# Introduction 

1-2ug of DNA was converted using EpiTect Bisulfite Kit (Qiagen, ID 59104) and following manufacturer’s instructions. After purification, 100ng or 11ul of each sample was amplified, by PCR, for the TSS region and the EGFL7 intron 7. A High Fidelity DNA polymerase, KAPA HiFi HotStart Uracil+ Kit (Roche, Basel, CH), was used, and primer sequences and thermocycler settings are shown below.  
Methylation profile analysis and differential methylation analysis (according to groups of interest) was performed only in the PCR amplified region. 

# Data analysis workflow

The data analysis consists in 3 main steps:
- QC
- Mapping and CpG coverage quantification
- Differential methylation analysis

## QC

Raw sequences were quality checked using fastqc (v.0.11.8) and trimmed for adapters and base quality using cutadapt v.1.16 (minimum base quality 20; minimum length 25bp).  
Both tails of the reads were subjected to trimming if necessary by using different clipping parameters settings according to runs batches.  
Very bad quality samples were discarded from downstream analyses.  

## Mapping and CpG coverage quantification

Passing quality checks reads were then mapped to the human reference genome GRCh38 using bismark v.0.22.1 (--local mode).
Coverage files (.cov) obtained from bismark output were then used as input for differential methylation analysis by using the MethylKit R package (v1.10.0).  

## Differential methylation analysis

Differential methylation analysis was performed primarly at single CpG considering the small length of the region.
To get rid of CpGs not included in the targeted regions, we tailored the analysis only to genomic coordinates (chr9: 136668000-136671000) mapping to (TSS, 5’ and 3’ intron-7 subregions) with selectByOverlap function.  
Hence, we discarded reads with less than 10 counts and reads with coverage lower than the 5th percentile. In this way we were able to evaluate methylation percentage in most of the samples under analysis.  
We then normalized coverage by using the normalizeCoverage function.  
Exploratory data analysis was performed by using built-in functions provided by the package, including PCA, clustering, and correlations.  
CpGs with q.value < 0.01 (logistic regression test) and absolute delta methylation (negative or positive) percentage of at least 10 between the compared groups, were considered differentially methylated.  
Percent methylation matrix was used as input to produce heatmaps (pheatmap R package).  
Color palette presets from RColorBrewer package were used for % methylation and row annotation tiles coloring.


