How do I determine if my peak calling worked?

Prev Next

The outcome of peak calling is dependent on the quality of the Unique BAM file that was used as input. For quality Unique BAM files, the peak caller used (MACS2/SICER/SEACR) will deliver statistically significant peaks in accordance with the preselected field-supported q-value, False Discovery Rate (FDR), and Significance Thresholds. The outcome of peak calling will be two files that compare the target enrichment sites to background/IgG - these two files are a BED file and a text file that represent the locations of each significant peak that was called, the number of peaks called, and the weighted strength of each peak called in the form of a FRiP score (Fraction of Reads in Peaks). These files can be used in external analysis, but internally on the CUTANATM Cloud, the user can leverage the BED file for visualization of peaks by navigating to the Peak Calling Tab/IGV. 

The highlighted bars beneath the respective bigwig traces represent the called peaks. Comparing and contrasting locations of peaks in relation to genomic features requires some understanding of the target behavior. For example, if you are peak-calling a target understood to be related to transcription, you would expect peaks to be called near/adjacent to promoter regions - the bars representing called peaks on the IGV window should reflect this. It is helpful to compare/contrast the H3K4me3 control (a mark of active transcription) enrichment in cases like this.

It is important to note that more peaks do not necessarily mean a better peak calling instance. Peak calling trustworthiness is a multifactorial exercise that relies on the quality of Unique BAM used for input, comparing/contrasting peak location and characteristics visually in IGV, and the strength of FRiP (Fraction of Reads in Peaks) scoring - the higher the FRiP, the more trustworthy the data.