Skip to main content Link Menu Expand (external link) Document Search Copy Copied

Last update: 20241217

CollectWgsMetrics (Picard)

Collect metrics about coverage and performance of whole genome sequencing (WGS) experiments. This tool collects metrics about the fractions of reads that pass base- and mapping-quality filters as well as coverage (read-depth) levels for WGS analyses. Both minimum base- and mapping-quality values as well as the maximum read depths (coverage cap) are user defined. Note: Metrics labeled as percentages are actually expressed as fractions! https://gatk.broadinstitute.org/hc/en-us/articles/360037269351-CollectWgsMetrics-Picard

Usage Example:

java -jar picard.jar CollectWgsMetrics \
       I=input.bam \
       O=collect_wgs_metrics.txt \
       R=reference_sequence.fasta 

We use this in our study book for:

  1. CollectWgsMetrics: 03b_collectwgsmetrics.sh -> study_book/qc_summary_stats mapping, depth, and more.
MetricSummary
GENOME_TERRITORYThe number of non-N bases in the genome reference over which coverage will be evaluated.
MEAN_COVERAGEThe mean coverage in bases of the genome territory, after all filters are applied.
SD_COVERAGEThe standard deviation of coverage of the genome after all filters are applied.
MEDIAN_COVERAGEThe median coverage in bases of the genome territory, after all filters are applied.
MAD_COVERAGEThe median absolute deviation of coverage of the genome after all filters are applied.
PCT_EXC_ADAPTERThe fraction of aligned bases that were filtered out because they were in reads with mapping quality 0 and looked like adapter reads.
PCT_EXC_MAPQThe fraction of aligned bases that were filtered out because they were in reads with low mapping quality (lower than MIN_MAPPING_QUALITY).
PCT_EXC_DUPEThe fraction of aligned bases that were filtered out because they were in reads marked as duplicates.
PCT_EXC_UNPAIREDThe fraction of aligned bases that were filtered out because they were in reads without a mapped mate pair.
PCT_EXC_BASEQThe fraction of aligned bases that were filtered out because they were of low base quality (lower than MIN_BASE_QUALITY).
PCT_EXC_OVERLAPThe fraction of aligned bases that were filtered out because they were the second observation from an insert with overlapping reads.
PCT_EXC_CAPPEDThe fraction of aligned bases that were filtered out because they would have raised coverage above COVERAGE_CAP.
PCT_EXC_TOTALThe total fraction of aligned bases excluded due to all filters.
PCT_1XThe fraction of bases that attained at least 1X sequence coverage in post-filtering bases.
PCT_5XThe fraction of bases that attained at least 5X sequence coverage in post-filtering bases.
PCT_10XThe fraction of bases that attained at least 10X sequence coverage in post-filtering bases.
PCT_15XThe fraction of bases that attained at least 15X sequence coverage in post-filtering bases.
PCT_20XThe fraction of bases that attained at least 20X sequence coverage in post-filtering bases.
PCT_25XThe fraction of bases that attained at least 25X sequence coverage in post-filtering bases.
PCT_30XThe fraction of bases that attained at least 30X sequence coverage in post-filtering bases.
PCT_40XThe fraction of bases that attained at least 40X sequence coverage in post-filtering bases.
PCT_50XThe fraction of bases that attained at least 50X sequence coverage in post-filtering bases.
PCT_60XThe fraction of bases that attained at least 60X sequence coverage in post-filtering bases.
PCT_70XThe fraction of bases that attained at least 70X sequence coverage in post-filtering bases.
PCT_80XThe fraction of bases that attained at least 80X sequence coverage in post-filtering bases.
PCT_90XThe fraction of bases that attained at least 90X sequence coverage in post-filtering bases.
PCT_100XThe fraction of bases that attained at least 100X sequence coverage in post-filtering bases.
FOLD_80_BASE_PENALTYThe fold over-coverage necessary to raise 80% of bases to the mean coverage level.
FOLD_90_BASE_PENALTYThe fold over-coverage necessary to raise 90% of bases to the mean coverage level.
FOLD_95_BASE_PENALTYThe fold over-coverage necessary to raise 95% of bases to the mean coverage level.
HET_SNP_SENSITIVITYThe theoretical HET SNP sensitivity.
HET_SNP_QThe Phred Scaled Q Score of the theoretical HET SNP sensitivity.