Table 2.
Atlas-Indel2 | GATK Unified Genotyper | SAMtools mpileup | |
---|---|---|---|
Average INDELs/sample (Coding and Non-coding) |
23525 | 9648 | 26139 |
Average Coding INDELs/sample | 194 | 1947 | 1560 |
Average % 3(n) Coding INDELs/sample | 47.52 | 10.39 | 25.82 |
# Coding INDELs | 816 | 12027 | 12305 |
% 3(n) Coding INDELs | 38.11 | 7.78 | 23.84 |
# Non-coding INDELs | 19607 | 3441 | 28135 |
% 3(n) Non-coding INDELs | 14.06 | 9.79 | 17.19 |
Summary of INDELs called by Atlas-Indel2, GATK Unified Genotyper and SAMtools mPileup on 10 SOLiD samples (5 LWK, 5 CEU). The metrics compared are the average number of coding and non-coding INDELs per sample, the number of INDEL alleles merged across all 10 samples and the % 3(n) INDELs. The 3(n) INDELs refer to INDELs with a length of multiples of 3, which do not cause a frameshift mutation in the coding region. Previous studies have reported that coding regions tend to harbor less frameshift-causing INDELs. Coding refers to the consensus exome target regions of the genome as defined by the 1000 Genomes consortium. Non-coding refers to all the regions outside of the exome target regions. In the merged call sets, INDELs at the same site found in different samples are merged together in a population VCF file. Individual sample results are shown in Additional file 2.