Skip to main content
. Author manuscript; available in PMC: 2017 Feb 4.
Published in final edited form as: Nature. 2016 Aug 4;536(7614):41–47. doi: 10.1038/nature18642

Extended Data Table 3. Counts and properties of variants identified in sequenced subjects.

a. Variant numbers for the 2,657 individuals with whole genome sequence data passing QC and included in the association analysis data set; b. Variant numbers are provided for the 13,008 individuals passing initial rounds of QC from which further QC defined the 12,940 subjects included in the association analysis data set. Private refers to variants seen in only a single ancestral group; cosmopolitan to variants seen in all five major ancestral groups.

a
Genomes integrated panel
SNV Indel SV

Variant Type
N (%total)
25.2M (94%) 1.50M (5.6%) 8,876 (0.03%)

Coding Non-coding

Function
N (%total)
888K (3.3%) 25.8M (97%)

Rare (MAF<0.5%) Low frequency (0.5<MAF<5%) Common (MAF>5%)

Frequency spectrum
N (%total)
6.26M (23%) 4.16M (16%) 16.3M (61%)

b137 Novel

dbSNV
N (%total)
14.6M (55%) 12.1M (45%)
b
Exome sequence data
All samples African-American East-Asian European Hispanic South-Asian

Samples: 13,008 2,086 2,165 4,579 1,959 2,219
    T2D cases 6,504 1,018 1,012 2,359 1,021 1,094
    T2D controls 6,436 1,056 1,153 2,182 922 1,123
Excluded from association analysis 68 12 0 38 16 2

Coverage:
        Coding:
    Mean (Mc) per gene 81.7 ±23.7 83.2 ±24.0 84.6 ±23.8 78.6 ±23.3 83.8 ±24.1 78.2 ±23.2
    # of genes with Mc <20 368 302 302 351 269 325
        Non-coding:
    Mean per gene 59.0 ±21.0 60.9 ±21.5 62.2 ±21.6 57.5 ±20.6 59.2 ±21.2 55.4 ±20.3
    # of genes with Mc <20 1,150 738 731 1,102 804 945

Variant annotations:
    Synonymous SNV 627,630 237,430 178,232 192,282 156,231 211,218
    Missense SNV 1,110,897 354,797 296,707 327,049 231,351 344,191
    Start SNV 2,055 593 523 639 384 583
    Nonsense SNV 26,321 7,188 6,668 8,030 4,660 7,339
    Frameshift INDEL 26,901 6,605 6,159 7,515 4,155 6,609
    Inframe INDEL 11,090 3,471 2,963 3,145 2,068 3,165
    3′UTR SNV, INDEL 65,013 24,583 19,149 21,102 16,959 22,177
    5′UTR SNV, INDEL 43,965 16,920 13,520 15,562 11,634 15,595
    Intron SNV, INDEL 931,449 352,398 270,564 296,970 243,139 314,810
    Essential splicing SNV, INDEL 14,286 3,648 3,454 4,108 2,301 3,744
    Other splicing SNV, INDEL 128,644 45,876 35,413 38,263 30,301 41,122
    Non-coding RNA SNV, INDEL 18,113 7,247 5,996 6,715 5,084 6,706
    Intergenic SNV, INDEL 37,345 14,335 11,498 13,614 10,700 12,937
All 3,043,709 1,075,091 850,846 934,994 718,967 990,196

Coding frequency spectrum:
    Rare (MAF<0.5%) 95.79% 83.30% 90.06% 89.19% 84.56% 89.89%
private 77.93% 53.79% 65.47% 51.80% 37.26% 61.55%
cosmopolitan 0.35% 1.80% 3.02% 1.88% 2.24% 1.73%
    Low frequency (0.5<MAF<5%) 2.57% 10.36% 4.61% 5.52% 8.21% 5.10%
private 0.17% 1.43% 1.10% 0.26% 0.52% 1.02%
cosmopolitan 0.60% 1.50% 1.54% 1.94% 2.74% 1.62%
    Common (MAF>5%) 1.65% 6.35% 5.33% 5.29% 7.23% 5.00%
private 0.09% 0.00% 0.00% 0.00% 0.01% 0.00%
cosmopolitan 1.50% 4.35% 5.17% 4.97% 6.88% 4.86%

Intron/UTR frequency spectrum:
    Rare (MAF<0.5%) 94.09% 78.68% 86.91% 86.17% 81.43% 86.68%
private 74.76% 49.81% 61.36% 45.26% 31.03% 56.96%
cosmopolitan 0.46% 2.07% 3.98% 2.49% 2.66% 2.19%
    Low frequency (0.5<MAF<5%) 3.52% 12.57% 5.63% 6.51% 9.43% 6.32%
private 0.25% 1.74% 1.25% 0.29% 0.47% 1.18%
cosmopolitan 0.80% 1.81% 2.11% 2.53% 3.30% 2.17%
    Common (MAF>5%) 2.39% 8.76% 7.46% 7.32% 9.14% 7.00%
private 0.15% 0.00% 0.00% 0.01% 0.00% 0.00%
cosmopolitan 2.17% 5.94% 7.26% 6.93% 8.77% 6.81%