Table 1 |.
Population and consortium (if applicable) | Number of individuals | Data type | Total novel sequence reported | Average per individual | Additional requirements | Publication year |
Refs |
---|---|---|---|---|---|---|---|
Swedish, SweGen | 1,000 (subset of 2) | Short read (long read) | 46 Mb (17.3 Mb) | 0.6 Mb (12.1 Mb) | Over 300 bp (over 100 bp) | 2019 (2018) | 18,79 |
Han Chinese | 275 | Short read | 29.5 Mb | ~5 Mb fully unaligned + ~6 Mb partially unaligned to reference | Over 500 bp | 2019 | 69 |
Mixed, TOPMed | 53,831 | Short read | 2.2 Mb | 0.2–0.5 Mb | Must align to a hominid genome | 2019 | 65 |
Mixed | 154 | BioNano maps, linked reads (10X Genomics) | 60 Mb | 14.2 Mb | >2 kb | 2019 | 71 |
Mixed | 15 | Long read | 21.3 Mb | 6.4 Mb | Not in peri-centromeric regions, over 50 bp | 2019 | 68 |
African ancestry, Consortium on Allergy in African-Ancestry Populations | 910 | Short read | 296.5 Mb | 2.5 Mb | >1 kb | 2019 | 28 |
Mixed | 17 | Linked reads (10X Genomics) | 2.1 Mb | 0.71 Mb | Breakpoint resolved, over 50 bp of non-repetitive content per sequence | 2018 | 73 |
Icelandic | 15,219 | Short read | 0.33 Mb | 0.16 Mb | Non-repetitive, breakpoint resolved | 2017 | 15 |
Danish, Danish Genome Project | 150 | Short read | >15,000 insertionsa,b | Not reported | >50 bp | 2017 | 17 |
Dutch, Genome of the Netherlands | 769 | Short read | 4.3 Mb | Not reported | >150 bp | 2016 | 70 |
Mixed | 10,545 | Short read | 3.26 Mb | 0.7 Mb | Non-repetitive, >200 bp | 2016 | 25 |
Mixed, data from 1KGP | 45 | Short read | 61.6 Mb | 17,700–20,500 insertionsa,c | No size or other restrictions reported | 2016 | 74 |
Mixed, The Simon’s Genome Diversity Project | 300 | Short read | 5.8 Mb (13 Mb with repetitive elements) | Not reported | Non-repetitive, >500 bp | 2016 | 24 |
Japanese, Tohoku Medical Megabank Organization | 1,070 | Short read | 9,354 insertionsa | 45 insertionsa | >1 kb | 2015 | 72 |
1KGP, 1000 Genomes Project; TOPMed, Trans-Omics Precision Medicine.
Did not report number of bases.
Estimates separated into the average number of contiguous sequences per population with at least a partial match. The 61.6 Mb reported was based on 30,879 insertions.