Skip to main content
. 2018 Mar 14;27(R1):R14–R21. doi: 10.1093/hmg/ddy081

Table 1.

Selected biobanks with linked EHRs and genetic data in ≥50 000 participants listed in descending order of sample size with available genetic data

Cohort Country Institution or companya Cohort sizeb,c Samples with matched EHR and genetic data availableb,d Access References
UK BioBank (UKBB) United Kingdom UK Biobank charity 500 000 488 377 genotyped Application for bona fide researcher (9,52)
DeCODE Genetics Iceland Amgen >350 000 >350 000 Contact to collaborate (4,53,54)
Million Veteran Program (MVP) USA Department of Veterans Affairs >500 000 >350 000 Contact to collaborate (12,13)
BioBank Japan Project Japan Pharmacogenomics Research Network 200 000 162 255 genotyped Contact to collaborate (34,55)
China Kadoorie Biobank China University of Oxford, Chinese Academy of Medical Sciences 510 000 >130 000 Application for bona fide researcher (56,57)
Kaiser Permanente Research Bank our-research/for-researchers/ USA Kaiser Permanente 270 570 102 998 genotyped Application for bona fide researcher (7,58)
eMerge Network USA NHGRI 105 325 83 717 Application for eMERGE affiliate membership (6,18)
Danish Biobank Register Denmark Danish National Biobank 5.7 million >70 000 Application for bona fide researcher (59)
Nord Trondelag Health Study (HUNT) Norway Norwegian University of Science and Technology 120 000 69 037 genotyped Application and collaboration with PI affiliated with a Norwegian research institute (15,60)
DiscovEHR USA Geisenger Health System, Regeneron Genetics Center 50 000 >50 000 exome sequences Contact to collaborate (43,61)

Main institution responsible for the resource, many other institutions may provide funding or support.


Sample size as of January 2018. In situations where up to date sample sizes were difficult to find, sample sizes from recent publications were used.


Unique number of participants with some type of data available (52–61).


Actual samples available for analysis may be less due to quality control. Number includes both sequencing and genotyping with the type of data described when possible.