Deep exome resequencing is a powerful approach for delineating patterns of protein-coding variation among genes, pathways, individuals and populations. We analyzed exome data from 2,440 individuals of European and African ancestry as part of the National Heart, Lung, and Blood Institute’s Exome Project, the aim of which is to discover novel genes and mechanisms that contribute to heart, lung and blood disorders. Each exome was sequenced to a mean coverage of 116×, allowing detailed inferences about the population genomic patterns of both common variation and rare coding variation. We identified more than 500,000 single nucleotide variations, the majority of which were novel and rare (76% of variants had a minor allele frequency of less than 0.1%), reflecting the recent dramatic increase in the size of the human population. The unprecedented magnitude of this dataset allowed us to rigorously characterize the large variation in nucleotide diversity among genes (ranging from 0 to 1.32%), as well as the role of positive and purifying selection in shaping patterns of protein-coding variation and the differential signatures of population structure from rare and common variation. This dataset provides a framework for personal genomics and is an important resource that will allow inferences of broad importance to human evolution and health.
. 2011 Sep 19;12(Suppl 1):I1. doi: 10.1186/gb-2011-12-s1-i1
Analysis of 2,440 human exomes highlights the evolution and functional impact of rare coding variation
Joshua Akey
1,✉
Joshua Akey
1Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
Find articles by Joshua Akey
1Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
✉
Corresponding author.
Supplement
Beyond the Genome 2011http://genomebiology.com/supplements/12/S1
Conference
19-22 September 2011
Beyond the Genome 2011
Washington, DC, USA
Issue date 2011.
Copyright ©2011 Akey; licensee BioMed Central Ltd.
This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
PMCID: PMC3439047 PMID: 22607170