Abstract
A paradigm shift towards biology occurred in the 1990’s subsequently catalyzed by the sequencing of the human genome in 2000. The cost of DNA sequencing has gone from millions to thousands of dollars with sequencing of one’s entire genome costing only $1,000. Rapid DNA sequencing is being embraced for single gene disorders, particularly for sporadic cases and those from small families. Transmission of lethal genes such as associated with Huntington’s disease can, through in-vitro fertilization, avoid passing it on to one’s offspring. DNA sequencing will meet the challenge of elucidating the genetic predisposition for common polygenic diseases, especially in determining the function of the novel common genetic risk variants and identifying the rare variants, which may also partially ascertain the source of the missing heritability.
The challenge for DNA sequencing remains great, despite human genome sequences being 99.5% identical, the 3 million single nucleotide polymorphisms (SNPs) responsible for most of the unique features add up to 60 new mutations per person which, for 7 billion people, is 420 billion mutations. It is claimed that DNA sequencing has increased 10,000 fold while information storage and retrieval only 16 fold. The physician and health user will be challenged by the convergence of two major trends, whole genome sequencing and the storage/retrieval and integration of the data.
Keywords: Genomics, Sequence, Data
Historical Perspective
Captain Cook wrote in his log upon reaching Australia that “I have not only travelled farther than any other man, but I have travelled as far as man can travel” (1). Thus, by the 18th Century, all the continents had now been discovered and named. It appeared logical and perhaps appropriate for mankind to pursue the inner treasures of the planet. This coincided with the industrial revolution which led to the harnessing of energy from coal, electricity and oil as well as the discovery of all the marvelous elements including uranium which enabled many human endeavors, from cancer therapy to the invention of the atomic bomb. While this trend continues, in the 1990’s a major worldwide shift occurred in which mankind became interested in the inner workings of human biology. The word Biology is today often associated with excitement and activity, not just in science but also in medicine and commerce. This revolutionary concept received a major boost with the sequencing of the human genome in 2000 (2). In fact, sequencing of the human genome may be to the 21st Century as invention of the vowels and development of democracy was to the 6th Century BC or the industrial revolution was to the 18th Century.
The Human Genome – new developments
The double stranded human genome of each cell contains 6.4 billion nucleotides. While proteins are the molecules that do the work, only about 1% of the human genome sequences are designated to encode mRNAs for protein coding (3). Until recently, most of DNA was considered junk (3), but we now know that virtually all of DNA is transcribed into RNA (3). The Encyclopedia of DNA Elements (ENCODE) project has enabled us to assign biochemical functions for 80% of the genome (4). It is of note that only a small proportion of the transcribed RNAs are translated into protein with the remainder performing a host of functions, affecting those sequences (genes) that encode for protein. These RNAs that do not code for protein are as a group referred to as non-coding RNA (ncRNA). Most genes coding for protein are in some way regulated by these ncRNAs (5). These ncRNAs are very promiscuous—each RNA can affect multiple different genes on the same or different chromosomes.
The Source of Human Genetic Biodiversity
All genomes from all species share most of their DNA sequences having acquired them over a 3.8 billion year evolutionary history since the origin of life. Despite the common sequence ancestry, each individual genome within each species has maintained itself as unique. The development of biodiversity and unique sequences of each genome whether within or between species is due primarily to the errors in the process of copying DNA. Copying errors during the replication of one’s DNA induce primarily single base changes through substitution of a single base (nucleotide) for another (e.g. thymine for adenine). These substitutions are passed on from generation to generation and are referred to as single nucleotide polymorphisms (SNPs). These SNP substitutions account for 94% of the errors from copying or replicating DNA, while deletions of 1-4bp account for 4.5%, and the remainder are due to insertions of 1-4bp (6;7). Other types of DNA variation exist such as chromosomal rearrangements, duplications (copy number variants) and translocations. The mutations induced by DNA copying errors, if beneficial, are conserved and their frequency increases, while deleterious mutations remain rare or are eliminated. Fortunately, many of these SNPs have modest to minimal effects or are neutral. The human DNA (6 billion bases) replicates itself every few days, and although it only makes 1 error per 1 billion bases created, it can accumulate a significant number of mutations over generations. Kruglyayk (8) estimates with a mutation rate of 2×10−8 per base pair per generation and a human genome of over 3 billion base pairs, each genome carries 60 new mutations per generation. The world population of 7 billion has 420 billion new mutations in the current generation. The genetic diversity of mankind is exemplified by the observation that the exons (protein coding regions) of each individual genome, referred to as the exome, encompasses ~13,000 non-synonymous and ~7,000 potentially functional variants, posing considerable challenges in identification of disease causing DNA sequence variants (DSVs) (9;10). Despite the sequence of the human genome being 99.5% identical, the remaining 0.5% is more than adequate to provide each of us a unique genome that until sequenced will have many hidden surprises. Current knowledge indicates there are 3 million SNPs per genome which account for over 80% of human phenotype variation, whether it be the color of one’s eyes or the susceptibility to disease (11).
The Search for Disease Related Genes
A major goal is to identify DNA regions that predispose or cause cardiovascular disease. This refers to the ongoing studies that correlate physical or biochemical features (phenotype) to that of the genotype. Defining the phenotype precisely is fundamental to the discovery of the associated or causal genotype. The role of the clinician in detecting the phenotype has been crucial to this pursuit and will continue to be even more so as we further refine and specify sub-phenotypes. DNA can be obtained from the blood, other body fluids such as saliva or body tissue. The approach to identify the causal genes and variants has evolved dramatically over the past three decades. The conventional approach of genetic linkage analysis in large families, which was very successful in linking causal DNA mutations to rare single gene disorders, has all but been replaced with the newer approaches of Genome Wide Association Studies (GWAS) and Next Generation DNA Sequencing (NGS) in small families and individual cases. The newer approaches not only have partially overcome a major limitation of genetic linkage in identifying the causal variant in small size families but also have afforded the opportunity to identify the causal alleles in sporadic cases with single gene diseases and the susceptibility (risk) alleles in those with the complex phenotypes.
Single Gene Disorders - the success of genetic linkage analysis
Single gene disorders are the phenotypic consequences of rare DSVs that impart large effect sizes. The mutation is both necessary and sufficient to induce the disease. Familial Hypertrophic Cardiomyopathy (HCM) was the first cardiovascular single gene disorder for which the responsible mutation was discovered. The responsible mutation was a missense mutation in the gene that encodes the beta-cardiac myosin heavy chain (12). Introducing the human mutant gene as a transgene induced the disease in both the mouse (13) and the rabbit (14). While the rare variant is sufficient to cause the disease, there is often variable expressivity (severity of the phenotype), determined by other genetic and non-genetic factors. The conventional approach for mapping the chromosomal location (locus) of the gene responsible for a single gene disorder has been genetic linkage analysis. In this technique, DNA of members of a 2 to 3 generation pedigree affected with the disease are genotyped using a few hundred short tandem repeat DNA markers. DNA markers that are inherited more commonly than by chance by the affected members of the family indicate the markers are in close physical proximity to the DNA region containing the responsible gene. Sequencing of candidate genes at the mapped locus usually identifies the causal variant. This approach has been exceedingly successful in mapping the causal genes for various single gene disorders, typically in large and moderate size families. It is estimated there are about 6,000 single gene disorders of which causative genes have been discovered for over 3,500 (15). Accordingly, several dozen genes for hereditary cardiomyopathies, including dilated, hypertrophic and arrhythmogenic cardiomyopathies; hereditary arrhythmias, such as atrial fibrillation, long QT syndromes, short QT syndromes, and catecholaminergic polymorphic ventricular tachycardia; and cardiac conduction defects have been identified (16). In addition to linkage analysis, the candidate gene approach, guided by the biological and functional similarities between the known causal genes and the candidate gene, has been used to screen and identify new causal genes for single gene disorders. Both approaches are limited by not offering sufficient resolution to identify the causal genes in small families or in sporadic cases.
Single Gene Disorders – DNA sequencing, a paradigm shift
The advent of next generation sequencing (NGS) platforms, has eased one of the bottlenecks to complete elucidation of the genetic causes of single gene disorders (9) including those occurring in small families or sporadically and has emerged as the preferred method. The unbiased approach of whole exome sequencing (WES), sequencing all of the exons in the genome, or whole genome sequencing (WGS) enables identification of all DSVs and hence, the opportunity for not only discovering the causal variants but also modifier variants that influence phenotypic expression of the disease.
The NGS technologies are based on parallel sequencing of millions of DNA fragments simultaneously. The sequencing reads are relatively short, typically comprised of 35 to 100 bases but could be as long as 1,000 bases, depending on the platform. The reads are aligned with the reference sequence and multiple reads of the same DNA fragments are compared to identify the variants. The existing technologies afford the opportunity to generate up to ~600 billion base (Gb) sequences per run in about one to two weeks. Given that each genome is ~3.2 billion base pairs (Gbp) and each exome is ~30 million bp (Mbp), such platforms afford the opportunity to sequence one genome or a dozen or so exomes at a high mean ‘coverage rate’ (X100). The coverage rate refers to the number of times each DNA fragment is sequenced and mapped to the reference sequence. A new approach to sequencing is being developed based on the “Nanopore Technology”, whereby a pore is small enough to enable only a single strand of DNA to pass through it. Detection of the specific nucleotide is based on the changes in conductivity as each specific DNA (or RNA) nucleotide that passes through the pore (17). There is no need for fluorescence or chemicals, hence it should be relatively inexpensive. Oxford Nanopore technologies recently announced the generation of a plastic pore with an attached enzyme that pulls the single strand of DNA through at a given speed. It is estimated that 25,000 of these pores would fit into the diameter of a human hair. The simultaneous operation of a large number of Nanopores makes it possible to sequence a human genome within hours at less than $1,000.00 per genome. The machine would be a small laptop device and also relatively inexpensive. The company has announced that it will deliver testing machines before the end of 2012 and mass production is expected in the year 2013.
The most commonly used approach is sequencing of the approximately ~180,000 protein coding exons in the 21,000 genes in the genome, which encompass approximately 30 Mbp of genomic DNA. The approach is referred to as WES as opposed to WGS wherein the entire genome is sequenced. In view of the large number of DNA sequence variants in each exome/genome, skilful interpretation of the genetic data utilizing various bioinformatics and genetic resources as well as exquisite phenotyping are necessary to reduce the number of putative causal variants. WES has other shortcomings including incomplete capture, and inadequate coverage (per read) of all exons as well as incorrect mapping of the reads. In general, approximately 500 of the 21,000 genes may not be correctly sequenced due to inherent errors in WES. For medical sequencing, i.e., genetic testing, all DSVs identified by the NGS platforms should be validated either by repeat independent NGS reactions, Sanger sequencing or at least by genotyping.
The NGS platforms have already been successful for many Mendelian disorders (9) as shown in Table 1. Utilizing this approach, TTN, encoding the giant protein Titin, was identified as a major causal gene for hereditary and sporadic dilated cardiomyopathy (18). While WES and WGS are useful for identification of the causal genes/variants in small families, robust study design is necessary to filter out the large number of variants that typically segregate with the phenotype in small families, which renders identification of the true causal variants challenging. Various study design and approaches have been suggested to strengthen the likelihood of success (9). Various bioinformatics programs, such as PolyPhen2 and SIFT as well as genetic databases, such as NHLBI Exome Sequencing Project and 1,000 Genomes, are available to filter out the DNA sequence variants identified by WES or WGS experiments which would restrict the number of putative candidate causal genes.
TABLE 1.
Single Gene Disorder | Gene | Function | Ref |
---|---|---|---|
Familial dilated cardiomyopathy | TTN | Titin is a large sarcomere protein spanning half of sarcomere length | (18) |
Familial and sporadic dilated Cardiomyopathy | BAG3 | Cochaperon protein | (20) |
Autosomal recessive Dilated Cardiomyopathy | GATA | GATA zinc finger domain | (21) |
D1 | containing protein 1 transcription factor | ||
Hypertrophic cardiomyopathy (mitochondrial) | MRPL3 | Abnormal assembly of mitochondrial respiratory chain | (22) |
Cantú syndrome (Patent ductus arteriosus, cardiac hypertrophy, pulmonary hypertension and pericardial effusion in conjunction with non-cardiac manifestations) | ABCC9 | An ATP-sensitive potassium channel | (23) |
Thoracic aortic aneurysm | SMAD3 | Signal transduced of TGF-beta | (24) |
Familial pheochromocytoma | MAX | Neural crest development | (25) |
Congenital heart defects | MYH6 | Sarcomeric myofibril formation | (26) |
Determining the causal mutation in autosomal recessive disorders is facilitated by the fact that the causal mutation must be homozygous to induce the disease as opposed to heterozygous in autosomal dominant disease. In autosomal dominant disease, WES typically leads to identification of several dozen putative candidates that co-segregate with the phenotype in small or medium size families and hence, it is difficult to discern the causal variant. Despite the advantage of ascertaining the significance of polymorphisms within families, there will remain many polymorphisms which cannot be annotated definitively as causative for disease. While techniques such as bioinformatics and filtering mechanisms can reduce the number of putative causal variants, for some it will ultimately require extensive in vitro and in vivo studies to delineate biological and functional significance of these variants. Identification of non-synonymous variants by WES has the advantage of being in a protein coding region, which considerably facilitates functional analysis and the search for a corresponding phenotype. These points have been discussed in greater detail in a recent review by Marian (19).
DNA Sequencing as a Genetic Screen for Single Gene Disorders
Targeted subgenomic sequencing approach as opposed to WES may be used to screen for mutations in the known genes for single gene disorders (27). However, the approach is restricted to the known genes and does not lend itself to identification of the novel genes. It might also be used to identify double or triple causal mutations and as a part of cascade screening of family members. Cascade screening refers to genetic testing of family members of a proband in whom the causal mutation has already been identified. The cascade screening may entail simple genotyping for the presence of the specific mutation, Sanger sequencing, subgenomic sequencing and even WES. While the latter seems excessive for cascade screening and currently not covered by the insurance companies, it affords the opportunity for identification of potentially additional mutations that might contribute to the phenotype and define the genetic structure of the individual. Technical aspects of WES, as a genetic screening tool in autosomal dominant diseases are similar to those relevant to gene discovery by NGS. Typically, a much higher coverage is demanded for medical sequencing than for gene discovery studies.
Genetics of Coronary Artery Disease – an archetypical polygenic disorder
It has been recognized for some time (28) that genetic predisposition to common diseases such as CAD would be due to multiple common genes, each with minimal to modest effect on the phenotype. In polygenic disorders, unlike single gene disorders, no one gene is sufficient or necessary to induce the phenotype (29). Genetic linkage analysis, which utilizes a few hundred DNA markers, lacks the necessary resolution to identify the predisposing genes in polygenic disorders. It was recognized that the case-control association would be the better approach, but would require hundreds of thousands of DNA markers to span the genome, which were not available (30). In 2005, HapMap annotated the chromosomal location of millions of SNPs (31) which provided the necessary DNA markers to perform genome-wide association studies (GWAS). At the same time, platforms for high-throughput genotyping were developed (29;32) which enabled mapping of the first genetic variant for CAD, 9p21 in 2007 (33;34). This was followed by one of the largest collaborative efforts (35) in cardiology involving 2 continents, CARDIoGRAM, with a sample size of 143,000 dedicated to mapping genes for CAD, followed by ‘CARDIoGRAMplusC4D’ with a sample of 193,000. In just five years, 36 genetic variants have been confirmed to be associated with increased risk for CAD (36). Each of these 36 genetic risk variants for CAD was confirmed in populations independent of the discovery population and most recently underwent a meta-analysis in a total sample size of 190,000 (37). Based on this sample size, the chances of even one of these loci being false is very unlikely (38). It is important to realize that the DNA risk region is indicated by a SNP. This SNP serves as a marker and in most cases is not the SNP causing the disease risk. Thus, the actual sequence responsible for the risk in most cases is yet to be identified but will be markedly facilitated by the availability of rapid and inexpensive sequencing. Furthermore, most of the SNPs (23 of 36) mediate their risk independent of known risk factors (e.g. hypertension and cholesterol) through mechanisms as yet unknown. Functional analysis of the independent 23 risk variants for CAD is currently being pursued. Functional analysis is confounded by the observation that most of these SNPs are in non-protein coding regions. Determining the function and identifying the polymorphism will be extremely difficult since the effect of any one risk variant is small and its specific intermediary phenotype that contributes to coronary atherosclerosis or myocardial infarction is unknown. Functional analysis is further confounded by the many contributing components to atherosclerosis such as macrophage formation, plaque rupture, platelet adhesiveness or thrombosis to name just a few. One approach to function is the pursuit of network modeling techniques (39) (40) (41) in an attempt to identify DNA, RNA, and protein pathways that involve the DNA region containing the disease associated DNA marker. This is pursued along with conventional analysis of in vitro (cells) and in vivo (animal) expression studies.
The common risk variants for CAD discovered by GWAS have several features in common as shown below. This was reviewed in detail in a recent review (36).
The common genetic risk variants occur frequently, with ten of the variants occurring in ≥75% of the population and half of them in ≥50% of the population.
The risk effect per variant is small, averaging a risk increase of about 18%.
Ten of the variants act through known conventional risk factors: Seven through cholesterol, (SORT1, PCSK9, LPA, ZNF259/APOA5, TRIBI, APOE, ABCGA, LDLR); two, through hypertension (CYP17A1 and SH2B3); and the ABO locus (9q34) through increased propensity for coronary thrombosis.
Two-thirds of the genetic risk variants act through mechanisms independent of conventional risk factors.
Most of the SNPs signaling a risk variant are found in non-protein coding regions.
Risk is proportional to the total number of risk variants inherited by an individual, rather than a specific risk variant.
In our analysis of 23 risk variants for CAD, we observed that while the maximum present in any one individual could be 46, the average was 17 with a maximum observed of 26 and a minimum of 7.
Genetic Risk of Coronary Artery Disease and Clinical Application
The clinical application of the genetic risk factors for complex diseases such as CAD is yet to be recommended. One approach is to wait until we have specific therapy related to these genetic risk variants before recommending genetic testing. This is likely to require many years, since drug therapy as a rule could require a minimum of 10 years for development and approval. Another approach would be to incorporate them as risk factors into the current prevention-recommended guidelines. For example, current guidelines, for prevention of CAD recommend lowering LDL cholesterol to 160 mg/dl if one has no conventional risk factors, but if another independent risk factor such as hypertension is present, LDL cholesterol should be decreased to 140 mg/dl. Since genetic risk variants such as 9p21 have been proven to be independent risk factors, it could be incorporated into current guidelines and would lead to more intense treatment of known risk factors such as cholesterol. Currently, the genetic risk variants for CAD are not recommended for routine prevention and treatment of CAD. The independent genetic risk factors imply several mechanisms involved in the pathogenesis of atherosclerosis, which have yet to be discovered. While GWAS has not specifically identified the culprits, the implications for the pathogenesis and biology of atherosclerosis provide tremendous potential for development of new drug targets and innovative therapy.
Missing Heritability – the need for DNA sequencing
Despite the many common genetic risk variants for CAD, they only account for a small percentage of the expected heritability (42). It is estimated that about 50% of predisposition for CAD is genetic (28), yet, the 36 risk variants only account for about 10% of the expected heritability. There are several possibilities to account for this discrepancy: rare risk variants (MAF≤5%), undiscovered common variants, Epistasis (gene-gene interactions), or miscalculations. GWAS has the resolution to detect common SNPs, but not rare SNPs so they can only be detected by direct DNA sequencing. An ongoing approach instead of WGS, is WES. This is a NHLBI sponsored project “Exome Sequencing Project” for rare variants that involves sequencing about 30 million bases encompassing all 180,000 exons in the 23,000 genes in the genome (27). The initial results confirm the expectation that there are many more rare variants (43) than common variants. Based on a sample of 202 genes in 14,000 Europeans, investigators observed one base pair per 21 base pairs had undergone mutation to a rare polymorphism. These variants are very rare (MAF<1%) with 75% of these rare variants having a frequency of only 1 per 200 to 300 individuals (44).
While sequencing is necessary to detect rare polymorphisms it does not determine their function or whether they are disease related. Once a rare polymorphism or SNP is identified, one must, through case control association studies determine if the SNP is statistically more common in cases than controls. The advantage of functional rare variants associated with disease is that they occur primarily in protein coding regions and are associated with several fold increased risk (44;45). Rare variants that cause single gene diseases such as HCM (3), Wolff–Parkinson–White syndrome ‘WPW’ (46) are associated with several fold increased risk and are in themselves potent enough to induce the phenotype as shown in transgenic animals (47). The sample size required for 30 rare risk variants with an average frequency of 1% and power≥80% is over 6,000 cases and controls, if risk is increased two-fold. If one is assessing 30 rare risk variants with an average frequency of 0.1%, it would require 60,000 cases and controls.
In determining the biological or pathological function, rare variants have certain advantages over common variants. Common variants occur primarily in non-protein coding regions (48) as opposed to disease related rare variants, which predominantly occur in protein coding regions (44;45). Thus, WES, in which only the protein coding regions are sequenced, is appropriate for rare variants and is much more economical than WGS (49). Since most of the rare variants occur in known proteins, detection of the phenotype will be greatly facilitated with prior knowledge of the protein expressed, whether performed in vitro or in vivo.
It remains to be determined whether rare variants, despite their greater effect contribute significantly to the missing heritability. It is important to emphasize that the frequency of the genetic risk variant has nothing to do with its importance as a therapeutic target. Their importance as a therapeutic target is highly enriched by the greater effect over that of common variants. This is illustrated by the cholesterol receptor that was identified back in the 1970’s by Brown and Goldstein (50). This inherited defect referred to as Familial Hypercholesterolemia only occurs in 1 in 5,000, yet this rare disorder was the tipping point to recognize that cholesterol played a major role in precipitating premature CAD in these individuals. This led to the development of statins which inhibit the synthesis of cholesterol and today, statins are the mainstay in the prevention of CAD (51). A more relevant and recent example of the potency of rare variants is the rare polymorphism discovered in PCSK9 which has a frequency of about 1% (52). An antibody to PCSK9 was associated with a 60% further reduction in LDL cholesterol over that of statin therapy (52).
The other possibility is the overly stringent statistical requirement demanded by GWAS of p≤5×10−8. There is considerable evidence, as indicated by Peter Visscher, et al (53;54) that common variants of less than GWAS may account for much of the genetic “missing heritability”. In genetics of height, Yang, et al (49) showed that they can account for more than 40% of the expected heritability utilizing less significant common variants. A more recent study by Simonson, et al (55) also indicates that common variants of less than genome-wide significance do account for some of the missing heritability.
The current approach to assessing the total risk effect of common variants is by simply adding their individual effects. A major proportion of this missing heritability may be due to Epistasis or gene to gene interaction, which is not accounted for in our current calculations (56). In their natural state, genes exert their effect through combined networks rather than as single units and likely have synergistic effects over and above that of their individual effects. As more genes are discovered and their functions elucidated together with their interacting networks, it should be possible to ascertain and confirm the source of the missing heritability. To resolve this issue, it will be necessary to have genome sequencing in massive sample sizes to identify the rare variants and elucidate their function.
Pharmacogenetics
Pharmacogenetics is rapidly expanding in defining the relationship of DNA sequence variation and drug response. This has been most notable with two drugs commonly prescribed for cardiovascular therapy, clopidogrel and warfarin.
Since the establishment of dual antiplatelet therapy as the gold standard therapy following coronary stent placement, clopidogrel has become one of the most widely prescribed cardiac drugs. Single nucleotide polymorphisms in the gene encoding cytochrome P450 2C19 have been shown to affect the degree to which clopidogrel attenuates platelet aggregation. In both PLATO (57) and TRITON-TIMI 38 (58) carriage of CYP P450 2C19 polymorphisms was associated with major adverse cardiac events including the potentially catastrophic outcome of stent thrombosis. It is unclear at this point as to whether tailoring antiplatelet therapy can favorably modify outcomes in those individuals that are carriers. However, it has been demonstrated that identification of carriers by point-of-care testing and tailored prescription of a dual-platelet regimen successfully eliminates high residual platelet activity (59). A randomized study using such technology needs to be executed to determine whether such testing reduces adverse outcomes.
Polymorphisms in cytochrome P450 2C9 and vitamin K epoxide reductase (VKORC) have been shown to modify warfarin response. Several pharmacogenetic models have been developed in order to predict warfarin-dosing requirements. These include CYP P450 2C9 and VKORC1 genotype, smoking status, relevant medications, age, sex and body mass index (60). The application of these algorithms has been investigated in several prospective studies demonstrating feasibility of this approach. However, only a few were randomized and all suffered from small sample sizes. A recent publication compared standard dosing regimen with two genotype-guided algorithms (61). Primary outcomes were % out of range (OOR) international normalized ratios and time in therapeutic range (TTR) at 3 months. The combined genotype-guided prescription cohort demonstrated superior outcomes with respect to both primary endpoints. Moreover, serious events were significantly less frequent in the genotype-guided cohort (4.5% vs. 9.4% of patients (p<0.001)). It should be noted that there was no difference in the primary outcome between the two genotype-based algorithms. As a consequence, routine use of such algorithms has not been endorsed in the guidelines.
The primary thrust of pharmacogenomic inquiry has been to define sequence variation that modifies drug efficacy, some work has been done with respect to sequence variation that predisposes to adverse effects. One striking example is the identification of SLCO1B1 polymorphism and HMG CoA reductase inhibitor induced myopathy, where homozygosity confers a relative risk of 16.9 relative to non-carriers (62).
Individual Genome Sequencing - a new reality
While a draft of the human genome sequence was completed in 2000(2), the first individual Human Genome completed in its entirety, was that of John Craig Venter (63), when sequencing the individual genomes cost millions of dollars. Introduction of the NGS (27) revolutionized the rate and cost of DNA sequencing as shown in Figure 1, with the sequence of the human genome today costing $5,000 and expected to be less than $1,000 within one to two years. It is estimated that over 30,000 individuals will have had their whole genome sequenced by the end of this year (64).Recent reviews on genome sequencing are listed in Table 2.
Table 2.
Ref | |
---|---|
Targeted next-generation sequencing for the molecular genetic diagnostics of cardiomyopathies | (27) |
Genome sequencing. Search for pore-fection | (17) |
DNA Sequencing Clinical Applications of New DNA Sequencing Technologies | (45) |
Whole-Genome Sequencing: The New Standard of Care? | (65) |
Secrets of the human genome disclosed | (64) |
The Ultimate Genetic Test | (66) |
Accurate whole-genome sequencing and haplotyping from 10 to 20 human cells | (67) |
Next Steps in Cardiovascular Disease Genomic Research-Sequencing, Epigenetics, and Transcriptomics | (68) |
What does it mean to have one’s genome sequenced
If one simply follows through with parallel sequencing, it will be feasible, inexpensive, rapid and expected to be routine within the next 5 years and maybe sooner if the Nanopore approach is robust. What does it mean on the basis of a venipuncture, a buccal smear or a sample of one’s human hair to have one’s genome completely sequenced? Knowing one’s DNA disease risk fragments from such a single measurement is overwhelming considering that these variants will not change in one’s lifetime. These DNA risk variants are not influenced by meals, the time of the day, age gender or medications. A permanent record of one’s DNA variants can be stored and attached to one’s medical record as a permanent unchanging blueprint of the individual’s genetic makeup. The NIH has already launched a project referred to as “eMERGE” involving five medical centers in the United States whereby the individual’s DNA sequencing and their medical record will be analyzed for genotype correlations. (69) This could be the prototype for the future whereby one’s buccal smear, blood or tissue is stored in a bio-repository and genetic analysis correlated with the stored electronic phenotypic data. Similar such projects are on-going for other diseases such as cancer. This information will be routinely available and be part of the hospital record. A couple known to carry a gene for a lethal disease, such as Huntington’s disease, can avoid transmitting it to their children through in-vitro fertilization selecting their own egg and sperm without the mutation, thus avoiding what might be life-threatening mutations. Having your genome sequenced avoids misinterpretation and immediately determines whether you have one or more mutations proven to be associated with disease.
Despite the utility of the GWAS and NGS platforms in offering robust strategies to elucidate the genetic basis of complex diseases, clinical applications of such discoveries confront a number of challenges. Among them is the daunting task of identifying the true causal allele from the vast number of variants that are present in each genome or exome, including nsSNVs and even insertion/deletion variants. Bioinformatics algorithms might offer information about potential pathogenicity of the variants but such predictions are often discordant across different platforms. Likewise, large-scale high through screening tools to identify the pathogenic variants are currently not available. The focus on identification of the risk or causal variants by NGS is on the rare alleles, which are expected to exert larger effect size that the common alleles. However, a significant number of rave variants also are not expected to be pathogenic. Therefore, a practical approach is to identify the variants that have been already linked to the phenotype. Such variants are typically rare and are often non-sense, missense or frame-shift mutations that either have been shown to cause cardiovascular pathology or are located in a gene that is known to be a causal gene for a Mendelian disease. Each genome comprises a handful of such variants that might be used for early identification of those at risk. However, whether NGS-based early identification and interventions could influence the outcome in cardiovascular disease is an empiric question remains to be tested.
The convergence of two technologies – a challenge for personalized medicine
A major challenge to the healthcare policy makers, physicians, care givers and end users are being created by the convergence of two major technologies: cost effective DNA sequencing of the whole genome and digitization of patient data. The progress of DNA sequencing is said to have improved 10,000 fold in the past 8 years (70), while our ability to store, retrieve and analyze data has only improved 16 fold. (70;71) Some claim that the convergence of these two technologies is the tipping point for personalized medicine. It could be costly not to realize we are at the cusp of the new era of personalized medicine. The detailed genome knowledge is rapidly being made available as DNA sequencing is accelerating much faster than our ability to store and analyze the data (Figure 2). Interpreting the data will probably require elucidation of the function of the DNA risk variants. The era of population medicine where ‘one drug fits all’ will be replaced by medicine based on one’s genetic composition, molecular makeup and how it affects the particular disease phenotype in that individual. Given the etiological and phenotypic complexity of the common cardiovascular disorders and in view the difficulties in identifying the true risk alleles, one has to avoid a cavalier approach in assigning clinical implications to the genetic data. Experienced clinicians with training and expertise in medical genetics and/or in conjunction with medical geneticists should carefully assess the clinical significance of the genetic discoveries. The field is clearly not ready for a Direct-to-Customer approach, which has the potential to offer false information with considerable medical and psychological implications. The human genome’s effects have hardly been felt by some, but one effect is obvious to all of society and was best put by Leroy Hood, one of the pioneers “Revolutions that have been generated by the first draft of the Human Genome Project, have barely been felt, but there is one profound change that has already occurred and that is the realization that biology is fundamentally an informational science.” (72) This informational revolution could not be more unlike the industrial revolution. It has minimal, if any unfavorable effects on the environment, being performed in cybernetic space that most of us believe is intangible, untouchable and lily white clean. The immense nature of the informational revolution was recently summarized in a book by Firestein (73). From 5,000 years ago until 2003, humanity created a total of 5 exabytes (a billion gigabytes) of information. From 2003 to 2010, we created this amount every 2 days and in 2013 we create this amount every 10 minutes. Another way of stating this is to realize that every few hours, we create more information than all of the information created by humanity since the start of civilization.
Acknowledgments
The authors acknowledge Peggy Offley for her assistance in the preparation of this manuscript.
Grant Support
CIHR #MOP82810 (RR)/Canada; CIHR #MOP77682 (AFRS)/Canada; CFI #11966 (RR)/Canada
Grant Support
R01-088498/PHS HHS/United States; R21 AG038597-01/AG/NIA NIH HHS/United States;
R34HL105563/HL/NHLBI NIH HHS/United States
Grant Support
CIHR #MOP82810 (RR) & CIHR #MOP77682 (AFRS)/Canada
National Science & Engineering, Research Council of Canada
Canadian Diabetes Association of Canada
ABBREVIATIONS LIST
- CAD
Coronary Artery Disease
- DNA
Deoxyribonucleic acid
- DSV
DNA sequence variants
- GWAS
Genome-Wide Association Studies
- GWS
Genome-Wide Significance
- HCM
Hypertrophic Cardiomyopathy
- MAF
Mean Allele Frequency
- Mbp
Mega base pairs (1 million)
- Gb
Giga base (1 billion)
- mRNA
messenger RNA
- ncRNA
non-coding RNA
- NGS
Next Generation Sequencing
- MI
Myocardial Infarction
- RNA
Ribonucleic Acid
- SNP
Single Nucleotide Polymorphism
- WES
Whole Exome Sequencing
- WGS
Whole Genome Sequencing
- WPW
Wolff–Parkinson–White syndrome
Footnotes
Disclosure of no conflicts:
Dr. R. Roberts is a consultant to Cumberland Pharmaceuticals and confirms no conflicts.
STATEMENT OF DISCLOSURE
The Authors are responsible for recognizing and disclosing any relationship with industry that could be perceived to bias their work, acknowledging all financial support and any other personal connections.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Contributor Information
Robert Roberts, Email: rroberts@ottawaheart.ca, University of Ottawa Heart Institute, Ottawa, Canada. Director, John & Jennifer Ruddy Canadian Cardiovascular Genetics Centre 40 Ruskin Street, Ottawa, ON K1Y 4W7 Canada, Tel: 613.761.4779.
A.J. Marian, Institute of Molecular Medicine, Center for Cardiovascular Genetic Research, University of Texas Health Sciences Center, Houston, Texas USA, Tel. 713-500-2350.
Sonny Dandona, McGill University, Montreal, Quebec, Canada, McIntyre Medical Building, 3655 Sir William Osler, Montreal, Quebec H3G 1Y6 Tel: 514-934-1934 x36880.
Alexandre F.R. Stewart, John & Jennifer Ruddy Canadian Cardiovascular Genetics Centre, Associate Professor of Medicine, University of Ottawa Heart Institute Tel: 613-761-5189.
Reference List
- 1.Boorstin D. The Discoverers. Random House; 1983. p. 287. [Google Scholar]
- 2.Lander ES, Linton LM, Birren B, et al. Initial sequencing and analysis of the human genome. Nature. 2001 Feb 15;409(6822):860–921. doi: 10.1038/35057062. [DOI] [PubMed] [Google Scholar]
- 3.Roberts R, McNally EM. Genetic Basis for Cardiovascular Disease. In: Fuster V, Walsh RA, Harrington RA, editors. Hurst’s The Heart. 13. New York, NY: McGraw Hill; 2011. pp. 195–205. [Google Scholar]
- 4.Dunham I, Kundaje A, Aldred SF, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012 Sep 6;489(7414):57–74. doi: 10.1038/nature11247. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Amaral PP, Dinger ME, Mercer TR, Mattick JS. The Eukaryotic Genome as an RNA Machine. Science. 2008;319(5871):1787–9. doi: 10.1126/science.1155472. [DOI] [PubMed] [Google Scholar]
- 6.Chris Carlson. Considerations for SNP Selection. In: Michael P, Winer SGJCS, editors. Genetic variation: a laboratory manual. Cold Spring Harbor Laboratory Press; 2007. pp. 263–81. [Google Scholar]
- 7.Bhangale TR, Rieder MJ, Livingston RJ, Nickerson DA. Comprehensive identification and characterization of diallelic insertion-deletion polymorphisms in 330 human candidate genes. Hum Mol Genet. 2005 Jan 1;14(1):59–69. doi: 10.1093/hmg/ddi006. [DOI] [PubMed] [Google Scholar]
- 8.Kruglyak L, Nickerson D. Variation is the spice of life. Nature Genetics. 2001 Mar;27:234–6. doi: 10.1038/85776. [DOI] [PubMed] [Google Scholar]
- 9.Marian AJ, Belmont J. Strategic approaches to unraveling genetic causes of cardiovascular diseases. Circ Res. 2011 May 13;108(10):1252–69. doi: 10.1161/CIRCRESAHA.110.236067. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Ng PC, Levy S, Huang J, Stockwell TB, et al. Genetic variation in an individual human exome. PLoS Genet. 2008;4(8):e1000160. doi: 10.1371/journal.pgen.1000160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Stranger BEFM, Dunning M, Ingle CE, et al. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science. 2007;315(5813):848–53. doi: 10.1126/science.1136678. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Geisterfer-Lowrance A, Kass S, Tanigawa G, et al. A molecular basis for Familial Hypertrophic Cardiomyopathy: A beta cardiac myosium heavy chain missense mutation. Cell. 1990;62:999–1006. doi: 10.1016/0092-8674(90)90274-i. [DOI] [PubMed] [Google Scholar]
- 13.Lim DS, Oberst L, McCluggage M, et al. Decreased left ventricular ejection fraction in transgenic mice expressing mutant cardiac troponin T-Q(92), responsible for human hypertrophic cardiomyopathy. J Mol Cell Cardiol. 2000 Mar;32(3):365–74. doi: 10.1006/jmcc.1999.1081. [DOI] [PubMed] [Google Scholar]
- 14.Marian AJ, Wu Y, Lim DS, et al. A transgenic rabbit model for human hypertrophic cardiomyopathy. J Clin Invest. 1999 Dec;104(12):1683–92. doi: 10.1172/JCI7956. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Hamosh A, Scott AF, et al. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucl Acids Res. 2002;30(1):52–5. doi: 10.1093/nar/30.1.52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Marian AJ, Brugada R, Roberts R. Cardiovascular Diseases caused by Genetic Abnormalities. In: Fuster V, Walsh RA, Harrington RA, editors. Hurst’s The Heart. 13. New York, NY: McGraw HIll; 2011. pp. 1783–826. [Google Scholar]
- 17.Pennisi E. Genome sequencing. Search for pore-fection. Science. 2012 May 4;336(6081):534–7. doi: 10.1126/science.336.6081.534. [DOI] [PubMed] [Google Scholar]
- 18.Herman DS, Lam L, Taylor MR, et al. Truncations of titin causing dilated cardiomyopathy. N Engl J Med. 2012 Feb 16;366(7):619–28. doi: 10.1056/NEJMoa1110186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Marian AJ. Challenges in Medical Applications of Whole Exome/Genome Sequencing Discoveries. Trends Cardiovasc Med. 2012 Aug 23; doi: 10.1016/j.tcm.2012.08.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Norton N, Li D, Rieder MJ, et al. Genome-wide studies of copy number variation and exome sequencing identify rare variants in BAG3 as a cause of dilated cardiomyopathy. Am J Hum Genet. 2011 Mar 11;88(3):273–82. doi: 10.1016/j.ajhg.2011.01.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Theis JL, Sharpe KM, Matsumoto ME, et al. Homozygosity mapping and exome sequencing reveal GATAD1 mutation in autosomal recessive dilated cardiomyopathy. Circ Cardiovasc Genet. 2011 Dec;4(6):585–94. doi: 10.1161/CIRCGENETICS.111.961052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Galmiche L, Serre V, Beinat M, et al. Exome sequencing identifies MRPL3 mutation in mitochondrial cardiomyopathy. Hum Mutat. 2011 Nov;32(11):1225–31. doi: 10.1002/humu.21562. [DOI] [PubMed] [Google Scholar]
- 23.Harakalova M, van Harssel JJ, Terhal PA, et al. Dominant missense mutations in ABCC9 cause Cantu syndrome. Nat Genet. 2012 Jul;44(7):793–6. doi: 10.1038/ng.2324. [DOI] [PubMed] [Google Scholar]
- 24.van de Laar IM, Oldenburg RA, Pals G, et al. Mutations in SMAD3 cause a syndromic form of aortic aneurysms and dissections with early-onset osteoarthritis. Nat Genet. 2011 Feb;43(2):121–6. doi: 10.1038/ng.744. [DOI] [PubMed] [Google Scholar]
- 25.Comino-Mendez I, Gracia-Aznarez FJ, Schiavi F, et al. Exome sequencing identifies MAX mutations as a cause of hereditary pheochromocytoma. Nat Genet. 2011 Jul;43(7):663–7. doi: 10.1038/ng.861. [DOI] [PubMed] [Google Scholar]
- 26.Granados-Riveron JT, Ghosh TK, Pope M, et al. Alpha-cardiac myosin heavy chain (MYH6) mutations affecting myofibril formation are associated with congenital heart defects. Hum Mol Genet. 2010 Oct 15;19(20):4007–16. doi: 10.1093/hmg/ddq315. [DOI] [PubMed] [Google Scholar]
- 27.Meder B, Haas J, Keller A, et al. Targeted next-generation sequencing for the molecular genetic diagnostics of cardiomyopathies. Circ Cardiovasc Genet. 2011 Apr;4(2):110–22. doi: 10.1161/CIRCGENETICS.110.958322. [DOI] [PubMed] [Google Scholar]
- 28.Chan L, Boerwinkle E. Gene-environment Interactions and Gene Therapy in Atherosclerosis. Cardiol Rev. 1994;2(3):130–7. [Google Scholar]
- 29.Roberts R. A customized genetic approach to the number one killer: coronary artery disease. Curr Opin Cardiol. 2008 Nov;23(6):629–33. doi: 10.1097/HCO.0b013e32830e6b4e. [DOI] [PubMed] [Google Scholar]
- 30.Kruglyak L. Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nat Genet. 1999 Jun;22(2):139–44. doi: 10.1038/9642. [DOI] [PubMed] [Google Scholar]
- 31.The International HapMap Consortium. A haplotype map of the human genome. Nature. 2005 Oct 27;437(7063):1299–320. doi: 10.1038/nature04226. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Roberts R, Stewart AF, Wells GA. Identifying genes for coronary artery disease: An idea whose time has come. Can J Cardiol. 2007 Aug;23(Suppl A):7A–15A. doi: 10.1016/s0828-282x(07)71000-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.McPherson R, Pertsemlidis A, Kavaslar N. A common allele on chromosome 9 associated with coronary heart disease. Science. 2007 Jun 8;316(5830):1488–91. doi: 10.1126/science.1142447. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Helgadottir A, Thorleifsson G, Manolescu A, et al. A common variant on chromosome 9p21 affects the risk of myocardial infarction. Science. 2007 Jun 8;316(5830):1491–3. doi: 10.1126/science.1142842. [DOI] [PubMed] [Google Scholar]
- 35.Preuss M, Konig IR, Thompson JR, et al. Design of the Coronary ARtery DIsease Genome-Wide Replication And Meta-Analysis (CARDIoGRAM) Study: A Genome-wide association meta-analysis involving more than 22 000 cases and 60 000 controls. Circ Cardiovasc Genet. 2010 Oct;3(5):475–83. doi: 10.1161/CIRCGENETICS.109.899443. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Roberts R, Stewart AF. Genes and Coronary Artery Disease: Where Are We? J Am Coll Cardiol. 2012 Oct 30;60(18):1715–21. doi: 10.1016/j.jacc.2011.12.062. [DOI] [PubMed] [Google Scholar]
- 37.The CARDIoGRAMplusC4D Consortium. Coronary artery disease risk loci identified in over 190,000 individuals implicate lipid metabolism and inflammation as key causal pathways. Nature Genetics. 2012 In press. [Google Scholar]
- 38.Dudbridge F, Gusnanto A. Estimation of significance thresholds for genomewide association scans. Genet Epidemiol. 2008 Apr;32(3):227–34. doi: 10.1002/gepi.20297. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Califano A, Butte AJ, Friend S, et al. Leveraging models of cell regulation and GWAS data in integrative network-based association studies. Nat Genet. 2012 Aug;44(8):841–7. doi: 10.1038/ng.2355. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Zhong H, Yang X, Kaplan LM, et al. Integrating pathway analysis and genetics of gene expression for genome-wide association studies. Am J Hum Genet. 2010 Apr 9;86(4):581–91. doi: 10.1016/j.ajhg.2010.02.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Ravasi T, Suzuki H, Cannistraci CV, et al. An atlas of combinatorial transcriptional regulation in mouse and man. Cell. 2010 Mar 5;140(5):744–52. doi: 10.1016/j.cell.2010.01.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Manolio TA, Collins FS, Cox NJ, et al. Finding the missing heritability of complex diseases. Nature. 2009;461:747–53. doi: 10.1038/nature08494. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Tennessen JA, Bigham AW, O’Connor TD, et al. Evolution and Functional Impact of Rare Coding Variation from Deep Sequencing of Human Exomes. Science. 2012 May 21; doi: 10.1126/science.1219240. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Nelson MR, Wegmann D, Ehm MG, et al. An Abundance of Rare Functional Variants in 202 Drug Target Genes Sequenced in 14,002 People. Science. 2012 May 17; doi: 10.1126/science.1217876. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Dewey FE, Pan S, Wheeler MT, et al. DNA sequencing: clinical applications of new DNA sequencing technologies. Circulation. 2012 Feb 21;125(7):931–44. doi: 10.1161/CIRCULATIONAHA.110.972828. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Gollob MH, Green MS, Tang AS, et al. Identification of a gene responsible for familial Wolff-Parkinson-White syndrome. N Engl J Med. 2001 Jun 14;344(24):1823–31. doi: 10.1056/NEJM200106143442403. [DOI] [PubMed] [Google Scholar]
- 47.Sidhu JS, Rajawat YS, Rami TG, et al. Transgenic mouse model of ventricular preexcitation and atrioventricular reentrant tachycardia induced by an AMP-activated protein kinase loss-of-function mutation responsible for Wolff-Parkinson-White syndrome. Circulation. 2005 Jan 4;111(1):21–9. doi: 10.1161/01.CIR.0000151291.32974.D5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Hindorff LA, Sethupathy P, Junkins HA, et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A. 2009 Jun 9;106(23):9362–7. doi: 10.1073/pnas.0903103106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Kiezun A, Garimella K, Do R, et al. Exome sequencing and the genetic basis of complex traits. Nat Genet. 2012 Jun;44(6):623–30. doi: 10.1038/ng.2303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Brown MS, Goldstein JL. Expression of the familial hypercholesterolemia gene in heterozygotes: mechanism for a dominant disorder in man. Science. 1974;185:61–3. doi: 10.1126/science.185.4145.61. [DOI] [PubMed] [Google Scholar]
- 51.Shepherd J, Cobbe SM, Ford I, et al. Prevention of Coronary Heart Disease with Pravastatin in Men with Hypercholesterolemia. New England Journal of Medicine. 1995 Nov 16;333(20):1301–8. doi: 10.1056/NEJM199511163332001. [DOI] [PubMed] [Google Scholar]
- 52.Stein EA, Mellis S, Yancopoulos GD, et al. Effect of a monoclonal antibody to PCSK9 on LDL cholesterol. N Engl J Med. 2012 Mar 22;366(12):1108–18. doi: 10.1056/NEJMoa1105803. [DOI] [PubMed] [Google Scholar]
- 53.Visscher PM, Brown MA, McCarthy MI, Yang J. Five Years of GWAS Discovery. Am J Hum Genet. 2012 Jan 13;90(1):7–24. doi: 10.1016/j.ajhg.2011.11.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Yang J, Benyamin B, McEvoy BP, et al. Common SNPs explain a large proportion of the heritability for human height. Nat Genet. 2010 Jul;42(7):565–9. doi: 10.1038/ng.608. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Simonson MA, Wills AG, Keller MC, McQueen MB. Recent methods for polygenic analysis of genome-wide data implicate an important effect of common variants on cardiovascular disease risk. BMC Med Genet. 2011;12:146. doi: 10.1186/1471-2350-12-146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Zuk O, Hechter E, Sunyaev SR, Lander ES. The mystery of missing heritability: Genetic interactions create phantom heritability. Proc Natl Acad Sci U S A. 2012 Jan 24;109(4):1193–8. doi: 10.1073/pnas.1119675109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Wallentin L, James S, Storey RF, et al. Effect of CYP2C19 and ABCB1 single nucleotide polymorphisms on outcomes of treatment with ticagrelor versus clopidogrel for acute coronary syndromes: a genetic substudy of the PLATO trial. Lancet. 2010 Oct 16;376(9749):1320–8. doi: 10.1016/S0140-6736(10)61274-3. [DOI] [PubMed] [Google Scholar]
- 58.Mega JL, Close SL, Wiviott SD, et al. Cytochrome p-450 polymorphisms and response to clopidogrel. N Engl J Med. 2009 Jan 22;360(4):354–62. doi: 10.1056/NEJMoa0809171. [DOI] [PubMed] [Google Scholar]
- 59.Roberts JD, Wells GA, Le May MR, et al. Point-of-care genetic testing for personalisation of antiplatelet treatment (RAPID GENE): a prospective, randomised, proof-of-concept trial. Lancet. 2012 Mar 28; doi: 10.1016/S0140-6736(12)60161-5. [DOI] [PubMed] [Google Scholar]
- 60.Carlquist JF, Anderson JL. Using pharmacogenetics in real time to guide warfarin initiation: a clinician update. Circulation. 2011 Dec 6;124(23):2554–9. doi: 10.1161/CIRCULATIONAHA.111.019737. [DOI] [PubMed] [Google Scholar]
- 61.Anderson JL, Horne BD, Stevens SM, et al. A randomized and clinical effectiveness trial comparing two pharmacogenetic algorithms and standard care for individualizing warfarin dosing (CoumaGen-II) Circulation. 2012 Apr 24;125(16):1997–2005. doi: 10.1161/CIRCULATIONAHA.111.070920. [DOI] [PubMed] [Google Scholar]
- 62.Link E, Parish S, Armitage J, et al. The SEARCH Collaborative Group. SLCO1B1 variants and statin-induced myopathy--a genomewide study. N Engl J Med. 2008 Aug 21;359(8):789–99. doi: 10.1056/NEJMoa0801936. [DOI] [PubMed] [Google Scholar]
- 63.Levy S, Sutton G, Ng PC, et al. The diploid genome sequence of an individual human. PLoS Biol. 2007 Sep 4;5(10):e254. doi: 10.1371/journal.pbio.0050254. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Hayden EC. Secrets of the human genome disclosed. Nature. 2011 Oct 6;478(7367):17. doi: 10.1038/478017a. [DOI] [PubMed] [Google Scholar]
- 65.Brunham LR, Hayden MR. Medicine. Whole-genome sequencing: the new standard of care? Science. 2012 Jun 1;336(6085):1112–3. doi: 10.1126/science.1220967. [DOI] [PubMed] [Google Scholar]
- 66.Drmanac R. Medicine. The ultimate genetic test. Science. 2012 Jun 1;336(6085):1110–2. doi: 10.1126/science.1221037. [DOI] [PubMed] [Google Scholar]
- 67.Peters BA, Kermani BG, Sparks AB, et al. Accurate whole-genome sequencing and haplotyping from 10 to 20 human cells. Nature. 2012 Jul 12;487(7406):190–5. doi: 10.1038/nature11236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Schnabel RB, Baccarelli A, Lin H, et al. Next steps in cardiovascular disease genomic research--sequencing, epigenetics, and transcriptomics. Clin Chem. 2012 Jan;58(1):113–26. doi: 10.1373/clinchem.2011.170423. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.McCarty CA, Chisholm RL, Chute CG, et al. The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies. BMC Med Genomics. 2011;4:13. doi: 10.1186/1755-8794-4-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Stein LD. The case for cloud computing in genome informatics. Genome Biol. 2010;11(5):207. doi: 10.1186/gb-2010-11-5-207. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Zerbino DR, Paten B, Haussler D. Integrating genomes. Science. 2012 Apr 13;336(6078):179–82. doi: 10.1126/science.1216830. [DOI] [PubMed] [Google Scholar]
- 72.DeSalle RYM. After the Genome: Where Should We Go? In: DeSalle Rob, Yudell Michael., editors. The Genomic Revolution: Unveiling the Unity of Life. Joseph Henry Press; 2002. pp. 64–74. [Google Scholar]
- 73.Firestein Stuart. Ignorance: How it Drives Science. Oxford University; 2012. [Google Scholar]