Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2010 Sep 12.
Published in final edited form as: Lancet. 2010 May 1;375(9725):1525–1535. doi: 10.1016/S0140-6736(10)60452-7

Clinical evaluation incorporating a personal genome

Euan A Ashley 1,2,3,*,^, Atul J Butte 1,2,3,^, Matthew T Wheeler 1,2,3, Rong Chen 1,2,3, Teri E Klein 1,2,3, Frederick E Dewey 1,2,3, Joel T Dudley 1,2,3, Kelly E Ormond 1,2,3, Aleksandra Pavlovic 1,2,3, Louanne Hudgins 1,2,3, Li Gong 1,2,3, Laura M Hodges 1,2,3, Dorit S Berlin 1,2,3, Caroline F Thorn 1,2,3, Katrin Sangkuhl 1,2,3, Joan M Hebert 1,2,3, Mark Woon 1,2,3, Hersh Sagreiya 1,2,3, Ryan Whaley 1,2,3, Alexander A Morgan 1,2,3, Dmitry Pushkarev 1,2,3, Norma F Neff 1,2,3, Joshua W Knowles 1,2,3, Mike Chou 1,2,3, Joseph Thakuria 1,2,3, Abraham Rosenbaum 1,2,3, Alexander Wait Zaranek 1,2,3, George Church 1,2,3, Henry T Greely 1,2,3,^, Stephen R Quake 1,2,3,^, Russ B Altman 1,2,3,^
PMCID: PMC2937184  NIHMSID: NIHMS221253  PMID: 20435227

Abstract

Background

The cost of genomic information has fallen steeply but the path to clinical translation of risk estimates for common variants found in genome wide association studies remains unclear. Since the speed and cost of sequencing complete genomes is rapidly declining, more comprehensive means of analyzing these data in concert with rare variants for genetic risk assessment and individualisation of therapy are required. Here, we present the first integrated analysis of a complete human genome in a clinical context.

Methods

An individual with a family history of vascular disease and early sudden death was evaluated. Clinical assessment included risk prediction for coronary artery disease, screening for causes of sudden cardiac death, and genetic counselling. Genetic analysis included the development of novel methods for the integration of whole genome sequence data including 2.6 million single nucleotide polymorphisms and 752 copy number variations. The algorithm focused on predicting genetic risk of genes associated with known Mendelian disease, recognised drug responses, and pathogenicity for novel variants. In addition, since integration of risk ratios derived from case control studies is challenging, we estimated posterior probabilities from age and sex appropriate prior probability and likelihood ratios derived for each genotype. In addition, we developed a visualisation approach to account for gene-environment interactions and conditionally dependent risks.

Findings

We found increased genetic risk for myocardial infarction, type II diabetes and certain cancers. Rare variants in LPA are consistent with the family history of coronary artery disease. Pharmacogenomic analysis suggested a positive response to lipid lowering therapy, likely clopidogrel resistance, and a low initial dosing requirement for warfarin. Many variants of uncertain significance were reported.

Interpretation

Although challenges remain, our results suggest that whole genome sequencing can yield useful and clinically relevant information for individual patients, especially for those with a strong family history of significant disease.

Keywords: sequencing, personal genomics, single nucleotide polymorphism, coronary artery disease, arrhythmogenic right ventricular dysplasia/cardiomyopathy, pharmacogenomics


Technological advance has brought a steep decline in the cost of genetic information, but the explanatory power and path to clinical translation of risk estimates for common variants found in genome wide association studies remains unclear. Much of the reason for this lies in the presence of rare and structural genetic variation. Since we are now able to rapidly and inexpensively sequence complete genomes,15 more comprehensive genetic risk assessment and individualisation of therapies may be possible.6 However, analytic tools are presently lacking to make these data accessible in a clinical context and the clinical utility of these data at an individual level has not been formally evaluated.

Clinical assessment

The patient was assessed at Stanford's Center for Inherited Cardiovascular disease by a cardiologist (EA) as well as a board certified genetic counsellor (KO). The patient was a 40 year old male who presented with a family history concerning for coronary artery disease and sudden death. There was no significant past medical history and the patient exercised regularly without symptoms. The patient was taking no medications. A four generation family pedigree was drawn (Figure 1). Family history revealed coronary artery disease and abdominal aortic aneurysm in first and second degree relatives. There was also a family history of early sudden, presumed cardiac, death. On examination, the patient was well appearing. Clinical examination was within normal limits. Conventional risk assessment for coronary artery disease included a lipid panel (Table 1). Due to his family history of cardiovascular disease, he underwent electrocardiography, which showed sinus rhythm, normal axis and high praecordial voltage with early repolarisation. The family history of sudden death prompted an echocardiogram and cardiopulmonary exercise test. The echocardiogram revealed normal right and left ventricular size, systolic, diastolic and valvular function. There were no wall motion abnormalities on maximal exercise and 1.5mm ST depression was upsloping. Maximum oxygen uptake was 49ml/kg/min.

Figure 1.

Figure 1

Patient pedigree. The arrow indicates the patient. Square shapes represent males, circles represent females. ARVD/C – arrhythmogenic right ventricular dysplasia/cardiomyopathy; AAA – abdominal aortic aneurysm;, HTN – hypertension; CAD – coronary artery disease; VT – paroxysmal ventricular tachycardia; HC – hypercholesterolemia; ARMD – age related macular degeneration; OA – osteoarthritis; SCD – sudden cardiac death (presumed)

Table 1.

Clinical characteristics of the patient

Patient Reference Range
Age (yr) 40
Height (in) 71
Weight (lb) 190
Body-mass index 26.5
Blood pressure (mm Hg)
 Systolic 128
 Diastolic 80
Laboratory Tests
Hemoglobin (g/dl) 15.7 13.5–17.7
Creatinine (mg/dl) 1.2 <1.2
Urea nitrogen (mg/dl) 20 5–25
White-cell count (K per microliter) 4.9 4–11
Cholesterol (mg/dl)
 Total 218
 LDL (mg/dl) 156
 HDL (mg/dl) 48
Triglycerides (mg/dl) 68
High-sensitivity C-reactive protein (mg/dl) <0.2
Lp(a) lipoprotein (mg/dl) 114 <30
Exercise Testing
Maximal VO2 (ml/kg/min) 49.6
Maximal external work (W) 450
Ve/VCO2 slope 26
Maximal heart rate (bpm) 191
Resting cardiac output (liters/min) 6.3
Maximum cardiac output (liters/min) 24.5
Electrocardiography Parameters
Heart Rate (bpm) 60
QTc (ms) 421
Echocardiography Parameters
Interventricular septum diastole (mm) 10 6–11
Left ventricular posterior wall diastole (mm) 9.7 6–11
Left ventricular internal diameter diastole (mm) 45 37–57
Ejection fraction by method of discs (%) 63 >55
Aortic root diameter (mm) 36 25–40
Mitral inflow
E (cm/s) 84
a (cm/s) 53

Methods

Genome sequencing and assembly

The technical details of the genome sequencing for this individual have been described previously.7 In brief, genomic DNA was purified from 2 ml of whole blood and sequenced with a Heliscope genome sequencer. Output comprised 148 GB of raw sequence with an average read length of 33 bases. Sequence data were mapped to the National Center for Biotechnology Information reference human genome build 36 using the open-source aligner IndexDP.7 Base calling was performed with the UMKA algorithm, resulting in the detection of 2.6 million single nucleotide variations and 752 copy number variations from the reference sequence. A subset of SNP calls were independently validated with the Illumina BeadArray and with Sanger sequencing. A subset of CNV calls were independently validated with digital PCR.

Analysis pipeline

Disease and risk analysis of the genome was focused in four areas: i) variants associated with genes for known Mendelian disease, ii) novel mutations, iii) variants known to modulate response to pharmacotherapy, and (iv) single nucleotide polymorphisms previously associated with complex disease.

Rare and Mendelian variants

Database queries, biophysical prediction algorithms, and analyses of non-coding regions were used to screen rare and novel variants in the genome. We queried disease-specific mutation databases, the Human Genome Mutation Database (HGMD) and Online Mendelian Inheritance in Man (OMIM) to identify genes and mutations with known associations to monogenic diseases. We applied prediction algorithms to weight the likelihood of variant pathogenicity based on allele frequency, conservation and protein domain disruption. In addition, we developed algorithms to index variants affecting or creating start sites, stop sites, splice sites and microRNAs (Figure 2, Supplemental methods).

Figure 2.

Figure 2

Approach to rare or novel variants

See text for details. GVS - Genome Variation Server25; SIFT - Sorting Intolerant From Tolerant26; HGMD – Human Gene Mutation27; Polyphen – POLYmorphism PHENotyping28; HGVS – Human genome variation society29; ARVC Database13; mtSNP - mtSNP: a database of human mitochondrial genome polymorphisms30; UniProt - UNIversal PROTein Resource31; PolyDom – a whole genome database for the identification of non-synonymous coding SNPs with the potential to impact disease32; OMIM -Online Mendelian Inheritance in Man, OMIM33.

Pharmacogenomics

PharmGKB8 contains data on 2500 variants, of which 650 refer specifically to drug response phenotypes. PharmGKB curators examined these 650 annotations in the context of this patient's genotype. Key variants were then identified based on the relevance of the phenotype in the annotation, the medical and family history, and the study population upon which the annotation was based. Since our disease risk estimation and pharmacogenomic analysis draw on previously published observations, we rated the level of evidence used in one of three categories (Supplemental methods).

Disease Risk

To integrate common variant genetic risk across a spectrum of human disease, we built a manually curated disease-SNP database (Supplemental methods). Diseases and phenotypes were mapped to the Unified Medical Language System (UMLS) Concept Unique Identifiers (CUIs). Since strand direction was variably reported in studies, we identified strand direction by comparing with the major/minor alleles found in the appropriate HapMap population. Odds ratios were available for allele comparisons in most cases (Supplemental Figure 1) however to generate a medically relevant posterior probability of disease from integrated environmental and genetic risk, we calculated likelihood ratios for the most significant SNP from each haplotype block. Pre-probability was derived from published sources (Supplemental Table 4) and the LR was applied to the pre-test odds of disease, calculated from age and sex appropriate population prevalence. Some studies did not provide frequency data for genotype that allowed calculation of the LR.

Ethical approval

The study was approved by the Institutional Review Board of Stanford University. The patient received education and counselling before signing the consent form and throughout the process of testing and follow up.

Role of the funding source

The study sponsors had no role in the design, data collection, data analysis, data interpretation, or writing of the report. Dr. Ashley had full access to all data in the study and final responsibility in the decision to submit the manuscript for publication.

Results

Family history

A four generation family pedigree (Figure 1) revealed atherosclerotic vascular disease with multiple manifestations as well as prominent osteoarthritis. The patient's first cousin once removed (IV-1) died suddenly of an unknown cause.

Rare variants

An important benefit of sequencing over DNA chip based methods of genotyping is the identification of rare or novel variants. We searched for evidence of rare or novel variants that would predispose the patient or his family to disease (Table 2, Supplemental Table 2). Specific to cardiovascular disease, we discovered rare variants in three genes clinically associated with sudden cardiac death: TMEM43, DSP, and MYBPC3. The MYBPC3 variant, encoding an arginine to glutamine change at position 326 of the cardiac myosin binding protein C, was originally associated with late onset hypertrophic cardiomyopathy.9 Subsequently it has also been found in multiple independent control populations without known hypertrophic cardiomyopathy,10 suggesting it may be a benign variant. Mutations in TMEM4311 or DSP12 have been associated with familial arrhythmogenic right ventricular dysplasia/cardiomyopathy (ARVD/C). Review of clinical assessment of extended family members revealed minor criteria for ARVD/C in one first cousin, whose son died suddenly in his teens. In contrast to the findings for the identified rare MYBPC3 variant, the TMEM43 variant, encoding a methionine to valine change at position 41 of transmembrane protein 43, has not been previously published, but was seen in 1 of 150 probands known to have ARVD/C.13 The identified DSP variant, encoding an arginine to histidine change to amino acid 1838 of the desmoplakin protein, is entirely novel. Control populations from clinical testing laboratories (more than 1000 total chromosomes) have not found either the DSP or TMEM43 variants.

Table 2.

Selected rare nonsynonymous variants in genes associated with disease.

Chr Position SNP ID Ref Base Pt DNA Gene Symbol Amino Acid Substitution Gene Name Associated Disease Mutation Databases Functional Prediction Mode of disease-gene inheritance Reference
Previously described rare variants in genes associated with common disease
6 160881127 rs3798220 T CT LPA I4399M* Apolipoprotein(a) Precursor (Lp(a)) Coronary artery disease Associated with high Lp(a) Benign n/a 20, 21
2 183411581 rs288326 G AG FRZB R200W Frizzled-related protein Osteoarthritis Possibly associated with osteoarthritis** Damaging n/a 34
Previously described rare variants in genes associated with rare disease
6 26199158 rs1799945 C CG HFE H63D Hereditary haemochromatosi s protein Precursor Haemochromatosis Probably disease associated Damaging recessive,
incomplete penetrance
35, 36
3 15661697 rs13078881 G CG BTD D444H Biotinidase Precursor Biotinidase Deficiency previously described, intermediate phenotype Damaging recessive 37
5 149340823 none C CT SLC26A2 R492W Solute carrier family 26 (sulfate transporter), Member 2 Diastrophic dysplasia disease associated Damaging recessive 38
1 207865689 none G AG LAMB3 R635X Laminin Beta-3 Epidermolysis Bullosa, Junctional disease associated, most common mutation Truncated protein recessive 39
2 44393296 none T CT SLC3A1 M467T Solute carrier family 3 (cystine, dibasic, and neutral amino acid transporter) Member 1 Cystinuria disease associated, most common mutation Damaging recessive 40
Previously described variants of unknown significance in disease associated genes
3 14146021 none A AG TMEM43 M41V Transmembrane protein 43 Arrhythmogenic Right Ventricular Dysplasia/Cardiomyopathy found in 1 of 150 probands with ARVC Benign dominant,
incomplete penetrance
13
11 47324447 rs34580776 C CT MYBPC3 R326Q Myosin-binding protein C, cardiac-type Familial Hypertrophic Cardiomyopathy variant of unknown significance Intermediate dominant,
incomplete penetrance
41
13 31870584 none A AG BRCA2 I3312V Breast cancer type 2 susceptibility protein Breast Cancer variant of unknown significance Intermediate dominant,
loss of heterozygo sity
42
Novel variants potentially associated with rare disease
6 7528007 novel G AG DSP R1838H Desmoplakin Arrhythmogenic Right Ventricular Dysplasia/Cardiomyopathy not found Damaging dominant,
incomplete penetrance
13
1 191468879 novel C CT HRPT2 Q430X Parafibromin Hyperparathyroidism -Jaw Tumour not found Truncated protein dominant,
loss of heterozygo sity
43
7 93888305 novel C CT COL1A2 P782S Alpha-2 type I collagen Ehlers-Danlos Syndrome, Type VII not found Intermediate recessive 44
7 116976093 novel G AG CFTR G458R Cystic fibrosis transmembrane conductance regulator Cystic Fibrosis not found Damaging recessive 45
19 40467780 novel A AG HAMP T84A Hepcidin Precursor Haemochromatosis, Juvenile not found Intermediate recessive
1 144127058 novel C CT HFE2 H174Y Hemojuvelin Precursor Haemochromatosis, Juvenile not found Damaging recessive

Rare nonsynonymous variants in genes associated with inherited disease. 1 - RefSeq reference allele in the human genome reference sequence (RefSeq), build 36. 2 - Disease associated with inherited mutations in the gene evaluated. 3 - Mutation databases evaluated for presence of the found variant in multiple mutation databases, including UniProt protein variant database, Human Genome Mutation Database curated mutation database, Locus-Specific Mutation Databases (curated by the Human Genome Variation Society), Online Mendelian Inheritance in Man, and clinical testing laboratory databases together with associated links. 4 - Functional prediction; Prediction of functional effect of mutation, derived from substitution effect prediction algorithms, Polymorphism Phenotyping (PolyPhen) and Sorting Intolerant from Tolerant (SIFT); published in vitro experimental evidence,; and evaluation of typical mutational mechanism in other disease gene associated mutations.

*

Also reported as I1891M. Each copy of C allele increases lipoprotein(a) level 1.8 SD and risk for coronary artery disease 2–3 fold.

**

Inconclusive association in metaanalysis of osteoarthritis related SNPs though moderate association with severe hip osteoarthritis.

The patient's genome revealed three novel and potentially damaging variants in two related genes previously associated with the development of haemochromatosis. Subsequent to these findings, detailed personal and family history review failed to identify a history of haemochromatosis in the patient or family members. Available clinical testing did not show evidence of haemochromatosis based on echocardiographic parameters and liver function testing. The justification for further surveillance and testing with serum iron studies was explored with the patient. Additionally, the patient was found to harbor a novel stop mutation in a gene implicated in hyperparathyroidism and parathyroid tumours. This variant may increase probability of future development of hyperparathyroidism or parathyroid tumours through a loss-of-heterozygosity mechanism. Osteoarthritis was prominent in family history and knee pain without a formal diagnosis was present in the patient.

Pharmacogenomics

We found 64 clinically relevant previously described pharmacogenomic variants (see Table 3, Supplemental Table 3) and 12 novel, non-conservative, amino acid changing SNPs in genes known to be important for drug response. There was a heterozygous null mutation in CYP2C19, a gene product important for the metabolism of many drugs, including proton pump inhibitors, antiepileptics, and the anti-platelet agent clopidogrel. Notably, the rate of cardiovascular events is higher among patients taking clopidogrel with CYP2C19 loss of function mutations.14 In addition, the patient has three distinct genetic variations that suggest a lower maintenance dose of warfarin. The patient has the single most important variant in VKORC1 associated with a lower maintenance dose,15 is homozygous for a CYP4F2 SNP associated with lower dosing, and interestingly has a novel non-synonymous SNP in VKORC1.16 Thus, warfarin loading could be managed in an individualised manner for this patient with lower expected doses. The patient has several variants associated with a good response to statins (including lower risk for myopathy) and one variant suggesting that he may need a higher dose to achieve a good response. Finally, the patient is wild type (with no copy number variations) for the important drug metabolizing enzymes that impact hundreds of drug responses: CYP2D6, CYP2C9, and CYP3A4.

Disease Risk

While genome wide association studies have provided highly significant association of many common variants with disease, integrating these small odds ratios in the context of the individual patient remains challenging. In particular additive or multiplicative models of even highly significant SNPs can add little to the classified status of the patient.17, 18 Further, these approaches take no account of prior probability of disease. To approach some of these concerns, we adopted established methods from within evidence based medicine that have to date rarely been applied to clinical genetics. We calculated pre-probabilities from referenced sources for 121 diseases (Supplementary Table 4). Of the 55 diseases for which we could calculate a post-test, there was consistently increased genetic risk (likelihood ratio, LR > 2) for 8 diseases and decreased genetic risk (LR < 0.5) for 7 diseases (Figure 3A). Of note, an increase in genetic risk did not always translate into a high post-test probability. It was rare for us to find post-test probabilities that were an order of magnitude higher or lower than pre-test probabilities. Decision towards acting on these predictions will necessarily be a function of the post-test probability threshold for action (i.e. the post-test probability of type 2 diabetes), the consequences of action (i.e. regular testing for fasting blood sugar), and the utility and efficacy of action.

Figure 3A.

Figure 3A

Clinical risk incorporating genetic risk estimates for major diseases.

Post-test probabilities were calculated by multiplying published pre-test probabilities or disease prevalence (in Caucasian males in the patient's age range, when available; see also Supplemental Table 4) with a series of independent likelihood ratios for each patient allele. Only the 32 diseases with (1) available pre-test probabilities, (2) more than one associated SNP, and (3) with published genotype frequencies are shown here. Disorders such as abdominal aortic aneurysm and progressive supranuclear palsy are not listed here, because they are diseases with only one available SNP. The back of the arrow heads indicate pre-test probabilities, and the point in the direction of the change in probability. Blue lines indicate a lowered post-test probability, and orange indicates an increased post-test probability. The number of independent SNPs used in the calculation of post-test probability for each disease is shown in the right. The advantage of plotting pre and post-test probabilities is illustrated by several cases. For example, While the patient has increased genetic risk for Graves' disease, the pre-test probability of this disease is very low so that post-test probability also remains low. However, while the patient exhibits much less genetic contribution to his risk for prostate cancer, his prior probability is high.

Increased genetic risk for myocardial infarction (MI) took the form of 6 MI-susceptible SNPs and 2 protective SNPs (Figure 3b). In addition, the patient had risk markers at the locus (9p21) most replicated in genome wide association studies (an example is rs1333049 associated with an odds ratios of 1.5 for early onset myocardial infarction19 – this marker is part of a commercial genetic “risk” test for myocardial infarction). Further, he patient harbors a single copy of the previously studied rare variant of the LPA gene that encodes the apolipoprotein (A) precursor. Notably, the patient manifest a very high Lp(a) level (114 mg/dl, Table 1) which is associated with an increased risk of cardiovascular events. This variant is associated with a 5-fold higher median plasma Lp(a) level, a 1.7 to 2-fold20 risk of coronary artery disease, and a 3-fold21 adjusted odds ratio versus non-carriers for severe coronary artery disease. This polymorphism has been associated with a low number of Kringle IV-2 (KIV-2) domain repeats in the LPA gene, high Lp(a) levels, and adverse cardiovascular events.22,23 Given the technical limitations of the short read sequencing, a precise estimate of the number of KIV-2 domains in the patient's genome sequence was not determined.

Figure 3B.

Figure 3B

The contribution of individual alleles to overall risk for four example diseases. Using our pre-test probability estimate as a starting point, we use SNPs with an association published from a genome wide association study, and then order them, as the number of studies showing association and sample sizes decrease. The darker colour indicates a greater number of published studies reporting association of that SNP with the disease, and the size of the box scales with the logarithm of the number of samples used to calculate the likelihood ratio. The SNP related gene if known, and the patient's genotype calls, are shown to the left of the diagram. The right shows, for each SNP, the likelihood ratio of the disease for the patient's genotype, the number of studies reporting an association, the number of samples used to calculate the likelihood ratio, and the post-test probability to that point down the graph. SNPs at the top of the graph are reported in more and larger studies, and we have greater confidence in their association with disease. The test probabilities are calculated by serially stepping down the list of SNPs and calculating an updated post-test probability using the contribution of that genotype, while including the contribution to our estimate of the SNPs above.

Gene-environment interaction and conditionally dependent risk

We placed the disease associated genetic risk into the context of known environmental and behavioural modifiers, as well as predisposing conditions (Figure 3c). Diseases that may be independently associated with low genetic risk (e.g. abdominal aortic aneurysm) are visualised in the context of others that may be aetiologically related but for which genetic risk may be higher (e.g. obesity, which predisposes to type 2 diabetes and hypertension). Thus, overall risk can then be assessed using both direct and conditionally dependent information because they are illustrated together in the circuit. For example, we predict a reduced risk probability for hypertension of 16.8% (LR = 0.81) relative to the normal population, however the patient has a substantially elevated genetic risk for obesity (LR = 6.28) imparting a high post-test risk of 56.1% for a pre-disposing risk factor for hypertension. Furthermore, hypertension is associated with a number of modifiable environmental factors imparting on risk either directly (e.g. sodium intake) or conditionally by association with another node in the circuit (e.g. antipsychotics). Although no methods currently exist for statistical integration of such conditionally dependent risks, interpretation of findings in the context of the causal circuit diagram allows individualised assessment of the combined effect of environmental and genetic risk.

Figure 3C. Gene-environment interaction.

Figure 3C

A conditional dependency diagram for diseases represented in the patient's genetic risk profile. Only diseases for which a calculable post-test risk probability > 10% are shown. The size of the disease name text is proportional to the post-test risk probability. A solid black directed edge is drawn between disease names if one disease is known to be a predisposing aetiological factor for another disease. Environmental factors that are potentially modifiable are shown around the circumference, and dashed grey directed edge is drawn between an environmental factor and a disease if the factor has been frequently published in association with the aetiology of the disease. Environmental factors are portrayed in a size proportional to the number of diseases they are associated with in the circuit. The intensity of the colours of the factor circles represents the maximum post-test risk probability among those diseases directly associated with each factor.

Genetic counselling

We discussed the possibility that this clinical assessment incorporating a personal genome might uncover high risk of a serious disease, even some that do not have therapies. In addition, we described the reproductive implications of heterozygous status for autosomal recessive diseases such as cystic fibrosis, potentially not predictable from family history (Table 2, Figure 1). We also warned of increases or decreases in genetic risk for common diseases. We noted that the vast majority of the sequence information available is currently difficult to interpret. We discussed error rates and validation processes. We addressed the possibility of discrimination based on genetics. While a specialised physician can provide information for a patient seeking a specific single disease genetic test, patients with whole genome sequence data need information on significantly more diseases with a wide clinical range (Table 2). For this reason, we offered extended access to clinical geneticists, genetic counsellors and clinical lab directors to interpret the information we presented.

Discussion

We provide an approach to comprehensive analysis of a human genome in a defined clinical context. We assessed whole genome genetic risk, focusing on variants in genes associated with Mendelian disease, novel rare variants across the genome, and variants of known pharmacogenomic importance. In addition, we developed an approach to the integration of disease risk across multiple common polymorphisms. Although the methodology is nascent, the results provide a proof-of-principle that clinically meaningful information can be derived about disease risk and the response to medications in patients with whole genome sequence data. A prominent aspect of the patient's family history (Figure 1) is the diagnosis of arrythmogenic right ventricular dysplasia/cardiomyopathy in his first cousin (III-3) and the sudden death of his first cousin once removed (IV-1). Our patient shares 12.5% of his genetic information with his first cousin and 6.25% with that relative's son and, while a diagnostic workup would involve targeted sequencing of DNA from these individuals, our analysis uncovered several variants in genes with potential explanatory value. Most were common variants. One gene (MYBPC3) was previously associated with hypertrophic cardiomyopathy but seems to in fact be a common variant, exemplifying the limitations of current variant databases. Two rare variants in genes (TMEM43, DSP) previously associated with ARVD/C were novel.

Our patient reported a prominent family history of vascular disease including aortic aneurysm and coronary artery disease (Figure 1, individuals II-1, II-2, I-1, I-2). While it is possible that the collagen variant we found contributes to familial risk of aortic aneurysm, disease in this family is more likely related to atherosclerotic disease. In estimating the risk of coronary artery disease, we integrated the most replicated risk associations, likelihood ratio projections from the entire literature, and a known rare variant in the LPA gene that may not have been found using chip based genotyping. According to the ATP-III guidelines,24 our patient does not currently have major risk factors for coronary artery disease and would require an LDL > 190 mg/dl to qualify for lipid lowering therapy. However, he is borderline for three major risk factors and any two of these would lower the LDL threshold for treatment to 160 mg/dl (his measured level was 156 mg/dl). Although no standards yet exist for the incorporation of global genetic risk in cardiovascular risk assessment, physicians are accustomed to incorporating many sources of information in clinical decision-making. In this case, the patient's physician took account of this lifetime genetic risk and knowledge of his likely response to therapy into the clinical decision to recommend a lipid lowering medication. Part of this decision focused on the likely response to this therapy. His genome includes variants (Table 3) that predict greater likelihood of beneficial effect for statin medications and lower risk for the adverse effect of skeletal myopathy. In addition, a significant reduction of attributable risk was found in carriers of the LPA risk allele who took aspirin,20 leading to a discussion between the physician and his patient on the threshold for primary prevention with aspirin therapy. Given a predisposition to coronary artery disease and other diseases on which risk is conditionally dependent (Figure 3c), understanding the patient's potential response to clopidogrel and warfarin may be important aspects of individualising future medical therapy. The patient is at risk for clopidogrel resistance as a result of his CYP2C19 loss of function mutation and his physician recommended a higher dose of clopidogrel in the event of future use or consideration of newer agents with alternative metabolism. In contrast, should the patient develop an indication for warfarin, his genotype at the VKORC1 and CYP4F2 loci suggests he should take lower initial doses of warfarin. The novel VKORC1 variant may have additional effects on warfarin metabolism.

In contrast, our patient did not report a family history of haemochromatosis or parathyroid tumours yet harbours some genetic risk for these conditions. An important contribution of clinical-genetic risk integration is the appropriate consideration of further screening studies. In addition, risk alleles may be discovered that carry reproductive or familial significance rather than personal significance (such as those for breast or ovarian cancer in a male patient). Appropriate incorporation of such risk alleles into both medical and ethical discussion is warranted.

Limitations

There remain significant limitations to our ability to comprehensively integrate genetic information into clinical care. For example, there is a lack of a comprehensive rare mutation database or a framework for the statistical combination of risk estimates from multiple common polymorphisms. Since risk estimates change as more studies are completed, a continuously updated pipeline is required. On a technical level, we remain limited in our ability to improve error rates associated with sequencing, in particular, detecting structural variants. Finally, gene-environment interactions are challenging to quantify and have to date been little studied.

Conclusion

As whole genome sequencing becomes more widespread, obtaining genomic information will no longer be the limiting factor in the application of genetics to clinical medicine. Developing tools to integrate genetic data on common and rare variants along with clinical data to assist in clinical decision making is a large step towards individualised medicine. The transition to a new era of genome-informed medical care will require a team approach that incorporates medical and genetics professionals, ethicists and healthcare delivery organizations.

Supplementary Material

Supplementary Methods

Table 3a.

Pharmacogenomic variants with summary of effect and level of evidence

Drug Summary Level of Evidence (PMID) Gene Gene Name rsID or SNP location Patient Genotype
HMG CoA Reductase Inhibitors (statins) No increased risk of myopathy High (2811365; 17177112; 18650507) SLCO1B1 solute carrier organic anion transporter family, member 1B1 rs4149056 T/T
Clopidogrel & CYP2C19 substrates CYP2C19 poor metaboliser; many drugs may need adjustment High (19106084) CYP2C19 cytochrome P450, family 2, subfamily C, polypeptide 19 rs4244285 A/G
Warfarin Requires lower dose High (15888487) VKORC1 vitamin K epoxide reductase complex, subunit 1 rs9923231 C/T
Warfarin Requires lower dose High (19270263) CYP4F2 cytochrome P450, family 4, subfamily F, polypeptide 2 rs2108622 C/C
Atenolol; Metoprolol May be better than calcium-channel blockers High (18615004; 12844134; 16815314) ADRB1 adrenergic, beta-1-, receptor rs1801252 A/A
Fluvastatin Good response Medium (18781850) SLCO1B1 solute carrier organic anion transporter family, member 1B1 rs11045819 A/C
Pravastatin May have good response Medium (15199031) HMGCR 3-hydroxy-3-methylglutaryl-Coenzyme A reductase rs17238540 T/T
Pravastatin, Simvastatin No reduced efficacy Medium (15199031) HMGCR 3-hydroxy-3-methylglutaryl-Coenzyme A reductase rs17244841 A/A
Beta blockers Other options may be preferred Medium (16189366) ADRB2 adrenergic, beta-2-, receptor, surface rs1042713 A/G
Beta blockers Other options may be preferred Medium (12835612; 16189366) ADRB2 adrenergic, beta-2-, receptor, surface rs1042714 C/C
Metoprolol and other CYP2D6 substrates Normal CYP2D6 metaboliser. Medium (19037197) CYP2D6 cytochrome P450, family 2, subfamily D, polypeptide 6 rs3892097 rs1800716 C/C
Desipramine; Fluoxetine Depression may improve more than average Medium (19414708) BDNF brain-derived neurotrophic factor rs61888800 G/G
Metformin Less likely to respond Medium (18544707) CDKN2A/B cyclin-dependent kinase inhibitor 2A/2B rs10811661 T/T
Troglitazone Less likely to respond Medium (18544707) CDKN2A/B cyclin-dependent kinase inhibitor rs10811661 T/T

Table 3b.

Pharmogenomic rare and novel non-synonymous damaging variants (predicted damaging by PhD-SNP algorithm46)

Drug Effect type Coding Change Gene Gene Name SNP location Patient Genotype
adefovir, dipivoxil, tenofovir Pharmacokinetic H191D AK2 adenylate kinase 2 1:33251518 C/G
infliximab Pharmacodynamic V793M NOD2 nucleotide-binding oligomerization domain containing 2 16:49303700 A/G
infliximab Pharmacodynamic S431L NOD2 nucleotide-binding oligomerization domain containing 2 16:49302615 C/T
trastuzumab, erlotinib, gefitinib, lapatinib, PHA-665752, chloroquine, cisplatin, gemcitabine, cetuximab Pharmacodynamic H578Y ERBB3 v-erb-b2 erythroblastic leukemia viral oncogene homolog 3 (avian) 12:54774480 C/T
mercaptopurine, methotrexate Pharmacodynamic I485F MYLK myosin light chain kinase 3:124923809 A/A
atorvastatin, fluvastatin, hmg coa reductase inhibitors, lovastatin, pravastatin, rosuvastatin, simvastatin, Pharmacokinetic Y21C SLC15A1 solute carrier family 15 (oligopeptide transporter), member 1 13:98176691 C/T
cladribine, fludarabine, uridine, mercaptopurine, thioguanine, antineoplastic agents, gemcitabine, azathioprine, folic acid Pharmacokinetic S443F SLC28A3 solute carrier family 28 (sodium-coupled nucleoside transporter), member 3 9:86090799 A/G
antimetabolites, mercaptopurine, methotrexate, adenosine, antineoplastic agents, azathioprine, folic acid, thioguanine Pharmacodynamic P246L AHCY adenosylhomocysteinase 20:32342227 A/G
clozapine Pharmacodynamic T262K HLA-DRB5 major histocompatibility complex, class II, DR beta 5 6:32593811 T/T
mercaptopurine, methotrexate Pharmacodynamic I14T MICA MHC class I polypeptide-related sequence A 6:31484467 C/T
cimetidine, estrone, antiinflammatory and antirheumatic products, non-steroids, ibuprofen, indomethacin, ketoprofen, methotrexate, phenylbutazone, piroxicam, probenecid, atorvastatin, fluvastatin, hmg coa reductase inhibitors, lovastatin, pravastatin, rosuvastatin, simvastatin, adefovir dipivoxil, tenofovir, antineoplastic agents, cyanocobalamin, folic acid, leucovorin, pyridoxine, Pharmacokinetic R534Q SLC22A8 solute carrier family 22 (organic anion transporter), member 8 11:62517376 C/T
warfarin Pharmacodynamic G64R VKORC1 vitamin K epoxide reductase complex, subunit 1 16:31012227 C/T

Acknowledgements

The authors would like to acknowledge the invaluable help of Josephine Puryear, Joshua Spin, Emidio Capriotti, Connie Oshiro. We also thank Yuti Bhide and Prajkta Bhide from Optra Systems for the curation of disease-associated SNPs from literature.

Grant support This work was supported by grants from the National Institutes of Health General Medical Sciences (GM61374 and associated ARRA supplement, GM079719), National Institutes of Health Heart, Lung And Blood Institute (F32HL097462, LM009719, K08 HL083914), National Institutes of Health Human Genome Research Institute (HG003389), Howard Hughes Medical Institute, The John D. and Catharine T. MacArthur Foundation for The Law and Neuroscience Project.

Footnotes

Conflicts of Interest RA is consultant to a direct-to-consumer genetic testing company, 23andme. GC is an advisor to several sequencing and direct-tconsumer companie (23andme, Knome, Helicos). Full list accessible here: http://arep.med.harvard.edu/gmc/tech.html. KO was a paid consultant as a member of the Genetic Counseling Task Force for Navigenics from 6/07 to 8/09. SQ is a founder, consultant and equity holder in Helicos BioSciences. DP is an equity holder in Helicos BioSciences. EA, DB, AB, RC, MC, FD, JD, LG, HG, JH, LMH, LH, TK, JWK, AM, NN, AP, AR, HS, KS, JT, CT, RW, MTW, MW, and AZ declare that they have no conflicts of interest.

References

  • 1.Choi M, Scholl UI, Ji W, et al. Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Proc Natl Acad Sci U S A. 2009;106(45):19096–101. doi: 10.1073/pnas.0910672106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Ng SB, Turner EH, Robertson PD, et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature. 2009;461(7261):272–6. doi: 10.1038/nature08250. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Wheeler DA, Srinivasan M, Egholm M, et al. The complete genome of an individual by massively parallel DNA sequencing. Nature. 2008;452(7189):872–6. doi: 10.1038/nature06884. [DOI] [PubMed] [Google Scholar]
  • 4.Kim JI, Ju YS, Park H, et al. A highly annotated whole-genome sequence of a Korean individual. Nature. 2009;460(7258):1011–5. doi: 10.1038/nature08211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Levy S, Sutton G, Ng PC, et al. The diploid genome sequence of an individual human. PLoS biology. 2007;5(10):e254. doi: 10.1371/journal.pbio.0050254. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Tucker T, Marra M, Friedman JM. Massively parallel sequencing: the next big thing in genetic medicine. Am J Hum Genet. 2009;85(2):142–54. doi: 10.1016/j.ajhg.2009.06.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Pushkarev D, Neff NF, Quake SR. Single-molecule sequencing of an individual human genome. Nat Biotechnol. 2009;27(9):847–52. doi: 10.1038/nbt.1561. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Klein TE, Chang JT, Cho MK, et al. Integrating genotype and phenotype information: an overview of the PharmGKB project. Pharmacogenetics Research Network and Knowledge Base. The pharmacogenomics journal. 2001;1(3):167–70. doi: 10.1038/sj.tpj.6500035. [DOI] [PubMed] [Google Scholar]
  • 9.Morner S, Richard P, Kazzam E, et al. Identification of the genotypes causing hypertrophic cardiomyopathy in northern Sweden. J Mol Cell Cardiol. 2003;35(7):841–9. doi: 10.1016/s0022-2828(03)00146-9. [DOI] [PubMed] [Google Scholar]
  • 10.Van Driest SL, Vasile VC, Ommen SR, et al. Myosin binding protein C mutations and compound heterozygosity in hypertrophic cardiomyopathy. J Am Coll Cardiol. 2004;44(9):1903–10. doi: 10.1016/j.jacc.2004.07.045. [DOI] [PubMed] [Google Scholar]
  • 11.Merner ND, Hodgkinson KA, Haywood AF, et al. Arrhythmogenic right ventricular cardiomyopathy type 5 is a fully penetrant, lethal arrhythmic disorder caused by a missense mutation in the TMEM43 gene. Am J Hum Genet. 2008;82(4):809–21. doi: 10.1016/j.ajhg.2008.01.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Yang Z, Bowles NE, Scherer SE, et al. Desmosomal dysfunction due to mutations in desmoplakin causes arrhythmogenic right ventricular dysplasia/cardiomyopathy. Circ Res. 2006;99(6):646–55. doi: 10.1161/01.RES.0000241482.19382.c6. [DOI] [PubMed] [Google Scholar]
  • 13.van der Zwaag PA, Jongbloed JD, van den Berg MP, et al. A genetic variants database for arrhythmogenic right ventricular dysplasia/cardiomyopathy. Human mutation. 2009;30(9):1278–83. doi: 10.1002/humu.21064. [DOI] [PubMed] [Google Scholar]
  • 14.Shuldiner AR, O'Connell JR, Bliden KP, et al. Association of cytochrome P450 2C19 genotype with the antiplatelet effect and clinical efficacy of clopidogrel therapy. Jama. 2009;302(8):849–57. doi: 10.1001/jama.2009.1232. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Klein TE, Altman RB, Eriksson N, et al. Estimation of the warfarin dose with clinical and pharmacogenetic data. N Engl J Med. 2009;360(8):753–64. doi: 10.1056/NEJMoa0809329. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Caldwell MD, Awad T, Johnson JA, et al. CYP4F2 genetic variant alters required warfarin dose. Blood. 2008;111(8):4106–12. doi: 10.1182/blood-2007-11-122010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Jakobsdottir J, Gorin MB, Conley YP, Ferrell RE, Weeks DE. Interpretation of genetic association studies: markers with replicated highly significant odds ratios may be poor classifiers. PLoS genetics. 2009;5(2):e1000337. doi: 10.1371/journal.pgen.1000337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Kathiresan S, Melander O, Anevski D, et al. Polymorphisms associated with cholesterol and risk of cardiovascular events. N Engl J Med. 2008;358(12):1240–9. doi: 10.1056/NEJMoa0706728. [DOI] [PubMed] [Google Scholar]
  • 19.Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007;447(7145):661–78. doi: 10.1038/nature05911. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Chasman DI, Shiffman D, Zee RY, et al. Polymorphism in the apolipoprotein(a) gene, plasma lipoprotein(a), cardiovascular disease, and low-dose aspirin therapy. Atherosclerosis. 2009;203(2):371–6. doi: 10.1016/j.atherosclerosis.2008.07.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Luke MM, Kane JP, Liu DM, et al. A polymorphism in the protease-like domain of apolipoprotein(a) is associated with severe coronary artery disease. Arterioscler Thromb Vasc Biol. 2007;27(9):2030–6. doi: 10.1161/ATVBAHA.107.141291. [DOI] [PubMed] [Google Scholar]
  • 22.Kamstrup PR, Tybjaerg-Hansen A, Steffensen R, Nordestgaard BG. Pentanucleotide repeat polymorphism, lipoprotein(a) levels, and risk of ischemic heart disease. J Clin Endocrinol Metab. 2008;93(10):3769–76. doi: 10.1210/jc.2008-0830. [DOI] [PubMed] [Google Scholar]
  • 23.Clarke R, Peden JF, Hopewell JC, et al. Genetic variants associated with Lp(a) lipoprotein level and coronary disease. N Engl J Med. 2009;361(26):2518–28. doi: 10.1056/NEJMoa0902604. [DOI] [PubMed] [Google Scholar]
  • 24.Grundy SM, Cleeman JI, Merz CN, et al. Implications of recent clinical trials for the National Cholesterol Education Program Adult Treatment Panel III guidelines. Circulation. 2004;110(2):227–39. doi: 10.1161/01.CIR.0000133317.49796.0E. [DOI] [PubMed] [Google Scholar]
  • 25. http://gvs.gs.washington.edu/GVS/. In.
  • 26.Ng PC, Henikoff S. SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003;31(13):3812–4. doi: 10.1093/nar/gkg509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Stenson PD, Mort M, Ball EV, et al. The Human Gene Mutation Database: 2008 update. Genome medicine. 2009;1(1):13. doi: 10.1186/gm13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Ramensky V, Bork P, Sunyaev S. Human non-synonymous SNPs: server and survey. Nucleic Acids Res. 2002;30(17):3894–900. doi: 10.1093/nar/gkf493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Human Genome Variation Society 2009 (Accessed at http://www.hgvs.org/dblist/glsdb.html.)
  • 30.mtSNP: a database of human mitochondrial genome polymorphisms. Ann NY Acad Sci. 2004;1011:7–20. doi: 10.1007/978-3-662-41088-2_2. [DOI] [PubMed] [Google Scholar]
  • 31.The Universal Protein Resource (UniProt) 2009. Nucleic Acids Res. 2009;37(Database issue):D169–74. doi: 10.1093/nar/gkn664. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Jegga AG, Gowrisankar S, Chen J, Aronow BJ. PolyDoms: a whole genome database for the identification of non-synonymous coding SNPs with the potential to impact disease. Nucleic Acids Res. 2007;35(Database issue):D700–6. doi: 10.1093/nar/gkl826. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Online Medelian Inheritance in Man. (Accessed at http://www.ncbi.nlm.nih.gov/omim/.)
  • 34.Evangelou E, Chapman K, Meulenbelt I, et al. Large-scale analysis of association between GDF5 and FRZB variants and osteoarthritis of the hip, knee, and hand. Arthritis and rheumatism. 2009;60(6):1710–21. doi: 10.1002/art.24524. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Allen KJ, Gurrin LC, Constantine CC, et al. Iron-overload-related disease in HFE hereditary hemochromatosis. N Engl J Med. 2008;358(3):221–30. doi: 10.1056/NEJMoa073286. [DOI] [PubMed] [Google Scholar]
  • 36.Tomatsu S, Orii KO, Fleming RE, et al. Contribution of the H63D mutation in HFE to murine hereditary hemochromatosis. Proc Natl Acad Sci U S A. 2003;100(26):15788–93. doi: 10.1073/pnas.2237037100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Hymes J, Stanley CM, Wolf B. Mutations in BTD causing biotinidase deficiency. Human mutation. 2001;18(5):375–81. doi: 10.1002/humu.1208. [DOI] [PubMed] [Google Scholar]
  • 38.Rossi A, Superti-Furga A. Mutations in the diastrophic dysplasia sulfate transporter (DTDST) gene (SLC26A2): 22 novel mutations, mutation review, associated skeletal phenotypes, and diagnostic relevance. Human mutation. 2001;17(3):159–71. doi: 10.1002/humu.1. [DOI] [PubMed] [Google Scholar]
  • 39.Pulkkinen L, Christiano AM, Gerecke D, et al. A homozygous nonsense mutation in the beta 3 chain gene of laminin 5 (LAMB3) in Herlitz junctional epidermolysis bullosa. Genomics. 1994;24(2):357–60. doi: 10.1006/geno.1994.1627. [DOI] [PubMed] [Google Scholar]
  • 40.Calonge MJ, Gasparini P, Chillaron J, et al. Cystinuria caused by mutations in rBAT, a gene involved in the transport of cystine. Nat Genet. 1994;6(4):420–5. doi: 10.1038/ng0494-420. [DOI] [PubMed] [Google Scholar]
  • 41.Maron BJ, Niimura H, Casey SA, et al. Development of left ventricular hypertrophy in adults in hypertrophic cardiomyopathy caused by cardiac myosin-binding protein C gene mutations. J Am Coll Cardiol. 2001;38(2):315–21. doi: 10.1016/s0735-1097(01)01386-9. [DOI] [PubMed] [Google Scholar]
  • 42.Yoshikawa Y, Morimatsu M, Ochiai K, et al. Insertion/deletion polymorphism in the BRCA2 nuclear localization signal. Biomedical research (Tokyo, Japan) 2005;26(3):109–16. doi: 10.2220/biomedres.26.109. [DOI] [PubMed] [Google Scholar]
  • 43.Shattuck TM, Valimaki S, Obara T, et al. Somatic and germ-line mutations of the HRPT2 gene in sporadic parathyroid carcinoma. N Engl J Med. 2003;349(18):1722–9. doi: 10.1056/NEJMoa031237. [DOI] [PubMed] [Google Scholar]
  • 44.Schwarze U, Hata R, McKusick VA, et al. Rare autosomal recessive cardiac valvular form of Ehlers-Danlos syndrome results from mutations in the COL1A2 gene that activate the nonsense-mediated RNA decay pathway. Am J Hum Genet. 2004;74(5):917–30. doi: 10.1086/420794. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Cuppens H, Legius E, Cabello P, et al. Association between XV2c/CS7/KM19/D9 haplotypes and the delta F508 mutation. A study of 57 Belgian families. Human genetics. 1990;85(4):402–3. doi: 10.1007/BF02428277. [DOI] [PubMed] [Google Scholar]
  • 46.Capriotti E, Calabrese R, Casadio R. Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information. Bioinformatics. 2006;22(22):2729–34. doi: 10.1093/bioinformatics/btl423. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Methods

RESOURCES