Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Dec 1.
Published in final edited form as: Circ Cardiovasc Genet. 2014 Sep 1;7(6):751–759. doi: 10.1161/CIRCGENETICS.113.000578

Targeted Analysis of Whole Genome Sequence Data to Diagnose Genetic Cardiomyopathy

Jessica R Golbus 1,#, Megan J Puckelwartz 1,#, Lisa Dellefave-Castillo 1, John P Fahrenbach 1, Viswateja Nelakuditi 1, Lorenzo L Pesce 1, Peter Pytel 1, Elizabeth M McNally 1
PMCID: PMC4270910  NIHMSID: NIHMS625206  PMID: 25179549

Abstract

Background

Cardiomyopathy is highly heritable but genetically diverse. At present, genetic testing for cardiomyopathy uses targeted sequencing to simultaneously assess the coding regions of more than 50 genes. New genes are routinely added to panels to improve the diagnostic yield. With the anticipated $1000 genome, it is expected that genetic testing will shift towards comprehensive genome sequencing accompanied by targeted gene analysis. Therefore, we assessed the reliability of whole genome sequencing and targeted analysis to identify cardiomyopathy variants in 11 subjects with cardiomyopathy.

Methods and Results

Whole genome sequencing with an average of 37× coverage was combined with targeted analysis focused on 204 genes linked to cardiomyopathy. Genetic variants were scored using multiple prediction algorithms combined with frequency data from public databases. This pipeline yielded 1-14 potentially pathogenic variants per individual. Variants were further analyzed using clinical criteria and/or segregation analysis. Three of three previously identified primary mutations were detected by this analysis. In six subjects for whom the primary mutation was previously unknown, we identified mutations that segregated with disease, had clinical correlates, and/or had additional pathological correlation to provide evidence for causality. For two subjects with previously known primary mutations, we identified additional variants that may act as modifiers of disease severity. In total, we identified the likely pathological mutation in 9 of 11 (82%) subjects.

Conclusions

These pilot data demonstrate that ~30-40× coverage whole genome sequencing combined with targeted analysis is feasible and sensitive to identify rare variants in cardiomyopathy-associated genes.

Keywords: cardiomyopathy, genetics, human, genomics, whole genome sequencing

Introduction

Inherited cardiomyopathy is genetically diverse and has been linked to mutations that are rare in the general population. Genetic testing for cardiomyopathy is a useful adjunct for diagnosis as it is positioned to provide prognostic information for individuals and families. Specific gene mutations may suggest a greater risk of arrhythmias, rapid course, and importantly gene positive individuals with early signs of cardiomyopathy may benefit from early treatment.1-4 Currently, clinical genetic testing for cardiomyopathy relies on screening the coding region of multiple genes simultaneously as a gene panel. The first gene panel for cardiomyopathy, introduced in 2007, sampled only 5 genes, while current panels assess more than 50 different genes.5 Concomitant with panel expansion, sensitivity for mutation identification has increased.

Massively parallel, next generation sequencing is now being transitioned into the clinical arena.6 Options include whole exome sequencing (WES) and whole genome sequencing (WGS). WES interrogates only the coding sequence and relies on an exon capture step. This capture step is limited by oligonucleotide design, and may be incomplete due to uneven exon capture due to GC bias, off-target sequencing, and omission of non-canonical transcripts, all especially important in the heart.7, 8 WES arrays are regularly updated to reflect changes in the annotation of the coding region of the genome.9 Like panel-based testing, WES may necessitate recapture and resequencing as genome annotation continues. Currently, WES is less expensive than WGS, but that is rapidly changing. High-throughput sequencing technology is now available that can produce high coverage (30×) genomes for $1000.00 (http://www.illumina.com/systems/hiseq-x-sequencing-system.ilmn). This progression in sequencing technology narrows the costs between targeted sequencing and whole genome approaches, making whole genome sequencing a viable alternative to panel sequencing.

The declining cost of WGS and the 100-fold increase in genome coverage makes it a viable alternative to genetic panels and WES for genetic testing of cardiomyopathy. To transition WGS into a useful tool for diagnosing cardiomyopathy requires development of an analytical approach that permits detection of rare mutations. As a first step for cardiomyopathy WGS analysis, we created a “super gene-set” with 204 known and putative cardiomyopathy genes. This super gene-setexceeds the number of genes on most clinically available gene panels for cardiomyopathy since the super gene-set was extended based on association with cardiomyopathy in humans as well as animal based modeling. WGS data was filtered through this super gene-set to determine the feasibility of using WGS as a reliable screening method for cardiomyopathy variants. To test the sensitivity of WGS combined with this analytic approach, it was tested in 11 unrelated subjects with DCM. The pathogenic or likely pathogenic variants were identified in nine of eleven subjects (82% detection rate). These data demonstrate that ~30-40X coverage WGS is a reliable alternative to panel-based testing for cardiomyopathy. Furthermore, for those individuals in which the super gene-set was unrevealing, the remaining genome sequence is available for immediate further interrogation, instead of waiting for additional panel-based testing justifying the extra-cost and time associated with WGS.

Methods

Study subjects

Eleven unrelated subjects with nonischemic DCM were selected for WGS. Personal and family history of cardiomyopathy was available for all subjects. The study was approved by the University of Chicago Institutional Review Board. All study subjects provided written informed consent. Genetic counseling for WGS was provided.

Generation of whole genome sequence data

Genomic DNA was extracted from the peripheral blood of ten study subjects and the explanted heart tissue of one subject. Reversible terminator massively parallel sequencing was performed by Illumina (San Diego, CA) on the HiSeq2000. Paired-end reads were mapped to the NCBI reference genome 37.1 (hg19) and variants were called using Illumina’s proprietary software (ELAND/CASAVA).

The Myopathy Super Gene-Set

The myopathy gene set comprises 204 genes (245 transcripts) identified by published association with cardiac or skeletal myopathies or cardiac arrhythmias as a single gene disorder, association-based study, or in animal models. In genes with multiple transcripts, those transcripts with the highest expression in cardiac and skeletal muscle were included in the gene set. For all transcripts in the gene set, exonic boundaries +/− 10basepairs (bps) were downloaded from the Ensemble Genome Browser (http://useast.ensembl.org/index.html).

Analysis of protein coding Single Nucleotide Variants (SNVs)

The effect of SNVs was determined using SeattleSeq Annotation 134 (http://snp.gs.washington.edu/SeattleSeqAnnotation134/). Variants were analyzed using PolyPhen-2 (PP2), SIFT, PhastCons, GERP, Panther, and ConSeq.10-15 Frequency was assessed using three publically available databases: The March, 2012 Integrated Phase 1 release of the 1000Genomes Project16, the NHLBI Exome Sequencing Project (ESP 5400) (http://evs.gs.washington.edu/EVS/), and dbSNP 135/136 (http://www.ncbi.nlm.nih.gov/projects/SNP/). A minor allele frequency (MAF) of ≤0.01 was used to restrict variants.

Prioritization of intronic SNVs

Variants were analyzed by MaxEntScan using publically available perl scripts.17 Retained variants were then analyzed for frequency using the March, 2012 Integrated Phase 1 release of the 1000Genomes Project16 and dbSNP 135/136 (http://www.ncbi.nlm.nih.gov/projects/SNP/).

Identification and analysis of insertion/deletion (Indel) polymorphisms

Indels were annotated using the Variant Effect Predictor (VEP) (http://useast.ensembl.org/index.html). Indels in the coding sequence or at a splice junction were scored for frequency using the 1000Genomes Project.16

Desmin expression

Full-length human desmin cDNA (Origene) was mutated. Wildtype and mutant plasmid was placed into pcDNA3.1/His C (Invitrogen) and transfected into C2C12 cells (ATCC) grown in DMEM supplemented with 10% Fetal Bovine Serum (FBS) and 1% penicillin/streptomycin in a 10% CO2 incubator at 37°C. After 48hours, cells were fixed in 100% methanol (−20°C) and stained as described using an anti-Xpress Antibody (Invitrogen). Results were imaged as described.18

Results

Eleven unrelated subjects with nonischemic cardiomyopathy (Table 1) were selected for WGS and targeted analysis. Nine of the eleven subjects or an affected family member had undergone previous panel-based clinical genetic testing (Table 1). The number of genes assessed by this previous testing varied with the year in which the subject had undergone testing. In three of the nine subjects who had undergone clinical testing, a pathogenic mutation was found via clinical testing (Table 1). WGS was performed in these families to provide validation to confirm that WGS has sensitivity to detect known mutations. Also, two of the families (DCM-AAB and DCM-Bl) exhibited marked phenotypic variability between generations. In these families the probands required heart transplant in the second decade of life while other family members, carrying the clinically identified mutation, were only mildly affected into the 4th decade of life.

Table 1.

Clinical Features of subjects

Age
at Dx
Clinical features Age at death (D)
or transplant (T)
# Affected
family members
Previous Genes tested
DCM-O1 12 Heart block;
Pacemaker age 12.
32 (D) 7
(5 DCM, 1 SCD,
1 arrhythmia)
ACTC, LDB3/ZASP, LMNA, MYBPC3, MYH7, PLN, TAZ, TNNI3,
TNNT2, TPM1 (affected relative tested)
DCM-AAB03 20 VT; BiV ICD. 20 (T) 7 (DCM) LMNA, MYBPC3, MYH7, TNNI3, TNNT2, TPM1 *TPM1 D230N
DCM-
AAW02
33 Postpartum CMP with
EF of 20%; VT; ICD.
4 (DCM) ACTC, GLA, LAMP2, MYBPC3, MYH7, MYL3, MYL2, PRKAG2,
TNNT2, TNNI3, TNNC1, TPM1, LMNA
DCM-AAY02 57 LAF and LIV
conduction delay
2 (DCM) ABCC9, ACTN2, CSRP3, CTF1, DES, EMD, LDB3, LMNA, MYBPC3,
MYH7, PLN, SGCD, TAZ, TCAP, TNNI3, TPM1, VCL
LGMD-AH01 30s Dual chamber ICD
+ LGMD
3 (LGMD) LMNA, LDB3, TNNT2, DES, SGCD, PLN, ACTC1, MYH7, TPM1,
TNNI3, TAZ, TTR, MYBPC3, LAMP2,
cardiomyopathy mitochondrial genes
DCM-Q14 21 3rd decade (D) 12 (10 DCM,
2 SCD)
SD-303 52 VT 61 (T) 13
DCM-AAL01 32 VT and AF; BiV ICD 43 (T) 13 (7 DCM,
6 SCD).
LMNA, MYH7, MYBPC3, TNNT2, TNNI3, TPM1, ACTC, MYL2,
MYL3, LAMP2, PRKAG2
MDC-01 AF + LGMD 37 (D) 17 (13 DCM, 4
LGMD)
ABCC9, ACTC, ACTN2, CSRP3, CTF1, DES, EMD, MDB3, LMNA,
MYBPC3, MYH7, PLN, SGCD, TAZ, TCAP, TNNI3, TNNT2, TPM,
VCL *DES c.735+3A>G 22
DCM-BI01 16 16 (T) 5 (DCM) ANKRD1, ACTC, LDB3, LMNA, MYBPC3, MYH7, PLN, SCN5A,
TNNCI, TNNI3, TNNT2, TPM1 (affected relative tested)
*TNNT2 K210del
DCM-BH01 62 LBBB; ICD 2 (DCM) LMNA, MYH7, TNNT2, ACTC1, DES, MYBPC3, TPM1, TNNI3, ZASP,
TAZ, PLN, TTR, LAMP2, SGCD, MYL2, MYL3, PRKAG2,
cardiomyopathy mitochondrial genes

Dx=Diagnosis; DCM=Dilated cardiomyopathy; SCD= Sudden cardiac death; VT=Ventricular tachycardia; BiV=Biventricular; CMP=Cardiomyopathy; EF=Ejection fraction; ICD=Implantable cardioverter defibrillator; LAF=Left atriofascicular; LIV=Left intraventricular; LGMD=Limb girdle muscular dystrophy; AF=Atrial fibrillation; VF=Ventricular fibrillation; LBBB=Left bundle branch block

*

Primary Mutation identified by panel sequencing

These data suggested additional genetic modifiers. For six of nine subjects previous clinical genetic testing was unrevealing. WGS was completed with an average coverage of 37.1 fold. On average, 113.4GB of data per individual passed filter (Q≥20) and aligned to the reference genome covering 97.9% of the non-N reference genome (Supplemental Table 1). We restricted our analysis to a super gene-set that included 204 genes (245 transcripts) previously associated with Mendelian or non-Mendelian forms of cardiomyopathy, skeletal myopathies, or cardiac arrhythmias as demonstrated in either humans or animal models (Supplemental Table 2).

Analysis of SNVs

WGS identified an average of 3.7 million single nucleotide variants (SNVs) per individual with a greater number of SNVs identified in the two non-Caucasian individuals (4.0 million). Each genome had ~11,586 non-synonymous SNVs (nsSNVs). Restricting analysis to the super gene set reduced the average number of missense SNVs to 167 per individual (Supplemental Table 3). Missense SNVs were filtered using a combination of algorithms that predict effect based on conservation and/or structure. SNVs were identified as rare based on their frequency in the population at large (Figure 1). For TTN, only truncating variants were considered as described by Herman and colleagues, as is the standard practice for clinical genetic testing.19 Missense variants in TTN were not considered because these variants are highly prevalent in the general population and vastly exceed the frequency of cardiomyopathy, making them difficult to interpret at this time.20 Variants common to multiple individuals in the sequencing cohort and absent from frequency databases were discarded as they represent sequencing or aligning artifacts. This analysis pipeline reduced the number of potentially damaging missense variants to 0-11 per individual (Supplemental Table 4). Variants were confirmed using Sanger Sequencing. We tested the sensitivity of this pipeline using 100 independent mutations reported in inherited cardiomyopathy and found it to be 91% sensitive (Supplemental Table 5). We also detected the one known pathogenic missense mutation previously identified in the cohort in subject DCM-AAB03 (TPM1 D230N).

Figure 1.

Figure 1

Variant analysis pipeline. A. Missense variant analysis. The Super Gene-set includes genes linked to cardiomyopathy. ~3.7 million variants were identified, restricting to the Super Gene-set reduces the number of variants to ~11.5K. Missense SNVs from these genes were analyzed using PolyPhen-2, SIFT, PhastCons, and GERP.10-13 Variants were retained if predicted to be probably or possibly damaging by PolyPhen-2 or if damaging by two of the three remaining programs. Cutoff scores of 0.95 and 3 were considered damaging by PhastCons and GERP, respectively. If no prediction was made by at least three of four programs, the variant was retained for further analysis. Variants were then analyzed by Panther and ConSeq.14, 15 Variants with a Panther subSPEC score of < −2 or a ConSeq score < 0 were retained. If neither program was able to make a prediction, the variant was retained. This filtering reduces the number of variants to ~167 per genome. Variants were analyzed for frequency using three databases: the 1000Genomes Project, NHLBI Exome Sequencing Project (ESP 5400), and dbSNP 135/136.16 Variants present at a frequency of ≤0.01 were retained, resulting in 0-11 nsSNPs per genome. Nonsense SNVs within the myopathy super gene-set were filtered using frequency. B. Splice site variant analysis. Analysis was restricted to intronic SNVs within 10bps of an exon, reducing variant lists from ~3.7M to 50 per genome. Variants were analyzed using MaxENT.17 The 1000Genomes Project and dbSNP 135/136 were used to determine frequency. Variants present at a frequency of ≤0.01 were deemed potentially deleterious, resulting in 0-2 variants per genome. (SNV=Single nucleotide variant)

To assess variants positioned to alter splicing, SNVs within 10 bps of exon boundaries were filtered through MaxENT and evaluated for frequency in the population at large (Figure 1). MaxENT estimates the strength of the 5′ and 3′ splice junctions using a maximum entropy model.17 An average of 50 intronic SNVs (range 44-63) were identified in each individual using the super gene-set. Filtering with MaxENT reduced this list to ≤4 per individual. By including only rare variants, the number of splice-site altering SNVs was reduced to 0-2 per individual (Supplemental Table 4). Variants were confirmed using Sanger Sequencing. We tested the sensitivity of this approach using a control dataset of 25 known splice-site altering mutations and found it to be 88% sensitive (Supplemental Table 6). This approach also detected the single known splice variant within the cohort in MDC-01 (DES c.735+3 A>G); this variant was previously shown to disrupt splicing.21, 22 Combining the analyses for missense, nonsense, and splicing variants produced 1-13 potentially pathogenic SNVs per individual (Supplemental Table 4). All variants that passed pipeline criteria were then manually curated based on the specific phenotype of the proband. For example, variants in genes that are usually associated with muscle involvement were ranked lower in probands without muscle disease. Variants were then tested for segregation, where possible (Table 2).

Table 2.

Pathogenic and likely pathogenic variants identified by WGS

Gene Function Position NHLBI ESP
frequency
1000 Genomes
frequency
Additional Evidence for Causality
DCM-AAB03 GLA
TPM1*
Missense
Missense
R118C
D230N
0.0004
Absent
Absent
Absent
Segregation n=7
DCM-AAW02 TTN Nonsense E3707X Absent Absent Segregation n=2
DCM-AAY02 FLNC Intronic, SS c.3791-1 G>C Absent Absent in unaffected sibling
DCM-Q14 TTN 1bp insertion L20605PfsX2 Absent Segregation n=38
DCM-AAL01 DES Missense R127P Absent Absent HistopathologyFunctional studies (Fig 3)
MDC-01* DES Intronic, SS c.735+3 A>G Absent Ref 22
DCM-BI01 TTN
TNNT2*
Intronic, SS
3bp deletion
c.42521-5 C>G
K210del
Absent
Absent
Segregation n=3
DCM-BH01 SCN5A Missense G1318V 0.00008 Absent Segregation n=2
SD-303 SCN5A Missense R814W Absent Absent Segregation n=16
*

Identified by panel-based genetic testing

Frequency refers to the overall variant frequency. SS=Splice site; bp=basepair.

Analysis of insertion/deletion polymorphisms

Each genome had on average 293,729 insertions and 312,947 deletions. Filtering these variants using the super gene-set reduced the number to ~88 indels per individual (Supplemental Table 3). Indels in the coding sequence or at a splice junction and present in the 1000Genomes database at frequency ≤0.01 were retained. Indels common in the sequenced cohort were omitted as these are likely sequencing/aligning artifact. This analysis reduced the number of potentially pathogenic indels to 0-1 per individual, which were confirmed using Sanger Sequencing. This analysis also detected the single known pathogenic indel within the cohort in DCM-BI01 (TNNT2 K210del). Combining the analysis for indels with that for SNVs produced 1-14 potentially pathogenic variants per individual.

Likely pathogenic cardiomyopathy variants detected by WGS

Potentially pathogenic variants were filtered based on the genes’ tissue expression pattern, association with syndromic features, and pattern of inheritance. Candidate variants were confirmed by Sanger sequencing and sequenced in family members for segregation when available. We identified 11 likely pathogenic mutations in 9 individuals, including 9 new mutations not identified by traditional panel-based screening (Table 2, Supplemental Tables 7-13). Of these mutations, two are second mutations identified within an individual that may function as disease modifiers. For 7 subjects, segregation analysis was performed in 2-38 family members, including both affected and unaffected individuals (Table 2, Supplemental Figures 1-4).

Subject DCM-BH01 presented at age 62 with nonischemic DCM, non-sustained ventricular tachycardia, and a wide QRS-complex consistent with cardiac conduction system disease (Figure 2A). Clinical genetic testing for 20 DCM genes was negative (Table 1). WGS identified three potentially pathogenic variants, including missense variants in SCN5A, FKTN, and HPS6 (Figure 2B). Both HPS6 and FKTN were excluded because of association with syndromic, autosomal recessive disorders.23, 24 The SCN5A G1318V variant was absent from public databases. SCN5A mutations near this region have been linked to both DCM and inherited arrhythmic disorders25, consistent with the striking conduction system disease observed in this individual. 26 The SCN5A G1318V variant was also found in an adult offspring whose left ventricular ejection fraction (LVEF) was 47.8% (Figure 2C).

Figure 2.

Figure 2

SCN5A G1318V detected by WGS. WGS was used to assess the genome of DCM-BH01 with DCM, left bundle branch block and nonsustained VT. A) 12 lead EKG from proband. B) Variants were filtered through the analysis pipeline identifying three potentially pathogenic variants. FKTN and HPS6 were excluded as these genes are linked to recessive, syndromic disease.23, 24 The SCN5A G1318A variant was considered pathogenic since SCN5A gene mutations are known to affect the cardiac conduction system in addition to causing DCM.26 C)SCN5A G1318A variant was identified in the proband’s offspring with DCM.

An SCN5A variant was also identified in subject SD-303 (Supplemental Figure 1). SD-303 has a history of arrhythmias beginning in the third decade of life. Congestive heart failure was diagnosed in the sixth decade of life and a heart transplant was performed at age 61. Family history is remarkable for arrhythmia, sudden death and congestive heart failure with some family members requiring transplant before the age of 40. WGS identified seven potentially pathogenic variants, all missense (Supplemental Table 7). Segregation analysis of SCN5A R814W in 16 family members confirmed its association with disease (Supplemental Figure 1). SCN5A R814W was both rare and considered highly deleterious by all algorithms. This particular variant has also been described as a de novo mutation in a 23-year old woman with sporadic DCM, atrial flutter and short runs of nonsustained ventricular tachycardia.27 Nguyen and colleagues found that the SCN5A R814W mutation disrupted both activation and deactivation of Na 1.5.28

Subject DCM-AAL01 was diagnosed with DCM at age 32 and underwent heart transplantation at age 43 (Figure 3A). Clinical genetic testing was negative for 11 genes (Table 1). Seven potentially pathogenic missense variants were identified in the myopathy super geneset (Supplemental Table 8). Variants in LDB3, MYH11, and DES were identified as potentially relevant based on the proband’s phenotype and were confirmed by Sanger sequencing (Figure 3B). The variant in MYH11 is unlikely to be the primary driver as MYH11 variants are often associated with aortic aneurysm which was not a factor in this subject’s disease.29, 30We performed immunohistochemical staining for desmin and electron microscopy on frozen sections from the proband’s explanted heart. Desmin aggregates were readily evident, diagnostic of a desmin-related myopathy (Figure 3C).31, 32 Segregation analysis could not be performed as all affected family members had died or were unavailable for testing. Epitope-tagged DES R127P or wildtype desmin was introduced into myogenic C2C12 cells to assess pathogenicity. Intracellular desmin aggregates were readily detected with DES R127P but not with wildtype desmin, confirming the pathogenicity of this variant (Figure 3D). LDB3, also known as ZASP or CYPHER, has been linked to cardiomyopathy and, given the variant’s rarity, may contribute to the phenotype.33 As desmin protein aggregates were observed in the subject’s explanted heart, and as these aggregates are not known to be a feature of LDB3 mutations, these data lead us to conclude that DES is the primary genetic mutation in this individual. Genetic counseling was provided to the subject.

Figure 3.

Figure 3

WGS identified a desmin (DES) genemutation, R127P. A) DCM-AAL01 pedigree reveals DCM and sudden cardiac death (SCD). The proband (*) underwent cardiac transplantation at the age of 42. B) Seven variants were identified as potentially deleterious, threewere relevant to this individual’s phenotype. MYH11 is linked to muscle phenotypes and therefore was excluded.29, 30 LDB3 I558V may be a contributing variant. The DES R127P variant was further tested. C) The explanted heart from the proband demonstrated desmin aggregates on immunohistochemistry (brown staining, left panel), feature pathognomonic for desmin related myopathies.32 EM revealed granulofilamentous material consistent with desmin aggregates (right panel). D) When expressed in myogenic C2C12 cells, the DES R127P variant formed aggregates (right panel, arrowhead) while wildtype desmin (left panel) did not. Expressed desmin was tagged with the Xpress epitope tag (green). Nuclei are stained with DAPI.

Whole genome sequencing identifies “second hits” as disease modifiers

WGS and the super gene-set were applied to two families that displayed a range of phenotype, consistent with disease modifiers. The first was an individual with a known TPM1 mutation with a more severe clinical course compared to other affected family members (Figure 4A). Subject DCM-AAB03 underwent heart transplantation at age 20 after presenting with refractory heart failure symptoms and severe LV systolic dysfunction (LVEF <20%). With the exception of one brother who died from cardiomyopathy and muscle weakness at age 12, other family members remain only mildly affected into the 4th decade and beyond. WGS identified six potentially pathogenic missense mutations, including a mutation in the X-encoded GLA gene (Supplemental Table 9). Mutations in GLA, which codes for the protein beta-galactosidase alpha, cause the X-linked disorder Fabry disease.34, 35 DCM-AAB03 was hemizygous for the GLA R118C variant while his less severely affected sisters and mother were heterozygous for the variant and his more mildly affected brother carried only the primary mutation in TPM1.

Figure 4.

Figure 4

WGS identified primary and secondary pathogenic sequence variation. A) DCM-AAB03 Pedigree. The TPM1 D230N variant segregates with cardiomyopathy in all family members. Several members of this family had earlier onset disease. Individual II-3 presented at age 20 with heart failure and was found to be hemizygous for the X-linked GLA R118C variant. B) DCM-BI01 Pedigree. The TNNT2 K210del variant has been described previously in cardiomyopathy.33, 38 The two younger members who required cardiac transplantation had an additional TTN mutation predicted to disrupt the splice site and truncate TTN.

The second family was known to have a primary TNNT2 mutation but evidence for variability. Specifically, subject DCM-BI01 presented at age 16 with a LVEF of 9.8% and underwent cardiac transplantation (Figure 4B). Clinical genetic testing identified TNNT2 K210del as a pathogenic mutation. The TNNT2 K210del mutation was also detected using the filtering pipeline for WGS. Both the patient and his brother, who also required heart transplantation in adolescence, carry the TNNT2 K210 deletion. This mutation was also found in the subject’s affected mother and in his mildly affected grandmother, who at age 65 had asymptomatic LV dysfunction (LVEF 45%). This phenotypic variability has previously been noted with the TNNT2 K210del variant.36-38 WGS also identified a TTN splice-variant c.42521-5 G>C (Supplemental Table 10). This variant was found in the two boys who required heart transplants in their second decade but not in their grandmother who was only mildly affected at age 65 (Figure 4B). This TTN splice variant has a MaxEnt score of 5.97, which is deleterious by the criteria set forth by Herman et al. for TTN splice sites.19

Discussion

WGS detects rare variants for cardiomyopathy

WGS offers a comprehensive approach to identifying genetic variation across the genome. We now show that rare variants can be detected in cardiomyopathy genes using WGS and targeted analysis. While the comprehensive nature of WGS or even WES is attractive, analytical tools must be refined to identify pathogenic variants. The pipeline applied herein relied on 1) restricting the number of genes for analysis to a “super gene-set” 2) filtering based on frequency with the assumption that a rare disease is caused by rare genetic variation and 3) protein prediction algorithms that largely rely on disrupting conserved regions. The method successfully identified three known mutations (DES c.735+3 A>G in patient MDC-01, TNNT2 K210del in patient DCM-Bl01, TPM1 D230N in patient DCM-AAB03) providing proof-of-principle that WGS at 30-40× coverage is sensitive to detect these rare variants. Likely pathogenic variants were identified in 6 of the remaining patients. Each proband had 1-14 variants that passed filtering criteria of our pipeline. We relied on segregation analysis, functional data and phenotypic data from each subject to further refine the variant list. It is important to note that phenotypic information is invaluable when identifying likely pathogenic variation and that manual curation of each high probability variant is required.

In subject DCM-AAY02, a putative mutation in the gene encoding γ-filamin, also known as filamin C (FLNC) was found (Table 2, Supplemental Table 11). DCM-AAY02 presented at age 57 with left anterior hemiblock and intraventricular conduction delay, but without skeletal muscle disease. FLNC encodes an actin-crosslinking protein that interacts with the dystrophin-associated protein complex, and mutations in FLNC lead to skeletal myopathy in humans and mice.39, 40 Moreover, a fish model, medaka, with reduced FLNC expression, resulted in an enlarged and mechanically weakened heart.41 Furthermore, a quantitative proteomic assessment of aggregates in desminopathy identified filamin C as the second most abundant protein in these aggregates, providing additional support for the role of filamin C as a mediator of cardiomyopathy.42 The FLNC mutation identified in DCM-AAY02 is located 1bp from a splice-acceptor, is absent from the 1000Genomes database, is predicted to be strongly deleterious by MaxENT, and is not present in an unaffected sibling. While further functional studies are required to confirm the pathogenicity of this variant, these data underscore the utility of having a mutable super gene-set that allows for the interrogation of putative cardiomyopathy genes.

Perhaps the greatest value of broad-based sequencing is the capacity to further interrogate the data if a mutation is not identified on first pass analysis. Here, the super gene-set is defined by genes previously associated with cardiomyopathy, restricting our ability to identify new cardiomyopathy-associated genes. The super gene-set is mutable and genes can be added and removed at any time, allowing for the identification of variation in genes that may be suspected in the pathogenesis of cardiomyopathy. Analysis can be rerun in minutes to hours depending on the size of the dataset. However, this approach is candidate driven and will not identify variation in genes not on the super gene-set. Additional panel-based testing can take weeks to months to complete and may still not supply variants of interest. Further, as WGS becomes more commonplace, it is possible that patients will already have sequencing data. While, panel-based testing is generally higher coverage, this pilot study confirms that 30-40× coverage sequencing data is appropriate for identifying cardiomyopathy mutations.

For two subjects in this pilot study, WGS did not identify a clear mutation. We expect that these two individuals have mutations in genes outside the super gene-set. Importantly, the available WGS sequence from these genomes will permit ongoing analysis and will not require reinvestment in additional sequencing. Larger gene panels have increased sensitivity for mutation detection, especially for DCM. Therefore, it can be expected that having comprehensive sequencing should have even greater power to detect primary mutations, and importantly secondary modifier mutations which may account for disease severity or point to potentially treatable disorders such as cardiomyopathy amenable to enzyme replacement therapy like Fabry.43

Cost considerations

The cost of targeted panel sequencing, whole exome and whole genome sequencing is a major consideration, and costs are shifting rapidly with new technology. Clinical testing for cardiomyopathy using targeted panel sequencing is around $4000 at this time. Current pricing for clinical exome sequencing is approximately ~$7000 but varies widely depending on the extent of variant analysis. Clinical whole genome sequencing is approximately $9000-9500, and like other clinical genetic testing costs includes the cost of analysis. The anticipated $1000 genome is for research-based studies, and does not include the cost of analysis.

Sequencing depth is also a major consideration for sequencing studies. Next generation sequencing technologies have a higher per base sequencing error than traditional Sanger sequencing reducing the probability of true positive variants.44 The higher error rate requires higher depth whereby each base is sequenced × number of times by high quality reads. However, as depth increases so does cost. In this study, WGS was performed at 30-40× because previous work has shown that readable bases and number of SNVs identified increases exponentially with increasing depth, with a plateau at 30×, indicating diminishing returns with additional sequencing.45 It is also important to note that sequencing reads should undergo extensive filtering to identify and discard low quality reads. It is imperative that analysis pipelines be applied to sequencing data to control for faulty reads, misalignment to the referent genome and low quality variant calls. These steps contribute to analysis costs. Given the range of approaches used for bioinformatics, costs for analysis will vary widely. Reducing analytical costs will derive from improving alignment and variant calling pipelines as well as refining which genes to analyze and automating pipelines to reduce human analytical time. The informatically designed super gene-set, which was applied here to WGS, is expected to be refined to provide more specific results. The utility of this approach is that the super gene-set is virtual, and as such the analysis can readily be updated and optimized without need for recapture and resequencing. Thus for those individuals for which a primary mutation is not identified, the remaining genome information is available for analysis. This data provides a wealth of information that may inform not only about new genes that cause cardiomyopathy but also about combinations of genes that may provide important prognostic information.

Supplementary Material

000578 - PAP
000578 - Supplemental Material
CircGenetics_CIRCCVG-2014-000578-T.xml
Clinical Perspective

Acknowledgments

We thank the families for their participation.

Funding Sources: Doris Duke Charitable Foundation, New York, New York; Sarnoff Foundation, Great Falls, VA; National Institutes of Health, Bethesda, Maryland, NIH T32 HL007237, NIH F32 HL097587.

Footnotes

Conflict of Interest Disclosures: None.

References

  • 1.Cirino AL, Ho CY. Genetic testing for inherited heart disease. Circulation. 2013;128:e4–8. doi: 10.1161/CIRCULATIONAHA.113.002252. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Lakdawala NK, Thune JJ, Colan SD, Cirino AL, Farrohi F, Rivero J, et al. Subtle abnormalities in contractile function are an early manifestation of sarcomere mutations in dilated cardiomyopathy. Circ Cardiovasc Genet. 2012;5:503–510. doi: 10.1161/CIRCGENETICS.112.962761. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Maron BJ, Roberts WC, Arad M, Haas TS, Spirito P, Wright GB, et al. Clinical outcome and phenotypic expression in LAMP2 cardiomyopathy. JAMA. 2009;301:1253–1259. doi: 10.1001/jama.2009.371. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Mestroni L, Taylor MR. Lamin A/C gene and the heart: how genetics may impact clinical care. J Am Coll Cardiol. 2008;52:1261–1262. doi: 10.1016/j.jacc.2008.07.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Zimmerman RS, Cox S, Lakdawala NK, Cirino A, Mancini-DiNardo D, Clark E, et al. A novel custom resequencing array for dilated cardiomyopathy. Genet Med. 2010;12:268–278. doi: 10.1097/GIM.0b013e3181d6f7c0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Meder B, Haas J, Keller A, Heid C, Just S, Borries A, et al. Targeted next-generation sequencing for the molecular genetic diagnostics of cardiomyopathies. Circ Cardiovasc Genet. 2011;4:110–122. doi: 10.1161/CIRCGENETICS.110.958322. [DOI] [PubMed] [Google Scholar]
  • 7.Clark MJ, Chen R, Lam HY, Karczewski KJ, Euskirchen G, Butte AJ, et al. Performance comparison of exome DNA sequencing technologies. Nat Biotechnol. 2011;29:908–914. doi: 10.1038/nbt.1975. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Asan, Xu Y, Jiang H, Tyler-Smith C, Xue Y, Jiang T, et al. Comprehensive comparison of three commercial human whole-exome capture platforms. Genome Biol. 2011;12:R95. doi: 10.1186/gb-2011-12-9-r95. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Bamshad MJ, Ng SB, Bigham AW, Tabor HK, Emond MJ, Nickerson DA, et al. Exome sequencing as a tool for Mendelian disease gene discovery. Nat Rev Genet. 2011;12:745–755. doi: 10.1038/nrg3031. [DOI] [PubMed] [Google Scholar]
  • 10.Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, et al. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7:248–249. doi: 10.1038/nmeth0410-248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005;15:1034–1050. doi: 10.1101/gr.3715005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Ng PC, Henikoff S. Predicting deleterious amino acid substitutions. Genome Res. 2001;11:863–874. doi: 10.1101/gr.176601. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Cooper GM, Stone EA, Asimenos G, Green ED, Batzoglou S, Sidow A. Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. 2005;15:901–913. doi: 10.1101/gr.3577405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Thomas PD, Kejariwal A, Campbell MJ, Mi H, Diemer K, Guo N, et al. PANTHER: a browsable database of gene products organized by biological function, using curated protein family and subfamily classification. Nucleic Acids Res. 2003;31:334–341. doi: 10.1093/nar/gkg115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Ashkenazy H, Erez E, Martz E, Pupko T, Ben-Tal N. ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids. Nucleic Acids Res. 2010;38:W529–533. doi: 10.1093/nar/gkq399. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.A map of human genome variation from population-scale sequencing. Nature. 2010;467:1061–1073. doi: 10.1038/nature09534. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Yeo G, Burge CB. Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. J Comput Biol. 2004;11:377–394. doi: 10.1089/1066527041410418. [DOI] [PubMed] [Google Scholar]
  • 18.Demonbreun AR, Fahrenbach JP, Deveaux K, Earley JU, Pytel P, McNally EM. Impaired muscle growth and response to insulin-like growth factor 1 in dysferlin-mediated muscular dystrophy. Hum Mol Genet. 2011;20:779–789. doi: 10.1093/hmg/ddq522. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Herman DS, Lam L, Taylor MR, Wang L, Teekakirikul P, Christodoulou D, et al. Truncations of titin causing dilated cardiomyopathy. N Engl J Med. 2012;366:619–628. doi: 10.1056/NEJMoa1110186. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Golbus JR, Puckelwartz MJ, Fahrenbach JP, Dellefave-Castillo LM, Wolfgeher D, McNally EM. Population-based variation in cardiomyopathy genes. Circ Cardiovasc Genet. 2012;5:391–399. doi: 10.1161/CIRCGENETICS.112.962928. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Dalakas MC, Park KY, Semino-Mora C, Lee HS, Sivakumar K, Goldfarb LG. Desmin myopathy, a skeletal myopathy with cardiomyopathy caused by mutations in the desmin gene. N Engl J Med. 2000;342:770–780. doi: 10.1056/NEJM200003163421104. [DOI] [PubMed] [Google Scholar]
  • 22.Park KY, Dalakas MC, Goebel HH, Ferrans VJ, Semino-Mora C, Litvak S, et al. Desmin splice variants causing cardiac and skeletal myopathy. J Med Genet. 2000;37:851–857. doi: 10.1136/jmg.37.11.851. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Gahl WA, Huizing M. Hermansky-Pudlak Syndrome. 1993 [Google Scholar]
  • 24.Arimura T, Hayashi YK, Murakami T, Oya Y, Funabe S, Arikawa-Hirasawa E, et al. Mutational analysis of fukutin gene in dilated cardiomyopathy and hypertrophic cardiomyopathy. Circ J. 2009;73:158–161. doi: 10.1253/circj.cj-08-0722. [DOI] [PubMed] [Google Scholar]
  • 25.Ruan Y, Liu N, Priori SG. Sodium channel mutations and arrhythmias. Nat Rev Cardiol. 2009;6:337–348. doi: 10.1038/nrcardio.2009.44. [DOI] [PubMed] [Google Scholar]
  • 26.Schott JJ, Alshinawi C, Kyndt F, Probst V, Hoorntje TM, Hulsbeek M, et al. Cardiac conduction defects associate with mutations in SCN5A. Nat Genet. 1999;23:20–21. doi: 10.1038/12618. [DOI] [PubMed] [Google Scholar]
  • 27.Olson TM, Michels VV, Ballew JD, Reyna SP, Karst ML, Herron KJ, et al. Sodium channel mutations and susceptibility to heart failure and atrial fibrillation. JAMA. 2005;293:447–454. doi: 10.1001/jama.293.4.447. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Nguyen TP, Wang DW, Rhodes TH, George AL., Jr Divergent biophysical defects caused by mutant sodium channels in dilated cardiomyopathy with arrhythmia. Circ Res. 2008;102:364–371. doi: 10.1161/CIRCRESAHA.107.164673. [DOI] [PubMed] [Google Scholar]
  • 29.Babu GJ, Warshaw DM, Periasamy M. Smooth muscle myosin heavy chain isoforms and their role in muscle physiology. Microsc Res Tech. 2000;50:532–540. doi: 10.1002/1097-0029(20000915)50:6<532::AID-JEMT10>3.0.CO;2-E. [DOI] [PubMed] [Google Scholar]
  • 30.Tajsharghi H, Ohlsson M, Palm L, Oldfors A. Myopathies associated with beta-tropomyosin mutations. Neuromuscul Disord. 2012;22:923–933. doi: 10.1016/j.nmd.2012.05.018. [DOI] [PubMed] [Google Scholar]
  • 31.Chourbagi O, Bruston F, Carinci M, Xue Z, Vicart P, Paulin D, et al. Desmin mutations in the terminal consensus motif prevent synemin-desmin heteropolymer filament assembly. Exp Cell Res. 2011;317:886–897. doi: 10.1016/j.yexcr.2011.01.013. [DOI] [PubMed] [Google Scholar]
  • 32.Clemen CS, Herrmann H, Strelkov SV, Schroder R. Desminopathies: pathology and mechanisms. Acta Neuropathol. 2013;125:47–75. doi: 10.1007/s00401-012-1057-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Hershberger RE, Parks SB, Kushner JD, Li D, Ludwigsen S, Jakobs P, et al. Coding sequence mutations identified in MYH7, TNNT2, SCN5A, CSRP3, LBD3, and TCAP from 313 patients with familial or idiopathic dilated cardiomyopathy. Clin Transl Sci. 2008;1:21–26. doi: 10.1111/j.1752-8062.2008.00017.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Schafer E, Baron K, Widmer U, Deegan P, Neumann HP, Sunder-Plassmann G, et al. Thirtyfour novel mutations of the GLA gene in 121 patients with Fabry disease. Hum Mutat. 2005;25:412. doi: 10.1002/humu.9327. [DOI] [PubMed] [Google Scholar]
  • 35.Sheppard MN. The heart in Fabry’s disease. Cardiovasc Pathol. 2011;20:8–14. doi: 10.1016/j.carpath.2009.10.003. [DOI] [PubMed] [Google Scholar]
  • 36.Hershberger RE, Pinto JR, Parks SB, Kushner JD, Li D, Ludwigsen S, et al. Clinical and functional characterization of TNNT2 mutations identified in patients with dilated cardiomyopathy. Circ Cardiovasc Genet. 2009;2:306–313. doi: 10.1161/CIRCGENETICS.108.846733. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Richard P, Charron P, Carrier L, Ledeuil C, Cheav T, Pichereau C, et al. Hypertrophic cardiomyopathy: distribution of disease genes, spectrum of mutations, and implications for a molecular diagnosis strategy. Circulation. 2003;107:2227–2232. doi: 10.1161/01.CIR.0000066323.15244.54. [DOI] [PubMed] [Google Scholar]
  • 38.Hanson EL, Jakobs PM, Keegan H, Coates K, Bousman S, Dienel NH, et al. Cardiac troponin T lysine 210 deletion in a family with dilated cardiomyopathy. J Card Fail. 2002;8:28–32. doi: 10.1054/jcaf.2002.31157. [DOI] [PubMed] [Google Scholar]
  • 39.Furst DO, Goldfarb LG, Kley RA, Vorgerd M, Olive M, van der Ven PF. Filamin C-related myopathies: pathology and mechanisms. Acta Neuropathol. 2013;125:33–46. doi: 10.1007/s00401-012-1054-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Thompson TG, Chan YM, Hack AA, Brosius M, Rajala M, Lidov HG, et al. Filamin 2 (FLN2): A muscle-specific sarcoglycan interacting protein. J Cell Biol. 2000;148:115–126. doi: 10.1083/jcb.148.1.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Fujita M, Mitsuhashi H, Isogai S, Nakata T, Kawakami A, Nonaka I, et al. Filamin C plays an essential role in the maintenance of the structural integrity of cardiac and skeletal muscles, revealed by the medaka mutant zacro. Dev Biol. 2012;361:79–89. doi: 10.1016/j.ydbio.2011.10.008. [DOI] [PubMed] [Google Scholar]
  • 42.Maerkens A, Kley RA, Olive M, Theis V, van der Ven PF, Reimann J, et al. Differential proteomic analysis of abnormal intramyoplasmic aggregates in desminopathy. J Proteomics. 2013;90:14–27. doi: 10.1016/j.jprot.2013.04.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Lidove O, West ML, Pintos-Morell G, Reisin R, Nicholls K, Figuera LE, et al. Effects of enzyme replacement therapy in Fabry disease--a comprehensive review of the medical literature. Genet Med. 2010;12:668–679. doi: 10.1097/GIM.0b013e3181f13b75. [DOI] [PubMed] [Google Scholar]
  • 44.Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008;456:53–59. doi: 10.1038/nature07517. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Ajay SS, Parker SC, Abaan HO, Fajardo KV, Margulies EH. Accurate and comprehensive sequencing of personal genomes. Genome Res. 2011;21:1498–1505. doi: 10.1101/gr.123638.111. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

000578 - PAP
000578 - Supplemental Material
CircGenetics_CIRCCVG-2014-000578-T.xml
Clinical Perspective

RESOURCES