Summary
Short-read genome sequencing (GS) holds the promise of becoming the primary diagnostic approach for the assessment of autism spectrum disorder (ASD) and fetal structural anomalies (FSAs). However, few studies have comprehensively evaluated its performance against current standard-of-care diagnostic tests: karyotype, chromosomal microarray (CMA), and exome sequencing (ES). To assess the clinical utility of GS, we compared its diagnostic yield against these three tests in 1,612 quartet families including an individual with ASD and in 295 prenatal families. Our GS analytic framework identified a diagnostic variant in 7.8% of ASD probands, almost 2-fold more than CMA (4.3%) and 3-fold more than ES (2.7%). However, when we systematically captured copy-number variants (CNVs) from the exome data, the diagnostic yield of ES (7.4%) was brought much closer to, but did not surpass, GS. Similarly, we estimated that GS could achieve an overall diagnostic yield of 46.1% in unselected FSAs, representing a 17.2% increased yield over karyotype, 14.1% over CMA, and 4.1% over ES with CNV calling or 36.1% increase without CNV discovery. Overall, GS provided an added diagnostic yield of 0.4% and 0.8% beyond the combination of all three standard-of-care tests in ASD and FSAs, respectively. This corresponded to nine GS unique diagnostic variants, including sequence variants in exons not captured by ES, structural variants (SVs) inaccessible to existing standard-of-care tests, and SVs where the resolution of GS changed variant classification. Overall, this large-scale evaluation demonstrated that GS significantly outperforms each individual standard-of-care test while also outperforming the combination of all three tests, thus warranting consideration as the first-tier diagnostic approach for the assessment of ASD and FSAs.
Keywords: genome sequencing, karyotype, microarray, exome sequencing, structural variant, autism spectrum disorder, structural anomaly, prenatal, first-tier, diagnostic
We evaluated genome sequencing (GS) as a single test to displace the sequential application of karyotype, chromosomal microarray, and exome sequencing, three standard-of-care tests used for the assessment of autism and fetal structural anomalies. Our data suggest GS warrants consideration as a first-tier diagnostic approach for these two phenotypes.
Introduction
Fetal structural anomalies (FSAs) and autism spectrum disorder (ASD) represent developmental defects that share significant overlap in genetic architecture1,2,3,4,5,6,7,8 and clinical diagnostic recommendations.9,10,11,12,13,14,15,16 Both are genetically heterogeneous and are associated with many of the same pathogenic variants (e.g., 22q11.2 deletions [MIM: 611867], point mutations in CHD8 [MIM: 610528])17,18 that have a wide range of potential clinical outcomes.19,20 Broad and comprehensive testing strategies are required to maximize diagnostic sensitivity for FSAs and ASD, as it is difficult to predict the genetic basis of these conditions a priori due to the diversity of pathogenic variants contributing to these conditions1,2,3,4,5,6,7,8 and widespread existence of variable expressivity.21,22 The current standard-of-care testing for genome-wide genetic surveys involves three orthogonal and largely complementary diagnostic tests: karyotype to discover microscopically visible balanced and unbalanced chromosomal abnormalities, chromosomal microarray (CMA) to capture sub-microscopic copy-number variants (CNVs), and exome sequencing (ES) to identify single-nucleotide variants (SNVs) and small insertions and deletions (indels) within the ∼2% of the genome that codes for proteins.9,10,11,12,13,14,15,16 All three tests are required to capture the full range of genetic variation currently known to be associated with FSAs and ASD. This sequential diagnostic testing strategy is inefficient in the prenatal setting where rapid diagnosis is critical and cumbersome in the pediatric setting where families can be easily lost to follow-up as a result of an unnecessarily long diagnostic odyssey.23
Short-read genome sequencing (GS) has the potential to identify almost all pathogenic variation captured by these currently applied technologies in a single test as well as potentially discovering novel diagnostic variants that are cryptic to current approaches.24,25,26 To date, studies performing GS for the diagnostic assessment of FSAs and neurodevelopmental disorders (NDDs), of which ASD is a subtype, have only included small cohorts of highly selected individuals with disparate diagnostic pre-screening, resulting in variable GS diagnostic yields ranging from 19.8% to 57.7% for FSAs27,28,29,30,31,32,33,34 and 30% to 50% for ASD/NDDs.35,36,37 These GS studies typically do not provide the opportunity for direct technology comparisons, as multiple standard-of-care tests are rarely available on the same individuals. Given that no single study has quantified the performance of GS against karyotype, CMA, and ES, the added value of GS remains unknown for most phenotypes, including for FSAs and ASD.
The goal of this study was to systematically evaluate the performance of GS against the current standard-of-care diagnostic tests for the assessment of FSAs and ASD. We developed a comprehensive GS analytic framework that characterized nine different classes of genetic variation while maintaining a manageable burden of manual variant review, which currently presents a significant barrier to the widespread implementation of clinical GS.38,39 We tested our GS analytic framework on 1,612 systematically collected ASD quartet families (n = 6,448 individuals total), which represented an ideal technical benchmarking cohort because each individual in the family had GS and matched CMA and ES data available for re-analysis. To assess the diagnostic yield of GS in FSAs, we applied our analytic framework to 295 prenatal families that had clinical results from karyotype, CMA, and/or ES available for comparison. The diagnostic yields from these large-scale studies suggest that a shift toward recommending GS as a first-tier diagnostic test for the assessment of ASD and FSAs is warranted.
Subjects and methods
Study subjects
We applied our short-read GS analytic framework to 1,612 ASD quartet families from the Simons Foundation for Autism Research Initiative (SFARI) Simons Simplex Collection (SSC; n = 6,448 individuals total; Table S1).40 Each quartet family comprised one proband diagnosed with ASD, one unaffected sibling, and two unaffected parents (Figure 1). The ASD cohort was chosen as the primary technical comparison for our GS pipeline because every individual had CMA, ES, and GS data available for re-processing. This facilitated direct technology comparisons that were not impacted by differences in bioinformatic analyses, variant interpretation methods, and/or assessment timepoints.41 Additionally, given the significant overlap in the types of variants that contribute to ASD and FSAs, particularly SVs,1,6 the larger size of the ASD cohort enabled the discovery and interpretation of a broader spectrum of diagnostic variants. All participants or their legal guardians provided written informed consent for participation and their data were de-identified by SFARI before sharing with qualified researchers.40
We next applied the same analytic framework to 295 fetuses that met criteria for diagnostic testing because of the presence of a structural anomaly (n = 281) or advanced maternal age (AMA) (n = 14; Figure 1). The 295 fetuses included 249 trios (n = 747 individuals) comprising a fetus with a structural anomaly detected by ultrasound and two unaffected parents. Of the 249 FSAs, 85.5% (n = 213) were prescreened (e.g., no diagnostic variant identified) with CMA, the current recommended first-tier diagnostic test for fetuses with structural anomalies,1 67.0% (n = 167) with karyotype, and 35.3% (n = 88) with ES. With respect to overlapping tests, 58.6% of the FSA cohort had negative results from both CMA and karyotype and 6.4% had negative results from all three tests (Table S2). We also included 46 singleton fetuses (n = 32 FSAs and n = 14 AMA) that were pre-selected for carrying a clinically reportable variant (n = 53) detected by karyotype, CMA, or ES. We used these samples to benchmark the performance of GS against tests performed in clinical diagnostic laboratories (Table S3). We also wanted to explore the potential for GS to discover variants originally identified by karyotyping. The prenatal cohort includes fetuses recruited from the Carmen and John Thain Center for Prenatal Pediatrics at Columbia University (n = 160), the University of California San Francisco (UCSF; n = 59), and the Prenatal Diagnosis Program at the University of North Carolina Chapel Hill (UNC; n = 30). A subset of the fetuses have had their karyotype, CMA, and ES data previously published.1,3,42,43,44 This study was approved by the institutional review boards at Mass General Brigham, Columbia University, UNC, and UCSF. All participants or their legal guardians provided written informed consent prior to participation.
GS and sample level quality control
All 7,241 samples analyzed in this study underwent short-read Illumina GS following standard library protocols to a mean genome coverage of >30× (Tables S1–S3; additional details in the supplemental methods). Whole-blood-derived DNA was sequenced for every individual in the ASD cohort and all the unaffected parents from the fetal structural anomaly trios. Fetal DNA was obtained from chorionic villi, amniocytes, umbilical cord blood, or products of conception. Sample relatedness was confirmed for all individuals via KING32 and all pregnancies were genetically confirmed to have arisen from non-consanguineous unions (Figure S1). We also used GS data to infer genetic sex by using PLINK45 and depth-based chromosomal analyses (Figure S2; details in supplemental methods).
GS analytic framework
We developed a framework to identify pathogenic and likely pathogenic (P/LP) variants from GS data with high sensitivity while limiting the number of variants requiring manual review (Figure 2; Table S4). The framework is organized into four components: variant discovery, annotation, filtering, and manual classification. Additional details on the framework can be found in the supplemental methods.
Variant discovery
Variant discovery identified nine different classes of genetic variation, including SNVs, indels, deletions and duplications that ranged from 50 base pairs to full chromosomal aneuploidies, inversions, insertions, translocations, complex rearrangements (16 different sub-classes),54 and short tandem repeats (STRs), via a suite of algorithms.55,56,57,58,59,60,61,62 All samples were jointly processed in batches following GATK Best Practices Workflows for SNV and indel discovery with Terra.63 The SV discovery and genotyping was performed across all samples with GATK-SV, a publicly available cloud-enabled ensemble method that leverages data from multiple SV algorithms to boost sensitivity and filters to improve specificity.24,54 Here, we ran six individual SV detection algorithms55,56,57,58,59,60 on all samples and then ran GATK-SV in cohort mode (a single sample version of GATK-SV is also available as a workflow on Terra). We used GATK-SV for filtering, genotyping, breakpoint refinement, and complex variant resolution to produce a VCF for each cohort. Finally, we ran ExpansionHunter to identify potentially diagnostic STR expansion candidates.61
Variant annotation
All variants (SNVs, indels, SVs, and STRs) were annotated for genic overlap and functional consequences against GENCODE v.26 gene boundaries based on the canonical transcript.64 Sequence variants (SNVs and indels) were annotated with ANNOVAR65 and any variants predicted to be stop-gain, stop-loss, frameshift insertion, frameshift deletion, or splicing (within 2 bp of a splice junction) according to RefSeq or GENCODE annotations were considered loss of function (LoF). SVs were annotated with GATK-SV and functional consequence was determined for each SV type. LoF SVs were defined as any deletion overlapping coding sequence, an inversion, mobile element insertion, complex SV, or translocation with one or more breakpoints disrupting coding sequence, or an intragenic exonic duplication (a duplication that overlaps coding sequence with both breakpoints contained within the same gene boundary). Full gene copy gains were defined as duplications that fully overlap a gene boundary. Partial gene duplications were defined as duplications with one breakpoint located within the gene boundary and one outside.46 Additionally, we annotated allele frequency (AF) for all SNVs and indels by using the maximum AF across gnomAD genomes,66 gnomAD exomes/ExAC,67 the 1000 Genomes Project,68 and parental samples from each cohort. The SV and STR allele frequencies were calculated based on the prevalence of each event in gnomAD.54,69
Variant filtering
We first filtered variants on the basis of predicted functional impact. For SNVs and indels, we retained all variants annotated as LoF or missense variants that had a CADD score > 1570 and were not annotated as benign, likely benign, risk factor, association, drug response, or protective in ClinVar.71 All SNVs and indels that passed our quality control and allele frequency thresholds were retained if they were predicted to functionally alter a gene on our list of disease-associated genes (see Table S4 and supplemental methods for specific thresholds). The only aspect of our GS analytic framework that differed between the ASD and prenatal cohorts was the content of the gene lists, which were phenotype-specific and computationally derived to limit the burden of up-front gene curation.39,72 Briefly, the ASD gene list included 901 genes classified as having a “confirmed” or “probable” association with NDDs in the Developmental Disorders Genotype-Phenotype Database (Table S5).73 To account for the phenotypic heterogeneity of the structural anomalies observed in our FSA cohort (Tables S2 and S3), we compiled a separate list of 2,535 genes from eight sources that are broadly associated with developmental disorders and congenital anomalies (Table S6 and supplemental methods). All variants were then filtered under four genotype categories (de novo, rare inherited, homozygous, and hemizygous) depending on the specific mode(s) of inheritance of the gene-disease association (dominant, recessive, or X-linked). Finally, we applied more stringent filters (described in supplemental methods) to inherited, homozygous, compound heterozygous, and hemizygous missense variants given that they contributed significantly to the number of variants requiring manual review but have not been shown to substantially contribute to the etiology of ASD or FSAs.2,3,4,74
A hierarchical filtering process was applied to all SVs. First, SVs predicted to be LoF or full gene copy gains were retained and partial gene duplications were excluded given their unknown functional impact.46 Then, following current recommendations, multigenic CNVs (deletions and duplications overlapping ≥35 and ≥50 protein-coding genes, respectively)46 were prioritized for manual classification regardless of whether any of the genes have been previously associated with disease. Next, any rare SV overlapping one of the 64 known genomic disorder loci (Table S7) or the 17 noncoding loci associated with pathogenic positional effects (Table S8) were retained. The SVs that did not meet any of the preceding criteria were then filtered on the basis of their overlap with the phenotype-specific disease-associated gene lists following the same inheritance patterns and allele frequency thresholds described above. All STRs that exceeded a pathogenic repeat length based on literature review were retained if they overlapped an STR-mediated locus associated with an early-onset developmental disorder (18 loci described in Table S9). Finally, the identification of candidate compound heterozygous variants comprised three filtering steps: (1) compiling heterozygous SNVs, indels, and LoF SVs located in the same recessive disease-associated gene; (2) annotating each variant with inheritance status; and (3) retaining only the instances where individuals had more than one variant in a recessive disease-associated gene with disparate inheritance patterns (e.g., one maternally inherited, one de novo). To retain variants in trans, we used inheritance as a proxy for phasing and required that at least one variant per compound heterozygous grouping be inherited from a parent (e.g., not all could occur de novo).
Manual variant classification
To ensure all variants were high quality, we visually inspected the read evidence for each candidate diagnostic variant output by our filtering pipeline by using the Integrated Genomics Viewer for SNVs, indels, and SVs;75 CNView for CNVs;76 and REViewer for STRs.77 All variants that passed manual visual inspection were assessed by a variant review panel consisting of board-certified clinical geneticists, cytogeneticists, molecular geneticists, obstetricians, maternal-fetal specialists, pediatricians, and genetic counselors as well as population geneticists and bioinformaticians with expertise in SV identification and interpretation. All variants were first evaluated for a gene-phenotype association on a individual-specific basis.78 If a reliable match was determined for the individual in question, all variants in that gene were reviewed following guidelines for sequence variant and CNV interpretation from the American College of Medical Genetics and Genomics (ACMG), the Association for Molecular Pathology (AMP), the Clinical Genome (ClinGen),46,47 and recommendations for adjusting the standard clinical guidelines from the ClinGen Sequence Variant Interpretation (SVI) Working Group.48,49,50,51,52,53 Overall, these guidelines provide a systematic and robust method to identify variants with a 90% or greater certainty of causing disease.47 This method is reliably reproduced across laboratories79 and rarely results in downgrading P/LP variants over time.80 All variants classified as P/LP in a gene robustly associated with the individual’s phenotype (e.g., the indication for testing) were considered a molecular diagnosis and were counted toward the diagnostic yield of GS.
Benchmarking the performance of the GS analytic framework
ASD proband vs. unaffected sibling comparisons
The quartet family structure of the ASD cohort provided us with a unique opportunity to evaluate our bioinformatic filtering and variant classification methods by comparing the number of variants output at each step between the affected probands with ASD and their unaffected siblings. To confirm that our filtering pipeline was enriching for potentially pathogenic variants as intended and assess the potential false positive rate of the variant interpretation guidelines, we treated each ASD proband and their unaffected sibling as separate trios with both parents. After filtering, we compared the number of variants requiring manual review in the ASD probands to their unaffected siblings then manually reviewed all variants blind to affected status (e.g., all variants were reviewed as if the child was diagnosed with ASD). We then compared the fraction of P/LP variants identified between these two groups.
Cross-technology comparisons
To quantify the sensitivity of GS against CMA and ES, we first leveraged the ASD cohort, which had unfiltered data for each technology available for re-analysis. For the CMA analysis, we obtained CNVs identified from Illumina single-nucleotide polymorphism (SNP) microarrays that were processed as previously described.6 Briefly, SNP genotyping data were generated via three Illumina CMA platforms and CNVs were identified from these data via PennCNV,81 QuantiSNPv2.3,82 and GNOSIS/CNVision.83 All CNVs identified from CMA were lifted over from GRCh37 to GRCh38 for comparisons against ES and GS. For the ES analysis, we used the SNV, indel, and CNV calls that were generated as part of a larger ASD sequencing initiative.4,5 To summarize, raw reads from all 6,448 samples were aligned to GRCh38 and SNV and indel discovery was performed with GATK v.4.1.2.0.62 All samples were jointly genotyped following GATK Best Practices for Variant Calling.63 We also employed GATK-gCNV for exome CNV detection,84 a new algorithm that is specifically designed to adjust for known bias factors of exome capture and sequencing (e.g., GC content), while automatically controlling for other technical and systematic differences. The GATK-gCNV workflow is publicly available in a Terra workspace. We applied the same version of our GS analytic pipeline to the CMA and ES data from all 6,448 individuals in the ASD quartet families. The only modification made was to the allele balance and depth filters to accommodate for the higher coverage of ES compared to GS (Figure S3).
We also analyzed GS of 46 fetuses that were pre-selected for receiving a clinically reportable finding from karyotype, CMA, or ES. Inclusion of these benchmarking fetal samples allowed us to investigate the impact of DNA source (whole blood vs. chorionic villi or amniocytes) on the performance of GS as well as evaluate the ability of GS to identify a range of cytogenetically visible balanced chromosomal rerrangements (BCRs). Each recruitment site provided us with the list of clinically reported variants found in each fetus by using their in-house methods and pipelines (e.g., raw data were not available for re-analysis).1,3,42,43,44 We identified STRs across 18 loci (Table S9) in the ASD and FSA cohorts, despite there being no clinical STR test results available for direct comparison. Previous studies using the same computational approach have demonstrated 97.3% and 99.6% sensitivity and specificity against existing PCR tests,85 respectively. The sensitivity of GS was calculated as the proportion of P/LP variants identified by each diagnostic test (karyotype, CMA, and ES) that were also identified by GS.
Application of GS to a prescreened fetal structural anomaly cohort
After systematically benchmarking the GS analytic framework, we applied it to 249 retrospectively obtained fetal structural anomaly trios (n = 747 individuals) that had been pre-screened with karyotype, CMA, and/or ES (Table S2). The analysis performed on the FSA trios was identical to that applied to the benchmarking samples described above. The added diagnostic yield of GS in this cohort was calculated on the basis of the number of P/LP variants identified by GS.
Results
Assessment of the GS analytic framework
We analyzed short-read GS data from 1,612 ASD quartet families (n = 6,448 individuals) that also had matched CMA and ES data available to directly compare the relative value of each technology. Overall, our GS variant calling methods identified an average of 3.7M short variants (3.4M SNVs, 0.3M indels) and 8,814 SVs per genome that passed filtering criteria as well as 115,821 STR genotypes at 18 targeted disease loci across the cohort. Our filtering strategy reduced the number of variants requiring manual curation to an average of 0.49 variants per child (range = 0–9), totaling 1,743 variants across 901 NDD-associated genes and loci (Table S5) in the ASD probands and unaffected siblings. We observed an enrichment of variants requiring manual review per person in the ASD probands compared to their unaffected siblings (0.58 mean variants per ASD proband and 0.39 per unaffected sibling; p = 4.12 × 10−14; two-sided Wilcoxon test), suggesting that our filtering pipeline was accurately enriching for potentially pathogenic variants. Demonstrating the power of the interpretation guidelines, this proband enrichment further increased following manual variant curation, which identified 128 P/LP variants in 126 ASD probands (7.8% yield; 95% CI 6.5–9.1) compared to 17 P/LP variants in unaffected siblings (1.1% yield; 95% CI 0.6–1.6; odds ratio [OR] = 7.9; 95% CI = 4.7–14.1; p = 2.2 × 10−16; Fisher’s exact test; Figure 3; Table S10). Importantly, 71% of the P/LP variants identified in siblings included CNVs associated with reduced penetrance, which are known to be a challenge for genetic counseling and are already encountered by clinicians during routine CMA testing.
Evaluating the diagnostic performance of GS
We benchmarked the diagnostic performance of GS against standard-of-care tests by applying the equivalent GS framework to the CMA and ES data from the ASD cohort, with minor modifications to accommodate each data type (Figure S3; supplemental methods). Overall, GS identified a diagnostic variant in almost 2-fold more probands than CMA (n = 126 vs. n = 71; OR = 1.8; 95% CI 1.3–2.5; p = 6.5 × 10−5) and almost 3-fold more than ES (n = 126 vs. n = 49; OR = 2.7; 95% CI 1.9–3.9; p = 1.98 × 10−10) (Figure 3). When we used a new method to capture CNVs from ES data (GATK-gCNV),84 the overall diagnostic yield of ES approached that of GS (7.4% vs. 7.8%, respectively), though it still did not capture all known P/LP variants. For example, a single exon deletion overlapping the first exon of NRXN1 (MIM: 600565) identified by GS was missed by ES because it did not pass our stringent filtering criteria that required CNVs to overlap >2 exons.5,84 Manual inspection revealed the deletion was present in the raw ES CNV calls, suggesting strategies for clinical exome CNV calling could consider relaxing filtering for pre-defined disease-associated genes, particularly for those where CNVs are a known mechanism of disease.86
Overall, GS captured 100% of the P/LP variants identified by CMA (n = 71) and ES (n = 118) while also uniquely identifying an additional diagnostic variant in seven (0.4%) ASD probands (Figure 3). We reviewed the properties of the variants uniquely identified by GS, which included one SNV and one indel: a de novo stop-gain in ANKRD11 (MIM: 611192) and a 44 bp de novo frameshift insertion in SMARCA4 (MIM: 603254), and five SVs: single exon deletions in RERE (MIM: 610226) and RORA (MIM: 600825),87,88 a reciprocal translocation disrupting GRIN2B (MIM: 138252), an SVA retrotransposon insertion in DMD (MIM: 300377), and a 47.2 Mb complex SV involving chromosome 1 comprised of four deletions, an inversion, and an inverted insertional translocation (Table S11). The ANKRD11 stop-gain was in an exon with no ES coverage and the SMARCA4 insertion was within 30 bp of an intron-exon boundary and was not present in the ES read evidence (Figure S4). In contrast to the single exon NRXN1 deletion described above, the smaller RERE (5.6 kb) and RORA (0.5 kb) deletions identified by GS were not detectable in the raw ES data, suggesting that ES will not be able to capture all single-exon deletions of clinical relevance. As expected, CMA and ES were unable to detect the balanced translocation. Similarly, while CMA and ES both detected the four de novo deletions involved in the complex SV, they were unable to identify the inversions that link the deletions into a single event. Finally, we did not apply a mobile element insertion algorithm to the ES data, as it is not currently implemented in routine clinical diagnostics,89 but this type of ES analysis could potentially capture variants labeled as GS unique in this study, such as the SVA insertion. Taken together, these data demonstrate that GS outperforms both CMA and ES, capturing all P/LP variants from these two technologies and providing a modest increase in diagnostic yield beyond the combination of both diagnostic tests.
Using DNA obtained from diagnostic procedures performed in pregnancy, we next confirmed the benchmarking results in prenatal samples as well as assessed the performance of GS to detect BCRs routinely identified by karyotype. We chose 46 fetuses that carried 53 reportable variants identified from standard clinical testing due to AMA (n = 14) or ultrasound detection of an FSA (n = 32) (Table S3). These variants included seven aneuploidies, 20 CNVs, and 18 SNVs or indels (including four compound heterozygous variant pairs), all of which are commonly observed in prenatal testing. This benchmarking cohort was also highly enriched for BCRs (n = 8/46 fetuses; 17.4% here vs. 3.0% estimated prevalence across all FSAs).1 Overall, GS captured 100% of the clinically reportable CNVs and SNVs/indels originally identified by CMA (n = 20) and ES (n = 12) and 62.5% of the BCRs identified by karyotype (n = 5/8). On the basis of the reported karyotype, the three BCRs not captured by GS are localized to highly repetitive telomeric and centromeric regions, which are known to be inaccessible to short-read GS.26 This class of missed BCRs account for <1% of the total diagnostic yield provided by karyotype in FSAs.1
Determining the added diagnostic yield of GS for the assessment of fetal structural anomalies
After systematically benchmarking the performance of our GS analytic framework, we applied it to 249 fetus-parent trios that were pre-screened with karyotype, CMA, and/or ES. The structural anomalies impacted a wide range of organ systems and 36.1% (n = 90/249) of the cohort had multisystem involvement (Figure 4; Table S2). GS identified 816 candidate variants requiring manual review, resulting in an average of 3.1 variants per fetus (median = 3.0, range = 0–21). The increased number of variants output by our GS filtering in fetuses compared to the ASD probands is due to a greater number of SNVs and indels across the larger gene list used, with an average of 2.65 sequence variants across n = 2,535 genes for the fetal cohort compared to an average of 0.31 sequence variants across n = 901 genes for the ASD cohort. Manual variant curation identified 21 P/LP variants in 19 (7.6%) fetuses with a structural anomaly (Table S12). On the basis of our benchmarking analyses, the majority (n = 17/19; 89.5%) of these molecular diagnoses would have also been identified by a combination of contemporary CMA and ES. For example, 78.9% (n = 15/19) of the diagnoses included SNVs and indels identified in fetuses that had not previously undergone ES. Similarly, GS identified a 67 kb deletion in MED13L (MIM: 608771) and a maternal uniparental disomy (UPD) event involving chromosome 20 in two fetuses who had previously undergone array comparative genomic hybridization (aCGH). The MED13L deletion was missed because the custom aCGH platform did not have probe coverage over the region and the UPD was missed because regions of homozygosity are not identifiable without the analysis of SNP probes, which are absent from aCGH.1 These data demonstrate the importance of taking previous diagnostic testing, technology platforms, and analysis pipelines into consideration when reporting comparative diagnostic yields.
The most conservative estimate therefore suggests that GS uniquely provided a molecular diagnosis in two FSA probands: a single exon deletion in MED13L (1.3 kb in size) and a compound heterozygous variant pair comprising a missense variant in trans with a 143 kb intragenic exonic duplication in DYNC2H1 (MIM: 603297). While the identification of the compound heterozygous variants is technically feasible with the combination of CMA and ES, most clinical analysis pipelines do not systematically integrate variants across technologies. Instead, diagnostic laboratories often manually follow-up on individual genes when there is a strong a priori suspicion of a gene-phenotype match, as was true for this fetus in clinic. A pathogenic missense variant in DYNC2H1 was identified by ES in a fetus with short-rib thoracic dysplasia. Given the specificity of the gene-phenotype association,90 the diagnostic laboratory manually reviewed the ES read depth profile across this gene, identified the duplication, and confirmed the event with fluorescence in situ hybridization.42 While this ultimately represented a successful approach for this fetus, it is not systematic and the increased burden of these additional steps is unlikely to scale, particularly for phenotypes associated with multiple recessive genes. Overall, these data suggest that GS provided a 0.8% increase in diagnostic yield beyond the combination of karyotype, CMA, and ES in these FSA trios (Figure 4).
Classification of SVs unique to GS
Over 75% (n = 7/9) of the diagnostic variants uniquely identified by GS in the ASD and FSA cohorts were SVs (Figure 5), including SVs below the resolution of and/or inaccessible to existing standard-of-care tests (n = 5) and SVs for which the base pair resolution provided by GS resulted in a medically relevant change in classification from variant of uncertain significance (VUS) to P/LP (n = 2).80 Notably, while STRs represent a variant class uniquely identifiable from GS, we did not identify any STRs that met P/LP criteria in the ASD or FSA cohorts. As studies examining the contribution of STRs to disease risk increase,85,91,92 we expect the interpretation of these variants to improve. Indeed, predicting the functional consequences of many GS-unique SVs was challenging, particularly for in-frame single exon deletions like the 5,618 bp de novo deletion in RERE in an ASD proband. For small rare in-frame CNVs (e.g., that disrupt <10% of the protein),50 evidence that the altered exon codes for a functional unit of the protein is one way to increase classification of the variant. However, this type of exon-level annotation is unavailable for most genes, suggesting that gene-level metrics quantifying the impact of in-frame CNVs would be of value.
GS also identified SVs that could only be classified as diagnostic using the resolution uniquely provided by this technology, such as the pathogenic balanced translocation disrupting GRIN2B in an ASD proband.24 Reciprocal translocations identified by karyotype are routinely reported back to families, but very little can be said about their contribution to the phenotype because the precise location of the breakpoints, and thus the predicted functional impact, remains unknown.93,94,95 Indeed, our previous work has demonstrated that GS revises the location of cytogenetically visible BCRs by one or more cytogenetic bands in over 93% of individuals,46,93 suggesting that conclusions about pathogenicity for the indication for testing cannot be drawn on the basis of karyotype results alone. Similarly, we identified a pathogenic de novo 47.2 Mb complex SV in an ASD proband that was only resolved by GS. Current guidelines recommend the individual assessment of CNVs involved in a complex SV; however, GS can resolve complex SVs to a single event so there is strong rationale to evaluate the overall rearrangement in diagnostic classification. In this study, we applied the gene-number thresholds to the total number of genes overlapped by all four deletions to classify this complex SV as LP, but we note that these thresholds were derived from very large canonical CNVs and did not include the analysis of complex SV.46 To improve gene-number thresholds, future studies could consider including complex SVs as well as CNVs below the resolution of CMA, which are now robustly detectable with GS.96 Taken together, these data provide specific examples of the types of variants, particularly SVs, that will be encountered as comprehensive variant identification from clinical GS becomes more widespread.
Discussion
Since the advent of massively parallel sequencing technologies, the application of clinical short-read GS has represented an enticing approach to ascertain almost all pathogenic variation in a single diagnostic test. Despite this enthusiasm,97,98 there remains a dearth of unbiased and large-scale studies to systematically assess this technology against conventional tests for any phenotype, and in particular for FSAs. As such, it has been asserted that GS can provide anywhere from no improved diagnostic yield99 to over 50%.29,30 Unfortunately, existing studies examining the clinical utility of GS frequently have disparate standard-of-care tests available on individuals for comparison, precluding systematic benchmarking of GS against any individual test as well as the combination of multiple tests. Further, SVs are often not considered36,100,101 or only identified via a small number of algorithms102,103,104,105 despite evidence demonstrating the need for multiple approaches to maximize sensitivity.25 This places an unnecessary technical constraint on the diagnostic value of GS and represents a critical limitation for surveying conditions where the contribution of SVs is significant, such as for FSAs and ASD.1,6 We demonstrate here that these limitations can be circumvented with a comprehensive GS framework to capture, filter, and interpret a broad spectrum of variant classes without significantly increasing the burden of manual variant curation.39
The scale of the benchmarking conducted here, namely the 1,612 ASD quartet families that had three technologies (GS, ES, and CMA) available for re-analysis on all individuals, demonstrated that GS captures all diagnostic variants identified by CMA and ES and provides a molecular diagnosis for almost 2-fold more ASD probands than either technology alone. We also illustrate that the diagnostic yield of ES can approach that of GS if sensitive CNV discovery is performed on the exome data.106,107,108 While phenotype, ascertainment, and clinical context are expected to impact comparative diagnostic yields, our study demonstrates the importance of comprehensive variant discovery across technologies to avoid overestimating the added diagnostic yield of a single technology. As exemplified by our FSA cohort, inflated yields of GS (e.g., 7.6% vs. 0.8%) can easily occur when previous testing, technology platforms, assessment timepoints, bioinformatic analyses, and interpretation guidelines are not taken into consideration.
To confirm these results in fetal DNA samples, we applied the GS analytic framework to 46 fetuses pre-selected to harbor a reportable variant identified by karyotype, CMA, or ES. As expected, GS identified 100% of the CNVs and SNVs/indels identified by CMA and ES, respectively. In contrast, only 62.5% of the BCRs identified by karyotype were recapitulated by GS, largely as a result of the localization of BCRs to highly repetitive acrocentric chromosomes.109 Previous studies have found that short-read GS may identify upwards of 90.8% of BCR breakpoints when rearrangements involving the acrocentric chromosomes are excluded,93 suggesting the true performance of GS for detecting all BCRs will likely fall within the 62.5%–90.8% range. However, the impact of these missed BCRs on the total yield of GS will be small (e.g., 0.3%–1.1%), as the fraction of BCRs identified in FSAs is only estimated to be 3%.1 Indeed, we can extrapolate our benchmarking results to diagnostic yields obtained from unselected FSAs that were ascertained from the same catchment area as the vast majority (64.2%) of our FSA cohort. Using these historical data,1,3 we estimate that GS can provide an overall diagnostic yield of 46.1% in unselected FSAs, significantly outperforming each individual standard-of-care test by a wide margin: 17.2% increase over karyotype, 14.1% over CMA, and 38.3% over ES when only SNVs and indels are considered, and 4.1% when CNVs are also robustly identified from ES data (Figure 3). Based on diagnostic performance alone, these data strongly argue for GS to displace the serial application of karyotype, CMA, and ES for the assessment of FSAs and ASD, provided analysis and interpretation are sufficiently optimized to identify and interpret all classes of variation.
These studies found that GS uniquely identified nine P/LP variants across ASD probands and fetuses with structural anomalies, representing an added diagnostic yield of 0.4% and 0.8% in each cohort, respectively. Our study revealed that most diagnostic GS-unique variants included SVs that were inaccessible to existing standard-of-care diagnostic tests or were only determined to be pathogenic on the basis of information that was uniquely provided by GS. These included BCRs, complex SVs, single exon in-frame deletions, and mobile element insertions. It may be possible to further increase the yield of ES by improving filtering to recapture single exon CNVs. However, we previously demonstrated that the false positive rate of deletions and duplications detected by GATK-gCNV from ES data can dramatically increase if filtering is relaxed to one exon genome wide without manual curation of individual variants.5,84 These data should temper enthusiasm regarding immediate significant increases in interpretable pathogenic variation from either ES or GS. Advances in genomics technologies and algorithms will continue to only provide incremental increases in diagnostic yield without improvements in variant annotation (e.g., predicting the functional impact of a variant) and interpretation.
Beyond diagnostic yield, there are additional technical, logistical, and economic considerations when deciding to implement a new diagnostic test such as GS. Among these, technical capacity and timely return-of-results is paramount in the prenatal setting. While assessing turn-around-time and the impact of GS on downstream health care costs was beyond the scope of this study, previous studies have demonstrated that GS results can be delivered within 18–21 days for the assessment of FSAs.33,34 Additionally, rapid GS (ranging from 26 h to 3.2 days for analysis completion)100,110 has been demonstrated in the pediatric setting for the assessment of critically ill infants, where, similar to the prenatal diagnostics, time to diagnosis can have a significant impact on medical management and clinical outcomes. Further, clinical GS costs less than existing standard-of-care diagnostic tests for individuals with a developmental disorder and/or congenital anomaly111 and rapid GS has reduced the cost of hospitalization for children admitted to neonatal or pediatric intensive care units.112,113 Taken together, these data suggest that the benefits of GS are likely to extend to reductions in health care costs and rapid return-of-results in addition to improved diagnostic yield. Yet, efforts to ensure that GS does not exacerbate health inequities will be critical, as access to testing will be initially isolated to metropolitan areas with major medical centers. Additionally, initiatives to expand diverse population representation in reference databases will be integral to ensuring that individuals from non-European genetic ancestries have an equal opportunity to receive a diagnosis, as population-specific allele frequencies are essential for variant interpretation.
In conclusion, these studies demonstrate the potential for GS to displace a series of standard-of-care diagnostic tests that individually identify only a small portion of the genomic variant spectrum associated with FSAs and ASD. The large-scale benchmarking performed in this study was critical, as these analyses focus on rare variants that span an array of mutational mechanisms but are not frequently observed in the general population or in small cohorts. We demonstrate that GS is unlikely to significantly increase the diagnostic yield in FSAs or ASD without improvements in variant annotation and interpretation, particularly for noncoding variation, as we were only able to consider a small number of noncoding disease-associated loci. Some discrete phenotypes will also continue to require specialized assays (e.g., methylation tests, microsatellite analysis) for variants not accessible to any short-read GS technology. Overall, these data suggest that GS can effectively displace karyotype, CMA, and ES as a single diagnostic test for the assessment of FSAs and ASD and will provide a marginal, but important, increase in diagnostic yield beyond the combination of all three current standard-of-care diagnostic tests.
Data and code availability
The genomic and phenotype data for the ASD families can be accessed through SFARIbase with permission from the Simons Foundation Autism Research Initiative. The raw sequencing data generated from the fetal structural anomaly cohort is restricted because of consent limitations. However, all diagnostic variants identified in the ASD and FSA cohorts are provided in Tables S10 and S12.
Acknowledgments
We thank the families and clinicians from the Columbia University Carmen and John Thain Center for Prenatal Pediatrics, the University of North Carolina Chapel Hill Prenatal Diagnosis Program, the University of California San Francisco Prenatal Diagnostic Center, and the Simons Simplex Collection for their participation. This study was supported by resources from the National Institutes of Health (NIH): HD081256, HD099547, and MH115957 (awarded to M.E.T.); HD088742 (awarded to N.V.); UM1HG008900 (awarded to A.H.O'D-L., H.R., and M.E.T); HD105266 (awarded to R.W. and M.E.T.); K99HD108392 (awarded to C.L.); F31NS113414 (awarded to E.V.); T32HG002295 (supporting R.L.C.); and K99DE026824 (awarded to H.B.). Additional support came from the Simons Foundation Autism Research Initiative (SFARI #573206 awarded to M.E.T.). C.L. was also supported by a postdoctoral fellowship from the Canadian Institutes of Health Research and R.L.C. was supported by the National Science Foundation (GRFP #2017240332). J.-Y.A. was supported by the National Research Foundation of Korea (2020R1C1C1003426 and 2021M3E5D9021878).
Author contributions
Study design: B.L., H.B., D.G.M., R.W., and M.E.T. Family recruitment and sample collection: J.L.G., V.S.A., D.L., M.E.N., T.M., K.G., B.P., A.B., M.D., N.L.V., B.L., and R.W. Sample library preparation: B.B.C. and K.O’K. Computational analysis: C.L., E.V., H.Z.W., E.P.H., N.K., C.W.W., S.P.H., B.W., V.J., J.F., R.L.C., X.Z., L.D.G., C.T., N.S., J.-Y.A., S.D., B.D., D.B.G., S.J.S., D.G.M., and H.B. Manual variant curation: C.L., E.V., J.L.G., C.A.A.-T., E.E., G.L., K.G., F.V., J.C.H., A.H.O’D-L., H.L.R., N.L.V., B.L., and R.W. Verified the underlying data for these analyses: C.L., E.V., H.B., and M.E.T. Wrote the manuscript and generated the figures: C.L., E.V., H.B., and M.E.T. All authors reviewed the manuscript. C.L. and E.V. contributed equally to this study.
Declaration of interests
M.E.T. and H.R. receive research funding from Microsoft Inc and/or research reagents from Illumina Inc. M.E.T. also received research funding from Levo Therapeutics and research reagents from Ionis Therapeutics for unrelated research projects.
Published: August 17, 2023
Footnotes
Supplemental information can be found online at https://doi.org/10.1016/j.ajhg.2023.07.010.
Web resources
ClinGen Sequence Variant Interpretation Resources, https://clinicalgenome.org/working-groups/sequence-variant-interpretation/
GATK Best Practices Workflows, https://gatk.broadinstitute.org/hc/en-us/sections/360007226651-Best-Practices-Workflows
GATK-gCNV, https://app.terra.bio/#workspaces/help-gatk/Germline-CNVs-GATK4
GATK-SV single sample pipeline, https://app.terra.bio/#workspaces/help-gatk/GATK-Structural-Variants-Single-Sample
Terra, https://terra.bio/
Supplemental information
References
- 1.Wapner R.J., Martin C.L., Levy B., Ballif B.C., Eng C.M., Zachary J.M., Savage M., Platt L.D., Saltzman D., Grobman W.A., et al. Chromosomal microarray versus karyotyping for prenatal diagnosis. N. Engl. J. Med. 2012;367:2175–2184. doi: 10.1056/NEJMoa1203382. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Lord J., McMullan D.J., Eberhardt R.Y., Rinck G., Hamilton S.J., Quinlan-Jones E., Prigmore E., Keelagher R., Best S.K., Carey G.K., et al. Prenatal exome sequencing analysis in fetal structural anomalies detected by ultrasonography (PAGE): a cohort study. Lancet. 2019;393:747–757. doi: 10.1016/S0140-6736(18)31940-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Petrovski S., Aggarwal V., Giordano J.L., Stosic M., Wou K., Bier L., Spiegel E., Brennan K., Stong N., Jobanputra V., et al. Whole-exome sequencing in the evaluation of fetal structural anomalies: a prospective cohort study. Lancet. 2019;393:758–767. doi: 10.1016/S0140-6736(18)32042-7. [DOI] [PubMed] [Google Scholar]
- 4.Satterstrom F.K., Kosmicki J.A., Wang J., Breen M.S., De Rubeis S., An J.Y., Peng M., Collins R., Grove J., Klei L., et al. Large-scale exome sequencing study implicates both developmental and functional changes in the neurobiology of autism. Cell. 2020;180:568–584.e23. doi: 10.1016/j.cell.2019.12.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Fu J.M., Satterstrom F.K., Peng M., Brand H., Collins R.L., Dong S., Wamsley B., Klei L., Wang L., Hao S.P., et al. Rare coding variation provides insight into the genetic architecture and phenotypic context of autism. Nat. Genet. 2022;54:1320–1331. doi: 10.1038/s41588-022-01104-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Sanders S.J., He X., Willsey A.J., Ercan-Sencicek A.G., Samocha K.E., Cicek A.E., Murtha M.T., Bal V.H., Bishop S.L., Dong S., et al. Insights into Autism Spectrum Disorder Genomic Architecture and Biology from 71 Risk Loci. Neuron. 2015;87:1215–1233. doi: 10.1016/j.neuron.2015.09.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Marshall C.R., Noor A., Vincent J.B., Lionel A.C., Feuk L., Skaug J., Shago M., Moessner R., Pinto D., Ren Y., et al. Structural variation of chromosomes in autism spectrum disorder. Am. J. Hum. Genet. 2008;82:477–488. doi: 10.1016/j.ajhg.2007.12.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.An J.Y., Lin K., Zhu L., Werling D.M., Dong S., Brand H., Wang H.Z., Zhao X., Schwartz G.B., Collins R.L., et al. Genome-wide de novo risk score implicates promoter variation in autism spectrum disorder. Science. 2018;362 doi: 10.1126/science.aat6576. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Monaghan K.G., Leach N.T., Pekarek D., Prasad P., Rose N.C., ACMG Professional Practice and Guidelines Committee. Guidelines Committee The use of fetal exome sequencing in prenatal diagnosis: a points to consider document of the American College of Medical Genetics and Genomics (ACMG) Genet. Med. 2020;22:675–680. doi: 10.1038/s41436-019-0731-7. [DOI] [PubMed] [Google Scholar]
- 10.Van den Veyver I.B., Chandler N., Wilkins-Haug L.E., Wapner R.J., Chitty L.S., ISPD Board of Directors International Society for Prenatal Diagnosis Updated Position Statement on the use of genome-wide sequencing for prenatal diagnosis. Prenat. Diagn. 2022;42:796–803. doi: 10.1002/pd.6157. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Lazier J., Hartley T., Brock J.-A., Caluseriu O., Chitayat D., Laberge A.-M., Langlois S., Lauzon J., Nelson T.N., Parboosingh J., et al. Clinical application of fetal genome-wide sequencing during pregnancy: position statement of the Canadian College of Medical Geneticists. J. Med. Genet. 2022;59:931–937. doi: 10.1136/jmedgenet-2021-107897. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.International Society for Prenatal Diagnosis. Society for Maternal and Fetal Medicine. Perinatal Quality Foundation Joint Position Statement from the International Society for Prenatal Diagnosis (ISPD), the Society for Maternal Fetal Medicine (SMFM), and the Perinatal Quality Foundation (PQF) on the use of genome-wide sequencing for fetal diagnosis. Prenat. Diagn. 2018;38:6–9. doi: 10.1002/pd.5195. [DOI] [PubMed] [Google Scholar]
- 13.Miller D.T., Adam M.P., Aradhya S., Biesecker L.G., Brothman A.R., Carter N.P., Church D.M., Crolla J.A., Eichler E.E., Epstein C.J., et al. Consensus statement: chromosomal microarray is a first-tier clinical diagnostic test for individuals with developmental disabilities or congenital anomalies. Am. J. Hum. Genet. 2010;86:749–764. doi: 10.1016/j.ajhg.2010.04.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Srivastava S., Love-Nichols J.A., Dies K.A., Ledbetter D.H., Martin C.L., Chung W.K., Firth H.V., Frazier T., Hansen R.L., Prock L., et al. Meta-analysis and multidisciplinary consensus statement: exome sequencing is a first-tier clinical diagnostic test for individuals with neurodevelopmental disorders. Genet. Med. 2019;21:2413–2421. doi: 10.1038/s41436-019-0554-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Carter M.T., Srour M., Au P.-Y.B., Buhas D., Dyack S., Eaton A., Inbar-Feigenberg M., Howley H., Kawamura A., Lewis S.M.E., et al. Genetic and metabolic investigations for neurodevelopmental disorders: position statement of the Canadian College of Medical Geneticists (CCMG) J. Med. Genet. 2023;60:523–532. doi: 10.1136/jmg-2022-108962. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Mone F., McMullan D.J., Williams D., Chitty L.S., Maher E.R., Kilby M.D., Fetal Genomics Steering Group of the British Society for Genetic Medicine. Royal College of Obstetricians and Gynaecologists Evidence to Support the Clinical Utility of Prenatal Exome Sequencing in Evaluation of the Fetus with Congenital Anomalies: Scientific Impact Paper No. 64 [February] 2021. BJOG. 2021;128:e39–e50. doi: 10.1111/1471-0528.16616. [DOI] [PubMed] [Google Scholar]
- 17.Costain G., McDonald-McGinn D.M., Bassett A.S. Prenatal genetic testing with chromosomal microarray analysis identifies major risk variants for schizophrenia and other later-onset disorders. Am. J. Psychiatry. 2013;170:1498. doi: 10.1176/appi.ajp.2013.13070880. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Dingemans A.J.M., Truijen K.M.G., van de Ven S., Bernier R., Bongers E.M.H.F., Bouman A., de Graaff-Herder L., Eichler E.E., Gerkes E.H., De Geus C.M., et al. The phenotypic spectrum and genotype-phenotype correlations in 106 patients with variants in major autism gene CHD8. Transl. Psychiatry. 2022;12:421. doi: 10.1038/s41398-022-02189-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Lowther C., Costain G., Bassett A.S. Reproductive genetic testing and human genetic variation in the era of genomic medicine. Am. J. Bioeth. 2015;15:25–26. doi: 10.1080/15265161.2015.1028661. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Kammenga J.E. The background puzzle: how identical mutations in the same gene lead to different disease symptoms. FEBS J. 2017;284:3362–3373. doi: 10.1111/febs.14080. [DOI] [PubMed] [Google Scholar]
- 21.Wright C.F., West B., Tuke M., Jones S.E., Patel K., Laver T.W., Beaumont R.N., Tyrrell J., Wood A.R., Frayling T.M., et al. Assessing the Pathogenicity, Penetrance, and Expressivity of Putative Disease-Causing Variants in a Population Setting. Am. J. Hum. Genet. 2019;104:275–286. doi: 10.1016/j.ajhg.2018.12.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Chen R., Shi L., Hakenberg J., Naughton B., Sklar P., Zhang J., Zhou H., Tian L., Prakash O., Lemire M., et al. Analysis of 589,306 genomes identifies individuals resilient to severe Mendelian childhood diseases. Nat. Biotechnol. 2016;34:531–538. doi: 10.1038/nbt.3514. [DOI] [PubMed] [Google Scholar]
- 23.Sawyer S.L., Hartley T., Dyment D.A., Beaulieu C.L., Schwartzentruber J., Smith A., Bedford H.M., Bernard G., Bernier F.P., Brais B., et al. Utility of whole-exome sequencing for those near the end of the diagnostic odyssey: time to address gaps in care. Clin. Genet. 2016;89:275–284. doi: 10.1111/cge.12654. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Werling D.M., Brand H., An J.Y., Stone M.R., Zhu L., Glessner J.T., Collins R.L., Dong S., Layer R.M., Markenscoff-Papadimitriou E., et al. An analytical framework for whole-genome sequence association studies and its implications for autism spectrum disorder. Nat. Genet. 2018;50:727–736. doi: 10.1038/s41588-018-0107-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Chaisson M.J.P., Sanders A.D., Zhao X., Malhotra A., Porubsky D., Rausch T., Gardner E.J., Rodriguez O.L., Guo L., Collins R.L., et al. Multi-platform discovery of haplotype-resolved structural variation in human genomes. Nat. Commun. 2019;10:1784. doi: 10.1038/s41467-018-08148-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Zhao X., Collins R.L., Lee W.-P., Weber A.M., Jun Y., Zhu Q., Weisburd B., Huang Y., Audano P.A., Wang H., et al. Expectations and blind spots for structural variation detection from long-read assemblies and short-read genome sequencing technologies. Am. J. Hum. Genet. 2021;108:919–928. doi: 10.1016/j.ajhg.2021.03.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Cao Y., Chau M.H.K., Zheng Y., Zhao Y., Kwan A.H.W., Hui S.Y.A., Lam Y.H., Tan T.Y.T., Tse W.T., Wong L., et al. Exploring the diagnostic utility of genome sequencing for fetal congenital heart defects. Prenat. Diagn. 2022;42:862–872. doi: 10.1002/pd.6151. [DOI] [PubMed] [Google Scholar]
- 28.So P.L., Hui A.S.Y., Ma T.W.L., Shu W., Hui A.P.W., Kong C.W., Lo T.K., Kan A.N.C., Kan E.Y.L., Chong S.C., et al. Implementation of Public Funded Genome Sequencing in Evaluation of Fetal Structural Anomalies. Genes. 2022;13:2088. doi: 10.3390/genes13112088. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Westenius E., Sahlin E., Conner P., Lindstrand A., Iwarsson E. Diagnostic yield using whole-genome sequencing and in-silico panel of 281 genes associated with non-immune hydrops fetalis in clinical setting. Ultrasound Obstet. Gynecol. 2022;60:487–493. doi: 10.1002/uog.24911. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Liao Y., Yang Y., Wen H., Wang B., Zhang T., Li S. Abnormal Sylvian fissure at 20-30 weeks as an indicator of malformations of cortical development: role for prenatal whole-genome sequencing. Ultrasound Obstet. Gynecol. 2022;59:552–555. doi: 10.1002/uog.24771. [DOI] [PubMed] [Google Scholar]
- 31.Wang Y., Greenfeld E., Watkins N., Belesiotis P., Zaidi S.H., Marshall C., Thiruvahindrapuram B., Shannon P., Roifman M., Chong K., et al. Diagnostic yield of genome sequencing for prenatal diagnosis of fetal structural anomalies. Prenat. Diagn. 2022;42:822–830. doi: 10.1002/pd.6108. [DOI] [PubMed] [Google Scholar]
- 32.Choy K.W., Wang H., Shi M., Chen J., Yang Z., Zhang R., Yan H., Wang Y., Chen S., Chau M.H.K., et al. Prenatal Diagnosis of Fetuses With Increased Nuchal Translucency by Genome Sequencing Analysis. Front. Genet. 2019;10:761. doi: 10.3389/fgene.2019.00761. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Zhou J., Yang Z., Sun J., Liu L., Zhou X., Liu F., Xing Y., Cui S., Xiong S., Liu X., et al. Whole Genome Sequencing in the Evaluation of Fetal Structural Anomalies: A Parallel Test with Chromosomal Microarray Plus Whole Exome Sequencing. Genes. 2021;12:376. doi: 10.3390/genes12030376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Yang Y., Zhao S., Sun G., Chen F., Zhang T., Song J., Yang W., Wang L., Zhan N., Yang X., et al. Genomic architecture of fetal central nervous system anomalies using whole-genome sequencing. NPJ Genom. Med. 2022;7:31. doi: 10.1038/s41525-022-00301-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.van der Sanden B.P.G.H., Schobers G., Corominas Galbany J., Koolen D.A., Sinnema M., van Reeuwijk J., Stumpel C.T.R.M., Kleefstra T., de Vries B.B.A., Ruiterkamp-Versteeg M., et al. The performance of genome sequencing as a first-tier test for neurodevelopmental disorders. Eur. J. Hum. Genet. 2022;31:81–88. doi: 10.1038/s41431-022-01185-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Soden S.E., Saunders C.J., Willig L.K., Farrow E.G., Smith L.D., Petrikin J.E., LePichon J.-B., Miller N.A., Thiffault I., Dinwiddie D.L., et al. Effectiveness of exome and genome sequencing guided by acuity of illness for diagnosis of neurodevelopmental disorders. Sci. Transl. Med. 2014;6:265ra168. doi: 10.1126/scitranslmed.3010076. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Jiang Y.H., Yuen R.K.C., Jin X., Wang M., Chen N., Wu X., Ju J., Mei J., Shi Y., He M., et al. Detection of clinically relevant genetic variants in autism spectrum disorder by whole-genome sequencing. Am. J. Hum. Genet. 2013;93:249–263. doi: 10.1016/j.ajhg.2013.06.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Marshall C.R., Bick D., Belmont J.W., Taylor S.L., Ashley E., Dimmock D., Jobanputra V., Kearney H.M., Kulkarni S., Rehm H., Medical Genome Initiative The Medical Genome Initiative: moving whole-genome sequencing for rare disease diagnosis to the clinic. Genome Med. 2020;12:48. doi: 10.1186/s13073-020-00748-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Austin-Tse C.A., Jobanputra V., Perry D.L., Bick D., Taft R.J., Venner E., Gibbs R.A., Young T., Barnett S., Belmont J.W., et al. Best practices for the interpretation and reporting of clinical whole genome sequencing. NPJ Genom. Med. 2022;7:1–13. doi: 10.1038/s41525-022-00295-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Fischbach G.D., Lord C. The Simons Simplex Collection: a resource for identification of autism genetic risk factors. Neuron. 2010;68:192–195. doi: 10.1016/j.neuron.2010.10.006. [DOI] [PubMed] [Google Scholar]
- 41.Costain G., Jobling R., Walker S., Reuter M.S., Snell M., Bowdin S., Cohn R.D., Dupuis L., Hewson S., Mercimek-Andrews S., et al. Periodic reanalysis of whole-genome sequencing data enhances the diagnostic advantage over standard clinical genetic testing. Eur. J. Hum. Genet. 2018;26:740–744. doi: 10.1038/s41431-018-0114-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Vora N.L., Gilmore K., Brandt A., Gustafson C., Strande N., Ramkissoon L., Hardisty E., Foreman A.K.M., Wilhelmsen K., Owen P., et al. An approach to integrating exome sequencing for fetal structural anomalies into clinical practice. Genet. Med. 2020;22:954–961. doi: 10.1038/s41436-020-0750-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Vora N.L., Powell B., Brandt A., Strande N., Hardisty E., Gilmore K., Foreman A.K.M., Wilhelmsen K., Bizon C., Reilly J., et al. Prenatal exome sequencing in anomalous fetuses: new opportunities and challenges. Genet. Med. 2017;19:1207–1216. doi: 10.1038/gim.2017.33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Slavotinek A., Rego S., Sahin-Hodoglugil N., Kvale M., Lianoglou B., Yip T., Hoban H., Outram S., Anguiano B., Chen F., et al. Diagnostic yield of pediatric and prenatal exome sequencing in a diverse population. NPJ Genom. Med. 2023;8:10. doi: 10.1038/s41525-023-00353-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Purcell S., Neale B., Todd-Brown K., Thomas L., Ferreira M.A.R., Bender D., Maller J., Sklar P., de Bakker P.I.W., Daly M.J., Sham P.C. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 2007;81:559–575. doi: 10.1086/519795. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Riggs E.R., Andersen E.F., Cherry A.M., Kantarci S., Kearney H., Patel A., Raca G., Ritter D.I., South S.T., Thorland E.C., et al. Technical standards for the interpretation and reporting of constitutional copy-number variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics (ACMG) and the Clinical Genome Resource (ClinGen) Genet. Med. 2020;22:245–257. doi: 10.1038/s41436-019-0686-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Richards S., Aziz N., Bale S., Bick D., Das S., Gastier-Foster J., Grody W.W., Hegde M., Lyon E., Spector E., et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 2015;17:405–424. doi: 10.1038/gim.2015.30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Biesecker L.G., Harrison S.M., ClinGen Sequence Variant Interpretation Working Group The ACMG/AMP reputable source criteria for the interpretation of sequence variants. Genet. Med. 2018;20:1687–1688. doi: 10.1038/gim.2018.42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Ghosh R., Harrison S.M., Rehm H.L., Plon S.E., Biesecker L.G., ClinGen Sequence Variant Interpretation Working Group Updated recommendation for the benign stand-alone ACMG/AMP criterion. Hum. Mutat. 2018;39:1525–1530. doi: 10.1002/humu.23642. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Abou Tayoun A.N., Pesaran T., DiStefano M.T., Oza A., Rehm H.L., Biesecker L.G., Harrison S.M., ClinGen Sequence Variant Interpretation Working Group ClinGen SVI Recommendations for interpreting the loss of function PVS1 ACMG/AMP variant criterion. Hum. Mutat. 2018;39:1517–1524. doi: 10.1002/humu.23626. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Brnich S.E., Abou Tayoun A.N., Couch F.J., Cutting G.R., Greenblatt M.S., Heinen C.D., Kanavy D.M., Luo X., McNulty S.M., Starita L.M., et al. Recommendations for application of the functional evidence PS3/BS3 criterion using the ACMG/AMP sequence variant interpretation framework. Genome Med. 2019;12:3. doi: 10.1186/s13073-019-0690-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Pejaver V., Byrne A.B., Feng B.-J., Pagel K.A., Mooney S.D., Karchin R., O’Donnell-Luria A., Harrison S.M., Tavtigian S.V., Greenblatt M.S., et al. Calibration of computational tools for missense variant pathogenicity classification and ClinGen recommendations for PP3/BP4 criteria. Am. J. Hum. Genet. 2022;109:2163–2177. doi: 10.1016/j.ajhg.2022.10.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Walker L.C., Hoya M.d.l., Wiggins G.A.R., Lindy A., Vincent L.M., Parsons M.T., Canson D.M., Bis-Brewer D., Cass A., Tchourbanov A., et al. Using the ACMG/AMP framework to capture evidence related to predicted and observed impact on splicing: Recommendations from the ClinGen SVI Splicing Subgroup. Am. J. Hum. Genet. 2023;110:1046–1067. doi: 10.1016/j.ajhg.2023.06.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Collins R.L., Brand H., Karczewski K.J., Zhao X., Alföldi J., Francioli L.C., Khera A.V., Lowther C., Gauthier L.D., Wang H., et al. A structural variation reference for medical and population genetics. Nature. 2020;581:444–451. doi: 10.1038/s41586-020-2287-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Chen X., Schulz-Trieglaff O., Shaw R., Barnes B., Schlesinger F., Källberg M., Cox A.J., Kruglyak S., Saunders C.T. Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics. 2016;32:1220–1222. doi: 10.1093/bioinformatics/btv710. [DOI] [PubMed] [Google Scholar]
- 56.Layer R.M., Chiang C., Quinlan A.R., Hall I.M. LUMPY: a probabilistic framework for structural variant discovery. Genome Biol. 2014;15:R84. doi: 10.1186/gb-2014-15-6-r84. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Gardner E.J., Lam V.K., Harris D.N., Chuang N.T., Scott E.C., Pittard W.S., Mills R.E., 1000 Genomes Project Consortium. Devine S.E. The Mobile Element Locator Tool (MELT): population-scale mobile element discovery and biology. Genome Res. 2017;27:1916–1929. doi: 10.1101/gr.218032.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Abyzov A., Urban A.E., Snyder M., Gerstein M. CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res. 2011;21:974–984. doi: 10.1101/gr.114876.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Kronenberg Z.N., Osborne E.J., Cone K.R., Kennedy B.J., Domyan E.T., Shapiro M.D., Elde N.C., Yandell M. Wham: Identifying Structural Variants of Biological Consequence. PLoS Comput. Biol. 2015;11 doi: 10.1371/journal.pcbi.1004572. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Klambauer G., Schwarzbauer K., Mayr A., Clevert D.A., Mitterecker A., Bodenhofer U., Hochreiter S. cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate. Nucleic Acids Res. 2012;40:e69. doi: 10.1093/nar/gks003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Dolzhenko E., Deshpande V., Schlesinger F., Krusche P., Petrovski R., Chen S., Emig-Agius D., Gross A., Narzisi G., Bowman B., et al. ExpansionHunter: a sequence-graph-based tool to analyze variation in short tandem repeat regions. Bioinformatics. 2019;35:4754–4756. doi: 10.1093/bioinformatics/btz431. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Poplin R., Ruano-Rubio V., DePristo M.A., Fennell T.J., Carneiro M.O., Van der Auwera G.A., Kling D.E., Gauthier L.D., Levy-Moonshine A., Roazen D., et al. Scaling accurate genetic variant discovery to tens of thousands of samples. bioRxiv. 2018 doi: 10.1101/201178. Preprint at. [DOI] [Google Scholar]
- 63.van der Auwera G., O’Connor B.D. O’Reilly Media, Incorporated; 2020. Genomics in the Cloud: Using Docker, GATK, and WDL in Terra. [Google Scholar]
- 64.Frankish A., Diekhans M., Ferreira A.M., Johnson R., Jungreis I., Loveland J., Mudge J.M., Sisu C., Wright J., Armstrong J., et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 2019;47:D766–D773. doi: 10.1093/nar/gky955. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Wang K., Li M., Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38:e164. doi: 10.1093/nar/gkq603. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Karczewski K.J., Francioli L.C., Tiao G., Cummings B.B., Alföldi J., Wang Q., Collins R.L., Laricchia K.M., Ganna A., Birnbaum D.P., et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020;581:434–443. doi: 10.1038/s41586-020-2308-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Lek M., Karczewski K.J., Minikel E.V., Samocha K.E., Banks E., Fennell T., O’Donnell-Luria A.H., Ware J.S., Hill A.J., Cummings B.B., et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536:285–291. doi: 10.1038/nature19057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Sudmant P.H., Rausch T., Gardner E.J., Handsaker R.E., Abyzov A., Huddleston J., Zhang Y., Ye K., Jun G., Fritz M.H.-Y., et al. An integrated map of structural variation in 2,504 human genomes. Nature. 2015;526:75–81. doi: 10.1038/nature15394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Weisburd, B., VanNoy, G., and Watts, N. The addition of short tandem repeat calls to gnomAD.
- 70.Rentzsch P., Witten D., Cooper G.M., Shendure J., Kircher M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 2019;47:D886–D894. doi: 10.1093/nar/gky1016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Landrum M.J., Lee J.M., Benson M., Brown G.R., Chao C., Chitipiralla S., Gu B., Hart J., Hoffman D., Jang W., et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018;46:D1062–D1067. doi: 10.1093/nar/gkx1153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Lazo de la Vega L., Yu W., Machini K., Austin-Tse C.A., Hao L., Blout Zawatsky C.L., Mason-Suares H., Green R.C., Rehm H.L., Lebo M.S. A framework for automated gene selection in genomic applications. Genet. Med. 2021;23:1993–1997. doi: 10.1038/s41436-021-01213-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Wright C.F., Fitzgerald T.W., Jones W.D., Clayton S., McRae J.F., van Kogelenberg M., King D.A., Ambridge K., Barrett D.M., Bayzetinova T., et al. Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data. Lancet. 2015;385:1305–1314. doi: 10.1016/S0140-6736(14)61705-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Doan R.N., Lim E.T., De Rubeis S., Betancur C., Cutler D.J., Chiocchetti A.G., Overman L.M., Soucy A., Goetze S., et al. Autism Sequencing Consortium Recessive gene disruptions in autism spectrum disorder. Nat. Genet. 2019;51:1092–1098. doi: 10.1038/s41588-019-0433-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Robinson J.T., Thorvaldsdóttir H., Winckler W., Guttman M., Lander E.S., Getz G., Mesirov J.P. Integrative genomics viewer. Nat. Biotechnol. 2011;29:24–26. doi: 10.1038/nbt.1754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Collins R.L., Stone M.R., Brand H., Glessner J.T., Talkowski M.E. 2016. CNView: a visualization and annotation tool for copy number variation from whole-genome sequencing. [Google Scholar]
- 77.Dolzhenko E., Weisburd B., Ibañez K., Rajan-Babu I.-S., Anyansi C., Bennett M.F., Billingsley K., Carroll A., Clamons S., Danzi M.C., et al. REViewer: haplotype-resolved visualization of read alignments in and around tandem repeats. Genome Med. 2022;14:84. doi: 10.1186/s13073-022-01085-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Strande N.T., Riggs E.R., Buchanan A.H., Ceyhan-Birsoy O., DiStefano M., Dwight S.S., Goldstein J., Ghosh R., Seifert B.A., Sneddon T.P., et al. Evaluating the Clinical Validity of Gene-Disease Associations: An Evidence-Based Framework Developed by the Clinical Genome Resource. Am. J. Hum. Genet. 2017;100:895–906. doi: 10.1016/j.ajhg.2017.04.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Amendola L.M., Muenzen K., Biesecker L.G., Bowling K.M., Cooper G.M., Dorschner M.O., Driscoll C., Foreman A.K.M., Golden-Grant K., Greally J.M., et al. Variant Classification Concordance using the ACMG-AMP Variant Interpretation Guidelines across Nine Genomic Implementation Research Studies. Am. J. Hum. Genet. 2020;107:932–941. doi: 10.1016/j.ajhg.2020.09.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Harrison S.M., Rehm H.L. Is “likely pathogenic” really 90% likely? Reclassification data in ClinVar. Genome Med. 2019;11:72. doi: 10.1186/s13073-019-0688-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Wang K., Li M., Hadley D., Liu R., Glessner J., Grant S.F.A., Hakonarson H., Bucan M. PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res. 2007;17:1665–1674. doi: 10.1101/gr.6861907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Colella S., Yau C., Taylor J.M., Mirza G., Butler H., Clouston P., Bassett A.S., Seller A., Holmes C.C., Ragoussis J. QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data. Nucleic Acids Res. 2007;35:2013–2025. doi: 10.1093/nar/gkm076. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Sanders S.J., Ercan-Sencicek A.G., Hus V., Luo R., Murtha M.T., Moreno-De-Luca D., Chu S.H., Moreau M.P., Gupta A.R., Thomson S.A., et al. Multiple recurrent de novo CNVs, including duplications of the 7q11.23 Williams syndrome region, are strongly associated with autism. Neuron. 2011;70:863–885. doi: 10.1016/j.neuron.2011.05.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Babadi M., Fu J.M., Lee S.K., Smirnov A.N., Gauthier L.D., Walker M., Benjamin D.I., Karczewski K.J., Wong I., Collins R.L., et al. GATK-gCNV: A Rare Copy Number Variant Discovery Algorithm and Its Application to Exome Sequencing in the UK Biobank. bioRxiv. 2022 doi: 10.1101/2022.08.25.504851. Preprint at. [DOI] [Google Scholar]
- 85.Ibañez K., Polke J., Hagelstrom R.T., Dolzhenko E., Pasko D., Thomas E.R.A., Daugherty L.C., Kasperaviciute D., Smith K.R., et al. WGS for Neurological Diseases Group Whole genome sequencing for the diagnosis of neurological repeat expansion disorders in the UK: a retrospective diagnostic accuracy and prospective clinical validation study. Lancet Neurol. 2022;21:234–245. doi: 10.1016/S1474-4422(21)00462-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Lowther C., Speevak M., Armour C.M., Goh E.S., Graham G.E., Li C., Zeesman S., Nowaczyk M.J.M., Schultz L.A., Morra A., et al. Molecular characterization of NRXN1 deletions from 19,263 clinical microarray cases identifies exons important for neurodevelopmental disease expression. Genet. Med. 2017;19:53–61. doi: 10.1038/gim.2016.54. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Fregeau B., Kim B.J., Hernández-García A., Jordan V.K., Cho M.T., Schnur R.E., Monaghan K.G., Juusola J., Rosenfeld J.A., Bhoj E., et al. De Novo Mutations of RERE Cause a Genetic Syndrome with Features that Overlap Those Associated with Proximal 1p36 Deletions. Am. J. Hum. Genet. 2016;98:963–970. doi: 10.1016/j.ajhg.2016.03.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Guissart C., Latypova X., Rollier P., Khan T.N., Stamberger H., McWalter K., Cho M.T., Kjaergaard S., Weckhuysen S., Lesca G., et al. Dual Molecular Effects of Dominant RORA Mutations Cause Two Variants of Syndromic Intellectual Disability with Either Autism or Cerebellar Ataxia. Am. J. Hum. Genet. 2018;102:744–759. doi: 10.1016/j.ajhg.2018.02.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Gardner E.J., Prigmore E., Gallone G., Danecek P., Samocha K.E., Handsaker J., Gerety S.S., Ironfield H., Short P.J., Sifrim A., et al. Contribution of retrotransposition to developmental disorders. Nat. Commun. 2019;10:4630. doi: 10.1038/s41467-019-12520-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Dagoneau N., Goulet M., Geneviève D., Sznajer Y., Martinovic J., Smithson S., Huber C., Baujat G., Flori E., Tecco L., et al. DYNC2H1 mutations cause asphyxiating thoracic dystrophy and short rib-polydactyly syndrome, type III. Am. J. Hum. Genet. 2009;84:706–711. doi: 10.1016/j.ajhg.2009.04.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Trost B., Engchuan W., Nguyen C.M., Thiruvahindrapuram B., Dolzhenko E., Backstrom I., Mirceta M., Mojarad B.A., Yin Y., Dov A., et al. Genome-wide detection of tandem DNA repeats that are expanded in autism. Nature. 2020;586:80–86. doi: 10.1038/s41586-020-2579-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Mitra I., Huang B., Mousavi N., Ma N., Lamkin M., Yanicky R., Shleizer-Burko S., Lohmueller K.E., Gymrek M. Patterns of de novo tandem repeat mutations and their role in autism. Nature. 2021;589:246–250. doi: 10.1038/s41586-020-03078-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Redin C., Brand H., Collins R.L., Kammin T., Mitchell E., Hodge J.C., Hanscom C., Pillalamarri V., Seabra C.M., Abbott M.A., et al. The genomic landscape of balanced cytogenetic abnormalities associated with human congenital anomalies. Nat. Genet. 2017;49:36–45. doi: 10.1038/ng.3720. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Halgren C., Nielsen N.M., Nazaryan-Petersen L., Silahtaroglu A., Collins R.L., Lowther C., Kjaergaard S., Frisch M., Kirchhoff M., Brøndum-Nielsen K., et al. Risks and Recommendations in Prenatally Detected De Novo Balanced Chromosomal Rearrangements from Assessment of Long-Term Outcomes. Am. J. Hum. Genet. 2018;102:1090–1103. doi: 10.1016/j.ajhg.2018.04.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Lowther C., Mehrjouy M.M., Collins R.L., Bak M.C., Dudchenko O., Brand H., Dong Z., Rasmussen M.B., Gu H., Weisz D., et al. Balanced chromosomal rearrangements offer insights into coding and noncoding genomic features associated with developmental disorders. medRxiv. 2022 doi: 10.1101/2022.02.15.22270795. Preprint at. [DOI] [Google Scholar]
- 96.Raca G., Astbury C., Behlmann A., De Castro J.M., Hickey S.E., Karaca E., Lowther C., Riggs E.R., Seifert B.A., Thorland E., Deignan J.L. Points to consider in the detection of germline structural variants using next-generation sequencing: A statement of the American College of Medical Genetics and Genomics (ACMG). Genet Med. 2023 Feb;25(2):100316. doi: 10.1016/j.gim.2022.09.017. Epub 2022 Dec 12. [DOI] [PubMed]
- 97.Turro E., Astle W.J., Megy K., Gräf S., Greene D., Shamardina O., Allen H.L., Sanchis-Juan A., Frontini M., Thys C., et al. Whole-genome sequencing of patients with rare diseases in a national health system. Nature. 2020;583:96–102. doi: 10.1038/s41586-020-2434-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.All of Us Research Program Investigators. Denny J.C., Rutter J.L., Goldstein D.B., Philippakis A., Smoller J.W., Jenkins G., Dishman E. The “All of Us” Research Program. N. Engl. J. Med. 2019;381:668–676. doi: 10.1056/NEJMsr1809937. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Clark M.M., Stark Z., Farnaes L., Tan T.Y., White S.M., Dimmock D., Kingsmore S.F. Meta-analysis of the diagnostic and clinical utility of genome and exome sequencing and chromosomal microarray in children with suspected genetic diseases. NPJ Genom. Med. 2018;3:16. doi: 10.1038/s41525-018-0053-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Willig L.K., Petrikin J.E., Smith L.D., Saunders C.J., Thiffault I., Miller N.A., Soden S.E., Cakici J.A., Herd S.M., Twist G., et al. Whole-genome sequencing for identification of Mendelian disorders in critically ill infants: a retrospective analysis of diagnostic and clinical findings. Lancet Respir. Med. 2015;3:377–387. doi: 10.1016/S2213-2600(15)00139-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Gilissen C., Hehir-Kwa J.Y., Thung D.T., van de Vorst M., van Bon B.W.M., Willemsen M.H., Kwint M., Janssen I.M., Hoischen A., Schenck A., et al. Genome sequencing identifies major causes of severe intellectual disability. Nature. 2014;511:344–347. doi: 10.1038/nature13394. [DOI] [PubMed] [Google Scholar]
- 102.Lionel A.C., Costain G., Monfared N., Walker S., Reuter M.S., Hosseini S.M., Thiruvahindrapuram B., Merico D., Jobling R., Nalpathamkalam T., et al. Improved diagnostic yield compared with targeted gene sequencing panels suggests a role for whole-genome sequencing as a first-tier genetic test. Genet. Med. 2018;20:435–443. doi: 10.1038/gim.2017.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Kingsmore S.F., Cakici J.A., Clark M.M., Gaughran M., Feddock M., Batalov S., Bainbridge M.N., Carroll J., Caylor S.A., Clarke C., et al. A Randomized, Controlled Trial of the Analytic and Diagnostic Performance of Singleton and Trio, Rapid Genome and Exome Sequencing in Ill Infants. Am. J. Hum. Genet. 2019;105:719–733. doi: 10.1016/j.ajhg.2019.08.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Stavropoulos D.J., Merico D., Jobling R., Bowdin S., Monfared N., Thiruvahindrapuram B., Nalpathamkalam T., Pellecchia G., Yuen R.K.C., Szego M.J., et al. Whole Genome Sequencing Expands Diagnostic Utility and Improves Clinical Management in Pediatric Medicine. NPJ Genom. Med. 2016;1:15012. doi: 10.1038/npjgenmed.2015.12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Taylor J.C., Martin H.C., Lise S., Broxholme J., Cazier J.-B., Rimmer A., Kanapin A., Lunter G., Fiddy S., Allan C., et al. Factors influencing success of clinical genome sequencing across a broad spectrum of disorders. Nat. Genet. 2015;47:717–726. doi: 10.1038/ng.3304. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Retterer K., Juusola J., Cho M.T., Vitazka P., Millan F., Gibellini F., Vertino-Bell A., Smaoui N., Neidich J., Monaghan K.G., et al. Clinical application of whole-exome sequencing across clinical indications. Genet. Med. 2016;18:696–704. doi: 10.1038/gim.2015.148. [DOI] [PubMed] [Google Scholar]
- 107.Retterer K., Scuffins J., Schmidt D., Lewis R., Pineda-Alvarez D., Stafford A., Schmidt L., Warren S., Gibellini F., Kondakova A., et al. Assessing copy number from exome sequencing and exome array CGH based on CNV spectrum in a large clinical cohort. Genet. Med. 2015;17:623–629. doi: 10.1038/gim.2014.160. [DOI] [PubMed] [Google Scholar]
- 108.Pfundt R., Del Rosario M., Vissers L.E.L.M., Kwint M.P., Janssen I.M., de Leeuw N., Yntema H.G., Nelen M.R., Lugtenberg D., Kamsteeg E.-J., et al. Detection of clinically relevant copy-number variants by exome sequencing in a large cohort of genetic disorders. Genet. Med. 2017;19:667–675. doi: 10.1038/gim.2016.163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Aganezov S., Yan S.M., Soto D.C., Kirsche M., Zarate S., Avdeyev P., Taylor D.J., Shafin K., Shumate A., Xiao C., et al. A complete reference genome improves analysis of human genetic variation. Science. 2022;376 doi: 10.1126/science.abl3533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Cakici J.A., Dimmock D.P., Caylor S.A., Gaughran M., Clarke C., Triplett C., Clark M.M., Kingsmore S.F., Bloss C.S. A Prospective Study of Parental Perceptions of Rapid Whole-Genome and -Exome Sequencing among Seriously Ill Infants. Am. J. Hum. Genet. 2020;107:953–962. doi: 10.1016/j.ajhg.2020.10.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Li C., Vandersluis S., Holubowich C., Ungar W.J., Goh E.S., Boycott K.M., Sikich N., Dhalla I., Ng V. Cost-effectiveness of genome-wide sequencing for unexplained developmental disabilities and multiple congenital anomalies. Genet. Med. 2021;23:451–460. doi: 10.1038/s41436-020-01012-w. [DOI] [PubMed] [Google Scholar]
- 112.Farnaes L., Hildreth A., Sweeney N.M., Clark M.M., Chowdhury S., Nahas S., Cakici J.A., Benson W., Kaplan R.H., Kronick R., et al. Rapid whole-genome sequencing decreases infant morbidity and cost of hospitalization. NPJ Genom. Med. 2018;3:10. doi: 10.1038/s41525-018-0049-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Incerti D., Xu X.-M., Chou J.W., Gonzaludo N., Belmont J.W., Schroeder B.E. Cost-effectiveness of genome sequencing for diagnosing patients with undiagnosed rare genetic diseases. Genet. Med. 2021;23:1833–1835. doi: 10.1016/j.gim.2021.08.015. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The genomic and phenotype data for the ASD families can be accessed through SFARIbase with permission from the Simons Foundation Autism Research Initiative. The raw sequencing data generated from the fetal structural anomaly cohort is restricted because of consent limitations. However, all diagnostic variants identified in the ASD and FSA cohorts are provided in Tables S10 and S12.