Skip to main content
The Journal of Clinical Endocrinology and Metabolism logoLink to The Journal of Clinical Endocrinology and Metabolism
. 2013 Jun 14;98(8):E1428–E1437. doi: 10.1210/jc.2013-1534

Large-Scale Pooled Next-Generation Sequencing of 1077 Genes to Identify Genetic Causes of Short Stature

Sophie R Wang 1,*, Heather Carmichael 1,*, Shayne F Andrew 1, Timothy C Miller 1, Jennifer E Moon 1, Michael A Derr 1, Vivian Hwa 1, Joel N Hirschhorn 1, Andrew Dauber 1,
PMCID: PMC3733853  PMID: 23771920

Abstract

Context:

The majority of patients presenting with short stature do not receive a definitive diagnosis. Advances in genetic sequencing allow for large-scale screening of candidate genes, potentially leading to genetic diagnoses.

Objectives:

The purpose of this study was to discover genetic variants that contribute to short stature in a cohort of children with no known genetic etiology.

Design:

This was a prospective cohort study of subjects with short stature.

Setting:

The setting was a pediatric endocrinology and genetics clinics at an academic center.

Patients:

A total of 192 children with short stature with no defined genetic etiology and 192 individuals of normal stature from the Framingham Heart Study were studied.

Intervention:

Pooled targeted sequencing using next-generation DNA sequencing technology of the exons of 1077 candidate genes was performed.

Main Outcome Measures:

The numbers of rare nonsynonymous genetic variants found in case patients but not in control subjects, known pathogenic variants in case patients, and potentially pathogenic variants in IGF1R were determined.

Results:

We identified 4928 genetic variants in 1077 genes that were present in case patients but not in control subjects. Of those, 1349 variants were novel (898 nonsynonymous). False-positive rates from pooled sequencing were 4% to 5%, and the false-negative rate was 0.1% in regions covered well by sequencing. We identified 3 individuals with known pathogenic variants in PTPN11 causing undiagnosed Noonan syndrome. There were 9 rare potentially nonsynonymous variants in IGF1R, one of which is a novel, probably pathogenic, frameshift mutation. A previously reported pathogenic variant in IGF1R was present in a control subject.

Conclusions:

Large-scale sequencing efforts have the potential to rapidly identify genetic etiologies of short stature, but data interpretation is complex. Noonan syndrome may be an underdiagnosed cause of short stature.


Growth is a fundamental biological process that occurs during childhood. With the exception of diabetes, short stature is one of the most common reasons for referral to a pediatric endocrinologist. In most cases, short stature is familial, consistent with a strong genetic influence on childhood and adult height. In some cases, however, children have severe short stature that is out of proportion to the parental heights or have short stature associated with syndromic features. Molecular defects associated with these rarer cases have, over the last decade, expanded the list of genes and biological pathways known to influence growth. Multiple mutations, for example, have been found in the GH pathway, not only in the GH gene (GH1) itself, but also downstream within the GHR (GH receptor), STAT5B, IGF1, IGFALS, and IGF1R (IGF-I receptor) genes (1). Many genes underlying severe skeletal dysplasias associated with short stature have also been identified (2). Despite these advances, the molecular causality in the vast majority of patients, including those with severe or syndromic short stature, remains unresolved. Consequently, most affected patients continue to be classified as having idiopathic short stature.

Genome-wide association (GWA) studies have enabled the identification of common genetic variants (frequencies >5%) influencing quantitative traits such as height. Indeed, recent GWA studies identified 180 genetic loci with common DNA sequence variants that influence human stature (3). Intriguingly, these common variants are often located in or near genes that underlie syndromes of abnormal skeletal growth. This overlap suggests that rare variants in other genes highlighted by the GWA studies could have significant effects on growth.

To explore the role of rare genetic variants in short stature, we developed and applied large-scale candidate gene sequencing technologies (4) in a cohort of children with short stature of unknown cause. The selected list of 1077 candidates is composed of genes from identified GWA loci, genes known to cause syndromic short stature, and genes known to be involved in growth plate biology or growth plate signaling. Herein, we report our initial screening and assessment of pooled exonic sequencing on DNA samples from 192 children with short stature and 192 control children of normal stature. We identified a large number of nonsynonymous variants present in case patients but not in control subjects. There are a number of possible analytical approaches to explore these data. First, one can search for variants that have previously been reported to be pathogenic. Second, one can search for novel variants within genes known to cause short stature and then perform further familial segregation and functional studies to validate those variants. Third, one can search for multiple likely deleterious variants in novel genes not previously known to cause short stature. In the current article, we discuss the first approach, looking for known pathogenic variants, and provide a more detailed analysis of rare genetic variants identified in IGF1R as an example of the second approach. Haploinsufficiency of IGF1R is known to cause significant short stature, and our data demonstrate the utility of large-scale sequencing and the critical need for careful interpretation of the resulting data. Future work will explore the other analytical approaches.

Materials and Methods

Height candidate genes

In this study, we sequenced the exons of 1077 genes (∼2 Mb total target size). Of these 1077 genes, one-third (n = 356) were known biological candidates, including genes known to underlie syndromic growth disorders or skeletal dysplasias as well as genes involved in growth plate biology or GH signaling. The remaining two-thirds (n = 777) included genes within genomic loci associated with height based on GWA studies; 56 genes belong to both categories (see Supplemental Table 1 published on The Endocrine Society's Journals Online web site at http://jcem.endojournals.org) (3). For the genes within the GWA loci, we set the genomic boundaries at each height-associated locus using linkage disequilibrium cutoffs (HapMap CEU r2 > 0.5) for the top single-nucleotide polymorphism (SNP). For loci with ≥2 genes within the genomic boundary, all genes were included. Loci with >10 genes were excluded. For SNPs with <2 genes within the genomic boundary, genes beyond the boundary but within the next recombination hotspots were included.

Subjects

This study was approved by the institutional review board at Boston Children's Hospital (Boston, Massachusetts). All subjects or their legal guardians provided written informed consent. The 192 patients with short stature (>2 SD below the mean for age and sex) (5) but without defined genetic etiologies, were recruited from the endocrinology and genetics clinics at Boston Children's Hospital. Because we were searching for rare genetic syndromes, subjects were allowed to have additional medical comorbidities, dysmorphic features, or other hormonal deficiencies as long as these alternate medical problems did not provide a clear explanation for the subject's short stature. In addition, 192 control subjects were chosen from the Framingham Heart Study. Control subjects were chosen from the middle of the Framingham Heart Study height distribution (height z scores between −0.7 and +0.7 SDs). z scores were calculated by regressing the height phenotype, stratifying by sex and adjusting for age.

Sequencing protocol

DNA samples from multiple subjects were pooled for DNA sequencing using previously described methods available at the Broad Institute (6). To identify variants present only in a single individual (hereon referred to as singleton variants), we applied a simple overlapping pooling design (Supplemental Figure 1). The samples from short stature subjects and control samples were each arranged into a 14 × 14 matrix of 28 pools, with 13 to 14 samples in each pool. Four empty “holes” were included in each matrix for assessing the false-positive rate. Each sample was sequenced in 2 pools (1 row pool and 1 column pool). Singleton variants appear only in 1 row pool and 1 column pool. Therefore, the subject whose DNA sample is present at the intersection of these 2 pools must be the individual carrying that singleton variant. The targeted exons of the 1077 candidate genes were enriched using a custom Agilent SureSelect hybrid selection system. Sequencing was performed on the Illumina HiSeq platform. There was an average of 12 961 604 reads per pool, resulting in mean target coverage of 213 reads (15 reads per subject in a pool of 14 subjects or 30 total reads per subject, as each subject is present in 2 pools). Variant calling was performed using Syzygy software (6) and then a new likelihood-based secondary calling strategy that integrated the extra information from our matrix design was applied (Supplemental Methods).

Variants were annotated for functional effect using SnpEff 2.0.5 (http://snpeff.sourceforge.net/). Variant allele frequency data were obtained from 3 publicly available datasets: (1) the integrated variant call set of 1000 Genomes phase 1 samples (7) (February 2012 release); (2) the National Heart, Lung, and Blood Institute Exome Variant Server (8), and (3) ∼12 000 sequenced genomes and exomes assembled for exome genotyping chip design (http://genome.sph.umich.edu/wiki/Exome_Chip_Design). The maximal allele frequencies from all 3 sources were used. Novel variants are those not observed in any of these datasets.

Assessing false-positive and false-negative rates

We estimated the false-negative and false-positive rates by comparing pooling data with data from exome sequencing previously performed in 6 of the short stature subjects. To start, we determined the overlapping targets between pooling and exome capture arrays. Then, limiting to sites with ≥10 reads, we assumed that the exome sequencing data reflected the gold standard because of its much greater depth of coverage. False positives were defined as singletons observed in pooling data but not in exome data, whereas false-negative results were those observed in exome data but not in either of the 2 relevant pools. The false-positive rate was also estimated by looking for singleton variants that mapped to one of the empty holes in the matrix. The singleton variants that mapped to empty holes were false-positives, permitting the number of false-positive variants per individual to be estimated, and from there we could independently estimate the false-positive rate of singleton variants.

IGF1R functional studies

Whole blood samples (BD Vacutainer Cell Preparation Tube with sodium heparin; Becton, Dickinson and Company, Franklin Lakes, New Jersey) were collected from the adopted patient and from the unrelated mother, who served as normal control subjects. Peripheral blood mononuclear cells (PBMCs) were isolated following the manufacturer's protocol. PBMCs, in freezing medium (RPMI 1640 medium + 40% fetal bovine serum + 10% dimethyl sulfoxide) were stored in liquid nitrogen.

For immunoblot analysis, fresh PBMCs (2 × 106/treatment) were resuspended in serum-free RPMI 1640 medium, with or without recombinant IGF-I (100 ng/mL; GroPep Ltd, Thebarton, South Australia, Australia), for 20 minutes at 37°C in a CO2 incubator, before pelleting, and cells were lysed as described previously for fibroblast cell cultures (9). Western immunoblot analyses were performed as described previously (9).

For flow cytometry analysis by fluorescence-activated cell sorting (FACS) of cell surface IGF1R, PBMCs, warmed to 37°C from liquid nitrogen storage, were washed twice, aliquoted as 1 × 106 cells/sample in RPMI 1640 medium + 10% fetal bovine serum, and incubated overnight at 37°C (5% CO2 incubator). Before IGF-I treatment, cells were washed twice with serum-free RPMI 1640 medium + 0.5% BSA and equilibrated in 0.5 mL of serum-free RPMI 1640 medium for 4 hours. Cells were treated with or without IGF-I (100 ng/mL final concentration) for 1 hour, after which cells were washed twice with cold staining medium (1× PBS-0.5% BSA-0.1% sodium azide) and incubated with phycoerythrin (PE)–conjugated anti-human IGFIR-α (CD221; BD Biosciences, San Jose, California) for 30 minutes at 4°C in the dark. After antibody staining, cells were washed twice with cold staining medium, resuspended in 200 μL of staining medium-0.25% propidium iodide, and incubated on ice for 10 minutes. A total of 100 000 live PBMCs (propidium iodide negative, CD221 positive) per sample were acquired via a FACSCaliber flow cytometer (BD Biosciences), and the fluorescence emitted by IGF1R-PE–labeled PBMCs was analyzed using FCS Express 3 analysis software (De Novo Software, Los Angeles, California).

Results

Description of cohort

Participants in this study included 192 subjects (106 male and 86 female), 75% of whom were white. The height z scores ranged from −2.05 to −7.01 SD (Figure 1). The ages of these subjects ranged from 3 to 22 years with a mean of 10.3 years. Seventy subjects (36.4%) had begun GH therapy for short stature before enrollment in the study. However, GH deficiency was diagnosed in only 31 subjects (16%); of these, 22 had isolated GH deficiency without additional pituitary hormone defects. For those subjects receiving GH therapy, only height z scores before initiation of therapy are shown (Figure 1). An additional 14 subjects were thought to have known genetic syndromes, but clinical diagnostic testing for the suspected syndromes had not identified pathogenic variants. Twenty-nine subjects were reported to have developmental delay.

Figure 1.

Figure 1.

Height z score at enrollment or before initiation of GH therapy. Each bar represents individuals with a height z score less than or equal to the number noted below it on the x-axis but greater than the number below the bar to the left. For example, the right-most bar represents individuals with a height z score −2.25 < z ≤ −2.

Validating pooled sequencing results

Using our pooled sequencing design, the false-positive rate of singleton variants estimated by comparison to the exome data was 4.8% (range, 0%–12.5% per individual) (Table 1). In addition, a total of 7 singleton variants mapped to the 8 holes in the 2 matrices compared with 7680 singleton variants that mapped to the 384 subjects (patients and control subjects), resulting in a similar estimation of the false-positive rate of 4.2%. These numbers establish the upper bound for the variant false-positive rate, because singleton variants are more likely to be false positives than variants found in multiple individuals. Of a total of 6618 variants present in the 6 exome samples within our target region, 7 variants were not identified by the pooled sequencing, giving an estimated overall false-negative rate of 0.1%. Similar to the false-positive rate, the false-negative rate for singleton variants is likely to be higher than that of other variants because singleton variants only appear in 2 pools and are more difficult to identify.

Table 1.

False-Positive Rate of Singletons Estimated by Comparing With Exome Sequencing of 6 Samples

Sample No. No. of Singletons False-Positive Singletons False-Positive Rate, %
1 34 4 11.8
2 58 1 1.7
3 55 3 5.5
4 15 0 0.0
5 30 0 0.0
6 16 2 12.5
Summary 208 10 4.8

Pathogenicity of rare variants

In the 192 short stature patients, we identified a total of 10 819 variants, of which 4928 were not detected in the control samples. Of these, 1349 were novel (Table 2). To screen for possible causal effects of variants found in our cohort, we compared these variants to those found in the Human Gene Mutation Database (HGMD) (10). The database contains 26 995 SNPs or indels located in the 1077 genes in our study, in which the variant has been reported as being associated with a particular clinical phenotype. We identified 66 such SNPs that matched a variant detected in our case subjects but not in control subjects. Because the HGMD is known to have erroneous entries, we eliminated 7 variants with a minor allele frequency of ≥1% because these are unlikely to be true pathogenic variants. Of the remaining 59 variants, 32 were associated with recessive conditions or predispositions to complex traits, and the clinical pictures of the patients were not consistent with the disease phenotype, suggesting that they are unaffected heterozygous carriers. The final 27 variants previously associated with dominantly inherited diseases are listed in Supplemental Table 2. We reviewed the phenotypes of the 27 case subjects and identified 1 case of autosomal dominant brachyolmia type 3 and 3 cases of Noonan syndrome. The remaining 24 case subjects did not have phenotypes consistent with the reported disease associations.

Table 2.

Variants Identified in Short Stature Samples but Not in Control Samples

Variant Type Known Variants
Novel Variants
All MAF ≤ 5% MAF ≤ 1%
Silent 1632 1602 1356 451
Missense 1903 1888 1704 829
Splice 11 11 10 8
Indel 11 11 10 46
Nonsense 22 22 22 15
Total 3579 3534 3102 1349

Abbreviation: MAF, minor allele frequency.

Identification of pathological variants associated with brachyolmia and Noonan syndrome

The patient with brachyolmia has a height of −3.88 SD and platyspondyly of the cervical spine. The mutation in TRPV4 is a missense mutation (c.1858G>A, V620I) that is a known variant causing the disease (11). Before the research results became available but subsequent to enrollment in our study, brachyolmia was clinically diagnosed in the patient, and clinical testing revealed this mutation.

All 3 patients with Noonan syndrome carried variants in PTPN11, the most common causative gene in this syndrome. Noonan syndrome is an autosomal dominant condition with characteristic dysmorphic facial features as well as short stature, webbed neck, and cardiac abnormalities (12). The first subject is an 11-year-old girl with a height z score of −2.7 SD. Isolated GH deficiency was diagnosed at age 7 years, and she had a poor response to GH therapy. Of note, she was born with a transitional atrioventricular canal defect, which was repaired at 4 months of age. She had a triangular face with a mildly low posterior hair line and slightly wide-spaced eyes. She did not have ptosis or downslanting eyes, and her ears were normal. She carries the c.188A>G/p.Y63C variant (13). The second subject is an 8-year-old girl with a height z score of −1.7 SD. She reached a height nadir of −3.0 SD at age 5 before the start of GH therapy for an indication of being small for gestational age to which she had a good response. She was evaluated at age 4 by a geneticist for the possibility of Russell-Silver syndrome, but no formal diagnosis was made. She has ptosis, epicanthal folds, downslanting eyes, and posteriorly rotated ears. She carries the c.925A>G/p.I309V variant (14), which she inherited from her father whose height is 173 cm (−0.5 SD). The third subject is a 16-year-old male adolescent with a height z score of −2.5 SD. He reached a height nadir of −3.2 SD at age 13 before isolated GH deficiency was diagnosed and GH therapy was started. He also started testosterone therapy at age 15 years for delayed puberty. He has mild learning issues, and on examination has a low posterior hair line but no other facial features consistent with Noonan syndrome. He carries the c.853T>C/p.F285L variant (14).

Identification of one pathological IGF1R variant among all IGF1R rare variants identified

To demonstrate the utility of a large-scale sequencing approach and the need for careful interpretation of results, we focused on rare variants in IGF1R, a gene for which haploinsufficiency is known to cause significant short stature (1517). Our approach was to identify nonsynonymous variants present in case patients only that segregated with the phenotype of short stature within the families. Variants meeting these criteria would be classified as potentially pathogenic variants requiring further functional validation. In total, our targeted sequencing found 25 unique IGF1R variants in both case patients and control subjects. Of these, 16 were synonymous SNPs, most of which were common (minor allele frequency >0.01); these were not evaluated further. The remaining 9 variants included 6 missense, 1 frameshift, and 2 intronic variants. The intronic variants were found within 5 bp of an exon, which suggests a potential involvement in splicing, and thus were included for further analysis. Five of these variants were present in case patients only (Table 3). All 9 variants were validated via traditional Sanger sequencing and confirmed to be heterozygous. To determine the biological significance of these variants, segregation of variants 2 through 6 within families was performed (Figure 2). There was no correlation between the individual family member's heights and the carrier status of the variants, suggesting that these variants are not likely to be major contributors to the patients' short stature, and, therefore, we excluded these variants from further consideration as pathogenic variants. Variant 7 was present in multiple case patients and control subjects and was also not likely to be pathogenic. Of note, 1 of the 2 rare missense variants found only in control subjects in our study (variant 9) was previously reported as pathogenic in the literature (18). This control subject is of normal stature at −0.4 SD.

Table 3.

IGF1R Potentially Nonsynonymous Variants

Variant Exon cDNA Protein MAF Subject Sex Height SD Birth weight, ga IGF-1 (Normal Rangeb), ng/mL
1 2 c.418dupG p.A140Gfs*5 Novel Case 1 F −4.1 Unknown 389.8 (244–787)
2 5 c.1247 + 3A>G Intron 0.0007 Case 2 F −3.9 2800 26.6 (49–342)
3 7 c.1463 − 5 C>A Intron 0.012 Case 3 M −2.4 3500 52.2 (49–342)
4 6 c.1411C>T p.R471C Novel Case 4 M −2.3 4100 34 (63–279)
5 7 c.1502C>T p.S501 liter Novel Case 5 M −3.0 3400 148 (63–279)
6 6 c.1336A>G p.M446V 0.0027 Case 6 F −2.8 2600 97.2 (49–342)
Control 6 F 0.0
7 6 c.1310G>A p.R437H 0.004 Case 7a F −3.3 4200 69.9 (49–342)
Case 7b F −3.1 3800 62 (63–279)
Control 7a F +0.4
Control 7b M 0.0
Control 7c M −0.4
8 5 c.1162G>A p.V388 m 0.003 Control 8 M −0.6
9 7 c.1532G>A p.R511Q 0.003 Control 9 M −0.4

Abbreviations: F, female; M, male.

a

All case patients were the product of full-term pregnancies (>37 weeks) with the exception of case 7b (36.5 weeks). None of the case patients met the definition for intrauterine growth retardation (weight <2500 g at birth for normal gestation).

b

Normal range for sex and Tanner stage. All IGF-I values were obtained during a baseline clinical evaluation and were measured when the patient was not receiving growth hormone therapy.

Figure 2.

Figure 2.

Segregation of identified IGF1R nonsynonymous variants in affected families does not correlate with short stature. Numbers below the individuals denote the height z scores. ∧ indicates that the height was estimated by a family member. All other values were measured. Individuals carrying the heterozygous variants are indicated as black half-filled circles (females) or squares (males). The arrow points to the affected proband in each family.

Variant 1 was a novel frameshift mutation (c.418dupG/p.A140Gfs*5) (Supplemental figure 2) found in 1 patient in the heterozygous state. The mutation causes severe truncation of the protein with complete abrogation of the transmembrane and intracellular domains and thus was predicted to lead to haploinsufficiency. This patient was adopted from China at 6 years of age, and therefore a complete history and familial samples could not be obtained. At the age of 15 years, the patient had Tanner stage 4 breast development with height of 136 cm (−4.06 SD), weight of 30.2 kg (−4.87 SD), and a head circumference of 49.3 cm (−4.4 SD). She has a history notable for bilateral cleft lip and palate as well as attention deficit disorder and mild developmental delay. Her IGF-I was normal at 389.8 ng/mL (normal range, 244–787 ng/mL for a Tanner stage 4 female). GH stimulation testing with arginine and glucagon demonstrated a peak GH level of 18.8 ng/mL. She had previously been treated with GH therapy with a possible mild increase in growth velocity, although this occurred concurrently with entering puberty.

Variant 1 was the only variant in IGF1R that met our prespecified criteria for consideration as a potential pathogenic variant. To determine whether variant 1 was causal for the patient's phenotype, we evaluated IGF1R expression and function in primary PBMCs derived from the patient compared with those in control PBMCs (procured from the unrelated adoptive mother). Flow cytometric analysis by FACS of live PBMCs (counts, y-axis; Figure 3) indicated that fluorescence emitted by IGF1R-PE-labeled PBMCs was markedly reduced (fluorescence intensity, x-axis; Figure 3) in patient PMBCs compared with that by the normal control PBMCs (Figure 3A). When the live PBMCs were treated with IGF-I, emitted fluorescence was comparably reduced for both control and patient PBMCs, suggesting normal internalization of IGF1R upon ligand binding (Figure 3B). Immunoblot analysis of cell lysates, furthermore, indicated that total IGF1R expression was reduced in the patient's PBMCs with correlating reductions in IGF-I–induced signaling (Figure 3C). Taken together, the results are consistent with the heterozygous IGF1R c.418dupG variant inducing a state of IGF1R deficiency and being an excellent candidate to cause the subject's short stature.

Figure 3.

Figure 3.

IGF1R expression and signaling in primary peripheral PBMCs of patient carrying heterozygous IGF1R c.418dupG. PBMCs were isolated as indicated in Materials and Methods. Flow cytometry analysis by FACS was used to detect IGF1R, labeled by PE-conjugated anti-human IGF1R-α antibody (see Materials and Methods), on the cell surface of live PBMCs. Live PBMCs (counts, y-axis) and fluorescence emitted by the IGF1R-PE–labeled PBMC were collated (log scale fluorescence intensity, x-axis). (A) Patient (red graph) compared with normal (black graph) PBMCs. Background fluorescence emitted by unlabeled and untreated PBMC control is shown by the gray-shaded region. The geometric mean of the fluorescent intensity (FI) detected in normal PBMCs was given an arbitrary unit of 100% (table). (B) Effect of IGF-I treatment (100 ng/mL, 1 hour) on the detection of IGF1R-PE–labeled PBMC from normal (top panel) and patient (bottom panel) PBMCs. For each, the geometric mean of the fluorescent intensity (FI) detected in untreated PBMCs was given an arbitrary unit of 100%. (C) Western immunoblot analysis of total cell lysates from PBMCs treated with IGF-I (100 ng/mL) vs untreated cells. Molecular mass (kilodaltons) is indicated on the left side of the immunoblots. The intracellular proteins detected are indicated by arrows (on right).

Discussion

Short stature is a common problem confronting pediatric endocrinologists. After exclusion of other chronic diseases or overt hormonal deficiencies, clinicians are often unable to provide a definitive diagnosis for the etiology of an individual patient's short stature. There are a multitude of genetic causes for short stature, but most patients do not fall into a previously identified genetic syndrome. We, therefore, designed and performed a large-scale sequencing project to identify pathogenic rare genetic variants in individuals with short stature. We sequenced 1077 candidate genes including known skeletal dysplasia genes, genes within the GH signaling pathway, genes known to affect growth plate biology, and genes within loci associated with adult height in large GWA studies. Using this approach, we identified 4 known pathogenic variants causing short stature as well as novel variants in genes known to affect stature.

To facilitate the sequencing of a large number of genes in many subjects, we used a pooled sequencing design, which significantly reduced the cost of such analysis (19). Most of the cost associated with a targeted next-generation sequencing project is typically incurred at the library construction stage, in which targeted regions of DNA are separated from the remainder of the genome for sequencing. In a pooled sequencing design, this process only has to be performed once per pool. Although actual sequencing depth may need to be increased to ensure adequate representation of all samples in the pool, the associated cost is relatively minor. Indeed, the cost per sample of our pooled targeted sequencing approach is estimated to be ∼15% of the cost for individual exome sequencing (ie, ∼$110 compared with ∼$800 per sample, based on current prices available at our institution). Exome sequencing could also be done in a pooled fashion, in which case cost differences will depend on the cost of sequencing coverage, a process that is becoming cheaper to perform. Although pooled exome sequencing does have the advantage that nearly all genes are evaluated, analysis and interpretation of the data generated would be much more complex because of the large number of novel nonsynonymous variants in genes with no known connection to the phenotype of interest. Our simple matrix pooling design, in contrast, allows for the rapid assessment of low-frequency variants in candidate genes and the identification of individuals carrying singleton variants, which are more likely to be pathogenic than variants with a higher minor allele frequency. However, pooling does limit the ability to discern whether a single variant is homozygous or heterozygous in an individual subject and follow-up confirmatory genotyping is necessary. Using this design, we were able to identify a large number of very rare nonsynonymous variants within our candidate genes with low false -positive and low false-negative rates.

We identified 4 subjects in our cohort who had known pathogenic variants implicated in disease. Notably, 3 of these subjects have mutations in PTPN11 that cause Noonan syndrome. Noonan syndrome is known to have a wide phenotypic spectrum, leading to difficulty in diagnosis (12), and, indeed, one of our subject's fathers carries a proven pathogenic variant in PTPN11 yet never presented with the overt clinical manifestations of Noonan syndrome. Although it is true that our subjects may have had features consistent with Noonan syndrome that were unrecognized, such as a cardiac defect or delayed puberty, this retrospective recognition of related features does not eliminate the benefit of genetic screening. The lack of diagnoses in our cohort represents clinical reality, because these subjects were extensively evaluated by experienced pediatric endocrinologists and in one case by geneticist as well. This suggests that a substantial number of patients with Noonan syndrome are designated as having idiopathic short stature or isolated GH deficiency even after clinical evaluation by pediatric subspecialists. Additional research is needed to determine whether widespread screening for PTPN11 or the other Noonan syndrome genes is warranted for patients with short stature of unknown etiology.

It is important to note that, in our cohort, the vast majority of HGMD-reported disease-causing dominant mutations did not manifest with the associated clinical phenotype. The classification of these variants as pathogenic is probably erroneous and based on insufficient clinical evidence. However, we cannot rule out the possibility that the variants have variable expressivity and some of our subjects are presenting on the very mild end of the clinical spectrum with short stature as their disease manifestation.

Our focus on rare variants of the IGF1R gene illustrates the critical importance of providing supporting familial segregation and functional data when a rare variant has been identified. IGF-I, the primary mediator of GH function, is essential for growth. Heterozygous and compound heterozygous mutations in IGF1R that lead to decreases in the quantity or function of the receptor have been described in nearly a dozen human cases (1, 1517, 20, 21). These patients display variable phenotypes, with shared characteristics that include poor prenatal and postnatal growth, microcephaly, high or normal IGF-I levels, and developmental delay (1, 1517, 20, 21).

Our targeted sequencing approach identified 7 unique rare nonsynonymous IGF1R variants as well as 2 intronic variants with the potential to affect splicing because of their proximity to exons. Only 1 of these 9 variants, a novel c.418dupG frameshift mutation located in exon 2, was associated with clinical features suggestive of a pathological IGF1R deficiency state (high levels of serum IGF-1, microcephaly, and intrauterine growth retardation). Furthermore, in primary cells derived from this patient, significant decreases in both IGF1R expression and IGF-I–induced signaling supported the pathogenicity of the IGF1R c.418dupG defect. None of the remaining variants found in case patients show convincing evidence of pathogenicity. This example demonstrates that large-scale sequencing efforts will identify numerous very rare and novel nonsynonymous variants in candidate genes. Most of these variants will be missense variants, leading to a change in a single amino acid, which will not affect protein function, and represent incidental findings. Segregation of these variants with the phenotype within families is the first critical step in evaluating potential pathogenicity, highlighting the importance of collecting familial samples at the time of the initial DNA collection.

Filtering strategies based on population allele frequency are useful and necessary, but most public databases do not provide individual phenotypic data linked to the subject's genotype, thus limiting the ability to determine whether a variant is potentially pathogenic. Therefore, simultaneous sequencing of a control cohort with a known phenotype, in this case normal stature, provides additional information about the lack of pathogenicity of rare variants in a gene. This fact is exemplified by our finding that an IGF1R variant previously reported to be pathogenic (c.1532G>A/p.R511Q) (18) was found in the heterozygous state in a control subject of normal stature. This variant was originally identified in the heterozygous state in a patient and her maternal aunt, both of whom presented with extreme short stature (−6.1 and −5.7 SD, respectively). It is of note that information regarding the parents of the patient were lacking in this report. In vitro reconstitution studies of the homozygous p.R511Q variant were performed to support the pathogenicity of this variant, although the effect of heterozygosity was unknown. These caveats, together with our identification of the same variant in a control subject of normal stature, strongly suggest that a heterozygous p.R511Q is not likely to be the cause of the previously reported family's extreme short stature. Furthermore, Kansra et al (22) recently detected the R511Q variant in 6 of 1800 public school students. Indeed, carriers had an average height around the 27th percentile, thus providing additional evidence that this variant does not cause severe short stature. Taken together, these results support the importance of segregation analysis and the need to include primary cells in functional analysis.

Our study has a number of important limitations. We recruited a very heterogeneous cohort, allowing for the inclusion of dysmorphic features, other congenital anomalies, and hormonal deficiencies provided that there was no known genetic etiology for these findings. Thus, subjects in this cohort do not meet a strict definition of idiopathic short stature (23). Nevertheless, we believe that this cohort more accurately represents the diversity of patients who are seen in a referral setting and is probably enriched for individuals with rare genetic variants that may have multisystem effects. In addition, our hybrid selection strategy only targets the exons of the candidate genes, and, thus, any noncoding variation that affects gene expression cannot be detected by our methods. Variants affecting gene expression can play an important role in causing short stature. For example, Russell-Silver syndrome, an important syndromic form of short stature, is often due to abnormalities in methylation of chromosome 11p15.5, leading to aberrant gene expression (24). In addition, our current approach does not assess copy number variation (ie, deletions or duplications of genes), which may also be an important genetic defect leading to short stature. We are currently pursuing copy number analysis of this cohort using a custom chromosomal microarray (data not shown). Furthermore, we did not obtain perfect sequencing coverage of all variants in the targeted region and could miss potentially pathogenic variants in the candidate genes. Finally, because of the large numbers of rare missense variants in both case patients and control subjects, we have limited power to discover new genes with a statistically significant excess of mutations in case patients vs control subjects. Ongoing work to increase sample size and examine subjects at the extremes of the height distribution will provide additional data to support novel gene discovery.

In conclusion, we present the initial results of a large-scale candidate gene sequencing effort in children with short stature and demonstrate the complexity of data interpretation of such efforts. Of our 192 subjects, 3 were found to have known pathogenic variants in PTPN11, highlighting the possibility that Noonan syndrome is underdiagnosed in the clinical setting. We report a novel frameshift mutation in IGF1R and demonstrate its pathogenicity in vivo. In addition, we provide evidence that a previously reported variant in IGF1R is not pathogenic. Analyses of variants identified in the other candidate genes are currently ongoing.

Acknowledgments

We thank Jason Flannick for his assistance in running the Syzygy software and Dr. Amy Roberts for her helpful discussions regarding Noonan syndrome.

This work was supported by Harvard Catalyst, The Harvard Clinical and Translational Science Center (National Institutes of Health [NIH] Award UL1 RR 025758 and financial contributions from Harvard University and its affiliated academic health care centers). The content is solely the responsibility of the authors and does not necessarily represent the official views of Harvard Catalyst, Harvard University and its affiliated academic health care centers, the National Center for Research Resources, or the NIH. Sequencing experiments were performed by the Sequencing Core Facility of the Molecular Genetics Core Facility at Boston Children's Hospital supported by NIH P30-HD18655. This work was also supported by NIH Grant 1K23HD073351 (to A.D.), a fellowship grant from the Genentech Center for Clinical Research in Endocrinology (to A.D.), the Translational Research Program at Boston Children's Hospital, and March of Dimes Grant 6-FY09-507 (to J.N.H.). Samples were provided from the Framingham Heart Study of the National Heart, Lung, and Blood Institute of the NIH and Boston University School of Medicine, which was supported by the National Heart, Lung, and Blood Institute Framingham Heart Study (Contract N01-HC-25195).

Disclosure Summary: A.D. has previously consulted for Ipsen Pharma. J.N.H. received grant support from Pfizer Inc (2011−present). The other authors have nothing to disclose.

Footnotes

Abbreviations:
FACS
fluorescence-activated cell sorter
GWA
genome-wide association
HGMD
Human Gene Mutation Database
PBMC
peripheral blood mononuclear cell
PE
phycoerythrin
SNP
single nucleotide polymorphism.

References

  • 1. David A, Hwa V, Metherell LA, et al. Evidence for a continuum of genetic, phenotypic, and biochemical abnormalities in children with growth hormone insensitivity. Endocr Rev. 2011;32:472–497 [DOI] [PubMed] [Google Scholar]
  • 2. Rimoin DL, Cohn D, Krakow D, Wilcox W, Lachman RS, Alanay Y. The skeletal dysplasias: clinical-molecular correlations. Ann NY Acad Sci. 2007;1117:302–309 [DOI] [PubMed] [Google Scholar]
  • 3. Lango Allen H, Estrada K, Lettre G, et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 2010;467:832–838 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. DePristo MA, Banks E, Poplin R, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011;43:491–498 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Kuczmarski R, Ogden C, Guo S. CDC growth charts for the United States: methods and development. Vital Health Stat. 2002;11:246. [PubMed] [Google Scholar]
  • 6. Rivas MA, Beaudoin M, Gardet A, et al. Deep resequencing of GWAS loci identifies independent rare variants associated with inflammatory bowel disease. Nat Genet. 2011;43:1066–1073 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. 1000 Genomes Project Consortium A map of human genome variation from population-scale sequencing. Nature 2010;467:1061–1073 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. National Heart, Lung, and Blood Institute Exome Variant Server. http://evs.gs.washington.edu/EVS/ Accessed May 5, 2012
  • 9. Fang P, Schwartz ID, Johnson BD, et al. Familial short stature caused by haploinsufficiency of the insulin-like growth factor I receptor due to nonsense-mediated messenger ribonucleic acid decay. J Clin Endocrinol Metab. 2009;94:1740–1747 [DOI] [PubMed] [Google Scholar]
  • 10. Stenson PD, Ball E, Howells K, et al. Human Gene Mutation Database: towards a comprehensive central mutation database. J Med Genet. 2008;45:124–126 [DOI] [PubMed] [Google Scholar]
  • 11. Rock MJ, Prenen J, Funari VA, et al. Gain-of-function mutations in TRPV4 cause autosomal dominant brachyolmia. Nat Genet. 2008;40:999–1003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Romano AA, Allanson JE, Dahlgren J, et al. Noonan syndrome: clinical features, diagnosis, and management guidelines. Pediatrics. 2010;126:746–759 [DOI] [PubMed] [Google Scholar]
  • 13. Tartaglia M, Mehler EL, Goldberg R, et al. Mutations in PTPN11, encoding the protein tyrosine phosphatase SHP-2, cause Noonan syndrome. Nat Genet. 2001;29:465–468 [DOI] [PubMed] [Google Scholar]
  • 14. Tartaglia M, Kalidas K, Shaw A, et al. PTPN11 mutations in Noonan syndrome: molecular spectrum, genotype-phenotype correlation, and phenotypic heterogeneity. Am J Hum Genet. 2002;70:1555–1563 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Ester WA, van Duyvenvoorde HA, de Wit CC, et al. Two short children born small for gestational age with insulin-like growth factor 1 receptor haploinsufficiency illustrate the heterogeneity of its phenotype. J Clin Endocrinol Metab. 2009;94:4717–4727 [DOI] [PubMed] [Google Scholar]
  • 16. Abuzzahab MJ, Schneider A, Goddard A, et al. IGF-I receptor mutations resulting in intrauterine and postnatal growth retardation. N Engl J Med. 2003;349:2211–2222 [DOI] [PubMed] [Google Scholar]
  • 17. Kawashima Y, Kanzaki S, Yang F, et al. Mutation at cleavage site of insulin-like growth factor receptor in a short-stature child born with intrauterine growth retardation. J Clin Endocrinol Metab. 2005;90:4679–4687 [DOI] [PubMed] [Google Scholar]
  • 18. Inagaki K, Tiulpakov A, Rubtsov P, et al. A familial insulin-like growth factor-I receptor mutant leads to short stature: clinical and biochemical characterization. J Clin Endocrinol Metab. 2007;92:1542–1548 [DOI] [PubMed] [Google Scholar]
  • 19. Golan D, Erlich Y, Rosset S. Weighted pooling−-practical and cost-effective techniques for pooled high-throughput sequencing. Bioinformatics. 2012;28:i197−i206 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Fang P, Cho YH, Derr MA, Rosenfeld RG, Hwa V, Cowell CT. Severe short stature caused by novel compound heterozygous mutations of the insulin-like growth factor 1 receptor (IGF1R). J Clin Endocrinol Metab. 2012;97:E243–E247 [DOI] [PubMed] [Google Scholar]
  • 21. Walenkamp MJ, van der Kamp HJ, Pereira AM, et al. A variable degree of intrauterine and postnatal growth retardation in a family with a missense mutation in the insulin-like growth factor I receptor. J Clin Endocrinol Metab. 2006;91:3062–3070 [DOI] [PubMed] [Google Scholar]
  • 22. Kansra AR, Dolan LM, Martin LJ, Deka R, Chernausek SD. IGF receptor gene variants in normal adolescents: effect on stature. Eur J Endocrinol. 2012;167:777–781 [DOI] [PubMed] [Google Scholar]
  • 23. Cohen P, Rogol AD, Deal CL, et al. Consensus statement on the diagnosis and treatment of children with idiopathic short stature: a summary of the Growth Hormone Research Society, the Lawson Wilkins Pediatric Endocrine Society, and the European Society for Paediatric Endocrinology Workshop. J Clin Endocrinol Metab. 2008;93:4210–4217 [DOI] [PubMed] [Google Scholar]
  • 24. Abu-Amero S, Wakeling EL, Preece M, Whittaker J, Stanier P, Moore GE. Epigenetic signatures of Silver-Russell syndrome. J Med Genet. 2010;47:150–154 [DOI] [PubMed] [Google Scholar]

Articles from The Journal of Clinical Endocrinology and Metabolism are provided here courtesy of The Endocrine Society

RESOURCES