Abstract
Objective
Custom genotyping of markers in families with Familial Idiopathic Scoliosis (FIS) were used to fine-map candidate regions on chromosomes 9 and 16 in order to identify candidate genes that contribute to this disorder and prioritize them for next generation sequence analysis.
Methods
Candidate regions on 9q and 16p–16q, previously identified as linked to FIS in a study of 202 families, were genotyped with a high-density map of single nucleotide polymorphisms (SNPs). Tests of linkage for fine-mapping and intra-familial tests of association, including tiled regression, were performed on scoliosis as both a qualitative and quantitative trait.
Results and Conclusions
Nominally significant linkage results were found for markers in both candidate regions. Results from intra-familial tests of association and tiled regression corroborated the linkage findings and identified possible candidate genes suitable for follow-up with next generation sequencing in these same families. Candidate genes that met our prioritization criteria included FAM129B and CERCAM on chromosome 9 and SYT1, GNAO1, and CDH3 on chromosome 16.
Keywords: idiopathic scoliosis, chromosome 9q, chromosome 16, genetic heterogeneity, genetics, association, family-based association study, complex disease
INTRODUCTION
Idiopathic scoliosis (IS) is the most common spinal abnormality of children, and it is clinically characterized by a pain-free abnormal spinal curvature in otherwise normal individuals [1]. The variation of presentation, the limited therapeutic options, and the inability to detect individuals at risk for significant progression, have led to the establishment of high-cost screening programs. These programs have resulted in higher rates of subspecialty referrals and additional radiographic analyses. Spinal fusion is the recommended treatment option when curve progression is significant. The economic cost attributable to scoliosis within the United States has been estimated to be as high as $3 billion per year, and this does not take into account the morbidity in adulthood related to long-term effects of a spinal fusion done at a young age [2].
Genome-wide linkage screens and fine-mapping of candidate loci in families with at least two members with IS, termed Familial Idiopathic Scoliosis (FIS), have identified candidate regions on 6q, 10q, 18q [3]; 17p11.2 [4]; 19p13.3, 2q [5]; Xq23-26 [6]; 6p25-22, 6q14-16, 9q32-34, 16q11-12, 17p11-q11 [7]; 8q12 [8]; 9q31.2-q34.2, 17q25.4-qtel [9]; and 12pter [10]. Candidate regions on chromosomes 9q32-34, and 19p13 have been independently confirmed [9,11]. Recently, a common variant near LBX1 on 10q24.31 was found to be associated with adolescent idiopathic scoliosis in unrelated individuals in two Asian populations [12,13].
In this study, a set of families with FIS was genotyped with two different high density custom oligonucleotide panels of single nucleotide polymorphisms (SNPs) in order to identify candidate genes and prioritize them for next-generation sequence analysis.
MATERIALS AND METHODS
The study population
Written informed consent was obtained from all study participants, in accordance with protocol approved by the Johns Hopkins School of Medicine Institutional Review Board. The study population was comprised of Caucasian families with two or more individuals in the family with IS; and all family members participating in the study were ascertained and examined by a single orthopaedic surgeon. Characterization of the study population was performed to document the uniformity and/or variation within the sample population. Parameters of gender, curve type, and size within this familial study population were consistent with previous reports in the literature [14]. The number of affected females exceeded the number of affected males (270 to 110), and the mean curve severity of the females was greater than that of the males (35.0 ± 1.2 vs. 26.9 ±2.0). The primary curve pattern represented was the single right thoracic curvature [14]. The criteria for a diagnosis of IS were history and physical examination consistent with a sagittal spinal curvature, and standing anteroposterior spinal radiographs exhibiting ≥ 10 degrees curvature in the coronal plane by the Cobb method, with pedicle rotation and no congenital deformity [15]. The threshold of ten degrees is based on the fact that a graph of scoliosis prevalence among the general population is a smooth exponential function where the sharpest change in slope occurs at ten degrees of curvature in the coronal plane [15]. While the initial threshold criteria of ten degrees for the definition of scoliosis has proven to be clinically relevant, the significance of this threshold is unknown with respect to the underlying genetics, therefore, additional thresholds of ≥ 20°, ≥ 30°, and ≥ 40° were considered. Radiographic measurements of the proband within each family were taken at the time of inclusion into the study and varied from age 8 to 16 years with curve measurements of 16 to 88 degrees. Radiographs of family members were obtained either from historical radiographs or standing spinal radiographs at the time of their inclusion into the study. A single orthopaedic surgeon performed all radiographic measurements. Measurements related to scoliotic spinal curvatures from radiographs have been well studied for intra-observer consistency [16,17]. Historical evidence or clinical signs of conditions, including blood clots, cardiac defects, osteoporosis, and known hereditary disorders, in any individual, excluded the family from the study. In order to avoid misclassification, individuals without radiographic information were classified as unknown. For individuals with two or more curves, the degree of curvature was obtained from the curve with the largest observed Cobb angle. In addition to the degree of lateral curvature, variables measured included type of curve, age at diagnosis, ethnic background, awareness of condition, presence of pain and type of treatment.
In this study, the sample population was genotyped and fine-mapping linkage analysis and intra-familial tests of association were performed in order to identify and prioritize candidate genes for next-generation sequence analysis. The sample consisted of 544 individuals belonging to a group of 95 families determined most likely to be segregating as an autosomal dominant form of FIS; average family size was 5.7 individuals with a range from 2 to 29; and 358 (65.9% [of 544]) of these individuals were female. Genotype and phenotype information was available on 510 (93.75%) of these individuals yielding a missing rate of 6.25%. This sample was genotyped with SNPs located in the previously identified candidate regions on chromosome 9 (between STRP markers D9S930 and D9S1826 spanning 24 Megabases (Mb): 115.2–138.4 Mb) and on chromosome 16 (between STRPs D16S764 and D16S2624 spanning 54 Mb: 23.0–54.8 Mb) [7].
Genotype analysis
Blood samples were previously obtained from all participants. Genomic DNA was extracted with standard purification protocols [18]. The SNP panels were generated with the National Center for Biotechnology Information (NCBI) dbSNP (http://www.ncbi.nlm.nih.gov/projects/SNP) chromosome report. Custom oligo pools for fine-mapping for linkage analysis and tests of intra-familial associations were designed for the candidate regions on chromosomes 9 and 16. The panel was designed with special attention to intragenic positioning, linkage disequilibrium blocks and genes that may have relationship to spinal growth and development.
SNPs were genotyped using the Illumina BeadArray platform [19,20]. The SNPs were genotyped at the Center for Inherited Disease Research (CIDR), McKusick/Nathans Institute of Medicine, Johns Hopkins School of Medicine. Illumina’s BeadStudio software was used to cluster all SNPs using samples in this study. The SNP plots were manually evaluated and clustered as needed. SNPs were not included if they had poorly defined clusters, excessive replicate or Mendelian errors, and/or more than 50% missing data. Genotypes were identified with a GenCall score, a quantitative measure that ranges from 0 to 1 and reflects the proximity within a cluster plot of the intensities of that genotype to the centroid of the nearest cluster. All SNPs with a GenCall score < 0.25 were dropped.
A total of 1324 SNPs were released for chromosomes 9 and 16 (Table 1). As reported by CIDR, the missing, Mendelian inconsistency and duplicate error rates were 1.4%, 0.06% and 0.05%, respectively, for the SNPs genotyped.
Table 1.
Characteristics of SNPs on 9q31-34 and 16p12-16q22.
| Chromosome 9 | Chromosome 16 | |
|---|---|---|
| SNPs released | 519 | 805 |
| Average spacing | 46 kb | 66 kbb |
| Location | 9q31.3-34.3 | 16p12.3-q22.2 |
| SNP boundaries (bp) | 114552025 – 138460445 | 18074463 – 71258937 |
Map positions obtained from UCSC website, build GRCh37 (www.genome.ucsc.edu)
Includes the centromere
Statistical genetic analysis
Allele frequencies were estimated in the founders with FREQ [21]. Familial relationships were verified with RELCHECK [22,23] which infers the most likely relationship between pairs of relatives by using estimates of identical by descent (IBD) sharing. Genotyping data were checked for Mendelian inconsistencies with PEDCHECK [24] and non-systematic inconsistencies were removed from the data. Physical positions for the SNPs were obtained from the NCBI dbSNP database (Build 131). For each sibling pair, the proportion of alleles shared IBD was estimated using GENIBD [21].
Monomorphic SNPs, SNPs with a missing rate > 10%, and SNPs for which the minor allele frequency (MAF) was < 1% were removed prior to analysis. Tests of Hardy-Weinberg equilibrium [25] identified 2 SNPs on chromosome 9 and 4 SNPs on chromosome 16 with p-values < 0.001. Two of these SNPs, one on chromosome 9 and one on chromosome 16 were dropped due to genotyping errors. Because this is highly ascertained data, the other SNPs were flagged and retained for subsequent analysis. This reduced the number of released SNPs to 1252, with 500 and 752 on chromosomes 9 and 16, respectively.
Scoliosis was analyzed both as a qualitative and quantitative trait; the qualitative measure represents a clinically relevant classification, and the quantitative measure represents the variation of the degree of lateral curvature. For scoliosis treated as a qualitative trait, the presence or absence of scoliosis was determined by a threshold value used to dichotomize the degree of lateral curvature. This threshold value allows for the conversion of a quantitative trait into a qualitative trait dichotomized into affected and unaffected classes, set by a clinically relevant, although arbitrary, threshold. For the qualitative trait, thresholds of ≥ 10°, ≥ 20°, ≥ 30°, and ≥ 40° were considered. For scoliosis treated as a quantitative trait, both untransformed and transformed measurements of the degree of lateral curvature were considered in the analysis. This distribution of the degree of lateral curvature is highly skewed and kurtotic with a substantial probability mass at 0. Although several transformations were considered (e.g., natural log, Box and Cox) prior to genetic analysis, none sufficiently normalized the distribution with respect to both skewness and kurtosis. The natural log transformation was chosen because it best discriminated between individuals with a curve of 0 versus a curve greater than 0 and represented a mixture of two distributions.
SNPs in high linkage disequilibrium (LD) must have highly correlated results, both for linkage analysis and for intra-familial tests of association. Including SNPs in high LD increases the number of non-independent tests and provides little if any additional information and inflates the type I error rate. SNPs in high LD were identified by calculating the pairwise LD measures r2 and D' with Haploview [26]. Any pair of SNPs with D' = 1.0 and r2 > 0.4 were defined as being in LD, and only the most informative SNP from the pair was retained. The final number of SNPs in these analyses was 846, with 352 and 494 SNPs on chromosomes 9 and 16, respectively.
Linkage analysis
Prior to multipoint linkage analysis, the 846 SNP genotypes were merged with 13 STRP genotypes from the previous genomic screen (6 and 7 on chromosomes 9 and 16, respectively) [7]. Model-independent single-point and multi-point linkage analyses were performed using SIBPAL [21]. The traditional Haseman-Elston analysis for quantitative and qualitative traits was performed on full sib relationships. For multi-point linkage analysis, the genetic distances (in cM) were assumed to be a linear function of the map distance (in Mb). Results from SIBPAL linkage analysis are presented as a p-plot, which plots the log of the inverse of the p-value, or minus the log(p), against the location of each marker, as a linear function of the map distance [27].
Intra-familial tests of association
Intra-familial tests of association were carried out on 846 SNPs using FBATv2.0.2c for qualitative traits [28,29,30] and ASSOC [21] for quantitative traits. FBAT uses data from the members in a nuclear family and can perform tests of association using single SNPs or haplotypes. In order to test for association in these areas of known linkage, the robust empirical variance estimator implemented in FBAT was used assuming an additive model and bi-allelic model [31]. Based on results from linkage analysis, individuals with a threshold of ≥20° were classified as affected with FIS for the qualitative trait association analysis.
ASSOC is a likelihood-based test of association that compares the likelihood of the data in models with and without a marker (e.g., a single SNP). Unlike other intra-familial tests, ASSOC uses the phenotype and genotype information of the entire family. The untransformed and natural log transformed (loge) degree of lateral curvature were used as quantitative phenotypes and a genotypic test was performed. The Bonferroni threshold for the intra-familial tests of association was based on the independent factors (846 SNPs × three tests of association) and was taken to be 0.00002.
In addition to single marker association tests, tiled regression was used to identify the set of independent SNPs that are responsible for the variation in the degree of lateral curvature in the context of other SNPs [32,33,34,35]. In tiled regression, the genome is divided into independent segments based on predefined regions. Recombination hot spots (i.e., well-defined regions of increased recombination) were used to delineate regions in this study. The term tile denotes both the sequence of DNA between two hot spot regions and a hot spot region itself. Each sequence variant is assigned to a tile based on its physical position. A tile is selected if the multiple linear regression on all SNPs in the tile shows a significant relationship to trait variation (testing the null hypothesis that all SNP coefficients are 0) or if the simple linear regression on any single SNP in the tile is significant. A stepwise regression is then used to select the important individual independent SNPs identified in each selected tile. Thereafter, the significant SNPs are combined across tiles in higher-order stepwise regressions within chromosome and then genome levels. Generalized estimating equations (GEE) [36] were included in the tiled regression framework to allow for familial correlations in families [33]. Sibships were used as the clusters in the GEE covariance matrix estimation and SNPs with a MAF < 0.05 were removed prior to analysis. The end result is a multiple linear regression model that includes the set of SNPs that independently contribute to trait variation [35]. Tiled regression was performed with the Tiled Regression Analysis Package (TRAP v 1.0) [34] for the quantitative trait on chromosomes 9 and 16, separately. Critical values of 0.01 and 0.1 were used to select the significant tiles for the simple and multiple linear regressions. A critical value of 0.05 was used to select markers in the stepwise regression. There were 391 markers in 243 tiles on chromosome 9 and 696 markers in 330 tiles on chromosome 16. All p-values reported are nominal values and are not adjusted for multiple tests.
Because different intra-familial tests of association use different kinds of information and test for different types of association as described above, the criteria for prioritizing markers was based on selecting markers that were significant for at least two different methods at a significance level of p < 0.05.
Results
Linkage
Chromosome 9
Results for the model-independent multi-point linkage analysis on chromosome 9 for both the qualitative and quantitative traits are presented in Figure 1. When the threshold was ≥ 20°, p-values < 0.001 were found in an 8.0 Mb region (rs3737048 to rs1475718), and p-values < 0.0001 (rs968477 to rs753659) in a 4.2 Mb region. At a threshold of ≥30°, the linked region was similar to that of the 20° threshold, but the p-values in this region were not as significant (p < 0.001, rs968477 to rs753659) as those in the region based on the 20° threshold, most likely due to the smaller number of affected individuals at this higher threshold. When the threshold was ≥40°, the region with p-values < 0.01 was less than 0.5 Mb in length (rs944028 to rs568203), which overlaps the region with the most significant p-values (< 0.00001) obtained with a threshold of ≥20°. No p-value < 0.05 was found when the affection threshold was ≥10°. For the quantitative linkage analysis of the transformed and untransformed trait, p-values < 0.01 were found in the same region as identified by qualitative analysis of FIS and spanned about 5.7 and 6.8 Mb for the untransformed and loge transformed traits, respectively (Figure 1).
Figure 1.
Model-independent multi-point linkage analyses results for STRPs and SNPs on chromosome 9
Chromosome 16
Evidence for linkage on chromosome 16 spanned the centromere but was strongest on the q arm region adjacent to the centromere (Figure 2). As with chromosome 9, the significance of the linkage peak increased as the threshold used to dichotomize the qualitative trait increased from ≥ 10° to ≥ 20°. The most significant p-values were obtained for a threshold of ≥20°, with p-values < 0.01 extending over a 34.6 Mb region that included the centromere. P-values < 0.001 extended over a 3.7 Mb region on 16p11 (rs713547 to rs3116150) and over 8.2 Mb (rs1566467 to rs8051405) on 16q11. For the ≥30° threshold, the linkage region on 16p11 was only significant at the 0.01 level, while the peak on 16q11 remained significant at p-values < 0.001, and was approximately 4.8 Mb long (rs1420263 to rs2518054). No p-values < 0.05 were found in the analysis of the ≥40° threshold, again most likely due to the small number of individuals taken to be affected with FIS, or for the untransformed quantitative trait. For the loge transformed trait, two SNPs (rs1022455 and rs508414) on 16p12.1 resulted in p-values < 0.01, with p-values < 0.05 extending over a 2.6 Mb region. However, this region was not the same as the region from the qualitative analyses (Figure 2).
Figure 2.
Model-independent multi-point linkage analyses results for STRPs and SNPs on chromosome 16
Intra-Familial Tests of Association
Chromosome 9, Qualitative Analysis of FIS with a 20° threshold
For the tests of association for FIS as a qualitative trait with FBAT, two SNPs had nominal p-values < 0.01: rs1306 (132900076 bp, p = 0.009) located in the 3’ untranslated region of GPR107 (G-protein coupled receptor) and rs1536480 (137412564 bp, p = 0.0093) which is not located within a gene.
Chromosome 9, Quantitative Analysis of degree of lateral curvature
Only the results for the loge transformed trait are presented in the Tables and all significance levels presented are for the nominal significance levels, not adjusted for multiple tests. For ASSOC, five SNPs were significant at a level of p < 0.01 for the loge transformation (Table 2). The most significant association obtained (p < 0.0014) was for rs7847869 located 23 Kilobases (kb) from microRNA MIR3134 (miR3134). For the tiled regression analysis, six independent SNPs in 6 tiles were identified. The most significant of these was rs1871692 (p < 0.0007), which is not located in a known gene, but lies in a characterized region indicative of a regulatory function. Significance levels for the SNPs identified with tiled regression are presented in Table 2 and the final multiple regression model was:
- Loge (Degree of lateral curvature +1) =
- 0.34 + 0.28 × rs1871692 + 0.19 × rs944323 + 0.22 × rs2249110 + −0.22 × rs943392 + −0.28 × rs7259 + −0.19 × rs1537189.
Four SNPs were significant at the p < 0.05 level for both ASSOC and tiled regression: rs1871692 located in a potential regulatory region, rs2249110 located in an intron of the FAM129B gene, rs7259, a synonymous SNP, located in the CERCAM gene, and rs1537189 not located within a known gene. No SNP was significant across FIS taken both as qualitative and as a quantitative trait.
Table 2.
Quantitative tests of association for markers on chromosome 9.
| SNP | bpa | ASSOC | Tiled Regression | Location |
|---|---|---|---|---|
| rs7847869 | 114736023 | 0.0014** | ||
| rs2676628 | 115369058 | 0.0044** | intron KIAA1958 | |
| rs1871692 | 116501277 | 0.0077** | 0.0007*** | |
| rs944323 | 119674722 | 0.4435 | 0.0349* | intron ASTN2 |
| rs2249110 | 130329655 | 0.0053** | 0.0132* | intron FAM129B |
| rs943392 | 130725441 | 0.2345 | 0.0279* | intron FAM102A |
| rs7259 | 131196704 | 0.0395* | 0.0169* | CERCAM |
| rs2519760 | 135758421 | 0.0086** | intron C0orf9 | |
| rs1537189 | 137800247 | 0.0141* | 0.0256* |
map positions obtained from NCBI dbSNP website, GRCh37
p-value ≤ 0.05,
p-value ≤ 0.01,
p-value ≤ 0.001
Seven SNPs were significant at a level of p < 0.01 for the intra-familial tests of association for the untransformed trait (results not shown). The most significant association obtained was for rs752090 (134550028 bp, p =0.000002). No SNP was significant at the p < .05 level for both the intra-familial test of association and tiled regression or across FIS taken both as an untransformed quantitative and a qualitative trait.
Chromosome 16, Qualitative Analysis of FIS with a 20° threshold
Eight SNPs were significant at a level of p < 0.01 for the qualitative trait association tests with FBAT (Table 3). The most significant result (p = 0.0004) for rs2059251 was not located in a gene.
Table 3.
Qualitative tests of association for markers on chromosome 16.
| SNP | Bpa | FBAT ≥ 20° |
Location |
|---|---|---|---|
| rs208600 | 22968910 | 0.0064** | |
| rs4889606 | 31011183 | 0.0029** | intron STX1B |
| rs889548 | 31137712 | 0.0073** | intron MYST1 |
| rs7204626 | 51678263 | 0.0058** | |
| rs1111487 | 53562129 | 0.0050** | |
| rs1544806 | 60692954 | 0.0068** | |
| rs2059251 | 60730790 | 0.0004*** | |
| rs1864148 | 64909469 | 0.0006*** |
Map positions from NCBI dbSNP website, GRCh37
p-value ≤ 0.05,
p-value ≤ 0.01,
p-value ≤ 0.001
Chromosome 16, Quantitative Analysis of degree of lateral curvature
For the loge transformed trait on chromosome 16, 4 SNPs were significant at a level of p < 0.01 for ASSOC and 7 SNPs were significant at this same level for tiled regression (Table 4). Rs723876 was the most significant SNP with p < 0.001 for ASSOC but it is not located in any known gene. The final multiple regression model for tiled regression was:
- Loge (Degree of lateral curvature +1) =
- −0.43 + −0.24 × rs229018 + 0.51 × rs876856 + 0.19 × rs2214437 + 0.25 × rs723876 + −0.25 × rs1009303 + −0.31 × rs1861545 + −0.24 × rs733017 + −0.46 × rs899234 + −0.19 × rs1864148 + 0.44 × rs233551 + 0.23 × rs3785133.
Four SNPs were significant at the p < 0.05 level for both ASSOC and tiled regression: rs229018 located in an intron of SYT17, rs723876 not located in a known gene, rs8999234 located in an intron of the GNAO1 gene, and rs3785133 located in an intron of the CDH3. No SNP was significant at the p < .05 level across FIS taken as qualitative and as a quantitative trait.
Table 4.
Summary of quantitative statistical tests for markers on chromosome 16.
| SNP | bpa | ASSOC | Tiled Regression | Location |
|---|---|---|---|---|
| rs229018 | 19222713 | 0.0023** | 0.0291* | intron SYT17 |
| rs876856 | 19872529 | 0.1123 | 0.0016** | intron GPRC5B |
| rs2214437 | 24275388 | 0.1385 | 0.0329* | intron CACNG3 |
| rs723876 | 26886087 | 0.0001*** | 0.0043** | |
| rs1009303 | 49110866 | 0.0531 | 0.0158* | |
| rs1861545 | 50499844 | 0.1523 | 0.0058** | |
| rs733017 | 54517482 | 0.0545 | 0.0093** | |
| rs899234 | 56354079 | 0.0310* | 0.0005*** | intron GNAO1 |
| rs1864148 | 64909469 | 0.0888 | 0.1450 | |
| rs233551 | 66327313 | 0.1062 | 0.0052** | |
| rs3785133 | 68728611 | 0.0225* | 0.0051** | intron CDH3 |
| rs755702 | 64051063 | 0.0071** | ||
| rs1423828 | 65536620 | 0.0082** | intron LOC283867 |
map positions from NCBI dbSNP website, GRCh37
p-values are from analysis of scoliosis as a quantitative trait, loge transformed curvature.
p-value ≤ 0.05,
p-value ≤ 0.01,
p-value ≤ 0.001
For the untransformed trait 2 SNPs were significant at a level of p < 0.01 for ASSOC and 5 SNPs were significant at this same level for tiled regression (results not shown). Three SNPs were significant at the p < 0.05 level for both tiled regression and ASSOC: rs723876, rs1009303 and rs3785133 located in an intron of the gene CDH3. No SNP was significant at the p < 0.05 level across FIS taken as both an untransformed quantitative or qualitative trait.
DISCUSSION
The genomics era has resulted in major advances in the identification of candidate loci for both sporadic and familial IS. We previously identified susceptibility loci on multiple chromosomes in a genome-wide linkage scan for FIS in a large study sample [7]. Candidate regions on chromosomes 9q31-34 and 16p–16q have been replicated in distinct study samples and investigations. Ocaka et al. (2008) mapped a locus for FIS on chromosome 9q31.2-q34.2 in a British sample [9]; Sharma et al. (2011) reported mild associations to this same region [37]. The area on chromosome 16 has been confirmed by Jose Morcuende (J. Morcuende, personal communication) in a sample from the Midwestern United States. In this study we report results from fine-mapping linkage and intra-familial tests of association analyses of these two regions utilizing custom SNP panels, which corroborated our earlier results and aided in the prioritization of candidate genes for next-generation sequencing in families based on the combined statistical analyses. Because different methods of association use different kinds of information and have different strengths and weaknesses with respect to the underlying model we use corroboration of significant results across methods as our criteria for prioritization (at least two methods significant at p < 0.05). Most of the corroborative results found were from the likelihood based intra-familial test of association (ASSOC) and tiled regression which identifies the set of independent variants in the context of all variants in a tile that best predict the degree of lateral curvature.
Given the large number of SNPs tested, the multiple phenotypes used and the number of association tests carried out, none of the SNPs were significant after adjusting for multiple tests (Bonferroni significance level of p = 0.00002). Association analyses identified several SNPs significant at the nominal < 0.01 level which were within the linkage peaks, some approaching, but not reaching Bonferroni significance. Four SNPs on chromosome 9 and four SNPs on chromosome 16 met our prioritization criteria (see Tables 2 and 4). Six of the SNPs are located in genes or in characterized regions resulting in five candidate genes including FAM129B and CERCAM on chromosome 9 and SYT17, GNAO1 and CDH3 on chromosome 16.
On chromosome 9, rs2249110 lies in an intron of the FAM129B gene, a target of the MAP kinase-signaling cascade in human melanoma cells, and may play a role in apoptosis suppression. [38]. Rs7259 lies within the CERCAM gene, which produces a cerebral endothelial cell adhesion molecule potentially involved in leukocyte transmigration across the blood-brain barrier [39]. The last SNP, rs1871692 is not located in a gene, but is in a DNase sensitive region indicative of a regulatory region [39]. This SNP is less than 350 kb from two SNPs (rs4979321 and rs891725) found to be mildly associated (p = 0.0009 and p = 0.0003 respectively) in a genome-wide association test of 419 families with IS [37].
On chromosome 16, rs229018 lies within the intron of SYT17, a member of the synaptotagmin protein family. These proteins are characterized by a specific N-terminal transmembrane region, a variable linker, and two C-terminal domains and act in membrane trafficking between cells [39]. Rs3785133 lies within the cadherin 3 (CDH3) gene, a member of the cadherin superfamily. The encoded protein is a calcium-dependent cell-cell adhesion glyocoprotein comprised of five extracellular cadherin repeats, a transmembrane region and a conserved cytoplasmic tail [40]. Aberrant expression of this protein is noted in some tissue-specific cancers. Mutations in the CDH3 gene are responsible for hypotrichosis with juvenile macular dystrophy (HJMD), an autosomal recessive disorder characterized by sparse scalp hair and early blindness [41]. Rs899234 is located within the intron of guanine nucleotide binding protein, alpha (GNAO1). GNAO1, a heterotrimeric G protein, is related to neuronal growth cone control and synaptic activities, but specific functional roles are unknown [42]. This is notable given recent data suggestive of neural cell adhesion molecules and axonal guidance neurodevelopmental pathways as potential mechanisms contributing to idiopathic scoliosis pathogenesis [37].
Like many other complex diseases (e.g., breast cancer, diabetes, familial hypercholesterolemia) that initially had fairly substantial linkage peaks, no definitive associations were found. The failure of the intra-familial tests of association to identify a single definitive SNP that was responsible for most of the linkage signal was not unexpected. This would only be the case if FIS was genetically homogeneous and caused by a single simple Mendelian variant. Results from segregation analyses, linkage analyses and genome-wide association studies for FIS suggest that is clearly not the case. Most likely, FIS is genetically heterogeneous, probably not simply caused by common variants, but rather by both common and rare variants that differ from family to family.
In summary, we have utilized high-density SNP genotyping panels and statistical analyses to confirm previous work showing a significant relationship between loci on chromosomes 9 and 16 and the FIS phenotype and to prioritize candidate genes for either targeted, whole exome or whole genome next generation sequencing. Ultimately, these findings will lead to sequencing of regions under the linkage peaks and the identification of both common and rare variants within genes and regulatory regions which are related to the pathophysiology of FIS.
ACKNOWLEDGEMENTS
This work was supported by the Center for Inherited Disease Research [federal contract from the National Institutes of Health to Johns Hopkins University, contract number N01-HG-65403]; S.A.G.E. [National Center for Research Resources grant 1 P41 RR03655]; LARRK Foundation [private donation]; National Institutes of Health [R01-AR048862-01A1]; and the Division of Intramural Research of the National Human Genome Research Institute, National Institutes of Health. Additional SNP genotypes were performed at The SNP Center, Genetic Resources Core Facility, McKusick/Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine.
REFERENCES
- 1.Weinstein SL. Adolescent idiopathic scoliosis: prevalence and natural history. In: Weinstein SL, editor. The Pediatric spine : principles and practice. New York: Raven Press; 1994. pp. 463–478. [Google Scholar]
- 2.Daffner SD, Beimesch CF, Wang JC. Geographic and demographic variability of cost and surgical treatment of idiopathic scoliosis. Spine (Phila Pa 1976) 2010;35:1165–1169. doi: 10.1097/BRS.0b013e3181d88e78. [DOI] [PubMed] [Google Scholar]
- 3.Wise CA, Barnes R, Gillum J, Herring JA, Bowcock AM, et al. Localization of susceptibility to familial idiopathic scoliosis. Spine. 2000;25:2372–2380. doi: 10.1097/00007632-200009150-00017. [DOI] [PubMed] [Google Scholar]
- 4.Salehi LB, Mangino M, De Serio S, De Cicco D, Capon F, et al. Assignment of a locus for autosomal dominant idiopathic scoliosis (IS) to human chromosome 17p11. Hum Genet. 2002;111:401–404. doi: 10.1007/s00439-002-0785-4. [DOI] [PubMed] [Google Scholar]
- 5.Chan V, Fong GC, Luk KD, Yip B, Lee MK, et al. A genetic locus for adolescent idiopathic scoliosis linked to chromosome 19p13.3. Am J Hum Genet. 2002;71:401–406. doi: 10.1086/341607. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Justice CM, Miller NH, Marosy B, Zhang J, Wilson AF. Familial idiopathic scoliosis: evidence of an X-linked susceptibility locus. Spine. 2003;28:589–594. doi: 10.1097/01.BRS.0000049940.39801.E6. [DOI] [PubMed] [Google Scholar]
- 7.Miller NH, Justice CM, Marosy B, Doheny KF, Pugh E, et al. Identification of candidate regions for familial idiopathic scoliosis. Spine. 2005;30:1181–1187. doi: 10.1097/01.brs.0000162282.46160.0a. [DOI] [PubMed] [Google Scholar]
- 8.Gao X, Gordon D, Zhang D, Browne R, Helms C, et al. CHD7 gene polymorphisms are associated with susceptibility to idiopathic scoliosis. Am J Hum Genet. 2007;80:957–965. doi: 10.1086/513571. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Ocaka L, Zhao C, Reed JA, Ebenezer ND, Brice G, et al. Assignment of two loci for autosomal dominant adolescent idiopathic scoliosis to chromosomes 9q31.2-q34.2 and 17q25.3-qtel. J Med Genet. 2008;45:87–92. doi: 10.1136/jmg.2007.051896. [DOI] [PubMed] [Google Scholar]
- 10.Raggio CL, Giampietro PF, Dobrin S, Zhao C, Dorshorst D, et al. A novel locus for adolescent idiopathic scoliosis on chromosome 12p. J Orthop Res. 2009;27:1366–1372. doi: 10.1002/jor.20885. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Alden KJ, Marosy B, Nzegwu N, Justice CM, Wilson AF, et al. Idiopathic scoliosis: identification of candidate regions on chromosome 19p13. Spine. 2006;31:1815–1819. doi: 10.1097/01.brs.0000227264.23603.dc. [DOI] [PubMed] [Google Scholar]
- 12.Takahashi Y, Kou I, Takahashi A, Johnson TA, Kono K, et al. A genome-wide association study identifies common variants near LBX1 associated with adolescent idiopathic scoliosis. Nat Genet. 2011;43:1237–1240. doi: 10.1038/ng.974. [DOI] [PubMed] [Google Scholar]
- 13.Fan YH, Song YQ, Chan D, Takahashi Y, Ikegawa S, et al. SNP rs11190870 near LBX1 is associated with adolescent idiopathic scoliosis in southern Chinese. J Hum Genet. 2012 doi: 10.1038/jhg.2012.11. [DOI] [PubMed] [Google Scholar]
- 14.Miller NH, Schwab DL, Sponseller PD, Manolio TA, Pugh EW, et al. Characterization of idiopathic scoliosis in a clinically well-defined population. Clin Orthop Relat Res. 2001:349–357. doi: 10.1097/00003086-200111000-00045. [DOI] [PubMed] [Google Scholar]
- 15.Kane WJ. Scoliosis prevalence: a call for a statement of terms. Clin Orthop Relat Res. 1977:43–46. [PubMed] [Google Scholar]
- 16.Pruijs JE, Hageman MA, Keessen W, van der Meer R, van Wieringen JC. Variation in Cobb angle measurements in scoliosis. Skeletal Radiol. 1994;23:517–520. doi: 10.1007/BF00223081. [DOI] [PubMed] [Google Scholar]
- 17.Mehta SS, Modi HN, Srinivasalu S, Chen T, Suh SW, et al. Interobserver and intraobserver reliability of Cobb angle measurement: endplate versus pedicle as bony landmarks for measurement: a statistical analysis. J Pediatr Orthop. 2009;29:749–754. doi: 10.1097/BPO.0b013e3181b72550. [DOI] [PubMed] [Google Scholar]
- 18.Sambrook J, Fritsch EF, Maniatis T. Molecular cloning : a laboratory manual. Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press; 1989. [Google Scholar]
- 19.Fan JB, Oliphant A, Shen R, Kermani BG, Garcia F, et al. Highly parallel SNP genotyping. Cold Spring Harb Symp Quant Biol. 2003;68:69–78. doi: 10.1101/sqb.2003.68.69. [DOI] [PubMed] [Google Scholar]
- 20.Gunderson KL, Kruglyak S, Graige MS, Garcia F, Kermani BG, et al. Decoding randomly ordered DNA arrays. Genome Res. 2004;14:870–877. doi: 10.1101/gr.2255804. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.S.A.G.E. Statistical Analysis for Genetic Epidemiology. (6.0.1 ed) http://darwin.cwru.edu/. [Google Scholar]
- 22.Boehnke M, Cox NJ. Accurate inference of relationships in sib-pair linkage studies. Am J Hum Genet. 1997;61:423–429. doi: 10.1086/514862. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Broman KW, Weber JL. Estimation of pairwise relationships in the presence of genotyping errors. Am J Hum Genet. 1998;63:1563–1564. doi: 10.1086/302112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.O'Connell JR, Weeks DE. PedCheck: a program for identification of genotype incompatibilities in linkage analysis. Am J Hum Genet. 1998;63:259–266. doi: 10.1086/301904. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Wigginton JE, Cutler DJ, Abecasis GR. A note on exact tests of Hardy-Weinberg equilibrium. Am J Hum Genet. 2005;76:887–893. doi: 10.1086/429864. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005;21:263–265. doi: 10.1093/bioinformatics/bth457. [DOI] [PubMed] [Google Scholar]
- 27.Pugh EW, Mandal DM, Wilson AF. A graphical approach for presenting linkage results from a genomic screen. Genet Epidemiol. 1995;12:807–812. doi: 10.1002/gepi.1370120646. [DOI] [PubMed] [Google Scholar]
- 28.Rabinowitz D, Laird N. A unified approach to adjusting association tests for population admixture with arbitrary pedigree structure and arbitrary missing marker information. Hum Hered. 2000;50:211–223. doi: 10.1159/000022918. [DOI] [PubMed] [Google Scholar]
- 29.Laird NM, Horvath S, Xu X. Implementing a unified approach to family-based tests of association. Genet Epidemiol. 2000;19(Suppl 1):S36–S42. doi: 10.1002/1098-2272(2000)19:1+<::AID-GEPI6>3.0.CO;2-M. [DOI] [PubMed] [Google Scholar]
- 30.Horvath S, Xu X, Lake SL, Silverman EK, Weiss ST, et al. Family-based tests for associating haplotypes with general phenotype data: application to asthma genetics. Genet Epidemiol. 2004;26:61–69. doi: 10.1002/gepi.10295. [DOI] [PubMed] [Google Scholar]
- 31.Lake SL, Blacker D, Laird NM. Family-based tests of association in the presence of linkage. Am J Hum Genet. 2000;67:1515–1525. doi: 10.1086/316895. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Wilson AF, Kim Y, Sung H, Cai JL, McMahon FJ, et al. Tiled regression: the use of regression methods in hotspot defined genomic segments to identify independent genetic variants responsible for variation in quantitative traits. Genetic Epidemiology. 2009;33:793–793. [Google Scholar]
- 33.Kim Y, Justice C, Sung H, Cai JL, Sorant AJM, et al. Tests of Association for Family Data: Tiled Regression with Generalized Estimation Equations. Genetic Epidemiology. 2010;34:962–962. [Google Scholar]
- 34.Sorant AJM, Cai JL, Sung H, Kim Y, Wilson AF. TiledReg: Software Implementation of Tiled Regression. Genetic Epidemiology. 2010;34:984–985. [Google Scholar]
- 35.Sung H, Kim Y, Cai J, Cropp CD, Simpson CL, et al. A Comparison of Results rom Tests of Association in Unrelated Individuals on Collapsed and Uncollapsed Sequence Variants Using Tiled Regression in the GAW17 Data. BMC Genetics Proceedings. 2011 doi: 10.1186/1753-6561-5-S9-S15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Zeger SL, Liang KY. Longitudinal data analysis for discrete and continuous outcomes. Biometrics. 1986;42:121–130. [PubMed] [Google Scholar]
- 37.Sharma S, Gao X, Londono D, Devroy SE, Mauldin KN, et al. Genome-wide association studies of adolescent idiopathic scoliosis suggest candidate susceptibility genes. Hum Mol Genet. 2011;20:1456–1466. doi: 10.1093/hmg/ddq571. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Chen S, Evans HG, Evans DR. FAM129B/MINERVA, a Novel Adherens Junction-associated Protein, Suppresses Apoptosis in HeLa Cells. J Biol Chem. 2011;286:10201–10209. doi: 10.1074/jbc.M110.175273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Rosenbloom KR, Dreszer TR, Pheasant M, Barber GP, Meyer LR, et al. ENCODE whole-genome data in the UCSC Genome Browser. Nucleic Acids Res. 2010;38:D620–D625. doi: 10.1093/nar/gkp961. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Shimoyama Y, Yoshida T, Terada M, Shimosato Y, Abe O, et al. Molecular cloning of a human Ca2+-dependent cell-cell adhesion molecule homologous to mouse placental cadherin: its low expression in human placental tissues. J Cell Biol. 1989;109:1787–1794. doi: 10.1083/jcb.109.4.1787. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Jelani M, Salman Chishti M, Ahmad W. A novel splice-site mutation in the CDH3 gene in hypotrichosis with juvenile macular dystrophy. Clin Exp Dermatol. 2009;34:68–73. doi: 10.1111/j.1365-2230.2008.02933.x. [DOI] [PubMed] [Google Scholar]
- 42.Nishimura-Akiyoshi S, Niimi K, Nakashiba T, Itohara S. Axonal netrin-Gs transneuronally determine lamina-specific subdendritic segments. Proc Natl Acad Sci U S A. 2007;104:14801–14806. doi: 10.1073/pnas.0706919104. [DOI] [PMC free article] [PubMed] [Google Scholar]


