Skip to main content
American Journal of Human Genetics logoLink to American Journal of Human Genetics
. 2024 Jul 11;111(7):1258–1260. doi: 10.1016/j.ajhg.2024.05.010

Exploring the noncoding genome with chromosomal structural rearrangements

Cynthia Casson Morton 1,2,3,
PMCID: PMC11308653  PMID: 38996468

Abstract

Highlighting the Distinguished Speakers Symposium on “The Future of Human Genetics and Genomics,” this collection of articles is based on presentations at the ASHG 2023 Annual Meeting in Washington, DC, in celebration of all our field has accomplished in the past 75 years, since the founding of ASHG in 1948.

Main text

It is my pleasure to speak with you today on the last day of the 75th anniversary celebration of the American Society of Human Genetics. I’ve had a career-long fascination with chromosomes and the technologies that have expanded their usefulness in diagnostics of human disorders. Today I will speak about our noncoding genome, the dark matter, and chromosomal structural rearrangements that result in inherited and de novo germline disorders.

I understand that TED talks are to tell a story so I will begin the story by reminding you about the methods applied in cytogenetics that have informed us deeper about human biology, leading us from what I fondly refer to as the “original genome scan” through banding, FISH, and microarrays to the “James Webb telescope view” where we have learned about our genome on a nucleotide level and how variation in our genome contributes to health and disease.

I will pose three questions. The first is why do we solve only about 50% of rare diseases with all the omics tools that we now have? I’m going to tell you about a project we undertook called DGAP (Developmental Genome Anatomy Project). DGAP is a paradigm of gene discovery in human genetics. The hypothesis is that chromosomal rearrangements in individuals with congenital anomalies can be etiologic in their abnormal phenotype due to disruption or dysregulation of genes critical in human development. And I’m proud to report that there were many diagnostic successes from this project … but not all cases were solved.

I’m going to speak about long noncoding RNAs (lncRNAs).1,2 There have been many attempts at nomenclature and classification of lncRNAs by the HUGO Gene Nomenclature Committee, the GENCODE consortium, and others predominately based on their genomic position and orientation relative to protein-coding genes.2 Linking to nearby genes has been useful as it provides context and has sometimes provided clues to lncRNA function (e.g., in regulating gene expression).1 Another characteristic is that there are various lnc biotypes. Divergent head-to-head (XH) and divergent (antisense inside) inside (XI) lncRNAs comprise 19%–27% and 20%–21%, respectively, of total lncRNAs, representing the two largest gene lncRNA biotypes in human and mouse genomes. Particularly notable is that such lncRNAs are transcribed in the opposite orientation with respect to the protein-coding gene. lncRNAs are located more closely to the nearest protein-coding gene than a long-interspersed RNA (lincRNA) is to a protein-coding gene, and more closely to the protein-coding gene than the distance of two protein-coding genes to each other, with the exception of some protein-coding genes overlapping another protein-coding gene.

Question 2 is how might cytogenetic rearrangements be used to interpret variants in lncRNAs etiologic in clinical phenotypes? This is the story of two DGAP cases. DGAP353 begins with a referral about two decades ago from an increased maternal serum risk for Down syndrome. The karyotype of the fetus from an amniocentesis revealed an apparently balanced t(14;17)(q24.3;q23) translocation and subsequently determination of maternal inheritance. The mother and daughter share a phenotype of mild to moderate hearing loss and otherwise are clinically normal. A negative exome analysis ruled out a known variant in a gene for hearing loss, and a literature search in the region of the chromosome 17 breakpoint revealed an unrelated individual with hearing loss among other clinical findings. The chromosome 17 breakpoint was non-overlapping with TBX2 (T-box transcription factor 2) and was located 355 base pairs upstream of TBX2 in the divergent antisense (AS) RNA 1 for TBX2 (TBX2-AS1) (Figure 1). In Figure 1, TBX2 in the genome browser view is shown underneath the relevant topologically associating domain I (TAD) region of chromosome 17. It is located quite close to where we find TBX2-AS1. The position of the breakpoint within TBX2-AS1 is indicated by a gold vertical bar. We hypothesize that disruption of one allele of TBX2-AS1 results in a loss of function of TBX2. Another characteristic that is very typical of lncRNAs is the larger size of the TBX2 protein-coding gene compared to TBX2-AS1. Consistent with this characteristic of TBX2 and its companion divergent lncRNA, TBX2 has seven exons (on the forward DNA strand) and TBX2-AS1 has three exons (on the reverse DNA strand).

Figure 1.

Figure 1

DGAP353

DGAP353 has an apparently balanced t(14;17)(q24.3;q23) in the mother that is inherited by her daughter. The chromosome 17 breakpoint—indicated by the gold vertical bar—reveals disruption of TBX2-AS1 at 355 base pairs centromeric and non-overlapping to TBX2. TBX2-AS1 is located on the reverse strand in a divergent orientation to TBX2. We hypothesize that the TBX2-AS1 allele disrupted by the translocation results in a loss of function that affects regulation of TBX2.

Shortly thereafter TBX2 was described as a master regulator of hair cell fate in the mouse inner ear.3 The normal mouse inner ear has a single row of inner hair cells and three rows of outer hair cells akin to that in humans. The mechanosensory inner hair cells synapse onto neurons to transmit sensory information to the brain and the outer hair cells selectively amplify auditory inputs. Ablation of Tbx2 in a mouse inner hair cell results in a total of four rows of outer hair cells, and ectopic expression of Tbx2 in an outer hair cell prevents an outer hair cell from transdifferentiating into an inner hair cell. Tbx2 is necessary and sufficient to make inner hair cells distinct from outer hair cells and maintain this difference throughout development.3 The mouse genome has a similarly annotated region and will be our focus for substantiating a loss of function in Tbx2-AS1 in the mouse model associated with hearing loss.

The second DGAP case I’ll share is DGAP103,4 an eight-year-old boy referred to DGAP with a complex phenotype of facial dysmorphism, extreme somatic overgrowth, advanced endochondral bone and bone ages, bilateral bowing of the legs, arthritis, brachydactyly of hands and feet, and multiple subcutaneous lipomas. These clinical features are clearly uncommon in a young boy. His parents recognized his overgrowth (height, weight, and head circumference all at or above the 95th percentile) with upper incisors erupting at three months of age, and eight teeth present by five months of age. At eight years he was at the 50th percentile for a 15-year-old boy, and his final height was reported as 7 feet 8.5 inches. A de novo pericentric inversion was detected in chromosome 12 involving bands p11.22 and q14.3, designated inv(12)(p11.22q14.3)dn.

So, what role might lncRNAs play in the DGAP103 phenotype? In DGAP103, the gene HMGA2, encoding an architectural factor in the high-mobility group (HMG) of proteins with three AT-hook domains, is disrupted in 12q14.3 by the chromosomal inversion (Figure 2A vertical gold bar). The three AT-hook domains in exons 1–3 bind to the minor groove of AT-rich DNA, alter chromatin architecture, and provide accessibility to transcription factors. The antisense inside lncRNA HMGA2-AS1 transcript is wholly embedded on the reverse DNA strand in the large third intron of HMGA2 separating the N-terminal end of this protein from the lncRNA HMGA2-AS1 (Figure 2A). The 5′ region of the HMGA2 transcript encoding the three AT-hook domains becomes inverted into the short arm of chromosome 12 at p11.22 and resides on the reverse strand. Within the TAD on 12p is located PTHLH, the gene encoding parathyroid hormone-like hormone; loss-of-function variants of PTHLH are etiologic in brachydactyly type 4 (Figure 2B). The precise mechanism of a loss of function for PTHLH likely mediated by the inversion remains to be known. However, the biological impact of the repositioning of the AT-hook domains to 12p downstream on the reverse strand centromeric to PTHLH may be the etiology of a loss of function (i.e., in PTHLH) and a gain of function (i.e., overgrowth syndrome) in DGAP103.

Figure 2.

Figure 2

DGAP103

(A) DGAP103 has a pericentric inversion in chromosome 12 that disrupts HMGA2, which encodes an architectural protein that binds DNA to provide accessibility to transcriptional factors. The gold vertical bar marks the inversion breakpoint in 12q. The antisense inside HMGA2-AS1 transcript is telomeric to the breakpoint on the long arm of chromosome 12 and separates the three AT-hook domains from the C-terminal end of HMGA2. We hypothesize that the phenotype of overgrowth in DGAP103 results from dysregulation of the AT-hook domains.

(B) The 5′ region of HMGA2 with the three exon-containing AT-hook domains now resides on the reverse strand centromeric in the TAD on chromosome 12 that also includes PTHLH. Loss-of-function mutations and deletions in PTHLH are etiologic in type 4 brachydactyly. Although the mechanism underlying PTHLH loss of function remains to be elucidated, the striking phenotype of brachydactyly in the setting of long bone overgrowth leads one to a hypothesis of a loss of function (i.e., PTHLH) and a gain of function (i.e., dysregulated AT-hooks) from the pericentric inversion in DGAP103.

Other cases of DGAP have yet to be explored for a potential role of a lncRNA in the participant’s phenotype, and many of these “DGAP-type” cases reside in the laboratories of members of the ASHG audience who have had the privilege to study chromosomes. These cases await further investigation. This is a “shout out” to the community of cytogeneticists! We have an opportunity once again to step forward from “the original genome scan” to “the James Webb telescope version.” Let’s stand together and seize this next opportunity with our knowledge of structural rearrangements to serve individuals who await our insight into their biology and their potential future therapies.

References

  • 1.Luo S., Lu J.Y., Liu L., Yin Y., Chen C., Han X., Wu B., Xu R., Liu W., Yan P., et al. Divergent lncRNAs regulate gene expression and lineage differentiation in pluripotent cells. Cell Stem Cell. 2016;18:637–652. doi: 10.1016/j.stem.2016.01.024. [DOI] [PubMed] [Google Scholar]
  • 2.Mattick J.S., Amaral P.P., Carninci P., Carpenter S., Chang H.Y., Chen L.L., Chen R., Dean C., Dinger M.E., Fitzgerald K.A., et al. Long non-coding RNAs: definitions, functions, challenges and recommendations. Nat. Rev. Mol. Cell Biol. 2023;24:430–447. doi: 10.1038/s41580-022-00566-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.García-Añoveros J., Clancy J.C., Foo C.Z., García-Gómez I., Zhou Y., Homma K., Cheatham M.A., Duggan A. Tbx2 is a master regulator of inner versus outer hair cell differentiation. Nature. 2022;605:298–303. doi: 10.1038/s41586-022-04668-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Ligon A.H., Moore S.D.P., Parisi M.A., Mealiffe M.E., Harris D.J., Ferguson H.L., Quade B.J., Morton C.C. Constitutional rearrangement of the architectural factor HMGA2: a novel human phenotype including overgrowth and lipomas. Am. J. Hum. Genet. 2005;76:340–348. doi: 10.1086/427565. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from American Journal of Human Genetics are provided here courtesy of American Society of Human Genetics

RESOURCES