Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2010 Jan 13;107(Suppl 1):1765–1771. doi: 10.1073/pnas.0906222107

Genomic disorders: A window into human gene and genome evolution

Claudia M B Carvalho a, Feng Zhang a, James R Lupski a,b,c,1
PMCID: PMC2868291  PMID: 20080665

Abstract

Gene duplications alter the genetic constitution of organisms and can be a driving force of molecular evolution in humans and the great apes. In this context, the study of genomic disorders has uncovered the essential role played by the genomic architecture, especially low copy repeats (LCRs) or segmental duplications (SDs). In fact, regardless of the mechanism, LCRs can mediate or stimulate rearrangements, inciting genomic instability and generating dynamic and unstable regions prone to rapid molecular evolution. In humans, copy-number variation (CNV) has been implicated in common traits such as neuropathy, hypertension, color blindness, infertility, and behavioral traits including autism and schizophrenia, as well as disease susceptibility to HIV, lupus nephritis, and psoriasis among many other clinical phenotypes. The same mechanisms implicated in the origin of genomic disorders may also play a role in the emergence of segmental duplications and the evolution of new genes by means of genomic and gene duplication and triplication, exon shuffling, exon accretion, and fusion/fission events.

Keywords: chromosomal rearrangements, low copy repeats, segmental duplications, copy-number variation

Genomic Disorders Result from Copy-Number Variation

One decade ago the concept of genomic disorders was proposed predicated on two major premises: First, the conveyed clinical phenotype does not result from a point mutation, but rather from genomic rearrangements and second, the DNA rearrangement results from instability incited by genome architectural features (1, 2). It was considered that elucidating the rules for the mechanisms of human genomic rearrangements could potentially provide insights into what regions of the human genome are susceptible to instability. Structural variation can produce copy-number variation (CNV) that has been implicated in Mendelian diseases and common traits such as obesity (3, 4), neurobehavioral traits (48), and craniofacial features (9, 10), as well as in sporadic diseases (1, 2, 11, 12). The clinical phenotype conferred will vary depending on the genes and the genomic region involved and may result from distinct mechanisms including gene dosage effects, gene disruption, and position effects or by unmasking a recessive allele (1316).

Mechanistically, the instability and thus mutability of our genome can be facilitated by the ubiquitous presence of repeat sequences, such as low copy repeats (LCRs) or segmental duplications (SDs), as well as by the presence of repetitive sequences such as short interspersed nuclear elements (SINEs) and long interspersed nuclear elements (LINEs). Characterization of many genomic rearrangements causative of human diseases revealed two rearrangement types that could be distinguished at a given locus: recurrent and nonrecurrent rearrangements. Recurrent rearrangements have the same size and fixed breakpoints that cluster in LCRs (17); these LCRs can act as homologous recombination substrates. Nonrecurrent rearrangements have varied sizes and breakpoints for each patient. The mapping and delineation of the smallest region of overlap (SRO) pertaining to nonrecurrent duplications and deletions in a given patient cohort can be used to delineate the genes or regulatory sequences within the dosage-sensitive genomic interval mediating the phenotypic consequences of the genomic change. A subtype of the nonrecurrent rearrangements is characterized by one breakpoint grouping, but not clustering, in a genomic region. The breakpoint grouping can be coincident with genomic intervals laden with genomic sequence elements able to form unusual non-B DNA structures, such as hairpins and cruciforms, potentially stimulating specific mechanisms that drive nonrecurrent rearrangements (17).

The Human Genome Is Enriched in Both Repeated and Repetitive Sequences

LCRs were defined as intrachromosomal duplications ≥10 kb in length and with ≥97% sequence identity that probably arose by duplication of genomic segments resulting in paralogous regions of the human genome (15). SDs were defined as segments of DNA containing ≥90% of sequence identity and ≥1 kb in length (18); both terms are used interchangeably. In contrast, repetitive sequences were defined much earlier (1968) by Britten and Kohne using reassociation kinetics (19), therefore constituting a different class of repeats. LCR/SD became apparent during mechanistic studies of genomic disorders and their genomewide nature was independently revealed during studies of the sequence of the draft haploid human genome (20, 21); these were not revealed by reassociation kinetics. In fact, as much as 5.4% of our genome is duplicated (≥1 kb and ≥90% identity) (22). Also, 52% of the remaining gaps in the reference haploid human genome, refractory regions to all techniques available at the moment, are flanked by LCRs with >90% identity (23). According to the Human Genome Sequence Consortium, “by far, the most difficult regions of the genome were those containing near-exact segmental duplications” (23). The analysis of the (almost) finished human euchromatic genome sequence provided in 2004 enabled the confirmation of several remarkable LCR features already documented by previous studies. LCRs are present across the entire human genome, they can be inter- or intrachromosomal, and they often contain partial or complete gene sequences with intron–exon structure. In addition, LCRs can be classified into three categories: pericentromeric, subtelomeric, or those present in interstitial. Pericentromeric and subtelomeric LCRs are biased toward interchromosomal LCRs and organized as a complex mosaic of duplications; by contrast, interstitial LCRs are enriched for interspersed LCRs (22).

Proximal 17p as a Model for Human Genomic Disorders and Evolution

Studies of rearrangements involving chromosome 17p11.2p12 showed that the proximal 17p chromosome is marked by several direct and inverted interspersed LCRs (2). The ∼7.5-Mb LCR-rich region evolved to a complex genomic architecture that involved serial segmental duplication events during primate evolution (24), often emanating from preferential LCR-containing genomic intervals or cores, some regions with apparent increased mutation rates and others with apparent reduced recombination (12, 2527), potentially reflecting inversion polymorphisms. It is also the site of the breakpoint for an evolutionary translocation, t(4;19), that occurred in an ancestral gorilla chromosome (28). Four genomic disorders, Charcot–Marie–Tooth disease type 1A [CMT1A (MIM118220)], hereditary neuropathy with liability to pressure palsies [HNPP (MIM 162500)], Smith–Magenis microdeletion syndrome [SMS (MIM 182290)] and Potocki–Lupski microduplication syndrome [PTLS (MIM 610883)], map to this region. CMT1A is a length-dependent distal symmetric polyneuropathy caused by a 1.4-Mb duplication generated by nonallelic homologous recombination (NAHR) between the distal CMT1A-REP and proximal CMT1A-REP (29). HNPP is a milder condition with susceptibility to asymmetric neuropathy; it results from the reciprocal deletion of the same genomic segment (30). SMS is a multiple congenital anomaly mental retardation syndrome with obesity, sleep disturbance, and specific behavioral abnormalities due to a recurrent 3.7 Mb deletion generated by NAHR between two LCRs, the so-called proximal and distal SMS-REPs (27, 31). PTLS is due to the reciprocal duplication and manifests as failure to thrive and neurobehavioral abnormalities, including features of autism (6, 32). NAHR between the same LCRs can produce the microdeletion or the reciprocal duplication disorder (6, 3234), but the large number of LCRs spanning the region can also result in uncommon recurrent rearrangement, using alternative LCR as homologous recombination (HR) substrates (35). Furthermore, nonrecurrent deletions/duplications are potentially stimulated by other LCRs in the region (12, 32, 36).

The 17p11.2p12 region also undergoes other structural variations observed in the population such as an inversion involving the distal SMS-REP and middle SMS-REPs [Database of Genomic Variants (DGV), http://projects.tcag.ca/variation/]. Furthermore, structural changes therein can occur somatically and be associated with cancer (3740). The common breakpoint of the i(17q) chromosome, formally idic(17)(p11.2), frequently observed in patients with hematological malignancies associated with poor prognosis, maps to the LCRs REPA and -B located between middle and proximal SMS-REPs (41). The same region is very polymorphic within the population (42). A summary of the characterized proximal 17p evolutionary, constitutional (i.e. germ-line), and somatic rearrangement events is shown in Fig. 1.

Fig. 1.

Fig. 1.

Schematic representation of the genome architecture susceptible to rearrangements in the proximal chromosome 17p. The low copy repeats are shown in rectangles (color-coded or similar symbols for given repeats), along with the distribution of the rearrangement breakpoints. (Upper) Diverse alterations (constitutional, evolutionary, somatic) thus far documented for this region. They are color coded for matching the involved segment on 17p. The green horizontal arrow below represents the recurrent duplication and deletion causative of CMT1A and HNPP, respectively; purple horizontal arrows represent the recurrent deletion and duplication causative of SMS and PTLS (3.7 Mb) and the recurrent but uncommon deletion causative of SMS (∼5 Mb). Black arrows below represent the uncommon nonrecurrent deletions and duplications causative of SMS and PTLS, respectively. Solid black line: marker chromosome breakpoints. (Lower) Schematic representation of the isodicentric chromosome 17q, formally designated idic(17)(p11.2), generated according to the model proposed by Barbouti et al. (41) and adapted, with permission, from refs. 12 and 113.

The presence of the specific dosage-sensitive gene within the 17p11.2p12 chromosome was demonstrated by the identification of rare patients who had disease-causative point mutations in the dosage-sensitive gene rather than large genomic rearrangements including that gene: for example, patients with HNPP and without a deletion who had loss-of-function PMP22 point mutations (nonsense/frameshift) and rare CMT1A patients without duplication who instead had gain-of-function PMP22 point mutations (4345). Dosage alteration of the retinoic acid inducible 1 (RAI1) gene causes most of the clinical phenotypes observed in patients with SMS, an observation also supported by mouse models (10, 4648). Furthermore, nonsense and frameshift point mutations within that gene were detected in patients with SMS who did not have a genomic deletion of RAI1, implicating haploinsufficiency as a major contributing factor for the disease (4648). Point mutations leading to gain-of-function are predicted to cause PTLS but such patients have not yet been identified.

Delineation of the NAHR Mechanism Enabled the Prediction of Novel Genomic Disorders

NAHR is a frequent mechanism underlying disease-associated genomic rearrangements. LCRs are the usual substrates for NAHR due to their high degree of sequence identity. Experimental observations have implicated the existence of recombination hotspots for the occurrence of the crossovers within the LCRs (33, 49, 50). Two LCRs involved in a particular NAHR can be interchromosomal, intrachromosomal, or intrachromatidal, and they can be either directly or inversely oriented to each other. The rearrangement will generate different products accordingly, i.e., duplication, deletion, inversion, or translocation (1, 15). Experimentally, deletions occur twice as often as duplications during meiosis in male germ cells (51).

The high frequency of interspersed LCRs in the human genome predicts many regions of genomic instability that could potentially undergo NAHR-mediated rearrangements and be associated with genomic disorders. In a “genome-first” approach, Sharp et al. (7) developed a BAC-based array Comparative Genomic Hybridization (aCGH) designed to interrogate 130 genomic intervals flanked by directly orientated LCRs >10 kb in length, with >95% identity, and within a distance of 50 kb to 5 Mb. Such an approach was used to screen patient cohorts with defined phenotypes, such as mental retardation and congenital anomalies, enabling the detection of five microdeletions (at 17q21.31, 17q12, 15q24, 15q13.3, and 1q21.1) and further description of five novel genomic disorders (7). Therefore, knowledge of the NAHR mechanism has played a pivotal role in uncovering new human syndromes with profound consequences for clinical genetics.

Other Mechanisms Produce Nonrecurrent Rearrangements

Nonrecurrent rearrangements can be generated by NAHR between repetitive sequences such as SINEs and LINEs (36), but other molecular mechanisms are also implicated for their origin, including nonhomologous end joining (NHEJ), fork stalling and template switching (FoSTeS), microhomology-mediated break-induced replication (MMBIR), and retrotransposition (reviewed in refs. 16, 17, 52, 53). NHEJ is one of the repair pathways responsible for double-strand break (DSB) repair in cells. Following detection of DSBs, NHEJ rejoins the broken DNA ends without the requirement for homology; this process requires the preparation of damaged ends using base removal and insertions of new bases, without ensuring sequence restoration around the break (54). FoSTeS is a recently described replication-based mechanism proposed to explain the complex PLP1 duplications at Xq22, associated with the genomic disorder Pelizaeus–Merzbacher disease [PMD (MIM 312080)] (55). It was proposed that during DNA replication the DNA replication forks could stall, and the 3′ end of the newly synthesized strand could resume DNA synthesis on a different template in a second nearby replication fork. Microhomologies between the switched template and the original fork are used to prime replication. DNA deletion or duplication can be generated depending on whether the template switching occurred to a new replication fork located upstream or downstream. Inversions can also be produced depending on the direction of the fork progression and if the leading or the lagging strands are used on the switched template. The disengaging/resuming replication in a different fork/extension process can occur multiple times, producing complex rearrangements (17, 55). The FoSTeS model has been further generalized and the molecular details are provided in the MMBIR model that appears to be operative in all domains of life (52). In this model, the replication fork stalls, by virtue of the presence of a nick on the template strand resulting in a collapsed fork as the replication fork proceeds through the nick. The collapsed fork generates a one-ended, double-stranded, DNA that is resected to expose the 3′ end, which can mediate a break-induced replication (BIR) using microhomology to prime the template switch.

The presence of complex rearrangements in several genomic disorders has been increasingly detected due to the greater resolution of the advanced genome technologies. Recent examples include MECP2 duplications (56) [MRXSL (MIM 300260)] and duplications in 17p13.3 involving the PAFAH1B1 (LIS1) and/or the YWHAE (14-3-3e) genes (57). Remarkably, Zhang et al. (58) observed as much as 57% of the nonrecurrent PTLS-associated duplications involving the 17p11.2 region can be complex rearrangements (58). The extent to which FoSTeS/MMBIR is involved in the generation of human structural variation is still undetermined as breakpoint sequences of the complex rearrangements are particularly difficult to obtain.

Genomic Architecture Incites Rearrangements

The presence of LCRs in a specific genomic region increases the probability of occurrences of new rearrangements, such as duplications, deletions, gene conversions, and inversions therein or in the flanking segments (36). Indeed, the association between structural variation and LCRs has been shown in several studies, including those examining the genome of different populations (5963), those analyzing individual genomic loci (through human-specific disease studies) (1, 9, 12, 32, 55, 56, 6470), and genomic evolutionary studies (24, 71). In fact, it has been shown that between human and chimpanzee ∼70–80% of inversions and ∼40% of deletions/duplications map to regions containing LCRs (71). Interestingly, the unique regions flanking segmental duplications are ∼10 times more probable to become duplicated compared to other randomly distributed regions, a phenomenon termed “duplication shadowing” (22) that partially explains the nonrandom distribution and the complex mosaic patterns of LCRs. This observation is supported by a recent comparative study in primates where it was shown that LCRs do not arise randomly, but are likely to arise within or adjacent to another LCR already present (24, 72, 73). Therefore, the unique regions flanking LCRs will eventually undergo rearrangements that can either create new LCRs or add new complexities to the previous one; additionally, ectopic homologous recombination and gene conversion can produce homogenization, maintaining the sequence conservation within the LCRs (15).

The role of the LCRs in recurrent rearrangements as substrates for NAHR is well established (1). However, LCRs can also be associated with nonrecurrent rearrangements generated by FoSTeS/MMBIR. Inoue et al. (68) analyzed families with PMD due to deletion of the PLP1 gene at Xq22 and found the distal breakpoints in two of three cases were embedded in LCRs. This finding was supported by the results of Lee et al. (55, 74) who studied PMD patients carrying PLP1 duplications. Later a statistically significant association between LCRs and the distal breakpoints of duplications involving the MECP2 gene in male patients with neurodevelopmental delay was shown (56). Approximately 77% (23/30) of the distal duplication breakpoints map within or nearby one of the LCRs (LCRJ and LCRK) that are located 47 and 201 kb telomeric to the MECP2 gene. LCRJ is formed by two genes that constitute the Opsin array, OPN1LW (long-wave sensitive) and OPN1MW (middle-wave sensitive). In vertebrates, the visual pigments are the products of five families of Opsin genes that probably have arisen by multiple gene duplication events at least 540 million years ago (Mya) (reviewed in ref. 75). The ability to absorb three different wavelengths (short, medium, and long) in the retina is not found among many mammals and constitutes a distinctive feature of the primates. Such evolutionary acquisition enabled primates to see three primary colors (blue, green, and red), changing their vision from dichromatic to trichromatic. From a molecular evolutionary standpoint, the event that enabled trichromatic vision in Catarrhines was the duplication of the X chromosome Opsin gene that occurred ∼35 Mya followed by gene diversification (7577). Interestingly, in humans, differences regarding the sensitivity to distinguish red–green colors are very common. As much as 8% of Caucasian males present color-vision defects and polymorphisms resulting from frequent chromosomal rearrangements and gene conversions at the Opsin locus (78). Of note, the evolutionary timing of the switch from dichotamous to trichotamous color vision coincides with the loss of many functional olfactory receptor genes and may reflect increased dependence of higher primates on vision versus olfaction to sense one’s environment (79).

In our cohort of patients with MECP2 duplication, the stronges breakpoint bias was observed in patients with complex rearrangements (triplications embedded within duplications) (56) in which both duplication and triplication breakpoints map within a low copy repeat pair termed LCRs K (Fig. 2). The LCRK1 and LCRK2 are positioned in an inverted orientation with respect to each other, have 99% sequence identity, and are 11.3 kb in length (56, 80). The region between the LCRs K, which contains the FLNA and EMD genes, is inverted in 18% of individuals of European descent (81). Nonrecurrent deletions involving one of the LCRs K and the EMD gene have been reported to cause X-linked Emery–Dreifuss muscular dystrophy [EDMD (MIM 310300)] (81). Caceres et al. (82) identified the presence of the LCRs K in diverse eutherians, suggesting that they are derived from an ancestral duplication and probably have a single common origin. In addition, inversion events occurred at least 10 independent times along the eutherian evolution (82).

Fig. 2.

Fig. 2.

Schematic representation of the MECP2 telomeric region. (Top) Blue boxes represent the pathogenic rearrangements documented in the literature thus far: distal breakpoint grouping of most of the patients with MECP2 duplications, deletions and/or gene conversions of the Opsin genes that cause color blindness, and deletions of the EMD gene that cause Emery–Dreifuss muscular dystrophy (EDMD). (Middle) The genomic context telomeric to MECP2. LCRJ spans 114 kb and is formed by three genes and/or pseudogenes that constitute the Opsin array, OPN1LW, OPN1MW, and TEX28. The nearby LCRs, K1 and K2, are positioned in inverted orientation, have 99% sequence identity, and are 11.3 kb in length. Hatched bars within arrows inside the LCRs K represent the small region that is 100% identical between them. Blue arrows show alignment of the join points of the patients carrying complex rearrangements (triplications embedded in duplications). (Bottom) Human structural variation (yellow rectangles) includes CNVs and inversions; evolutionary genomic rearrangements (orange rectangles) include the duplication of the Opsin gene and further acquisition of the trichromatic color vision during the primate evolution in addition to a recurrent inversion that has been occurring multiple times in eutherians. *, based on data reported in Carvalho et al. (56).

Duplication Rearrangements and the Emergence of Novel Traits

In his seminal work, Ohno (83) proposed that gene duplications coupled with rapid sequence diversification may play a fundamental role in evolution. Increasing evidence from experimental studies in diverse organisms has confirmed his prediction. In fact, duplications may act as a “reservoir” for producing adaptative phenotypes (84), but also they can cause a dramatic increase in the dosage of specific genes, producing an immediate advantageous effect (85).

In primates, LCRs are implicated in lineage-specific gene creation and potentially in speciation as well. A comparative study between human and chimpanzee genomes estimated that 2.7% of euchromatic sequences were differentially duplicated between chimpanzee and human (86). In contrast, single-base pair differences account for 1.2% of the genetic difference (87). Therefore, some of the genes that distinguish human from chimpanzee arose and/or expanded as LCRs. The salivary amylase gene (AMY1), which encodes a protein that catalyzes the first step in digestion of dietary starch and glycogen, constitutes an interesting example. It has approximately three times more copies in humans compared to chimpanzees and copy-number differences correlate positively with the higher levels of salivary amylase protein (88). The copy number of AMY1 shows evidence of positive selection in populations with high-starch consumption, suggesting that its copy-number increase in humans was selectively favored due to the concomitant increase of starch consumption in agricultural societies (88). Also, the human-specific amplification of the aquaporin-7 gene (AQP7), coupled to positive selection, provides another example of adaptative traits that emerged after lineage-specific gene duplication; the aquaporin-7 protein is involved in water, glycerol, and urea membrane transport and may have contributed to enabling the human capacity for endurance running (89, 90).

Using interspecies cDNA CGH in five hominoid species including humans, Fortna et al. (89) identified 140 genes showing human lineage-specific variation in copy number, most of them (134/140) due to amplification. Several genes are implicated in neuronal function, including a neurotransmitter transporter for γ-aminobutyric acid (GABA) (SLC6A13) and the gene encoding the neuronal apoptosis inhibitory protein (NAIP), which is suspected to have a role in neuronal proliferation and/or brain size in humans (89). Remarkably, they showed that the neuronal-expressed DUF1220 domain, which presents the highest number of copies in humans compared to primates, is apparently under positive selection (91). It is estimated that perhaps as many as 34 human genes encode a DUF1220 domain; these genes map to several genomic sites on chromosome 1 with the majority localized to the 1q21.1 region. Rearrangements of 1q21.1 are associated with congenital anomalies, mental retardation, and neuropsychiatric phenotypes (7, 9, 92). Sikela et al. found a high correlation between DUF1220 domain copy number and human head circumference, suggesting that DUF1220 domains may have a role in shaping human brain size (93). Brunetti-Pierri et al. (9) recently showed that 1q21.1 deletion is associated with microcephaly whereas 1q21.1 duplication is associated with macrocephaly. In fact, rearrangements involving 1q21.1 represent an interesting example of copy-number variation causing developmental and behavioral phenotypes. Noteworthy, 63.6% of the content of the 1q21 chromosome sequence is represented by LCRs. Recent studies showed recurrent deletions and duplications in patients with a broad range of clinical phenotypes including dysmorphic features, congenital anomalies, mental retardation, and neuropsychiatric conditions such as attention deficit hyperactivity disorder (ADHD), autism, anxiety/depression, and antisocial behavior (7, 9, 92). Deletions have also been recently associated with schizophrenia (5, 94). The association of microdeletion with microcephaly and duplication with macrocephaly can be potentially explained by the copy-number alteration of the human-specific paralog of the gene HYDIN. In mice, mutations causing premature termination of the Hydin gene product were reported to cause hydrocephalus (95). The 1q21.1 paralog HYDIN copy results from a 360-kb interchromosomal duplication from the 16q22.2 segment (96) containing the original HYDIN gene.

Exon Shuffling and the Emergence of New Genes

Along with gene duplication, exon shuffling is also implicated in the generation of novel genes and proteins and, once more, the LCRs might play a pivotal role underlying that event. In 1978, Walter Gilbert launched the concept of exon shuffling when he proposed that recombination between introns could rearrange exons, creating new transcription units, and consequently new proteins could be formed (97). In the primate lineage, including humans, there is evidence of exon shuffling generating novel genes, e.g., the creation of testis-specific genes (98, 99). Some additional evidence for exon shuffling observed in human and mouse subjects is listed in Table 1.

Table 1.

Examples of exon shuffling and their potential mechanisms

Organism Involved gene Mechanism Microhomology at breakpoint Reference
Mouse aA-crystallin Illegitimate recombination* CCCAT (123)
Gnb5, Myo5a Nonhomologous recombination* GG (124)
Human LDL receptor, EGF precursor NA NA (125, 126)
Multiple genes L1 retrotransposition NA (127)
Kua, UEV Gene fusion NA (128)
PMCHL1, MCH Retrotransposition NA (129)
ATM Retrotransposition NA (130)
TRE-2 (USP6) Gene fusion NA (99)
PIPSL, PIP5K1A, PSMD4 L1 retrotransposition NA (131)

NA, not available.

*Microhomology was shown at the breakpoint, which can be alternatively interpreted to be caused by the FoSTeS/MMBIR mechanism.

An interesting example of the “birth” of a gene due to duplication and exon shuffling is the proximal CMT1A-REP. This LCR arose by genomic rearrangement whereby exon VI of the COX10 gene and surrounding 25-kb intronic sequences (i.e., distal CMT1A-REP) were duplicated and inserted 1.4 Mb more proximal on 17p within the human–chimpanzee ancestral chromosome. This one event created proximal CMT1A-REP (Fig. 3) and gave birth to two novel genes through exon accretion and fission, respectively (24, 26). Interestingly, both novel genes, HREP and CDRT1, are expressed in humans although they have different tissue specificity: HREP is expressed in heart and skeletal muscle whereas CDRT1 is mainly expressed in pancreas (26, 100). The original COX10 (distal CMT1A-REP) is highly expressed in multiple tissues (101); its protein product farnesylates the heme group incorporated into cytochrome oxidase that is important for mitochondrial function.

Fig. 3.

Fig. 3.

Duplication of selected LCRs during molecular evolution of the primates (updated from ref. 122). The figure is not to scale. LCRJ, Opsin and TEX28 array at Xq28; LCR15, LCR highly repeated in chromosome 15q11-q14; LCRK, LCR flanking the genes FLNA and EMD at Xq28; PWS/AS, Prader–Willi and Angelman syndromes; DGS, DiGeorge syndrome; SMS, Smith–Magenis syndrome; WBS, Williams–Beuren syndrome; GBA, glucocerebrosidase gene; NEMO, gene mutated in incontinentia pigmenti; PMCHL1/2, chimeric genes derived from the melanin-concentrating hormone gene; NF1, neurofibromatosis 1; CMT1A, Charcot-Marie-Tooth disease type 1A; LCR16a, low copy repeats on chromosome 16; SMN2, gene mutated in spinal muscular atrophy. This figure was adapted, with permission, from ref. 122.

Another example is the hominoid testis-specific gene TRE-2 (USP6) that emerged during primate evolution resulting from the chimeric fusion of two genes, USP32 and TBC1D3 (Table 1) (99). TBC1D3 itself is derived from a segmental duplication that underwent multiple gene duplications during primate evolution (99). Interestingly, TBC1D3 underwent mutations with respect to its closest homolog, USP6NL and acquired the features of an adaptor molecule involved in the macropinocytic process (102).

Can FoSTeS/MMBIR Account for Exon Shuffling Events?

It has been estimated that at least ∼19% of exons in eukaryotic genes were formed by exon shuffling (103). However, the underlying mechanisms are not fully understood. Two mechanisms have been proposed, illegitimate recombination (104) and retrotransposed exon insertion (105); nevertheless, many exon rearrangements are not readily explained by either mechanism. Recently, our group (58) reported complex rearrangements, including triplications, detected at the join points of duplications in patients with PTLS and in patients with PMP22 duplication and deletions. Importantly, the complex patterns implicating FoSTeS/MMBIR could be detected at different levels of genome resolution from involving megabases of the human genome to small genomic intervals containing a single gene or even only one exon (58). The sequencing of the join points of the deletion involving just one exon of the PMP22 gene revealed a complex pattern including a small insertion in an inverted orientation. This complex rearrangement of a coding exon caused by FoSTeS/MMBIR suggests that this mechanism may contribute to exon shuffling (58). The replicative FoSTeS/MMBIR mechanism could readily shuffle any given exon by a template switch before and after that exon anywhere within the flanking introns (58).

Birth Defects: Evolution in Real Time

Structural variation in the human genome encompasses a wide range of different alterations, including aneuploidies, heteromorphisms, fragile sites, repetitive elements, micro- and minisatellites, insertions, deletions, inversions, duplications, balanced and unbalanced translocations, and complex genomic and chromosomal rearrangements (60, 62, 106110). Some of the changes are large enough to be visualized by light microscopy whereas others require special techniques, e.g., submicroscopic alterations can be detected by CGH (62, 107), whereas inversions can be detected by paired-end sequencing techniques (60, 109, 111) or by PCR-based approaches (112).

In the human genome, the de novo locus-specific mutation rates for genomic rearrangements were estimated on the basis of disease prevalence rates as ∼10−6–10−4 (113, 114). This range is two to four orders of magnitude greater than the locus-specific rates for base pair changes (∼10−8). Therefore, CNVs may frequently occur de novo and can be associated with sporadic birth defects. This contention is supported by a recent study on neonates with various birth defects in which a high frequency of de novo pathological CNVs was identified (115). aCGH was used to screen 638 neonates with different birth defects including dysmorphic features, multiple congenital anomalies, congenital heart disease, cleft palate, etc. Pathogenic CNVs were detected in up to 20% of subjects (115). In patients with a clinical indication of suspected chromosomal abnormalities, the rate of de novo CNV detection was as high as 66.7% (115), three times greater than the published rates for chromosome studies.

Can Structural Variation Produce Atavism?

Atavism is a concept proposed by Darwin in 1868 to term the reappearance of ancestral characteristics in individuals of a species in further generations. Evidence for atavistic traits has been found in horses and whales (116). In humans proposed atavistic traits include extra nipples, the ability to move the scalp, natural “earring” holes, and hypertrichosis (117). Hypertrichosis is a rare condition characterized by excessive generalized or localized hairiness (118). Marcias-Flores et al. (119) described a family with a severe form of X-linked hypertrichosis. Figuera et al. (118) mapped the X-linked locus to the chromosome Xq24-q27.1, but autosomal-dominant inheritance patterns with associated clinical signs, e.g., gingival hyperplasia, skeletal abnormalities, mental retardation, and others, have also been described (reviewed in ref. 120). Recently, Sun et al. (121) mapped the congenital generalized hypertrichosis terminalis (CGHT) trait to chromosome 17q24.2-q24.3 by linkage analysis in three Han Chinese families. They identified nonrecurrent microdeletions in three CGHT families and also found one de novo microduplication in a sporadic patient, as causative of the trait. The candidate gene has not yet been identified but they postulated a long-range position effect potentially due to the presence of the SOX9 gene nearby.

Conclusions

In conclusion, the development of the concept of genomic disorders, and the definition of the mechanisms for formation (e.g., NAHR, FoSTeS/MMBIR) of the rearrangements underlying these conditions, has led to improvement in clinical ascertainment and the discovery of novel syndromes. Such studies revealed a great deal of new information about human genome structure and evolution and delineated the role of the genomic architecture, including repetitive (e.g., SINEs and LINEs) and repeat sequences (LCRs/SDs), as a facilitator of genomic instability that can cause disease. Adaptative traits can be driven by structural variation as exemplified by the amylase (AMY1) copy-number variation associated with the change of human eating habits. Moreover, increasing data regarding human CNVs and how they can convey neuropsychiatric phenotypes suggest that CNVs may play a major role in human cognition and other complex traits.

Footnotes

The authors declare no conflict of interest.

This paper results from the Arthur M. Sackler Colloquium of the National Academy of Sciences, “Evolution in Health and Medicine” held April 2–3, 2009, at the National Academy of Sciences in Washington, DC. The complete program and audio files of most presentations are available on the NAS web site at www.nasonline.org/Sackler_Evolution_Health_Medicine.

This article is a PNAS Direct Submission. D.R.G. is a guest editor invited by the Editorial Board.

References

  • 1.Lupski JR. Genomic disorders: Structural features of the genome can lead to DNA rearrangements and human disease traits. Trends Genet. 1998;14:417–422. doi: 10.1016/s0168-9525(98)01555-8. [DOI] [PubMed] [Google Scholar]
  • 2.Lupski JR. Genomic disorders ten years on. Genome Med. 2009;1:42. doi: 10.1186/gm42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Sha BY, et al. Genome-wide association study suggested copy number variation may be associated with body mass index in the Chinese population. J Hum Genet. 2009;54:199–202. doi: 10.1038/jhg.2009.10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Walz K, Paylor R, Yan J, Bi W, Lupski JR. Rai1 duplication causes physical and behavioral phenotypes in a mouse model of dup(17)(p11.2p11.2) J Clin Invest. 2006;116:3035–3041. doi: 10.1172/JCI28953. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.International Schizophrenia Consortium. Rare chromosomal deletions and duplications increase risk of schizophrenia. Nature. 2008;455:237–241. doi: 10.1038/nature07239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Potocki L, et al. Molecular mechanism for duplication 17p11.2—the homologous recombination reciprocal of the Smith-Magenis microdeletion. Nat Genet. 2000;24:84–87. doi: 10.1038/71743. [DOI] [PubMed] [Google Scholar]
  • 7.Sharp AJ, et al. Discovery of previously unidentified genomic disorders from the duplication architecture of the human genome. Nat Genet. 2006;38:1038–1042. doi: 10.1038/ng1862. [DOI] [PubMed] [Google Scholar]
  • 8.Sharp AJ, et al. Characterization of a recurrent 15q24 microdeletion syndrome. Hum Mol Genet. 2007;16:567–572. doi: 10.1093/hmg/ddm016. [DOI] [PubMed] [Google Scholar]
  • 9.Brunetti-Pierri N, et al. Recurrent reciprocal 1q21.1 deletions and duplications associated with microcephaly or macrocephaly and developmental and behavioral abnormalities. Nat Genet. 2008;40:1466–1471. doi: 10.1038/ng.279. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Yan J, et al. Reduced penetrance of craniofacial anomalies as a function of deletion size and genetic background in a chromosome engineered partial mouse model for Smith-Magenis syndrome. Hum Mol Genet. 2004;13:2613–2624. doi: 10.1093/hmg/ddh288. [DOI] [PubMed] [Google Scholar]
  • 11.Gu W, Lupski JR. CNV and nervous system diseases—What’s new? Cytogenet Genome Res. 2008;123:54–64. doi: 10.1159/000184692. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Stankiewicz P, et al. Genomic disorders: Genome architecture results in susceptibility to DNA rearrangements causing common human traits. Cold Spring Harbor Symp Quant Biol. 2003;68:445–454. doi: 10.1101/sqb.2003.68.445. [DOI] [PubMed] [Google Scholar]
  • 13.Henrichsen CN, Chaignat E, Reymond A. Copy number variants, diseases and gene expression. Hum Mol Genet. 2009;18:R1–R8. doi: 10.1093/hmg/ddp011. [DOI] [PubMed] [Google Scholar]
  • 14.Lupski JR, Stankiewicz P. Genomic disorders: Molecular mechanisms for rearrangements and conveyed phenotypes. PLoS Genet. 2005;1:e49. doi: 10.1371/journal.pgen.0010049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Stankiewicz P, Lupski JR. Genome architecture, rearrangements and genomic disorders. Trends Genet. 2002;18:74–82. doi: 10.1016/s0168-9525(02)02592-1. [DOI] [PubMed] [Google Scholar]
  • 16.Zhang F, Gu W, Hurles ME, Lupski JR. Copy number variation in human health, disease, and evolution. Annu Rev Genomics Hum Genet. 2009;10:451–481. doi: 10.1146/annurev.genom.9.081307.164217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Gu W, Zhang F, Lupski JR. Mechanisms for human genomic rearrangements. Pathogenetics. 2008;1:4. doi: 10.1186/1755-8417-1-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Bailey JA, et al. Recent segmental duplications in the human genome. Science. 2002;297:1003–1007. doi: 10.1126/science.1072047. [DOI] [PubMed] [Google Scholar]
  • 19.Britten RJ, Kohne DE. Repeated sequences in DNA. Hundreds of thousands of copies of DNA sequences have been incorporated into the genomes of higher organisms. Science. 1968;161:529–540. doi: 10.1126/science.161.3841.529. [DOI] [PubMed] [Google Scholar]
  • 20.Eichler EE. Masquerading repeats: Paralogous pitfalls of the human genome. Genome Res. 1998;8:758–762. doi: 10.1101/gr.8.8.758. [DOI] [PubMed] [Google Scholar]
  • 21.Bailey JA, Yavor AM, Massa HF, Trask BJ, Eichler EE. Segmental duplications: Organization and impact within the current human genome project assembly. Genome Res. 2001;11:1005–1017. doi: 10.1101/gr.187101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Bailey JA, Eichler EE. Primate segmental duplications: Crucibles of evolution, diversity and disease. Nat Rev Genet. 2006;7:552–564. doi: 10.1038/nrg1895. [DOI] [PubMed] [Google Scholar]
  • 23.IHGSC. Finishing the euchromatic sequence of the human genome. Nature. 2004;431:931–945. doi: 10.1038/nature03001. [DOI] [PubMed] [Google Scholar]
  • 24.Stankiewicz P, Shaw CJ, Withers M, Inoue K, Lupski JR. Serial segmental duplications during primate evolution result in complex human genome architecture. Genome Res. 2004;14:2209–2220. doi: 10.1101/gr.2746604. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Bi W, et al. Genes in a refined Smith-Magenis syndrome critical deletion interval on chromosome 17p11.2 and the syntenic region of the mouse. Genome Res. 2002;12:713–728. doi: 10.1101/gr.73702. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Inoue K, et al. The 1.4-Mb CMT1A duplication/HNPP deletion genomic region reveals unique genome architectural features and provides insights into the recent evolution of new genes. Genome Res. 2001;11:1018–1033. doi: 10.1101/gr.180401. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Park SS, et al. Structure and evolution of the Smith-Magenis syndrome repeat gene clusters, SMS-REPs. Genome Res. 2002;12:729–738. doi: 10.1101/gr.82802. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Stankiewicz P, Park SS, Inoue K, Lupski JR. The evolutionary chromosome translocation 4;19 in Gorilla gorilla is associated with microduplication of the chromosome fragment syntenic to sequences surrounding the human proximal CMT1A-REP. Genome Res. 2001;11:1205–1210. doi: 10.1101/gr.181101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Lupski JR, et al. DNA duplication associated with Charcot-Marie-Tooth disease type 1A. Cell. 1991;66:219–232. doi: 10.1016/0092-8674(91)90613-4. [DOI] [PubMed] [Google Scholar]
  • 30.Chance PF, et al. DNA deletion associated with hereditary neuropathy with liability to pressure palsies. Cell. 1993;72:143–151. doi: 10.1016/0092-8674(93)90058-x. [DOI] [PubMed] [Google Scholar]
  • 31.Chen KS, et al. Homologous recombination of a flanking repeat gene cluster is a mechanism for a common contiguous gene deletion syndrome. Nat Genet. 1997;17:154–163. doi: 10.1038/ng1097-154. [DOI] [PubMed] [Google Scholar]
  • 32.Potocki L, et al. Characterization of Potocki-Lupski syndrome (dup(17)(p11.2p11.2)) and delineation of a dosage-sensitive critical interval that can convey an autism phenotype. Am J Hum Genet. 2007;80:633–649. doi: 10.1086/512864. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Bi W, et al. Reciprocal crossovers and a positional preference for strand exchange in recombination events resulting in deletion or duplication of chromosome 17p11.2. Am J Hum Genet. 2003;73:1302–1315. doi: 10.1086/379979. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Shaw CJ, Bi W, Lupski JR. Genetic proof of unequal meiotic crossovers in reciprocal deletion and duplication of 17p11.2. Am J Hum Genet. 2002;71:1072–1081. doi: 10.1086/344346. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Shaw CJ, Withers MA, Lupski JR. Uncommon deletions of the Smith-Magenis syndrome region can be recurrent when alternate low-copy repeats act as homologous recombination substrates. Am J Hum Genet. 2004;75:75–81. doi: 10.1086/422016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Shaw CJ, Lupski JR. Implications of human genome architecture for rearrangement-based disorders: The genomic basis of disease. Hum Mol Genet. 2004;13(Spec No 1):R57–R64. doi: 10.1093/hmg/ddh073. [DOI] [PubMed] [Google Scholar]
  • 37.Mendrzyk F, et al. Isochromosome breakpoints on 17p in medulloblastoma are flanked by different classes of DNA sequence repeats. Genes Chromosomes Cancer. 2006;45:401–410. doi: 10.1002/gcc.20304. [DOI] [PubMed] [Google Scholar]
  • 38.Babicka L, et al. Complex chromosomal rearrangements in patients with chronic myeloid leukemia. Cancer Genet Cytogenet. 2006;168:22–29. doi: 10.1016/j.cancergencyto.2005.11.017. [DOI] [PubMed] [Google Scholar]
  • 39.McCabe MG, et al. High-resolution array-based comparative genomic hybridization of medulloblastomas and supratentorial primitive neuroectodermal tumors. J Neuropathol Exp Neurol. 2006;65:549–561. doi: 10.1097/00005072-200606000-00003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Fabris S, et al. Molecular and transcriptional characterization of 17p loss in B-cell chronic lymphocytic leukemia. Genes Chromosomes Cancer. 2008;47:781–793. doi: 10.1002/gcc.20579. [DOI] [PubMed] [Google Scholar]
  • 41.Barbouti A, et al. The breakpoint region of the most common isochromosome, i(17q), in human neoplasia is characterized by a complex genomic architecture with large, palindromic, low-copy repeats. Am J Hum Genet. 2004;74:1–10. doi: 10.1086/380648. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Carvalho CM, Lupski JR. Copy number variation at the breakpoint region of isochromosome 17q. Genome Res. 2008;18:1724–1732. doi: 10.1101/gr.080697.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Nicholson GA, et al. A frame shift mutation in the PMP22 gene in hereditary neuropathy with liability to pressure palsies. Nat Genet. 1994;6:263–266. doi: 10.1038/ng0394-263. [DOI] [PubMed] [Google Scholar]
  • 44.Roa BB, et al. Charcot-Marie-Tooth disease type 1A. Association with a spontaneous point mutation in the PMP22 gene. N Engl J Med. 1993;329:96–101. doi: 10.1056/NEJM199307083290205. [DOI] [PubMed] [Google Scholar]
  • 45.Valentijn LJ, et al. Identical point mutations of PMP-22 in Trembler-J mouse and Charcot-Marie-Tooth disease type 1A. Nat Genet. 1992;2:288–291. doi: 10.1038/ng1292-288. [DOI] [PubMed] [Google Scholar]
  • 46.Slager RE, Newton TL, Vlangos CN, Finucane B, Elsea SH. Mutations in RAI1 associated with Smith-Magenis syndrome. Nat Genet. 2003;33:466–468. doi: 10.1038/ng1126. [DOI] [PubMed] [Google Scholar]
  • 47.Bi W, et al. Inactivation of Rai1 in mice recapitulates phenotypes observed in chromosome engineered mouse models for Smith-Magenis syndrome. Hum Mol Genet. 2005;14:983–995. doi: 10.1093/hmg/ddi085. [DOI] [PubMed] [Google Scholar]
  • 48.Bi W, et al. Rai1 deficiency in mice causes learning impairment and motor dysfunction, whereas Rai1 heterozygous mice display minimal behavioral phenotypes. Hum Mol Genet. 2007;16:1802–1813. doi: 10.1093/hmg/ddm128. [DOI] [PubMed] [Google Scholar]
  • 49.Lupski JR. Hotspots of homologous recombination in the human genome: Not all homologous sequences are equal. Genome Biol. 2004;5:242. doi: 10.1186/gb-2004-5-10-242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Reiter LT, et al. A recombination hotspot responsible for two inherited peripheral neuropathies is located near a mariner transposon-like element. Nat Genet. 1996;12:288–297. doi: 10.1038/ng0396-288. [DOI] [PubMed] [Google Scholar]
  • 51.Turner DJ, et al. Germline rates of de novo meiotic deletions and duplications causing several genomic disorders. Nat Genet. 2008;40:90–95. doi: 10.1038/ng.2007.40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Hastings PJ, Ira G, Lupski JR. A microhomology-mediated break-induced replication model for the origin of human copy number variation. PLoS Genet. 2009;5:e1000327. doi: 10.1371/journal.pgen.1000327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Hastings PJ, Lupski JR, Rosenberg SM, Ira G. Mechanisms of change in gene copy number. Nat Rev Genet. 2009;10:551–564. doi: 10.1038/nrg2593. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Weterings E, van Gent DC. The mechanism of non-homologous end-joining: A synopsis of synapsis. DNA Repair (Amst) 2004;3:1425–1435. doi: 10.1016/j.dnarep.2004.06.003. [DOI] [PubMed] [Google Scholar]
  • 55.Lee JA, Carvalho CM, Lupski JR. A DNA replication mechanism for generating nonrecurrent rearrangements associated with genomic disorders. Cell. 2007;131:1235–1247. doi: 10.1016/j.cell.2007.11.037. [DOI] [PubMed] [Google Scholar]
  • 56.Carvalho CM, et al. Complex rearrangements in patients with duplications of MECP2 can occur by fork stalling and template switching. Hum Mol Genet. 2009;18:2188–2203. doi: 10.1093/hmg/ddp151. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Bi W, et al. Increased LIS1 expression affects human and mouse brain development. Nat Genet. 2009;41:168–177. doi: 10.1038/ng.302. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Zhang F, et al. The DNA replication FoSTeS/MMBIR mechanism can generate genomic, genic and exonic complex rearrangements in humans. Nat Genet. 2009;41:849–853. doi: 10.1038/ng.399. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Kidd JM, et al. Mapping and sequencing of structural variation from eight human genomes. Nature. 2008;453:56–64. doi: 10.1038/nature06862. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Korbel JO, et al. Paired-end mapping reveals extensive structural variation in the human genome. Science. 2007;318:420–426. doi: 10.1126/science.1149504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Mehan MR, Freimer NB, Ophoff RA. A genome-wide survey of segmental duplications that mediate common human genetic variation of chromosomal architecture. Hum Genomics. 2004;1:335–344. doi: 10.1186/1479-7364-1-5-335. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Redon R, et al. Global variation in copy number in the human genome. Nature. 2006;444:444–454. doi: 10.1038/nature05329. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Shaw CJ, Lupski JR. Non-recurrent 17p11.2 deletions are generated by homologous and non-homologous mechanisms. Hum Genet. 2005;116:1–7. doi: 10.1007/s00439-004-1204-9. [DOI] [PubMed] [Google Scholar]
  • 64.Antonacci F, et al. Characterization of six human disease-associated inversion polymorphisms. Hum Mol Genet. 2009;18:2555–2566. doi: 10.1093/hmg/ddp187. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Ben-Shachar S, et al. Microdeletion 15q13.3: A locus with incomplete penetrance for autism, mental retardation, and psychiatric disorders. J Med Genet. 2009;46:382–388. doi: 10.1136/jmg.2008.064378. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Edelmann L, et al. A common molecular basis for rearrangement disorders on chromosome 22q11. Hum Mol Genet. 1999;8:1157–1167. doi: 10.1093/hmg/8.7.1157. [DOI] [PubMed] [Google Scholar]
  • 67.Groth M, et al. High-resolution mapping of the 8p23.1 beta-defensin cluster reveals strictly concordant copy number variation of all genes. Hum Mutat. 2008;29:1247–1254. doi: 10.1002/humu.20751. [DOI] [PubMed] [Google Scholar]
  • 68.Inoue K, et al. Genomic rearrangements resulting in PLP1 deletion occur by nonhomologous end joining and cause different dysmyelinating phenotypes in males and females. Am J Hum Genet. 2002;71:838–853. doi: 10.1086/342728. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Pentao L, Wise CA, Chinault AC, Patel PI, Lupski JR. Charcot-Marie-Tooth type 1A duplication appears to arise from recombination at repeat sequences flanking the 1.5 Mb monomer unit. Nat Genet. 1992;2:292–300. doi: 10.1038/ng1292-292. [DOI] [PubMed] [Google Scholar]
  • 70.Stankiewicz P, Lupski JR. The genomic basis of disease, mechanisms and assays for genomic disorders. Genome Dyn. 2006;1:1–16. doi: 10.1159/000092496. [DOI] [PubMed] [Google Scholar]
  • 71.Kehrer-Sawatzki H, Cooper DN. Molecular mechanisms of chromosomal rearrangement during primate evolution. Chromosome Res. 2008;16:41–56. doi: 10.1007/s10577-007-1207-1. [DOI] [PubMed] [Google Scholar]
  • 72.Marques-Bonet T, et al. A burst of segmental duplications in the genome of the African great ape ancestor. Nature. 2009;457:877–881. doi: 10.1038/nature07744. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Kolb J, et al. Cruciform-forming inverted repeats appear to have mediated many of the microinversions that distinguish the human and chimpanzee genomes. Chromosome Res. 2009;17:469–483. doi: 10.1007/s10577-009-9039-9. [DOI] [PubMed] [Google Scholar]
  • 74.Lee JA, et al. Role of genomic architecture in PLP1 duplication causing Pelizaeus-Merzbacher disease. Hum Mol Genet. 2006;15:2250–2265. doi: 10.1093/hmg/ddl150. [DOI] [PubMed] [Google Scholar]
  • 75.Jacobs GH. Primate color vision: A comparative perspective. Vis Neurosci. 2008;25:619–633. doi: 10.1017/S0952523808080760. [DOI] [PubMed] [Google Scholar]
  • 76.Hunt DM, et al. Molecular evolution of trichromacy in primates. Vision Res. 1998;38:3299–3306. doi: 10.1016/s0042-6989(97)00443-4. [DOI] [PubMed] [Google Scholar]
  • 77.Nathans J, Piantanida TP, Eddy RL, Shows TB, Hogness DS. Molecular genetics of inherited variation in human color vision. Science. 1986;232:203–210. doi: 10.1126/science.3485310. [DOI] [PubMed] [Google Scholar]
  • 78.Nathans J, Thomas D, Hogness DS. Molecular genetics of human color vision: The genes encoding blue, green, and red pigments. Science. 1986;232:193–202. doi: 10.1126/science.2937147. [DOI] [PubMed] [Google Scholar]
  • 79.Gilad Y, Przeworski M, Lancet D. Loss of olfactory receptor genes coincides with the acquisition of full trichromatic vision in primates. PLoS Biol. 2004;2:E5. doi: 10.1371/journal.pbio.0020005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.del Gaudio D, et al. Increased MECP2 gene copy number as the result of genomic duplication in neurodevelopmentally delayed males. Genet Med. 2006;8:784–792. doi: 10.1097/01.gim.0000250502.28516.3c. [DOI] [PubMed] [Google Scholar]
  • 81.Small K, Iber J, Warren ST. Emerin deletion reveals a common X-chromosome inversion mediated by inverted repeats. Nat Genet. 1997;16:96–99. doi: 10.1038/ng0597-96. [DOI] [PubMed] [Google Scholar]
  • 82.Caceres M, Sullivan RT, Thomas JW. A recurrent inversion on the eutherian X chromosome. Proc Natl Acad Sci USA. 2007;104:18571–18576. doi: 10.1073/pnas.0706604104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Ohno S. Evolution by Gene Duplication. Berlin: Springer-Verlag; 1970. [Google Scholar]
  • 84.Larkin DM, et al. Breakpoint regions and homologous synteny blocks in chromosomes have different evolutionary histories. Genome Res. 2009;19:770–777. doi: 10.1101/gr.086546.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Innan H. Population genetic models of duplicated genes. Genetica. 2009;137:19–37. doi: 10.1007/s10709-009-9355-1. [DOI] [PubMed] [Google Scholar]
  • 86.Cheng Z, et al. A genome-wide comparison of recent chimpanzee and human segmental duplications. Nature. 2005;437:88–93. doi: 10.1038/nature04000. [DOI] [PubMed] [Google Scholar]
  • 87.Fujiyama A, et al. Construction and analysis of a human-chimpanzee comparative clone map. Science. 2002;295:131–134. doi: 10.1126/science.1065199. [DOI] [PubMed] [Google Scholar]
  • 88.Perry GH, et al. Diet and the evolution of human amylase gene copy number variation. Nat Genet. 2007;39:1256–1260. doi: 10.1038/ng2123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Fortna A, et al. Lineage-specific gene duplication and loss in human and great ape evolution. PLoS Biol. 2004;2:E207. doi: 10.1371/journal.pbio.0020207. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Lupski JR. An evolution revolution provides further revelation. BioEssays. 2007;29:1182–1184. doi: 10.1002/bies.20686. [DOI] [PubMed] [Google Scholar]
  • 91.Popesco MC, et al. Human lineage-specific amplification, selection, and neuronal expression of DUF1220 domains. Science. 2006;313:1304–1307. doi: 10.1126/science.1127980. [DOI] [PubMed] [Google Scholar]
  • 92.Mefford HC, et al. Recurrent rearrangements of chromosome 1q21.1 and variable pediatric phenotypes. N Engl J Med. 2008;359:1685–1699. doi: 10.1056/NEJMoa0805384. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Dumas L, Sikela JM. DUF1220 domains, cognitive disease, and human brain evolution. Cold Spring Harbor Symp Quant Biol. 2009 doi: 10.1101/sqb.2009.74.025. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Stefansson H, et al. Large recurrent microdeletions associated with schizophrenia. Nature. 2008;455:232–236. doi: 10.1038/nature07229. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Davy BE, Robinson ML. Congenital hydrocephalus in hy3 mice is caused by a frameshift mutation in Hydin, a large novel gene. Hum Mol Genet. 2003;12:1163–1170. doi: 10.1093/hmg/ddg122. [DOI] [PubMed] [Google Scholar]
  • 96.Doggett NA, et al. A 360-kb interchromosomal duplication of the human HYDIN locus. Genomics. 2006;88:762–771. doi: 10.1016/j.ygeno.2006.07.012. [DOI] [PubMed] [Google Scholar]
  • 97.Gilbert W. Why genes in pieces? Nature. 1978;271:501. doi: 10.1038/271501a0. [DOI] [PubMed] [Google Scholar]
  • 98.Babushok DV, Ostertag EM, Kazazian HH., Jr Current topics in genome evolution: Molecular mechanisms of new gene formation. Cell Mol Life Sci. 2007;64:542–554. doi: 10.1007/s00018-006-6453-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Paulding CA, Ruvolo M, Haber DA. The Tre2 (USP6) oncogene is a hominoid-specific gene. Proc Natl Acad Sci USA. 2003;100:2507–2511. doi: 10.1073/pnas.0437015100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Inoue K, Lupski JR. Molecular mechanisms for genomic disorders. Annu Rev Genomics Hum Genet. 2002;3:199–242. doi: 10.1146/annurev.genom.3.032802.120023. [DOI] [PubMed] [Google Scholar]
  • 101.Murakami T, Reiter LT, Lupski JR. Genomic structure and expression of the human heme A:farnesyltransferase (COX10) gene. Genomics. 1997;42:161–164. doi: 10.1006/geno.1997.4711. [DOI] [PubMed] [Google Scholar]
  • 102.Frittoli E, et al. The primate-specific protein TBC1D3 is required for optimal macropinocytosis in a novel ARF6-dependent pathway. Mol Biol Cell. 2008;19:1304–1316. doi: 10.1091/mbc.E07-06-0594. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103.Long M, Betran E, Thornton K, Wang W. The origin of new genes: Glimpses from the young and old. Nat Rev Genet. 2003;4:865–875. doi: 10.1038/nrg1204. [DOI] [PubMed] [Google Scholar]
  • 104.van Rijk A, Bloemendal H. Molecular mechanisms of exon shuffling: Illegitimate recombination. Genetica. 2003;118:245–249. [PubMed] [Google Scholar]
  • 105.Esnault C, Maestre J, Heidmann T. Human LINE retrotransposons generate processed pseudogenes. Nat Genet. 2000;24:363–367. doi: 10.1038/74184. [DOI] [PubMed] [Google Scholar]
  • 106.Feuk L, Carson AR, Scherer SW. Structural variation in the human genome. Nat Rev Genet. 2006;7:85–97. doi: 10.1038/nrg1767. [DOI] [PubMed] [Google Scholar]
  • 107.Iafrate AJ, et al. Detection of large-scale variation in the human genome. Nat Genet. 2004;36:949–951. doi: 10.1038/ng1416. [DOI] [PubMed] [Google Scholar]
  • 108.Sebat J, et al. Large-scale copy number polymorphism in the human genome. Science. 2004;305:525–528. doi: 10.1126/science.1098918. [DOI] [PubMed] [Google Scholar]
  • 109.Tuzun E, et al. Fine-scale structural variation of the human genome. Nat Genet. 2005;37:727–732. doi: 10.1038/ng1562. [DOI] [PubMed] [Google Scholar]
  • 110.Zhang F, Carvalho CM, Lupski JR. Complex human chromosomal and genomic rearrangements. Trends Genet. 2009;25:298–307. doi: 10.1016/j.tig.2009.05.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 111.Bailey JA, Kidd JM, Eichler EE. Human copy number polymorphic genes. Cytogenet Genome Res. 2008;123:234–243. doi: 10.1159/000184713. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112.Flores M, et al. Recurrent DNA inversion rearrangements in the human genome. Proc Natl Acad Sci USA. 2007;104:6099–6106. doi: 10.1073/pnas.0701631104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 113.Lupski JR. Genomic rearrangements and sporadic disease. Nat Genet. 2007;39:S43–S47. doi: 10.1038/ng2084. [DOI] [PubMed] [Google Scholar]
  • 114.van Ommen GJ. Frequency of new copy number variation in humans. Nat Genet. 2005;37:333–334. doi: 10.1038/ng0405-333. [DOI] [PubMed] [Google Scholar]
  • 115.Lu XY, et al. Genomic imbalances in neonates with birth defects: High detection rates by using chromosomal microarray analysis. Pediatrics. 2008;122:1310–1318. doi: 10.1542/peds.2008-0297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 116.Hall BK. Atavisms and atavistic mutations. Nat Genet. 1995;10:126–127. doi: 10.1038/ng0695-126. [DOI] [PubMed] [Google Scholar]
  • 117.Cantu JM, Ruiz C. On atavisms and atavistic genes. Ann Genet. 1985;28:141–142. [PubMed] [Google Scholar]
  • 118.Figuera LE, Pandolfo M, Dunne PW, Cantu JM, Patel PI. Mapping of the congenital generalized hypertrichosis locus to chromosome Xq24-q27.1. Nat Genet. 1995;10:202–207. doi: 10.1038/ng0695-202. [DOI] [PubMed] [Google Scholar]
  • 119.Macias-Flores MA, et al. A new form of hypertrichosis inherited as an X-linked dominant trait. Hum Genet. 1984;66:66–70. doi: 10.1007/BF00275189. [DOI] [PubMed] [Google Scholar]
  • 120.Garcia-Cruz D, Figuera LE, Cantu JM. Inherited hypertrichoses. Clin Genet. 2002;61:321–329. doi: 10.1034/j.1399-0004.2002.610501.x. [DOI] [PubMed] [Google Scholar]
  • 121.Sun M, et al. Copy-number mutations on chromosome 17q24.2-q24.3 in congenital generalized hypertrichosis terminalis with or without gingival hyperplasia. Am J Hum Genet. 2009;84:807–813. doi: 10.1016/j.ajhg.2009.04.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 122.Stankiewicz P, Lupski JR. Molecular-evolutionary mechanisms for genomic disorders. Curr Opin Genet Dev. 2002;12:312–319. doi: 10.1016/s0959-437x(02)00304-0. [DOI] [PubMed] [Google Scholar]
  • 123.van Rijk AA, de Jong WW, Bloemendal H. Exon shuffling mimicked in cell culture. Proc Natl Acad Sci USA. 1999;96:8074–8079. doi: 10.1073/pnas.96.14.8074. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 124.Jones JM, et al. The mouse neurological mutant flailer expresses a novel hybrid gene derived by exon shuffling between Gnb5 and Myo5a. Hum Mol Genet. 2000;9:821–828. doi: 10.1093/hmg/9.5.821. [DOI] [PubMed] [Google Scholar]
  • 125.Sudhof TC, et al. Cassette of eight exons shared by genes for LDL receptor and EGF precursor. Science. 1985;228:893–895. doi: 10.1126/science.3873704. [DOI] [PubMed] [Google Scholar]
  • 126.Sudhof TC, Goldstein JL, Brown MS, Russell DW. The LDL receptor gene: A mosaic of exons shared with different proteins. Science. 1985;228:815–822. doi: 10.1126/science.2988123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 127.Moran JV, DeBerardinis RJ, Kazazian HH., Jr. Exon shuffling by L1 retrotransposition. Science. 1999;283:1530–1534. doi: 10.1126/science.283.5407.1530. [DOI] [PubMed] [Google Scholar]
  • 128.Thomson TM, et al. Fusion of the human gene for the polyubiquitination coeffector UEV1 with Kua, a newly identified gene. Genome Res. 2000;10:1743–1756. doi: 10.1101/gr.gr-1405r. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 129.Courseaux A, Nahon JL. Birth of two chimeric genes in the Hominidae lineage. Science. 2001;291:1293–1297. doi: 10.1126/science.1057284. [DOI] [PubMed] [Google Scholar]
  • 130.Ejima Y, Yang L. Trans mobilization of genomic DNA as a mechanism for retrotransposon-mediated exon shuffling. Hum Mol Genet. 2003;12:1321–1328. doi: 10.1093/hmg/ddg138. [DOI] [PubMed] [Google Scholar]
  • 131.Babushok DV, et al. A novel testis ubiquitin-binding protein gene arose by exon shuffling in hominoids. Genome Res. 2007;17:1129–1138. doi: 10.1101/gr.6252107. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES