Abstract
Collagens, or more precisely collagen-based extracellular matrices, are often considered as a metazoan hallmark. Among the collagens, fibrillar collagens are present from sponges to humans, and are involved in the formation of the well-known striated fibrils. In this review we discuss the different steps in the evolution of this protein family, from the formation of an ancestral fibrillar collagen gene to the formation of different clades. Genomic data from the choanoflagellate (sister group of Metazoa) Monosiga brevicollis, and from diploblast animals, have suggested that the formation of an ancestral α chain occurred before the metazoan radiation. Phylogenetic studies have suggested an early emergence of the three clades that were first described in mammals. Hence the duplication events leading to the formation of the A, B and C clades occurred before the eumetazoan radiation. Another important event has been the two rounds of “whole genome duplication” leading to the amplification of fibrillar collagen gene numbers, and the importance of this diversification in developmental processes. We will also discuss some other aspects of fibrillar collagen evolution such as the development of the molecular mechanisms involved in the formation of procollagen molecules and of striated fibrils.
Keywords: fibrillar collagen, extracellular matrix, metazoan evolution
1. Introduction
Collagen, in all its forms, represents the most abundant protein in animals. The collagens represent a heterogeneous family of extracellular matrix glycoproteins containing at least one triple helical domain and are generally involved in the formation of supramolecular networks [1,2]. All of the collagen molecules are made up of three α chains that may or may not be identical. At the primary structure level, the sequence of α chains involved in the formation of triple-helical structure consists of repeating Gly-Xaa-Yaa triplets and is called the collagenous domain or triple helix motif. Thus, the triple-helical structure corresponds to a right-handed superhelix resulting from the intertwining of the collagenous domains of three α chains, each of which adopts a polyproline II-like left-handed conformation. Collagens are often considered as a metazoan hallmark, even if proteins possessing triple-helical motifs have been identified in viruses, bacteria, fungi and the protists choanoflagellates [2–5]. Moreover, some other metazoan proteins contain a triple helix but are not members of the collagen family. These include several humoral proteins included in the so-called defense collagen family, and implicated in innate immunity [2,6].
Among the 28 different types identified in vertebrates [7,8], basement membrane type IV and the fibrillar collagens are the only ones to have been hitherto described from sponges to humans [9,10]. Type IV collagen is one of the major constituents of basement membranes where it forms a three-dimensional network. It has been characterized in Homoscleromorpha, the only sponge group presenting basement membrane-like structures. Interestingly, in another sponge group devoid of basement membrane structure, the Demospongiae, a short-chain collagen family evolutionarily related to type IV collagens has been described. These short-chain collagens seem to be present only in invertebrates with the notable exception of Ecdyzozoa [11].
The fibrillar collagens are present in almost all animals, and are the components of the well-known striated fibrils. A prototypal fibrillar procollagen α chain consists of an uninterrupted collagenous domain or major triple helix made up of approximately 338 Gly-Xaa-Yaa triplets, this region being flanked by two non-collagenous domains, the N- and the C-propeptides. On the biosynthetic pathway leading to the formation of the striated fibrils (Figure 1A), we first have the selection and association of three α procollagen chains. The resultant procollagen molecules are processed into collagen molecules. During this maturation, the N- and the C-propeptides are generally cleaved by specific proteases. The collagen molecules mostly correspond to the major triple helix and appear as rod-like structures 300 nm in length with diameter approximately 1.5 nm. Two short non-collagenous segments or telopeptides flank the major triple helix in these mature collagen molecules, which are then able to assemble into fibrils. The triple helical structure is less susceptible to proteases than non-collagenous domains, and since fibrillar collagens are abundant proteins, this might explain why they have been useful in analyzing palaeontologic material [13,14]. Indeed, taking advantage of the fact that vertebrate type I collagen is one of the most abundant vertebrate proteins, several studies have been able to sequence part of this well preserved molecule using old fossilized bones from dinosaurs 68 to 80 million years old [15,16]. Although there is some controversy about the results of these studies, reanalysis by another group of the Tyrannosaurus rex sample has led to analogous results [17]. As indicating by Bern et al. [17], contamination remains a tricky and possibly unresolvable issue for these ancient fossilized samples, necessitating considerable precautions during the extraction process.
2. Fibrillar Collagen Family
To better understand the evolution of fibrillar collagens, we will first describe some sequence particularities of these proteins by taking into account human data. In humans, the fibrillar collagen family included types I–III, V and XI, but also the more recently characterized types XXIV and XXVII [2,18–20]. As illustrated in Figure 1B, types XXIV and XXVII share some particularities with the other fibrillar collagen chains. Their major triple helix is slightly shorter (997 instead of 1014 to 1020 residues) and presents two successive glycine substitutions and one Gly-Xaa-Yaa-Zaa imperfection. Moreover, their N-propeptide domains are devoid of a minor triple-helical region, unlike the other fibrillar collagens. Human fibrillar α chains harbor different non-collagenous modules in their N-propeptide domains (Figures 1B and 1C). The proα1(I), proα1(II), proα1(III) and proα2(V) (e.g., a proα2(I) collagen polypeptide corresponds to the proα chain 2 of type I collagen) chains contain a VWC module in their N-propeptides while a TSPN domain is observed for the proα1(V), proα3(V), proα1(XI), proα2(XI), proα1(XXIV) and proα1(XXVII) chains. For the proα2(I) chain, the N-propeptide is almost entirely made of a short triple helical region. At the molecular level, types II and III collagens are made up of three identical chains (homotrimer). Structural considerations and tissue localizations suggest that collagens XXIV and XXVII form homotrimeric molecules [18–20]. Type I is generally an heterotrimer comprising two proα1(I) and one proα2(I) chains (Figure 1A), but homotrimeric molecules made of three proα1(I) chains have been detected. The situation of types V and XI is more complicated. These often represent heterotrimeric molecules, but a homotrimer of the proα1(V) chain has been characterized and composite molecules of types V and XI have also been described [21]. It should been noted that the proα3(XI) chain appears to be a modified product of the gene encoding the type II collagen chain [22,23].
Since the same cell can synthesize different types of fibrillar collagen at the same time, a specific molecular mechanism is required for the recognition and discrimination of α chains during assembly of the procollagen molecule. Using chimeric recombinants of human proα2(I) and proα1(III) chains, Lee et al. [24] have characterized a discontinuous region of 15 amino acids in the C-propeptide which is involved in such recognition of fibrillar procollagen chains. Multiple alignment analysis of mammalian C-propeptides permitted these authors to show that this region corresponds to two relatively hydrophilic stretches of 12 and three amino acids separated by a highly conserved and hydrophobic sequence (Figure 2A). Hence, the C-propeptide plays a fundamental role allowing for cell type-specific assembly of fibrillar procollagen molecules.
3. The N-Propeptide Region of Fibrillar Procollagens
As indicated above, after removal of the propeptides from procollagens, the resultant collagen molecules are involved in the formation of striated fibrils. Once more, this is not a simple situation. Fibrils are generally heterotypic structures, being composed of one or two quantitatively major collagens (I–III) as well as one quantitatively minor collagen (V or XI). We can also distinguish fibrils present in cartilage (including types II and XI) from those in non-cartilage tissues (types I, III and V). Moreover, partial processing of the N-propeptide of minor fibrillar collagens in these heterotypic fibrils has been demonstrated. The structural importance of the retention of the N-propeptide in fibrils has been pointed out in several studies [26,27]. From the model of Linsenmayer et al. [27], the presence of the type V N-propeptide on the fibril surface regulates fibril diameter by sterically preventing the further addition of collagen molecules. In other words, the thinnest heterotypic fibrils have the highest minor collagen content. In the case of type XXVII collagen, it was recently shown that this unusual fibrillar collagen is involved in the formation of ultra-thin 10 nm thick non-striated fibrils [28]. However, another group has indicated that type XXVII is a component of non-banded fibrous structures, filamentous networks, and thin banded fibrils [29].
Four N-propeptide configurations have been described in human fibrillar collagen chains (Figure 1B). As shown in Figure 1C, most metazoan α chains can be assigned to one of these four types of N-propeptide. One exception to this rule occurs in Cnidaria, where proα chains possess an N-propeptide made of WAP or WAP and VWA modules (not shown in Figure 1B) in addition to a minor triple helix. Also not shown is the WASP module (eight cysteine residues), which we have demonstrated is similar to the VWC domain (10 cysteine residues) in terms of length, location in the N-propeptide and presence of two successive cysteine residues near the C-terminus [30]. The second exception concerns the presence in some sea urchin α chains of a series of a four-cysteine modules (called SURF modules) between a VWC domain and the minor triple helix within their N-propeptide regions (Figure 1C) [31,32].
Another sea urchin α chain from Strongylocentrotus purpuratus, termed 1α, has an N-propeptide reduced to a minor triple helix reminiscent of the situation found in the vertebrate proα2(I) chain [33]. However, the large intronic sequence (approximately 22,800 bp) between the two first 5’ exons of the gene encoding the 1α chain might potentially encode one VWC and 16 SURF modules without perturbing the open reading frame (Figure 3). Blast analysis reveals that four ESTs (from S. purpuratus larva tissues) span part of these newly deduced exons (Figure 3). It should be noted that the study concerning the S. purpuratus 1α chain was carried out using blastula to pluteus embryos and that in all the cDNAs analyzed the open reading frame encoded a 1α chain presenting an N-propeptide reduced to the minor triple helix [33]. In agreement with the cDNA study, Northern-blot analysis revealed a 5 kb mRNA specific to the 1α chain. Interestingly, and as shown in Figure 3, over-exposed autoradiograms show an 11 kb mRNA band at the pluteus stage that might potentially encode the long isoform of the 1α chain.
4. The C-Propeptide of Fibrillar Collagens
The C-propeptide or COLF1 domain contains highly conserved sequences that are probably involved in its own structure, but is also punctuated by less conserved regions like those involved in chain selection [24,35]. The most important result concerning the evolution of this domain has been to realize that most of the defined chain selectivity recognition sequence is absent in invertebrate fibrillar collagens [36]. As shown in Figure 2B, only one invertebrate chordate α chain, Ci759-Cin, possesses a complete sequence. The relevance of this situation in regard to fibrillar collagen evolution will be discussed later. In spite of the increase of metazoan data, the COLF1 domain has only been described to date at the C-terminus of fibrillar collagen chains. Choanoflagellates are the closest living relatives of the Metazoa. Interestingly, King et al. [5] have indicated that the genome of the choanoflagellate Monosiga brevicolis can potentially encode two proteins including a triple-helical sequence and three possessing a COLF1 domain. Multiple alignment analysis (Figure 2A) reveals that M. brevicolis COLF1 modules lack Cysteine residues 2, 3, 5 and 8. In fibrillar collagens, Cys-5 and Cys-8 form an intra-chain disulfide bond while either Cys-2 or Cys-3 or both of them can be absent.
5. The Major Triple Helical Sequences and the Formation of an Ancestral Gene
The characterization of the genes encoding fibrillar collagen chains have led to proposals concerning the exon-intron organization of an ancestral fibrillar collagen chain in addition to some steps leading to its formation. By chronological order, it was first obvious that half the exons encoding the major triple helix of types I and III collagens were 54 bp of length while the others are multiples of 54 bp (108 and 162 bp) or multiples of 54 bp minus 9 bp (45 and 99 bp). From this observation, Yamada et al. [37] suggested that the primordial genetic unit of an ancestral fibrillar collagen includes an exon of 54 bp in length, beginning with an intact Glycine codon, ending with an intact Yaa codon, and encoding six Gly-Xaa-Yaa triplets. From this point of view, the formation of a putative ancestral gene arose from the multiple duplications of this primordial unit. The presence of the 45 bp and 99 bp exons in fibrillar collagen genes could be explained in this hypothesis by unequal crossing-over. Later on, from the study of a freshwater sponge fibrillar collagen gene and by comparing this data with the exon-intron structures of mammalian types I–III genes, we have been able to propose the exon-intron organization of a putative ancestral fibrillar collagen gene [38]. In comparison to the Yamada model, we suggest that two genetic units are at the origin of the fibrillar collagen genes. As proposed by Yamada et al. [37], the first steps have been multiple rounds of duplication of an exon of 54 bp. An unequal crossing-over event led to the formation of a 45 bp exon beginning by an intact Glycine codon. As shown in Figure 4, multiple duplications of a new genetic unit including a 54 bp and a 45 bp exon might explain the particular distribution of these two types of exons in fibrillar collagen genes. Interestingly, and with the availability of more genomic data, the exon-intron organization of numerous genes [25,30,39,40] is consistent with that suggested for an ancestral fibrillar collagen gene ([41] and Figure 4).
6. Using Triple Helical Sequences to Decipher Fibrillar Collagen Evolution
Preceding phylogenetic analyses, molecular and electron microscopic studies have suggested that fibrillar collagens from invertebrates, and more especially from diploblast animals seem to be related to the vertebrates types V/XI [9,42,43]. Hence, in sponges, striated fibrils have a uniform diameter of 25 nm, a situation observed in vertebrates for heterofibrils containing types V or XI collagens. A second observation allowing the first classification of vertebrate fibrillar collagens has resulted from the sequencing of the related genes. Takahara et al. [39] proposed, from the exon/intron distribution in the region encoding the major triple helix that the fibrillar collagens could be divided into two subgroups. The first includes the genes encoding types I–III and the proα2(V) chains while the second includes those encoding the proα1(V), proα1(XI) and proα2(XI) chains [39,40]. Later on, sequencing projects demonstrated that COL5A3 is a member of the second group. The next step in knowledge of the evolution of this collagen family was to make a phylogenetic analysis using human α chains and a few invertebrate fibrillar collagens [44]. Despite the small number of sequences used, it was clear that vertebrate α chains could be divided in two subfamilies, this study confirming the previous suggestion made from simple observations of gene organization. Hence, the first (types I–III, proα2(V) chain) and second (proα1(V), proα3(V), proα1(XI), and proα2(XI) chains) subfamilies were called the A and B clades, respectively. In agreement with these studies, the N-propeptide composition of the A and B chains are different. A clade members possess a VWC module in their N-propeptide, while B clade members possess a TSPN module. It should be noted that the proα2(I) chain lacks the VWC domain, but is included in the A clade. More recently, a third subfamily of fibrillar collagen that includes the proα1(XXIV) and proα1(XXVII) chains was characterized, and called the C clade [19]. While at this point the evolutionary relationship of vertebrate fibrillar collagen chains was well understood, there remained some difficulty and/or controversy in assigning invertebrate fibrillar α chains to one of the vertebrate clades despite the presence in several invertebrate α chains of a VWC module in their N-propeptide. For these phylogenetic analyses [19,44], the authors used the C-propeptide sequences. However, due to several factors (variability in length and sequence), this domain is not sufficiently informative to decipher the evolution of the fibrillar collagens. A new approach was to postulate that the conservation of the exon/intron organization of metazoan fibrillar collagen genes in the region encoding the major triple helix reflects conservation of related amino acid sequences [25]. In agreement with this methodological approach, it should be noted that, with the availability of complete sequences of eukaryotic genomes, exon-intron-structures have indeed been used as a novel source of evolutionary information [45,46]. As indicated by Csurös et al. [47], “comparative-genomic studies show that numerous intron positions in orthologous genes are conserved at great evolutionary depths, for example, between plants and animals”. Hence, the use of intron positions might improve multiple protein sequence alignments in regions of questionable alignment [48]. It is even possible to align two or more unrelated collagenous sequences by this method, while multiple alignments generally result in numerous gaps. In contrast, multiple alignments of bilaterian major triple helices confirm the pattern of introns of fibrillar collagen genes [25,30].
Using triple-helical sequence with or without the C-propeptide domain, it was possible to investigate more precisely the fibrillar collagen story (Figures 5 and 6) Altogether, these studies have permitted the evolution of this protein family to be followed from sponges to humans [12,19,25,30,50,51]. As shown in Figure 5, the three fibrillar collagen clades, defined in humans have emerged before the emergence of chordates. Hence, the invertebrate chordate Ciona intestinalis (ascidian) clearly possesses a member of each clade, but also another α chain (906-ascidian) related to the C clade and presenting numerous imperfections and Glycine substitutions in its major triple helical domain. With the availability of poriferan (parazoan) and cnidarian (Radiata, eumetazoan) genomes, new analyses have revealed the early evolution of the fibrillar collagen family [30,50]. Hence, there is strong phylogenetic support for the hypothesis that the emergence of the A, B and C clades predated the Radiata–Bilateria split [30]. Moreover, and although not strongly supported by phylogenetic analyses, there is compelling evidence that the emergence of the three clades predated the divergence of poriferan lineages. Hence, the modular structures of the sponge fibrillar collagen chains are in good agreement with this hypothesis.
7. Suggested Evolutionary Model for the Fibrillar Collagen Family
A model for the evolution of fibrillar collagens is presented in Figure 6. Moreover, to better understand the Figure 6, a simplified tree of life is illustrated in Figure 7. From the literature available to date, fibrillar collagens seem to be specific to Metazoa. Although we could not reject the hypothesis that fibrillar collagen information was lost in Choanoflagellatea (the sister group of Metazoa, Figure 7), several data favor the idea that an ancestral fibrillar collagen gene arose in the lineage leading up to the Metazoa. Hence, the choanoflagellate M. brevicollis might potentially encode proteins including either triple helical or COLF1 sequences while fibrillar collagens are present in sponges. The three M. brevicollis COLF1 sequences lack cysteine residues 5 and 8. The fact that cysteine residues 5 and 8 form an intra-chain disulfide bond and are strictly conserved among fibrillar collagen chains suggests that they play an important function in the structure of the COLF1 module. As shown in Figure 6, the COLF1 domain present in the ancestral fibrillar collagen chain might possess cysteine residues 5 and 8. The lack of cysteine residues 2 and 3 in M. brevicollis COLF1 sequences is more questionable. Hence, previous studies have suggested that only α chains possessing all 8 cysteine residues are able to form homotrimeric procollagen molecules [59,60]. In contrast, these two cysteine residues are absent in some invertebrate fibrillar collagen chains [30]. Moreover, recombinant studies using semi-intact cells have shown that cysteine residue 2 is not required during chain association and triple helix folding [61].
The emergence of the A and B/C clades occurred before the parazoan–eumetazoan split, the earliest divergence among extant animal phyla. As shown in this model (Figure 6), the divergence between the ancestral B and C clade fibrillar collagen chains occurred either before the separation of poriferan and eumetazoan lineages (H1 hypothesis) or predated metazoan cladogenesis (H2). The modular organization of sponge and sea anemone fibrillar α chains (Figure 1C and see Figure 1 in Reference [30]) is in favor of the H2 hypothesis, and reveals that only B clade collagens have conserved their N-propeptides and triple helix characteristics from sponges to humans. For the A clade, we can note that the N-propeptide seems to be reduced to the minor triple helix in sponge. The formation of an A clade fibrillar collagen chain possessing a VWC module in its N-propeptide predated the bilaterian radiation. The VWC module might have evolved from a WAP module present in Cnidaria as previously suggested [30], or the presence of WAP sequences in A clade related α chains might be specific to Cnidaria. The C clade is the less known family, and in the absence of more sequence data, we can only indicate that the major triple helix of C clade members seems to have evolved more rapidly than the comparable domains in clades A and B. Hence, the ascidian C clade fibrillar collagen chains do not have the Gly-Xaa-Yaa-Zaa imperfection present in types XXIV and XXVII.
As indicated in Figure 6, lamprey and hagfish (agnathans) have true orthologs of COL2A1, the gene encoding the vertebrate type II collagen (member of the A clade), which is the major structural protein of cartilage in gnathostomes [62–64]. Although the cartilage of agnathans was first described as a non-collagenous tissue [65], the presence of type II collagen has been demonstrated, suggesting that the formation of a collagen-based cartilage predated the agnathan-gnathostome split. This hypothesis has to be related to recent studies indicating that the two rounds (2R) of whole genome duplication occurred between the origin of chordates and before the divergence between cyclostomes and gnathostomes [66,67]. Hence, Kuraku [67] suggests that “a post-2R state is a genomic synapomorphy for all extant vertebrates”. The relationships between fibrillar collagens and cartilage have also been investigated in invertebrates [12,51,63,68]. Rychel et al. [51] have shown that collagenous proteins are present in the pharyngeal cartilage of hemichordates and cephalochordates, this result suggesting that the formation of this collagenous tissues occurred near the time of deuterostome diversification [68]. In lancelet, the gene encoding this collagenous protein is expressed in the notochord [63]. It is related to the A clade and might be defined as the pre-2R ortholog of types I, II, III and the proα2(V) fibrillar collagen chains. In jawed vertebrates, the A clade genes are mostly expressed in notochord and/or notochordal sheath. As indicated by Zhang and Cohn [63], the vertebrate chondrocytes that express the type II gene may have evolved from notochordal cells.
The evolution of the chain selection sequence present in the COLF1 module (Figure 6) has been firstly described as a model of molecular incest [36]. In this model, a rare genomic event led to the formation of a sequence encoding a complete chain selection region at the dawn of the chordates and before the two whole genome duplication events. This gene might encode a fibrillar proα chain participating to the formation of a homotrimeric procollagen molecule. After the first round of duplications, the two newly formed genes can make the same homotrimeric molecule. With time, these two genes diverge but their translational product might still trimerize together in what is now an incestuous relationship. For the authors [36], this model could hold for all the multimeric proteins. Later on and from the studies of invertebrate chordate fibrillar collagens, another group pointed out that it is not one but two rare genomic events that preceded the two rounds of genome duplication [12]. The first event occurred before the ascidian-vertebrate split and permitted the formation of an A clade gene encoding a fibrillar collagen possessing a complete chain selection sequence. The second rare genomic event predated the vertebrate radiation and led to the formation of a B clade gene including the complete coding sequence of the chain selection domain. The functional importance of the chain selection sequence in vertebrates led us to think about the situation of invertebrate fibrillar collagen chains. Hence, seven and eight fibrillar collagen chains have been described in sponges and sea anemones, respectively [30]. All these α chains having an incomplete chain selection sequence, other mechanisms might be used in invertebrates. First and with the lack of biochemical and developmental data, we can imagine that these sponge or sea anemone genes are differentially expressed in these organisms. Second, some of these α chains might be indiscriminately used in the formation of heterotrimeric molecules. Thirdly, other variable regions of the COLF1 domain might be involved in the chain selection in invertebrates.
8. Conclusions
During the last few years, data arising from sequencing projects have led to a better understand the evolution of the fibrillar collagen genes [30,50], and have highlighted the importance of the chain selection sequence and of the two rounds of genome duplication in the evolution of vertebrate development [53,62,63]. Another level of complexity concerns the molecular mechanisms leading to the formation and structural aspects of collagen fibrils. In vertebrates, fibrils are generally heterotypic, their diameters depending on the procollagen types present and their ratio, as well as the N-propeptide maturation of minor procollagens and interactions with other extracellular matrix components [69]. Moreover, it has been suggested that the minor fibrillar collagens (types V and XI) play a pivotal function in the nucleation of fibril assembly [69–71], and have been named as the “nucleators” of the initiation of the collagen fibrillogenesis [69]. Interestingly, the B clade fibrillar collagen chains have conserved the same modular organization and triple helix characteristics from sponges to humans. While little is known about the composition of invertebrate fibrils, we have previously demonstrated in sea urchin the presence of heterotypic fibrils made of quantitatively major and minor collagen molecules undergoing distinct maturation of their N-propeptide domains [32]. In sponges, all collagen fibrils have a thin, uniform diameter of 20–25 nm [2], although phylogenetic studies suggest that members of the three fibrillar collagen clades are present in these animals. During the next years, a new challenge will be to decipher the evolution of the collagen fibrils.
Acknowledgments
We are indebted to David J. Hulmes for critically reading the manuscript and revising the English.
References and Notes
- 1.Myllyharju J, Kivirikko KI. Collagens, modifying enzymes and their mutations in humans, flies and worms. Trends Genet. 2004;20:33–43. doi: 10.1016/j.tig.2003.11.004. [DOI] [PubMed] [Google Scholar]
- 2.Exposito JY, Cluzel C, Garrone R, Lethias C. Evolution of collagens. Anat. Rec. 2002;268:302–316. doi: 10.1002/ar.10162. [DOI] [PubMed] [Google Scholar]
- 3.Rasmussen M, Jacobsson M, Björck L. Genome-based identification and analysis of collagen-related structural motifs in bacterial and viral proteins. J. Biol. Chem. 2003;278:32313–32316. doi: 10.1074/jbc.M304709200. [DOI] [PubMed] [Google Scholar]
- 4.Celerin M, Ray JM, Schisler NJ, Day AW, Stetler-Stevenson WG, Laudenbach DE. Fungal fimbriae are composed of collagen. EMBO J. 1996;15:4445–4453. [PMC free article] [PubMed] [Google Scholar]
- 5.King N, Westbrook MJ, Young SL, Kuo A, Abedin M, Chapman J, Fairclough S, Hellsten U, Isogai Y, Letunic I, Marr M, Pincus D, Putnam N, Rokas A, Wright KJ, Zuzow R, Dirks W, Good M, Goodstein D, Lemons D, Li W, Lyons JB, Morris A, Nichols S, Richter DJ, Salamov A, Sequencing JGI, Bork P, Lim WA, Manning G, Miller WT, McGinnis W, Shapiro H, Tjian R, Grigoriev IV, Rokhsar D. The genome of the choanoflagellate Monosiga brevicollis and the origin of metazoans. Nature. 2008;451:783–788. doi: 10.1038/nature06617. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Acton S, Resnick D, Freeman M, Ekkel Y, Ashkenas J, Krieger M. The collagenous domains of macrophage scavenger receptors and complement component C1q mediate their similar, but not identical, binding specificities for polyanionic ligands. J. Biol. Chem. 1993;268:3530–3537. [PubMed] [Google Scholar]
- 7.Heino J. The collagen family members as cell adhesion proteins. Bioessays. 2007;29:1001–1010. doi: 10.1002/bies.20636. [DOI] [PubMed] [Google Scholar]
- 8.Söderhäll C, Marenholz I, Kerscher T, Rüschendorf F, Esparza-Gordillo J, Worm M, Gruber C, Mayr G, Albrecht M, Rohde K, Schulz H, Wahn U, Hubner N, Lee YA. Variants in a novel epidermal collagen gene (COL29A1) are associated with atopic dermatitis. PLoS Biol. 2007;5:e242. doi: 10.1371/journal.pbio.0050242. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Exposito JY, Garrone R. Characterization of a fibrillar collagen gene in sponges reveals the early evolutionary appearance of two collagen gene families. Proc. Natl. Acad. Sci. USA. 1990;87:6669–6673. doi: 10.1073/pnas.87.17.6669. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Boute N, Exposito JY, Boury-Esnault N, Vacelet J, Noro N, Miyazaki K, Yoshizato K, Garrone R. Type IV collagen in sponges, the missing link in basement membrane ubiquity. Biol. Cell. 1996;88:37–44. doi: 10.1016/s0248-4900(97)86829-3. [DOI] [PubMed] [Google Scholar]
- 11.Aouacheria A, Geourjon C, Aghajari N, Navratil V, Deléage G, Lethias C, Exposito JY. Insights into early extracellular matrix evolution: spongin short chain collagen-related proteins are homologous to basement membrane type IV collagens and form a novel family widely distributed in invertebrates. Mol. Biol. Evol. 2006;23:2288–2302. doi: 10.1093/molbev/msl100. [DOI] [PubMed] [Google Scholar]
- 12.Wada H, Okuyama M, Satoh N, Zhang S. Molecular evolution of fibrillar collagen in chordates, with implications for the evolution of vertebrate skeletons and chordate phylogeny. Evol. Dev. 2006;8:370–377. doi: 10.1111/j.1525-142X.2006.00109.x. [DOI] [PubMed] [Google Scholar]
- 13.Wick G, Kalischnig G, Maurer H, Mayerl C, Müller PU. Really old-palaeoimmunology: immunohistochemical analysis of extracellular matrix proteins in historic and pre-historic material. Exp. Gerontol. 2001;36:1565–1579. doi: 10.1016/s0531-5565(01)00141-3. [DOI] [PubMed] [Google Scholar]
- 14.Franc S, Marzin E, Boutillon MM, Lafont R, Lechéne de la Porte P, Herbage D. Immunohistochemical and biochemical analyses of 20,000–25,000-year-old fossil cartilage. Eur. J. Biochem. 1995;234:125–131. doi: 10.1111/j.1432-1033.1995.125_c.x. [DOI] [PubMed] [Google Scholar]
- 15.Asara JM, Schweitzer MH, Freimark LM, Phillips M, Cantley LC. Protein sequences from mastodon and Tyrannosaurus rex revealed by mass spectrometry. Science. 2007;316:280–285. doi: 10.1126/science.1137614. [DOI] [PubMed] [Google Scholar]
- 16.Schweitzer MH, Zheng W, Organ CL, Avci R, Suo Z, Freimark LM, Lebleu VS, Duncan MB, Vander Heiden MG, Neveu JM, Lane WS, Cottrell JS, Horner JR, Cantley LC, Kalluri R, Asara JM. Biomolecular characterization and protein sequences of the Campanian hadrosaur B. canadensis. Science. 2009;324:626–631. doi: 10.1126/science.1165069. [DOI] [PubMed] [Google Scholar]
- 17.Bern M, Phinney BS, Goldberg D. Reanalysis of Tyrannosaurus rex Mass Spectra. J. Proteome Res. 2009;8:4328–4332. doi: 10.1021/pr900349r. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Koch M, Laub F, Zhou P, Hahn RA, Tanaka S, Burgeson RE, Gerecke DR, Ramirez F, Gordon MK. Collagen XXIV, a vertebrate fibrillar collagen with structural features of invertebrate collagens: selective expression in developing cornea and bone. J. Biol. Chem. 2003;278:43236–43244. doi: 10.1074/jbc.M302112200. [DOI] [PubMed] [Google Scholar]
- 19.Boot-Handford RP, Tuckwell DS, Plumb DA, Rock CF, Poulsom R. A novel and highly conserved collagen [pro(α)1(XXVII)] with a unique expression pattern and unusual molecular characteristics establishes a new clade within the vertebrate fibrillar collagen family. J. Biol. Chem. 2003;278:31067–31077. doi: 10.1074/jbc.M212889200. [DOI] [PubMed] [Google Scholar]
- 20.Pace JM, Corrado M, Missero C, Byers PH. Identification, characterization and expression analysis of a new fibrillar collagen gene, COL27A1. Matrix Biol. 2003;22:3–14. doi: 10.1016/s0945-053x(03)00007-6. [DOI] [PubMed] [Google Scholar]
- 21.Kleman JP, Aeschlimann D, Paulsson M, van der Rest M. Transglutaminase-catalyzed cross-linking of fibrils of collagen V/XI in A204 rhabdomyosarcoma cells. Biochemistry. 1995;34:13768–13775. doi: 10.1021/bi00042a007. [DOI] [PubMed] [Google Scholar]
- 22.Burgeson RE, Hollister DW. Collagen heterogeneity in human cartilage: identification of several new collagen chains. Biochem. Biophys. Res. Commun. 1979;87:1124–1131. doi: 10.1016/s0006-291x(79)80024-8. [DOI] [PubMed] [Google Scholar]
- 23.Reese CA, Mayne R. Minor collagens of chicken hyaline cartilage. Biochemistry. 1981;20:5443–5448. doi: 10.1021/bi00522a014. [DOI] [PubMed] [Google Scholar]
- 24.Lees JF, Tasab M, Bulleid NJ. Identification of the molecular recognition sequence which determines the type-specific assembly of procollagen. EMBO J. 1997;16:908–916. doi: 10.1093/emboj/16.5.908. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Aouacheria A, Cluzel C, Lethias C, Gouy M, Garrone R, Exposito JY. Invertebrate data predict an early emergence of vertebrate fibrillar collagen clades and an anti-incest model. J. Biol. Chem. 2004;279:47711–47719. doi: 10.1074/jbc.M408950200. [DOI] [PubMed] [Google Scholar]
- 26.Thom JR, Morris NP. Biosynthesis and proteolytic processing of type XI collagen in embryonic chick sterna. J. Biol. Chem. 1991;266:7262–7269. [PubMed] [Google Scholar]
- 27.Linsenmayer TF, Gibney E, Igoe F, Gordon MK, Fitch JM, Fessler LI, Birk DE. Type V collagen: molecular structure and fibrillar organization of the chicken α1(V) NH2-terminal domain, a putative regulator of corneal fibrillogenesis. J. Cell Biol. 1993;121:1181–1189. doi: 10.1083/jcb.121.5.1181. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Plumb DA, Dhir V, Mironov A, Ferrara L, Poulsom R, Kadler KE, Thornton DJ, Briggs MD, Boot-Handford RP. Collagen XXVII is developmentally regulated and forms thin fibrillar structures distinct from those of classical vertebrate fibrillar collagens. J. Biol. Chem. 2007;282:12791–12795. doi: 10.1074/jbc.C700021200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Hjorten R, Hansen U, Underwood RA, Telfer HE, Fernandes RJ, Krakow D, Sebald E, Wachsmann-Hogiu S, Bruckner P, Jacquet R, Landis WJ, Byers PH, Pace JM. Type XXVII collagen at the transition of cartilage to bone during skeletogenesis. Bone. 2007;41:535–542. doi: 10.1016/j.bone.2007.06.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Exposito JY, Larroux C, Cluzel C, Valcourt U, Lethias C, Degnan BM. Demosponge and sea anemone fibrillar collagen diversity reveals the early emergence of A/C clades and the maintenance of the modular structure of type V/XI collagens from sponge to human. J. Biol. Chem. 2008;283:28226–28235. doi: 10.1074/jbc.M804573200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Exposito JY, D’Alessio M, Ramirez F. Novel amino-terminal propeptide configuration in a fibrillar procollagen undergoing alternative splicing. J. Biol. Chem. 1992;267:17404–17408. [PubMed] [Google Scholar]
- 32.Cluzel C, Lethias C, Garrone R, Exposito JY. Distinct maturations of N-propeptide domains in fibrillar procollagen molecules involved in the formation of heterotypic fibrils in adult sea urchin collagenous tissues. J. Biol. Chem. 2004;279:9811–9817. doi: 10.1074/jbc.M311803200. [DOI] [PubMed] [Google Scholar]
- 33.Exposito JY, D’Alessio M, Solursh M, Ramirez F. Sea urchin collagen evolutionarily homologous to vertebrate pro-α2(I) collagen. J. Biol. Chem. 1992;267:15559–15562. [PubMed] [Google Scholar]
- 34.Suzuki HR, Reiter RS, D’Alessio M, Di Liberto M, Ramirez F, Exposito JY, Gambino R, Solursh M. Comparative analysis of fibrillar and basement membrane collagen expression in embryos of the sea urchin, Strongylocentrotus purpuratus. Zoolog. Sci. 1997;14:449–454. doi: 10.2108/zsj.14.449. [DOI] [PubMed] [Google Scholar]
- 35.Dion AS, Myers JC. COOH-terminal propeptides of the major human procollagens. Structural, functional and genetic comparisons. J. Mol. Biol. 1987;193:127–143. doi: 10.1016/0022-2836(87)90632-2. [DOI] [PubMed] [Google Scholar]
- 36.Boot-Handford RP, Tuckwell DS. Fibrillar collagen: the key to vertebrate evolution? A tale of molecular incest. Bioessays. 2003;25:142–151. doi: 10.1002/bies.10230. [DOI] [PubMed] [Google Scholar]
- 37.Yamada Y, Avvedimento VE, Mudryj M, Ohkubo H, Vogeli G, Irani M, Pastan I, de Crombrugghe B. The collagen gene: evidence for its evolutinary assembly by amplification of a DNA segment containing an exon of 54 bp. Cell. 1980;22:887–892. doi: 10.1016/0092-8674(80)90565-6. [DOI] [PubMed] [Google Scholar]
- 38.Exposito JY, van der Rest M, Garrone R. The complete intron/exon structure of Ephydatia mülleri fibrillar collagen gene suggests a mechanism for the evolution of an ancestral gene module. J. Mol. Evol. 1993;37:254–259. doi: 10.1007/BF00175502. [DOI] [PubMed] [Google Scholar]
- 39.Takahara K, Hoffman GG, Greenspan DS. Complete structural organization of the human α1(V) collagen gene (COL5A1): divergence from the conserved organization of other characterized fibrillar collagen genes. Genomics. 1995;29:588–597. doi: 10.1006/geno.1995.9961. [DOI] [PubMed] [Google Scholar]
- 40.Vuoristo MM, Pihlajamaa T, Vandenberg P, Prockop DJ, Ala-Kokko L. The human COL11A2 gene structure indicates that the gene has not evolved with the genes for the major fibrillar collagens. J. Biol. Chem. 1995;270:22873–22881. doi: 10.1074/jbc.270.39.22873. [DOI] [PubMed] [Google Scholar]
- 41.Exposito JY, Cluzel C, Lethias C, Garrone R. Tracing the evolution of vertebrate fibrillar collagens from an ancestral α chain. Matrix Biol. 2000;19:275–279. doi: 10.1016/s0945-053x(00)00067-6. [DOI] [PubMed] [Google Scholar]
- 42.Miura S, Kimura S. Jellyfish mesogloea collagen. Characterization of molecules as α1α2α3 heterotrimers. J. Biol. Chem. 1985;260:15352–15356. [PubMed] [Google Scholar]
- 43.Tillet E, Franc JM, Franc S, Garrone R. The evolution of fibrillar collagens: a sea-pen collagen shares common features with vertebrate type V collagen. Comp. Biochem. Physiol. B: Biochem. Mol. Biol. 1996;113:239–246. doi: 10.1016/0305-0491(95)02014-4. [DOI] [PubMed] [Google Scholar]
- 44.Sicot FX, Exposito JY, Masselot M, Garrone R, Deutsch J, Gaill F. Cloning of an annelid fibrillar-collagen gene and phylogenetic analysis of vertebrate and invertebrate collagens. Eur. J. Biochem. 1997;246:50–58. doi: 10.1111/j.1432-1033.1997.00050.x. [DOI] [PubMed] [Google Scholar]
- 45.Yandell M, Mungall CJ, Smith C, Prochnik S, Kaminker J, Hartzell G, Lewis S, Rubin GM. Large-scale trends in the evolution of gene structures within 11 animal genomes. PLoS Comput. Biol. 2006;2:e15. doi: 10.1371/journal.pcbi.0020015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Roy SW, Gilbert W. Resolution of a deep animal divergence by the pattern of intron conservation. Proc. Natl. Acad. Sci. USA. 2005;102:4403–4408. doi: 10.1073/pnas.0409891102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Csurös M, Rogozin IB, Koonin EV. Extremely intron-rich genes in the alveolate ancestors inferred with a flexible maximum-likelihood approach. Mol. Biol. Evol. 2008;25:903–911. doi: 10.1093/molbev/msn039. [DOI] [PubMed] [Google Scholar]
- 48.Irimia M, Roy SW. Spliceosomal introns as tools for genomic and evolutionary analysis. Nucleic Acids Res. 2008;36:1703–1712. doi: 10.1093/nar/gkn012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Chevenet F, Brun C, Banuls AL, Jacq B, Chisten R.TreeDyn: Towards dynamic graphics and annotations for analyses of trees BMC Bioinformatics 20067doi:10.1186/1471-2105-7-439. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Zhang X, Boot-Handford RP, Huxley-Jones J, Forse LN, Mould AP, Robertson DL, Li L, Athiyal M, Sarras MP., Jr The collagens of hydra provide insight into the evolution of metazoan extracellular matrices. J. Biol. Chem. 2007;282:6792–6802. doi: 10.1074/jbc.M607528200. [DOI] [PubMed] [Google Scholar]
- 51.Rychel AL, Smith SE, Shimamoto HT, Swalla BJ. Evolution and development of the chordates: collagen and pharyngeal cartilage. Mol. Biol. Evol. 2006;23:541–549. doi: 10.1093/molbev/msj055. [DOI] [PubMed] [Google Scholar]
- 52.DeSalle R, Schierwater B. An even “newer” animal phylogeny. Bioessays. 2008;30:1043–1047. doi: 10.1002/bies.20842. [DOI] [PubMed] [Google Scholar]
- 53.Zhang G, Cohn MJ. Genome duplication and the origin of the vertebrate skeleton. Curr. Opin. Genet. Dev. 2008;18:387–393. doi: 10.1016/j.gde.2008.07.009. [DOI] [PubMed] [Google Scholar]
- 54.Degnan BM, Vervoort M, Larroux C, Richards GS. Early evolution of metazoan transcription factors. Curr. Opin. Genet. Dev. 2009;19:591–599. doi: 10.1016/j.gde.2009.09.008. [DOI] [PubMed] [Google Scholar]
- 55.Garcia-Fernàndez J, Benito-Gutiérrez E. It’s a long way from amphioxus: descendants of the earliest chordate. Bioessays. 2009;31:665–675. doi: 10.1002/bies.200800110. [DOI] [PubMed] [Google Scholar]
- 56.Hejnol A, Obst M, Stamatakis A, Ott M, Rouse GW, Edgecombe GD, Martinez P, Baguñà J, Bailly X, Jondelius U, Wiens M, Müller WE, Seaver E, Wheeler WC, Martindale MQ, Giribet G, Dunn CW. Assessing the root of bilaterian animals with scalable phylogenomic methods. Proc. Biol. Sci. 2009;276:4261–4270. doi: 10.1098/rspb.2009.0896. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Philippe H, Derelle R, Lopez P, Pick K, Borchiellini C, Boury-Esnault N, Vacelet J, Renard E, Houliston E, Quéinnec E, Da Silva C, Wincker P, Le Guyader H, Leys S, Jackson DJ, Schreiber F, Erpenbeck D, Morgenstern B, Wörheide G, Manuel M. Phylogenomics revives traditional views on deep animal relationships. Curr. Biol. 2009;19:706–712. doi: 10.1016/j.cub.2009.02.052. [DOI] [PubMed] [Google Scholar]
- 58.van de Peer Y, Maere S, Meyer A. The evolutionary significance of ancient genome duplications. Nat. Rev. Genet. 2009;10:725–732. doi: 10.1038/nrg2600. [DOI] [PubMed] [Google Scholar]
- 59.Weil D, Bernard M, Gargano S, Ramirez F. The proα2(V) collagen gene is evolutionarily related to the major fibrillar-forming collagens. Nucleic Acids Res. 1987;15:181–198. doi: 10.1093/nar/15.1.181. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Bernard M, Yoshioka H, Rodriguez E, van der Rest M, Kimura T, Ninomiya Y, Olsen BR, Ramirez F. Cloning and sequencing of pro-α1(XI) collagen cDNA demonstrates that type XI belongs to the fibrillar class of collagens and reveals that the expression of the gene is not restricted to cartilagenous tissue. J. Biol. Chem. 1988;263:17159–17166. [PubMed] [Google Scholar]
- 61.Bulleid NJ, Wilson R, Lees JF. Type-III procollagen assembly in semi-intact cells: chain association, nucleation and triple-helix folding do not require formation of inter-chain disulphide bonds but triple-helix nucleation does require hydroxylation. Biochem. J. 1996;317:195–202. doi: 10.1042/bj3170195. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Zhang G, Miyamoto MM, Cohn MJ. Lamprey type II collagen and Sox9 reveal an ancient origin of the vertebrate collagenous skeleton. Proc. Natl. Acad. Sci. USA. 2006;103:3180–3185. doi: 10.1073/pnas.0508313103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Zhang G, Cohn MJ. Hagfish and lancelet fibrillar collagens reveal that type II collagen-based cartilage evolved in stem vertebrates. Proc. Natl. Acad. Sci. USA. 2006;103:16829–16833. doi: 10.1073/pnas.0605630103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Ohtani K, Yao T, Kobayashi M, Kusakabe R, Kuratani S, Wada H. Expression of Sox and fibrillar collagen genes in lamprey larval chondrogenesis with implications for the evolution of vertebrate cartilage. J. Exp. Zool. B: Mol. Dev. Evol. 2008;310:596–607. doi: 10.1002/jez.b.21231. [DOI] [PubMed] [Google Scholar]
- 65.Wright GM, Keeley FW, Robson P. The unusual cartilaginous tissues of jawless craniates, cephalochordates and invertebrates. Cell Tissue Res. 2001;304:165–174. doi: 10.1007/s004410100374. [DOI] [PubMed] [Google Scholar]
- 66.Kuraku S, Meyer A, Kuratani S. Timing of genome duplications relative to the origin of the vertebrates: did cyclostomes diverge before or after? Mol. Biol. Evol. 2009;26:47–59. doi: 10.1093/molbev/msn222. [DOI] [PubMed] [Google Scholar]
- 67.Kuraku S. Insights into cyclostome phylogenomics: Pre-2R or post-2R. Zoolog. Sci. 2008;25:960–968. doi: 10.2108/zsj.25.960. [DOI] [PubMed] [Google Scholar]
- 68.Rychel AL, Swalla BJ. Development and evolution of chordate cartilage. J. Exp. Zool. B: Mol. Dev. Evol. 2007;308:325–335. doi: 10.1002/jez.b.21157. [DOI] [PubMed] [Google Scholar]
- 69.Kadler KE, Hill A, Canty-Laird EG. Collagen fibrillogenesis: fibronectin, integrins, and minor collagens as organizers and nucleators. Curr. Opin. Cell Biol. 2008;20:495–501. doi: 10.1016/j.ceb.2008.06.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Wenstrup RJ, Florer JB, Davidson JM, Phillips CL, Pfeiffer BJ, Menezes DW, Chervoneva I, Birk DE. Murine model of the Ehlers-Danlos syndrome col5a1 haploinsufficiency disrupts collagen fibril assembly at multiple stages. J. Biol. Chem. 2006;281:12888–12895. doi: 10.1074/jbc.M511528200. [DOI] [PubMed] [Google Scholar]
- 71.Fernandes RJ, Weis M, Scott MA, Seegmiller RE, Eyre DR. Collagen XI chain misassembly in cartilage of the chondrodysplasia (cho) mouse. Matrix Biol. 2007;26:597–603. doi: 10.1016/j.matbio.2007.06.007. [DOI] [PMC free article] [PubMed] [Google Scholar]