Abstract
The increase of bodyplan complexity in early bilaterian evolution is correlates with the advent and diversification of microRNAs. These small RNAs guide animal development by regulating temporal transitions in gene expression involved in cell fate choices and transitions between pluripotency and differentiation. One of the two known microRNAs whose origins date back before the bilaterian ancestor is mir-100. In Bilateria, it appears stably associated in polycistronic transcripts with let-7 and mir-125, two key regulators of development. In vertebrates, these three microRNA families have expanded to form a complex system of developmental regulators. In this contribution, we disentangle the evolutionary history of the let-7 locus, which was restructured independently in nematodes, platyhelminths, and deuterostomes. The foundation of a second let-7 locus in the common ancestor of vertebrates and urochordates predates the vertebrate-specific genome duplications, which then caused a rapid expansion of the let-7 family.
Keywords: let-7, miRNA evolution, microRNA, mir-100, mir-125
Introduction
As a class, microRNAs exhibit several unusual features in their evolutionary history. Despite the short length of the functional mature microRNAs (miRs) with only ~22 nt, they are extremely well conserved and hence detectable with high accuracy in genomic sequence data.1 Since the class of microRNAs is subdivided into hundreds of families with presumably independent origins, the presence/absence of at least one representative of a family forms a set of valuable and phylogenetically highly informative characters.2,3 While the loss of entire families is rare overall, some clades, such as in the tunicate Oikopleura dioica,4 have undergone a major restructuring of their microRNA repertoire. On the other hand, individual microRNA families, with a few exceptions, evolve much like other multi-gene families showing gains by duplications, lineage-specific losses of paralogs, and reflect the genome-wide duplication events.3,5
Most microRNA families have only a few paralogous members making it fairly straightforward to resolve their evolutionary histories in full detail. The most complex example studied exhaustively to date is the mir-17 cluster, comprising about 15 microRNAs with eight different miRBase numbers that can belong to three unrelated families.6,7 Apart from repeat-derived microRNAs and the huge imprinted mammalian microRNA clusters that behave in a repeat-like fashion, there is only a single ancient class of microRNAs whose evolutionary history has remained poorly understood: the let-7 family. This may come as a surprise, given that Caenorhabditis elegans let-7 and lin-4 were the first microRNAs to be discovered8 more than a decade before microRNAs were recognized as a general class of RNA regulators,9 and despite the fact that its phylogenetic distribution has been the subject of systematic investigation already a decade ago.10,11
Both let-7 and lin-4 direct temporal development in different larval stages of C. elegans.8,12 While lin-4 miRNA does not seem to exist outside of Rhabditida (parasitic Roundworms), the let-7 miRNA is conserved throughout Bilateria. Outside of Rhabditia, it is clustered with mir-100 miRNA, in Panarthropoda and Deuterostomia this cluster is further extended by a mir-125 as third microRNA, whose sequence is unrelated to both let-7 and mir-100. Vertebrate genomes, on the other hand, typically contain a dozen or more let-7 paralogs, some in clusters with paralogous copies of mir-125 and/or mir-100 and some located in isolation,13 see also the data provided in miRBase Release 18.0.14 Only part of this diversity can be accounted for by the genome-wide duplication events at the origin of vertebrates.15 The major difficulty in deriving a comprehensive picture of the evolution of the let-7 family is the correct assignment of orthology; unfortunately, the naming conventions used by miRBase are not helpful and even misleading in some cases. In addition, the annotated set of animal let-7 sequences is still rather incomplete.
In this contribution, we therefore characterize the evolution of the let-7 family and its associated microRNAs based on a comprehensive homology search and a careful analysis of their orthology relationships by means of both sequence comparison and assessment of synteny. We further suggest a refined nomenclature for the members of the let-7 family that better reflects their evolutionary relationships.
Materials and Methods
Starting point of our analysis was the collection of all lin-4, let-7, mir-100, and mir-125 precursor sequences compiled in the miRBase database release 16. This includes 250 let-7, 86 mir-125, and 85 mir-100 sequences in deuterostomes as well as 33 let-7, 26 mir-125, 27 mir-100, and 6 lin-4 microRNAs in protostomes.
We performed a comprehensive homology search in the genomic sequences available for 60 deuterostomes, 50 protostomes, and five cnidarians. Therefore, we extended all miRBase-derived microRNA-precursor sequences to a uniform length. Subsequently, BLAST16 was applied with these sequences to search the genomes of 60 deuterostomes in order to collect a comprehensive data set. In cases where BLAST was not able to detect a certain homolog, we used the semi-global sequence alignment tool GotohScan17 and finally Infernal.18
In order to firmly establish orthology relations, we determined for each microRNA gene its genomic context: for intergenic microRNAs, we recorded both adjacent protein-coding genes, for intronic locations, we recorded the surrounding genes. Homology of these protein-coding genes was established by alignments of the amino acid sequences whenever the corresponding information could not be retrieved from a database. Synteny among teleost species was determined from the pairwise alignment nets19 provided through the UCSC genome browser. Regions with a size on the order of 100 kb were visually inspected for this purpose.
Taking into account the genomic locations, synteny information, and conservation of the 5p-miR/3p-miR regions, we built alignments for each microRNA family and its subfamilies. MicroRNA-like hairpin structures and secondary structure conservation were checked using RNAfold and RNAalifold from the Vienna RNA package.20,21
Analyses of deuterostome data were mainly based on manually curated alignments, calculated by ClustalW.22 For phylogenetic analyses, we refined these alignments to a selection of taxa containing one member of primates (human), Laurasiatheria (dog), Metatheria (either wallaby or opossum), Protheria (platypus), Lepidosauria (anoles), Aves (either chicken or turkey), Amphibia (frog), and all available teleost sequences. Different members of the same cluster were aligned independently and then concatenated to increase the signal-to-noise ratio. SplitsTree23 was applied for a visual investigation and refinement of our microRNA family assignment.
We estimated the phylogeny of the mixed clusters A, C, and D and the homogeneous clusters E–J using the program MrBayes v3.1.24 Therein, the mir-100 family of cluster C served as outgroup. We used jModelTest25 to select the best fitting nucleotide substitution model using the AIC. The JC and K80+I+G was selected for the mixed and homogeneous clusters, respectively. MrBayes was used to infer the posterior majority rule consensus tree along with posterior support for all internal branches using the respective evolutionary model and mainly default settings. The bayesian phylogenetic analyses of mixed and homogeneous clusters were run twice in parallel with eight (seven heated) Metropolis-coupled Markov chain Monte Carlo chains for 10 000 000 and 2 000 000 generations sampling every 1000th and 100th iteration, respectively. The initial 2 500 000 and 500 000 generations were discarded as burn-in during the estimation of the consensus trees of clusters A, C, and D and E–J, respectively. The trees are illustrated using Dendroscope.26 All branches are labeled with their respective posterior support.
The Infernal package18 was used for structure- and sequence-based homology searches whenever sequence conservation alone seemed to be insufficient. Therein, the program cmbuild was utilized with standard parameters to derive a CM for every let-7, mir-100, and mir-125 (sub)family, given the corresponding cleaned alignment and its consensus secondary structure. In this scope, cleaned alignments solely contain full-length microRNA sequences that do not comprise any nucleotide except of A, C, G, or T. In addition, we created a compound mir-100 CM based on a manually curated alignment of mir-100 sequences derived from clusters A, C, and D for searches in Protostomia and basal Metazoa. Besides homology searches, these family models were used to determine the structural and sequence similarity of the let-7 families in contrast to the phylogenetic analysis that was done with MrBayes. For each let-7 family, we used all its microRNA sequences to calculate the average bitscore against each family CM.
Results
Our survey resulted in a single copy of let-7 in Protostomia, 14 let-7 genes in human, and 19 let-7 copies in teleosts, except for the zebrafish (Danio rerio) where 21 genes could be retrieved. We note that mir-99 has long been known as a homolog of mir-100, while mir-98 is a let-7 homolog, see e.g. Roush and Slack.13 In the following, we discuss the evolutionary history of the let-7 system in detail. As a resource, we provide extensive supplemental information in machine readable form, including covariance models, structure-annotated multiple sequence alignments, and the genomic coordinates of all microRNAs discussed in this work (http://www.bioinf.uni-leipzig.de/publications/supplements/11-022).1
Let-7 microRNAs in basal Metazoans
It is well known that miRNA mir-100 is one of the oldest miRNAs in animal species. The most ancient organism that has a mir-100 copy encoded in its genome is the sea anemone Nematostella vectensis, a cnidarian.27 Somewhat surprisingly, no mir-100 ortholog was detectable in any of the diploplast genomes (cnidaria, ctenophorans, and poriferans), although this might be explainable by the incomplete status of these genome projects.
The let-7 microRNAs were detected by northern blot in a wide variety of both deuterostomes and protostomes, but remained undetectable in diploblasts.11 Consistent results were obtained by microRNA sequencing.28 Chaetognatha, which sometimes have been hypothesized as pre-dating the protostome-deuterostome divergence,29 also have a let-7 homolog.11 More recent phylogenetic studies, however, place them firmly within Protostomia.30,31 No trace of a let-7 or mir-125 homolog can be found in any of the non-bilaterian animal genomes.
Let-7 in Protostomes
In most major protostome clades, we find a single intact cluster of mir-100, let-7, and mir-125. Typically, the cluster is tightly linked indicating an intact polycistronic transcript. This is the case for both lophotrochozoans and ecdysozoans with some exceptions.
Among lophotrochozoans, complete and tightly linked clusters are found in the annelids Capitella teleta and Platynereis dumerilii. In the latter, the expression of the let-7 cluster is studied in detail.32 In the mollusc Lottia gigantea, the mir-125 homolog is missing. In contrast, the cluster has desintegrated in platyhelminthes and mir-100 appears to be missing completely. Schistosomes have a single copy of let-7 and two mir-125 paralogs.33-35 In Schmidtea mediterranea, multiple copies of let-7 and mir-125 as well as a single copy of lin-4 have been annotated.36-38 In this scope, lin-4 can be seen as a putative homolog of mir-125. Both microRNA families show perfect conservation of their seed sequences, i.e., either nucleotides 1–7 or 2–8 of the 5p-miR region, compared with human let-7 and mir-125 miRs, respectively. Several substitutions are encountered in the remaining part of the 5p-miR sequences, however. Hence, the assignment of platyhelminth let-7 and mir-125 paralogs to particular subfamilies remains inconclusive.39
Much more genomic data are available for Ecdysozoa. In arthropods, mir-100, let-7, and mir-125 form a tight genomic cluster in which the microRNAs are separated only by a few hundred nucleotides. In Drosophila melanogaster, the polycistronic primary transcript and its expression has been studied in detail.40 In a few species, one of the cluster members is lacking, possibly due to missing data. In nematodes, an intact cluster is present only in Trichinella spiralis, i.e., in the most basal clade Dorylaimia. In contrast, most rhabditid worms including Caenorhabditis elegans have an isolated let-7 gene and lack annotated mir-125 and mir-100 homologs. The loss of mir-100 appears to be a relatively recent phenomenon in Caenorhabditis and Pristionchus,41 since a mir-100 linked to let-7 can be found in Heterorhabditis bacteriophora. Ruby et al. proposed, based on a match of the seed sequence, that the two related microRNA clusters mir-51/mir-53 (Chr.IV) and mir-54/mir-55/mir-56 (Chr.X) are co-orthologs of miR-100 in C. elegans.42 The cluster on the Chr.X is separated from cel-let-7 by more than 1.5 Mb. Beyond the seed nucleotides, no homology with mir-100 is detectable, however, so that their relation with mir-100 remains uncertain.
Clusters comprising mir-100 and let-7 are also found in Tylenchina (e.g., Meloidogyne, Heterodera) as well as Spirurina (Brugia malayi and Ascaris suum). Poole and colleagues reported four mir-100 paralogs in B. malayi,43 only one of which, bma-mir-100b, is linked with the sole annotated let-7 and, furthermore, shows perfect sequence conservation with the human miR-100. The remaining three microRNAs show a conserved seed region but comprise various mutations in the 3′ end of their 5p-miR sequence. None of these genomes contain a mir-125.
Lin-4, one of the first microRNAs to be discovered,44 is functionally closely associated with let-7.45 It was recognized as a putative ortholog of mir-125 by Lagos-Quintana et al.46 based on the similarities of the 5p-miR regions. We find that the sequence homology covers the majority of the precursor hairpin supporting the homology of lin-4 and mir-125. In contrast to mir-125, however, none of the annotated lin-4 sequences is linked with let-7 and/or mir-100. C. elegans lin-4 is located in intron 9 of the protein-coding gene F59G1.4. This arrangement is conserved in both Pristionchus pacificus and B. malayi. No lin-4 sequence is detectable in T. spiralis, which has an intact mir-100/let-7/mir-125 cluster.
In C. elegans, there is an antisense transcript of lin-4, which could also give rise to a miRNA similar to the iab-4/iab-8 pair in Drosophila.47 Thus, we checked whether lin-4 might originate from a mir-125 antisense hairpin. Comparisons of a lin-4 CM against annotated mir-125 sequences and a mir-125 CM against lin-4 sequences in both reading directions show that mir-125 and lin-4 match significantly better in sense direction. We thus hypothesize that the sequence divergence of lin-4 is coupled with the breaking up of the ancestral cluster.
Let-7 in Gnathostomes
In total, we collected 874 microRNAs among Deuterostomia including 128 mir-100 sequences, 135 mir-125 sequences, and 611 let-7 sequences. Most of these sequences were found in Gnathostomata (jawed vertebrates). The miRBase lists 12 let-7 paralogs in human including three genomic loci at which let-7 appears to be accompanied by other microRNAs. The best-known of these clusters, A, is composed of mir-99b, let-7e, and mir-125a on chromosome 19. The two other loci are C: mir-99a, let-7c, mir-125b-2 (chr.21) and D: mir-100, let-7a-2, mir-125b-1 (chr.11). The association of mir-125 and let-7 at the latter two loci, although previously noticed, e.g., Roush and Slack,13 is not annotated as a cluster in miRBase, since the distances of 50 and 46 kb, resp., are larger than the (arbitary) 10kb threshold. The second type of let-7 cluster consists of members of the let-7 family only. The paradigmatic example is the cluster E on chromosome 9, consisting of let-7a-1, let-7f-1, and let-7d. The two remaining clustered loci are F on chr.X (let-7f-2 and mir-98) and G on chr.22 (let-7a-3 and let-7b). Two additional loci, designated here as I (chr.12) and J (chr.3), each harbour a single annotated human let-7 miRNAs (let-7i and let-7g, resp.). Our homology searches revealed two additional sequences similar to let-7d, located at positions K (chr.17) and L (chr.1), i.e., unrelated to the previously described let-7 loci.
With the exception of the novel loci K and L (see below), this arrangment of let-7 paralogs is well conserved: The three mixed clusters A, C, and D, the three homogeneous clusters E, F, and G, and the two isolated loci I and J can be traced throughout all available tetrapods. A summary of these gnathostome let-7 clusters is compiled in Table 1, the corresponding gene phylogenies are shown in Figure 1. An extended table showing annotated miRBase names and distances between adjacent microRNAs is also available online.
Table 1. Overview of miRNA clusters among Gnathostoma ordered by their presumed evolutionary history.

Previously annotated miRNAs (mirBase 18) are depicted by filled circles, newly found putative miRNAs are shown as empty circles. Dashed lines separate different lineages: Primates + Tupaia (1), Glires (2), Euarchontoglires (1+2), Laurasiatheria (3), Afrotheria (4), Xenarthra (5), Eutheria (1–5), Metatheria (6), Sauropsida (7), Teleostei (8). The two paralogous sets of clusters are separated by the long-dashed line.
Figure 1. Estimated phylogenetic trees for the let-7 miRNA sequences. The two trees contain selected sequences from clusters A, C, and D (left) and E to J (right). The numbers at branches indicate the posterior support. Subtrees that are specific for teleosts (label-prefix “teleost”) or non-teleosts (no label-prefix) were collapsed to increase readability of the right tree, even if the subtrees are not complete.
Orthology of the corresponding loci is unambiguously established based on both synteny information and sequence similarity (see Methods for details). In the chicken genome, two additional clustered let-7 paralogs, let-7k and let-7j, were reported.48 They clearly form a fourth homogeneous cluster, H, absent in eutheria and metatheria. Evidence for the presence of the D, E, G, H, and J loci can also be found in the genome of the elephant shark. Since this genome is sequenced only at low coverage,49 it is plausible that missing loci are due to lack of data rather than true losses.
The genomes of nearly all vertebrates, more precisely of the gnathostomes to the exclusion of lampreys and hagfishes, share two rounds of genome duplications.50-52 Both the three mixed (A, C, and D) and the four homogeneous (E, F, G, and H) let-7 clusters clearly are the result of the vertebrate-specific (2R) genome duplications.
The situation is more complex in the five teleosts due to an extra round of genome duplication. The fish-specific genome duplication (FSGD) preceeded the divergence of the teleosts.53,54 Combining synteny information and sequence comparison allows to resolve the orthology relationships of the let-7 loci among the teleosts, see Table 2.
Table 2. Correspondence of let-7 loci in the teleost genomes of Danio rerio (dre), Oryzia latipes (ola), Gasterosteus aculeatus (gac), Takifugu rubripes (tru), and Tetraodon nigrovirides (tni).
| loc. | dre | ola | gac | tru | tni |
|---|---|---|---|---|---|
|
Aa |
1627M |
1612M |
XX8M |
s.22 |
86M |
|
Ab |
1910M |
s.1995 |
X6.6M |
s.37 |
21-rnd |
|
Ca |
1039M |
141M |
VII13M |
— |
— |
|
Cb |
1529M |
— |
— |
— |
— |
|
Da |
1520M |
132M |
— |
s.144 |
168M |
|
Db |
531M |
1416M |
VII18M |
s.6 |
76M |
|
Ea |
1128M |
— |
— |
— |
— |
|
Ha |
654M |
512M |
XVII12M |
s.7484 |
119M |
|
Ja |
641M |
57M |
c.7697 |
s.56 |
115M |
|
Hb |
235M |
713M |
XII12M |
s.93 |
94.6M |
|
Fa |
2328M |
712M |
XII11M |
s.66 |
93.6M |
|
Fb |
2318M |
726M |
XII13M |
s.2159 |
910M |
|
Ga |
252M |
66M |
XIX4.8M |
s.2 |
139.2M |
|
Ia |
251.6M |
65.8M |
XIX4.5M |
s.2 |
139.0M |
|
Gb |
417M |
236M |
IV19M |
s.177 |
195.3M |
| Ib | — | s.1942 | IV20M | s.177 | 195.4M |
Synteny was determined from the pairwise alignment nets19 provided through the UCSC genome browser. Note that the genomic coordinates of each loci are abbreviated by the chromosome or scaffold number and their position on megabase scale in superscript.
The correspondence of tetrapod and teleost let-7 clusters cannot be determined based on sequence similarity alone due to the short sequences and the large phylogenetic distances. For most loci, strong support comes again from synteny information. The teleost Aa locus, in particular, shares several flanking protein-coding genes with the human A locus, e.g., SMG9 and HAS1. We note, however, that the sequences of the teleost Aa and Ab loci are not recognizable as orthologs of the tetrapod A locus, while synteny and sequence data are largely consistent for the other loci. There is no support for the alternative explanation either, namely that the teleost clusters Aa and Ab derive really from an ancestral gnathostome B locus that has been lost completely in Tetrapoda, while the A locus has completely disappeard in teleosts. Taken together, thus, the data suggest an ancestral state prior to the FSGD that closely matches the ancestral state in gnathostomes, see Figure 2. The only changes that can be attributed to the actinopterygian stem lineages are the loss of A-mir-100 and E-let-7–3. Surprisingly, the loci I and J are found in a loose association with the H and G clusters. In the case of G/I, this association is found in both paralogs, implying that the proximity of G and I loci preceeded the FSGD. The G and I loci are also found on the same chromosomes, although separated by many megabases, in rat, dog, cow, sheep, and in sauropsids.
Figure 2. Putative evolutionary history of the let-7 microRNA clusters across Bilateria. A white triangle represents a mir-100, a gray triangle represents mir-125 microRNAs while let-7 sequences are depicted by black triangles. Annotated lin-4 microRNAs are shown with circles. 1R/2R denote two rounds of whole genome duplications, whereas FSGD labels the additional teleost-specific genome duplication. The duplicate clusters in teleosts are highlighted by different shading. Entire lineages are written in bold, genera are written in italic. Dispersed, highly derived let-7 and mir-125 paralogs in platyhelminthes and in Brugia are not shown.
During gnathostome evolution, we observe several clade-specific loss events of entire clusters and individual microRNAs, cf. Figure 2. The most dramatic reductions occur in the wake of the FSGD with the complete loss of one copy of clusters E and J, the subsequent loss of the other copy of E in the percomorph lineage (pufferfishes, medaka, and stickleback), and the deletion of the Ca cluster in pufferfishes. Aves lack both the A and the F clusters, both of which are still present in the lizard genome. This could be due to the bird genome assemblies, which are, however, known to be incomplete in particular in their coverage of the micro-chromosomes.55 Among mammals, only platypus features an H cluster, while this locus is lost in all Theria. Other missing clusters in Eutheria affect in particular low coverage genomes and might be explained better by an incomplete assembly. A conspicuous pattern is the lack of the A cluster in the lemurs, however. A gain of new let-7 loci is observed only in primates. The L locus appears in Haplorhini (tarsier and monkeys), while the K locus is present in Catarrhini (old world monkeys) only. We found evidence of expression of the miR sequence of K and L loci in miRNA-seq data of Human and Rhesus macaque brain samples56 as well as in small RNA-seq data of the ENCODE cell lines.57
The ancestral let-7 clusters were apparently tightly linked as one would expect from a polycistronic primary transcript. However, some of these distances in mixed clusters substantially increase in tetrapods, namely, D-mir-100/D-let-7, C-let-7/C-mir-125, and D-let-7/D-mir-125, see also Figure 3. In both cases, the entire clusters are contained in the introns of non-coding primary precursors, in human known as LINC00478 (C-cluster) and MIR100HG (D-cluster), respectively. Cluster F is expressed from an intron of the coding HUWE1 gene throughout and locus J is conserved within an intron of the coding WDR82 gene. Cluster G, in contrast, is exonic, located in the 3′ exon of the non-coding host gene MIRLET7BHG. Loci E and I are associated with clusters of unspliced ESTs, the expression of cluster A cannot be resolved from currently available data.

Figure 3. Distances between the pairs of adjacent pairs of microRNAs (mir-100/let-7 and let-7/mir-125) in the mixed clusters A, C, and D are conserved across Mammalia. Locus A is quite compact throughout the gnathostomes. Cluster D, in contrast, shows consistently large distances in tetrapods. In the cat Felis catus, distance outliers are owed to large gaps in the assembly within clusters.
Interestingly, human MIRLET7BHG harbors in one of its introns the additional annotated microRNA mir-3619 where evidence of expression was found in small RNA-seq data from embryonic stem cells.58 This is an evolutionarily young innovation, present only in old world monkeys.
By the use of family-wide covariance models of all let-7 families, clear evidence for the close relationships of let-7 microRNA of mixed clusters can be found, c.f. Figure 4. Furthermore, all let-7–1 and let-7–2 cluster appear to be closely related to each other corroborating their origin by genome duplications.
Figure 4. Heatmap illustrating the structural and sequence similarity of all let-7 families. Therefore, all let-7 microRNA sequences of each family were scored against each family-wide covariance model. The average bitscore of covariance model (row) vs. let-7 family (column) is visualized in a color gradient. The standard deviations of these bitscores is always below 15 with a median bitscore standard deviation of 3.4. Due to the sequence divergence of cluster A in tetrapods and Aa/Ab in teleost fishes, both lineages were analyzed separately.
Let-7 in Basal Deuterostomes
Cyclostomia, lampreys and hagfishes, share at least one and possibly both rounds of the vertebrate genome duplication. Genomic data are solely available for the lamprey Petromyzon marinus. The miRBase lists 7 let-7, 3 mir-100, and a single mir-125,59 not all of which can be recovered from the available genome assembly. The pma-mir-100c sequence, however, is the reverse complement of pma-mir-100a.
The mir-100a and mir-125 genes are genomically linked and hence derive from the ancestral mixed cluster, although the corresponding let-7 expected to lie between them is missing. The presence of pma-mir-100b (without a linked mir-125 paralogs) serves as a witness of at least one round of genome duplication. Based on the similarities measured with the help of covariance models, we can also identify pma-let-7a-4 as descendant of the let-7 located in the ancestral mixed cluster. It seems to be the only lamprey let-7 miRNA originating from a mixed cluster. However, the assignment of the corresponding cluster is not possible since the sequence is not mappable to any available genome assembly. Among the homogeneous clusters, pma-let-7d and pma-let-7a-3 form a cluster; for the other loci the genome assembly is too fragmented to be informative. By applying Infernal to these miRNAs, strong evidence is obtained that P. marinus contains at least 2 homogeneous clusters comprising both a let-7-1 and a let-7-2 paralog. One of the two remaining sequences, pma-let-7c, shows also obvious characteristics of a let-7-1 subfamily, whereas the other one, pma-let-7a-1, reveals no clear features to make a precise assignment of its origin. Sequencing of short RNAs also shows the presence of, presumably, multiple copies of mir-100, mir-125, and let-7 in the genome of the hagfish Myxine glutinosa.59
In Ciona intestinalis, an intron of an EST cluster that is homologous to the HUWE1 protein harbours a closely spaced cluster of four copies of let-7. On a different chromsome, there is a single mixed cluster consisting of mir-1473, let-7d, and mir-125. A very similar arrangement is found in Ciona savignyi. Upon closer inspection, cin-mir-1473 and csa-mir-1473 are clearly homologs of mir-100 revealing an ancestral mixed cluster, see Figure 5. In Oikopleura dioica, one locus harbours a let-7 and a mir-1473/mir-100 ortholog,4 a second locus consists of two let-7 paralogs, and a final copy of let-7 is found on a third scaffold. The cin-let-7a-1 and cin-let-7a-2, as well as their counterparts csa-let-7c-1 and csa-let-7c-2, appear to be homologs of the let-7-2 and let-7-1 subfamily, respectively. In C. intestinalis, let-7a-1 is the first miRNA in the homogeneous let-7 cluster while let-7a-2 is located at the end. On the other hand, both C. savignyi sequences are located at the beginning. In O. dioica, however, solely odi-let-7c located on scaffold 10 appears to be assignable to the let-7-1 subfamily whereas the corresponding member of the let-7-2 subfamily is missing. Both cin-let-7c and its ortholog csa-let-7a might be copies of the let-7 miRNA of the mixed cluster. Unfortunately, the other orthologous miRNAs cin-let-7b and csa-let-7b cannot clearly be assigned to any let-7 subfamily by the use of covariance models, although their 5p-miR appears to be closely related to let-7 miRNAs originating from mixed clusters. Nevertheless, both remaining O. dioica sequences, namely odi-let-7a and odi-let-7b, are neither unambiguously assignable to any ascidian let-7 miRNA nor to any other subfamily. The relationships of the let-7 loci in basal deuterostomes are summarized in Figure 6.
Figure 5. Highly derived mir-100 paralogs in basal deuterostomes. Although the seed regions also contain substitutions, the homology is clearly visible.

Figure 6. Relationships of let-7 loci in the basal deuterostomes P. marinus (pma), O. dioica (odi), C. savignyi (csa), and C. intestinalis (cin). Assignments to certain let-7 subfamilies were made with the use of Infernal. For details, see text.
In the lancelet Branchiostoma floridae, there is a second copy of let-7 located at the 3′ end of the canonical cluster.60 The two bfl-let-7 precursor sequences differ by only 4 point mutations, and hence are probably the results of a lineage-specific duplication. The ancestral cluster mir-100/let-7/mir-125 is also present in ambulacrarians, i.e., the acorn worm (hemichordata) and the sea urchin (echinodermata). In the latter, the mir-100 ortholog is rather diverged and recorded as spu-mir-2003 in miRBase.
The most basal clade of Deuterostomia, recently termed Xenacoelomorpha,61 is composed of Xenoturbellida and Acoelomorpha. For Xenoturbella bocki, mature sequences of mir-125, let-7, and mir-100 have been reported.61 In contrast, no evidence for any of the three microRNA families were found in microRNA libaries of Hofstenia miamia61 and Symsagittifera roscoffensis.28,62 A survey of several acoels by northern blot also returned a negative result.11
Antisense microRNAs and cluster extensions
MicroRNA precurors are very stable hairpin structures so that their 3′ and 5′ halves are close to being reverse complements. An antisense transcript therefore will in general also give rise to pre-mir-like hairpin structure. Indeed, functional antisense microRNAs have been reported in the literature for serveral loci, see e.g. Bender.63 It does not come as a surprise, therefore, that antisense microRNAs have been found in deep sequencing data for some of the let-7 loci, cf. Table 3. In this survey, however, microRNAs that are located antisense to miRBase annotated mir-100, mir-125, or let-7 microRNAs were not further investigated.
Table 3. Antisense microRNAs associated with let-7 loci listed in miRBase v.18.
| Species | loc. | sense | antisense | Ref. |
|---|---|---|---|---|
| rat |
D |
rno-let-7a-2 |
rno-mir-3596a |
68 |
| rat |
A |
rno-let-7e |
rno-mir-3596c |
68 |
| rat |
E-1 |
rno-let-7f-1 |
rno-mir-3596d |
68 |
| rat |
E-3 |
rno-let-7d |
rno-mir-3596b |
68 |
| rat |
C |
rno-mir-125b-2 |
rno-mir-3588 |
68 |
| cow |
G-2 |
bta-let-7b |
bta-mir-3596 |
69 |
| lancelet |
— |
bfl-let-7a-1 |
bfl-let-7b |
70 |
| lancetet | — | bfl-mir-125a | bfl-mir-125b | 70 |
Hairpins of about the size of pre-microRNAs are among the most frequent secondary structure motifs. This provides a likely mechanism for the innovation of new microRNAs.6 The current version of miRBase reports several cases of additional microRNAs that emerged within or closely adjacent to a let-7 cluster. The best conserved example is hsa-mir-476364 located in the G cluster between hsa-let-7a-3 and hsa-let-7b. The human miR sequence is fairly well conserved in primates and to some extent in eutheria, see the corresponding alignment in the supplement material. However, there was no clear evidence of expression of the annotated hsa-mir-4763 in brain samples56 or ENCODE cell lines.57 Furthermore, conserved orthologs in Macaca mulatta and Canis familaris revealed no expression in miRNA-seq data from rhesus macaque brain56 or in small RNA-seq data from domestic dog lymphocytes,65 respectively. The cow microRNA bta-mir-2443, that is also located in cluster G between both let-7 microRNAs, is not an ortholog of the hsa-mir-4763 sequence. There is evidence for its expression in small RNA libraries of bovine kidney cells.66 An interesting finding is the only detectable ortholog found in the dolphin Tursiops truncatus. This microRNA is located upstream of the mir-4763 ortholog and of the G-let-7-2 sequence, while the corresponding let-7-1 sequences of cluster G is missing. In Bombyx mori, bmo-mir-2795 is inserted between let-7 and mir-100.67 A search of the NCBI databases did not reveal homologs in other insects. In the zebra finch, finally, tgu-mir-2987 is found about 3.1 kb downstream of the E cluster. Its sequence is not conserved in other bird genomes.
Discussion
The association of mir-100, mir-125, and let-7 with its key conserved function in developmental timing32 is one of the evolutionarily most ancient systems of microRNA-based regulation. The ancestral cluster of these three microRNAs dates back to the advent of Bilateria. In fact, only mir-100 and mir-10 date back further and are common to Eubilateria.3-5 The evolution of the let-7 cluster in Protostomia is characterized mostly by partial losses and only occasional gene duplications (e.g., mir-100 in Brugia and mir-125 in platyhelminthes). In contrast, early chordates have acquired a second let-7 locus that subsequently expanded by tandem duplication. The vertebrate-specific genome duplications expanded this system to a large number of paralogous loci. The retention rate of these paralogs is rather high, with up to 20 let-7 cluster microRNAs present in extant tetrapods, compared with the ancestral 24 microRNAs that are inferred from two rounds of duplications of the two chordate clusters. This is comparable with the fate of important transcriptional regulators such as the HOX gene clusters,68,69 while the redundancy generated by genome duplications is nearly completely resolved, e.g., for metabolic enzymes.
The detailed analysis of the let-7 family also shows that microRNAs are always as conserved as one might expect. Beyond loss events, we also found highly derived paralogs that by combination of synteny and sequence similarity are unambigously recognizable as homologs. The best examples are the homology of lin-4 and mir-125 in nematodes and the mir-100 paralogs mir-1473 (tunicates) and mir-2003 (echinoderms). This observation suggests that undocumented homologies are present also among other annotated microRNA families and it has an impact on the use of microRNAs as a phylogenetic marker as unrecognized derived microRNA families can be misinterpreted as the ancestral state in which the microRNA family has not yet emerged.
The naming convention of miRBase for paralogous microRNAs has turned out to be a major technical inconvenience for the present study. True orthologs (as determined by both synteny and sequence comparison of the complete precursor sequences) not infrequently have different names in different species. Even worse, paralogous copies may have the same name. It would be desirable, therefore, to rethink the naming schemes to convey information on the genomic location. For vault RNAs, which also form multiple clusters in mammalian genomes genes, names that make the cluster membership explicit were recently adopted by the HGNC.70
Acknowledgments
This work is based on the results of a bioinformatics computer lab course in the Winter Semester 2010/2011. The following students contributed their preliminary analysis to this work: Denise Aumer, Simon Bin, Peter Buske, Jörg Dreier, Katharina Eichler, Martin Fischer, Belinda Kahnt, Rebecca Kirsch, Martin Krüger, Vera Lede, Sandra Treffkorn, Sigrid Uxa, Sarah Witzsche. This work was funded in part by the European Regional Development Fund (ERDF) and by the Free State of Saxony within the framework of LIFE – Leipzig Research Center for Civilization Diseases (CO), a PhD stipend funded by the European Social Fund (AW), and the European FP-7 project QUANTOMICS (PFS,SB, no. 222664).
Footnotes
Previously published online: www.landesbioscience.com/journals/rnabiology/article/18974
References
- 1.Berezikov E, Guryev V, van de Belt J, Wienholds E, Plasterk RH, Cuppen E. Phylogenetic shadowing and computational identification of human microRNA genes. Cell. 2005;120:21–4. doi: 10.1016/j.cell.2004.12.031. [DOI] [PubMed] [Google Scholar]
- 2.Sempere LF, Cole CN, McPeek MA, Peterson KJ. The phylogenetic distribution of metazoan microRNAs: insights into evolutionary complexity and constraint. J Exp Zool B Mol Dev Evol. 2006;306:575–88. doi: 10.1002/jez.b.21118. [DOI] [PubMed] [Google Scholar]
- 3.Heimberg AM, Sempere LF, Moy VN, Donoghue PCJ, Peterson KJ. MicroRNAs and the advent of vertebrate morphological complexity. Proc Natl Acad Sci U S A. 2008;105:2946–50. doi: 10.1073/pnas.0712259105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Fu X, Adamski M, Thompson EM. Altered miRNA repertoire in the simplified chordate, Oikopleura dioica. Mol Biol Evol. 2008;25:1067–80. doi: 10.1093/molbev/msn060. [DOI] [PubMed] [Google Scholar]
- 5.Hertel J, Lindemeyer M, Missal K, Fried C, Tanzer A, Flamm C, et al. The expansion of the metazoan microRNA repertoire. BMC Genomics. 2006;7:15. doi: 10.1186/1471-2164-7-25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Tanzer A, Stadler PF. Molecular evolution of a microRNA cluster. J Mol Biol. 2004;339:327–35. doi: 10.1016/j.jmb.2004.03.065. [DOI] [PubMed] [Google Scholar]
- 7.Tanzer A, Stadler PF. Evolution of microRNAs. In SY Ying, editor, MicroRNA Protocols, volume 342 of Methods in Molecular Biology Humana Press, Totowa, NJ, 2006, 335–350. [DOI] [PubMed] [Google Scholar]
- 8.Ambros V. A hierarchy of regulatory genes controls a larva-to-adult developmental switch in C. elegans. Cell. 1989;57:49–57. doi: 10.1016/0092-8674(89)90171-2. [DOI] [PubMed] [Google Scholar]
- 9.Lau NC, Lim LP, Weinstein EG, Bartel DP. An abundant class of tiny RNAs with probable regulatory roles in Caenorhabditis elegans. Science. 2001;294:858–62. doi: 10.1126/science.1065062. [DOI] [PubMed] [Google Scholar]
- 10.Pasquinelli AE, Reinhart BJ, Slack F, Martindale MQ, Kuroda MI, Maller B, et al. Conservation of the sequence and temporal expression of let-7 heterochronic regulatory RNA. Nature. 2000;408:86–9. doi: 10.1038/35040556. [DOI] [PubMed] [Google Scholar]
- 11.Pasquinelli AE, McCoy A, Jime´nez E, Salo´ E, Ruvkun G, Martindale MQ, et al. Expression of the 22 nucleotide let-7 heterochronic RNA throughout the Metazoa: a role in life history evolution? Evol Dev. 2003;5:372–8. doi: 10.1046/j.1525-142X.2003.03044.x. [DOI] [PubMed] [Google Scholar]
- 12.Reinhart BJ, Slack FJ, Basson M, Pasquinelli AE, Bettinger JC, Rougvie AE, et al. The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans. Nature. 2000;403:901–6. doi: 10.1038/35002607. [DOI] [PubMed] [Google Scholar]
- 13.Roush S, Slack FJ. The let-7 family of microRNAs. Trends Cell Biol. 2008;18:505–16. doi: 10.1016/j.tcb.2008.07.007. [DOI] [PubMed] [Google Scholar]
- 14.Griffiths-Jones S. The microRNA Registry. Nucleic Acids Res. 2004;32(Database issue):D109–11. doi: 10.1093/nar/gkh023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Bompfünewerer AF, Flamm C, Fried C, Fritzsch G, Hofacker IL, Lehmann J, et al. Evolutionary patterns of non-coding RNAs. Theory Biosci. 2005;123:301–69. doi: 10.1016/j.thbio.2005.01.002. [DOI] [PubMed] [Google Scholar]
- 16.Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–10. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
- 17.Hertel J, de Jong D, Marz M, Rose D, Tafer H, Tanzer A, et al. Non-coding RNA annotation of the genome of Trichoplax adhaerens. Nucleic Acids Res. 2009;37:1602–15. doi: 10.1093/nar/gkn1084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Nawrocki EP, Kolbe DL, Eddy SR. Infernal 1.0: inference of RNA alignments. Bioinformatics. 2009;25:1335–7. doi: 10.1093/bioinformatics/btp157. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution’s cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003;100:11484–9. doi: 10.1073/pnas.1932072100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Hofacker IL, Fontana W, Stadler PF, Bonhoeffer SL, Tacker M, Schuster P. Fast folding and comparison of RNA secondary structures. Monatsh Chem. 1994;125:167–88. doi: 10.1007/BF00818163. [DOI] [Google Scholar]
- 21.Bernhart SH, Hofacker IL, Will S, Gruber AR, Stadler PF. RNAalifold: improved consensus structure prediction for RNA alignments. BMC Bioinformatics. 2008;9:474. doi: 10.1186/1471-2105-9-474. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, et al. Clustal W and Clustal X version 2.0. Bioinformatics. 2007;23:2947–8. doi: 10.1093/bioinformatics/btm404. [DOI] [PubMed] [Google Scholar]
- 23.Huson DH, Bryant D. Application of phylogenetic networks in evolutionary studies. Mol Biol Evol. 2006;23:254–67. doi: 10.1093/molbev/msj030. [DOI] [PubMed] [Google Scholar]
- 24.Ronquist F, Huelsenbeck JP. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 2003;19:1572–4. doi: 10.1093/bioinformatics/btg180. [DOI] [PubMed] [Google Scholar]
- 25.Posada D. jModelTest: phylogenetic model averaging. Mol Biol Evol. 2008;25:1253–6. doi: 10.1093/molbev/msn083. [DOI] [PubMed] [Google Scholar]
- 26.Huson DH, Richter DC, Rausch C, Dezulian T, Franz M, Rupp R. Dendroscope: An interactive viewer for large phylogenetic trees. BMC Bioinformatics. 2007;8:460. doi: 10.1186/1471-2105-8-460. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Grimson A, Srivastava M, Fahey B, Woodcroft BJ, Chiang HR, King N, et al. Early origins and evolution of microRNAs and Piwi-interacting RNAs in animals. Nature. 2008;455:1193–7. doi: 10.1038/nature07415. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Sempere LF, Martinez P, Cole C, Baguñà J, Peterson KJ. Phylogenetic distribution of microRNAs supports the basal position of acoel flatworms and the polyphyly of Platyhelminthes. Evol Dev. 2007;9:409–15. doi: 10.1111/j.1525-142X.2007.00180.x. [DOI] [PubMed] [Google Scholar]
- 29.Ball EE, Miller DJ. Phylogeny: the continuing classificatory conundrum of chaetognaths. Curr Biol. 2006;16:R593–6. doi: 10.1016/j.cub.2006.07.006. [DOI] [PubMed] [Google Scholar]
- 30.Harzsch S, Müller CH. A new look at the ventral nerve centre of Sagitta: implications for the phylogenetic position of Chaetognatha (arrow worms) and the evolution of the bilaterian nervous system. Front Zool. 2007;4:14. doi: 10.1186/1742-9994-4-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Paps J, Baguñà J, Riutort M. Bilaterian phylogeny: a broad sampling of 13 nuclear genes provides a new Lophotrochozoa phylogeny and supports a paraphyletic basal acoelomorpha. Mol Biol Evol. 2009;26:2397–406. doi: 10.1093/molbev/msp150. [DOI] [PubMed] [Google Scholar]
- 32.Christodoulou F, Raible F, Tomer R, Simakov O, Trachana K, Klaus S, et al. Ancient animal microRNAs and the evolution of tissue identity. Nature. 2010;463:1084–8. doi: 10.1038/nature08744. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Xue X, Sun J, Zhang Q, Wang Z, Huang Y, Pan W. Identification and characterization of novel microRNAs from Schistosoma japonicum. PLoS One. 2008;3:e4034. doi: 10.1371/journal.pone.0004034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Huang J, Hao P, Chen H, Hu W, Yan Q, Liu F, et al. Genome-wide identification of Schistosoma japonicum microRNAs using a deep-sequencing approach. PLoS One. 2009;4:e8206. doi: 10.1371/journal.pone.0008206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Wang Z, Xue X, Sun J, Luo R, Xu X, Jiang Y, et al. An “in-depth” description of the small non-coding RNA population of Schistosoma japonicum schistosomulum. PLoS Negl Trop Dis. 2010;4:e596. doi: 10.1371/journal.pntd.0000596. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Palakodeti D, Smielewska M, Graveley BR. MicroRNAs from the Planarian Schmidtea mediterranea: a model system for stem cell biology. RNA. 2006;12:1640–9. doi: 10.1261/rna.117206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Friedländer MR, Adamidi C, Han T, Lebedeva S, Isenbarger TA, Hirst M, et al. High-resolution profiling and discovery of planarian small RNAs. Proc Natl Acad Sci U S A. 2009;106:11546–51. doi: 10.1073/pnas.0905222106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Lu YC, Smielewska M, Palakodeti D, Lovci MT, Aigner S, Yeo GW, et al. Deep sequencing identifies new and regulated microRNAs in Schmidtea mediterranea. RNA. 2009;15:1483–91. doi: 10.1261/rna.1702009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Copeland CS, Marz M, Rose D, Hertel J, Brindley PJ, Santana CB, et al. Homology-based annotation of non-coding RNAs in the genomes of Schistosoma mansoni and Schistosoma japonicum. BMC Genomics. 2009;10:464. doi: 10.1186/1471-2164-10-464. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Sokol NS, Xu P, Jan YN, Ambros V. Drosophila let-7 microRNA is required for remodeling of the neuromusculature during metamorphosis. Genes Dev. 2008;22:1591–6. doi: 10.1101/gad.1671708. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.de Wit E, Linsen SE, Cuppen E, Berezikov E. Repertoire and evolution of miRNA genes in four divergent nematode species. Genome Res. 2009;19:2064–74. doi: 10.1101/gr.093781.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Ruby JG, Jan C, Player C, Axtell MJ, Lee W, Nusbaum C, et al. Large-scale sequencing reveals 21U-RNAs and additional microRNAs and endogenous siRNAs in C. elegans. Cell. 2006;127:1193–207. doi: 10.1016/j.cell.2006.10.040. [DOI] [PubMed] [Google Scholar]
- 43.Poole CB, Davis PJ, Jin J, McReynolds LA. Cloning and bioinformatic identification of small RNAs in the filarial nematode, Brugia malayi. Mol Biochem Parasitol. 2010;169:87–94. doi: 10.1016/j.molbiopara.2009.10.004. [DOI] [PubMed] [Google Scholar]
- 44.Lee RC, Feinbaum RL, Ambros V. The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell. 1993;75:843–54. doi: 10.1016/0092-8674(93)90529-Y. [DOI] [PubMed] [Google Scholar]
- 45.Caygill EE, Johnston LA. Temporal regulation of metamorphic processes in Drosophila by the let-7 and miR-125 heterochronic microRNAs. Curr Biol. 2008;18:943–50. doi: 10.1016/j.cub.2008.06.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Lagos-Quintana M, Rauhut R, Yalcin A, Meyer J, Lendeckel W, Tuschl T. Identification of tissue-specific microRNAs from mouse. Curr Biol. 2002;12:735–9. doi: 10.1016/S0960-9822(02)00809-6. [DOI] [PubMed] [Google Scholar]
- 47.Tyler DM, Okamura K, Chung WJ, Hagen JW, Berezikov E, Hannon GJ, et al. Functionally distinct regulatory RNAs generated by bidirectional transcription and processing of microRNA loci. Genes Dev. 2008;22:26–36. doi: 10.1101/gad.1615208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.International Chicken Genome Sequencing Consortium Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature. 2004;432:695–716. doi: 10.1038/nature03154. [DOI] [PubMed] [Google Scholar]
- 49.Venkatesh B, Kirkness EF, Loh YH, Halpern AL, Lee AP, Johnson J, et al. Survey sequencing and comparative analysis of the elephant shark (Callorhinchus milii) genome. PLoS Biol. 2007;5:e101. doi: 10.1371/journal.pbio.0050101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Ohno S. Evolution by gene duplication Berlin: Springer-Verlag, 1970. [Google Scholar]
- 51.Meyer A, Schartl M. Gene and genome duplications in vertebrates: the one-to-four (-to-eight in fish) rule and the evolution of novel gene functions. Curr Opin Cell Biol. 1999;11:699–704. doi: 10.1016/S0955-0674(99)00039-3. [DOI] [PubMed] [Google Scholar]
- 52.Dehal P, Boore JL. Two rounds of whole genome duplication in the ancestral vertebrate. PLoS Biol. 2005;3:e314. doi: 10.1371/journal.pbio.0030314. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Hoegg S, Brinkmann H, Taylor JS, Meyer A. Phylogenetic timing of the fish-specific genome duplication correlates with the diversification of teleost fish. J Mol Evol. 2004;59:190–203. doi: 10.1007/s00239-004-2613-z. [DOI] [PubMed] [Google Scholar]
- 54.Crow KD, Stadler PF, Lynch VJ, Amemiya CT, Wagner GP. The “fish-specific” Hox cluster duplication is coincident with the origin of teleosts. Mol Biol Evol. 2006;23:121–36. doi: 10.1093/molbev/msj020. [DOI] [PubMed] [Google Scholar]
- 55.Dodgson JB, Delany ME, Cheng HH. Poultry genome sequences: progress and outstanding challenges. Cytogenet Genome Res. 2011;134:19–26. doi: 10.1159/000324413. [DOI] [PubMed] [Google Scholar]
- 56.Somel M, Guo S, Fu N, Yan Z, Hu HY, Xu Y, et al. MicroRNA, mRNA, and protein expression link development and aging in human and macaque brain. Genome Res. 2010;20:1207–18. doi: 10.1101/gr.106849.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Birney E, Stamatoyannopoulos JA, Dutta A, Guigo´ R, Gingeras TR, Margulies EH, et al. ENCODE Project Consortium. NISC Comparative Sequencing Program. Baylor College of Medicine Human Genome Sequencing Center. Washington University Genome Sequencing Center. Broad Institute. Children’s Hospital Oakland Research Institute Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007;447:799–816. doi: 10.1038/nature05874. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Morin RD, O’Connor MD, Griffith M, Kuchenbauer F, Delaney A, Prabhu AL, et al. Application of massively parallel sequencing to microRNA profiling and discovery in human embryonic stem cells. Genome Res. 2008;18:610–21. doi: 10.1101/gr.7179508. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Heimberg AM, Cowper-Sal-lari R, Se´mon M, Donoghue PC, Peterson KJ. microRNAs reveal the interrelationships of hagfish, lampreys, and gnathostomes and the nature of the ancestral vertebrate. Proc Natl Acad Sci U S A. 2010;107:19379–83. doi: 10.1073/pnas.1010350107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Campo-Paysaa F, Se´mon M, Cameron RA, Peterson KJ, Schubert M. microRNA complements in deuterostomes: origin and evolution of microRNAs. Evol Dev. 2011;13:15–27. doi: 10.1111/j.1525-142X.2010.00452.x. [DOI] [PubMed] [Google Scholar]
- 61.Philippe H, Brinkmann H, Copley RR, Moroz LL, Nakano H, Poustka AJ, et al. Acoelomorph flatworms are deuterostomes related to Xenoturbella. Nature. 2011;470:255–8. doi: 10.1038/nature09676. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Wheeler BM, Heimberg AM, Moy VN, Sperling EA, Holstein TW, Heber S, et al. The deep evolution of metazoan microRNAs. Evol Dev. 2009;11:50–68. doi: 10.1111/j.1525-142X.2008.00302.x. [DOI] [PubMed] [Google Scholar]
- 63.Bender W. MicroRNAs in the Drosophila bithorax complex. Genes Dev. 2008;22:14–9. doi: 10.1101/gad.1614208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Persson H, Kvist A, Rego N, Staaf J, Vallon-Christersson J, Luts L, et al. Identification of new microRNAs in paired normal and tumor breast tissue suggests a dual role for the ERBB2/Her2 gene. Cancer Res. 2011;71:78–86. doi: 10.1158/0008-5472.CAN-10-1869. [DOI] [PubMed] [Google Scholar]
- 65.Friedländer MR, Chen W, Adamidi C, Maaskola J, Einspanier R, Knespel S, et al. Discovering microRNAs from deep sequencing data using miRDeep. Nat Biotechnol. 2008;26:407–15. doi: 10.1038/nbt1394. [DOI] [PubMed] [Google Scholar]
- 66.Glazov EA, Kongsuwan K, Assavalapsakul W, Horwood PF, Mitter N, Mahony TJ. Repertoire of bovine miRNA and miRNA-like small regulatory RNAs expressed upon viral infection. PLoS One. 2009;4:e6349. doi: 10.1371/journal.pone.0006349. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Liu S, Li D, Li Q, Zhao P, Xiang Z, Xia Q. MicroRNAs of Bombyx mori identified by Solexa sequencing. BMC Genomics. 2010;11:148. doi: 10.1186/1471-2164-11-148. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Prohaska SJ, Fried C, Amemiya CT, Ruddle FH, Wagner GP, Stadler PF. The shark HoxN cluster is homologous to the human HoxD cluster. J Mol Evol. 2004;58:212–7. doi: 10.1007/s00239-003-2545-z. [DOI] [PubMed] [Google Scholar]
- 69.Amemiya CT, Powers TP, Prohaska SJ, Grimwood J, Schmutz J, Dickson M, et al. Complete HOX cluster characterization of the coelacanth provides further evidence for slow evolution of its genome. Proc Natl Acad Sci U S A. 2010;107:3622–7. doi: 10.1073/pnas.0914312107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Stadler PF, Chen JJ, Hackermüller J, Hoffmann S, Horn F, Khaitovich P, et al. Evolution of vault RNAs. Mol Biol Evol. 2009;26:1975–91. doi: 10.1093/molbev/msp112. [DOI] [PubMed] [Google Scholar]




