Skip to main content
Genome Biology and Evolution logoLink to Genome Biology and Evolution
. 2014 Feb 2;6(3):491–499. doi: 10.1093/gbe/evu023

Evolution of the Relaxin/Insulin-Like Gene Family in Anthropoid Primates

José Ignacio Arroyo 1,2, Federico G Hoffmann 3,4, Juan C Opazo 1,*
PMCID: PMC3971578  PMID: 24493383

Abstract

The relaxin/insulin-like gene family includes signaling molecules that perform a variety of physiological roles mostly related to reproduction and neuroendocrine regulation. Several previous studies have focused on the evolutionary history of relaxin genes in anthropoid primates, with particular attention on resolving the duplication history of RLN1 and RLN2 genes, which are found as duplicates only in apes. These studies have revealed that the RLN1 and RLN2 paralogs in apes have a more complex history than their phyletic distribution would suggest. In this regard, alternative scenarios have been proposed to explain the timing of duplication, and the history of gene gain and loss along the organismal tree. In this article, we revisit the question and specifically reconstruct phylogenies based on coding and noncoding sequence in anthropoid primates to readdress the timing of the duplication event giving rise to RLN1 and RLN2 in apes. Results from our phylogenetic analyses based on noncoding sequence revealed that the duplication event that gave rise to the RLN1 and RLN2 occurred in the last common ancestor of catarrhine primates, between ∼44.2 and 29.6 Ma, and not in the last common ancestor of apes or anthropoids, as previously suggested. Comparative analyses based on coding and noncoding sequence suggests an event of convergent evolution at the sequence level between co-ortholog genes, the single-copy RLN gene found in New World monkeys and the RLN1 gene of apes, where changes in a fraction of the convergent sites appear to be driven by positive selection.

Keywords: relaxin, differential gene retention, convergent evolution, gene duplication, positive selection

Introduction

Convergent evolution is defined as the process whereby unrelated organisms independently reach similar character states. At the phenotype level, one of the best known examples of convergence is the wing, in which phylogenetically unrelated groups (e.g., insects, bats, and birds) evolved the ability of flight independently. At the molecular level, several cases have been reported in which preexisting genes have changed their original function (Eizinger et al. 1999; Piatigorski 2007). One remarkable example is the independent evolution of the oxygen-transport hemoglobins between gnathostomes (jawed vertebrates) and cyclostomes (jawless vertebrates) (Hoffmann et al. 2010). An important issue regarding convergent evolution is to understand the role of different evolutionary forces that are behind the process to understand the mechanisms of functional adaptation. Although convergent evolution represents an important mechanism to promote evolutionary innovations, detecting convergent events represents a challenge especially when the duplicative history of the genes is complex, and orthologous relationships are not well understood.

The relaxin/insulin-like gene family includes signaling molecules that perform a variety of physiological roles mostly related to reproduction and neuroendocrine regulation (Bathgate et al. 2003; Sherwood 2004; Park et al. 2005; McGowan et al. 2008). Recent analyses revealed that the two whole genome duplications that occurred early in vertebrate evolution are linked to the initial expansion of this group of genes (Hoffmann and Opazo 2011; Yegorov and Good 2012). Members of this gene family are found on three different genomic locations in mammals, which have been called relaxin family locus (RFL) A, B, and C (Park et al. 2008).

The number and nature of genes in these three genomic loci are well conserved in most mammalian lineages, with the exception of the RFLB locus (Park et al. 2008; Hoffmann and Opazo 2011; Arroyo, Hoffmann, Good, et al. 2012; Arroyo, Hoffmann, Opazo 2012a, 2012b). This locus possess a complex duplicative history characterized by small-scale duplications and differential gene retention, where the relative age of many genes is not consistent with their phyletic distribution (Hoffmann and Opazo 2011; Arroyo, Hoffmann, Good, et al. 2012; Arroyo, Hoffmann, Opazo 2012a, 2012b). For example, the INSL4 gene, also called placentin, is restricted to catarrhine primates but derives from a duplication event in the last common ancestor of placental mammals (Bieche et al. 2003; Park et al. 2008; Park, Semyonov, et al. 2008; Arroyo, Hoffmann, Good, et al. 2012; Arroyo, Hoffmann, Opazo 2012b). This is also true for the RLN1 and RLN2 paralogs of anthropoid primates (Wilkinson et al. 2005; Park et al. 2008; Park, Semyonov, et al. 2008; Hoffmann and Opazo 2011; Arroyo, Hoffmann, Good, et al. 2012; Arroyo, Hoffmann, Opazo 2012b), for which multiple competing scenarios have been proposed to explain their evolutionary origin. Initial studies postulated that the duplication event that gave rise to the RLN1 and RLN2 genes, which are only found in duplicate in apes, occurred in their last common ancestor (fig. 1A; Evans et al. 1994; Wilkinson et al. 2005; Park et al. 2008; Park, Semyonov, et al. 2008; Hoffmann and Opazo 2011). In this scenario, the RLN1 and RLN2 genes in apes would be co-orthologs to the single copy RLN gene found in most mammals. More recently, Arroyo, Hoffmann, Opazo (2012b) suggested that RLN1 and RLN2 originated in the last common ancestor of anthropoid primates, and were only retained as duplicates in apes, whereas New and Old World monkeys independently lost copies of RLN1 and RLN2, respectively (fig. 1B). Here, the single copy RLN gene from New World monkeys would be a 1:1 ortholog to the RLN1 gene of apes, whereas the single copy RLN gene from Old World monkeys would be a 1:1 ortholog to the RLN2 gene of apes. However, dot-plot comparisons suggested the possibility that the RLN gene found in New World monkeys could be a 1:1 ortholog to the RLN2 gene of apes (fig. 1C; Arroyo, Hoffmann, Opazo [2012b]). Thus, the relationships among these genes remained unresolved.

Fig. 1.—

Fig. 1.—

Schematic representations of alternative hypotheses regarding phylogenetic relationships among the duplicated RLN genes in anthropoid primates. In (A) RLN1 and RLN2 genes arose via duplication of a proto-RLN gene in the last common ancestor of apes. In (B) the duplication event that gave rise to RLN1 and RLN2 genes predates the radiation of anthropoid primates, although a two gene arrangement was present in the last common ancestor of anthropoid primates, only apes appear to have retained both copies, whereas New and Old World monkeys independently retain complementary gene copies, RLN1 and RLN2, respectively. In (C), the duplication event also predates the radiation of anthropoid primates but this time New and Old World monkeys have independently retained the RLN2 paralog. Lineages in gray denote gene losses.

The main goal of this research is to unravel the history of duplication of the RLN1 and RLN2 genes of anthropoid primates to estimate the timing of the duplication that gave rise to the RLN1 and RLN2 genes, and asses the potential role of natural selection in their divergence. To this end, we contrasted phylogenies based and coding and noncoding sequences, and compared rates of synonymous and nonsynonymous substitution along the tree based on coding sequences. Results from our phylogenetic analyses based on noncoding sequence revealed that the duplication event that gave rise to the RLN1 and RLN2 genes occurred in the last common ancestor of catarrhine primates, between ∼44.2 and 29.6 Ma, and not in the last common ancestor of apes or anthropoids, as previously inferred. Comparative analyses based on coding and noncoding sequence suggest an event of convergent evolution at the sequence level between co-ortholog genes, the single-copy RLN gene found in New World monkeys and the RLN1 gene of apes. Molecular evolution analyses suggest that changes in some of the convergent sites appear to be driven by positive selection, and also suggest that the peptide C from the relaxin precursor might play functionally relevant roles that need to be explored

Materials and Methods

DNA Sequence Data

We manually identified relaxin/insulin-like genes that belong to the Relaxin Family Locus B (RFLB) in 15 species of primates representing all main groups of the order (supplementary table S1, Supplementary Material online). The primates species included six apes (human, Homo sapiens; chimpanzee, Pan troglodytes; bonobo, P. paniscus; gorilla, Gorilla gorilla; orangutan, Pongo abelii, and gibbon, Nomascus leucogenys), four Old World monkeys (rhesus macaque, Macaca mulatta; crab-eating macaque, M. fascicularis; olive baboon, Papio anubis; and hamadryas baboon, Pap. hamadryas), two New Wold monkeys (squirrel monkey, Saimiri boliviensis and marmoset, Callithrix jacchus), one tarsier (Tarsius syrichta), and two strepsirrhines (mouse lemur Microcebus murinus, and bushbaby, Otolemur garnetti). We compared annotated exons sequences with unannotated genomic sequences using the program Blast2seq (Tatusova and Madden 1999). Putatively functional genes were characterized by an intact open reading frame with the canonical two exon/one intron structure typical of vertebrate RLN/INSL-like genes, whereas pseudogenes were identifiable because of their high sequence similarity to functional orthologs and the presence of inactivating mutations, and/or the lack of exons. To distinguish among tandemly arrayed genes copies, we index each gene copy with the symbol T followed by a number that corresponds to the linkage order in the 5 to 3′ orientation, thus, the first gene in the cluster is labeled T1, the second T2, and so forth. Pseudogenes were indexed with the ps suffix.

Phylogenetic Inference

We estimated phylogenetic relationships among RLN genes in all major groups of primates. We used a maximum likelihood and a Bayesian analyses, as implemented in the programs Treefinder version March 2011 (Jobb et al. 2004) and Mr.Bayes v3.1.2 (Ronquist and Huelsenbeck 2003), respectively. Because convergent evolution is typically restricted to the coding regions, in addition to using phylogenetic reconstructions based on coding sequence, we also used noncoding sequences (flanking regions and intron 1) to unravel the evolutionary history of the RLN genes in anthropoid primates. Sequence alignments were carried out using the L-INS-i strategy from MAFFT v.6 (Katoh et al. 2009). In the case of the coding sequence, the best fitting models for each structural domain (signal peptide, and peptides B, C, and A) was estimated separately using the propose model routine from the program Treefinder version March 2011 (Jobb et al. 2004). For noncoding sequences a single model of molecular evolution was estimated for each region (up- and downstream flanking sequences, and intron 1). In the case of maximum likelihood, we estimated the best tree under the selected models, and assessed support for the nodes with 1,000 bootstrap pseudoreplicates. In Bayesian analysis, two simultaneous independent runs were performed for 10 × 106 iterations of a Markov Chain Monte Carlo algorithm, with six simultaneous chains sampling trees every 1,000 generations. Support for the nodes and parameter estimates were derived from a majority rule consensus of the last 5,000 trees sampled after convergence. The average standard deviation of split frequencies remained 0.01 after the burn-in threshold.

Molecular Evolution Analysis

To investigate the possible role of natural selection in the evolutionary history of the RLN gene of New World monkeys, we explored variation in ω, the ratio of the rate of nonsynonymous and synonymous substitutions per nonsynonymous and synonymous site, in a maximum likelihood framework using the program codeml from the PAML v4.4 package (Yang 2007). We compared two sets of models, the first set focused on comparing changes in ω ( = dN/dS) along the branches of the tree, and the second set of models focused on comparing changes in ω along the different sites in the alignment between background and foreground sets of branches. We first compared the following two branch models: 1) a 1 − ω model in which a single ω estimate was assigned to all branches in the tree; and 2) a 2 − ω model, which assigned one ω to the ancestral branch of the New World monkey RLN clade, and a second ω to all other branches. We also implemented branch-site models, which explore changes in ω for a set of sites in a specific branch of the tree to assess changes in their selective regime (Yang and dos Reis 2011). In this case, the ancestral branch of the New World monkey RLN clade was labeled as the foreground branch. We compared the modified model A (Yang et al. 2005; Zhang et al. 2005), in which some sites are allowed to change to an ω > 1 in the foreground branch, with the corresponding null hypothesis of neutral evolution. The Bayes Empirical Bayes (BEB) method was used to identify sites under positive selection (Nielsen and Yang 1998; Yang et al. 2000). Because the branch-site analysis estimates rates of evolution on a codon by codon basis, its implementation is particularly useful in cases when different gene segments evolve at different rates, as is the case with the different domains of the RLN genes.

Results and Discussion

The evolutionary history of the relaxin genes in anthropoid primates has been intensely studied (Evans et al. 1994; Wilkinson et al. 2005; Park et al. 2008; Park, Semyonov, et al. 2008; Hoffmann and Opazo 2011; Arroyo, Hoffmann, Opazo 2012b). Most studies have focused on resolving the duplicative history of the RLN1 and RLN2 genes of apes. These studies suggest that the RLN1 and RLN2 paralogs of apes have a more complex history than their phyletic distribution suggests. In this regard, three alternative scenarios have been proposed to explain the timing of duplication and gene gains and losses along the organismal tree (fig. 1AC). Initial studies had suggested that the duplication giving rise to RLN1 and RLN2 mapped to the last common ancestor of apes, between approximately 29.6 and 18.8 Ma (fig. 1A; Evans et al. 1994; Wilkinson et al. 2005; Park et al. 2008; Park, Semyonov, et al. 2008), but phylogenies with more extensive taxonomic sampling suggested that the same duplication mapped to the last common ancestor of anthropoid primates, the group that includes apes and Old and New World monkeys, between ∼71.1 and 44.2 Ma. The identity of the RLN gene lost by New and Old World monkeys remained unclear (fig 1B and C; Arroyo, Hoffmann, Opazo 2012), as support for the relevant nodes was not significant to resolve among competing alternatives.

The phylogenetic evidence presented by Arroyo, Hoffmann, Opazo (2012b) suggested an older origin than previously proposed, but it was not conclusive (Wilkinson et al. 2005; Park et al. 2008; Park, Semyonov, et al. 2008; Hoffmann and Opazo 2011). Phylogenetic analyses of paralogous members of a gene family often result in nonorthologous genes appearing more similar to each other than they are to their true orthologs. In particular, gene conversion and positive Darwinian selection often obscure phylogenetic reconstructions among paralog members of a gene family. However, because both gene conversion and positive Darwinian selection are largely restricted to coding regions, true homologous relationships can often be determined by analyzing variation in introns and flanking sequence. Accordingly, we expanded our phylogenetic analyses of the RLN1 and RLN2 paralogs of primates to include noncoding sequences corresponding to the single intron plus the upstream and downstream flanking regions, and also explored the role of natural selection in the evolution of the coding sequence of these genes.

In all analyses the two RLN1 and RLN2 paralogs of apes fell in two separate clades that did not deviate significantly from the expected organismal phylogenies (fig. 2). Thus, we infer that these phylogenies resolved orthology among the RLN1 and RLN2 paralogs of apes, with the exception of a small conversion tract in the first exon restricted to chimps and bonobos (Evans et al. 1994). Interestingly, phylogenies based on coding and noncoding sequences gave contrasting answers regarding the position of the single copy RLN gene of New World monkeys (fig. 2). As in Arroyo, Hoffmann, Opazo (2012b), phylogenies based on coding sequence placed the single copy RLN gene of New World monkeys as sister to the RLN1 genes of apes (fig. 2). A tree topology suggesting that the duplication that gave rise to the RLN1/RLN2 paralogs occurred in the last common ancestor of anthropoid primates (Arroyo, Hoffmann, Opazo 2012b). However, phylogenies based on the three separate noncoding fragments consistently placed the New World monkey RLN genes as sister to the clade containing RLN1/RLN2 sequences from Old World monkeys and apes (fig. 2). This result would suggest a novel alternative to the three evolutionary scenarios already proposed in which the RLN1 and RLN2 paralogs would derive from the duplication of a proto-RLN gene in the last common ancestor of catarrhine primates, between ∼44.2 and 29.6 Ma (fig. 3). According to this novel scenario, the single copy RLN gene of New World monkeys represents the ancestral condition, whereas the single copy RLN gene of Old World monkeys would derive from the secondary loss of the RLN1 paralog in the group (fig. 3). This was also supported by approximately unbiased topology tests (Shimodaira and Hasegawa 1999), based on the intron or downstream alignments, which rejected the placement of the New World monkeys RLN gene as sister to the RLN1 gene of apes (P < 0.001). Because the observed differences between coding and noncoding phylogenies were statistically significant, our results are indicative of a pattern of convergent evolution at the sequence level.

Fig. 2.—

Fig. 2.—

Maximum likelihood phylograms depicting relationships among relaxin-like genes in primates based on 1 kb of 5′ flanking sequence, coding sequence, intron 1, and 1 kb of 3′ flanking sequence. Numbers on the nodes correspond to maximum likelihood bootstrap support values and Bayesian posterior probabilities. Single copy RLN gene found in New World monkeys are shaded.

Fig. 3.—

Fig. 3.—

An evolutionary model for the evolution of the RLN1 and RLN2 genes in anthropoid primates. The model indicates that the RLN1 and RLN2 paralogs derive from the duplication of a proto-RLN gene in the last common ancestor of catarrhine primates, and not in the last common ancestor of apes or anthropoids as previously thought. Although a two gene arrangement was present in the last common ancestor of catarrhine primates, only apes appear to have retained both copies, whereas Old World monkeys lost the RLN1 paralog.

Phylogenetic reconstructions have been widely used in the literature to investigate events of putative convergent evolution at the sequence level (Castoe et al. 2009; Li et al. 2010; Liu et al. 2010; Yokoyama et al. 2011). Cases where species with similar phenotypes are grouped together rather than with their true relatives have been considered as evidence for convergent evolution, defined here in a loose manner to include both convergent and parallel evolution. For example, Liu et al. (2010) studied the evolution of prestin genes, which encode for a protein involved in hearing, and found that a process of convergent evolution driven by natural selection was responsible for the placement of the dolphin gene within a clade that included echolocating microbats rather than to the cow, which was its true closest relative.

In this case, we investigated the potential role of natural selection on the evolution of the single copy RLN gene of New World monkeys. In particular, we focused on exploring the possibility that the phylogenetic affinity between the RLN gene from New World monkeys and the RLN1 paralog of apes are due to convergent evolution at the sequence level driven by natural selection. If this was the case, we hypothesized that the branch leading to the RLN gene of New World monkeys would have a dN/dS ratio significantly higher than 1, and that some of the codons under natural selection could have converged to the same state independently in both lineages.

To test the first of these predictions, we explored variation in ω ( = dN/dS) among the branches in the tree in a maximum likelihood framework. First, we compared a 2 − ω model that assigned one independent ω estimate with the ancestral branch of the RLN clade of New World monkeys and a second one to the rest of the tree with a 1 − ω model where all branches were assigned the same ω. The 2 − ω model was significantly better according to the likelihood ratio test (LRT = 6.32, P < 0.02). Under the 2 − ω model, the ancestral branch of the New World monkey RLN clade had an ω estimate of 1.77 whereas all other branches had an ω of 0.76 (table 1). The branch-site analyses yielded similar results, as the LRTs favored the alternative model (LRT = 3.86, P = 0.049), where several residues switched to a positive selection regime in the ancestral branch of the New World monkeys RLN clade. The BEB analysis identified 35 codons under a positive selection regime, two on the region encoding for the signal peptide, four on the region encoding for the B peptide, 21 on the region encoding for the C peptide, and eight located on the region encoding for the A peptide (table 1). These results suggest that positive Darwinian selection in the ancestral branch of the New World monkey RLN clade was responsible for the remodeling of this protein, and probably accounts for the phylogenetic position of the New World monkeys RLN gene in phylogenies derived from coding sequence.

Table 1.

Log Likelihood and Parameter Estimates under Different Branch and Branch-Site Models

Model ln L Parameter Estimates Positively Selected Sites
Branch models
1 − ω −4,734.19 ωall branches = 0.799 NA
2 − ω −4,731.03 ωnon-New World monkey branches = 0.758; ωancestral branch of the New World monkey RLN clade = 1.776 NA
Branch-site models
ω fixed (NWM) −4,685.07 p0 = 0.259; p1 = 0.347; p2a = 0.167; p2b = 0.224; ω0 = 0.246, ω1 = 1; ω2 = 1 NA
ω free (NWM) −4,683.14 p0 = 0.170; p1 = 0.228; p2a = 0.257; p2b = 0.343; ω0 = 0.245; ω1 = 1; ω2a/b = 3.477 SP: 19, 20; B: 2,4, 22, 28; C: 7, 11, 12, 13, 15, 16, 18, 25, 26, 38, 49, 50, 52, 55, 56, 66, 71, 74, 102, 103, 107; A: 3, 4, 7, 8, 9, 12, 19, 22

Note.—ln L, likelihood value; p, proportion of site class; ω, omega value for branches or site classes; SP, signal peptide; B, B peptide; C, C peptide C; A, A peptide.

We then explored whether convergence at the nucleotide level resulted in convergence at the amino acid level. In this scenario, a number of the codons under natural selection in the ancestral branch of New World monkey RLN clade would have converged to the same amino acid state as the RLN1 genes of apes. To do so, we reconstructed ancestral sequences of the relevant nodes using a maximum likelihood approach and tracked amino acid changes along the tree (fig. 4). We found that two of the codons inferred to be evolving under positive Darwian selection, B4 and C49, had changed in parallel (fig. 4). In the case of the B4 site, a Met was substituted by a Lys in both ancestral branches, whereas a Thr was substituted by an Ala on the C49 site (fig. 4). We identified one additional positively selected codon, C66, where the derived amino acid state belongs to the same functional group (fig. 4). In this case, a nonpolar/neutral amino acid (ValC66) was replaced by amino acids with the same functional properties (fig. 4). The fact that two amino acid replacements were strictly parallel, and in other case the derived state belongs to the same functional group indicates that a few of the positively selected codons support the convergent hypothesis at the amino acid level. Thus, our analyses would suggest that the sister group relationship between the single copy RLN gene from New World monkeys and the RLN1 paralog of apes is due to an event of convergent evolution at the sequence level between co-ortholog genes, where changes in a subset of the convergent sites appear to be driven by positive selection.

Fig. 4.—

Fig. 4.—

Alignments of relaxin amino acid sequences. The upper panel depicts an alignment of the ancestral states reconstructed for the branch leading to the New World monkey RLN clade (nodes 1 and 2), and two actual New World monkey species. The middle panel shows an alignment of the ancestral states reconstructed for the branch leading to the ape RLN1 clade (nodes 3 and 4), and five actual ape species. The lower panel shows RLN2 sequences from five actual ape species. Amino acids in bold denote sites inferred under positive selection, shaded amino acids are parallel changes, and boxed amino acid is a parallel change where the derived amino acid state was not the same in both lineages but they belong to the same functional group. Amino acid sites labeled with an X were not included in the ancestral sequence reconstruction analysis as the entire columns of gapped sites were removed.

Aside from resolving the evolutionary history of the RLN1 and RLN2 paralogs our results have functional implications as well. Most of the positively selected residues are located on the region encoding for the C peptide, an interesting result given that in marmoset, prorelaxin, the hormone whose C-peptide domain has not been proteolytically cleaved, possesses biological activity similar to the processed peptide (Tan et al. 1998; Zarreh-Hoshyari-Khah et al. 2001; Silvertown et al. 2003). Similar results have been shown for relaxin 3 (Bathgate et al. 2006), suggesting that processing the precursor might not be an essential prerequisite for the acquisition of biological activity. A similar situation has been demonstrated for the proinsulin molecule, a member of a closely related gene family, which is an active agent that binds to the insulin-receptor A, eliciting a differential signaling with enhanced mitogenic effects that regulate embryo development (Hernandez-Sanchez et al. 2006; Malaguarnera et al. 2012). In this regard, proinsulin has been detected in the chick embryo as early as 0.5 days of development, during gastrulation, and also in the retinal neuroepithelium at day 3 (Diaz et al. 1999; Hernandez-Sanchez et al. 2002). In addition to the physiological roles of the C peptide in the unprocessed molecule, it is also involved in the correct folding and disulphide bond pairing of the relaxin molecule. Although its length is approximately 100 amino acids long, it has been shown that the full length is not required to attain the correct molecular conformation (Vandlen et al. 1995). In the particular case of the RLN2 molecule, Vandlen et al. (1995) demonstrated that a C peptide of just 13 amino acids is enough to achieve the correct folding and disulphide bond pairing. Similar results have been shown for the insulin molecule (Busse et al. 1976).

A full exploration of the convergent evolution scenario should be accompanied with physiological data that demonstrates that both proteins, RLN1 from apes and RLN gene from New World monkeys, perform the same physiological function. However, this is difficult to demonstrate at this time, as in a recent review, Bathgate et al. (2013) stated, “The function of the RLN1 gene in humans and higher primates is unknown.” In the same work they also said “The RLN1 gene is only found in humans and the great apes, but in some of these species, it is doubtful that a functional peptide is produced. Even in humans where mRNA expression is detected in multiple tissues, there is no evidence for functional peptide production.” In agreement with these statements, Shabanpoor et al. (2009) wrote, “the mRNA expression of H1 relaxin has been detected in human deciduas, prostate gland and placenta trophoblast. However, its functional significance remains unknown.”

At the expression level it has been reported that the RLN1 gene has a more restricted expression than the RLN2 gene. The RLN1 gene has been detected in the decidua, trophoblast, and prostate (Sakbun et al. 1990; Hansell et al. 1991), whereas the RLN2 gene is expressed in the corpus luteum, endometrium, decidua, placenta, prostate, mammary glands, heart, and brain (Bathgate et al. 2006; Ivell et al. 2011). Accordingly, it could be hypothesized that one of the consequences of a convergent event between the RLN1 of apes and the single copy RLN gene of New World monkeys could be a restriction in the expression pattern of the single copy RLN gene found in New World monkeys. However, given the essential physiological roles of the single copy RLN gene found in the RFLB locus in most mammalian species, we think is highly improbable that in any actual mammal (including NWM) this gene could suffer a restriction on its expression pattern. In support of this claim, it has been shown that in marmoset (C. jacchus) the pattern of relaxin expression appears to be very similar to the human (Steinetz et al. 1995; Einspanier et al. 1997, 1999).

Conclusions

Our results allowed us to refine the current model for the evolution of the RLN1 and RLN2 paralogs in anthropoid primates. According to our phylogenies, the duplication event that gave rise to the RLN1 and RLN2 paralogs occurred in the last common ancestor of catarrhine primates (fig. 3), and not in the last common ancestor of apes or anthropoids, as previously thought. Although both genes were present in the last common ancestor of catarrhine primates, only apes appear to have retained both copies, whereas Old World monkeys lost the RLN1 paralog. This refined model highlights the role of the differential retention of relatively old paralogs in shaping the gene complement in catarrhine primates. In addition, we showed that the sister group relationship between the RLN gene of New World monkeys and the RLN1 paralog of apes was due to convergent evolution at the nucleotide level partly driven by positive Darwinian selection. We speculate that it is unlikely that the observed convergence at the nucleotide level has resulted in convergence at the functional level. Importantly, our molecular evolution analyses work suggest novel research questions regarding the “functional homology” between the New World monkeys RLN and the RLN1 and RLN2 genes from apes, and of the putative functional role of the C peptide, and the prorelaxin (i.e., the relaxin molecule that includes the C peptide).

Supplementary Material

Supplementary table S1 is available at Genome Biology and Evolution online (http://www.gbe.oxfordjournals.org/).

Supplementary Data

Acknowledgments

This work was supported by the Fondo Nacional de Desarrollo Científico y Tecnológico (FONDECYT 1120032) grants to J.C.O., the National Science Foundation (EPS-0903787) grants to F.G.H., and a CONICYT doctoral fellowship to J.I.A.

Literature Cited

  1. Arroyo JI, Hoffmann FG, Good S, Opazo JC. INSL4 pseudogenes help define the relaxin family repertoire in the common ancestor of placental mammals. J Mol Evol. 2012;75:73–78. doi: 10.1007/s00239-012-9517-0. [DOI] [PubMed] [Google Scholar]
  2. Arroyo JI, Hoffmann FG, Opazo JC. Gene duplication and positive selection explains unusual physiological roles of the relaxin gene in the European rabbit. J Mol Evol. 2012a;74:52–60. doi: 10.1007/s00239-012-9487-2. [DOI] [PubMed] [Google Scholar]
  3. Arroyo JI, Hoffmann FG, Opazo JC. Gene turnover and differential retention in the relaxin/insulin-like gene family in primates. Mol Phylogenet Evol. 2012b;63:7689–7776. doi: 10.1016/j.ympev.2012.02.011. [DOI] [PubMed] [Google Scholar]
  4. Bathgate RA, Samuel CS, Burazin TC, Gundlach AL, Tregear GW. Relaxin: new peptides, receptors and novel actions. Trends Endocrinol Metab. 2003;14:207–213. doi: 10.1016/s1043-2760(03)00081-x. [DOI] [PubMed] [Google Scholar]
  5. Bathgate RA, et al. Relaxin-3: improved synthesis strategy and demonstration of its high-affinity interaction with the relaxin receptor LGR7 both in vitro and in vivo. Biochemistry. 2006;45:1043–1053. doi: 10.1021/bi052233e. [DOI] [PubMed] [Google Scholar]
  6. Bieche I, et al. Placenta-specific INSL4 expression is mediated by a human endogenous retrovirus element. Biol Reprod. 2003;68:1422–1429. doi: 10.1095/biolreprod.102.010322. [DOI] [PubMed] [Google Scholar]
  7. Busse WD, Carpenter FH. Synthesis and properties of carbonylbis(methionyl)insulin, a proinsulin analogue which is convertible to insulin by cyanogen bromide cleavage. Biochemistry. 1976;15:1649–1657. doi: 10.1021/bi00653a010. [DOI] [PubMed] [Google Scholar]
  8. Castoe TA, et al. Evidence for an ancient adaptive episode of convergent molecular evolution. Proc Natl Acad Sci U S A. 2009;106:8986–8991. doi: 10.1073/pnas.0900233106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Diaz B, Pimentel B, de Pablo F, de La Rosa EJ. Apoptotic cell death of proliferating neuroepithelial cells in the embryonic retina is prevented by insulin. Eur J Neurosci. 1999;11:1624–1632. doi: 10.1046/j.1460-9568.1999.00577.x. [DOI] [PubMed] [Google Scholar]
  10. Einspanier A, et al. Local relaxin biosynthesis in the ovary and uterus through the oestrous cycle and early pregnancy in the female marmoset monkey (Callithrix jacchus) Hum Reprod. 1997;12:1325–1337. doi: 10.1093/humrep/12.6.1325. [DOI] [PubMed] [Google Scholar]
  11. Einspanier A, et al. Relaxin in the marmoset monkey: secretion pattern in the ovarian cycle and early pregnancy. Biol Reprod. 1999;61:512–520. doi: 10.1095/biolreprod61.2.512. [DOI] [PubMed] [Google Scholar]
  12. Eizinger A, Jungblut B, Sommer RJ. Evolutionary change in the functional specificity of genes. Trends Genet. 1999;15:197–202. doi: 10.1016/s0168-9525(99)01728-x. [DOI] [PubMed] [Google Scholar]
  13. Evans BA, Fu P, Tregear GW. Characterization of two relaxin genes in the chimpanzee. J Endocrinol. 1994;140:385–392. doi: 10.1677/joe.0.1400385. [DOI] [PubMed] [Google Scholar]
  14. Hansell D, Bryant G, Greenwood F. Expression of the human relaxin H1 gene in the decidua, trophoblast, and prostate. J Clin Endocrinol Metab. 1991;72:899–904. doi: 10.1210/jcem-72-4-899. [DOI] [PubMed] [Google Scholar]
  15. Hernandez-Sanchez C, Mansilla A, de la Rosa EJ, de Pablo F. Proinsulin in development: new roles for an ancient prohormone. Diabetologia. 2006;49:1142–1150. doi: 10.1007/s00125-006-0232-5. [DOI] [PubMed] [Google Scholar]
  16. Hernandez-Sanchez C, Rubio E, Serna J, de la Rosa EJ, de Pablo F. Unprocessed proinsulin promotes cell survival during neurulation in the chick embryo. Diabetes. 2002;51:770–777. doi: 10.2337/diabetes.51.3.770. [DOI] [PubMed] [Google Scholar]
  17. Hoffmann FG, Opazo JC. Evolution of the relaxin/insulin-like gene family in placental mammals: implications for its early evolution. J Mol Evol. 2011;72:72–79. doi: 10.1007/s00239-010-9403-6. [DOI] [PubMed] [Google Scholar]
  18. Hoffmann FG, Opazo JC, Storz JF. Gene cooption and convergent evolution of oxygen transport hemoglobins in jawed and jawless vertebrates. Proc Natl Acad. Sci U S A. 2010;107:14274–14279. doi: 10.1073/pnas.1006756107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Ivell R, Kotula-Balak M, Glynn D, Heng K, Anand-Ivell R. Relaxin family peptides in the male reproductive system—a critical appraisal. Mol Hum Reprod. 2011;17:71–84. doi: 10.1093/molehr/gaq086. [DOI] [PubMed] [Google Scholar]
  20. Jobb G, Haeseler AV, Strimmer K. TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics. BMC Evol Biol. 2004;4:18. doi: 10.1186/1471-2148-4-18. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
  21. Katoh K, Asimenos G, Toh H. Multiple alignment of DNA sequences with MAFFT. Methods Mol Biol. 2009;537:39–64. doi: 10.1007/978-1-59745-251-9_3. [DOI] [PubMed] [Google Scholar]
  22. Li Y, Liu Z, Shi P, Zhang J. The hearing gene Prestin unites echolocating bats and whales. Curr Biol. 2010;20:R55–R56. doi: 10.1016/j.cub.2009.11.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Liu Y, et al. Convergent sequence evolution between echolocating bats and dolphins. Curr Biol. 2010;20:R53–R54. doi: 10.1016/j.cub.2009.11.058. [DOI] [PubMed] [Google Scholar]
  24. Malaguarnera R, et al. Proinsulin binds with high affinity the insulin receptor isoform A and predominantly activates the mitogenic pathway. Endocrinology. 2012;153:2152–2163. doi: 10.1210/en.2011-1843. [DOI] [PubMed] [Google Scholar]
  25. McGowan BM, et al. Relaxin-3 stimulates the hypothalamic-pituitary-gonadal axis. Am J Physiol Endocrinol Metab. 2008;295:E278–E286. doi: 10.1152/ajpendo.00028.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Nielsen R, Yang Z. Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics. 1998;148:929–936. doi: 10.1093/genetics/148.3.929. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Park JI, Chang CL, Hsu SY. New Insights into biological roles of relaxin and relaxin-related peptides. Rev Endocr Metab Disord. 2005;6:291–296. doi: 10.1007/s11154-005-6187-x. [DOI] [PubMed] [Google Scholar]
  28. Park J-I, Semyonov J, Yi W, Chang CL, Hsu SYT. Regulation of receptor signaling by relaxin A chain motifs: derivation of pan-specific and LGR7-specific human relaxin analogs. J Biol Chem. 2008;283:32099–32109. doi: 10.1074/jbc.M806817200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Park J-I, et al. Origin of INSL3-mediated testicular descent in therian mammals. Genome Res. 2008;18:974–985. doi: 10.1101/gr.7119108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Piatigorski J. Gene sharing and evolution: the diversity of protein functions. Cambridge: Harvard University Press; 2007. [Google Scholar]
  31. Ronquist F, Huelsenbeck JP. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 2003;19:1572–1574. doi: 10.1093/bioinformatics/btg180. [DOI] [PubMed] [Google Scholar]
  32. Sakbun V, Ali SM, Greenwood FC, Bryant-Greenwood GD. Human relaxin in the amnion, chorion, decidua parietalis, basal plate, and placental trophoblast by immunocytochemistry and northern analysis. J Clin Endocrinol Metab. 1990;70:508–514. doi: 10.1210/jcem-70-2-508. [DOI] [PubMed] [Google Scholar]
  33. Sherwood OD. Relaxin’s physiological roles and other diverse actions. Endocr Rev. 2004;25:205–234. doi: 10.1210/er.2003-0013. [DOI] [PubMed] [Google Scholar]
  34. Shimodaira H, Hasegawa M. Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol Biol Evol. 1999;16:1114–1116. [Google Scholar]
  35. Silvertown JD, Geddes BJ, Summerlee AJ. Adenovirus-mediated expression of human prorelaxin promotes the invasive potential of canine mammary cancer cells. Endocrinology. 2003;144:3683–3691. doi: 10.1210/en.2003-0248. [DOI] [PubMed] [Google Scholar]
  36. Tan YY, Wade JD, Tregear GW, Summers RJ. Comparison of relaxin receptors in rat isolated atria and uterus by use of synthetic and native relaxin analogues. Br J Pharmacol. 1998;123:762–770. doi: 10.1038/sj.bjp.0701659. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Tatusova TA, Madden TL. BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences. FEMS Microbiol Lett. 1999;174:247–250. doi: 10.1111/j.1574-6968.1999.tb13575.x. [DOI] [PubMed] [Google Scholar]
  38. Vandlen R, Winslow J, Moffat B, Rinderknecht E. Human relaxin: purification, characterization and production of recombinant relaxins for structure function studies. In: MacLennan AH, Tregear GW, Bryant-Greenwood GD, editors. Progress in relaxin research: the proceedings of the second international congress on the hormone relaxin. Singapore: World Scientific Publications; 1995. pp. 59–74. [Google Scholar]
  39. Wilkinson TN, Speed TP, Tregear GW, Bathgate RAD. Evolution of the relaxin-like peptide family. BMC Evol Biol. 2005;5:14. doi: 10.1186/1471-2148-5-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24:1586–1591. doi: 10.1093/molbev/msm088. [DOI] [PubMed] [Google Scholar]
  41. Yang Z, dos Reis M. Statistical properties of the branch-site test of positive selection. Mol Biol Evol. 2011;28:1217–1228. doi: 10.1093/molbev/msq303. [DOI] [PubMed] [Google Scholar]
  42. Yang Z, Nielsen R, Goldman N, Pedersen AM. Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics. 2000;155:431–449. doi: 10.1093/genetics/155.1.431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Yang Z, Wong WS, Nielsen R. Bayes empirical Bayes inference of amino acid sites under positive selection. Mol Biol Evol. 2005;22:1107–1118. doi: 10.1093/molbev/msi097. [DOI] [PubMed] [Google Scholar]
  44. Yegorov S, Good S. Using paleogenomics to study the evolution of gene families: origin and duplication history of the relaxin family hormones and their receptors. PLoS One. 2012;7:e32923. doi: 10.1371/journal.pone.0032923. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Yokoyama S, Altun A, DeNardo DF. Molecular convergence of infrared vision in snakes. Mol Biol Evol. 2011;28:45–48. doi: 10.1093/molbev/msq267. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Zarreh-Hoshyari-Khah R, Bartsch O, Einspanier A, Pohnke Y, Ivell R. Bioactivity of recombinant prorelaxin from the marmoset monkey. Regul Pept. 2001;97:139–146. doi: 10.1016/s0167-0115(00)00205-6. [DOI] [PubMed] [Google Scholar]
  47. Zhang J, Nielsen R, Yang Z. Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol Biol Evol. 2005;22:2472–2479. doi: 10.1093/molbev/msi237. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

Articles from Genome Biology and Evolution are provided here courtesy of Oxford University Press

RESOURCES