Abstract
The superfamily of armadillo repeat proteins is a fascinating archetype of modular-binding proteins involved in various fundamental cellular processes, including cell–cell adhesion, cytoskeletal organization, nuclear import, and molecular signaling. Despite their diverse functions, they all share tandem armadillo (ARM) repeats, which stack together to form a conserved three-dimensional structure. This superhelical armadillo structure enables them to interact with distinct partners by wrapping around them. Despite the important functional roles of this superfamily, a comprehensive analysis of the composition, classification, and phylogeny of this protein superfamily has not been reported. Furthermore, relatively little is known about a subset of ARM proteins, and some of the current annotations of armadillo repeats are incomplete or incorrect, often due to high similarity with HEAT repeats. We identified the entire armadillo repeat superfamily repertoire in the human genome, annotated each armadillo repeat, and performed an extensive evolutionary analysis of the armadillo repeat proteins in both metazoan and premetazoan species. Phylogenetic analyses of the superfamily classified them into several discrete branches with members showing significant sequence homology, and often also related functions. Interestingly, the phylogenetic structure of the superfamily revealed that about 30 % of the members predate metazoans and represent an ancient subset, which is gradually evolving to acquire complex and highly diverse functions.
Electronic supplementary material
The online version of this article (doi:10.1007/s00018-016-2319-6) contains supplementary material, which is available to authorized users.
Keywords: Phylogeny, ARM repeat, HEAT repeat, Metazoa, Premetazoan, Molecular evolution, Protein domains, Catenins, Importins
Introduction
Armadillo proteins share similar imperfect tandem repeats, generally composed of about 40 amino acids and called armadillo (ARM) repeats. ARM repeats were first identified in the Drosophila segment polarity protein named armadillo [1], which is the ortholog of mammalian β-catenin (encoded by CTNNB1 in man), the founding member of this large protein superfamily [2]. Armadillo proteins are involved in a broad range of biological processes, including cell adhesion, molecular signaling, cytoskeletal regulation, and intracellular transport (reviewed in [3]).
Although the sequence similarity between the individual ARM repeats is low, each ARM repeat folds into a conserved three-dimensional (3D) compact helical bundle composed of three α helices (H1, H2, and H3) (Fig. 1a, b). ARM proteins may differ much from each other in the number of ARM repeats and in the organization of their ARM domains. However, as far as is known, ARM repeats fold together into a similar superhelical structural domain (ARM domain) in which the neighboring H2–3 helices are packed together in an antiparallel orientation [4]. The resulting ARM domain has a positively charged groove that varies among the different proteins and enables them to interact with distinct partners by wrapping around them (Fig. 1c). The intrinsic thermodynamic stability of the individual repeats and the resulting ARM domain is mediated by interactions between the highly conserved hydrophobic residues in the repeat sequences (Fig. 1b; Fig. S1). The resulting elongated-binding surface and the stable biophysical properties of these repeats make ARM domains excellent tools for designing modular peptide-binding scaffolds [5, 6].
In the last decade, many so-called ARM repeat proteins have been identified and several structures of ARM domains have been characterized, including that of p120 catenin (encoded by CTNND1), plakoglobin (JUP), plakophilin 1 (PKP1), adenomatous polyposis coli (APC), and importin-α (KPNA), most of which have been well studied in the past two decades (e.g., [7–9]). However, relatively little is known about many ARM proteins and some of the current annotations of ARM repeats are either incomplete or incorrect. This makes it difficult to identify the full repertoire of ARM superfamily members, to classify them accordingly, and to understand their functional specification and diversification in full.
A major reason for this inconsistency seems to be the similarity between ARM repeats and HEAT repeats (acronym for Huntingtin, elongation factor 3, protein phosphatase 2A, and the yeast kinase TOR1). Although a single HEAT repeat contains only two alpha helices compared to three helices in one ARM repeat, HEAT repeats from a superhelical 3D structure comparable to that of ARM repeats, and it has been suggested that HEAT and ARM repeat proteins are evolutionary related [10]. Besides the frequent difficulty of distinguishing between these types of repeats, correct detection of ARM and HEAT repeats is problematic [11]. Recent crystal structures of the formins, such as Fmnl2 and Diaph1, show the presence of ARM repeats [12, 13], but remarkably a conserved domain search with the current Pfam or SMART position-specific scoring matrices (PSSMs) fails to detect these repeats. Methods using information from experimentally determined structures improve the prediction of ARM and HEAT repeats and show enhanced sensitivity at low false positive rates [11].
Here, we identify and analyze for the first time what we think is the full repertoire of the ARM superfamily in the human genome. To reconstruct the evolutionary relationships of ARM superfamily members, we extended our comprehensive analysis to 13 metazoan and premetazoan species. Our study unravels the evolution of ARM protein families in multicellular animals, identifies close non-metazoan relatives, and presents an original in-depth phylogenetic analysis on the ARM superfamily.
Materials and methods
Identification of human ARM proteins
We used three strategies to detect all ARM proteins encoded by the human genome. First, the assembly release of the human genome (hg19, GRCh37 Genome Reference Consortium Human Reference 37) was downloaded from the UCSC Genome Browser (http://hgdownload.soe.ucsc.edu/downloads.html). The six-frame translation of the human genome was searched with two profile hidden Markov models (HMM), which were created using Pfam ARM seed (built with 242 sequences) and Pfam ARM full (built with 12,724 sequences) alignments, in the Pfam database [14]. All 504 significant ARM repeat hits, sorted by their E value, are shown in supplementary Table S1 (sheet 1). To identify the corresponding genes, each detected putative ARM repeat was used as query in BLASTp searches against the human genome. The results are shown in Table S1 (sheet 2), and unique entries are summarized in Table S2. Second, the NCBI Gene Entrez database was searched against the ARM-specific PSSM cd00020 “ARM” and other ARM-related PSSMs, such as pfam04826 “armadillo-like” and cl02500 “Armadillo/beta-catenin-like repeat”. The obtained Gene IDs were converted to the protein Reference Sequence (RefSeq) IDs, and for each identified “putative ARM protein,” the longest isoform was collected for the annotation step. The results of this NCBI Gene Entrez database search are summarized in Table S3. Finally, we searched the Protein Data Bank (PDB) for the text entry “Armadillo” and domain entry Pfam “PF00514,” and the results were refined for metazoan entries. All the identified putative ARM hits from NCBI Gene Entrez, HMMs, and PDB searches are summarized in Table S2.
Annotation of human ARM proteins
Since solved structures of armadillo proteins can be considered as concrete criteria for distinguishing ARM from HEAT repeats and correctly annotating the boundaries between consecutive ARM repeats, we first visualized each published structure containing ARM repeats with PyMOL (https://www.pymol.org/). After careful annotation of each ARM repeat in these structures, we searched for the closely related paralogs [e.g., published structure plakophilin-1 (PDB ID: 1XM9); two other paralogs plakophilin-2 and -3]. For each protein, multiple sequence alignments (MSA) between the paralogs were built with MUSCLE software [15], and the corresponding ARM repeats were annotated as such (e.g., supplementary Fig. S2A–C). Next, all the remaining entries the structure of which is not characterized were checked and ARM domains were annotated using a live (not pre-calculated) conserved domain (CD) search with E value threshold as 0.1 and using the database CD v3.11, which is a curated database from the combination of NCBI, Pfam, SMART, COG, PRK, and TIGRFAM [16]. Finally, annotation data from published studies were used to cross-check and extend our annotations of human ARM proteins. The results of the entire annotation process are summarized in Table S2 (sheet 2).
Sequence searches and annotation of orthologous sequences
To identify potential orthologs in the other 12 metazoan and non-metazoan species (Fig. S3), we used 70 human ARM proteins as queries for BLASTp searches (Table S2, sheet 3). The sequence with the highest homology (lowest E value) was considered as “best-hit.” To confirm the potential orthologous relations of these best hits, we performed reciprocal best-hit analysis. All the obtained best hits from the previous step were retrieved and used as a query for BLASTp searches against the human genome (Table S2, sheet 3). Although the reciprocal BLAST search is quite robust and effective for detection of genuine orthologous genes, it might miss some information due to many-to-many orthologous relations [17]. In order not to rule out genome specific duplication events, we also checked the other hits that returned with an E value less than 1E−10. In some cases, where only partial sequences could be found, additional gene predictions using Fgenesh [18] and GenScan [19] were performed around the genomic region in which the putative genes were located. After detection of the orthologs of each of the 70 human ARM proteins (summarized in Fig. 2), we first built an MSA with MUSCLE (in CLC Main workbench, http://www.clcbio.com/products/clc-main-workbench/) using only the vertebrate sequences and annotated their ARM repeats. Then, we extended the alignments and annotations by gradually adding the ARM proteins of the remaining bilaterian, non-bilaterian, and premetazoan species (illustrated in Fig. S2D, E).
Comparative and phylogenetic analysis of ARM proteins
For pairwise homology analysis of ARM domains (or repeats), we used EMBOSS Needle (http://www.ebi.ac.uk/Tools/psa/emboss_needle/), and for multifasta comparison, we used “basic local alignment search tool two sequences” (bl2seq) [20]. The analyses included ARM-by-ARM comparisons and pairwise sequence alignment. For multi sequence homology, sequences were aligned by MUSCLE [15] using the default settings, except for the maximum iteration, which was set to 100 to ensure reaching convergence. For Fig. 3, a Bayesian inference (BI) consensus tree was built with MrBayes 3 [21] (6,000,000 generations, sample frequency 1000, burnin 25 % and final average standard deviation = 0.0095). A BI consensus tree was drawn using Interactive Tree Of Life Version v3 [22] and represented as circular phylogram with Bayesian posterior probabilities (PP). The other phylogenetic trees, which were focused on specific subfamilies, were built either with BI (until final average standard deviation <0.001) and/or with Randomized Axelerated Maximum Likelihood (RAxML) [23] using an LG amino acid replacement matrix with 500 bootstrap replicates, unless otherwise stated.
Results
Identification and improved annotation of human ARM superfamily members
To obtain a complete inventory of the human ARM proteins, we combined several strategies. First, we searched the human NCBI Gene records [24] for entries, which are annotated with ARM domain specific PSSM. Second, the human genome was searched using the ARM profile hidden Markov model (HMM) available in the Pfam database [14]. While the NCBI search identified 70 putative ARM proteins, the HMM search using both Pfam seed and full armadillo model (see “Materials and methods”) identified 52 putative ARM proteins. Of the identified proteins, 42 were present in both data sets (Table S2). Finally, the Protein Data Bank (PDB) was searched for ARM repeat-containing structures and refined for metazoan protein entries. Of 24 structurally characterized ARM proteins, 16 of them were present in our NCBI + HMM data set and 8 of them were not detected by PSSM and/or HMM searches. This combined strategy resulted in the identification of 95 putative ARM repeat proteins (Table S2).
Although sequence similarity between individual ARM repeats is generally low, the folding of ARM repeats into a superhelical ARM domain is very similar among them (Fig. 1a, c). Since the structurally related HEAT repeat can be distinguished from ARM repeats by their two-helical pack instead of three-helical structure, we first analyzed the published 3D structures of human ARM proteins. Surprisingly, some of the proteins identified as ARM proteins by PSSM and HMM models, such as importin-beta 1 (Kpnb1) and transportin (Tnpo), were characterized as HEAT repeats in structural studies [7, 25]. Furthermore, the structurally characterized HEAT repeat proteins, such as Kpnb1 and Tnpo, are considered as ARM proteins also in the InterPro database [26] and by the HUGO Gene Nomenclature Committee (http://www.genenames.org/cgi-bin/genefamilies/set/409). To exclude such false positives from our initial list, each repeat was annotated separately and manually corrected whenever appropriate. Next, the annotations based on the 3D structures were extended by alignments within the known subfamily members. To annotate the remaining human ARM proteins, which do not belong to an established ARM family and for which no 3D structure is available, we searched each of these sequences in Conserved Domain search (CD) [16]. Using this multi-step strategy on the 95 candidate ARM sequences, 70 human ARM proteins were retained, comprising 433 manually curated and carefully annotated ARM repeats (Fig. 2, column H. sapiens). The 25 excluded sequences turned out to have HEAT repeats instead of ARM repeats (highlighted in pink in Table S2).
The number of ARM repeats in the 70 validated human ARM proteins varies a lot. In most cases, the ARM repeats are organized in a consecutive order folding into a superhelical structure. However, several ARM proteins (Ctnnd2 and relatives, Armc8, Armc2, and Rsph14) have a large insertion between two ARM regions (illustrated by a dashed line in Fig. 1c). Furthermore, based on the recent structural analysis of the formin-like (Fmnl) proteins [12] and on our annotations, the proteins Fmnl1–3 have a large insertion within the second ARM repeat (Fig. 2b). The conserved domain search on the human armadillo reference set revealed that about half of them have additional domains, e.g., an importin β-binding (IBB) domain or a formin homology domain (FH) (see legend of Fig. 2). The presence of other domains besides a specific number of ARM repeats is a first feature that helps to identify distinct families in the superfamily (Fig. 2b).
Phylogenetic distribution of the human ARM superfamily members in (pre-)metazoan species
To study the metazoan evolution of ARM proteins, we identified the orthologs of the human reference set of 70 ARM proteins (see “Materials and methods” and Table S2) in nine metazoan species occupying key phylogenetic positions or representing specific lineages in the metazoan evolution (Fig. S3). We analyzed the deuterostome genomes of Mus musculus (mouse), Gallus gallus (chicken), Danio rerio (zebrafish), and the tunicate Ciona intestinalis. In the protostome lineage, we searched the genomes of the mollusk Aplysia californica (sea hare) and the arthropod Drosophila melanogaster (fruit fly). We also identified the armadillo proteins in three non-bilaterian animals: the cnidarian Nematostella vectensis, the placozoan Trichoplax adhaerens, and the poriferan Amphimedon queenslandica. Finally, these analyses were expanded to choanoflagellate (Monosiga brevicollis and Salpingoeca rosetta) and filasterean lineages (Capsaspora owczarzaki), which are unicellular close relatives of Metazoa and are a useful outgroup for comparative studies [27]. To annotate the ARM domains in the identified orthologous proteins, we first aligned each human ARM protein with its corresponding vertebrate orthologs to deduce the conserved ARM repeats and annotated them accordingly. Then, we extended the MSA data and ARM annotations by gradually adding the ARM proteins of the remaining bilaterian, non-bilaterian, and premetazoan species (illustrated for beta-catenin orthologs in Fig. S2).
One of the clear differences between the ARM repertoires of mammals and of the other investigated vertebrates (chicken and zebrafish) is the expansion of the G protein-coupled receptor-associated sorting proteins (GASP family) members in mammals. The zebrafish genome encodes all the remaining human ARM proteins, with the exception of most of the GASP family members. In the chicken genome, besides GASPs, we could not detect the orthologs of five other superfamily members: two ARM formins, Ankar, Armc5, and Zyg11a. About half of the human ARM repertoire is not present in the other bilaterian genomes, such as those of sea hare and fruit fly, indicating a major expansion of the ARM superfamily in the vertebrate lineage. It is noteworthy that at least one member of each family in the ARM superfamily is present in basal animals. One of the major events in animal evolution has been the transition to bilaterality. Interestingly, the non-bilaterian animals N. vectensis and T. adhaerens possess a similar number and almost the same set of ARM proteins as non-vertebrate bilaterian animals (Fig. 2a). In the non-metazoan choanoflagellates and the filasterean C. owczarzaki, respectively, 30 and 24 % of the human ARM repertoire is represented, corresponding to the ancestral set of metazoan ARM superfamily.
Phylogeny of the ARM proteins
The phylogenetic relationships among the metazoan ARM proteins have not been studied yet at the superfamily level. Only the expansions in the families of importins [28], formins [29], GASPs [30], and catenins [31] have been studied. Studying the phylogeny of all ARM proteins faces several major challenges. First, the number of ARM repeats varies considerably among the ARM proteins, from one in Armc1 to 13 in Armc4. Second, there is high sequence divergence between the various ARM repeats (Fig. 1b), which might be an evolutionary feature to adapt the ability to interact with very different ligands, such as cytoskeletal proteins and transcription factors. Finally, the ARM superfamily members have indeed quite diverse functions, such as molecular signaling, intracellular transport and cell adhesion, and except for some families, such as the armadillo formins, there is no large common additional domain or structural feature besides the ARM domain.
We first tried to construct the phylogeny of the superfamily by alignment of blocks comprising three ARM repeats (~120 amino acids), i.e., as much as in the current PSSM models to identify ARM repeat proteins. Although there were some conserved blocks in the multiple sequence alignment (MSA), different phylogenetic methods did not produce trees with the same topology. Hence, to overcome these difficulties and construct a reliable phylogeny of the ARM superfamily, we selected ARM proteins, which possess at least five ARM repeats (~200 amino acids), and aligned only those ARM domains, which are the structural and functional cores of these proteins. The resulting MSA of conserved ARM repeats was then used to construct phylogenetic tree with Bayesian inference. Bayesian phylogeny is considered the best method in the field [32, 33], but the increased accuracy of this method requires high computational power. Assigning a phylogeny for many divergent sequences with variable lengths, such as the members of the ARM superfamily, is not practically feasible, even not for supercomputers [34]. Therefore, to compare the results of the two approaches, we used only the ARM proteins from early branching metazoans and premetazoan species (see “Materials and methods” for details). Bayesian phylogeny of the armadillo superfamily based on five ARM repeat blocks identified several discrete branches or families (Fig. 3). This classification was in essence supported by a maximum likelihood (ML) phylogenetic tree, which resulted in the same topology (data not shown). Interestingly, each of these discrete branches described below have one or more members in the premetazoan subset.
As expected, the HEAT repeats that were used from the importin-beta (KPNB) family are clearly outgrouped with the ARM repeats. Rap1gds1 (Rap1 GTPase-GDP dissociation stimulator 1) and its homolog Rqcd1 (required for cell differentiation 1 homolog) are rather solitary members and form separate branches from the rest of the ingroup taxa in the superfamily. Both proteins are found in choanoflagellates, and filasterians (Capsaspora), and Rqcd1 orthologs are even in yeasts and plants, indicating that these proteins probably arose more than a billion years ago and had similar functions in the different kingdoms of life. The ARM repeats found in the structure of Rcd1 [35] are not detected by the current domain models of SMART, CDD, and Pfam.
Phylogenetic branch comprising the delta-catenin and formin families
Evolutionary studies on FH2-containing formins revealed that both the N-terminal GTPase-binding domain (GBD) and the herewith overlapping FH3 domain are conserved between diaphanous-related formin (Diaph1-3), formin-like (Fmnl1-3), inverted formin 2 (Inf2), FH1/FH2 domain-containing proteins (Fhod1 and -3), and dishevelled associated activator of morphogenesis (Daam1-2), but have been lost in delphilin, formins (Fmn1, -2), and inverted formin 1 (Fdhc1) [29, 36]. On the other hand, FH1 and FH2 domains are consistently conserved among formins. Structural studies focused on mouse Diaph1 and human Fmnl2 revealed that a region in the FH3 domain folds, such as armadillo repeats [12, 13].
The question then arises whether the other GBD/FH3/FH2 formins besides Fmnl2 and Diaph1 also contain an armadillo repeat region. Sequence similarities between the GBD/FH3/FH2 formins confirm that the armadillo repeat region is indeed conserved between the human Diaph, Fmnl, Daam, Fhod members, and Inf2 (Table S4). Sequence searches showed that five GBD/FH3/FH2 formins are present in premetazoans and basal metazoans. These five members gave rise by gene duplication in vertebrates to 11 distinct members in mammals (Fig. 2). Phylogenetic analysis using either only the ARM domain of the GBD/FH3/FH2 formins or the full-length protein sequences did not change the relative positions of the subfamilies in the trees; this highlights the conservation of this ARM region of formins throughout the metazoan evolution (Fig. S4).
Our Bayesian phylogenetic tree (Fig. 3) suggests that the formin and delta-catenin families are closely related to each other and form a clade with a posterior probability (PP) of 1.00. Sequence searches on the above-mentioned species (Fig. S3) showed that delta-catenin members are confined to the metazoans. In vertebrates, an ancestral delta-catenin-like gene was duplicated into seven members [37] (Fig. 2). Except for two copies of delta-catenin-like genes in the sea squirt (tunicate) and sea anemone (cnidarian), none of the other non-vertebrate organisms investigated have multiple copies of this protein family in their genomes. Bayesian phylogeny of the delta-catenin family showed that a first duplication event had occurred before vertebrates arose, giving rise to the Ctnnd1/Arvcf-like genes on the one hand and the Ctnnd2/Pkp4-like genes on the other hand, since the two delta-catenins of sea squirt (here referred to as Ci Ctnnd-A and Ci Ctnnd-B) form clades with, respectively, vertebrate Ctnnd1/Arvcf and Ctnnd2/Pkp4 subfamilies (Fig. S5). ARM by ARM comparison showed and confirmed the position of the Ci Ctnnd-B, because eight of its nine ARM repeats are more similar to human Ctnnd2 (Table S5). Ci Ctnnd-A and all remaining non-vertebrate delta-catenins have a rather mosaic ARM repeat pattern and show similarities to both subfamilies (Table S5). Plakophilins clade neither with the Ctnnd1/Arvcf nor with the Ctnnd2/Pkp4 branch and form their own branch. Therefore, it is not clear which subfamily is at the origin of plakophilins (Fig. S5). Remarkably, sea anemone has two delta-catenins and three classical cadherins with each a delta-catenin-binding domain [38]. Other non-bilaterians, such as Trichoplax and Amphimedon, have both a single classical cadherin and a single delta-catenin.
Phylogenetic branch comprising Armc8, Cab39, and Uso1
Our phylogenetic analyses suggest that calcium-binding protein 39 (Cab39) and Uso1 (also known as general vesicular transport factor, p115) form a branch close to the root of the superfamily tree with a PP of 1.00, and which is outgrouped by Armc8 (PP = 0.65) (Fig. 3). Tracing the evolution of these genes revealed that CAB39 is the only duplicated member of this branch and gave rise to the CAB39L in vertebrates (Fig. 2). Human Cab39 shares 80 % sequence identity with human Cab391, and its orthologs can be found in all the metazoan and non-metazoan species investigated here (Fig. 2). Functional studies have shown that the scaffolding protein Cab39 is a key regulator of protein kinases and cell polarity in metazoan species [39, 40] and even in fission yeast [41]. Except for the choanoflagellates and fruit fly, Armc8 is conserved in all the metazoans and in non-metazoan Capsaspora. Human Armc8 contains a first ARM domain with four consecutive ARM repeats and a second ARM domain with five ARM repeats, connected by a large insert or loop region composed of about 150 AA (Fig. 2). Like some other ARM proteins (such as β-catenin, plakophilins, and Diaph1), Armc8 physically interacts with α-catenin and is responsible for proteasome-dependent degradation of α-catenin [42]. Although catenins are confined to the metazoans, Armc8 is found in Capsaspora (Fig. 2). Moreover, Armc8 orthologs have been reported in plants, algae, and yeast, where they participate in the degradation of gluconeogenesis enzymes [43, 44]. Uso1 has 11 ARM repeats and its orthologs can be found in all metazoans and non-metazoans (Fig. 2). Like the two other members in the same branch (Armc8 and Cab39), Uso1 is found in yeast, where it is involved in vesicular transport [45, 46]. Although the Cab39 and Uso1 proteins are more closely related to each other than to Armc8, many evolutionary changes have occurred between them as reflected by the branch lengths in the phylogenetic trees (Fig. 3).
Phylogenetic branch comprising importin-α and Spag6 proteins
Previous phylogenetic studies showed that importin-α proteins can be divided into three major clades: α1, α2, and α3 [28]. In vertebrates, three members (Kpna1/5/6) did arise from α1, whereas α2 predates the Kpna2/7 importins, and α3 is the ancestor of the Kpna3/4 importins. Evolutionary studies focused on Drosophila species and on Caenorhabditis elegans showed that atypical importin-α proteins exist in addition to the three classical importin-α types [47, 48]. The C. elegans genome encodes one classical importin (α, type 3; named Ima3) and two non-conventional importin α proteins (Ima1 and Ima2). In contrast, the D. melanogaster genome encodes all three classical importin-α proteins and at least one non-conventional importin-α that does not bear an N-terminal IBB domain (named αKap4 and in certain Drosophila species αKap5) (for a suggested improved nomenclature, see Table S6).
To investigate the details of the premetazoan and metazoan evolution of importin-α proteins, we searched the genomes of the above-mentioned ten species (Fig. S3). Sequence searches revealed that all metazoan species encode ancestral members for the three clades, here referred to as ImpA1, ImpA2, and ImpA3 (ancestors α1-3, respectively; Table S6, Fig. S6A). To investigate the evolutionary relationship of non-conventional importin-α proteins with the classical types, we included extra importin-α sequences from Drosophila bipectinata, Drosophila parabipectinata, and C. elegans in our analyses. Neighbor-joining phylogeny on the importin-α proteins suggests that next to the classical three types, another type is distinguishable (Fig. S6A). Remarkably, both C. elegans ImpA0/b and Drosophila ImpA0a/b are claded into this new clade (named type α0). Phylogenetic analyses on the importin-α proteins showed that the α0 type is forming a branch together with the α2 and α3 types (BS = 94 in Fig. S6A). However, within this branch, the exact position of the α0 type is unclear due to the low bootstrap support value (BS = 28 in Fig. S6A). Since the α0-type members are not forming a branch together with either one of the three classical types (Fig. S6A) and since all α0-type members have lost the IBB domain, the occurrence of another type separate from α1 to α3 is suggested. So far, type 0 importin-α proteins have been reported only in Ecdysozoans and not in Lophotrochozoa or Deuterostomia. To detect other members, we used type α0 sequences of C. elegans and D. melanogaster as queries in BLASTp and tBLASTn to search the genomes of non-bilaterians and premetazoan species. Indeed, we detected α0-type sequences in metazoan T. adhaerens and premetazoans (Table S6). A phylogenetic tree constructed using all the detected importin-α proteins suggests that the atypical importin-α proteins form another clade (type α0) and together with α1 type predate metazoans (Fig. S6B).
While ARM importins had gone through two gene duplications that gave rise to seven vertebrate members, we could not detect any duplicated paralog of the sperm-associated antigen 6 gene (SPAG6) in any lineage (Fig. 2). The filasterean Capsaspora genome apparently lacks an ortholog of Spag6. However, when we BLAST the importin sequences from any other lineage to the Capsaspora genome, best-hit returns Capsaspora importin, at least with an E value of 1E−10 (Table S7). This further supports the close phylogenetic relationship between ARM importins and Spag6 (PP = 0.96 in Fig. 3).
Phylogenetic analysis of Armc6, β-catenin, Apc, Kifap3, and Armc2
Remarkably, the two armadillo catenin families (β- and δ- catenins) are not claded into the same branch in the ARM superfamily phylogenetic tree (Fig. 3). Both beta- and delta-catenin families first appeared in multicellular animals (Fig. 2). We could not detect any ARM catenin ortholog in choanoflagellates or Capsaspora. However, several studies reported a β-catenin homologue (called Aardvark) in the soil-living ameba Dictyostelium discoideum [49]. Aardvark is required for intercellular adherens junction formation, molecular signaling [50], and cell polarity [51]. An ancient catenin complex is hypothesized to mediate cell polarity before multicellularity [51]. However, our phylogenetic analyses strongly suggest that the Dictyostelium Aardvark (aarA) is the ortholog of metazoan armadillo repeat-containing 6 (Armc6) (PP = 1.00) rather than of β-catenin (Fig. 3). In line with this, the comparison of ARM repeats of human and N. vectensis β-catenin and Armc6 with Dictyostelium Aardvark confirms and supports this suggested orthology (Table S8).
Furthermore, the phylogenetic tree indicates that the adenomatous polyposis coli (APC) protein is related to β-catenin (Fig. 3). Apc binds β-catenin and regulates its phosphorylation and ubiquitination [52, 53]. Like the β-catenin case, we could not detect any orthologs of Apc outside the metazoan kingdom, but the corresponding genes had both experienced a duplication event at the origin of vertebrates, which gave rise to APC2 and plakoglobin-encoding JUP genes (Fig. 2), indicating that the two genes might have co-evolved. Based on the phylogenetic tree, Kifap3 and Armc2 are positioned in the same branch (PP = 0.90) with Apc and β-catenin and this branch is possibly related to Armc6 (PP = 0.72) (Fig. 3).
Phylogenetic branch comprising Armc3, Armc4, and Ankar
Phylogeny on the ARM superfamily indicates that Armc3 and Armc4 possibly form a separate branch together with ankyrin and armadillo repeat-containing protein (Ankar) (Fig. 3). Although the support value for this branch is rather low (PP = 0.65), members of this group are possibly related to the Armc6/β-catenin branch with a high probability (PP = 0.90) (Fig. 3). Moreover, ARM repeat comparison showed and confirmed that Armc3–Armc4 and Ankar are closely related, because almost all ARM repeats in these proteins share a high sequence similarity (Table S9). Multiple sequence alignment revealed that the last five ARM repeats of Armc3 are similar to the first five repeats of Armc4 and Ankar (Table S9). Like other Ankyrin gene family members [54] in which no ARM repeats could be detected, ANKAR is confined to the metazoans (Fig. 2). ARMC3 and ARMC4 orthologs are encoded in non-metazoan choanoflagellates, but not in Capsaspora (Fig. 2), amebae, plants, or bacteria (data not shown). The similarity between the ARM domains of Armc4 from man and unicellular M. brevicollis is more than 65 % (for Armc3 45 % between man and M. brevicollis), indicating the importance of this protein in early animal evolution.
Phylogenetic branch comprising Ctnnbl1 and Unc45
Sequence searches showed that all investigated metazoan and non-metazoan species encode both Catenin beta-like 1 (CTNNBL1) and unc-45 myosin chaperone (UNC45) in their genomes (Fig. 2). The phylogenetic tree for the ARM superfamily members suggests that the two proteins are evolutionary related to each other with a PP value of 0.71 (Fig. 3). While UNC45 is duplicated in the origin of vertebrates, we could not detect any paralogs of CTNNBL1 in vertebrates. CTNNBL1 is found in amebae, fungi, algae, and plants, but UNC45 orthologs are not found in plants and ameba (data not shown). Although the Ctnnbl1 name refers to a close similarity to β-catenin (Ctnnb1), these proteins share only 12 % identical AA, and the ARM repeats of Ctnnbl1, and Ctnnb1 are not significantly similar to each other (Table S10).
Since the evolutionary relationship between the branches Ctnnbl1/Unc45 and Armc3/Armc4/Ankar on the one hand and the Armc6/β-catenin/Apc/Kifap3/Armc2 group on the other hand is not strongly supported by our BI analysis, we re-analyzed the phylogeny of this part of the ARM superfamily tree. In addition to the proteins investigated in Fig. 3, we included related ARM proteins from human, zebrafish, and fruit fly for detailed investigation. The resulting BI analysis confirms that the Ctnnbl1/Unc45 and Armc3/Armc4/Ankar groups form separate branches, now with a PP value of 0.87 and 0.99, respectively (Fig. S7). In the Armc6/β-catenin/Apc/Kifap3/Armc2 group, Kifap3 and Armc3 are clustered together with a PP value of 0.89. Although the positions of the Apc and Ctnnb1 proteins are not fully clear within this group, in both analyses, the Ctnnbl1/Unc45 branch, the Armc3/Armc4/Ankar branch, and the Armc6/β-catenin/Apc/Kifap3/Armc2 group are closely related (PP = 0.90 in Fig. 3 and PP = 0.99 in Fig. S7).
Solitary members in the ARM phylogenetic analyses
About one-third of the detected ARM proteins are not included in our ARM superfamily phylogeny, either because they are not represented in more than one lower metazoan species [as is the case of the enigmatic evolution of G protein-coupled receptor-associated sorting proteins (GASP family)], or because they possess less than five consecutive ARM repeats (used as a selection criterion as discussed above in the phylogeny section). Among these solitary members, GASPs and Zyg11 members had undergone duplication events in mammalian and vertebrate lineages, respectively (Fig. 2).
ARMCX1 to ARMCX6 (Armadillo repeat-containing X-linked proteins), GPRASP1, GPRASP2, BHLHB9, and ARMC10 have been classified as GASP family members (Table S11) [55]. They all share a common ARM domain at their C-termini [56]. While GPRASP1, GPRASP2, BHLHB9, and ARMCX6 encode proteins with five ARM repeats, the remaining members have an extra ARM repeat at the end of the C-terminal region (Fig. 2). Previous phylogenetic studies on the GASP family members have shown that an X-chromosome-linked GASP gene cluster is confined to the Eutherian genomes. On the other hand, Armc10 can be found in all vertebrate linages. Therefore, it predates the remaining GASP members and is found even in basal chordates, such as amphioxus [30]. Searching additional non-vertebrate genomes with PSI-BLAST revealed that ARMC10-like genes are present in several non-vertebrate lineages, such as mollusks and cnidarians (Table S11 and Fig. S8A). Like the vertebrate Armc10, putative invertebrate Armc10-like proteins all have a conserved ARM domain composed of six tandem repeats located in a C-terminal part (Fig. S8B). Reciprocal BLAST searches with these putative invertebrate Armc10-like proteins confirmed that Armc10 predates vertebrates (Table S12).
The Zyg11 gene was first identified in C. elegans and named after its chromosomal locus [57]. Except for N. vectensis, which contains two copies, the non-vertebrate animals investigated and non-metazoan choanoflagellates have a single Zyg11 ortholog (Fig. 2). Three paralogs are found in vertebrates, Zyg11a, Zyg11b, and Zyg11-related-1 (Zer1).
To identify the origin of the ARM repeats of these solitary members, we compared their ARM repeats with a reference ARM repeat set that was created according to our phylogenetic tree. ARM domain comparison of these solitary members suggests that the ARM repeats of Rsph14 and Armc5 show significant similarity to the ARM repeats of Armc3/Armc4, and that the ARM repeats of Tmco6 are significantly similar to the ARM repeats of Armc8 (Table S13). However, such similarity was not observed for many other solitary members, such as Armc7 and Armc10, in which the ARM repeats show similarities to several branches (Table S13).
Discussion
Here, we report the first exhaustive classification of armadillo repeat proteins in the animal kingdom based on structural, evolutionary, and functional criteria. Mining the human genome with a combination of different approaches (including HMMs and PSSMs) revealed that this well studied genome encodes at least 70 ARM proteins (Fig. 2). Of all reported human armadillo proteins, about 26 % (25 out of 95) are in fact HEAT repeat proteins. This discrepancy is due mainly to the current models used in automated domain detection, which often detect HEAT repeats as ARM repeats. In addition, some of the superfamily members that we identified, for example, based on structural evidence, are not detected by the current models. Although we used already a combined approach to identify genuine armadillo proteins in animals, it is likely that other ARM repeat proteins may be found in the future. To identify such new superfamily members, a new improved ARM model will need to be developed: it has to be more specific and it should be based on correctly annotated ARM repeats. Such a model should exclude HEAT repeats. Solving molecular structures provides undeniable evidence for annotating ARM repeats correctly.
Gene ontology (GO) analysis showed that the human ARM proteins are involved in diverse cellular roles, such as molecular signaling, cell adhesion, and intracellular transport (Fig. S9). Their tandemly arranged ARM repeats form an elongated-binding surface (Fig. 1c), and their conserved hydrophobic residues (Fig. 1b) create a modular recognition mechanism that makes them excellent modular peptide-binding targets compared to other repeat proteins, such as HEAT and tetratricopeptide repeat (TPR) proteins [58]. Next to the binding surfaces created by the ARM domain itself, ARM proteins generally include an additional domain. This helps them to recruit additional interaction partners, which can lead to the formation of physiologically relevant protein complexes, such as the β-catenin destruction complex [59] and the nuclear pore-targeting complex [28]. Extensive sequence searches to identify the orthologs of the human ARM repertoire on the closest unicellular relatives of Metazoa (the choanoflagellates M. brevicollis and S. rosetta and the filasterean C. owczarzaki) revealed that about one-fourth of the ARM proteins were already present more than 600 million years ago (Fig. 2). Thus, these orthologs predate metazoans and represent the metazoan origin of this large superfamily. The proposed and proven functions of the ancestral ARM repertoire show that they were involved in key cellular processes, such as gametogenesis, cell differentiation, and nuclear transport (Table 1). Later, during the metazoan evolution, duplicated members, such as plakophilins and plakoglobin, evolved more specialized roles, such as desmosome formation [60].
Table 1.
Gene | Cellular functions | References |
---|---|---|
Predating metazoa and conserved in choanoflagellida and/or filasterea | ||
RAP1GDS1 |
Chaperone Signaling Cell proliferation, adhesion |
[86–89] |
SIL1 | Chaperone | [90, 91] |
UNC45 |
Chaperone Cell differentiation, proliferation Apoptosis |
[92–94] |
CAB39 |
Chaperone Signaling, kinase activation |
[95–97] |
HSPBP1 |
Chaperone Spermatogenesis |
[98, 99] |
ZER1 (ZYG11l) |
Spermatogenesis Cell cycle regulator |
[100, 101] |
ARMC4 |
Spermatogenesis Ciliogenesis |
[71, 102] |
ARMC3 |
Spermatogenesis Ciliogenesis |
[73, 103] |
SPAG6 |
Sperm motility Ciliogenesis Regulation of the microtubules |
[104–106] |
KIFAP3 |
Intracellular transport Chromosome segregation Sperm motility |
[67, 107, 108] |
Arm importins |
Nuclear transport Spermatogenesis, gametogenesis |
[7, 77, 109] |
USO1 | Vesicular transport | [46] |
CTNNBL1 |
Spliceosome Nuclear transport Apoptosis |
[110–112] |
RQCD1 |
Transcriptional cofactor Cell differentiation |
[35, 113] |
Arm formins |
Cell differentiation, adhesion, migration Cytokinesis |
[12, 114–116] |
ARMC8 |
Cell proliferation Protein degradation |
[42, 117] |
ARMC-1/2/6/7/9 | Unknown | |
TTC12 | Unknown |
At least 22 genes encoding different ARM proteins predate metazoans and represent the metazoan origin of this large superfamily
Our phylogenetic analysis on the ARM superfamily members of metazoan and premetazoan species identified several discrete branches (Fig. 3). The delta-catenin and ARM formin families were found in the same phylogenetic branch (PP = 1.00). Both family members were reported to play roles in cytoskeletal regulation. For example, functional studies on Diaph1 knockdown in MCF7 cells leads to disordered adherens junction formation due to suppression of E-cadherin levels and reduced levels of α- and β-catenin at cell–cell contacts [61]. Functional studies on p120 catenin revealed its association with the juxtamembrane (JMD) domain of classical cadherins at adherens junctions and its role in stabilization of those cadherins at the plasma membrane by suppressing their endocytosis [9, 62]. Many delta-catenins and ARM formins participate in actin cytoskeleton remodeling by regulating Rho family GTPases (reviewed in [63, 64]). As the delta-catenin members are confined to Metazoa, the premetazoan ancestor of the delta-catenin family might be an ARM formin-like protein that was originally involved in the regulation of the actin cytoskeleton. Later, with the emergence of multicellularity when classical cadherins arose [37], ARM formin proteins probably co-evolved and gained an additional function in the adherens junction complex.
Surprisingly, beta-catenin and the related plakoglobin (Jup) were not positioned together with delta-catenins, but with Armc6, Apc, Kifap3, and Armc2 (Figs. 2, 3). Although these two protein groups share the same core name ‘catenin’ and both can interact with various cadherins at cell–cell junctions, the functions of these two ‘catenin’ groups are substantially different from each other. While delta-catenins are involved in regulation of Rho-GTPases, cadherin endocytosis, and modulation of Kaiso (reviewed in [63]), beta-catenin stabilizes cadherin junctions by linking them to the actin cytoskeleton through interaction with alpha-catenin [65], and moreover, it is involved in the modulation of TCF/LEF transcription factors in the nucleus (reviewed in [66]). The delta-catenin members can interact with cadherin JMD domains, while beta-catenin and plakoglobin bind the more C-terminal catenin-binding domain (CBD) of cadherins. Moreover, delta-catenin members and beta-catenin use distinct sets of hydrophobic residues to interact with cadherins [9]. This indicates that the ARM repeats of beta- and delta-catenins are optimized to bind different regions of the same classical cadherins, and their ARM repeats might have evolved from different premetazoan ancestors.
Interestingly, our phylogenetic tree positions the D. discoideum β-catenin homolog, Aardvark (AARA in Fig. 3), together with Armc6 (PP = 1.00). Moreover, Aardvark and Armc6 share more sequence similarity to each other than to β-catenin (Table S8). Since Armc6 orthologs predate the Metazoa, Armc6 might be the premetazoan ancestor of modern metazoan β-catenin. Next to Armc6 and β-catenin, we found that Kifap3 and Armc2 are also represented in the same branch (Fig. 3). No functional studies have been performed on Armc2, but Kifap3 plays a critical role in the regulation of mitosis and chromosomal segregation [67]. Remarkably, both APC and β-catenin are involved in similar processes, such as chromosome segregation [68, 69]. Moreover, Kifap3 interacts physically with both Apc and β-catenin and functions in microtubular transport [70]. Altogether, these data suggest that β-catenin, Apc, and Kifap3 are related both evolutionarily and functionally and that two uncharacterized proteins, Armc2 and Armc6, might have orthologous roles to the other well-characterized members in this branch.
Based on their phylogenetic position and sequence similarity, Armc3 and Armc4 are closely related to each other (Fig. 3; Table S9) and both can be found outside the metazoan kingdom. Despite their evolutionary conservation, only a few studies on them have been reported. Recent functional studies on man and fruit fly have revealed that Armc4 is involved in spermatogenesis and sperm mobility [71, 72]. Moreover, a study (almost the only one) on Armc3 has reported that in Bos taurus, a frameshift mutation in the ARMC3 gene causes a fertility disorder, indicating its crucial role in spermatogenesis as well [73]. Although the function of Ankar has not been explored, other ankryin repeat proteins, such as Ankrd49 and Ankrd36, have been shown to be important in spermatogenesis [74, 75].
Another branch is formed by the ancestral importin-α proteins (type 1 and 0) and Spag6 (PP = 0.96) (Fig. 3; Table S3). Besides their phylogenetic clustering and sequence similarity, they also share functional similarity, as both ARM importins and Spag6 have been reported to be essential in gametogenesis [76, 77]. Despite the lack of a SPAG6 gene, the Capsaspora genome encodes at least four ARM importins (Table S6), whereas both ARM importin and Spag6 orthologs can be found in fungi, algae, and plants (data not shown), indicating the evolutionary importance of both protein types.
Genome-wide studies in plants indicate that they typically contain more armadillo proteins than animals, e.g., 108 in Arabidopsis (mouse ear cress) and 158 in Oryza (rice) [78, 79]. This is most likely the result of dramatic genome expansions in land plants. Phylogenetic analysis divided them into 15 different groups. Next to importin-α and PF16, which is the plant ortholog of Spag6, only a few other armadillo proteins are found in both animals and plants. Animal Rqcd1 and its plant ortholog Rcd1 are involved in cell differentiation [80]. The vesicle transport protein Uso1 is also present in plants and its function has been proposed to be similar to Uso1 in animals and yeast [81]. For the Arm proteins Armc6, Armc7, Armc8, and Armc9, orthologs can be found in plants but whether they share similar functions still remains enigmatic.
We have shown in the past that classical cadherins have progressively lost N-terminal cadherin repeats during metazoan evolution [38]. For the two so-called 7D cadherins of vertebrates, several studies indicate a gain of two repeats by duplication of the first two N-terminal repeats from an ancestral five-repeat cadherin [82]. The reason for these gains and losses of cadherin repeats is likely structural needs for the formation of cis- and trans-dimers. However, in our current study, we did not find evidence for such events in armadillo repeat proteins. For instance, the sequence comparison of R(q)cd1 and importin-α proteins, which are two of the few ARM proteins conserved between animals, filastereans, yeasts, and plants, revealed exactly the same number of ARM repeats across evolution (Figs. S10 and S11).
In the conclusion, our exhaustive analysis of the ARM superfamily in a wide range of organisms, ranging from man to ancestral metazoans and including even premetazoan choanoflagellates and the filasterean C. owczarzaki, has revealed a more thorough insight into this large group of versatile proteins. First, our analysis demonstrates the importance of discriminating more consistently between ARM and HEAT repeat proteins, as this has both structural and functional consequences. Second, our analysis emphasizes less appreciated relationships within the ARM superfamily (e.g., between ARM formins and delta-catenins) and corrects the perspective of generally accepted relationships (e.g., between beta-catenin and delta-catenins). Third, our results make it possible to discriminate more reliably between the ancestral representatives and modern relatives in several ARM protein families and gives examples of the functional implications. Finally, we conclude that the ARM superfamily is even more diverse than the Cadherin superfamily with extracellular EC repeats [38, 83, 84] and propose that ARM domains have evolved independently several times during evolution, while this is less likely the case for the cadherin ectodomains. An obvious reason for that is that the binding functions of ARM domains are modular (as exemplified in Fig. 1c), whereas the interaction partners of cadherin ectodomains are generally other cadherins. Collectively, these findings represent a significant advance in our understanding of ARM proteins.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Acknowledgments
We thank Dr. Amin Bredan for critical reading and careful editing of the manuscript and our colleagues for helpful discussions. This work was supported by the Research Foundation—Flanders (FWO-Vlaanderen, Award G.0320.11N), the Belgian Science Policy (Interuniversity Attraction Poles—Award IAP7/07), and the Special Research Fund of Ghent University (Award BOF 01J14211).
References
- 1.Riggleman B, Wieschaus E, Schedl P. Molecular analysis of the armadillo locus: uniformly distributed transcripts and a protein with novel internal repeats are associated with a Drosophila segment polarity gene. Genes Dev. 1989;3(1):96–113. doi: 10.1101/gad.3.1.96. [DOI] [PubMed] [Google Scholar]
- 2.Peifer M, Berg S, Reynolds AB. A repeating amino acid motif shared by proteins with diverse cellular roles. Cell. 1994;76(5):789–791. doi: 10.1016/0092-8674(94)90353-0. [DOI] [PubMed] [Google Scholar]
- 3.Tewari R, Bailes E, Bunting KA, Coates JC. Armadillo-repeat protein functions: questions for little creatures. Trends Cell Biol. 2010;20(8):470–481. doi: 10.1016/j.tcb.2010.05.003. [DOI] [PubMed] [Google Scholar]
- 4.Huber AH, Nelson WJ, Weis WI. Three-dimensional structure of the armadillo repeat region of beta-catenin. Cell. 1997;90(5):871–882. doi: 10.1016/S0092-8674(00)80352-9. [DOI] [PubMed] [Google Scholar]
- 5.Madhurantakam C, Varadamsetty G, Grutter MG, Pluckthun A, Mittl PR. Structure-based optimization of designed Armadillo-repeat proteins. Protein Sci. 2012;21(7):1015–1028. doi: 10.1002/pro.2085. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Parmeggiani F, Pellarin R, Larsen AP, Varadamsetty G, Stumpp MT, Zerbe O, Caflisch A, Pluckthun A. Designed armadillo repeat proteins as general peptide-binding scaffolds: consensus design and computational optimization of the hydrophobic core. J Mol Biol. 2008;376(5):1282–1304. doi: 10.1016/j.jmb.2007.12.014. [DOI] [PubMed] [Google Scholar]
- 7.Cingolani G, Petosa C, Weis K, Muller CW. Structure of importin-beta bound to the IBB domain of importin-alpha. Nature. 1999;399(6733):221–229. doi: 10.1038/20367. [DOI] [PubMed] [Google Scholar]
- 8.Choi HJ, Gross JC, Pokutta S, Weis WI. Interactions of plakoglobin and beta-catenin with desmosomal cadherins: basis of selective exclusion of alpha- and beta-catenin from desmosomes. J Biol Chem. 2009;284(46):31776–31788. doi: 10.1074/jbc.M109.047928. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Ishiyama N, Lee SH, Liu S, Li GY, Smith MJ, Reichardt LF, Ikura M. Dynamic and static interactions between p120 catenin and E-cadherin regulate the stability of cell–cell adhesion. Cell. 2010;141(1):117–128. doi: 10.1016/j.cell.2010.01.017. [DOI] [PubMed] [Google Scholar]
- 10.Andrade MA, Perez-Iratxeta C, Ponting CP. Protein repeats: structures, functions, and evolution. J Struct Biol. 2001;134(2–3):117–131. doi: 10.1006/jsbi.2001.4392. [DOI] [PubMed] [Google Scholar]
- 11.Kippert F, Gerloff DL. Highly sensitive detection of individual HEAT and ARM repeats with HHpred and COACH. PLoS One. 2009;4(9):e7148. doi: 10.1371/journal.pone.0007148. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kuhn S, Erdmann C, Kage F, Block J, Schwenkmezger L, Steffen A, Rottner K, Geyer M. The structure of FMNL2-Cdc42 yields insights into the mechanism of lamellipodia and filopodia formation. Nat Commun. 2015;6:7088. doi: 10.1038/ncomms8088. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Rose R, Weyand M, Lammers M, Ishizaki T, Ahmadian MR, Wittinghofer A. Structural and mechanistic insights into the interaction between Rho and mammalian Dia. Nature. 2005;435(7041):513–518. doi: 10.1038/nature03604. [DOI] [PubMed] [Google Scholar]
- 14.Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Gunasekaran P, Ceric G, Forslund K, Holm L, Sonnhammer EL, Eddy SR, Bateman A. The Pfam protein families database. Nucleic Acids Res. 2010;38(Database issue):D211–D222. doi: 10.1093/nar/gkp985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32(5):1792–1797. doi: 10.1093/nar/gkh340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Marchler-Bauer A, Derbyshire MK, Gonzales NR, Lu S, Chitsaz F, Geer LY, Geer RC, He J, Gwadz M, Hurwitz DI, Lanczycki CJ, Lu F, Marchler GH, Song JS, Thanki N, Wang Z, Yamashita RA, Zhang D, Zheng C, Bryant SH. CDD: NCBI’s conserved domain database. Nucleic Acids Res. 2015;43(Database issue):D222–D226. doi: 10.1093/nar/gku1221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Dalquen DA, Dessimoz C. Bidirectional best hits miss many orthologs in duplication-rich clades such as plants and animals. Genome Biol Evol. 2013;5(10):1800–1806. doi: 10.1093/gbe/evt132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Solovyev V, Kosarev P, Seledsov I, Vorobyev D (2006) Automatic annotation of eukaryotic genes, pseudogenes and promoters. Genome Biol 7(Suppl 1):S10 11–12. doi:10.1186/gb-2006-7-s1-s10 [DOI] [PMC free article] [PubMed]
- 19.Burge CB, Karlin S. Finding the genes in genomic DNA. Curr Opin Struct Biol. 1998;8(3):346–354. doi: 10.1016/S0959-440X(98)80069-9. [DOI] [PubMed] [Google Scholar]
- 20.Johnson M, Zaretskaya I, Raytselis Y, Merezhuk Y, McGinnis S, Madden TL. NCBI BLAST: a better web interface. Nucleic Acids Res. 2008;36(Web Server issue):W5–W9. doi: 10.1093/nar/gkn201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Ronquist F, Huelsenbeck JP. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 2003;19(12):1572–1574. doi: 10.1093/bioinformatics/btg180. [DOI] [PubMed] [Google Scholar]
- 22.Letunic I, Bork P. Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation. Bioinformatics. 2007;23(1):127–128. doi: 10.1093/bioinformatics/btl529. [DOI] [PubMed] [Google Scholar]
- 23.Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30(9):1312–1313. doi: 10.1093/bioinformatics/btu033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM, Murphy MR, O’Leary NA, Pujar S, Rajput B, Rangwala SH, Riddick LD, Shkeda A, Sun H, Tamez P, Tully RE, Wallin C, Webb D, Weber J, Wu W, DiCuccio M, Kitts P, Maglott DR, Murphy TD, Ostell JM. RefSeq: an update on mammalian reference sequences. Nucleic Acids Res. 2014;42(Database issue):D756–D763. doi: 10.1093/nar/gkt1114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Imasaki T, Shimizu T, Hashimoto H, Hidaka Y, Kose S, Imamoto N, Yamada M, Sato M. Structural basis for substrate recognition and dissociation by human transportin 1. Mol Cell. 2007;28(1):57–67. doi: 10.1016/j.molcel.2007.08.006. [DOI] [PubMed] [Google Scholar]
- 26.Mitchell A, Chang HY, Daugherty L, Fraser M, Hunter S, Lopez R, McAnulla C, McMenamin C, Nuka G, Pesseat S, Sangrador-Vegas A, Scheremetjew M, Rato C, Yong SY, Bateman A, Punta M, Attwood TK, Sigrist CJ, Redaschi N, Rivoire C, Xenarios I, Kahn D, Guyot D, Bork P, Letunic I, Gough J, Oates M, Haft D, Huang H, Natale DA, Wu CH, Orengo C, Sillitoe I, Mi H, Thomas PD, Finn RD. The InterPro protein families database: the classification resource after 15 years. Nucleic Acids Res. 2015;43(Database issue):D213–D221. doi: 10.1093/nar/gku1243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Suga H, Chen Z, de Mendoza A, Sebe-Pedros A, Brown MW, Kramer E, Carr M, Kerner P, Vervoort M, Sanchez-Pons N, Torruella G, Derelle R, Manning G, Lang BF, Russ C, Haas BJ, Roger AJ, Nusbaum C, Ruiz-Trillo I. The Capsaspora genome reveals a complex unicellular prehistory of animals. Nat Commun. 2013;4:2325. doi: 10.1038/ncomms3325. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Goldfarb DS, Corbett AH, Mason DA, Harreman MT, Adam SA. Importin alpha: a multipurpose nuclear-transport receptor. Trends Cell Biol. 2004;14(9):505–514. doi: 10.1016/j.tcb.2004.07.016. [DOI] [PubMed] [Google Scholar]
- 29.Chalkia D, Nikolaidis N, Makalowski W, Klein J, Nei M. Origins and evolution of the formin multigene family that is involved in the formation of actin filaments. Mol Biol Evol. 2008;25(12):2717–2733. doi: 10.1093/molbev/msn215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Lopez-Domenech G, Serrat R, Mirra S, D’Aniello S, Somorjai I, Abad A, Vitureira N, Garcia-Arumi E, Alonso MT, Rodriguez-Prados M, Burgaya F, Andreu AL, Garcia-Sancho J, Trullas R, Garcia-Fernandez J, Soriano E. The Eutherian Armcx genes regulate mitochondrial trafficking in neurons and interact with Miro and Trak2. Nat Commun. 2012;3:814. doi: 10.1038/ncomms1829. [DOI] [PubMed] [Google Scholar]
- 31.Zhao ZM, Reynolds AB, Gaucher EA. The evolutionary history of the catenin gene family during metazoan evolution. BMC Evol Biol. 2011;11:198. doi: 10.1186/1471-2148-11-198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Hall BG. Comparison of the accuracies of several phylogenetic methods using protein and DNA sequences. Mol Biol Evol. 2005;22(3):792–802. doi: 10.1093/molbev/msi066. [DOI] [PubMed] [Google Scholar]
- 33.Gaucher EA, Kratzer JT, Randall RN. Deep phylogeny—how a tree can help characterize early life on Earth. Cold Spring Harb Perspect Biol. 2010;2(1):a002238. doi: 10.1101/cshperspect.a002238. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Wang LS, Leebens-Mack J, Kerr Wall P, Beckmann K, dePamphilis CW, Warnow T. The impact of multiple protein sequence alignment on phylogenetic estimation. IEEE/ACM Trans Comput Biol Bioinform. 2011;8(4):1108–1119. doi: 10.1109/TCBB.2009.68. [DOI] [PubMed] [Google Scholar]
- 35.Garces RG, Gillon W, Pai EF. Atomic model of human Rcd-1 reveals an armadillo-like-repeat protein with in vitro nucleic acid binding properties. Protein Sci. 2007;16(2):176–188. doi: 10.1110/ps.062600507. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Schonichen A. Geyer M (2010) Fifteen formins for an actin filament: a molecular view on the regulation of human formins. Biochim Biophys Acta. 1803;2:152–163. doi: 10.1016/j.bbamcr.2010.01.014. [DOI] [PubMed] [Google Scholar]
- 37.Hulpiau P, Gul IS, van Roy F. New insights into the evolution of metazoan cadherins and catenins. Prog Mol Biol Transl Sci. 2013;116:71–94. doi: 10.1016/B978-0-12-394311-8.00004-2. [DOI] [PubMed] [Google Scholar]
- 38.Hulpiau P, van Roy F. New insights into the evolution of metazoan cadherins. Mol Biol Evol. 2011;28(1):647–657. doi: 10.1093/molbev/msq233. [DOI] [PubMed] [Google Scholar]
- 39.Yamamoto Y, Izumi Y, Matsuzaki F. The GC kinase Fray and Mo25 regulate Drosophila asymmetric divisions. Biochem Biophys Res Commun. 2008;366(1):212–218. doi: 10.1016/j.bbrc.2007.11.128. [DOI] [PubMed] [Google Scholar]
- 40.Filippi BM, de los Heros P, Mehellou Y, Navratilova I, Gourlay R, Deak M, Plater L, Toth R, Zeqiraj E, Alessi DR. MO25 is a master regulator of SPAK/OSR1 and MST3/MST4/YSK1 protein kinases. EMBO J. 2011;30(9):1730–1741. doi: 10.1038/emboj.2011.78. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Mendoza M, Redemann S, Brunner D. The fission yeast MO25 protein functions in polar growth and cell separation. Eur J Cell Biol. 2005;84(12):915–926. doi: 10.1016/j.ejcb.2005.09.013. [DOI] [PubMed] [Google Scholar]
- 42.Suzuki T, Ueda A, Kobayashi N, Yang J, Tomaru K, Yamamoto M, Takeno M, Ishigatsubo Y. Proteasome-dependent degradation of alpha-catenin is regulated by interaction with ARMc8alpha. Biochem J. 2008;411(3):581–591. doi: 10.1042/BJ20071312. [DOI] [PubMed] [Google Scholar]
- 43.Menssen R, Schweiggert J, Schreiner J, Kusevic D, Reuther J, Braun B, Wolf DH. Exploring the topology of the Gid complex, the E3 ubiquitin ligase involved in catabolite-induced degradation of gluconeogenic enzymes. J Biol Chem. 2012;287(30):25602–25614. doi: 10.1074/jbc.M112.363762. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Francis O, Han F, Adams JC. Molecular phylogeny of a RING E3 ubiquitin ligase, conserved in eukaryotic cells and dominated by homologous components, the muskelin/RanBPM/CTLH complex. PLoS One. 2013;8(10):e75217. doi: 10.1371/journal.pone.0075217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Nakajima H, Hirata A, Ogawa Y, Yonehara T, Yoda K, Yamasaki M. A cytoskeleton-related gene, uso1, is required for intracellular protein transport in Saccharomyces cerevisiae . J Cell Biol. 1991;113(2):245–260. doi: 10.1083/jcb.113.2.245. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Striegl H, Roske Y, Kummel D, Heinemann U. Unusual armadillo fold in the human general vesicular transport factor p115. PLoS One. 2009;4(2):e4656. doi: 10.1371/journal.pone.0004656. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Geles KG, Adam SA. Germline and developmental roles of the nuclear transport factor importin alpha3 in C. elegans . Development. 2001;128(10):1817–1830. doi: 10.1242/dev.128.10.1817. [DOI] [PubMed] [Google Scholar]
- 48.Phadnis N, Hsieh E, Malik HS. Birth, death, and replacement of karyopherins in Drosophila . Mol Biol Evol. 2012;29(5):1429–1440. doi: 10.1093/molbev/msr306. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Grimson MJ, Coates JC, Reynolds JP, Shipman M, Blanton RL, Harwood AJ. Adherens junctions and beta-catenin-mediated cell signalling in a non-metazoan organism. Nature. 2000;408(6813):727–731. doi: 10.1038/35047099. [DOI] [PubMed] [Google Scholar]
- 50.Coates JC, Grimson MJ, Williams RS, Bergman W, Blanton RL, Harwood AJ. Loss of the beta-catenin homologue aardvark causes ectopic stalk formation in Dictyostelium . Mech Dev. 2002;116(1–2):117–127. doi: 10.1016/S0925-4773(02)00152-1. [DOI] [PubMed] [Google Scholar]
- 51.Dickinson DJ, Nelson WJ, Weis WI. A polarized epithelium organized by beta- and alpha-catenin predates cadherin and metazoan origins. Science. 2011;331(6022):1336–1339. doi: 10.1126/science.1199633. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Ha NC, Tonozuka T, Stamos JL, Choi HJ, Weis WI. Mechanism of phosphorylation-dependent binding of APC to beta-catenin and its role in beta-catenin degradation. Mol Cell. 2004;15(4):511–521. doi: 10.1016/j.molcel.2004.08.010. [DOI] [PubMed] [Google Scholar]
- 53.Yang J, Zhang W, Evans PM, Chen X, He X, Liu C. Adenomatous polyposis coli (APC) differentially regulates beta-catenin phosphorylation and ubiquitination in colon cancer cells. J Biol Chem. 2006;281(26):17751–17757. doi: 10.1074/jbc.M600831200. [DOI] [PubMed] [Google Scholar]
- 54.Cai X, Zhang Y. Molecular evolution of the ankyrin gene family. Mol Biol Evol. 2006;23(3):550–558. doi: 10.1093/molbev/msj056. [DOI] [PubMed] [Google Scholar]
- 55.Abu-Helo A, Simonin F. Identification and biological significance of G protein-coupled receptor associated sorting proteins (GASPs) Pharmacol Ther. 2010;126(3):244–250. doi: 10.1016/j.pharmthera.2010.03.004. [DOI] [PubMed] [Google Scholar]
- 56.Simonin F, Karcher P, Boeuf JJ, Matifas A, Kieffer BL. Identification of a novel family of G protein-coupled receptor associated sorting proteins. J Neurochem. 2004;89(3):766–775. doi: 10.1111/j.1471-4159.2004.02411.x. [DOI] [PubMed] [Google Scholar]
- 57.Kemphues KJ, Wolf N, Wood WB, Hirsh D. Two loci required for cytoplasmic organization in early embryos of Caenorhabditis elegans . Dev Biol. 1986;113(2):449–460. doi: 10.1016/0012-1606(86)90180-6. [DOI] [PubMed] [Google Scholar]
- 58.Reichen C, Hansen S, Pluckthun A. Modular peptide binding: from a comparison of natural binders to designed armadillo repeat proteins. J Struct Biol. 2014;185(2):147–162. doi: 10.1016/j.jsb.2013.07.012. [DOI] [PubMed] [Google Scholar]
- 59.Stamos JL, Weis WI (2013) The beta-catenin destruction complex. Cold Spring Harb Perspect Biol 5(1). doi:10.1101/cshperspect.a007898 [DOI] [PMC free article] [PubMed]
- 60.Hatzfeld M. Plakophilins: multifunctional proteins or just regulators of desmosomal adhesion? Biochim Biophys Acta. 2007;1773(1):69–77. doi: 10.1016/j.bbamcr.2006.04.009. [DOI] [PubMed] [Google Scholar]
- 61.Carramusa L, Ballestrem C, Zilberman Y, Bershadsky AD. Mammalian diaphanous-related formin Dia1 controls the organization of E-cadherin-mediated cell–cell junctions. J Cell Sci. 2007;120(Pt 21):3870–3882. doi: 10.1242/jcs.014365. [DOI] [PubMed] [Google Scholar]
- 62.Ireton RC, Davis MA, van Hengel J, Mariner DJ, Barnes K, Thoreson MA, Anastasiadis PZ, Matrisian L, Bundy LM, Sealy L, Gilbert B, van Roy F, Reynolds AB. A novel role for p120 catenin in E-cadherin function. J Cell Biol. 2002;159(3):465–476. doi: 10.1083/jcb.200205115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Kourtidis A, Ngok SP, Anastasiadis PZ. p120 catenin: an essential regulator of cadherin stability, adhesion-induced signaling, and cancer progression. Prog Mol Biol Transl Sci. 2013;116:409–432. doi: 10.1016/B978-0-12-394311-8.00018-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Young KG. Copeland JW (2010) Formins in cell signaling. Biochim Biophys Acta. 1803;2:183–190. doi: 10.1016/j.bbamcr.2008.09.017. [DOI] [PubMed] [Google Scholar]
- 65.Buckley CD, Tan J, Anderson KL, Hanein D, Volkmann N, Weis WI, Nelson WJ, Dunn AR. Cell adhesion. The minimal cadherin-catenin complex binds to actin filaments under force. Science. 2014;346(6209):1254211. doi: 10.1126/science.1254211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Valenta T, Hausmann G, Basler K. The many faces and functions of beta-catenin. EMBO J. 2012;31(12):2714–2736. doi: 10.1038/emboj.2012.150. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Haraguchi K, Hayashi T, Jimbo T, Yamamoto T, Akiyama T. Role of the kinesin-2 family protein, KIF3, during mitosis. J Biol Chem. 2006;281(7):4094–4099. doi: 10.1074/jbc.M507028200. [DOI] [PubMed] [Google Scholar]
- 68.Kaplan KB, Burds AA, Swedlow JR, Bekir SS, Sorger PK, Nathke IS. A role for the adenomatous polyposis coli protein in chromosome segregation. Nat Cell Biol. 2001;3(4):429–432. doi: 10.1038/35070123. [DOI] [PubMed] [Google Scholar]
- 69.Stolz A, Neufeld K, Ertych N, Bastians H. Wnt-mediated protein stabilization ensures proper mitotic microtubule assembly and chromosome segregation. EMBO Rep. 2015;16(4):490–499. doi: 10.15252/embr.201439410. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Jimbo T, Kawasaki Y, Koyama R, Sato R, Takada S, Haraguchi K, Akiyama T. Identification of a link between the tumour suppressor APC and the kinesin superfamily. Nat Cell Biol. 2002;4(4):323–327. doi: 10.1038/ncb779. [DOI] [PubMed] [Google Scholar]
- 71.Onoufriadis A, Shoemark A, Munye MM, James CT, Schmidts M, Patel M, Rosser EM, Bacchelli C, Beales PL, Scambler PJ, Hart SL, Danke-Roelse JE, Sloper JJ, Hull S, Hogg C, Emes RD, Pals G, Moore AT, Chung EM, UK10 K. Mitchison HM. Combined exome and whole-genome sequencing identifies mutations in ARMC4 as a cause of primary ciliary dyskinesia with defects in the outer dynein arm. J Med Genet. 2014;51(1):61–67. doi: 10.1136/jmedgenet-2013-101938. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Cheng W, Ip YT, Xu Z. Gudu, an Armadillo repeat-containing protein, is required for spermatogenesis in Drosophila . Gene. 2013;531(2):294–300. doi: 10.1016/j.gene.2013.08.080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Pausch H, Venhoranta H, Wurmser C, Hakala K, Iso-Touru T, Sironen A, Vingborg RK, Lohi H, Soderquist L, Fries R, Andersson M. A frameshift mutation in ARMC3 is associated with a tail stump sperm defect in Swedish Red (Bos taurus) cattle. BMC Genet. 2016;17(1):49. doi: 10.1186/s12863-016-0356-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Iida H, Urasoko A, Doiguchi M, Mori T, Toshimori K, Shibata Y. Complementary DNA cloning and characterization of rat spergen-2, a spermatogenic cell-specific gene 2 encoding a 56-kilodalton nuclear protein bearing ankyrin repeat motifs. Biol Reprod. 2003;69(2):421–429. doi: 10.1095/biolreprod.102.013987. [DOI] [PubMed] [Google Scholar]
- 75.Wang HL, Fan SS, Pang M, Liu YH, Guo M, Liang JB, Zhang JL, Yu BF, Guo R, Xie J, Zheng GP. The Ankyrin repeat domain 49 (ANKRD49) augments autophagy of serum-starved GC-1 cells through the NF-kappaB pathway. PLoS One. 2015;10(6):e0128551. doi: 10.1371/journal.pone.0128551. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Straschil U, Talman AM, Ferguson DJ, Bunting KA, Xu Z, Bailes E, Sinden RE, Holder AA, Smith EF, Coates JC, Rita T. The Armadillo repeat protein PF16 is essential for flagellar structure and function in Plasmodium male gametes. PLoS One. 2010;5(9):e12901. doi: 10.1371/journal.pone.0012901. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Holt JE, Ly-Huynh JD, Efthymiadis A, Hime GR, Loveland KL, Jans DA. Regulation of nuclear import during differentiation; the IMP alpha gene family and spermatogenesis. Curr Genomics. 2007;8(5):323–334. doi: 10.2174/138920207782446151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Mudgil Y, Shiu SH, Stone SL, Salt JN, Goring DR. A large complement of the predicted Arabidopsis ARM repeat proteins are members of the U-box E3 ubiquitin ligase family. Plant Physiol. 2004;134(1):59–66. doi: 10.1104/pp.103.029553. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Sharma M, Singh A, Shankar A, Pandey A, Baranwal V, Kapoor S, Tyagi AK, Pandey GK. Comprehensive expression analysis of rice Armadillo gene family during abiotic stress and development. DNA Res. 2014;21(3):267–283. doi: 10.1093/dnares/dst056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Teotia S, Lamb RS. RCD1 and SRO1 are necessary to maintain meristematic fate in Arabidopsis thaliana . J Exp Bot. 2011;62(3):1271–1284. doi: 10.1093/jxb/erq363. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Latijnhouwers M, Gillespie T, Boevink P, Kriechbaumer V, Hawes C, Carvalho CM. Localization and domain characterization of Arabidopsis golgin candidates. J Exp Bot. 2007;58(15–16):4373–4386. doi: 10.1093/jxb/erm304. [DOI] [PubMed] [Google Scholar]
- 82.Baumgartner W. Possible roles of LI-Cadherin in the formation and maintenance of the intestinal epithelial barrier. Tissue Barriers. 2013;1(1):e23815. doi: 10.4161/tisb.23815. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Oda H, Takeichi M. Evolution: structural and functional diversity of cadherin at the adherens junction. J Cell Biol. 2011;193(7):1137–1146. doi: 10.1083/jcb.201008173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Sotomayor M, Gaudet R, Corey DP. Sorting out a promiscuous superfamily: towards cadherin connectomics. Trends Cell Biol. 2014;24(9):524–536. doi: 10.1016/j.tcb.2014.03.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Xing Y, Takemaru K, Liu J, Berndt JD, Zheng JJ, Moon RT, Xu W. Crystal structure of a full-length beta-catenin. Structure. 2008;16(3):478–487. doi: 10.1016/j.str.2007.12.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Schuld NJ, Hauser AD, Gastonguay AJ, Wilson JM, Lorimer EL, Williams CL. SmgGDS-558 regulates the cell cycle in pancreatic, non-small cell lung, and breast cancers. Cell Cycle. 2014;13(6):941–952. doi: 10.4161/cc.27804. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Hauser AD, Bergom C, Schuld NJ, Chen X, Lorimer EL, Huang J, Mackinnon AC, Williams CL. The SmgGDS splice variant SmgGDS-558 is a key promoter of tumor growth and RhoA signaling in breast cancer. Mol Cancer Res. 2014;12(1):130–142. doi: 10.1158/1541-7786.MCR-13-0362. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Tew GW, Lorimer EL, Berg TJ, Zhi H, Li R, Williams CL. SmgGDS regulates cell proliferation, migration, and NF-kappaB transcriptional activity in non-small cell lung carcinoma. J Biol Chem. 2008;283(2):963–976. doi: 10.1074/jbc.M707526200. [DOI] [PubMed] [Google Scholar]
- 89.de Bruyn KM, Zwartkruis FJ, de Rooij J, Akkerman JW, Bos JL. The small GTPase Rap1 is activated by turbulence and is involved in integrin [alpha]IIb[beta]3-mediated cell adhesion in human megakaryocytes. J Biol Chem. 2003;278(25):22412–22417. doi: 10.1074/jbc.M212036200. [DOI] [PubMed] [Google Scholar]
- 90.Chung KT, Shen Y, Hendershot LM. BAP, a mammalian BiP-associated protein, is a nucleotide exchange factor that regulates the ATPase activity of BiP. J Biol Chem. 2002;277(49):47557–47563. doi: 10.1074/jbc.M208377200. [DOI] [PubMed] [Google Scholar]
- 91.Inaguma Y, Hamada N, Tabata H, Iwamoto I, Mizuno M, Nishimura YV, Ito H, Morishita R, Suzuki M, Ohno K, Kumagai T, Nagata K. SIL1, a causative cochaperone gene of Marinesco–Sojgren syndrome, plays an essential role in establishing the architecture of the developing cerebral cortex. EMBO Mol Med. 2014;6(3):414–429. doi: 10.1002/emmm.201303069. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Lee CF, Hauenstein AV, Fleming JK, Gasper WC, Engelke V, Sankaran B, Bernstein SI, Huxford T. X-ray crystal structure of the UCS domain-containing UNC-45 myosin chaperone from Drosophila melanogaster . Structure. 2011;19(3):397–408. doi: 10.1016/j.str.2011.01.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Price MG, Landsverk ML, Barral JM, Epstein HF. Two mammalian UNC-45 isoforms are related to distinct cytoskeletal and muscle-specific functions. J Cell Sci. 2002;115(Pt 21):4013–4023. doi: 10.1242/jcs.00108. [DOI] [PubMed] [Google Scholar]
- 94.Jilani Y, Lu S, Lei H, Karnitz LM, Chadli A. UNC45A localizes to centrosomes and regulates cancer cell proliferation through ChK1 activation. Cancer Lett. 2015;357(1):114–120. doi: 10.1016/j.canlet.2014.11.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Nozaki M, Onishi Y, Togashi S, Miyamoto H. Molecular characterization of the Drosophila Mo25 gene, which is conserved among Drosophila, mouse, and yeast. DNA Cell Biol. 1996;15(6):505–509. doi: 10.1089/dna.1996.15.505. [DOI] [PubMed] [Google Scholar]
- 96.Zeqiraj E, Filippi BM, Deak M, Alessi DR, van Aalten DM. Structure of the LKB1-STRAD-MO25 complex reveals an allosteric mechanism of kinase activation. Science. 2009;326(5960):1707–1711. doi: 10.1126/science.1178377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Dettmann A, Illgen J, Marz S, Schurg T, Fleissner A, Seiler S. The NDR kinase scaffold HYM1/MO25 is essential for MAK2 map kinase signaling in Neurospora crassa . PLoS Genet. 2012;8(9):e1002950. doi: 10.1371/journal.pgen.1002950. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Alberti S, Bohse K, Arndt V, Schmitz A, Hohfeld J. The cochaperone HspBP1 inhibits the CHIP ubiquitin ligase and stimulates the maturation of the cystic fibrosis transmembrane conductance regulator. Mol Biol Cell. 2004;15(9):4003–4010. doi: 10.1091/mbc.E04-04-0293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Rogon C, Ulbricht A, Hesse M, Alberti S, Vijayaraj P, Best D, Adams IR, Magin TM, Fleischmann BK, Hohfeld J. HSP70-binding protein HSPBP1 regulates chaperone expression at a posttranslational level and is essential for spermatogenesis. Mol Biol Cell. 2014;25(15):2260–2271. doi: 10.1091/mbc.E14-02-0742. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Feral C, Wu YQ, Pawlak A, Guellaen G. Meiotic human sperm cells express a leucine-rich homologue of Caenorhabditis elegans early embryogenesis gene, Zyg-11. Mol Hum Reprod. 2001;7(12):1115–1122. doi: 10.1093/molehr/7.12.1115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Vasudevan S, Starostina NG, Kipreos ET. The Caenorhabditis elegans cell-cycle regulator ZYG-11 defines a conserved family of CUL-2 complex components. EMBO Rep. 2007;8(3):279–286. doi: 10.1038/sj.embor.7400895. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Hjeij R, Lindstrand A, Francis R, Zariwala MA, Liu X, Li Y, Damerla R, Dougherty GW, Abouhamed M, Olbrich H, Loges NT, Pennekamp P, Davis EE, Carvalho CM, Pehlivan D, Werner C, Raidt J, Kohler G, Haffner K, Reyes-Mugica M, Lupski JR, Leigh MW, Rosenfeld M, Morgan LC, Knowles MR, Lo CW, Katsanis N, Omran H. ARMC4 mutations cause primary ciliary dyskinesia with randomization of left/right body asymmetry. Am J Hum Genet. 2013;93(2):357–367. doi: 10.1016/j.ajhg.2013.06.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Lonergan KM, Chari R, Deleeuw RJ, Shadeo A, Chi B, Tsao MS, Jones S, Marra M, Ling V, Ng R, Macaulay C, Lam S, Lam WL. Identification of novel lung genes in bronchial epithelium by serial analysis of gene expression. Am J Respir Cell Mol Biol. 2006;35(6):651–661. doi: 10.1165/rcmb.2006-0056OC. [DOI] [PubMed] [Google Scholar]
- 104.Sapiro R, Kostetskii I, Olds-Clarke P, Gerton GL, Radice GL, Strauss IJ. Male infertility, impaired sperm motility, and hydrocephalus in mice deficient in sperm-associated antigen 6. Mol Cell Biol. 2002;22(17):6298–6305. doi: 10.1128/MCB.22.17.6298-6305.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Teves ME, Sears PR, Li W, Zhang Z, Tang W, van Reesema L, Costanzo RM, Davis CW, Knowles MR, Strauss JF, 3rd, Zhang Z. Sperm-associated antigen 6 (SPAG6) deficiency and defects in ciliogenesis and cilia function: polarity, density, and beat. PLoS One. 2014;9(10):e107271. doi: 10.1371/journal.pone.0107271. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Li W, Mukherjee A, Wu J, Zhang L, Teves ME, Li H, Nambiar S, Henderson SC, Horwitz AR, Strauss Iii JF, Fang X, Zhang Z. Sperm Associated Antigen 6 (SPAG6) regulates fibroblast cell growth, morphology, migration and ciliogenesis. Sci Rep. 2015;5:16506. doi: 10.1038/srep16506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Manning BD, Snyder M. Drivers and passengers wanted! the role of kinesin-associated proteins. Trends Cell Biol. 2000;10(7):281–289. doi: 10.1016/S0962-8924(00)01774-8. [DOI] [PubMed] [Google Scholar]
- 108.Bansal SK, Gupta N, Sankhwar SN, Rajender S. Differential genes expression between fertile and infertile spermatozoa revealed by transcriptome analysis. PLoS One. 2015;10(5):e0127007. doi: 10.1371/journal.pone.0127007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Ratan R, Mason DA, Sinnot B, Goldfarb DS, Fleming RJ. Drosophila importin alpha1 performs paralog-specific functions essential for gametogenesis. Genetics. 2008;178(2):839–850. doi: 10.1534/genetics.107.081778. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Jabbour L, Welter JF, Kollar J, Hering TM. Sequence, gene structure, and expression pattern of CTNNBL1, a minor-class intron-containing gene—evidence for a role in apoptosis. Genomics. 2003;81(3):292–303. doi: 10.1016/S0888-7543(02)00038-1. [DOI] [PubMed] [Google Scholar]
- 111.Ganesh K, Adam S, Taylor B, Simpson P, Rada C, Neuberger M. CTNNBL1 is a novel nuclear localization sequence-binding protein that recognizes RNA-splicing factors CDC5L and Prp31. J Biol Chem. 2011;286(19):17091–17102. doi: 10.1074/jbc.M110.208769. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Huang X, Wang G, Wu Y, Du Z. The structure of full-length human CTNNBL1 reveals a distinct member of the armadillo-repeat protein family. Acta Crystallogr D Biol Crystallogr. 2013;69(Pt 8):1598–1608. doi: 10.1107/S0907444913011360. [DOI] [PubMed] [Google Scholar]
- 113.Hiroi N, Ito T, Yamamoto H, Ochiya T, Jinno S, Okayama H. Mammalian Rcd1 is a novel transcriptional cofactor that mediates retinoic acid-induced cell differentiation. EMBO J. 2002;21(19):5235–5244. doi: 10.1093/emboj/cdf521. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.Ryu JR, Echarri A, Li R, Pendergast AM. Regulation of cell–cell adhesion by Abi/Diaphanous complexes. Mol Cell Biol. 2009;29(7):1735–1748. doi: 10.1128/MCB.01483-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Schulze N, Graessl M, Blancke Soares A, Geyer M, Dehmelt L, Nalbant P. FHOD1 regulates stress fiber organization by controlling the dynamics of transverse arcs and dorsal fibers. J Cell Sci. 2014;127(Pt 7):1379–1393. doi: 10.1242/jcs.134627. [DOI] [PubMed] [Google Scholar]
- 116.Kim HC, Jo YJ, Kim NH, Namgoong S. Small molecule inhibitor of formin homology 2 domains (SMIFH2) reveals the roles of the formin family of proteins in spindle assembly and asymmetric division in mouse oocytes. PLoS One. 2015;10(4):e0123438. doi: 10.1371/journal.pone.0123438. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Xie C, Jiang G, Fan C, Zhang X, Zhang Y, Miao Y, Lin X, Wu J, Wang L, Liu Y, Yu J, Yang L, Zhang D, Xu K, Wang E. ARMC8alpha promotes proliferation and invasion of non-small cell lung cancer cells by activating the canonical Wnt signaling pathway. Tumour Biol. 2014;35(9):8903–8911. doi: 10.1007/s13277-014-2162-z. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.