Abstract
ATP‐binding cassette (ABC) systems, characterized by ABC‐type nucleotide‐binding domains (NBDs), play crucial roles in various aspects of human physiology. Human ABCG5 and ABCG8 form a heterodimeric transporter that functions in the efflux of sterols. We used sequence similarity search, multiple sequence alignment, phylogenetic analysis, and structure comparison to study the evolutionary origin and sequence signatures of ABCG5 and ABCG8. Orthologs of ABCG5 and ABCG8, supported by phylogenetic analysis and signature residues, were identified in bilaterian animals, Filasterea, Fungi, and Amoebozoa. Such a phylogenetic distribution suggests that ABCG5 and ABCG8 could have originated in the last common ancestor of Amorphea (the unikonts), the eukaryotic group including Amoebozoa and Opisthokonta. ABCG5 and ABCG8 were missing in genomes of various lineages such as snakes, jawless vertebrates, non‐vertebrate chordates, echinoderms, and basal metazoan groups. Amino‐acid changes in key positions in ABCG8 Walker A motif and/or ABCG5 C‐loop were observed in most tetrapod organisms, likely resulted in the loss of ATPase activity at one nucleotide‐binding site. ABCG5 and ABCG8 in Ecdysozoa (such as insects) exhibit elevated evolutionary rates and accumulate various changes in their NBD functional motifs. Alignment inspection revealed several residue positions that show different amino‐acid usages in ABCG5/ABCG8 compared to other ABCG subfamily proteins. These residues were mapped to the structural cores of transmembrane domains (TMDs), the NBD‐TMD interface, and the interface between TMDs. They serve as sequence signatures to differentiate ABCG5/ABCG8 from other ABCG subfamily proteins, and some of them may contribute to substrate specificity of the ABCG5/ABCG8 transporter.
Keywords: ABC systems, ABC transporters, ABCG5, ABCG8, sequence signatures, sterol efflux
1. INTRODUCTION
ATP‐binding cassette (ABC) systems are universally present in all domains of life and play important roles in a plethora of cellular processes including lipid translocation, hormone secretion, drug export, and gene regulation. 1 , 2 The essential component of an ABC system is an ABC‐type ATPase, also called the nucleotide‐binding domain (NBD). Based on sequence similarity and domain organization, eukaryotic ABC systems are classified into eight subfamilies named alphabetically from ABCA to ABCH. 3 , 4 , 5 , 6 Members of the ABCE and ABCF subfamilies possess only NBDs and are involved in non‐transporter functions such as translation and DNA repair. 7 Proteins of the other six eukaryotic ABC subfamilies (ABCA, ABCB, ABCC, ABCD, ABCG, and ABCH) are mainly exporters formed by single polypeptides or multi‐subunit complexes consisting of two NBDs and two transmembrane domains (TMDs).
An ABC‐type NBD possesses several conserved sequence motifs, including Walker A and Walker B motifs (common for all P‐loop NTPases), 8 the signature motif of the ABC‐type ATPases (C‐loop), D‐loop, Q‐loop, and H‐loop. 9 Walker A motif (GxGxxGK[ST]) with conserved glycine, lysine, and serine/threonine residues is mainly responsible for interactions with the phosphate moieties of the substrate ATP. Walker B motif (hhhhDE, h: a hydrophobic residue) is crucial for the catalytic reaction of ATP hydrolysis. 10 As ABC‐type NBDs function as homodimers or heterodimers, the C‐loop (SGG[EQ]) in one monomer is responsible for binding ATP on the opposite side to the Walker A motif in the other monomer. 11 The C‐loop and the D‐loop are essential for cross‐talks between the two monomers. 12 The conserved glutamine residue in the Q‐loop is involved in magnesium binding, 13 and the conserved histidine in the H‐loop (also called Switch II region) is an active site residue involved in catalysis. 14
Compared to NBDs, TMDs of ABC transporters are structurally more variable due to the diversity of the transported substrates. 1 , 15 TMDs of eukaryotic ABC transporters can form symmetric homodimers or pseudo‐centrosymmetric heterodimers. They are classified as either type I exporters (ABCB, ABCC, and ABCD) or type II exporters (ABCA, ABCG, and ABCH), which differ in the topology of the TMDs. 1 A functional ABC transporter in these subfamilies has minimally two NBDs and two TMDs from either one protein (a full transporter) or two proteins, each of which is called a half transporter (with one NBD and one TMD).
Multiple eukaryotic ABC‐system proteins are encoded in eukaryotic genomes, ranging from less than 30 in certain yeasts 16 to more than 100 in green plants 17 and the spider mite Tetranychus urticae. 18 The human genome encodes around 50 ABC‐system proteins 19 in subfamilies from ABCA to ABCG. A few of them are responsible for efflux of cholesterols and other sterol species, including the ABCA1 full transporter, the ABCG1 homodimer, and the ABCG5/ABCG8 heterodimer. 20 As cholesterol is an essential component of vertebrate cell membrane, the process of cholesterol secretion from hepatocytes to bile by ABCG5/ABCG8 is critical to maintain the balance of this sterol. 21 , 22 ABCG5/ABCG8 is also responsible for reducing circulating plant‐source phytosterols such as sitosterols absorbed from diet. Mutations in ABCG5 or ABCG8 have been found to block sterol secretion and lead to the autosomal recessive disorder sitosterolemia, the premature coronary atherosclerosis caused by sterol accumulation. 23 , 24
We aimed to study the evolutionary origin of ABCG5 and ABCG8 by comparative sequence and structure analysis of these proteins and other ABCG subfamily proteins in eukaryotic proteomes. Supported by the results of phylogenetic analysis, we identified orthologs of ABCG5/ABCG8 in several lineages outside vertebrates, including protostomes, Filasterea, Fungi, and Amoebozoa. Such a phylogenetic distribution suggests a deep evolutionary origin of these proteins in eukaryotes dating back from at least the last common ancestor of Amorphea (the unikonts). ABCG5 and ABCG8 appear to have been lost in genomes of snakes, jawless vertebrates, non‐vertebrate chordates, echinoderms, and non‐bilaterian metazoans. Degeneration of NBD motifs was observed in one nucleotide‐binding site of most tetrapod ABCG5/ABCG8 proteins and in both nucleotide‐binding sites of ecdysozoan ABCG5/ABCG8 proteins. Manual inspection of the ABCG subfamily alignment revealed positions showing different amino‐acid preferences in ABCG5/ABCG8 proteins compared to other ABCG proteins. Mutations in some of these positions have been linked to diseases. These signature residues map to several clusters in the ABCG5/ABCG8 heterodimer structure, 25 suggesting their involvement in maintaining the structural integrity of TMDs, inter‐domain interactions, and substrate interactions.
2. RESULTS AND DISCUSSION
2.1. Identification of ABCG5 and ABCG8 orthologs in diverse lineages of eukaryotes
We used BLAST 26 to search for eukaryotic homologs of human ABCG5 and ABCG8 proteins against a sequence database consisting of proteins in over 2,568 eukaryotic genomes downloaded from NCBI (Section 4). Orthologs of ABCG5 and ABCG8 were found from various vertebrates, including mammals, birds, reptiles, amphibians, teleost fishes, and chondrichthyan fishes. An ABC transporter has two NBDs that jointly form two nucleotide‐binding sites (NBSs). For the ABCG5 and ABCG8 heterodimer, one NBS (named NBS1 25 ) is formed by the Walker A/Walker B/Q‐loop motifs of ABCG8 and the signature motif (C‐loop) of ABCG5, and the other NBS (named NBS2) is formed by the Walker A/Walker B/Q‐loop motifs of ABCG5 and the signature motif of ABCG8. The Walker A motif of mammalian ABCG8 and the signature motif of ABCG5 in NBS1 have accumulated amino‐acid changes in key positions, while the crucial residues in NBS2 are preserved. For example, the serine/threonine in the Walker A motif of human ABCG8 has changed to alanine, and the first glycine in C‐loop has changed to threonine in human ABCG5 (Figure 1). Such changes indicate the loss of ATPase activity of NBS1. Indeed, while mutations introduced to key residues in NBS2 prevented bile sterol secretion, mutagenesis of the NBS1 residues did not affect sterol secretion. 27 , 28
FIGURE 1.
Multiple sequence alignment of sequence motifs in NBDs of the ABCG subfamily proteins. Consensuses of NBD sequence motifs are marked on top of the alignment. Conserved amino acids in these motifs are marked by red bold letters. Substitutions in these positions are marked by bold, black and underlined letters. Each protein is denoted by a two‐letter species abbreviation code (hs, Homo sapiens; cg, C. gigas; dm, D. melanogaster; is, I. scapularis; sc, S. cerevisiae; pv, P. violaceum; at, A. thaliana) and its NCBI accession number. Available common gene names are placed after the accession numbers for human, fruit fly, budding yeast, and A. thaliala. N‐terminal and C‐terminal NBDs of full transporters are denoted by “_N” and “_C” after the common names, respectively. Names of possible inactive NBDs due to substitutions in motifs are underlined. Organism names are colored as follows: magenta: metazoan; black: Filasterea; orange: fungi; blue: amoebozoan; green: green plants. The position showing different amino acid composition in ABCG5/ABCG8 is highlighted in grey background, with conserved amino acids in ABCG5/ABCG8 in bold magenta letters. The amino acid position numbers in human proteins are labeled above (Q206 for ABCG5 and Q226 for ABCG8). Numbers of amino acids in between the blocks are shown in parentheses
We found that the NBD motifs of both NBS1 and NBS2 in ABCG5 and ABCG8 proteins are conserved in cartilaginous fishes and bony fishes. In contrast, amino‐acid changes in ABCG5 Walker A motif and/or ABCG8 C‐loop were observed in most species of the tetrapod groups such as amphibians (except the three caecilians), reptiles, birds, and mammals (Figure S1), suggesting a loss of ATPase activity in NBS1. It is thus possible that both ABCG5 and ABCG8 NBDs were functional ATPases in the last ancestor of vertebrates, and that the likely loss of ATPase activity at NBS1 is a relatively recent evolutionary event occurred after the split of tetrapods from their sarcopterygian fish ancestors.
ABCG5 and ABCG8 were not found in proteomes of snakes, jawless vertebrates (hagfish and lampreys), nonvertebrate chordates (cephalochordates and tunicates), echinoderms, and hemichordates. Other transporters may substitute the role of ABCG5/ABCG8 in sterol secretion in these organisms. Control of excessive sterols could also be performed through other mechanisms. For example, secretory phospholipase A2 has been shown to decrease the serum cholesterol level in mice 29 and this enzyme is one main component of snake venom, which can decrease cholesterol level in human serum. 30 Therefore, snakes may reply on phospholipase A2 to control blood sterol level. Lampreys are known to thrive despite the developmental biliary atresia (loss of bile duct and gallbladder) during metamorphosis. 31 Loss of ABCG5/ABCG8 genes is consistent with this anatomical change in lamprey. Adult lampreys adapt to such a change by altering the composition of hepatic bile salt and maintaining normal plasma bile salt levels mainly through renal bile excretion. 32
Outside vertebrates, we identified orthologs of ABCG5 and ABCG8 in the Lophotrochozoa group including several phyla such as Mollusca, Annelida, and Brachiopoda. 33 Orthology was inferred by “reciprocal best BLAST hit” criterion (see Materials and methods) between ABCG5 or ABCG8 in human and corresponding proteins in these proteomes, and this relationship is further supported by phylogenetic analysis of ABCG subfamily proteins (Figure 3). Lophotrochozoan ABCG5 and ABCG8 proteins maintain the conservation of key NBD motifs (one sequence from Crassostrea gigas shown in Figure 1), suggesting functional ATPase activity in both NBS1 and NBS2.
FIGURE 3.
Phylogenetic tree of selected ABCG subfamily proteins. Sequence names are denoted the same way as in Figure 1. Sequences with possible inactive ATPases due to substitutions in conserved motifs are denoted by underlined names and red branches
The orthologs of ABCG5 and ABCG8 have been previously reported in Arthropoda, 5 , 18 a phylum in Ecdysozoa (a clade of Protostomia different from Lophotrochozoa). In the fruit fly Drosophila melanogaster, the orthologs of ABCG5 and ABCG8 are two proteins of unknown function: CG11069 and CG31121. 5 Like vertebrate ABCG5 and ABCG8 genes, corresponding genes of these two D. melanogaster orthologs form head‐to‐head orientation in the genome, suggesting co‐regulation at the gene level. BLAST searches using these proteins identified various orthologous proteins of ABCG5 and ABCG8 in Arthropoda and several other ecdysozoan phyla including Nematoda, Tardigrada, and Priapulimorpha. Ecdysozoan ABCG5 and ABCG8 orthologs harbor various changes in key positions in NBD motifs such as Walker A, Q‐loop, C‐loop, and H‐loop (Figure 1) in both NBS1 and NBS2, suggesting that both NBSs could have lost the ATPase activity. It is thus likely that these proteins no longer function as ATP‐dependent transporters. These ecdysozoan ABCG5 and ABCG8 orthologs could still be functionally important and may have adopted other roles such as binding and sensing sterol molecules in the membrane. The molecular and physiological functions of these proteins remain to be experimentally characterized.
The best BLAST hit outside Metazoa against the human ABCG5 query is a protein (NCBI accession: XP_004347341.1) from Capsaspora owczarzaki, a single‐cell organism of the Filasterea lineage. 34 This protein is the reciprocal best hit of ABCG5 between the C. owczarzaki proteome and the human proteome. It shares 36% identity with the full‐length human ABCG5 protein and 42% identity in the NBD region (Figure 1). Similarly, a protein from C. owczarzaki (XP_004345711.1) is the reciprocal best BLAST hit of human ABCG8 (Figure 1). These findings suggest that ABCG5 and ABCG8 have an evolutionary origin beyond metazoans. Further BLAST searches using C. owczarzaki ABCG5 and ABCG8 orthologs revealed closely related sequences as their putative orthologs in several other eukaryotic lineages, such as Fonticula, Fungi, and Amoebozoa. For example, C. owczarzaki ABCG5 identified a protein from the amoebozoan Acytostelium subglobosum (XP_012752403.1) with a sequence identity of 40.3% and an e‐value of 1.34e–141. It also found various full transporters from Fungi with two NBD‐TMD units. The best hits of C. owczarzaki ABCG5 are to the C‐terminal NBD‐TMD units of the Fungi full transporters, and the best hits of C. owczarzaki ABCG8 are to the N‐terminal NBD‐TMD units of the same proteins. Such observations suggest that ABCG5 and ABCG8 genes are fused in the Fungi lineage, with ABCG8 corresponding to the N‐terminal NBD‐TMD unit of the full transporter. Such full transporters include the YOL075C protein of unknown function from Saccharomyces cerevisiae. They have been shown to form a clade separated from other Fungi ABCG subfamily full transporters such as PDR5. 35
In summary, ABCG5 and ABCG8 orthologs were identified in the following eukaryotic lineages: Amoebozoa, Fonticula alba, Fungi, Filasterea, and Metazoa. These lineages all belong to the group of Amorphea (the unikonts), the ancestor of which was inferred to have a single flagellum. 12 On the other hand, ABCG5 and ABCG8 were not found in any proteomes of the Bikonta group, 36 which contains phylogenetic groups as SAR (stramenopiles, alveolates, and Rhiziria), Archaeplastida (such as green plants, red algae, and glaucophytes), and Excavata (such as Euglenozoa and Heterolobosea). It is thus possible that the ABCG5/ABCG8 complex originated in the common ancestor of Amorphea that contains Amoebozoa and Opisthokonta. Sequencing data sampling more Bikonta organisms may help resolve the questions whether ABCG5 and ABCG8 have a deeper evolutionary origin and whether the absence of ABCG5/ABCG8 in Bikonta is due to gene loss. Within the Amorphea group, ABCG5 and ABCG8 appear to be missing in proteomes of organisms from lineages such as Apusozoa, Ichthyosporea, and the four non‐bilaterian (basal) groups of metazoans (Porifera, Ctenophora, Cnidaria, and Placozoa), possibly due to lineage‐specific gene loss events.
Like their mammalian orthologs, nonmammalian ABCG5/ABCG8 could be involved in the transport of sterols or sterol‐like molecules. For example, YOL075C, the ABCG5/ABCG8 ortholog in budding yeast, was recently characterized to be a steryl ester transporter. 37 Nonmammalian ABCG5/ABCG8 may also be involved in the transport of other ligands, as a recent study found that the zebrafish ABCG5/ABCG8 transporter is involved in the detoxification of the pesticide lindane. 38
2.2. Phylogenetic analysis of ABCG subfamily proteins from selected eukaryotes
We constructed a maximum likelihood phylogenetic tree (Section 4) using the ABCG subfamily proteins from selected eukaryotic organisms (Figure 3). This tree shows strong support for the separation of ABCG5 and ABCG8 proteins from other ABCG members (bootstrap value: 100), consistent with the hypothesis that they resulted from a gene duplication of an ancestral homodimeric half transporter. The monophyly of ABCG5 proteins including yeast and amoebozoan sequences is strongly supported (bootstrap value: 99). ABCG8 orthologs also form a monophyletic group, within which the subgroup of metazoan and filasterean sequences is well supported (bootstrap value: 96.4). The ABCG5 and ABCG8 proteins from ecdysozoans (D. melanogaster (dm|), C. elegans (ce|), and tick (is|)) form long branches. The elevated evolutionary rates in ecdysozoans are consistent with the observed changes in active site residues in both ABCG5 and ABCG8 NBDs (Figure 1).
Several other ABCG groups have strong support and are labeled in Figure 3. The ABCG2 group of transporters includes human multidrug transporter ABCG2. 39 It was found in Metazoa, Fungi, and Amoebozoa (Figure 3). Consistent with the result of a previous study, 5 we did not find orthologs of ABCG2 in arthropods. The WHITE group of ABCG transporters includes D. melanogaster pigment transporters white, brown and scarlet. 40 They are also found in C. elegans and the mollusk C. gigas, but not in vertebrates. The ABCG1 group contains human ABCG1 and ABCG4, two closely related transporters functioning in sterol export. 41 This group also includes an expanded set of duplicated gene products from D. melanogaster, one of which (Atet) has been shown to be a transporter of the precursor steroid hormone ecdysone. 42 The E23 group contains D. melanogaster protein E23 (Early gene at 23) 43 and some plant sequences (Figure 3).
The PDR (pleiotropic drug resistance) group of ABCG full transporters is responsible for exporting xenobiotics in yeast and heavy metals and secondary metabolites in plants. 44 , 45 , 46 We identified PDR transporters in a variety of eukaryotes including Fungi (such as S. cerevisiae proteins PDR5, SNQ2, and AUS1, Figures 1 and 2), C. owczarzaki, Amoebozoa, and green plants. In contrast, PDRs were not found in metazoans. The N‐terminal NBD‐TMD units (PDR_N in Figure 3) and C‐terminal NBD‐TMD units (PDR_C in Figure 3) of the PDR full transporters form well‐supported monophyletic groups individually (bootstrap values >95). These two groups together form a monophyletic group, suggesting that they arose from the duplication of a half transporter followed by gene fusion. The branch lengths of the N‐terminal NBD‐TMD units of PDRs are longer than those of the C‐terminal NBD‐TMD units, consistent with the degeneration of NBD motifs (Figure 1) and potential loss of ATPase activity at one nucleotide‐binding site. While the yeast protein YOL075C is also a full transporter, it does not belong to the PDR group of ABCG full transporters. Instead, its N‐ and C‐terminal NBD‐TMD units are orthologous to ABCG8 and ABCG5, respectively. The gene fusion of ABCG5 and ABCG8 to form a full transporter appears to be a Fungi‐specific event as ABCG5 and ABCG8 are encoded by separate genes in other lineages including Amoebozoa, Fonticula, Filasterea, and Metazoa.
FIGURE 2.
Multiple sequence alignment showing ABCG5/ABCG8‐signature residues in TMDs of ABCG subfamily proteins. Sequences are named and ordered in the same way as in Figure 1. Small residues (G, A, S, C, T, P) in positions with mainly small residues are colored green. Residues in positions showing differences in amino acid usages of ABCG5 or ABCG8 compared to other ABCG subfamily proteins are highlighted in grey or yellow background. These positions are labeled by their amino acid numbers in human ABCG5 and ABCG8 proteins. The signature residues in ABCG5 and ABCG8 are colored based on their locations in the structure of the human ABCG5/ABCG8 complex: magenta—at the interface of TMD and NBD; cyan—positions in the interface of ABCG5 TMD and ABCG8 TMD; orange—polar residues in the core of the transmembrane helices, yellow background—positions showing size changes in ABCG5 TMD. Ligand‐interacting residues are marked by asterisks above human ABCG5 and ABCG8 (PDB: 7R8B, site 1 residues in green and site 2 residues in red), human ABCG1 (PDB: 7R8D), and human ABCG2 (PDB: 7OJ8)
2.3. Identification of ABCG5 and ABCG8 signature residues
We constructed a multiple sequence alignment of ABCG5, ABCG8, and other ABCG subfamily proteins. Several positions were identified from this alignment that exhibit specific amino‐acid usages for ABCG5 proteins and/or ABCG8 proteins. These residue positions are highlighted with residue numbers labeled in alignment Figures 1 and 2 and mapped onto the structure of ABCG5 and ABCG8 in Figure 4. They represent signature residues that help differentiate ABCG5 and ABCG8 proteins from other ABCG subfamily proteins. It should be noted that these positions often have different amino‐acid usages in the ecdysozoan ABCG5 and ABCG8 proteins that have experienced accelerated evolution.
FIGURE 4.
Structural mapping of ABCG5/ABCG8‐signature residues. C‐alpha traces of ABCG5 and ABCG8 are shown in black and gray, respectively. Side chains of ABCG5 and ABCG8 signature residues are shown in spheres, and the residue numbers are labeled for them. Color coding of side‐chain carbon atoms in these residues is consistent with the color coding of these residues in Figures 1 and 2. Sterol molecules in site 1 and site 2 are colored in green and red, respectively
For the NBD domains, the most striking difference between ABCG5/ABCG8 and the other ABCG members is observed in the position that is nine amino acids away from the C‐terminus of the C‐loop motif SGGE/Q. This position (Q206 in human ABCG5 and Q226 in human ABCG8, in magenta background in Figure 1) is mostly occupied by a glutamine in ABCG5 and ABCG8 proteins, compared to a negatively charged residue (glutamate being the most frequent) in other ABCG subfamily proteins. Structurally, this position is located at the interface between the NBD and the TMD. Interestingly, one of its interacting residues in the TMD coupling helix also has different amino‐acid composition in ABCG5 and ABCG8 proteins compared to other ABCG subfamily proteins (Figure 4). It is preferably a negatively charged residue (glutamate or aspartate) in ABCG5/ABCG8 (D455 in human ABCG5 and D484 and in human ABCG8, in magenta background in Figure 2) as opposed to mainly a noncharged polar residue in other ABCG subfamily proteins, such as serine in human ABCG2 and asparagine in human ABCG1 and ABCG4. The concerted changes in these two positions at the NBD‐TMD interface maintain the polar side‐chain interactions between a charged residue and a noncharged polar residue and avoid placing two negatively charged residues in structural vicinity.
Several ABCG5/ABCG8‐signature residues reside in the middle four core transmembrane helices (TMH2‐5, shown in Figure 2). Among them, polar residues Y424, Q425, and R550 in ABCG5 and D466, N564, and N568 in ABCG8 are located in the interface between the two TMDs (colored cyan in Figures 2 and 4). These positions are often occupied by hydrophobic residues in other ABCG subfamily proteins. They could contribute to substrate binding surfaces or entry/exit ways for sterols to access the heterodimer interface. Residue R550 of human ABCG5 lies at the apex of ABCG5 and ABCG8 interface. Its position is occupied by a hydrophobic residue in non‐ABCG5 proteins (Figure 2). One mutation (R550S) has been reported in sitosterolemia patients and is likely to be disease‐causing, 24 suggesting that R550 could play an important role in the sterol transport.
A recent structure of the ABCG5/ABCG8 complex showed two sterol molecules binding at two sites located in the interface between ABCG5 and ABCG8 TMDs (PDB: 7R8B). 47 The first site (site 1 47 ) is located within the cytosolic leaflet of the membrane and contains a sterol (shown in green in Figure 4) positioning in parallel to the transmembrane helices. Sterol‐contacting residues (within 5 Å) at site 1 are mainly from TMH2 of ABCG8 and TMH5 of ABCG5 (marked by green asterisks in Figure 2). One signature residue (D466 of ABCG8) makes contact with sterol in site 1. This site is also observed in the structure of ABCG1 that binds sterols in a similar fashion (pdb: 7R8D) 47 and involves a similar set of binding residues from TMH2 and TMH5 (marked by green asterisks above the human ABCG1 in Figure 2). The second sterol‐binding site (site 2) of ABCG5/ABCG8 is located midway through the TMHs, and the orientation of sterol molecule (shown in red in Figure 4) is nearly parallel to the membrane. Sterol‐interacting residues of site 2 are mainly from TMH2 and TMH5 of ABCG5 and TMH5 of ABCG8 (marked by red asterisks in Figure 2). Signature residues Q425 of ABCG5 and N564/N568 of ABCG8 make contact with sterol in site 2. These three positions are occupied by mainly hydrophilic residues in ABCG5/ABCG8 proteins and mainly hydrophobic residues in other ABCG subgroups. The location of ABCG5/ABCG8 site 2 is similar to the ligand‐binding sites observed in the structures of human ABCG2, 48 , 49 , 50 a multi‐drug transporter with broad substrate specificity. One ABCG2 ligand with a chemical structure similar to sterols (PDB: 7OJ8, ligand: estrone 3‐sulfate with four fused rings) 48 showed a similar set of ligand‐binding residues (marked by red asterisks in Figure 2 above human ABCG2) to those of the ABCG5/ABCG8 sterol‐binding site 2.
Residues N437, Y467, S475, and E514 in ABCG5 and residues H504 and R543 in ABCG8 (colored orange in Figures 2 and 4) are involved in polar interaction networks inside their TMDs. Concerted changes are observed in these positions when comparing ABCG5 or ABCG8 to other ABCG subfamily proteins. For example, the position of human ABCG5 Y467 is often occupied by a positively charged residue in other ABCG members. These polar interactions inside TMDs should be important for maintaining the structural stability of TMDs. Several mutations have been mapped to polar residues in the structural cores of the ABCG5 and ABCG8 TMDs. For example, the N437K mutation of ABCG5 led to no expression of ABCG5 51 and can cause sitosterolemia. 52 The R543S mutation of ABCG8 caused a decrease in the proportion of ABCG8 present in the mature form. 51
Small residues (such as glycine and alanine) inside TMDs often occur at tightly packed sites between transmembrane helices. Several small‐residue positions are conserved among all ABCG subfamily proteins (colored green in Figure 2). We observed two positions in ABCG5 TMD (in yellow background in Figure 2) that showed changes of side‐chain size compared to other ABCG members. The position corresponding to A478 of human ABCG5 is occupied by mainly small residues in ABCG5 proteins and mainly large residues in other ABCG subfamily proteins (e.g., Y507 in human ABCG8). The opposite size change is observed in the position corresponding to L521 of human ABCG5, where large residues occur in ABCG5 proteins compared to mainly small residues in other ABCG members (e.g., A550 in human ABCG8). The size differences in these two positions suggest subtle changes in the details of transmembrane helix packing in ABCG5.
3. CONCLUSIONS
Sterol molecules with a shared core structure of four fused rings are universally present in the cell membrane of eukaryotes. The ABCG5/ABCG8 heterodimeric transporter plays crucial roles in maintaining sterol balance in mammals by exporting diet‐derived sterols. Through extensive sequence similarity searches, we identified ABCG5/ABCG8 in diverse eukaryotic lineages in the group of Amorphea, including Amoebozoa, Fungi, Filasterea, protostomian invertebrates, and vertebrates. Such a finding suggests that an ancient gene duplication event gave rise to this heterodimeric transporter. Some vertebrate organisms, such as snakes and lampreys, lost their ABCG5/ABCG8, which may be correlated with their unique physiology. Orthology of ABCG5 and ABCG8 was supported by phylogenetic analysis and the discovery of ABCG5/ABCG8‐specific signature residues compared to other subgroups of the ABCG‐type transporters. These signature residues structurally map to the interdomain interaction interfaces involving the coupling helices, the cores of TMDs, and the interface between TMDs. The catalytic asymmetry observed in mammalian ABCG5/ABCG8 28 likely originated after the split of tetrapods from other vertebrates, as we observed that most ABCG5/ABCG8 proteins in the tetrapod lineage have degenerated sequence motifs at one nucleotide‐binding site (Figure S1). Most ABCG5/ABCG8 proteins in lineages outside tetrapods likely maintain the catalytic integrity of both NBSs by preserving key conserved residues. One exception is the ecdysozoan lineage, where both NBSs have accumulated changes at critical residues, suggesting that ecdysozoan ABCG5/ABCG8 proteins may be catalytically inactive and have nontransporter functions. Few experiments have been performed for ABCG5/ABCG8 proteins outside mammalian organisms. The ligands and physiological roles of ABCG5/ABCG8 orthologs in lineages outside mammals remain to be experimentally explored.
4. MATERIALS AND METHODS
4.1. Sequence database of eukaryotic proteomes and BLAST searches
Eukaryotic proteomes were downloaded from the NCBI genome database (https://www.ncbi.nlm.nih.gov/genome) as of February 2020. One proteome is kept for each species corresponding to the reference genome or the representative genome denoted by NCBI. Sequence redundancies were removed at 100% sequence identity level for each proteome by CD‐HIT. 53 Proteomes were then combined and formatted to form the BLAST database. The database contains 2,568 eukaryotic proteomes and 49,102,568 sequences.
BLASTP searches were initiated for human ABCG5 and ABCG8 proteins. Potential orthologous proteins were identified by reciprocal best BLASTP hits between proteomes. Two proteins A and B in two different proteomes are called reciprocal best hits of each other if the BLASTP search using protein A as the query identifies protein B as the best‐scoring hit in the proteome that contains B, and the BLASTP search using protein B as the query identifies protein A as the best‐scoring hit in the proteome that contains A. Lineage‐specific BLASTP searches were then performed using selected ABCG5 and ABCG8 orthologs as queries in different lineages. They include ABCG5 and ABCG8 from C. gigas (the Lophotrochozoa lineage), Ixodes scapularis (the Arthropoda lineage), Toxocara canis (the Nematoda lineage), C. owczarzaki (the Filasterea lineage), Fonticula alba (the Fonticula lineage), Spizellomyces punctatus DAOM BR117 (the Fungi lineage), and Acytostelium subglobosum LB1 (the Amoebozoa lineage). Top hits of these BLAST searches were manually checked to verify their orthology to the query ABCG5 and ABCG8 proteins by strong e‐values (less than 1e−40, database size: 25,494,900,823 letters), presence of ABCG5/ABCG8‐signature residues, and reciprocal best BLAST hits between two organisms.
4.2. Multiple sequence alignment and phylogenetic inference of selected ABCG subfamily proteins
We selected one or two representative proteomes from the following lineages: vertebrates—Homo sapiens, Lophotrochozoa—C. gigas; Arthropoda—D. melanogaster and I. scapularis; Nematoda—C. elegans; Filasterea—C. owczarzaki; Fungi—S. cereviciae; Amoebozoa—P. violaceum; Viridiplantae—A. thaliana. ABCG members in these proteomes were manually identified based on BLASTP searches (e‐value <1e−20). For human and fruit fly (an arthropod), we kept all ABCG subfamily proteins. Likely fruit fly, the arthropod I. scapularis experienced gene expansion of ABCG proteins. Due to space limitations, we only included ABCG5 and ABCG8 of I. scapularis for alignment and tree reconstruction. For each of the other organisms, we used CD‐HIT to cluster ABCG subfamily proteins at 40% identity level and selected one representative (the longest sequence) from each cluster as representative sequences. Full transporters of the representative sequences were split into N‐terminal NBD‐TMD units and C‐terminal NBD‐TMD units. MAFFT (version 7.475) 54 was used to construct a multiple sequence alignment of the half transporters and the NBD‐TMD units of full transporters (options: –localpair –maxiterate 1,000). For phylogenetic tree reconstruction, we removed poorly aligned regions with high gap ratios (positions with a gap fraction over 50%) near the N‐ and C‐termini and submitted the alignment to the IQTree server (IQTree version: 1.6.12). 55 The best phylogenetic model (LG + F + R7: LG substitution model, 56 empirical base frequencies counted from alignment, seven categories of gamma‐distributed rates) was identified by ModelFinder. 57 Bootstrap analysis was performed by the SH‐aLRT test. 58
CONFLICT OF INTEREST
None declared.
AUTHOR CONTRIBUTIONS
Jimin Pei: Conceptualization (equal); formal analysis (equal); writing – original draft (equal). Qian Cong: Conceptualization (equal); writing – review and editing (equal).
Supporting information
Figure S1 Walker A motifs and C‐loops of ABCG5 and ABCG8 in various groups of vertebrates. Conserved residues in Walker A motif (GxxGxGK[ST]) and C‐loop (SGG[EQ]) are colored red. Changes in these conserved residues are underlined and in cyan background.
ACKNOWLEDGMENTS
We would like to thank Jonathan Cohen, Helen Hobbs, Nick Grishin, Lisa Kinch, and Xiaochen Bai for helpful discussions.
Pei J, Cong Q. Evolutionary origin and sequence signatures of the heterodimeric ABCG5/ABCG8 transporter. Protein Science. 2022;31(5):e4297. 10.1002/pro.4297
Review Editor: Nir Ben‐Tal
Funding information Southwestern Medical Foundation scholar
REFERENCES
- 1. Srikant S, Gaudet R. Mechanics and pharmacology of substrate selection and transport by eukaryotic ABC exporters. Nat Struct Mol Biol. 2019;26(9):792–801. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Dassa E, Bouige P. The ABC of ABCS: A phylogenetic and functional classification of ABC systems in living organisms. Res Microbiol. 2001;152(3–4):211–229. [DOI] [PubMed] [Google Scholar]
- 3. Theodoulou FL, Kerr ID. ABC transporter research: Going strong 40 years on. Biochem Soc Trans. 2015;43(5):1033–1040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Xiong J, Feng J, Yuan D, Zhou J, Miao W. Tracing the structural evolution of eukaryotic ATP binding cassette transporter superfamily. Sci Rep. 2015;5:16724. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Dermauw W, Van Leeuwen T. The ABC gene family in arthropods: Comparative genomics and role in insecticide transport and resistance. Insect Biochem Mol Biol. 2014;45:89–110. [DOI] [PubMed] [Google Scholar]
- 6. Dean M, Annilo T. Evolution of the ATP‐binding cassette (ABC) transporter superfamily in vertebrates. Annu Rev Genomics Hum Genet. 2005;6:123–142. [DOI] [PubMed] [Google Scholar]
- 7. Kerr ID. Sequence analysis of twin ATP binding cassette proteins involved in translational control, antibiotic resistance, and ribonuclease L inhibition. Biochem Biophys Res Commun. 2004;315(1):166–173. [DOI] [PubMed] [Google Scholar]
- 8. Walker JE, Saraste M, Runswick MJ, Gay NJ. Distantly related sequences in the alpha‐ and beta‐subunits of ATP synthase, myosin, kinases and other ATP‐requiring enzymes and a common nucleotide binding fold. EMBO J. 1982;1(8):945–951. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Schmitt L, Tampe R. Structure and mechanism of ABC transporters. Curr Opin Struct Biol. 2002;12(6):754–760. [DOI] [PubMed] [Google Scholar]
- 10. Orelle C, Dalmas O, Gros P, di Pietro A, Jault JM. The conserved glutamate residue adjacent to the Walker‐B motif is the catalytic base for ATP hydrolysis in the ATP‐binding cassette transporter BmrA. J Biol Chem. 2003;278(47):47002–47008. [DOI] [PubMed] [Google Scholar]
- 11. Jones PM, George AM. The ABC transporter structure and mechanism: Perspectives on recent research. Cell Mol Life Sci. 2004;61(6):682–699. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. He D, Fiz‐Palacios O, Fu CJ, Fehling J, Tsai CC, Baldauf SL. An alternative root for the eukaryote tree of life. Curr Biol. 2014;24(4):465–470. [DOI] [PubMed] [Google Scholar]
- 13. Hopfner KP, Tainer JA. Rad50/SMC proteins and ABC transporters: Unifying concepts from high‐resolution structures. Curr Opin Struct Biol. 2003;13(2):249–255. [DOI] [PubMed] [Google Scholar]
- 14. Zaitseva J, Jenewein S, Jumpertz T, Holland IB, Schmitt L. H662 is the linchpin of ATP hydrolysis in the nucleotide‐binding domain of the ABC transporter HlyB. EMBO J. 2005;24(11):1901–1910. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Thomas C, Aller SG, Beis K, et al. Structural and functional diversity calls for a new classification of ABC transporters. FEBS Lett. 2020;594(23):3767–3775. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Gaur M, Choudhury D, Prasad R. Complete inventory of ABC proteins in human pathogenic yeast, Candida albicans . J Mol Microbiol Biotechnol. 2005;9(1):3–15. [DOI] [PubMed] [Google Scholar]
- 17. Kang J, Park J, Choi H, et al. Plant ABC transporters. Arabidopsis Book. 2011;9:e0153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Dermauw W, Osborne E, Clark RM, Grbić M, Tirry L, van Leeuwen T. A burst of ABC genes in the genome of the polyphagous spider mite Tetranychus urticae . BMC Genomics. 2013;14:317. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Vasiliou V, Vasiliou K, Nebert DW. Human ATP‐binding cassette (ABC) transporter family. Hum Genomics. 2009;3(3):281–290. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Ikonen E. Mechanisms for cellular cholesterol transport: Defects and human disease. Physiol Rev. 2006;86(4):1237–1261. [DOI] [PubMed] [Google Scholar]
- 21. Lee MH, Lu K, Hazard S, et al. Identification of a gene, ABCG5, important in the regulation of dietary cholesterol absorption. Nat Genet. 2001;27(1):79–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Graf GA, Yu L, Li WP, et al. ABCG5 and ABCG8 are obligate heterodimers for protein trafficking and biliary cholesterol excretion. J Biol Chem. 2003;278(48):48275–48282. [DOI] [PubMed] [Google Scholar]
- 23. Berge KE, Tian H, Graf GA, et al. Accumulation of dietary cholesterol in sitosterolemia caused by mutations in adjacent ABC transporters. Science. 2000;290(5497):1771–1775. [DOI] [PubMed] [Google Scholar]
- 24. Lu K, Lee MH, Hazard S, et al. Two genes that map to the STSL locus cause sitosterolemia: Genomic structure and spectrum of mutations involving sterolin‐1 and sterolin‐2, encoded by ABCG5 and ABCG8, respectively. Am J Hum Genet. 2001;69(2):278–290. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Lee JY, Kinch LN, Borek DM, et al. Crystal structure of the human sterol transporter ABCG5/ABCG8. Nature. 2016;533(7604):561–564. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Altschul SF, Madden TL, Schäffer AA, et al. Gapped BLAST and PSI‐BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–3402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Wang J, Grishin N, Kinch L, Cohen JC, Hobbs HH, Xie XS. Sequences in the nonconsensus nucleotide‐binding domain of ABCG5/ABCG8 required for sterol transport. J Biol Chem. 2011;286(9):7308–7314. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Zhang DW, Graf GA, Gerard RD, Cohen JC, Hobbs HH. Functional asymmetry of nucleotide‐binding domains in ABCG5 and ABCG8. J Biol Chem. 2006;281(7):4507–4516. [DOI] [PubMed] [Google Scholar]
- 29. Tietge UJ, Maugeais C, Lund‐Katz S, Grass D, de Beer FC, Rader DJ. Human secretory phospholipase A2 mediates decreased plasma levels of HDL cholesterol and apoA‐I in response to inflammation in human apoA‐I transgenic mice. Arterioscler Thromb Vasc Biol. 2002;22(7):1213–1218. [DOI] [PubMed] [Google Scholar]
- 30. Winkler E, Chovers M, Almog S, et al. Decreased serum cholesterol level after snake bite (Vipera palaestinae) as a marker of severity of envenomation. J Lab Clin Med. 1993;121(6):774–778. [PubMed] [Google Scholar]
- 31. Sidon EW, Youson JH. Morphological changes in the liver of the sea lamprey, Petromyzon marinus L., during metamorphosis: I. Atresia of the bile ducts. J Morphol. 1983;177(1):109–124. [DOI] [PubMed] [Google Scholar]
- 32. Cai SY, Lionarons DA, Hagey L, Soroka CJ, Mennone A, Boyer JL. Adult Sea lamprey tolerates biliary atresia by altering bile salt composition and renal excretion. Hepatology. 2013;57(6):2418–2426. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Philippe H, Lartillot N, Brinkmann H. Multigene analyses of bilaterian animals corroborate the monophyly of Ecdysozoa, Lophotrochozoa, and Protostomia. Mol Biol Evol. 2005;22(5):1246–1253. [DOI] [PubMed] [Google Scholar]
- 34. Shalchian‐Tabrizi K, Minge MA, Espelund M, et al. Multigene phylogeny of choanozoa and the origin of animals. PLoS One. 2008;3(5):e2098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Kovalchuk A, Driessen AJ. Phylogenetic analysis of fungal ABC transporters. BMC Genomics. 2010;11:177. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Burki F, Pawlowski J. Monophyly of Rhizaria and multigene phylogeny of unicellular bikonts. Mol Biol Evol. 2006;23(10):1922–1930. [DOI] [PubMed] [Google Scholar]
- 37. Ellis EC. Characterization of the first steryl ester transporter in biology, Yol075c, and its pleiotropic effects (PhD Dissertation). Champaign, IL: University of Illinois at Urbana‐Champaign, 2020. [Google Scholar]
- 38. Zhang L, Fang Y, Lu X, et al. Transcriptional response of zebrafish larvae exposed to lindane reveals two detoxification genes of ABC transporter family (abcg5 and abcg8). Comp Biochem Physiol C Toxicol Pharmacol. 2020;232:108755. [DOI] [PubMed] [Google Scholar]
- 39. Mo W, Zhang JT. Human ABCG2: Structure, function, and its role in multidrug resistance. Int J Biochem Mol Biol. 2012;3(1):1–27. [PMC free article] [PubMed] [Google Scholar]
- 40. Mackenzie SM, Brooker MR, Gill TR, Cox GB, Howells AJ, Ewart GD. Mutations in the white gene of Drosophila melanogaster affecting ABC transporters that determine eye colouration. Biochim Biophys Acta. 1999;1419(2):173–185. [DOI] [PubMed] [Google Scholar]
- 41. Vaughan AM, Oram JF. ABCA1 and ABCG1 or ABCG4 act sequentially to remove cellular cholesterol and generate cholesterol‐rich HDL. J Lipid Res. 2006;47(11):2433–2443. [DOI] [PubMed] [Google Scholar]
- 42. Yamanaka N, Marques G, O'Connor MB. Vesicle‐mediated steroid hormone secretion in Drosophila melanogaster. Cell. 2015;163(4):907–919. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Hock T, Cottrill T, Keegan J, Garza D. The E23 early gene of drosophila encodes an ecdysone‐inducible ATP‐binding cassette transporter capable of repressing ecdysone‐mediated gene activation. Proc Natl Acad Sci USA. 2000;97(17):9519–9524. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Crouzet J, Trombik T, Fraysse ÅS, Boutry M. Organization and function of the plant pleiotropic drug resistance ABC transporter family. FEBS Lett. 2006;580(4):1123–1130. [DOI] [PubMed] [Google Scholar]
- 45. Sipos G, Kuchler K. Fungal ATP‐binding cassette (ABC) transporters in drug resistance & detoxification. Curr Drug Targets. 2006;7(4):471–481. [DOI] [PubMed] [Google Scholar]
- 46. Nuruzzaman M, Zhang R, Cao HZ, Luo ZY. Plant pleiotropic drug resistance transporters: Transport mechanism, gene expression, and function. J Integr Plant Biol. 2014;56(8):729–740. [DOI] [PubMed] [Google Scholar]
- 47. Sun Y, Wang J, Long T, et al. Molecular basis of cholesterol efflux via ABCG subfamily transporters. Proc Natl Acad Sci USA. 2021;118(34):e2110483118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Yu Q, Ni D, Kowal J, et al. Structures of ABCG2 under turnover conditions reveal a key step in the drug transport mechanism. Nat Commun. 2021;12(1):4376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Kowal J, Ni D, Jackson SM, Manolaridis I, Stahlberg H, Locher KP. Structural basis of drug recognition by the multidrug transporter ABCG2. J Mol Biol. 2021;433(13):166980. [DOI] [PubMed] [Google Scholar]
- 50. Manolaridis I, Jackson SM, Taylor NMI, Kowal J, Stahlberg H, Locher KP. Cryo‐EM structures of a human ABCG2 mutant trapped in ATP‐bound and substrate‐bound states. Nature. 2018;563(7731):426–430. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Graf GA, Cohen JC, Hobbs HH. Missense mutations in ABCG5 and ABCG8 disrupt heterodimerization and trafficking. J Biol Chem. 2004;279(23):24881–24888. [DOI] [PubMed] [Google Scholar]
- 52. Hubacek JA, Berge KE, Cohen JC, Hobbs HH. Mutations in ATP‐cassette binding proteins G5 (ABCG5) and G8 (ABCG8) causing sitosterolemia. Hum Mutat. 2001;18(4):359–360. [DOI] [PubMed] [Google Scholar]
- 53. Fu L, Niu B, Zhu Z, Wu S, Li W. CD‐HIT: Accelerated for clustering the next‐generation sequencing data. Bioinformatics. 2012;28(23):3150–3152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–780. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. Nguyen LT, Schmidt HA, von Haeseler A, Minh BQ. IQ‐TREE: A fast and effective stochastic algorithm for estimating maximum‐likelihood phylogenies. Mol Biol Evol. 2015;32(1):268–274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Le SQ, Gascuel O. An improved general amino acid replacement matrix. Mol Biol Evol. 2008;25(7):1307–1320. [DOI] [PubMed] [Google Scholar]
- 57. Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS. ModelFinder: Fast model selection for accurate phylogenetic estimates. Nat Methods. 2017;14(6):587–589. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58. Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O. New algorithms and methods to estimate maximum‐likelihood phylogenies: Assessing the performance of PhyML 3.0. Syst Biol. 2010;59(3):307–321. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Figure S1 Walker A motifs and C‐loops of ABCG5 and ABCG8 in various groups of vertebrates. Conserved residues in Walker A motif (GxxGxGK[ST]) and C‐loop (SGG[EQ]) are colored red. Changes in these conserved residues are underlined and in cyan background.