Abstract
Laforin catalyzes glycogen dephosphorylation. Mutations in its gene result in Lafora disease, a fatal progressive myoclonus epilepsy, the hallmark being water-insoluble, hyperphosphorylated carbohydrate inclusions called Lafora bodies. Human laforin consists of an N-terminal carbohydrate-binding module (CBM) from family CBM20 and a C-terminal dual specificity phosphatase domain. Laforin is conserved in all vertebrates, some basal metazoans and a small group of protozoans. The present in silico study defines the evolutionary relationships among the CBM20s of laforin with an emphasis on newly identified laforin orthologues. The study reveals putative laforin orthologues in Trichinella, a parasitic nematode, and identifies two sequence inserts in the CBM20 of laforin from parasitic coccidia. Finally, we identify that the putative laforin orthologues from some protozoa and algae possess more than one CBM20.
Keywords: laforin, Lafora disease, carbohydrate-binding module, family CBM20, domain arrangement, evolutionary relatedness
Introduction
Starch-binding domains (SBDs) as independent sequence-structural modules help to increase the binding and degradation of raw starch by amylolytic enzymes and are therefore important for their potential application in biotechnologies [1]. They play their main role in approximately 10% of various starch hydrolases and related enzymes of microbial origin [2–4]. It is of interest that homologues have been revealed in different plant and animal non-amylolytic enzymes and in these proteins the SBDs allow the proteins to bind α-glucans, such as starch and glycogen [1,5,6].
In the Carbohydrate Active enZymes (CAZy) database ([7]; http://www.cazy.org/), the SBDs have been classified among the families of carbohydrate-binding modules (CBMs). Currently, there are 83 different CBM families with 15 considered SBDs due to their ability to bind starch and related α-glucans, such as glycogen, cyclodextrins and various maltooligosaccharides – these are CBM20, CBM21, CBM25, CBM26, CBM34, CBM41, CBM45, CBM48, CBM53, CBM58, CBM68, CBM69, CBM74, CBM82 and CBM83 [7–12]. The CBM74 members are about 250 residues long and this family obviously represents a novel type of an SBD [13]. All remaining individual CBM families of SBDs share a β-barrel fold of a β-sandwich although their sequences are, in general, poorly conserved [8]. Although there are no clans of CBMs in CAZy [7], several related SBDs are classified in different CBM families, e.g. CBM20, CBM48 and CBM69 [1,4,6,14–18]. They also exhibit a striking sequence-structural similarity suggesting a conserved mode of carbohydrate binding and often involve corresponding amino acid residues for carbohydrate binding.
Of all SBD/CBM families, the CBM20 family was the first established SBD family and it has also been extensively studied [1,6,19]. Among starch hydrolases and related enzymes classified in CAZy as glycoside hydrolase (GH) families, the CBM20 family is found associated with several activities. There are CBM20 domains within the α-amylase family GH13 (α-amylases, maltooligosaccharide-producing amylases and cyclodextrin glucanotransferases), GH14 β-amylases and GH15 glucoamylases, 4-α-glucanotransferases (DPE2) from GH77 and even in families GH31 (6-α-glucosyltransferase) and GH57 (putative amylopullulanase) [1]. Several CBM20 consensus residues were identified by sequence alignment [20], most of which have later been confirmed experimentally as crucial for activity (for reviews, see [1,6]). In general, the SBDs from the CBM20 family consist of approximately 100 amino acid residues; its secondary structure being formed by two main antiparallel β-sheets that adopt a β-sandwich fold with two binding sites [21–23].
In addition to typical microbial amylases, the unambiguous sequence features of SBDs from the CBM20 family have been revealed in non-amylolytic enzymes and proteins mostly of animal origin, such as the glucan phosphatase laforin [24] and the glycogen binding protein genethonin-1 [25]. The CBM20 is positioned C-terminally in most amylolytic enzymes and in the glycogen-associated protein genethonin-1 (also known as SBD1). Conversely, the laforin CBM20 is located at the N-terminus [26–28].
The present study is focused on laforin and, in particular, on its CBM20 evolution. Human laforin is a bimodular protein consisting of CBM20 followed by a dual specificity phosphatase (DSP) catalytic domain [29]. Laforin is encoded by the EPM2A gene. Mutations in EPM2A cause a progressive myoclonus epilepsy [30,31] called Lafora disease (LD) [32], which is a fatal, autosomal recessive, neurodegenerative form of epilepsy and a glycogen storage disease [33]. Laforin dephosphorylates glycogen in vitro and in vivo [34–37]. Additionally, laforin has been proposed to perform other functions via direct and/or indirect interactions with binding partners [29]. It has been shown in vitro that laforin forms a complex with the E3 ubiquitin ligase malin and assists with malin-directed ubiquitination of proteins associated in glycogen metabolism [29]. In the absence of the main enzymatic function of laforin, i.e. the α-glucan monophosphate hydrolasing activity [29,34,35], the glucose chains within glycogen become longer, hyperphosphorylated, and water-insoluble forming what is called a Lafora body (LB). Thus, laforin often is thought to prevent the formation of LBs by dephosphorylating glycogen moieties. Work from the Guinovart lab has demonstrated that the LBs are responsible for the onset and development of LD [38–41]. A recent study suggests that glycogen phosphorylation is not important for LB formation, but rather abnormal chain length of glycogen; however, this area is of research is currently under debate and investigation [42]. In any case, the CBM20 is indispensable for laforin’s function since it targets the enzyme to its substrate, i.e. glycogen [43,44].
The recently determined three-dimensional structure of human laforin [45] has revealed a unique quaternary structure of two laforin molecules forming an antiparallel dimer and yielding tetramodular CBM-DSP-DSP-CBM structure [46]. Laforin was co-crystallized in the presence of maltohexaose, an α-linked malto-oligosaccharide consisting of six glucosyl moieties. The structure shows a complex with maltohexaose bound not only to catalytic domain but also to CBM20 and confirmed the crucial role of CBM20 binding residues – Trp32, Lys87 and Trp99 [41]. These residues correspond to conserved CBM20 residues responsible for α-glucan binding in the SBD counterparts from amylolytic enzymes [1,6,21–23].
Previous work demonstrated that laforin is conserved in all vertebrates, it also has an ancient and unique evolutionary lineage [26,29]. Its evolution is attractive due to the two following observations: (i) genomes of most non-vertebrate organisms, such as yeast, fly and worm, i.e. eukaryotic model organisms, lack the gene coding for laforin; whereas (ii) laforin has been identified in two basal metazoans (Branchiostoma floridae and Nematostella vectensis) and even five protozoans (Cyanidioschyzon merolae, Toxoplasma gondii, Eimeria tenella, Tetrahymena thermophila and Paramecium tetraurelia) [47–49]. In order to define the evolutionary history of laforin, we performed an in silico study focused especially on its non-catalytic CBM20 domain. Searching recently sequenced genomes enabled us to identify new putative laforin orthologues in additional taxonomical species and to describe unique domain arrangement and evolution of their CBM20s.
Materials and methods
Selection of sequences
In the first step, the sequence of human laforin [30] was used as the query in the protein BLAST search ([50]; http://blast.ncbi.nlm.nih.gov/) yielding: (i) four representative laforin orthologues from various vertebrates; (ii) ten putative orthologues from phyla Cephalochordata, Cnidaria, Chromerida and Nematoda; and (iii) five protist laforin orthologues from Cyanidioschyzon merolae, Toxoplasma gondii, Eimeria tenella, Tetrahymena thermophila and Paramecium tetraurelia recognized previously [47]. Based on BLAST searches with each individual laforin sequence, these twenty sequences were completed by twelve additional putative laforin orthologues from different protozoans (Table 1). Only sequences containing both catalytic (DSP) and binding (CBM20) domains with simultaneously at least one full-length CBM20 version were taken in to the analysis. All sequences were retrieved from GenBank [51] and UniProt [52] databases.
Table 1.
The list of laforin sequences used in the present study.a
No. | Taxonomy | Organism | GenBank | UniProt | Source | Length | Copy 1 | Copy 2 | Copy 3 | DSP |
---|---|---|---|---|---|---|---|---|---|---|
1 | Vertebrata | Homo sapiens | AAG18377.1 | O95278 | CAZy database | 331 | 1–124 | 138–322 | ||
2 | Vertebrata | Gallus gallus | AAR21595.1 | Q5ZL46 | BLAST-O95278 | 319 | 1–111 | 125–309 | ||
3 | Vertebrata | Tetraodon nigroviridis | CAG03589.1 | Q4S6Z3 | BLAST-O95278 | 312 | 1–106 | 120–305 | ||
4 | Vertebrata | Alligator mississippiensis | KYO37738.1 | A0A151NLS3 | BLAST-O95278 | 305 | 1–99 | 113–297 | ||
5 | Vertebrata | Xenopus laevis | AAH73202.1 | Q6GPD8 | BLAST-O95278 | 313 | 1–106 | 120–304 | ||
6 | Cephalochordata | Branchiostoma floridae | EEN49019.1 | C3ZE63 | BLAST-O95278 | 316 | 1–117 | 131–316 | ||
7 | Cnidaria | Exaiptasia pallida | KXJ09946.1 | --- | BLAST-O95278 | 324 | 1–118 | 132–313 | ||
8 | Cnidaria | Nematostella vectensis | EDO32135.1 | A7SVW9 | BLAST-O95278 | 324 | 1–118 | 132–313 | ||
9 | Chromerida | Vitrella brassicaformis | CEL97703.1 | A0A0G4ELK3 | BLAST-O95278 | 358 | 1–139 | 152–337 | ||
10 | Nematoda | Trichinella britovi | KRY46872.1 | A0A0V1CCK1 | BLAST-O95278 | 361 | 1–104 | 118–299 | ||
11 | Nematoda | Trichinella murrelli | KRX36920.1 | A0A0V0TD31 | BLAST-O95278 | 360 | 1–104 | 118–299 | ||
12 | Nematoda | Trichinella papuae | KRZ74431.1 | A0A0V1MS55 | BLAST-O95278 | 312 | 1–104 | 118299 | ||
13 | Nematoda | Trichinella pseudospiralis | KRY67691.1 | A0A0V1E1K1 | BLAST-O95278 | 314 | 1–104 | 118–301 | ||
14 | Nematoda | Trichinella sp. T8 | KRZ83465.1 | A0A0V1NHG2 | BLAST-O95278 | 361 | 1–104 | 118–299 | ||
15 | Nematoda | Trichinella spiralis | EFV53640.1 | E5SNJ4 | BLAST-O95278 | 347 | 1–104 | 118–288 | ||
16 | Rhodophyta | Cyanidioschyzon merolae | BAM83396.1 | M1UXX5 | Ref [47] | 532 | 165–265 | 270–370 | 380–525 | |
17 | Rhodophyta | Chondrus crispus | CDF36183.1 | R7QEI4 | BLAST-M1UXX5 | 549 | 1–100 | 176–278 | 285–387 | 401–545 |
18 | Cryptophyta | Guillardia theta | EKX31889.1 | L1I7M6 | BLAST-M1UXX5 | 611 | 97–182 | 209–294 | 317–421 | 455–607 |
19 | Ciliophora | Paramecium tetraurelia | CAK57701.1 | A0BGN4 | Ref [47] | 726 | 257–356 | 365–460 | 538–685 | |
20 | Ciliophora | Tetrahymena thermophila | EAR89845.2 | Q22X01 | Ref [47] | 480 | 58–162 | 163–267 | 270–420 | |
21 | Ciliophora | Oxytricha trifallax | EJY86497.1 | J9IW88 | BLAST-M1UXX5 | 474 | 49–177 | 182–283 | 320–470 | |
22 | Ciliophora | Stylonychia lemnae | CDW83192.1 | A0A078APE9 | BLAST-M1UXX5 | 545 | 200–250 | 255–357 | 399–545 | |
23 | Apicomplexa | Eimeria tenella | CDJ40554.1 | U6KW34 | Ref. [47] | 459 | 1–254 | 269–450 | ||
24 | Apicomplexa | Toxoplasma gondii | EPT29986.1 | S8F2K7 | Ref. [47] | 523 | 1–312 | 327–508 | ||
25 | Apicomplexa | Eimeria acervulina | CDI81398.1 | U6GQ52 | BLAST-U6KW34 | 1435 | 856–1108 | 1115–1233 | ||
26 | Apicomplexa | Eimeria brunetti | CDJ47235.1 | U6LHG3 | BLAST-S8F2K7 | 1418 | 965–1218 | 1233–1414 | ||
27 | Apicomplexa | Eimeria maxima | CDJ57191.1 | U6LZL6 | BLAST-S8F2K7 | 1392 | 924–1177 | 1206–1387 | ||
28 | Apicomplexa | Eimeria mitis | CDJ36110.1 | U6KDK1 | BLAST-S8F2K7 | 740 | 288–541 | 556–737 | ||
29 | Apicomplexa | Eimeria necatrix | CDJ62429.1 | U6MIY3 | BLAST-S8F2K7 | 459 | 1–254 | 269–450 | ||
30 | Apicomplexa | Eimeria praecox | CDI86477.1 | U6H3K9 | BLAST-S8F2K7 | 1365 | 913–1165 | 1180–1361 | ||
31 | Apicomplexa | Hammondia hammondi | KEP61982.1 | A0A074TJ74 | BLAST-S8F2K7 | 523 | 1–312 | 327–508 | ||
32 | Apicomplexa | Neospora caninum | CBZ52253.1 | F0VEV9 | BLAST-S8F2K7 | 524 | 1–312 | 327–508 |
Table contains sequences of both experimentally characterized and putative laforin orthologues from several groups: (i) representatives of vertebrates (mammals, birds, fishes, reptiles and amphibians; all in green); (ii) non-vertebrates represented by phyla Cephalochordata (walnut), Cnidaria (cyan), Chromerida (orange) and Nematoda (magenta); (iii) red and cryptomonad algae (blue); and (iv) protozoans represented by phyla Ciliophora (gold) and Apicomplexa (red). GenBank [51] and UniProt [52] are the accession numbers of sequences from the two databases (if available). Source indicates the way how the sequence was found; BLAST followed by a UniProt accession number means that a sequence was found in the BLAST search using the UniProt’s sequence as a query. Length is the number of amino acid residues in the protein. The three columns with “copies” show the boundaries of CBM20 that may be present in more copies within a single laforin’s sequence. The last column “DSP” indicates the boundaries of the catalytic DSP domain.
To confirm the domain arrangement seen in human laforin [45], a verifying BLAST search was performed using laforin’s CBM20 sequence [30] as the query against the relevant taxa, such as Metazoa, Cnidaria, Chromerida, Nematoda, Algae, Ciliophora and Apicomplexa.
Comparison of CBM20 sequences
The sequence alignment of all CBM20s was performed using Clustal-X [53]. The computer-produced alignment was manually adjusted with regard to the knowledge on: (i) positions of functionally important residues responsible for carbohydrate binding in the family CBM20 [1,20]; (ii) previous alignment of five protozoan laforin orthologues [47]; (iii) tertiary structure of laforin from Homo sapiens solved as a complex with maltohexaose [45]; (iv) mutations affecting the function of CBM20 in laforin [45,54]; and (v) structural features of full-length CBM20s from putative laforin orthologues obtained by homology modelling server Phyre2 ([55]; http://www.sbg.bio.ic.ac.uk/phyre2/). The alignment was finalized also by considering the Phyre2 [55] modelled structure of two inserts within CBM20s from parasitic coccidia Toxoplasma gondii (Ala75-Leu198 and Gly215-Pro272) and Eimeria tenella (Gly76-Leu54 and Pro171-Asn214).
Boundaries for all individual recognized CBM20s were defined according to previous bioinformatics and structural studies [1,4,45,47], BLAST searches [50], homology modelling trials using the Phyre2 server [55] and information available in the Pfam database [56].
An alignment of DSPs was also performed using Clustal-X [53]; the computer output being used for subsequent evolutionary analysis without further manual adjustment.
Evolutionary analysis
The evolutionary trees of all 41 identified CBM20 copies present in 32 collected laforin sequences as well as of their DSP domains (Table 1) were calculated from the two final alignments including all gaps as a Phylip-tree type using the neighbour-joining clustering [57] and the bootstrapping procedure with 1,000 bootstrap trials [58] implemented in the Clustal-X package [53]. Both trees were displayed with the program iTOL ([59]; http://itol.embl.de/).
Results and discussion
Identification of novel laforin orthologues
This study defines a detailed in silico analysis of laforin focused on a unique evolution of its CBM20 domain, which in some cases was found to be present in two or even three copies.
It is of interest that the presence of laforin was first identified in vertebrates and immediately linked to LD [30,31]. Subsequent work identified a laforin orthologue in Toxoplasma gondii and hypothesized that laforin was present in the genome due to the organisms accumulation of amylopectin [47,60,61]. In any case, laforin is absent in the genomes of bacteria and Archaea. In addition to Toxoplasma gondii, laforin has been found in four other protozoans, i.e. Cyanidioschyzon merolae, Eimeria tenella, Tetrahymena thermophila and Paramecium tetraurelia [47] and later even in two bazal-metazoan non-vertebrates Branchiostoma floridae and Nematostella vectensis [48]. With regard to the five protists mentioned above, they produce an insoluble glucan as a source of energy during hibernation, to which they undergo at some stage of their life cycle [60,61]. Since the glucans are proposed to be biochemically similar to LBs, it was hypothesized [47] that protists can use their laforin for converting the insoluble glucans into energy, whereas the main role of the laforin in vertebrates is rather to prevent the accumulation of the insoluble glucan in the form of LBs. Based on these assumptions, laforin has correctly been expected also in other protozoans, such as Neospora caninum and Guillardia theta [47].
The present study extends the spectrum of organisms possessing the laforin gene in their genomes by identifying putative laforin orthologues in two additional protozoans – Oxitrichia trifallax and Stylonychia lemnae, one red alga – Chondrus crispus, and seven coccidians – six representatives of the genus Eimeria and Hamondia hammondi (Table 1). Laforin from the red alga Cyanidioschyzon merolae [62] was considered as one of the most ancient sequences [47], but, remarkably, we identified a putative laforin in the genome of an additional evolutionarily ancient alga – the flagellate Guillardia theta [63]. Uniquely, the Guillardia theta laforin possesses three CBM20 copies at its N-terminus (Table 1). It was previously proposed that typical non-vertebrate organisms, such as insect and worms as well as yeast lack laforin [47], but we have identified putative laforin orthologues in six representatives of parasitic nematodes from the genus Trichinella (Table 1).
It should be noted that all laforin orthologues to date contain a CBM20 followed by a DSP domain [26,29–31,45]. The plant proteome includes the Starch EXcess4 (SEX4) protein that contains a DSP domain followed by a CBM48, similar domains to laforin but in the opposite orientation [15,64]. In order to determine whether laforin orthologues with a different domain arrangement than that of human laforin, i.e. a DSP domain followed by a C-terminal CBM20, could be detected, we performed BLAST searches using only the laforin CBM20 sequence as query against Metazoa, Cnidaria, Chromerida, Nematoda, Algae, Ciliophora and Apicomplexa. We did not identify any proteins containing a DSP domain followed by CBM20. Additionally, we previously analyzed the CBM20 and CBM48 families, including those from SEX4 and laforin, and found that the laforin CBM20s were located in the CBM20 branch and all laforin CBM20s clustered together [1]. Thus, all putative laforin orthologues should have the same domain arrangement as that found in the human laforin with the CBM20 positioned N-terminally.
Domain arrangement
The arrangement of two fundamental domains of laforin, i.e. the N-terminal CBM20 binding domain and the C-terminal DSP catalytic domain, can be best illustrated by human laforin (Fig. 1). The laforin CBM20 is always positioned N-terminal to the DSP catalytic domain in all identified laforin orthologues (Fig. 1). Although in some laforin orthologues, such as Tetrahymena thermophila and Toxoplasma gondii, the DSP domain does not correspond exactly to the C-terminus of the protein (Fig. 1). As far as the CBM20 is concerned, it preceded the catalytic domain in all cases from the studied set (Table 1). Our analysis of the laforin CBM20 domains revealed three remarkable facts: (i) in the group of protozoans covering ciliophores, red algae, and a flagellate, the CBM20 was found in two or even in three copies; (ii) the N-terminal portion of protozoan laforin orthologues include up to 257 additional amino acids that may include currently unrecognized domains; and (iii) in the group of parasitic coccidian, represented by Toxoplasma gondii, the CBM20 was identified to consist of three segments due to the presence of two amino acid inserts (Fig. 1). Conversely, the putative laforin orthologues from the newly recognized group of parasitic nematodes (Fig. 1) represented by Trichinella spiralis [65] share the basic two-domain arrangement seen in the human laforin and its vertebrate orthologues [47–49] (Fig. 1).
Fig. 1.
Domain arrangement of human laforin and its representative orthologues. Homo sapiens, Trichinella spiralis and Toxoplasma gondii represent vertebrates, parasitic nematodes and parasitic coccidia, respectively. The non-catalytic CBM20 is in green, whereas the proceeding catalytic DSP is in yellow. The order of the individual CBM20s are numbered from the CBM20 that directly precedes the catalytic DSP (without a number), toward the N-terminus. The CBM20 of laforin from the group of parasitic coccidia, represented by Toxoplasma gondii, contains two amino acid inserts.
The laforin orthologues from the red alga Cyanidioschyzon merolae and protozoans Paramecium tetraurelia and Stylonychia lemnae contain two CBM20s. Additionally, these three orthologues contain longer N-terminal amino acid regions preceding the recognized CBM20s, where an additional domain could be present (Fig. 1). Although currently no putative conserved domain has been identified in this region, though these regions could contain undefined domains. Conversely, the first CBM20 copy of the putative laforin from Stylonychia lemnae [66] is very short (~50 residues) and most probably has lost its carbohydrate binding function, while the second CBM20 in S. lemnae is of the expected size (Fig. 1).
Remarkably, in the group of parasitic coccidia represented by the Toxoplasma gondii laforin [47], the CBM20 is interrupted by two inserts (Fig. 1). The inserts could be a consequence of duplications within this domain or a transcribed intron that had not been spliced during evolution. Further investigation is necessary to define how this CBM20 folds to perform its binding function.
Sequence comparison
The amino acid sequence alignment of all 41 CBM20s originated from 32 collected laforin orthologues – both experimentally confirmed and putative ones (Table 1) – is shown in Figure 2. It is of note that laforin orthologues from Cyanidioschyzon merolae, Paramecium tetraurelia, Tetrahymena thermophila, Oxytrichia trifallax and Stylonychia lemnae contain two copies of CBM20, whereas those from Chondrus crispus and Guillardia theta possess three.
Fig. 2.
Amino acid sequence alignment of CBM20s from human laforin and its taxonomically different orthologues. The CBM20 positions involved in starch binding sites 1 and 2 [1,22,45] are signified by numbers “1” and “2”, respectively, below the alignment. The individual residues are coloured as follows: Trp – yellow; Phe, Tyr – red; His – brown; Lys, Arg – cyan; Val, Leu, Ile – blue; Asp, Glu – green; Asp, Gln – dark yellow; Cys – magenta; Met - purple; Ala, Ser, Thr – gray; Gly, Pro – black. The green and red lanes above the alignment discriminate the borders of CBM20 segments (green) from two inserts (red) present in sequences from parasitic coccidia. For details concerning the studied sequences and their colouring scheme, please see Table 1. The individual laforins are marked by the binomial name of the organism preceded by the accession number from the UniProt database (except for the cnidarian Exaiptasia pallida preceded by the GenBank accession number). If there are more CBM20 copies in a single laforin sequence, there are digits “1” and eventually “2” following the accession number from the database. The order of the individual CBM20s is numbered from the CBM20, which directly precedes the catalytic DSP (without a number), toward the N-terminus.
The recently determined three-dimensional structure of human laforin in complex with maltohexaose [45] revealed the amino acids crucial for glucan binding, being Trp32, Lys87 and Trp99 (Homo sapiens laforin numbering), and their functional roles confirmed by site-directed mutagenesis [43,45,54]. These three residues correspond with Trp543, Lys578 and Trp590 (Aspergillus niger GH15 glucoamylase numbering) [22], forming the starch-binding site 1 of amylolytic enzymes [1]. Mutations of the three crucial residues decreased the CBM20 stability and decreased the glucan phosphatase activity of laforin [45]. Apart from a few exceptions, the above-mentioned residues are conserved in all CBM20s from the studied sample of laforin sequences, both experimentally confirmed laforin or its putative orthologues (Fig. 2).
In the three copies of CBM20 from Guillardia theta laforin [63], the second tryptophan is replaced by an aromatic phenylalanine residue in the CBM20 that directly precedes the catalytic DSP, whereas it is replaced by a tyrosine or is absent in the two additional CBM20 copies (Fig. 2). Additionally, the very N-terminal CBM20 lacks the first tryptophan. The putative laforin from Stylonychia lemnae [66] is of special interest since its CBM20 preceding the catalytic DSP contains both tryptophans and the lysine, but the potential additional CBM20 seems to consist of only relics of an extant CBM20 (Fig. 2). The laforin from Tetrahymena thermophila [47] is another unique example possessing two CBM20 copies because the copy directly preceding the DSP has a tyrosine in the position corresponding with the Trp32 of human laforin and the two remaining crucial residues, Lys87 and Trp99, are replaced by threonine and isoleucine, respectively, whereas the N-terminal CBM20 possesses all three invariantly conserved residues (Fig. 2). Alternatively, the putative laforin from Oxitrichia trifallax [67] also has two CBM20s. The first tryptophan (Trp32) is missing only in the N-terminal CBM20 copy, whereas the second CBM20 (preceding the DSP) contains all the residues necessary for its carbohydrate binding function (Fig. 2).
In the amino acid sequence alignment, we also identified the position of Trp60 (Homo sapiens laforin numbering), which corresponds with the starch binding site 2 of amylolytic enzymes [1], e.g. Trp563 in the CBM20 of Aspergillus niger GH15 glucoamylase [22]. Despite the fact that no substantial binding role has been ascribed for this residue in laforin, that tryptophan was replaced by a phenylalanine in four cases and by a non-aromatic residue in two cases (Fig. 2). In addition, a few more residues were aligned and found conserved, such as Phe5, Tyr86 and Phe88 (Homo sapiens laforin numbering), mutations of which also result in the onset of LD [45]. While both Tyr86 and Phe88 were conserved almost invariantly, the position of Phe5 was found as promiscuous in CBM20s of some representatives of the genus Eimeria and in the CBM20 copies preceding the DSP domain of Paramecium and Tetrahymena (Fig. 2). It is of interest that two glycines – Gly24 and Gly30 (Homo sapiens laforin numbering), were also well conserved. Thus, the former was conserved totally and the latter was absent only in the N-terminus and rather incomplete CBM20 copy of the laforin from Stylonychia lemnae, i.e. in the copy, which may represent only some relic of a real CBM20 (Fig. 2).
The parasitic coccidia CBM20 sequences all contain unusual inserts (Fig. 2). Some of these coccidian putative laforin orthologues represented by the genus Eimeria [68] were exceptionally long consisting of more than 1,000 residues; both CBM20 and DSP being located in the C-terminal part of the sequence. With regard to the two inserts, the first one is longer and located between the functionally important Trp60 and Lys87, whereas the second one is shorter and is positioned between the Lys87 and Trp99 (Homo sapiens laforin numbering). Both CBM20 inserts of the members of the family Sarcocystidae – Toxoplasma gondii, Hammondia hommondi and Neospora caninum are longer (~123 and ~57 residues) than those of the genus Eimeria (~80 and ~57 residues).
To investigate a potential role of the inserts, their structure was modelled using Phyre2 [55] allowing the server to choose the best templates. The template for the second insert was the structure of human laforin (PDB code: 4RKK; [45]), in particular the N-terminal part of its CBM20 – Pro11-Pro57. However, there was no counterpart tryptophan for Trp32 from human laforin in the second insert of Toxoplasma gondii and the model covered only a small part of the insert. We conclude that: (i) a part of CBM20 could have been duplicated during evolution in the CBM20 of Toxoplasma gondii laforin; or (ii) it could be an intron not spliced during evolution. Of all parasitic coccidian laforin orthologues studied here, the same should be applicable for CBM20 domains of putative laforin orthologues from both Hammondia hammondi and Neospora caninum containing the two inserts comparable with the Toxoplasma gondii laforin CBM20. Conversely, putative Eimeria laforin orthologues lack this insert (Fig. 2).
Evolutionary relationships of CBM20s from laforin orthologues
Previous in silico and experimental studies demonstrated that the laforin gene is found in the primitive red algae Cyanidioschyzon merolae and it is present in all vertebrates, but lacking in most non-vertebrates (e.g. yeast, fly and worm) [47–49]. Thus, the laforin gene was proposed to be an ancient gene and had undergone a unique evolutionary history. The overall evolutionary relationships of all CBM20s from the sample of laforin orthologues collected in the present study can be seen in the evolutionary tree (Fig. 3a). Four main groups or clusters can be recognized in the tree: (i) typical laforin orthologues from vertebrates including also some non-vertebrates from basal metazoans; (ii) putative laforin orthologues from parasitic nematodes; (iii) putative laforin orthologues from parasitic coccidia; and (iv) laforin orthologues from protozoans and algae having more than one CBM20 copy.
Fig. 3.
Evolutionary tree of (a) laforin CBM20s and (b) laforin DSP domains. The trees are based on the alignment of all laforin CBM20 and DSP sequences from Figure 2 and Table 1.
The first part of the evolutionary tree covers the CBM20 sequences of typical laforin orthologues from Homo sapiens as well as other representatives of vertebrates (birds, reptiles, fishes and amphibians). The clustering of these sequences reflects their high degree of mutual sequence similarities. This portion also contains the CBM20 of laforin from the lancelet Branchiostoma floridae, a chordate closely related to vertebrates [69]. More distantly related to vertebrates are the basal metazoans sea anemones (Exaiptasisa pallida and Nematostella vectensis) and the alveolate (Vitrella brasicoformis) and their putative laforin CBM20 clusters in this region (Fig. 3a).
Interestingly, the CBM20 sequences from newly recognized putative laforin orthologues from parasitic nematodes from the genus Trichinella share the common branch with those from parasitic coccidians (Fig. 3a). The clustering of these two groups together may reflect the fact that these organisms represent the intracellular parasites of animals.
While the CBM20s of putative laforin orthologues from parasitic nematodes are very closely related to each other, in the part of the tree covering the CBM20s of putative laforin orthologues from parasitic coccidia there are two separated clusters. One contains members of the genus Eimeria, i.e. parasites attacking poultry and cattle [67], and the other one comprising three coccidians (Toxoplasma gondii, Hammondia hammondi and Neospora caninum) that infect domestic cats and dogs as their ultimate host [70]. It is worth mentioning, however, that taxonomy was not the only reason for separating the two branches of parasitic coccidia in the tree. The two groups, i.e. members of the genus Eimeria and the remaining CBM20s differ from each other also by the length of the two inserts, which are shorter in the case of CBM20 sequences of putative laforin orthologues from the genus Eimeria than they are in those of their counterparts from the three remaining coccidians (Fig. 2).
The last part of the evolutionary tree is occupied by sequences of CBM20s that exists in laforin orthologues from algae and various protists in more than one copy (Table 1). Although there is no totally unambiguous trend in clustering, a subgroup of CBM20s that directly precede the catalytic DSP can be seen as well as the subgroup of CBM20s preceding the other CBM20 copy (Fig. 3a). Within both subgroups, it was also possible to trace the taxonomy, e.g. the representatives of Stylonychia lemnae and Oxytrichia trifallax as the members of the ciliate protozoan class Spirotrichea. In other cases, especially if there were three CBM20 copies, corresponding position of the CBM20 copy in the entire protein sequence was reflected regardless of taxonomy, such as in the case of the middle CBM20 of laforin orthologues from Chondrus crispus and Guillardia theta (Fig. 3a).
In order to gain more insights into laforin evolution, we also analyzed the laforin DSP domains. We generated the corresponding evolutionary tree based on DSPs from the sample of laforin orthologues collected in the present study (Fig. 3b). This tree could thus shed some light on whether or not the CBM20-DSP fusions observed in laforins are the result of a single evolutionary event. The same main four groups (or clusters) of laforin orthologues are observed for the DSP domain as was observed for the CBM20 tree – i.e. typical metazoans, parasitic nematodes, parasitic coccidia, and protozoans and algae (Fig. 3). The main difference between the trees concerns the grouping of the representatives of protozoa and algae. In the CBM20 tree (Fig. 3a), they are positioned separately from the remaining three main groups mentioned above, whereas in the DSP tree they are clustered adjacent to representatives of parasitic nematodes (Fig. 3b). This “protozoans/algae” group covers the organisms, whose laforins obviously contain two or three CBM20 copies (Table 1). It is worth mentioning that not all of these “multiple” CBM20 copies are necessarily functional and able to bind α-glucans. For instance, those from Tetrahymena thermophila and Oxitrichia trifallax lack some of the residues (Trp32 and Lys87; Homo sapiens laforin numbering) identified as responsible for binding (Fig. 2). Moreover, it is unlikely that the CBM20 copies from the two ciliophoran laforin sequences that positionally correspond to each other possess similar binding ability.
While the CBM20 and DSP domain trees share a number of similarities, it is also possible that the two domains might have evolved differently. It has been documented for CBM20s from the GH13, GH14 and GH15 amylolytic families that the CBM20 evolution more closely reflects species evolution rather than evolution of enzyme specificities (i.e. catalytic domain) [2,3].
Based on the above data and those previously published, it is likely that there was only a single fusion event between a CBM20 and a DSP resulting in laforin. Subsequently, there were CBM20 multiplication events yielding a DSP domain fused with multiple CBM20s as observed in some representatives of protozoa and algae.
Conclusions
This study defines the unique evolution of the laforin CBM20. Previous work stated that laforin orthologues are absent in most non-vertebrate organisms, such as yeast, flies and worms. However, the present in silico analysis extends the spectrum of organisms possessing a putative laforin to a group of parasitic nematodes represented by the genus Trichinella. In addition, it also reveals that the CBM20 is interrupted by two sequence inserts in putative laforin orthologues from the group of parasitic coccidia and that there are multiple CBM20 copies in laforin orthologues of some protozoa and algae including the unicellular Cyanidioschyzon merolae.
Acknowledgments
This work was supported by the grant No. 2/0146/17 from the Slovak Grant Agency VEGA to SJ and the grant No. R01NS070899 from the National Institutes of Health to MSG. AK thanks for the short-term fellowship from the Slovak Academic Information Agency SAIA.
Abbreviations
- CBM
carbohydrate-binding module
- DSP
dual specificity phosphatase
- GH
glycoside hydrolase
- LB
Lafora body
- LD
Lafora disease
- SBD
starch-binding domain
Footnotes
Author’s contribution
AK collected data, analysed results, prepared figures and contributed to writing the manuscript; MSG contributed to designing the study, interpreting results and writing the manuscript; SJ designed the study, contributed to collecting data and preparing figures, analysed and interpreted results, and wrote the manuscript.
References
- 1.Janecek S, Svensson B, MacGregor EA. Structural and evolutionary aspects of two families of non-catalytic domains present in starch and glycogen binding proteins from microbes, plants and animals. Enzyme Microb Technol. 2011;49:429–440. doi: 10.1016/j.enzmictec.2011.07.002. [DOI] [PubMed] [Google Scholar]
- 2.Janecek S, Sevcik J. The evolution of starch-binding domain. FEBS Lett. 1999;456:119–125. doi: 10.1016/s0014-5793(99)00919-9. [DOI] [PubMed] [Google Scholar]
- 3.Janecek S, Svensson B, MacGregor EA. Relation between domain evolution, specificity, and taxonomy of the α-amylase family members containing a C-terminal starch-binding domain. Eur J Biochem. 2003;270:635–645. doi: 10.1046/j.1432-1033.2003.03404.x. [DOI] [PubMed] [Google Scholar]
- 4.Machovic M, Janecek S. The evolution of putative starch-binding domains. FEBS Lett. 2006;580:6349–6356. doi: 10.1016/j.febslet.2006.10.041. [DOI] [PubMed] [Google Scholar]
- 5.Machovic M, Janecek S. Starch-binding domains in the post-genome era. Cell Mol Life Sci. 2006;63:2710–2724. doi: 10.1007/s00018-006-6246-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Christiansen C, Abou Hachem M, Janecek S, Viksø-Nielsen A, Blennow A, Svensson B. The carbohydrate-binding module family 20, diversity, structure, and function. FEBS J. 2009;276:5006–5029. doi: 10.1111/j.1742-4658.2009.07221.x. [DOI] [PubMed] [Google Scholar]
- 7.Cantarel BL, Coutinho PM, Rancurel C, Bernard T, Lombard V, Henrissat B. The Carbohydrate-Active EnZymes database (CAZy): an expert resource for Glycogenomics. Nucleic Acid Res. 2009;37:D233–D238. doi: 10.1093/nar/gkn663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Boraston AB, Bolam DN, Gilbert H, Davies GJ. Carbohydrate-binding modules: fine tuning polysaccharide recognition. Biochem J. 2004;382:769–781. doi: 10.1042/BJ20040892. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Hashimoto H. Recent structural studies of carbohydrate-binding modules. Cell Mol Life Sci. 2006;63:2954–2967. doi: 10.1007/s00018-006-6195-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Guillen D, Sanchez S, Rodriguez-Sanoja R. Carbohydrate-binding domains: multiplicity of biological roles. Appl Microbiol Biotechnol. 2010;85:1241–1249. doi: 10.1007/s00253-009-2331-y. [DOI] [PubMed] [Google Scholar]
- 11.Carvalho CC, Phan NN, Chen Y, Reilly PJ. Carbohydrate-binding module tribes. Biopolymers. 2015;103:203–214. doi: 10.1002/bip.22584. [DOI] [PubMed] [Google Scholar]
- 12.Cockburn DW, Suh C, Medina KP, Duvall RM, Wawrzak Z, Henrissat B, Koropatkin NM. Novel carbohydrate binding modules in the surface anchored α-amylase of Eubacterium rectale provide a molecular rationale for the range of starches used by this organism in the human gut. Mol Microbiol. 2017 doi: 10.1111/mmi.13881. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Valk V, Lamberts van Bueren A, van der Kaaij RM, Dijkhuizen L. Carbohydrate-binding module 74 is a novel starch-binding domain associated with large and multi-domain α-amylase enzyme. FEBS J. 2016;283:2354–2368. doi: 10.1111/febs.13745. [DOI] [PubMed] [Google Scholar]
- 14.Polekhina G, Gupta A, van Denderen BJ, Feil SC, Kemp BE, Stapleton D, Parker MW. Structural basis for glycogen recognition by AMP-activated protein kinase. Structure. 2005;13:1453–1462. doi: 10.1016/j.str.2005.07.008. [DOI] [PubMed] [Google Scholar]
- 15.Vander Kooi CW, Taylor AO, Pace RM, Meekins DA, Guo HF, Kim Y, Gentry MS. Structural basis for the glucan phosphatase activity of Starch Excess4. Proc Natl Acad Sci USA. 2010;107:15379–15384. doi: 10.1073/pnas.1009386107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ross FA, MacKintosh C, Hardie DG. AMP-activated protein kinase: a cellular energy sensor that comes in 12 flavours. FEBS J. 2016;283:2987–3001. doi: 10.1111/febs.13698. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Peng H, Zheng Y, Chen M, Wang Y, Xiao Y, Gao Y. A starch-binding domain identified in α-amylase (AmyP) represents a new family of carbohydrate-binding modules that contribute to enzymatic hydrolysis of soluble starch. FEBS Lett. 2014;588:1161–1167. doi: 10.1016/j.febslet.2014.02.050. [DOI] [PubMed] [Google Scholar]
- 18.Li X, Yu J, Zhang J, Sun H, Zhang X. Backbone and side-chain assignments for a novel CBM69 starch binding domain AmyP-SBD. Biomol NMR Assign. 2017;11:235–237. doi: 10.1007/s12104-017-9755-6. [DOI] [PubMed] [Google Scholar]
- 19.Rodriguez-Sanoja R, Oviedo N, Sanchez S. Microbial starch-binding domain. Curr Opin Microbiol. 2005;8:260–267. doi: 10.1016/j.mib.2005.04.013. [DOI] [PubMed] [Google Scholar]
- 20.Svensson B, Jespersen H, Sierks MR, MacGregor EA. Sequence homology between putative raw-starch binding domains from different starch-degrading enzymes. Biochem J. 1989;264:309–311. doi: 10.1042/bj2640309. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Penninga D, van der Veen BA, Knegtel RM, van Hijum SA, Rozeboom HJ, Kalk KH, Dijkstra BW, Dijkhuizen L. The raw starch binding domain of cyclodextrin glycosyltransferase from Bacillus circulans strain 251. J Biol Chem. 1996;271:32777–32784. doi: 10.1074/jbc.271.51.32777. [DOI] [PubMed] [Google Scholar]
- 22.Sorimachi K, Le Gal-Coëffet MF, Williamson G, Archer DB, Williamson MP. Solution structure of the granular starch-binding domain of Aspergillus niger glucoamylase bound to β-cyclodextrin. Structure. 1997;5:647–661. doi: 10.1016/s0969-2126(97)00220-7. [DOI] [PubMed] [Google Scholar]
- 23.Mikami B, Adachi M, Kage T, Sarikaya E, Nanmori T, Shinke R, Utsumi S. Structure of raw starch-digesting Bacillus cereus β-amylase complexed with maltose. Biochemistry. 1999;38:7050–7061. doi: 10.1021/bi9829377. [DOI] [PubMed] [Google Scholar]
- 24.Minassian BA, Ianzano L, Meloche M, Andermann E, Rouleau GA, Delgado-Escueta AV, Scherer SW. Mutation spectrum and predicted function of laforin in Lafora’s progressive myoclonus epilepsy. Neurology. 2000;55:341–346. doi: 10.1212/wnl.55.3.341. [DOI] [PubMed] [Google Scholar]
- 25.Janecek S. A motif of a microbial starch-binding domain found in human genethonin. Bioinformatics. 2002;18:1534–1537. doi: 10.1093/bioinformatics/18.11.1534. [DOI] [PubMed] [Google Scholar]
- 26.Gentry MS, Dixon JE, Worby CA. Lafora disease: insights into neurodegeneration from plant metabolism. Trends Biochem Sci. 2009;34:628–639. doi: 10.1016/j.tibs.2009.08.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Stapleton D, Nelson C, Parsawar K, McClain D, Gilbert-Wilson R, Barker E, Rudd B, Brown K, Hendrix W, O’Donnell P, Parker G. Analysis of hepatic glycogen-associated proteins. Proteomics. 2010;10:2320–2329. doi: 10.1002/pmic.200900628. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Roach PJ, Depaoli-Roach AA, Hurley TD, Tagliabracci VS. Glycogen and its metabolism: some new developments and old themes. Biochem J. 2012;441:763–787. doi: 10.1042/BJ20111416. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Gentry MS, Roma-Mateo C, Sanz P. Laforin, a protein with many faces: glucan phosphatase, adapter protein, et alii. FEBS J. 2013;280:525–537. doi: 10.1111/j.1742-4658.2012.08549.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Minassian BA, Lee JR, Herbrick JA, Huizenga J, Soder S, Mungall AJ, Dunham I, Gardner R, Fong CY, Carpenter S, Jardim L, Satishchandra P, Andermann E, Snead OC, Lopes-Cendes I, Tsui LC, Delgado-Escueta AV, Rouleau GA, Scherer SW. Mutations in a gene encoding a novel protein tyrosine phosphatase cause progressive myoclonus epilepsy. Nat Genet. 1998;20:171–174. doi: 10.1038/2470. [DOI] [PubMed] [Google Scholar]
- 31.Serratosa JM, Gomez-Garre P, Gallardo ME, Anta B, de Bernabé DB, Lindhout D, Augustijn PB, Tassinari CA, Malafosse RM, Topcu M, Grid D, Dravet C, Berkovic SF, de Cordoba SR. A novel protein tyrosine phosphatase gene is mutated in progressive myoclonus epilepsy of the Lafora type (EPM2) Hum Mol Genet. 1999;8:345–352. doi: 10.1093/hmg/8.2.345. [DOI] [PubMed] [Google Scholar]
- 32.Lafora GR, Glueck B. Beitrag zur histopathologie der myoklonischen epilepsie. Z Gesamte Neurol Psychiatr. 1911;6:1–14. [Google Scholar]
- 33.Turnbull J, Tiberia E, Striano P, Genton P, Carpenter S, Ackerley CA, Minassian BA. Lafora disease. Epileptic Disord. 2016;18(S2):38–62. doi: 10.1684/epd.2016.0842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Worby CA, Gentry MS, Dixon JE. Laforin, a dual specificity phosphatase that dephosphorylates complex carbohydrates. J Biol Chem. 2006;281:30412–30418. doi: 10.1074/jbc.M606117200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Tagliabracci VS, Turnbull J, Wang W, Girard JM, Zhao X, Skurat AV, Delgado-Escueta AV, Minassian BA, Depaoli-Roach AA, Roach PJ. Laforin is a glycogen phosphatase, deficiency of which leads to elevated phosphorylation of glycogen in vivo. Proc Natl Acad Sci USA. 2007;104:19262–19266. doi: 10.1073/pnas.0707952104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Tagliabracci VS, Heiss C, Karthik C, Contreras CJ, Glushka J, Ishihara M, Azadi P, Hurley TD, Depaoli-Roach AA, Roach PJ. Phosphate incorporation during glycogen synthesis and Lafora disease. Cell Metab. 2011;13:274–282. doi: 10.1016/j.cmet.2011.01.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Nitschke F, Wang P, Schmieder P, Girard JM, Awrey DE, Wang T, Israelian J, Zhao X, Turnbull J, Heydenreich M, Kleinpeter E, Steup M, Minassian BA. Hyperphosphorylation of glucosyl C6 carbons and altered structure of glycogen in the neurodegenerative epilepsy Lafora disease. Cell Metab. 2013;17:756–767. doi: 10.1016/j.cmet.2013.04.006. [DOI] [PubMed] [Google Scholar]
- 38.Vilchez D, Ros S, Cifuentes D, Pujadas L, Valles J, Garcia-Fojeda B, Criado-Garcia O, Fernandez-Sanchez E, Medrano-Fernandez I, Dominguez J, Garcia-Rocha M, Soriano E, Rodriguez de Cordoba S, Guinovart JJ. Mechanism suppressing glycogen synthesis in neurons and its demise in progressive myoclonus epilepsy. Nat Neurosci. 2007;10:1407–1413. doi: 10.1038/nn1998. [DOI] [PubMed] [Google Scholar]
- 39.Duran J, Tevy MF, Garcia-Rocha M, Calbo J, Milan M, Guinovart JJ. Deleterious effects of neuronal accumulation of glycogen in flies and mice. EMBO Mol Med. 2012;4:719–729. doi: 10.1002/emmm.201200241. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Sinadinos C, Valles-Ortega J, Boulan L, Solsona E, Tevy MF, Marquez M, Duran J, Lopez-Iglesias C, Calbo J, Blasco E, Pumarola M, Milan M, Guinovart JJ. Neuronal glycogen synthesis contributes to physiological aging. Aging Cell. 2014;13:935–945. doi: 10.1111/acel.12254. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Duran J, Guinovart JJ. Brain glycogen in health and disease. Mol Aspects Med. 2015;46:70–77. doi: 10.1016/j.mam.2015.08.007. [DOI] [PubMed] [Google Scholar]
- 42.Nitschke F, Sullivan MA, Wang P, Zhao X, Chown EE, Perri AM, Israelian L, Juana-López L, Bovolenta P, Rodriguez de Cordoba S, Steup M, Minassian BA. Abnormal glycogen chain length pattern, not hyperphosphorylation, is critical in Lafora disease. EMBO Mol Med. 2017;9:906–917. doi: 10.15252/emmm.201707608. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Wang J, Stuckey JA, Wishart MJ, Dixon JE. A unique carbohydrate binding domain targets the Lafora disease phosphatase to glycogen. J Biol Chem. 2002;277:2377–2380. doi: 10.1074/jbc.C100686200. [DOI] [PubMed] [Google Scholar]
- 44.Ganesh S, Tsurutani N, Suzuki T, Hoshii Y, Ishisara T, Delgado-Escueta AV, Yamakawa K. The carbohydrate-binding domain of Lafora disease protein targets Lafora polyglucosan bodies. Biochem Biophys Res Commun. 2004;313:1101–1109. doi: 10.1016/j.bbrc.2003.12.043. [DOI] [PubMed] [Google Scholar]
- 45.Raththagala M, Brewer MK, Parker MV, Sherwood AR, Wong BK, Hsu S, Bridges TM, Paasch BC, Hellman LM, Husodo S, Meekins DA, Taylor AO, Turner BD, Auger KD, Dukhande VV, Chakravarthy S, Sanz P, Woods VL, Li S, Vander Kooi CW, Gentry MS. Structural mechanism of laforin function in glycogen dephosphorylation and Lafora disease. Mol Cell. 2015;57:261–272. doi: 10.1016/j.molcel.2014.11.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Gentry MS, Brewer MK, Vander Kooi CW. Structural biology of glucan phosphatases from humans to plants. Curr Opin Struct Biol. 2016;40:62–69. doi: 10.1016/j.sbi.2016.07.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Gentry MS, Dowen RH, Worby CA, Mattoo S, Ecker JR, Dixon JE. The phosphatase laforin crosses evolutionary boundaries and links carbohydrate metabolism to neuronal disease. J Cell Biol. 2007;178:477–488. doi: 10.1083/jcb.200704094. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Gentry MS, Pace RM. Conservation of the glucan phosphatase laforin is linked to rates of molecular evolution and the glucan metabolism of the organism. BMC Evol Biol. 2009;9:138. doi: 10.1186/1471-2148-9-138. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Emanuelle S, Brewer MK, Meekins DA, Gentry MS. Unique carbohydrate binding platforms employed by the glucan phosphatases. Cell Mol Life Sci. 2016;73:2765–2778. doi: 10.1007/s00018-016-2249-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
- 51.Benson DA, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. GenBank. Nucleic Acids Res. 2015;43:D30–D35. doi: 10.1093/nar/gku1216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.UniProt Consortium. UniProt: a hub for protein information. Nucleic Acids Res. 2015;43:D204–D212. doi: 10.1093/nar/gku989. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. ClustalW and Clustal X version 2.0. Bioinformatics. 2007;23:2947–2948. doi: 10.1093/bioinformatics/btm404. [DOI] [PubMed] [Google Scholar]
- 54.Srikumar PS, Rohini K, Rajesh PK. Molecular dynamics simulations and principal component analysis on human laforin mutation W32G and W32G/K87A. Protein J. 2014;33:289–295. doi: 10.1007/s10930-014-9561-2. [DOI] [PubMed] [Google Scholar]
- 55.Kelley LA, Sternberg MJ. Protein structure prediction on the Web: a case study using the Phyre server. Nat Protoc. 2009;4:363–371. doi: 10.1038/nprot.2009.2. [DOI] [PubMed] [Google Scholar]
- 56.Finn RD, Goggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, Potter SC, Punta M, Qureshi M, Sangrador-Vegas A, Salazar GA, Tate J, Bateman A. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44:D279–D285. doi: 10.1093/nar/gkv1344. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987;4:406–425. doi: 10.1093/oxfordjournals.molbev.a040454. [DOI] [PubMed] [Google Scholar]
- 58.Felsenstein J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution. 1985;39:783–791. doi: 10.1111/j.1558-5646.1985.tb00420.x. [DOI] [PubMed] [Google Scholar]
- 59.Letunic I, Bork P. Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy. Nucleic Acids Res. 2011;39:W475–W478. doi: 10.1093/nar/gkr201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Coppin A, Varre J, Lienard L, Dauvillee D, Guerardel Y, Soyer-Gobillard MO, Buleon A, Ball S, Tomavo S. Evolution of plant-like crystalline storage polysaccharide in the protozoan parasite Toxoplasma gondii argues for a red alga ancestry. J Mol Evol. 2005;60:257–267. doi: 10.1007/s00239-004-0185-6. [DOI] [PubMed] [Google Scholar]
- 61.Guerardel Y, Leleu D, Coppin A, Lienard L, Slomianny C, Strecker G, Ball S, Tomavo S. Amylopectin biogenesis and characterization in the protozoan parasite Toxoplasma gondii, the intracellular development of which is restricted in the HepG2 cell line. Microbes Infect. 2005;7:41–48. doi: 10.1016/j.micinf.2004.09.007. [DOI] [PubMed] [Google Scholar]
- 62.Matsuzaki M, Misumi O, Shin-i T, Maruyama S, Takahara M, Miyagishima S, Mori T, Nishida K, Yagisawa F, Nishida K, … Kuroiwa T. Genome sequence of the ultrasmall unicellular red alga Cyanidioschyzon merolae 10D. Nature. 2004;428:653–657. doi: 10.1038/nature02398. [DOI] [PubMed] [Google Scholar]
- 63.Curtis BA, Tanifuji G, Burki F, Gruber A, Irimia M, Maruyama S, Arias MC, Ball SG, Gile GH, Hirakawa Y, … Archibald JM. Algal genomes reveal evolutionary mosaicism and the fate of nucleomorphs. Nature. 2012;492:59–65. doi: 10.1038/nature11681. [DOI] [PubMed] [Google Scholar]
- 64.Meekins DA, Raththagala M, Husodo S, White CJ, Guo HF, Kötting O, Vander Kooi CW, Gentry MS. Phosphoglucan-bound structure of starch phosphatase Starch Excess4 reveals the mechanism for C6 specificity. Proc Natl Acad Sci USA. 2014;111:7272–7277. doi: 10.1073/pnas.1400757111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Mitreva M, Jasmer DP, Zarlenga DS, Wang Z, Abubucker S, Martin J, Taylor CM, Yin Y, Fulton L, Minx P, … Wilson RK. The draft genome of the parasitic nematode Trichinella spiralis. Nat Genet. 2011;43:228–235. doi: 10.1038/ng.769. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Aeschlimann SH, Jonsson F, Postberg J, Stover NA, Petera RL, Lipps HJ, Nowacki M, Swart EC. The draft assembly of the radically organized Stylonychia lemnae macronuclear genome. Genome Biol Evol. 2014;6:1707–1723. doi: 10.1093/gbe/evu139. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Swart EC, Bracht JR, Margini V, Minx P, Chen X, Zhou Y, Khurana JS, Goldman AD, Nowacki M, Schotanus K, … Landweber LF. The Oxitrichia trifallax macronuclear genome: a complex eukaryotic genome with 16,000 tiny chromosomes. PLoS Biol. 2013;11:e1001473. doi: 10.1371/journal.pbio.1001473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Reid AJ, Blake DP, Ansari HR, Billington K, Browne HP, Bryant J, Dunn M, Hung SS, Kawahara F, Miranda-Saavedra D, … Pain A. Genomic analysis of the causative agents of coccidiosis in domestic chickens. Genome Res. 2014;24:1676–1685. doi: 10.1101/gr.168955.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Putnam NH, Butts T, Ferrier DE, Furlong RF, Hellsten U, Kawashima T, Robinson-Rechavi M, Shoguchi E, Terry A, Yu JK, … Rokhsar DS. The amphioux genome and the evolution of the chordate karyotype. Nature. 2008;453:1064–1071. doi: 10.1038/nature06967. [DOI] [PubMed] [Google Scholar]
- 70.Reid AJ, Vermont SJ, Cotton JA, Harris D, Hill-Cawthorne GA, Konen-Waisman S, Latham SM, Mourier T, Norton R, Quail MA, … Wastling JM. Comparative genomics of the apicomplexan parasites Toxoplasma gondii and Neospora caninum: coccidia differing in host range and transmission strategy. PLoS Pathog. 2012;8:e1002567. doi: 10.1371/journal.ppat.1002567. [DOI] [PMC free article] [PubMed] [Google Scholar]