Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Jan 1.
Published in final edited form as: Dev Comp Immunol. 2014 Oct 28;48(1):234–243. doi: 10.1016/j.dci.2014.10.009

A family of variable immunoglobulin and lect in domain containing molecules in the snail Biomphalaria glabrata

Nolwenn M Dheilly *,, David Duval *,, Gabriel Mouahid *,, Rémi Emans *,, Jean-François Allienne *,, Richard Galinier *,, Clémence Genthon , Emeric Dubois , Louis Du Pasquier §, Coen M Adema , Christoph Grunau *,, Guillaume Mitta *,, Benjamin Gourbal *,
PMCID: PMC4255472  NIHMSID: NIHMS642371  PMID: 25451302

Abstract

Technical limitations have hindered comprehensive studies of highly variable immune response molecules that are thought to have evolved due to pathogen-mediated selection such as Fibrinogen-related proteins (FREPs) from Biomphalaria glabrata. FREPs combine upstream immunoglobulin superfamily (IgSF) domains with a C-terminal fibrinogen-related domain (FreD) and participate in reactions against trematode parasites. From RNAseq data we assembled a de novo reference transcriptome of B. glabrata to investigate the diversity of FREP transcripts. This study increased over two-fold the number of bonafide FREP subfamilies and revealed important sequence diversity within FREP12 subfamily. We also report the discovery of related molecules that feature one or two IgSF domains associated with different C-terminal lectin domains, named C-type lectin-related proteins (CREPs) and Galectin-related protein (GREP). Together, the highly similar FREPs, CREPs and GREP were designated VIgL (Variable Immunoglobulin and Lectin domain containing molecules).

Keywords: Diversity, FREPs, C-type lectin, Galectin, RNAseq, immunoglobulin superfamily, IgL

1. Introduction

Pathogens and especially specialized parasites impose considerable selective pressures on their hosts. Therefore it is not surprising that animals of different phyla have independently acquired capabilities for individually diversifying parasite recognition capabilities and immune responses. In fact, diversification of genes or gene families of immune factors is now considered a common feature of animal immunity across phylogeny (Bowden et al., 2007; Ghosh et al., 2011; Hauton and Smith, 2007). Highly variable molecules assuming potential immune function may be found in any phylum of invertebrates. Among those, 185/333 proteins, Toll-like receptors and Nod-like receptors from sea urchins (Buckley and Rast, 2012; Nair et al., 2005), Variable Chitin Binding proteins (VCBP) in cephalochordates (Cannon et al., 2002) and tunicates (Dishaw et al., 2011), Down’s Syndrome Adhesion Molecules (DSCAMs) in arthropods (Brites et al., 2013; Brites et al., 2008; Neves and Chess, 2004; Watson et al., 2005; Watthanasurorot et al., 2011), and Fibrinogen-related proteins (FREPs) in molluscs (Adema et al., 1997b; Zhang et al., 2004; Zhang and Loker, 2003), have been studied in detail. The list of such highly variable immune factors continues to expand: additional families of potential immune receptors or effectors with an ability for specific recognition of pathogens are regularly described such as the C-type lectins (CTLD) of Caenorhabditis elegans (Schulenburg et al., 2008) and NACHT/NB-ARC in the coral Acropora digitifera (Hamada et al., 2012; Shinzato et al., 2011).

To date, most of our knowledge regarding the diversity of these molecules derives from traditional transcriptomic approaches (generation of ESTs and targeted PCR) (Brites et al., 2008; Zhang et al., 2004) and proteomics (Dheilly et al., 2012; Dheilly et al., 2009; Dheilly et al., 2013; Moné et al., 2010). However, the comprehensive study of the diversity of these highly variable immune molecules has been challenging because each individual sequence variant is expressed at low level such that detection is difficult (at protein level) or expensive (at mRNA or genomic DNA level by traditional Sanger sequencing). Therefore, these traditional approaches have allowed us to glimpse, but not fully explore in depth, the extent of sequence diversity and its role in the defense response capabilities of invertebrates to counter infections. To meet such limitations, we used Next Generation Sequencing (NGS) technologies that carry low costs and provide high sequencing coverage for study of the diversity of transcripts that result from somatic diversification of a multigenic family such as FREPs of the gastropod B. glabrata (Dheilly et al., 2014).

Functional and genomic characteristics suggest that FREPs play a key role in the immune processes underlying immuno-compatibility between B. glabrata and the trematode parasite Schistosoma mansoni. The immune role of FREPs is indicated by their differential expression over the course of an infection (Hanington et al., 2010b; Hertel et al., 2005) and their specific interaction with polymorphic mucins (SmPoMucs), antigens of S. mansoni (Moné et al., 2010). Furthermore, anti-trematode resistance of B. glabrata is reduced following FREP knockdown through RNA interference (Hanington et al., 2010a; Hanington et al., 2012). FREPs consist of one or two immunoglobulin domains (called IgSF1 and IgSF2) and a carboxyl terminal fibrinogen (FBG) domain (Adema et al., 1997b). They belong to a multigenic family of at least 14 members of which 6 full-length sequences have been obtained at cDNA or DNA level (Adema et al., 1997b; Léonard et al., 2001; Zhang et al., 2001; Zhang et al., 2008; Zhang and Loker, 2003). The combination of allelic polymorphism (Zhang et al., 2004) and somatic modification of FREP genes (Hanington et al., 2010a; Zhang et al., 2004) leads to a remarkable diversification within individual snails. Because of the considerable challenge to study this high diversity, FREPs are thus perfect candidates to test the potential of NGS in deciphering the diversity of transcript sequences derived from diverse, complex, multi-domain gene families.

In this study, we optimized the generation of a transcriptome de novo from Illumina sequencing of cDNA, in absence of a reference genome (see supplementary file 1 for details on transcriptome assembly). This assembly was inspected for transcripts that encoded complete and partial FREPs. This study greatly expanded the number of FREP gene subfamilies from B. glabrata and revealed a great diversity within the FREP12 subfamily. In addition, it leads to the discovery of new molecules that feature Immunoglobulin domains similar to IgSF1 and IgSF2 domains of FREPs associated with either a C-type lectin domain or a galectin domain.

2. Materials and Methods

2.1. Snail biological material and sampling

The de novo transcriptome assembly was conducted for the BgBRE strain of Biomphalaria glabrata, originally collected in Recife, Brazil and maintained in the laboratory for 30 years. This strain has a poor neutral genetic diversity. Microsatellite (neutral genetic markers) characterization indicated that the expected heterozygosity (He) was 0.387, allelic richness (AR) was 3, and Fis was 0.252. This study used two samples of 10 juvenile snails of BgBRE (shell diameter from 4 to 7 mm), two samples of 10 mature adult snails (shell diameter from 8 to 11 mm) and two samples of old adult snails (shell diameter from 12 to 16 mm). All snails were healthy, and were not intentionally immunized; the objective was to assemble the transcriptome of constitutively expressed genes.

2.2. RNA extraction

Total RNA was extracted from individual juvenile, adult and old snails. First, the shell was removed and whole snail body tissues were disrupted under liquid nitrogen with pestle and mortar. Powdered tissues were mixed with 500 µl Trizol reagent and stored at −80°C. Total RNA was extracted according to the manufacturer’s instructions. Briefly, 100 µl of chloroform was added and the sample was homogenized for 15 min at room temperature. Samples were centrifuged 15 min at 12,000×g and the aqueous phase was transferred into a new tube. RNA was precipitated with 500 µl of isopropanol and centrifuged 10 min at 12,000×g. RNA pellets were washed 2 times with 500 µl of cold ethanol (70%) and dissolved in water. RNA concentrations were determined using a ND-1000 spectrophotometer (Nanodrop Technologies).

2.3. cDNA library construction and Illumina SOLEXA sequencing

Equimolar amounts of RNA from juvenile, adult and old B. glabrata were pooled to constitute two biological replicates of 30 individuals: Bre1 and Bre2. RNA concentrations were determined spectrophotometrically (ND-1000 Nanodrop Technologies). RNA integrity was checked on a 2100 Bioanalyzer (Agilent) using RNA 6000 Nano kits (Agilent Technologies). The RNA Integrity Number (RIN) was not considered because of the hidden break in 28S rRNA from B. glabrata that causes 28S RNA to break into two fragments that run alongside 18S as in other invertebrate species (Dheilly et al., 2011; Ishikawa, 1977; Winnebeck et al., 2010). Two paired-end 72bp cDNA libraries were generated using the mRNA-Seq kit for transcriptome sequencing on the Illumina Genome analyzer II platform. Three samples were multiplexed for each lane. Library construction and sequencing were performed by MGX (Montpellier Genomix, c/o Institut de Génomique Fonctionnelle, Montpellier, France). Each library was purified and quantified using a DNA 1000 Chip on a 2100 BioAnalyzer (Agilent Technologies). cDNA fragment size ranged from 220 to 500 bp with an average size of 300 bp. Adapters were ligated to cDNA before analysis on the Illumina Genome analyzer II platform. The numbers of 72 bp reads resulting from samples Bre1 and Bre2 were 99,316,948 and 73,000,210, respectively. Only reads that passed the default quality filtering performed by the Illumina pipeline were retained. Reads were further cleaned using a workflow created in a local instance of Galaxy (Giardine et al., 2005). This workflow used the FastX toolkit (http://hannonlab.cshl.edu/fastx_toolkit/) and included the removal of sequencing artefacts, sequence trimming, and clipping of adapters. Quality control revealed an unexpected bias in nucleotide distribution associated with a reduced error per base calling for the initial 13 nucleotides at the 5’ terminus of each read (Figure S1) possibly due to the random primers used for library generation. Additionally, the sequencing quality was reduced for the 3 last nucleotides at the 3’ terminus of each read (Figure S1). Thus all reads were trimmed to remove these low quality sequence intervals (retaining bases 13 through 69), resulting in paired-end reads of 56 nucleotides each. Subsequently reads without a pair, named orphan reads, were removed (about 10% of total reads). The resulting two high quality libraries with 90,331,578 and 66,588,544 paired-reads for Bre1 and Bre2, respectively, were used for transcriptome assembly.

2.4. De novo assembly of transcriptomes

Each sequence dataset was assembled using the python script provided by Oases (version 0.2.06; http://www.ebi.ac.uk/~zerbino/oases/) that runs velvet (version 1.2.02; http://www.ebi.ac.uk/~zerbino/velvet/) for individual single-k assemblies of short reads using the de Bruijn graph algorithm. Oases exploits paired-end information to construct transcript isoforms and compute a merged assembly (Schulz et al., 2012; Zerbino and Birney, 2008). The resulting contigs were merged into unigene clusters with CD-HIT-EST (version 4.5.4 and parameters –c 0.95 –n 10 (Fu et al., 2012)). To optimize the de novo transcriptome assembly, the effect of k-mers usage, multiple k-mers assembly and sequencing depth was assessed (supplementary file 1, Figure S2). In addition, biological and technical variability of the method was evaluated by comparing the identity of transcripts generated independently from Bre1 and Bre2 samples. The comparison was performed by merging the Bre1 and Bre2 transcriptomes using CD-HIT-EST, to produce a transcriptome called Bre1+2 (Figure S3). Finally, a high quality reference transcriptome was produced with the 157 million paired reads from Bre1 and Bre2 comprising 117,269 transcripts with a N50 of 983 nucleotides (Figures S3 and S4).

2.5. Comparative analysis

The biological and technical variabilities of the method were further evaluated by comparing the expression levels of transcripts present in the Bre1 and Bre2 samples (supplementary file 1, Figure S4). Illumina reads were mapped against the reference transcriptome using BWA (Li and Durbin, 2009) and analyzed using SAMtools (Li et al., 2009). For each transcript, the number of reads per kilobase per millions of reads mapped (rpkm) was extracted. To compare Bre1 and Bre2 libraries (supplementary file 1, Figure S4), a statistical analysis was performed using the DESeq R package following the developer’s instructions. Differential expression analysis (Anders and Huber, 2010) was applied using the blind method in fit-only sharing mode with an adjusted P-value cutoff of 0.1.

2.6. Transcriptome annotation

Transcripts were automatically annotated using Blast2GO version 2.4.2 (Conesa et al., 2005) using BlastX (e-value cutoff of 1e-3) against the NCBI non-redundant (nr) protein database, Gene Ontology functional annotation (e-value hit-filter 1e-6) (GO; http://www.geneontology.org/), InterProScan functional domain search, and enzyme annotation using the Kyoto Encyclopedia of Genes and Genomes (KEGG). Analyses of Blast2GO annotations are provided in Supplementary file 2 and Figures 1 and S5.

Figure 1. A high diversity of Immune relevant protein domains in B. glabrata transcriptome.

Figure 1

The histogram provides the number of transcripts with immune-relevant protein domains annotated by InterProScan (Blue histograms, left axis) and mean rpkm values red squares, right axis). The dotted red line shows the mean rpkm value of all transcripts within the transcriptome (8.2). Ig-fold: Immunoglobulin-like fold, IPR013783. LRR: Leucine-rich repeat, IPR001611. CLECT: C-type lectin (CTL) or carbohydrate recognition domain (CRD), IPR001304. Ig: Immunoglobulin, IPR003599. FBG: Fibrinogen-related domains (FReDs, or FBG), IPR002181. TNF: Tumor necrosis factor family, IPR006052. TIR: Toll-Interleukin 1-receptor, IPR000157. C1q: Complement component C1q domain, IPR001073. SR: Scavenger Receptor Cystein-rich, IPR017448. CARD: Caspase recruitment domain, IPR001315. DEATH: DEATH domain, found in proteins involved in cell death, IPR000488. GLECT: Galectin, IPR001079. MACPF: membrane-attack complex / perforin, IPR020864. SPRY: Domain in SPIa and the RYanodine receptor, similar to pyrin domain, IPR018355. IRF: Interferon regulatory factor, IPR001346. BPI: BPI/LBP/CETP: Lipopolysaccharide binding protein, IPR001124 and IPR017942. PGRP: Animal peptidoglycan recognition protein homolog, IPR006619.

2.7. Single nucleotide polymorphism discovery

Reads were initially mapped with Bowtie2 version 2.0.5 with the following parameters: -O3 -m64 -msse2 –funroll -loops -g3 Size of (int, long, long long, void1, size_t, off_t): (4, 8, 8, 8, 8, 8). Then a mpileup file was generated with SamTools. Single Nucleotide Polymorphisms (SNPs) were extracted with VarScan with the following options: minimum coverage: 8; minimum reads: 2; minimum quality: 15; minimum variant allele: 0.01; pvalue: 99e-02.

2.8. Phylogenetic analysis

All transcripts in the reference transcriptome that showed either a high level of similarity (at the amino acid level) to FREP sequences or presenting classical constituent domain (IgSF,and FBG) of FREP sequences were aligned and compared to previous FREP entries available from GenBank (Figure S6), NCBI (http://www.ncbi.nlm.nih.gov/). Three protein-encoding sequence regions characteristic for B. glabrata FREPs were analyzed further and only sequences containing full-length domains were conserved: Immunoglobin superfamily 1 domain (IgSF1); 26 sequences available: 7 NCBI entries and 19 BgBre sequences; Immunoglobin superfamily 2 domain (IgSF2); 39 sequences available: 10 NCBI entries and 29 BgBre sequences, and the FBG domain; 25 sequences available: 13 NCBI entries and 12 BgBre sequences. Sequence accession numbers and abbreviations of sequences used for phylogenetic analyses are provided in Supplementary table 1).

Multiple sequence alignments (MSA) of the predicted amino acid sequences were obtained using the Muscle component of the software MEGA 5.2 (Tamura et al., 2011) and refined by Gblocks 0.91b (Castresana, 2000; Dereeper et al., 2010; Dereeper et al., 2008).

Pairwise distances were calculated using MEGA 5.2, applying the number of different residues to determine percentage identity. These values were used to generate a matrix of identity for each FREP region (IgSF1, IgSF2, FBG). Each matrix was used to generate a graphical distribution of percentage of identity frequencies (Figure S7).

Phylogenetic analyses were performed on nucleotide and amino-acid sequences using the neighbor-joining (NJ) and the maximum parsimony (MP) methods using MEGA 5.2. The maximum-likelihood (ML) method was performed with the PhyML program on the Seaview platform (Gouy et al., 2010). Reliability for internal branch was assessed using bootstrapping procedure (2000 replicates for NJ and 1000 replicates for MP and ML). For each sequence dataset, the probabilistic model of sequence evolution (Nei and Kumar, 2000) and the gamma distribution (G) to approximate rate heterogeneity among sites used for the phylogenetic reconstruction was estimated using MEGA 5.2. The phylogenetic analyses yielded congruent topologies for sequences that were represented in each of the three given datasets for IgSF1, IgSF2, FBG. The ML trees are provided in Figure 2, Figure 3 (IgSF2) and Figure S8 to S13. Support values below the 50% significance level were not discussed. Some of the deep nodes are not supported.

Figure 2. Subfamily grouping of Immunoglobulin (IgSF2) domains of Variable Immunoglobulin and Lectin domain containing molecules (VIgLs).

Figure 2

A/ Maximum-likelihood tree showing the relationships between FREPs, CREPs and GREP based on alignment of 38 IgSF2 domains. The clustering pattern demonstrate the similarities of IgSF2 sequences among VIgL molecules, yet show that CREPs and GREP remain distinct from all FREP subfamilies identified up to date. BgSel was used as outgroup to confirm that CREPs are more similar to FREPs than to BgSel. See Supplementary Table 1 for the gene sequence abbreviations and accession numbers. Combined with the analysis of the percentage of identity between sequences, this analysis identified new members of the FREP2, FREP3 and FREP12 families and new FREP subfamilies (8 new subfamilies featuring both IgSF domains and FBG domains) and unknown VIgL subfamilies (2 new subfamilies with unknown lectin domain). B/ Alignment of IgSF2 domains of FREP3.3, FREP3.2, CREP and GREP. Identical amino acid residues between CREP, GREP and FREP3 are boxed in grey.

Figure 3. High diversity within FREP12 subfamily.

Figure 3

A/ Positions of SNPs as obtained from sequencing of PCR products. B/ mapping of RNA-seq reads against reference sequences. C/ Venn diagram representing the number of SNPs found with both techniques. D/ legend of color-coded substitutions in A and B.

2.9. Validation of de novo transcripts assembly

Several de novo assembled transcripts were validated by traditional Sanger sequencing of PCR products. Briefly, BgBRE total RNA was reverse transcribed with random primers and RevertAid premium enzyme (Thermo scientific). Two µl of the RT reaction was then used for PCR (Advantage 2 PCR system, Invitrogen, Carlsbad, CA, USA) with primers that were designed to specially target and amplify novel predicted transcripts (supplementary table 2). Amplicons were cloned (pCR4-TOPO, Invitrogen), and sequenced (GATC Biotech, Konstanz, Germany). The Sequencher 4.5 program (Gene Codes, Ann Arbor, MI, USA) was used to align sequences from PCR products and for computational assembly of de novo transcripts (Supplementary file 3).

2.10. Data availability

Raw read fastq files were submitted to the Sequence Read Archive at NCBI (http://trace.ncbi.nlm.nih.gov/Traces/sra/) under the reference PRJNA213050. The reference transcriptome is publicly available from the transcriptomic database of Biomphalaria glabrata at the 2ei website (http://2ei.univ-perp.fr/?page_id=89). The sequences of CREP 1 to 4 and GREP have been submitted to NCBI database under the accession numbers KM975643, KM975644, KM975645, KM975646, KM975647, respectively.

3. Results

3.1. An extended array of potential immune-related molecules

RNAseq data were used to generate de novo a high quality reference transcriptome for B. glabrata (supplementary file 1, Figures S1 to S5). The annotated transcriptome was employed to search for and calculate the constitutive expression values of (known and novel) immune-related molecules and gene families (Supplementary File 2, Figure 1). We used a combined approach of keywords and protein domain searches similar to the one employed by Philipp et al. (Philipp et al., 2012). Immunoglobulin-like fold, C-type lectin domains (CLECT), Leucine-rich repeats (LRR), and FBG domains were the most frequent domains, followed with Tumor Necrosis Factor (TNF) and Toll-Interleukin-1 receptor (TIR) domains (Figure 1). Twenty predicted FREPs were included IgSF domains and eighty-one of the FBG domain containing transcripts (90% of the transcripts) were predicted to be FREPs. The most abundant transcripts had peptidoglycan recognition protein (PGRP) domains, (lipopolysaccharide-binding protein (LBP/BPI) domains, Galectin (GLECT) domains and CLECT domains (expressed in rpkm values; Figure 1). Proteins with perforin (MACPF) domains, complement component C1q (C1q) domains and FBG domains are also among the top 10% of most highly expressed transcripts. The present study focused on the analysis of the diversity presented by FREPs and related molecules.

3.2. Diversity of Fibrinogen-related proteins (FREPs)

3.2.1. FREPs characterization

Further Blast searches identified 173 complete and partial transcripts that were highly similar to previously characterized B. glabrata FREPs. Among those, 86 transcripts incorporated characteristic combinations of diagnostic constituent domains that define FREP sequences. In order, from N- to C-terminal, these are signal peptide, IgSF1, small connecting region (SCR), IgSF2, interceding region (ICR) and FBG. Note that the structure for FREPs with a single IgSF domain is SP, IgSF, ICR, FBG, with the sequence and intron-exon structure of the IgSF domain most similar to the IgSF2 found in the longer FREPs that present two IgSF domains. For instance, 24 transcripts (27.9%) contained a complete or partial IgSF, a complete ICR and a complete or partial FBG domains, 58 transcripts (67.4%) contained complete or partial IgSF + partial ICR domain and finally 4 transcripts (4.6%) were full length and contained all FREP domains (see alignment of Locus-2616_transcript_20 with FREP12 and FREP13 in Figure S6).

Transcript lengths ranged from a partial sequence of 179 nt (Locus_2479_Transcript_77) to a full-length FREP-encoding sequence of 2513 nt (Locus_2616_transcript_20). The IgSF and FBG domains were highly conserved among all transcripts whereas the ICR varied considerably in sequence and length (from 195 nt for Locus_2616_transcript_20 to 854 nt for Locus_3534_transcript_156). Characteristic imperfect repeats (consisting of amino acid triplets frequently composed of I, K, E residues) were often present within long ICR. The de novo assembly of 13 new FREPs (9 partial and 4 complete sequences) was validated by sequencing of PCR products obtained using transcript specific primers (Supplementary file 3, Supplementary Table 2).

3.2.2. Diversity of FREP subfamilies

None of the 173 transcripts was 100% identical to a FREP sequence entry in GenBank. The discovery of numerous new potential FREPs expands the sequence data available for comparison and facilitates a sharpening of the criteria used to assign FREPs as members of different sub-families. Zhang et al. (Zhang and Loker, 2004a; Zhang et al., 2004) previously suggested to assign two sequences to the same subfamily if their nucleotide identity was equal to or greater than 85%. For three FREP regions, IgSF1, IgSF2 (combined with the IgSF sequence of FREPs with a single IgSF domain), and FBG, we generated matrices of amino acid identity and nucleotide identity with all BgBre sequences and all previously reported FREPs from Genbank. Phylogenetic approaches were used to cluster sequences and allowed subfamilies assignments at the transcript and protein level. The branches of the resulting trees were strongly supported and distinguished different gene subfamilies of FREPs. At nucleotide level, 90% or greater sequence identity of constituent domains correctly captures all FREP sequences that cluster together in "gene trees". This level of sequence difference groups FREP proteins that differ in amino acid composition by no more than 15% (≥85% identity at AA level) and also provides a sensitive criterion for distinction of products from different FREP gene subfamilies (Figure S7). Congruent subfamilies assignments were obtained from analysis of the different FREP domains.

According to the analysis of the IgSF2 domain, six BgBre FREP transcripts were identified as new members of FREP families 2, 3 and 12 (see the ML trees provided in Figure 2 and Figures S8 to S13): L50943T1 is highly similar to NCBI FREP2, L3534T147 and L3534T148 were highly similar to NCBI FREP3 and L2479T70, L2479T84 and L2479T128 were highly similar to NCBI FREP12. The remaining FREP transcripts grouped into eight new FREP subfamilies and two novels, unknown subfamilies containing IgSF2. The analyses of IgSF1 and FBG showed that all the potential FREP transcripts were novel FREPs, unknown IgSF1 containing or FBG containing molecules - these sequences did not cluster on the same branches with previously reported FREPs (MLTrees) and did not have ≥ 90% nucleotide identity with any FREP in Genbank (Figures S8 to S13).

3.2.3. Diversity within a FREP subfamily

Based on >90% nucleotide identity, several transcripts were evidently members of the FREP12 subfamily. The mapping of reads from the Bre1 and Bre2 libraries against three divergent reference transcripts of a 420 nt long fragment of FREP12 (Locus_2479_Transcript_128, Locus_2479_transcript_147 and Locus_2479_Transcript_85) revealed 39 variable (SNP) nucleotide positions. From a pool of the original Bre1 and Bre2 RNA samples, 59 cloned amplicons of the same 420 bp long fragment were generated and sequenced by traditional Sanger sequencing (Supplementary file 3). Alignment of these sequences revealed 41 variable sites. The position and the nature of 33 SNPs (80%) were identical with both approaches (Figure 3), this confirmed that 7.9% of the residues in this 420 nt long fragment represent polymorphic sites . Clearly, FREP12 is highly diverse within this strain of B. glabrata..

3.3. Discovery of new Variable Immunoglobulin and Lectin containing molecules (VIgLs)

The search for particular IgSF domains as component of FREP sequences led to identification of novel molecules that resemble FREPs in domain structure and sequence. These included transcripts that consisted of one or two IgSF and an ICR highly similar to FREP sequences followed by either a C-type lectin domain or a galectin domain (Supplementary file 4, Figure 4). These were named C-type lectin-related protein (CREP) and galectin-related protein (GREP), respectively. The new de novo assembled transcripts were validated by Sanger sequencing of PCR products (Supplementary file 3, supplementary Table 2).

Figure 4. Similar domain structure of VIgLs.

Figure 4

SP: Signal Peptide; IgSF1: Immunoglobulin superfamily domain 1; IgSF2: Immunoglobulin superfamily domain 2; SCR: small connecting region; ICR: interceding region; FBG: fibrinogen domain; CLECT: C-type lectin domain; GLECT: galectin domain.

A total of four distinct CREPs composed of an IgSF domain in association with a C terminal C-type lectin domain were identified in the reference transcriptome (Supplementary file 3). This architecture is identical to that of Biomphalaria selectin (BgSel, (Guillou et al., 2004), and the lectins of ~35kDa recovered from albumen gland and egg masses of B. glabrata (Hathaway et al., 2010) but the IgSF domain of CREP is more similar to the second IgSF domain sequence from FREPs (60 % nt identity with IgSF2 from FREP3.3; gi: 346721864). The CLECT domain of CREPs had all the 11 canonical residues that compose the ligand binding interface of C-type lectins (Figure S14). BlastX analysis showed that the CLECT domain of CREP1 (Locus_51322_Transcript_1) was more similar to molluskan CLECT (incilarin, perlucin, C-type lectin proteins and selectins) whereas CREP2, CREP3 and CREP4 (Locus_40683_Transcript_4 Locus_38272_Transcript_4 and Locus_39839_Transcript_1, respectively) have CLECT domains that were more similar to vertebrate CLECT (CLEC4 and CLEC17).

Only one full length GREP transcript was found (Locus_751_Transcript_1). It consisted of two tandemly arranged IgSF domains upstream of a C-terminal galectin domain. The N terminal IgSF domain is highly similar to (the N-terminal) IgSF1 of FREPs (52% nt identity with IgSF1 from FREP3.3; gi: 28875402), and the second IgSF domain has high similarity with (the second) IgSF2 of FREPs (43% nt identity with IgSF2 from FREP14; gi: 346721861). The GLECT domain of GREP was highly divergent from other previously reported galectins. Moreover, this is the first report of a GLECT domain that associates with another type of domain. Still, the GLECT domain of GREP did contain all the 8 canonical amino acids that constitute the conserved sugar binding pocket of GLECT (Figure S14). In addition, 8 of the 11 residues that compose the dimerization interface were conserved and all 11 residues that compose the putative alternate dimerization interface were present (Figure S14).

Due to the high percentage of identity between IgSF domains of CREPs, GREP and FREPs, IgSF2 domains of CREPs and GREP could be included in the phylogenetic analysis of IgSF2 domains of FREPs (BgSel was used as an outgroup to confirm that CREPs and GREP are more similar to FREPs; Figures S10 and S11, Figure 2) and IgSF1 domain of GREP could be included in the phylogenetic analysis of IgSF1 domains of FREPs (Figures S8 and S9). Alignment demonstrates the high proportion of identical nucleotide and amino acid residues among IgSF domains of CREP, GREP and FREP3.2 and FREP3.3 (Figure 2). Moreover, the CREP and GREP-derived IgSF sequences did not segregate out as separate clades but clustered within different branches of FREP IgSF domains (Figure S8 to S11, Figure 3). Therefore, CREP and GREP IgSF sequences are as similar to IgSF of FREP genes as different IgSF sequences of FREP IgSF are to each other (Figure 2). Interestingly, and in accordance with the analysis of CLECT domains, the IgSF domain of CREP1 clusters apart from the IgSF domains of CREP2, CREP3 and CREP4. The latter appear more similar to the IgSF domains of FREP2 and FREP14. In consideration of the evident high similarity of IgSF1 and IgSF2 domains of FREPs, CREPs and GREP, we grouped all these molecules within a single family distinct from other molecules with the same domain architecture (IgSF and lectin domains). FREPs, CREPs and GREP were named VIgL for Variable Immunoglobulin and Lectin domain containing molecules (Figure 4).

4. Discussion

Next generation sequencing techniques are lifting limitations that restricted access to invertebrate genomes in ways that now facilitate comparative immunologists to study structure and function of the immune system of any organism, including non-model species (Dheilly et al., 2014). In the present study, we tested the hypothesis that NGS allows the study of the diversity of highly variable molecules (Dheilly et al., 2014; Dheilly et al., 2013) by investigating B. glabrata FREPs diversity from a de novo generated transcriptome. FREPs constitute the most highly diversified immune recognition gene family described from B. glabrata and are actively involved in anti-schistosome immune response (Hanington et al., 2010a; Mitta et al., 2012; Moné et al., 2010). This makes FREPs good candidates to evaluate the ability of state of the art de novo transcriptome assembly to provide a comprehensive representation of a large diversity of highly similar sequences. To date, 14 subfamilies of FREPs have been described, and these differ not just in nucleotide content but also by having one or two immunoglobulin domains and the length of the interceding region. Moreover, FREPs within subfamilies diversify even further through different mechanisms including alternative splicing and somatic diversification through gene conversions and point mutations (Hanington et al., 2012; Zhang et al., 2004).

The initial search for FREP sequences in the reference transcriptome led to the discovery of transcripts that encode novel types of lectins with upstream IgSF domains highly similar to FREP IgSF domains associated with different types of lectin domains, C-type lectins for CREPs and galectin for GREP (Figure 4). Sequencing of specific PCR products provided experimental confirmation for the existence of these new molecules. The recovery of four full-length CREP sequences suggests that CREPs constitute a new category of lectins in B. glabrata. The single full-length GREP has a novel, unique structure that reveals the existence of a new category of galectins. Indeed, it is the first report of a galectin domain associated with another domain. Galectins are a family of β-galactoside-binding lectins and are among the most conserved and ubiquitous family of lectins (Kilpatrick, 2002). They are composed of a one to four galectin domains (Vasta et al., 2012). Galectins are involved in host-pathogen interactions by recognition of exogenous ligands like glycans on the surface of viruses, bacteria, fungi and protozoa (Vasta, 2009; Vasta, 2012). Moreover, B. glabrata galectin (BgGal) has hemagglutinating activity. BgGal is absent from cell-free plasma and immunolocalizes in the plasma membrane of some sub-populations of snail hemocytes. It has been suggested that BgGal may serve as a pattern recognition receptor that selectively recognizes and binds hemocytes to pathogens with appropriate sugar ligands (Yoshino et al., 2008). All these observations provide further support for the idea that GREP could have a role in B. glabrata immune response. Similarly to other galectins, GREPs may interact with each other and form dimers or multimers (Song et al., 2011; Tasumi and Vasta, 2007; Vasta, 2012; Zhang et al., 2011). The discovery of a GREP protein and of four CREPs reveals the existence in B. glabrata of a broader category of lectins that like FREPs, Bgselectin (Guillou et al., 2004), and IgSF-CLECT (Hathaway et al. 2010) sequences is composed of 1 or 2 closely related IgSF domains with a downstream lectin domain that may be either a fibrinogen (FBG), C-type lectin (CLECT) or galectin (GLECT). The high similarity at nucleotide and amino acid levels of the IgSF domains and ICR within FREPs, CREPs and GREP may indicate that these sequences originated from a common ancestor gene and/or that these molecules participate in the same or related biological pathways. Lectins with related N-terminal sequences are likely to be processed by similar receptors or cell types and/or interact with each other in order to provide diversity in recognition molecules of various polysaccharides. Together, FREPs, CREPs and GREP are designated VIgL, which stand for Variable Immunoglobulin and Lectin domain containing molecules family.

The discovery of an extended family of VIgL, with the discovery of GREP and CREPs as related yet different lectins that also contain IgSF sequences similar to the IgSF domains recorded from FREPs emphasizes the need for a precise definition of B. glabrata FREPs. Correct identification of a FREP requires demonstration of the association of an upstream IgSF domain followed by an FBG domain sequence. Partial transcripts that cover only IgSF domains or IgSF and ICR cannot be named FREPs because they may derive from CREPs or GREP. Similarly, a partial sequence that contains only a FBG domain may originate from B. glabrata fibrinogen-related molecule (FREM), that combines an FBG domain with upstream collagen-like repeats (Zhang et al., 2008). Consequently, of the 14 previously reported FREP gene subfamilies, 8 can be unambiguously considered as FREPs because they possess both IgSF and FBG domains (FREPs 2, 3, 4, 5, 7, 12, 13 and 14) (Léonard et al., 2001; Zhang et al., 2001; Zhang and Loker, 2003, 2004b).

The current study yielded 28 transcripts that covered at least partially the IgSF domain and the FBG domain and these new FREPs clustered in at least 8 new FREP subfamilies. The remaining 58 transcripts are referred to as “unknown VIgL” until a more complete sequence covering the IgSF domain and lectin domain is obtained. However, it seems that the more diversified a family is, the more the de novo assembly generates a proportion of partial sequences. For instance, of the 28 transcripts confirmed to be FREPs, only 14 % were full length whereas the four CREP transcripts and the single GREP transcript recovered were almost completely full length. Although it indicates that at this time, for highly variable multigenic sequence families, it is difficult to use only Illumina RNA sequencing in efforts to assemble and reconstruct the complete family, this observation strongly suggests that most “unknown VIgL” subfamilies may be partial FREP sequences. The high discovery rate of novel FREP subfamilies is afforded by the unprecedented sequence coverage that is provided by the RNAseq. Additionally, the absence of some subfamilies previously entered into Genbank and the discovery of new FREP and “unknown VIgL” subfamilies may result directly from the use of a Brazilian strain of B. glabrata, different from those studied before (Porto Rico strains or crosses between Brazilian and Porto Rico strains). The FREP gene family may increase in size each time the transcriptome of a different strain of B glabrata is inspected through revealing new alleles or loci. Regardless, this study increased over two-fold the number of known FREP subfamilies and suggests that it represents only a fraction of the existing FREP diversity in B. glabrata. The present study has provided a basic necessary knowledge of the diversity of FREPs that can now be further investigated using a targeted sequencing approach to sequence the different FREP subfamilies (Babik et al., 2009; Hughes et al., 2013).

In a previous analyses of FREPs diversity, Zhang et al. (Zhang et al., 2004) and Hanington et al. (Hanington et al., 2012) showed that gene conversions may generate additional variants within the same FREP3 gene subfamily. The phylogenetic component of the present study did not detect the occurrence of recombinatorial diversification between different FREP subfamilies. Such recombination would have been revealed by an incongruent clustering of the same sequence in the phylogenetic analysis of the different domains (IgSF1, IgSF2 and FBG). While it may be that the assembly process did not reveal all sequence variants, or that healthy animals did not produce recombinant (variant or diversified) molecules, it may perhaps be that recombinatorial diversification remains restricted to within subfamilies of FREPs. FREP gene sequences are also (somatically) diversified through random point mutations (Hanington et al., 2012; Moné et al., 2010; Zhang et al., 2004). In the present study of the Brazilian B. glabrata strain, FREP 12 subfamily appeared to be the most highly diversified subfamily. Different FREP12 source variants were assembled de novo. Then, read mapping against reference FREP12 transcripts and traditional Sanger sequencing of clones confirmed a further diversification by point mutation. Previous studies revealed a high diversity of FREP3 in S. mansoni-exposed resistant BS90 strain of B. glabrata and a high diversity of FREP2 in infected Brazilian strain of B. glabrata. The present study investigated the diversity of immune recognition molecules in non-stimulated snails. Indeed, B. glabrata is able to activate an efficient innate cellular immune response immediately after an encounter with a pathogen (Deleury et al., 2012; Mitta et al., 2012). Here, FREP12 variability is found in healthy non-challenged individuals, which suggests a constitutive expression of diversified FREPs that participate in constitutive or anticipatory immunity. Now, further studies on VIgLs diversity are necessary to determine if these molecules are more diversified in response to pathogen exposures and if there are differences between individuals and/or populations.

FREPs actively participate in B. glabrata resistance to S. mansoni (Hanington et al., 2010a; Hanington et al., 2012). They bind to S. mansoni Polymorphic Mucins (SmPoMuc) on the parasite surface (Moné et al., 2010) and in ways thought to lead to parasite elimination (Mitta et al., 2012). The polymorphism and high diversity of interacting FREPs and SmPoMuc supports the compatibility polymorphism and represent one part of the molecular processes involved in the matching phenotype hypothesis (Mitta et al., 2012). Indeed, in natural populations, some snail/schistosome interactions are compatible (meaning that the host is successfully infected) and others are not (the host is resistant to the parasite strain). The RNAi-effected knockdown of FREP3 led to susceptibility to trematode infection in about 30% of normally resistant B. glabrata. This demonstrated the contribution of FREP3 in gastropod immunity and at the same time confirmed the involvement of additional determinants of anti-schistosoma immune responses (Hanington et al., 2010a; Hanington et al., 2012). Perhaps FREPs, CREPs and GREP serve as complementary or collaborative recognition factors that would be processed through the same pathway. In addition, different VIgLs could interact with each other to form complexes. Adema et al. (Adema et al., 1997a) showed that parasite-reactive lectins, including FREPs, occur as high molecular weight complexes under native conditions. It was not resolved whether these large multimers were of homologous or heterologous composition. Protein polymerization has previously been proposed for other highly variable molecules such as 185/333 proteins of sea urchins (Dheilly et al., 2009). In Anopheles gambiae, functional studies of fibrinogen-related domain (FreD) containing protein and C-type lectins have revealed complementary and synergistic functions that are mediated by inter- and intra-molecular associations. Formation of multimers provides a means of increasing the mosquito’s pathogen recognition receptor repertoire and mediating anti-pathogen responses (Cirimotich et al., 2010). As stated by Schulenburg et al. (Schulenburg et al., 2007) regarding lectins “multimerization increases binding valency and avidity, and as such, the potential for specific recognition of parasite molecules”. Interestingly, B. glabrata is able to mount a highly specific inducible protection against different strains or species of Schistosoma as demonstrated in immune priming experiments (Portela et al., 2013). Future studies may clarify whether FREPs, CREPs and GREP are able to interact by multimerization. It is tempting to hypothesize the high combinatorial diversity of different types of ViGLs, that derive from different loci with multiple alleles, including FREPs that are yet further diversified by alternative splicing and continuous somatic mutations, in service of non-self recognition in B. glabrata.

Supplementary Material

1
2
3
4
5
6
7

Highlights.

  • -

    Optimized de novo transcriptome assembly allows the study of highly diversified molecules

  • -

    We increased over two fold the number of bonafid FREP gene subfamilies

  • -

    FREP12 are constitutively highly diversified

  • -

    C-type lectin-related proteins (CREPs) combine an IgSF domain with a C-type lectin domain

  • -

    Galectin-related protein (GREP) combines two IgSF domains with a Galectin domain

  • -

    FREPs, CREPs and GREP constitute the VIgL family

Acknowledgments

NMD was supported by the Agence Nationale de la Recherche (ANR) Blanc, SVSE7, project Bodyguard to FT. BG acknowledges support from ANR JCJC INVIMORY number ANR-13- JSV7-0009. CMA acknowledges support from NIH grant number P20GM103452 from the National Institute of General Medical Sciences (NIGMS).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

1

Present address : School of Marine and Atmospheric Sciences, Stony Brook University, Stony Brook, NY, USA

References

  1. Adema CM, Hertel LA, Locker ES. Infection with Echinostoma paraensei (Digenea) induces parasite-reactive polypeptides in the hemolymph of the gastropod host Biomphalaria glabrata. In: B N, editor. Parasite Effects on Host Physiology and Behavior. New York: Chapman Press; 1997a. pp. 77–99. [Google Scholar]
  2. Adema CM, Hertel LA, Miller RD, Loker ES. A family of fibrinogen-related proteins that precipitates parasite-derived molecules is produced by an invertebrate after infection. Proceedings of the National Academy of Sciences of the United States of America. 1997b;94:8691–8696. doi: 10.1073/pnas.94.16.8691. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010;11:R106. doi: 10.1186/gb-2010-11-10-r106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Babik W, Taberlet P, Ejsmond M, Radwan J. New generation sequencers as a tool for genotyping of highly polymorphic multilocus MHC system. Mol Ecol Resour. 2009;9:713–719. doi: 10.1111/j.1755-0998.2009.02622.x. [DOI] [PubMed] [Google Scholar]
  5. Bowden L, Dheilly NM, Raftos DA, Nair SV. New immune systems: pathogen-specific host defence, life history strategies and hypervariable immune-response genes of invertebrates. Invert Surv J. 2007;4:127–136. [Google Scholar]
  6. Brites D, Brena C, Ebert D, Du Pasquier L. More than one way to produce protein diversity: duplication and limited alternative splicing of an adhesion molecule gene in Basal arthropods. Evolution. 2013;67:2999–3011. doi: 10.1111/evo.12179. [DOI] [PubMed] [Google Scholar]
  7. Brites D, McTaggart S, Morris K, Anderson J, Thomas K, Colson I, Fabbro T, Little TJ, Ebert D, Du Pasquier L. The Dscam homologue of the crustacean Daphnia is diversified by alternative splicing like in insects. Mol Biol Evol. 2008;25:1429–1439. doi: 10.1093/molbev/msn087. [DOI] [PubMed] [Google Scholar]
  8. Buckley KM, Rast JP. Dynamic evolution of toll-like receptor multigene families in echinoderms. Frontiers in immunology. 2012;3:136. doi: 10.3389/fimmu.2012.00136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Cannon JP, Haire RN, Litman GW. Identification of diversified genes that contain immunoglobulin-like variable regions in a protochordate. Nat Immunol. 2002;3:1200–1207. doi: 10.1038/ni849. [DOI] [PubMed] [Google Scholar]
  10. Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol. 2000;17:540–552. doi: 10.1093/oxfordjournals.molbev.a026334. [DOI] [PubMed] [Google Scholar]
  11. Cirimotich CM, Dong Y, Garver LS, Sim S, Dimopoulos G. Mosquito immune defenses against Plasmodium infection. Dev Comp Immunol. 2010;34:387–395. doi: 10.1016/j.dci.2009.12.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, Robles M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21:3674–3676. doi: 10.1093/bioinformatics/bti610. [DOI] [PubMed] [Google Scholar]
  13. Deleury E, Dubreuil G, Elangovan N, Wajnberg E, Reichhart JM, Gourbal B, Duval D, Baron OL, Gouzy J, Coustau C. Specific versus non-specific immune responses in an invertebrate species evidenced by a comparative de novo sequencing study. PLoS ONE. 2012;7:e32512. doi: 10.1371/journal.pone.0032512. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Dereeper A, Audic S, Claverie JM, Blanc G. BLAST-EXPLORER helps you building datasets for phylogenetic analysis. BMC evolutionary biology. 2010;10:8. doi: 10.1186/1471-2148-10-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Dereeper A, Guignon V, Blanc G, Audic S, Buffet S, Chevenet F, Dufayard JF, Guindon S, Lefort V, Lescot M, Claverie JM, Gascuel O. Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res. 2008;36:W465–W469. doi: 10.1093/nar/gkn180. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Dheilly N, Lelong C, Huvet A, Favrel P. Development of a Pacific oyster (Crassostrea gigas) 31,918-feature microarray: identification of reference genes and tissue-enriched expression patterns. BMC Genomics. 2011;12:468. doi: 10.1186/1471-2164-12-468. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Dheilly NM, Adema C, Raftos DA, Gourbal B, Grunau C, Du Pasquier L. No more non model species: The promise of next generation sequencing for comparative immunology. Dev Comp Immunol. 2014;45:56–66. doi: 10.1016/j.dci.2014.01.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Dheilly NM, Haynes PA, Raftos DA, Nair SV. Time course proteomic profiling of cellular responses to immunological challenge in the sea urchin, Heliocidaris erythrogramma. Dev Comp Immunol. 2012;37:243–256. doi: 10.1016/j.dci.2012.03.006. [DOI] [PubMed] [Google Scholar]
  19. Dheilly NM, Nair SV, Smith LC, Raftos DA. Highly variable immune-response proteins 185/333) from the sea urchin, Strongylocentrotus purpuratus: proteomic analysis identifies diversity within and between individuals. J Immunol. 2009;182:2203–2212. doi: 10.4049/jimmunol.07012766. [DOI] [PubMed] [Google Scholar]
  20. Dheilly NM, Raftos DA, Haynes PA, Smith LC, Nair SV. Shotgun proteomics of coelomic fluid from the purple sea urchin, Strongylocentrotus purpuratus. Dev Comp Immunol. 2013;40:35–50. doi: 10.1016/j.dci.2013.01.007. [DOI] [PubMed] [Google Scholar]
  21. Dishaw LJ, Giacomelli S, Melillo D, Zucchetti I, Haire RN, Natale L, Russo NA, De Santis R, Litman GW, Pinto MR. A role for variable region-containing chitin-binding proteins (VCBPs) in host gut-bacteria interactions. Proc Natl Acad Sci. 2011;108:16747–16752. doi: 10.1073/pnas.1109687108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Fu L, Niu B, Zhu Z, Wu S, Li W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012;28:3150–3152. doi: 10.1093/bioinformatics/bts565. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Ghosh J, Lun CM, Majeske AJ, Sacchi S, Schrankel CS, Smith LC. Invertebrate immune diversity. Dev Comp Immunol. 2011;35:959–974. doi: 10.1016/j.dci.2010.12.009. [DOI] [PubMed] [Google Scholar]
  24. Giardine B, Riemer C, Hardison RC, Burhans R, Elnitski L, Shah P, Zhang Y, Blankenberg D, Albert I, Taylor J, Miller W, Kent WJ, Nekrutenko A. Galaxy: a platform for interactive large-scale genome analysis. Genome Res. 2005;15:1451–1455. doi: 10.1101/gr.4086505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Gouy M, Guindon S, Gascuel O. SeaView version 4: A multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Mol Biol Evol. 2010;27:221–224. doi: 10.1093/molbev/msp259. [DOI] [PubMed] [Google Scholar]
  26. Guillou F, Mitta G, Dissous C, Pierce R, Coustau C. Use of individual polymorphism to validate potential functional markers: case of a candidate lectin (BgSel) differentially expressed in susceptible and resistant strains of Biomphalaria glabrata. Comparative biochemistry and physiology. 2004;138:175–181. doi: 10.1016/j.cbpc.2004.03.010. [DOI] [PubMed] [Google Scholar]
  27. Hamada M, Shoguchi E, Shinzato C, Kawashima T, Miller DJ, Satoh N. The complex NOD-like receptor repertoire of the coral Acropora digitifera includes novel domain combinations. Mol Biol Evol. 2012;30:167–176. doi: 10.1093/molbev/mss213. [DOI] [PubMed] [Google Scholar]
  28. Hanington PC, Forys MA, Dragoo JW, Zhang SM, Adema CM, Loker ES. Role for a somatically diversified lectin in resistance of an invertebrate to parasite infection. Proc Natl Acad Sci. 2010a;107:21087–21092. doi: 10.1073/pnas.1011242107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Hanington PC, Forys MA, Loker ES. A somatically diversified defense factor, FREP3, is a determinant of snail resistance to schistosome infection. PLoS Negl Trop Dis. 2012;6:e1591. doi: 10.1371/journal.pntd.0001591. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Hanington PC, Lun CM, Adema CM, Loker ES. Time series analysis of the transcriptional responses of Biomphalaria glabrata throughout the course of intramolluscan development of Schistosoma mansoni and Echinostoma paraensei. Int J Parasitol. 2010b;40:819–831. doi: 10.1016/j.ijpara.2009.12.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Hathaway JJM, Adema CM, Stout BA, Mobarak CD, Loker ES. Identification of protein components of egg masses indicates parental investment in immunoprotection of offspring by Biomphalaria glabrata (Gastropoda, Mollusca) Dev Comp Immunol. 2010;34:425–435. doi: 10.1016/j.dci.2009.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Hauton C, Smith VJ. Adaptive immunity in invertebrates: A straw house without a mechanistic foundation. BioEssays. 2007;29:1138–1146. doi: 10.1002/bies.20650. [DOI] [PubMed] [Google Scholar]
  33. Hertel LA, Adema CM, Loker ES. Differential expression of FREP genes in two strains of Biomphalaria glabrata following exposure to the digenetic trematodes Schistosoma mansoni and Echinostoma paraensei. Dev Comp Immunol. 2005;29:295–303. doi: 10.1016/j.dci.2004.08.003. [DOI] [PubMed] [Google Scholar]
  34. Hughes GM, Gang L, Murphy WJ, Higgins DG, Teeling EC. Using Illumina next generation sequencing technologies to sequence multigene families in de novo species. Mol Ecol Resour. 2013;13:510–521. doi: 10.1111/1755-0998.12087. [DOI] [PubMed] [Google Scholar]
  35. Ishikawa H. Evolution of Ribosomal RNA. Comp Biochem Physiol B. 1977;58:1–7. doi: 10.1016/0305-0491(77)90116-x. [DOI] [PubMed] [Google Scholar]
  36. Kilpatrick DC. Animal lectins: a historical introduction and overview. Biochimica and Biophysica Acta. 2002;1572:187–197. doi: 10.1016/s0304-4165(02)00308-2. [DOI] [PubMed] [Google Scholar]
  37. Léonard PM, Adema CM, Zhang S-M, Loker ES. Structure of two FREP genes that combine IgSF and fibrinogen domains, with comments on diversity of the FREP gene family in the snail Biomphalaria glabrata. Gene. 2001;269:155–165. doi: 10.1016/s0378-1119(01)00444-9. [DOI] [PubMed] [Google Scholar]
  38. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Mitta G, Adema CM, Gourbal B, Loker ES, Theron A. Compatibility polymorphism in snail/schistosome interactions: From field to theory to molecular mechanisms. Dev Comp Immunol. 2012;37:1–8. doi: 10.1016/j.dci.2011.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Moné Y, Gourbal B, Duval D, Du Pasquier L, Kieffer-Jaquinod S, Mitta G. A large repertoire of parasite epitopes matched by a large repertoire of host Immune receptors in an invertebrate host/parasite model. PLoS Negl Trop Dis. 2010;4:e813. doi: 10.1371/journal.pntd.0000813. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Nair SV, Del Valle H, Gross PS, Terwilliger DP, Smith LC. Macroarray analysis of coelomocyte gene expression in response to LPS in the sea urchin. Identification of unexpected immune diversity in an invertebrate. Physiol Genomics. 2005;22:33–47. doi: 10.1152/physiolgenomics.00052.2005. [DOI] [PubMed] [Google Scholar]
  43. Nei M, Kumar S. Molecular evolution and phylogenetics. Oxford University Press; 2000. p. 333. [Google Scholar]
  44. Neves G, Chess A. Dscam-mediated self- versus non-self-recognition by individual neurons. Cold Spring Harbor symposia on quantitative biology. 2004;69:485–488. doi: 10.1101/sqb.2004.69.485. [DOI] [PubMed] [Google Scholar]
  45. Philipp EER, Kraemer L, Melzner F, Poustka AJ, Thieme S, Findeisen U, Schreiber S, Rosenstiel P. Massively Parallel RNA Sequencing Identifies a Complex Immune Gene Repertoire in the lophotrochozoan Mytilus edulis. PLoS One. 2012;7:e33091. doi: 10.1371/journal.pone.0033091. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Portela J, Duval D, Rognon A, Galinier R, Boissier J, Coustau C, Mitta G, Theron A, Gourbal B. Evidence for specific genotype-dependent immune priming in the Lophotrochozoan Biomphalaria glabrata snail. Journal of innate immunity. 2013;5:261–276. doi: 10.1159/000345909. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Schulenburg H, Boehnisch C, Michiels NK. How do invertebrates generate a highly specific innate immune response? Mol Immunol. 2007;44:3338–3344. doi: 10.1016/j.molimm.2007.02.019. [DOI] [PubMed] [Google Scholar]
  48. Schulenburg H, Hoeppner MP, Weiner J, 3rd, Bornberg-Bauer E. Specificity of the innate immune system and diversity of C-type lectin domain (CTLD) proteins in the nematode Caenorhabditis elegans. Immunobiol. 2008;213:237–250. doi: 10.1016/j.imbio.2007.12.004. [DOI] [PubMed] [Google Scholar]
  49. Schulz MH, Zerbino DR, Vingron M, Birney E. Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics. 2012;28:1086–1092. doi: 10.1093/bioinformatics/bts094. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Shinzato C, Shoguchi E, Kawashima T, Hamada M, Hisata K, Tanaka M, Fujie M, Fujiwara M, Koyanagi R, Ikuta T, Fujiyama A, Miller DJ, Satoh N. Using the Acropora digitifera genome to understand coral responses to environmental change. Nature. 2011;476:320–323. doi: 10.1038/nature10249. [DOI] [PubMed] [Google Scholar]
  51. Song X, Zhang H, Wang L, Zhao J, Mu C, Song L, Qiu L, Liu X. A galectin with quadruple-domain from bay scallop Argopecten irradians is involved in innate immune response. Dev comp immunol. 2011;35:592–602. doi: 10.1016/j.dci.2011.01.006. [DOI] [PubMed] [Google Scholar]
  52. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011;28:2731–2739. doi: 10.1093/molbev/msr121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Tasumi S, Vasta GR. A galectin of unique domain organization from hemocytes of the eastern oyster (Crassostrea virginica) Is a receptor for the protistan parasite Perkinsus marinus. J Immunol. 2007;179:3086–3098. doi: 10.4049/jimmunol.179.5.3086. [DOI] [PubMed] [Google Scholar]
  54. Vasta GR. Roles of galectins in infection. Nat Rev Micro. 2009;7:424–438. doi: 10.1038/nrmicro2146. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Vasta GR. Galectins as pattern recognition receptors: structure, function, and evolution. Adv Exp Med Biol. 2012;946:21–36. doi: 10.1007/978-1-4614-0106-3_2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Vasta GR, Ahmed H, Nita-Lazar M, Banerjee A, Pasek M, Shridhar S, Guha P, Fernandez-Robledo J. Galectins as self/non-self recognition receptors in innate and adaptive immunity: An unresolved paradox. Frontiers in immunology. 2012;3 doi: 10.3389/fimmu.2012.00199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Watson FL, Puttmann-Holgado R, Thomas F, Lamar DL, Hughes M, Kondo M, Rebel VI, Schmucker D. Extensive diversity of Ig-superfamily proteins in the immune system of insects. Science. 2005;309:1874–1878. doi: 10.1126/science.1116887. [DOI] [PubMed] [Google Scholar]
  58. Watthanasurorot A, Jiravanichpaisal P, Liu H, Soderhall I, Soderhall K. Bacteria-induced Dscam isoforms of the crustacean, Pacifastacus leniusculus. PLoS pathogens. 2011;7:e1002062. doi: 10.1371/journal.ppat.1002062. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
  59. Winnebeck EC, Millar CD, Warman GR. Why does insect RNA look degraded? Journal of insect science (Online) 2010;10:159. doi: 10.1673/031.010.14119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Yoshino TP, Dinguirard N, Kunert J, Hokke CH. Molecular and functional characterization of a tandem-repeat galectin from the freshwater snail Biomphalaria glabrata, intermediate host of the human blood fluke Schistosoma mansoni. Gene. 2008;411:46–58. doi: 10.1016/j.gene.2008.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Zerbino DR, Birney E. Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008;18:821–829. doi: 10.1101/gr.074492.107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Zhang D, Jiang S, Hu Y, Cui S, Guo H, Wu K, Li Y, Su T. A multidomain galectin involved in innate immune response of pearl oyster Pinctada fucata. Dev comp immunol. 2011;35:1–6. doi: 10.1016/j.dci.2010.08.007. [DOI] [PubMed] [Google Scholar]
  63. Zhang S-M, Léonard PM, Adema CM, Loker ES. Parasite-responsive IgSF members in the snail Biomphalaria glabrata: characterization of novel genes with tandemly arranged IgSF domains and a fibrinogen domain. Immunogenetics. 2001;53:684–694. doi: 10.1007/s00251-001-0386-8. [DOI] [PubMed] [Google Scholar]
  64. Zhang S-M, Loker ES. Representation of an immune responsive gene family encoding fibrinogen-related proteins in the freshwater mollusc Biomphalaria glabrata, an intermediate host for Schistosoma mansoni. Gene. 2004a;341:255–266. doi: 10.1016/j.gene.2004.07.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Zhang S-M, Nian H, Zeng Y, Dejong RJ. Fibrinogen-bearing protein genes in the snail Biomphalaria glabrata: characterization of two novel genes and expression studies during ontogenesis and trematode infection. Dev Comp Immunol. 2008;32:1119–1130. doi: 10.1016/j.dci.2008.03.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Zhang SM, Adema CM, Kepler TB, Loker ES. Diversification of Ig superfamily genes in an invertebrate. Science. 2004;305:251–254. doi: 10.1126/science.1088069. [DOI] [PubMed] [Google Scholar]
  67. Zhang SM, Loker ES. The FREP gene family in the snail Biomphalaria glabrata: additional members, and evidence consistent with alternative splicing and FREP retrosequences. Fibrinogen-related proteins. Dev Comp Immunol. 2003;27:175–187. doi: 10.1016/s0145-305x(02)00091-5. [DOI] [PubMed] [Google Scholar]
  68. Zhang SM, Loker ES. Representation of an immune responsive gene family encoding fibrinogen-related proteins in the freshwater mollusc Biomphalaria glabrata, an intermediate host for Schistosoma mansoni. Gene. 2004b;341:255–266. doi: 10.1016/j.gene.2004.07.003. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2
3
4
5
6
7

RESOURCES