Abstract
Nematode Chemosensory G-Protein Coupled Receptors (GPCRs) (NemChRs) have expanded within nematodes, where they play important roles in foraging and host-seeking behavior. NemChRs are most highly expressed during free-living stages when chemosensory signaling is required for host detection and nematode activation in various parasitic nematodes, and therefore position NemChRs at the transition from infective to parasitic stages, making them important regulators to study in terms of host-seeking and host specificity. To facilitate the analysis of NemChRs, here we describe an integrative database of nematode chemoreceptors called NemChR-DB. This database enables users to study diverse parasitic nematode chemoreceptors, functionally explore sequence entries through structural and literature-based annotations, and perform cross-species comparisons.
Keywords: Nematode, Parasite, Chemoreceptor, G-protein coupled receptor, Database, NemChR
Nematodes are a ubiquitous and diverse group of animals that comprise both free-living and parasitic species. Parasitic nematodes of humans infect more than 1 billion people globally with an estimated 4 billion at risk for infection (Bethony et al., 2006). Four species of soil-transmitted helminths (STHs), Ascaris lumbricoides, Trichuris trichiura, Necator americanus and Ancylostoma duodenale, are responsible for most cases of human infection. They are prevalent in tropical and subtropical zones with poor sanitation practices that lead to transmission from eggs present in human feces. In the case of hookworm, heavy infection can result in debilitating and sometimes fatal iron deficiency anemia caused by blood loss due to the adult nematodes that feed in the intestine. In some cases, blockage of the intestine will necessitate surgery. These heavy infections are especially devastating in children, causing stunted physical and cognitive development, which may be permanent (Lozoff et al., 1991). Pregnant women as well as elderly people are also at high risk for morbidity (Bundy et al., 1995). Patients with lighter infections may be asymptomatic or exhibit mild symptoms such as diarrhea or abdominal pain.
G-Protein Coupled Receptors (GPCRs) represent the largest and most diverse family of membrane proteins in eukaryotes. Each member of the family shares a common architecture comprised of a single polypeptide containing seven hydrophobic transmembrane domains. A subgroup of GPCRs called the Nematode Chemosensory GPCRs (NemChRs) are unique to nematodes, and therefore represent important substrates to study in order to understand and potentially treat nematode infections (Robertson and Thomas, 2006; Krishnan et al., 2014). Additionally, NemChRs and their agonists are important regulators of diverse behaviors in nematodes (Sengupta et al., 1996; Choe et al., 2012; Greene et al., 2016), and genetic deletion of their downstream effectors has been shown to block nematode activation and behavioral responses to host-emitted cues (Gang et al., 2020; Wheeler et al., 2020). Recent work from our group and others has revealed a strong correlation between NemChR expression and free-living stages of parasitic nematodes (Bernot et al., 2020; Wheeler et al., 2020), suggesting a clear role for NemChRs in host-seeking behavior. The primary sensory organ in nematodes is called the amphid and at the neuroanatomical level this organ is highly conserved across diverse nematode species (Ashton and Schad, 1996). In Caenorhabditis elegans NemChRs have been shown to localize at the dendritic cilia of the amphid sensory organ (Sengupta et al., 1996; Vidal et al., 2018), and based on sequence data NemChRs have been organized into 19 groups. Of these 19 groups, members are organized into the following family and subfamily classifications: Sra - sra, srab, srb, sre; Str - srd, srh, sri, srj; Srg - srt, sru, srv, srx srxa; and other: srbc, srsx, srw, srz (Robertson and Thomas, 2006). Interestingly, chemosensory GPCRs in eukaryotes likely evolved independently multiple times which has led to substantial diversification of chemosensory GPCRs between deuterostomes and protostomes, and subsequent expansion of ancestral chemosensory GPCRs within the nematode lineage to produce the NemChRs (Krishnan et al., 2014). In order to accelerate the study of NemChRs in terms of nematode evolution and chemosensory behavior we here describe a unified resource of NemChRs called NemChR-DB. NemChR-DB is freely available to all users and provides an intuitive and integrative platform for the study of parasitic nematode chemoreceptors. NemChR-DB is accessible at the following URL: http://ohalloranlab.net/nemchr-db.
Data contained within NemChR-DB was mined from WormBase ParaSite (ver. WBPS14) (Howe et al., 2017) (Table 1) and UniProt (UniProt Consortium, 2019) using a pipeline developed previously (Wheeler et al., 2020). Briefly, hmmsearch (Mistry et al., 2013) was used to compare proteins against a database of GPCR Pfam hidden Markov models (HMMs) (Finn et al., 2014), and then filtered to keep proteins with matches to nematode chemoreceptor HMMs and Caenorhabditis elegans chemoreceptors. We also included an additional filter to remove FMRFamide-like receptors. Sequences were then provided as input to TMHMM (ver. 2.0c) (Sonnhammer et al., 1998) and Phobius (Käll et al., 2004) in order to obtain the predicted number of transmembrane (TM) domains and their topology. The sequences were then used to retrieve additional information such as PubMed ID, taxonomy, molecular weight and more from UniProt. Next, all of this information was organized into a database of JSON objects which is served asynchronously using Ajax and jQuery DataTables. The database contains 8,790 putative chemoreceptors from 53 unique nematode species representing 20 superfamilies of nematodes and 28 nematode families. Table 2 details the summary statistics for NemChR-DB in terms of chemoreceptor counts and quality scores for genomes represented in the database. We also examined the relationship between putative chemoreceptor counts and genome quality by comparing the numbers of putative chemoreceptors from each species with Core Eukaryotic Genes Mapping Approach (CEGMA) scores, (Parra et al., 2007) and Benchmarking Universal Single Copy Orthologs (BUSCO) (Seppey et al., 2019) scores (Fig. 1). From this analysis, we did not observe a significant correlation in either case, suggesting that the numbers of predicted NemChRs in each species were not influenced significantly by genome quality (Fig. 1: BUSCO Pearson’s correlation coefficient r = 0.037 and CEGMA r = 0.156, P > 0.05 in each case). However, it is worth pointing out that BUSCO scores differed across different genomes we used to identify NemChRs (Table 2: BUSCO score=75.25 and σ=18.5), and so improvement of BUSCO scores in some cases may alter final NemChR counts.
Table 1.
List of species and genomes used to predict chemoreceptors for NemChR-DB
Species name | Assembly | BioProject ID | Clade |
---|---|---|---|
Trichuris muris | TMUE3.0 | PRJEB126 | Clade I |
Trichuris suis | Tsuis_adult_female_1.0 | PRJNA208416 | Clade I |
Trichinella spiralis | T1_ISS3_r1.0 | PRJNA257433 | Clade I |
Trichuris trichiura | TTRE2.1 | PRJEB535 | Clade I |
Trichinella pseudospiralis | T4_ISS588_r1.0 | PRJNA257433 | Clade I |
Romanomermis culicivorax | nRc.2.0 | PRJEB1358 | Clade I |
Toxocara canis | Toxocara_canis_adult_r1.0 | PRJNA248777 | Clade III |
Ascaris suum | AscSuum_1.0_submitted | PRJNA80881 | Clade III |
Brugia malayi | Bmal-4.0 | PRJNA10729 | Clade III |
Ascaris lumbricoides | A_lumbricoides_Ecuador_v1_5_4 | PRJEB4950 | Clade III |
Onchocerca volvulus | ASM49940v2 | PRJEB513 | Clade III |
Brugia pahangi | B_pahangi_Glasgow_0011_upd | PRJEB497 | Clade III |
Dracunculus medinensis | D_medinensis_Ghana_v2_0_4 | PRJEB500 | Clade III |
Brugia timori | B_timori_Indonesia_v1_0_4_001_upd | PRJEB4663 | Clade III |
Onchocerca ochengi | ASM90053720v1 | PRJEB1465 | Clade III |
Thelazia callipaeda | T_callipaeda_Ticino_0011_upd | PRJEB1205 | Clade III |
Syphacia muris | S_muris_Valencia_v1_0_4 | PRJEB524 | Clade III |
Loa loa | Loa_loa_V3 | PRJNA37757 | Clade III |
Litomosoides sigmodontis | ASM90053727v1 | PRJEB3075 | Clade III |
Anisakis simplex | A_simplex_0011_upd | PRJEB496 | Clade III |
Wuchereria bancrofti | W_bancrofti_Jakarta_0011_upd | PRJEB536 | Clade III |
Dirofilaria immitis | nDi.2.2 | PRJEB1797 | Clade III |
Parascaris univalens | ASM225920v1 | PRJNA386823 | Clade III |
Elaeophora elaphi | E_elaphi_v1_0_4 | PRJEB502 | Clade III |
Steinernema glaseri | S_glas_v1_submitted | PRJNA204943 | Clade IV |
Bursaphelenchus xylophilus | ASM23113v1_submitted | PRJEA64437 | Clade IV |
Steinernema carpocapsae | ASM75764v3 | PRJNA202318 | Clade IV |
Rhabditophanes sp. KR3021 | Rhabditophanes_sp_KR3021 | PRJEB1297 | Clade IV |
Strongyloides venezuelensis | S_venezuelensis_HH1 | PRJEB530 | Clade IV |
Meloidogyne hapla | Freeze_1 | PRJNA29083 | Clade IV |
Strongyloides stercoralis | S_stercoralis_PV0001_v2_0_4 | PRJEB528 | Clade IV |
Strongyloides ratti | S_ratti_ED321_v5_0_4 | PRJEB125 | Clade IV |
Strongyloides papillosus | S_papillosus_LIN_v2_1_4 | PRJEB525 | Clade IV |
Parastrongyloides trichosuri | P_trichosuri_KNP | PRJEB515 | Clade IV |
Globodera pallida | GPAL001 | PRJEB123 | Clade IV |
Panagrellus redivivus | Pred3 | PRJNA186477 | Clade IV |
Ancylostoma ceylanicum | A_ceylanicum1.3.ec.cg.pg | PRJNA72583 | Clade V |
Ancylostoma caninum | A_caninum_9.3.2.ec.cg.pg | PRJNA72585 | Clade V |
Haemonchus contortus | Hco_v4_coding_submitted | PRJNA205202 | Clade V |
Ancylostoma duodenale | A_duodenale_2.2.ec.cg.pg | PRJNA72581 | Clade V |
Nippostrongylus brasiliensis | N_brasiliensis_RM07_v1_5_4_0011_upd | PRJEB511 | Clade V |
Necator americanus | N__americanus_v1 | PRJNA72135 | Clade V |
Dictyocaulus viviparus | D_viviparus_9.2.1.ec.pg | PRJNA72587 | Clade V |
Angiostrongylus cantonensis | A_cantonensis_Taipei_v1_5_4 | PRJEB493 | Clade V |
Haemonchus placei | H_placei_MHpl1_0011_upd | PRJEB509 | Clade V |
Diploscapter pachys | DipSp1Ass11Ann3 | PRJNA280107 | Clade V |
Heligmosomoides polygyrus | nHp_v2.0 | PRJEB15396 | Clade V |
Oesophagostomum dentatum | O_dentatum_10.0.ec.cg.pg | PRJNA72579 | Clade V |
Heterorhabditis bacteriophora | Heterorhabditis_bacteriophora-7.0 | PRJNA13977 | Clade V |
Angiostrongylus costaricensis | A_costaricensis_Costa_Rica_0011_upd | PRJEB494 | Clade V |
Strongylus vulgaris | S_vulgaris_Kentucky_0011_upd | PRJEB531 | Clade V |
Teladorsagia circumcincta | T_circumcincta.14.0.ec.cg.pg | PRJNA72569 | Clade V |
Cylicostephanus goldi | C_goldi_Cheshire_0011 | PRJEB498 | Clade V |
Table 2.
Summary of chemoreceptor counts and genome quality scores in NemChR-DB
Metric | Value (species) |
---|---|
Max. CR counts | 811 (Ancylostoma ceylanicum) |
Min. CR counts | 21 (Litomosoides sigmodontis) |
Avg. CR counts | 165.8 |
Median CR counts | 135 |
StDev CR counts | 137.4 |
Max. BUSCO score | 97.6 (Onchocerca volvulus) |
Min. BUSCO score | 11.4 (Cylicostephanus goldi) |
Avg. BUSCO score | 75.25 |
Median BUSCO score | 79.9 |
StDev. BUSCO score | 18.5 |
Max. CEGMA score | 99.6 (Strongyloides ratti) |
Min. CEGMA score | 17.74 (Cylicostephanus goldi) |
Avg. CEGMA score | 90 |
Median CEGMA score | 94.76 |
StDev CEGMA score | 14.9 |
Max., maximum; Min., minimum; Avg., average; StDev, standard deviation.
Fig. 1.
Correlation plots of chemoreceptor counts versus complete BUSCO scores (A) and CEGMA scores (B). BUSCO and CEGMA scores were downloaded from WormBase ParaSite ver. WBPS14 (WS271). BUSCO: Pearson’s correlation coefficient r=0.037 and P=0.79053; CEGMA r= 0.156 and P=0.2599. Spearman’s Rho value rs= −0.136 and P=0.323 for BUSCO correlation and rs= 0.094 and P=0.49 for CEGMA correlation, suggesting that there is no significant evidence for correlation between genome quality and chemoreceptor counts. Plots were generated using matplotlib, and Pearson’s correlation coefficient calculated using pearsonr and Spearman’s Rho calculated with spearmanr from scipy.stats.
The database can be searched by clicking the “BROWSE” link from the top of each page (Fig. 2). Users can sort the data tables by clicking on headers, and perform searches using diverse identifiers including species or gene name. Search results can be copied or exported as an Excel file by simply clicking on the corresponding button at the top of the table after performing a search. The TM topology column was created using TMHMM (Sonnhammer et al., 1998) and Phobius (Käll et al., 2004) to provide the residue positions (starting and ending) for each TM domain. TMHMM also returns a value indicating how likely a protein is to be a TM protein (typically values >18 indicate a TM protein), and Phobius indicates whether the protein is predicted to have a signal peptide; both of these data points are also represented as columns in the database table (ExpAA and SP). Protein sequences for each entry can be viewed by clicking the ‘+’ icon next to each entry (Fig. 2). NemChR-DB can also be searched by protein sequence by clicking the “BLAST” link from the top of any page. A local BlastP executable is launched from server-side Hypertext Preprocessor (PHP) scripts and returned as Hypertext Markup Language (HTML) in the user-specified output format. Default parameters are implemented for BlastP including an initial word_size match set to 3, a minimum threshold score for adding a word to the lookup table set to 11, and the heuristic value (in bits) for the final gapped alignment equal to 25. Users can use BLAST to search for similar sequences in other parasitic NemChRs in the database organized by clade, or against the entire database by simply selecting the database type on the BLAST page. Furthermore, users may be interested in understanding the relationship between NemChRs in their parasite of interest with that of NemChRs from C. elegans, and to accommodate this, there is an option to use BLAST to search for similar sequences against all parasitic NemChRs contained within the database as well as the predicted NemChRs from C. elegans. In addition to searching the database with candidate chemoreceptors using BLAST, users can explore TM domain topology of candidate chemoreceptors by navigating to the TM FINDER page. Here, users can supply raw protein sequence to identify predicted TM domains in their protein. TM Finder uses TMHMM (Sonnhammer et al., 1998) to identify predicted TM domains and renders the results using Feature-Viewer (Garcia et al., 2014).
Fig. 2.
From the home page of NemChR-DB users can navigate various links at the top to browse the database (1), search and download results (2 and 3), or explore functional annotations as well as related literature (4). Users can also characterize candidate nematode chemoreceptors using TM Finder to identify and visualize predicted transmembrane domains (5) or query the database using BlastP to uncover related sequences (6). NemChR-DB can be accessed at the following site: http://ohalloranlab.net/nemchr-db,
NemChR-DB is the first database of parasitic nematode chemoreceptors and provides a platform for users to study NemChRs and functionally explore these receptors from very diverse nematode species. To construct NemChR-DB, data were sourced from highly curated databases so as to provide the most up-to-date information of annotated nematode chemoreceptors in diverse species. Our database is designed to permit growth and allow for new sequences and annotations to be easily included when they become available. We plan to add chemoreceptors from more species on a rolling basis based on user feedback (more details can be found on the METHODS page of the database at the following URL: http://ohalloranlab.net/nemchr-db/methods.html), and using the WormBase Application Programming Interface (API) we will track new releases to include in NemChR-DB. In conclusion, NemChR-DB will serve as a useful portal for exploration and discovery in future experiments on nematode chemoreceptor biology.
Acknowledgements
This project was supported by grant R21AI137771 from the National Institute of Allergy and Infectious Diseases, USA. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of Allergy and Infectious Diseases or the National Institutes of Health, USA. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. We are grateful to Nicolas Wheeler and Mostafa Zamanian for help with data collection, methods, and feedback on the manuscript.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Ashton FT, Schad GA, 1996. Amphids in Strongyloides stercoralis and other parasitic nematodes. Parasitol. Today (Regul. Ed.) 12, 187–194. 10.1016/0169-4758(96)10012-0 [DOI] [PubMed] [Google Scholar]
- Bernot JP, Rudy G, Erickson PT, Ratnappan R, Haile M, Rosa BA, Mitreva M, O’Halloran DM, Hawdon JM, 2020. Transcriptomic analysis of hookworm Ancylostoma ceylanicum life cycle stages reveals changes in G-protein coupled receptor diversity associated with the onset of parasitism. Int. J. Parasitol 50, 603–610. 10.1016/j.ijpara.2020.05.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bethony J, Brooker S, Albonico M, Geiger SM, Loukas A, Diemert D, Hotez PJ, 2006. Soil-transmitted helminth infections: ascariasis, trichuriasis, and hookworm. Lancet 367, 1521–1532. 10.1016/S0140-6736(06)68653-4 [DOI] [PubMed] [Google Scholar]
- Bundy DA, Chan MS, Savioli L, 1995. Hookworm infection in pregnancy. Trans. R. Soc. Trop. Med. Hyg 89, 521–522. 10.1016/0035-9203(95)90093-4 [DOI] [PubMed] [Google Scholar]
- Choe A, von Reuss SH, Kogan D, Gasser RB, Platzer EG, Schroeder FC, Sternberg PW, 2012. Ascaroside Signaling is Widely Conserved Among Nematodes. Curr Biol 22, 772–780. 10.1016/j.cub.2012.03.024 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, Sonnhammer ELL, Tate J, Punta M, 2014. Pfam: the protein families database. Nucleic Acids Res. 42, D222–230. 10.1093/nar/gkt1223 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gang SS, Castelletto ML, Yang E, Ruiz F, Brown TM, Bryant AS, Grant WN, Hallem EA, 2020. Chemosensory mechanisms of host seeking and infectivity in skin-penetrating nematodes. Proc. Natl. Acad. Sci. U.S.A 117, 17913–17923. 10.1073/pnas.1909710117 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Garcia L, Yachdav G, Martin MJ, 2014. FeatureViewer, a BioJS component for visualization of position-based annotations in protein sequences. F1000Res 3, 47–47.v2. eCollection 2014. 10.12688/f1000research.3-47.v2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Greene JS, Dobosiewicz M, Butcher RA, McGrath PT, Bargmann CI, 2016. Regulatory changes in two chemoreceptor genes contribute to a Caenorhabditis elegans QTL for foraging behavior. Elife 5. 10.7554/eLife.21454 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Howe KL, Bolt BJ, Shafie M, Kersey P, Berriman M, 2017. WormBase ParaSite - a comprehensive resource for helminth genomics. Mol. Biochem. Parasitol 215, 2–10. 10.1016/j.molbiopara.2016.11.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Käll L, Krogh A, Sonnhammer ELL, 2004. A combined transmembrane topology and signal peptide prediction method. J. Mol. Biol 338, 1027–1036. 10.1016/j.jmb.2004.03.016 [DOI] [PubMed] [Google Scholar]
- Krishnan A, Almén MS, Fredriksson R, Schiöth HB, 2014. Insights into the origin of nematode chemosensory GPCRs: putative orthologs of the Srw family are found across several phyla of protostomes. PLoS ONE 9, e93048. 10.1371/journal.pone.0093048 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krogh A, Larsson B, von Heijne G, Sonnhammer EL, 2001. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol 305, 567–580. 10.1006/jmbi.2000.4315 [DOI] [PubMed] [Google Scholar]
- Lozoff B, Jimenez E, Wolf AW, 1991. Long-term developmental outcome of infants with iron deficiency. N. Engl. J. Med 325, 687–694. 10.1056/NEJM199109053251004 [DOI] [PubMed] [Google Scholar]
- Mistry J, Finn RD, Eddy SR, Bateman A, Punta M, 2013. Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions. Nucleic Acids Res. 41, e121. 10.1093/nar/gkt263 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parra G, Bradnam K, Korf I, 2007. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23, 1061–1067. 10.1093/bioinformatics/btm071 [DOI] [PubMed] [Google Scholar]
- Robertson HM, Thomas JH, 2006. The putative chemoreceptor families of C. elegans. WormBook 1–12. 10.1895/wormbook.1.66.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sengupta P, Chou JH, Bargmann CI, 1996. odr-10 encodes a seven transmembrane domain olfactory receptor required for responses to the odorant diacetyl. Cell 84, 899–909. [DOI] [PubMed] [Google Scholar]
- Seppey M, Manni M, Zdobnov EM, 2019. BUSCO: Assessing Genome Assembly and Annotation Completeness. Methods Mol. Biol 1962, 227–245. 10.1007/978-1-4939-9173-0_14 [DOI] [PubMed] [Google Scholar]
- Sonnhammer EL, von Heijne G, Krogh A, 1998. A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol 6, 175–182. [PubMed] [Google Scholar]
- UniProt Consortium, 2019. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 47, D506–D515. 10.1093/nar/gky1049 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vidal B, Aghayeva U, Sun H, Wang C, Glenwinkel L, Bayer EA, Hobert O, 2018. An atlas of Caenorhabditis elegans chemoreceptor expression. PLOS Biology 16, e2004218. 10.1371/journal.pbio.2004218 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wheeler NJ, Heimark ZW, Airs PM, Mann A, Bartholomay LC, Zamanian M, 2020. Genetic and functional diversification of chemosensory pathway receptors in mosquito-borne filarial nematodes. PLoS Biol. 18, e3000723. 10.1371/journal.pbio.3000723 [DOI] [PMC free article] [PubMed] [Google Scholar]