Skip to main content
. 2017 Jan 18;6:e20023. doi: 10.7554/eLife.20023

Figure 7. Candidate GPCRs expressed in AWC neurons.

Figure 7.

Gene expression in AWC neurons. Expression data are shown for 5326 C. elegans genes (5294 protein-coding and 32 non-coding RNAs) that exhibited above-background expression in our pooled AWC RNA-seq (i.e., that had a minimum expression level, in a 99% credibility interval, of ≥0.1 TPM). The x-axis shows AWC-specificity, computed as the ratio of AWC gene expression (in TPM) to larval gene expression (in TPM) for each gene. The y-axis shows the absolute magnitude of AWC gene activity in TPM. Among most genes (labeled ‘AWC’ and shown as gray dots) are highlighted 32 genes encoding G-protein coupled receptors shown as green triangles and 984 genes encoding housekeeping functions shown as red dots. The housekeeping genes were identified in a previous single-cell RNA-seq analysis (Schwarz et al., 2012). Full identities, annotations, and expression data for all C. elegans genes are provided in Figure 7—source data 1.

DOI: http://dx.doi.org/10.7554/eLife.20023.010

Figure 7—source data 1. Traits of C. elegans genes expressed in AWC, with RNA-seq expression data generated with our full data set and our most recent computational methods.
Data columns are as follows. Gene: a given predicted protein-coding or ncRNA-coding gene in the C. elegans genome, from WormBase release WS245, for which we observed non-zero gene activity in AWC neurons. All further data columns are pertinent to that particular gene. AWC: the expression level for a given gene in AWC neurons, generated from a pooled set of all AWC RNA-seq reads, measured in TPM. Larvae: the expression level for a given gene in whole larvae, generated from a pooled set of all larval RNA-seq reads, measured in TPM. AWC/larvae: the ratio of gene expression (measured in TPM) between AWC neurons and whole C. elegans larvae. Genes in this table have been ranked by descending values of AWC/larvae, as a general measure of their AWC-specificity. AWC.vs.larvae_padj: the statistical significance (if any) with which a given gene was expressed either more or less strongly in AWC neurons than in larvae. Significance is given as a p-value, adjusted for multiple testing (hence, named ‘padj’) by the collective false discovery rate (FDR) formula of Benjamini and Hochberg (Benjamini and Hochberg, 1995). To compute this set of padj values, we compared mapped reads per gene from two biologically independent sets of single-cell AWC RNA-seq data (set 1, cells 1, 2, and 4; set 2, cells 3 and 5) to mapped reads from two pools of whole C. elegans larvae (Supplementary file 2). Significance was calculated with DESeq2 (Love et al., 2014) using default arguments to optimize the number of genes detected with a padj of ≤0.01. AWC_minTPM: the minimum expression level for a given gene in AWC neurons at a credibility interval of 99%, generated from a pooled set of all AWC RNA-seq reads (Supplementary file 2), measured in TPM. Larvae_minTPM: the minimum expression level for a given gene in whole larvae at a credibility interval of 99%, generated from a pooled set of all larval RNA-seq reads (Supplementary file 2), measured in TPM. AWC_cell_[number]_TPM: the expression level for a given gene in a single AWC neuron (number 1 through 5), generated from the specific set of AWC RNA-seq reads from that individually dissected and amplified neuron (Supplementary file 2), measured in TPM. AWC_cell_[number]_minTPM: the minimum expression level for a given gene in a single AWC neuron (number 1 through 5) at a credibility interval of 99%, generated from the specific set of AWC RNA-seq reads from that individually dissected and amplified neuron (Supplementary file 2), measured in TPM. Coding: the nature of a given gene's coding potential, as annotated in WormBase WS245. Most genes are either solely protein-coding or solely ncRNA-coding, and are noted as such in this data column. For 301 genes in C. elegans, WS245 predicts both protein-coding and non-protein-coding transcripts; in this table, such genes are denoted with ‘protein; ncRNA’. However, for purposes of gene analysis, we assume that any gene with dual predicted nature is solely protein-coding. Prot_size: this shows the full range of sizes for all protein products from a gene's predicted isoforms. Max_prot_size: the size of the largest predicted protein product. Housekeeping: a set of genes that we previously observed, by single-cell RNA-seq, to be consistently active both in whole C. elegans larvae and in three different states of migrating C. elegans linker cells (Schwarz et al., 2012). 7TM_GPCRs: a set of genes encoding G-protein coupled receptors (GPCRs), a class of genes of particular biological interest in deciphering AWC function. Prominent members of this set include dop-1, gar-1, lat-1, odr-10, and ser-2 (Hobert, 2013). Pfam-A: for protein-coding genes, predicted domains from the annotated (Pfam-A) subdivision of PFAM 27 (Finn et al., 2014), PMID 24288371), with an E-value of ≤10–5. eggNOG: for protein-coding genes, predicted orthology groups from the eggNOG 3.0 database (Powell et al., 2012). Phobius: this denotes predictions of signal and transmembrane sequences made with Phobius (Käll et al., 2004). 'SigP' indicates a predicted signal sequence, and 'TM' indicates one or more transmembrane-spanning helices, with N helices indicated with '(Nx)'. Varying predictions from different isoforms are listed. NCoils: this shows coiled-coil domains, predicted by ncoils (Lupas, 1996). As with Psegs, the relative and absolute fractions of each protein's coiled-coil residues are shown. Psegs: this shows what fraction of a protein is low-complexity sequence, as detected by pseg (Wootton, 1994). Both the proportion of such sequence (ranging from 0.01 to 1.00) and the exact ratio of low-complexity residues to total residues are given. Proteins with no predicted low-complexity residues are blank. GO_terms: this denotes Gene Ontology terms for which a gene was annotated in WormBase release WS245.
DOI: 10.7554/eLife.20023.011
Figure 7—source data 2. Traits of C. elegans genes expressed in AWC, with RNA-seq expression data generated with our older data set and computational methods.
Data columns are as follows. Gene: a given predicted protein-coding gene (ncRNA-coding genes were not analyzed) in the C. elegans genome, from WormBase release WS190, for which we observed non-zero gene activity in AWC neurons. All further data columns are pertinent to that particular gene. Note that gene identifications have been mapped from the older values used in our older analytical pipeline (which used genes from WormBase release WS190) to the newer values in WormBase release WS245. AWC: the expression level for a given gene in AWC neurons, generated solely from our nine-cell pool of AWC RNA-seq reads, measured in RPKM. Larvae: the expression level for a given gene in whole larvae, generated from a pooled set of all larval RNA-seq reads, measured in RPKM. AWC/larvae: the ratio of gene expression (measured in TPM) between AWC neurons and whole C. elegans larvae. Genes in this table have been ranked by descending values of AWC/larvae, as a general measure of their AWC-specificity. [All other data columns]: these have the same meaning as they do in Figure 7—source data 1. All annotations were generated for WS245 genes.
DOI: 10.7554/eLife.20023.012
Figure 7—source data 3. Gene functions associated with genes specifically expressed in AWC.
GO terms whose associated genes exhibited a significant excess of high AWC-specificity (i.e., high ratios of AWC/larval RNA-seq expression), against a background set of 5937 genes with minTPM ≥0.1 in at least one AWC RNA-seq data set. For each GO term, a refined p-value was calculated by FUNC.
DOI: 10.7554/eLife.20023.013
Figure 7—source data 4. Comparison of selected data for C. elegans genes encoding GPCRs expressed in AWC.
These data are according to either our newer RNA-seq analysis (using all our data and newer analytical methods) or our older RNA-seq analysis (using our nine-cell pooled data and older analytical methods). The Excel file contains three sheets of data. The first data sheet provides a comparison between GPCR genes identified as expressed in AWC by the newer analysis, versus those identified by the older analysis. Data columns are as follows. Gene: a given predicted protein-coding gene in the C. elegans genome. All further data columns are pertinent to that particular gene. AWC.TPM.new: expression values in AWC for that gene from our newer RNA-seq analysis (Figure 7—source data 1), in units of TPM. AWC.RPKM.old: expression values in AWC for that gene from our older RNA-seq analysis (Figure 7—source data 2), in units of RPKM. Pfam-A: predicted domains from the annotated (Pfam-A) subdivision of PFAM 27 with an E-value of ≤10–5(Finn et al., 2014).
DOI: 10.7554/eLife.20023.014