Skip to main content
Molecular Endocrinology logoLink to Molecular Endocrinology
. 2012 Aug 21;26(10):1783–1792. doi: 10.1210/me.2012-1176

Research Resource: RNA-Seq Reveals Unique Features of the Pancreatic β-Cell Transcriptome

Gregory M Ku 1,*, Hail Kim 1,*, Ian W Vaughn 1, Matthew J Hangauer 1, Chang Myung Oh 1, Michael S German 1,, Michael T McManus 1,
PMCID: PMC3458219  PMID: 22915829

Abstract

The pancreatic β-cell is critical for the maintenance of glycemic control. Knowing the compendium of genes expressed in β-cells will further our understanding of this critical cell type and may allow the identification of future antidiabetes drug targets. Here, we report the use of next-generation sequencing to obtain nearly 1 billion reads from the polyadenylated RNA of islets and purified β-cells from mice. These data reveal novel examples of β-cell-specific splicing events, promoter usage, and over 1000 long intergenic noncoding RNA expressed in mouse β-cells. Many of these long intergenic noncoding RNA are β-cell specific, and we hypothesize that this large set of novel RNA may play important roles in β-cell function. Our data demonstrate unique features of the β-cell transcriptome.


The pancreatic β-cell is the body's main source of insulin. The devastating metabolic consequences of the β-cell loss or dysfunction seen in diabetes mellitus highlight the critical role of these cells in nutrient metabolism. The ability to match insulin production to physiological needs results from the β-cell's unique transcriptional program. Yet, no studies have defined β-cell transcriptional landscapes with a high resolution, either in diseased or healthy primary β-cells.

Some studies have described transcriptional profiles of β-cells and pancreatic islets using oligonucleotide arrays (1, 2) and, more recently, massively parallel signature sequencing (3). However, oligonucleotide array studies are limited to the detection of sequences that are already printed on the arrays, whereas unbiased massively parallel signature sequencing is limited by sheer throughput. Next-generation mRNA sequencing (mRNA-seq) addresses these shortcomings (4) and has not yet been applied to primary β-cells.

The ability of mRNA-seq to detect low-abundance, novel transcripts has resulted in the identification of a novel class of RNA, long intergenic noncoding RNA (lincRNA). These RNA are greater than 200 nucleotides in length and do not encode proteins. Thousands of distinct lincRNA loci have been described in the mouse and human genomes (5, 6). Although the biological functions of only a few have been explored, lincRNA regulate diverse processes including epigenetic silencing, apoptosis, alternative splicing, and protein translation (reviewed in Ref. 7).

Here, we describe a high-resolution analysis of pancreatic β-cells, providing a new view of the β-cell transcriptome with an unprecedented level of specificity, sensitivity, and breadth. In addition to β-cell-specific gene expression, we also delineate β-cell-specific promoter use, alternative splicing, and a comprehensive inventory of novel β-cell-specific lincRNA.

Materials and Methods

Islet isolation and cell sorting

Islets from 16- to 20-wk-old mouse insulin promoter (MIP)-green fluorescent protein (GFP) mice were isolated by the University of California San Francisco Islet Production Core. Islets were digested with trypsin until single-cell suspensions were obtained. GFP-positive or -negative cells were sorted by flow cytometry (Aria II; BD, San Jose, CA).

Read mapping and fragments per kilobase of transcript per million mapped reads (FPKM) estimation

Nonamplified, nondirectional polyadenylated mRNA sequencing was performed at the University of British Columbia using the Illumina platform generating 82- to 85-bp paired end reads. The samples were mapped with TopHat version 1.3.1 to UCSC mm9 with default parameters. Mapped read counts were as follows: female islet, 371 million reads; male β-cell-1, 160 million reads; male β-cell-2 (replicate), 180 million reads; female β-cell, 150 million reads. For known genes (Figs. 1 and 2), the iGenomes gtf (Illumina) was used as the reference and quantitated using Cufflinks version 1.3.0 while masking the Ins1 and Ins2 genes, rRNA, and mitochondrial RNA. The false discovery rate was set to 0.1, and the minimum number of counts to test significance was 30. Because Cufflinks becomes computationally demanding for genes expressed at very high levels (e.g. Ins2, Ins1, Gcg, Chga, and Chgb), we used HTSeq (with the modification that a mapped read to two overlapping features was scored as a read for both features) and DESeq for the analysis shown in Fig. 1B (8). Non-β-cell samples were downloaded from National Center for Biotechnology Information, Sequence Read Archive. These were mapped and quantitated using the same pipeline used for the islet and β-cell samples.

Fig. 1.

Fig. 1.

mRNA-seq of mouse islets and primary mouse β-cells demonstrates β-cell-specific genes. A, A comparison of RT-qPCR quantification (y-axis) and mRNA-seq quantification (x-axis) of GPCR expression. Each point represents a single GPCR gene. RT-qPCR data were plotted with permission (16). B, Log base 2-fold change in normalized read counts from β-cells vs. islets for the indicated genes. All were statistically significant with q value < 1 × 10−14. C, Identification of RefSeq genes enriched and depleted in β-cells. FPKM in β-cells (y-axis) is plotted vs. the FPKM in islets (x-axis). Red dots indicate a q value < 0.1. Black dots indicate q value > 0.1. The blue dashed oval is highly enriched for exocrine secreted enzymes (see text). D, Histogram of FPKM levels of RefSeq genes expressed in β-cells. E, Log base 10 ratio of β-cell FPKM to the average FPKM of all non-β tissues is plotted (β-cell specificity score) for β-cell-expressed genes. Green circles indicate genes statistically significantly increased in β-cells over all five non-β-cell tissues (q value < 0.1). F, FPKM values of the 16 genes with no detectable expression in any of the other five non-β-cell tissues examined. AU, Arbitrary units.

Fig. 2.

Fig. 2.

mRNA-seq identifies the β-cell Tph1 promoter. A, The short form of Tph1 is the dominant isoform expressed in β-cells. mRNA-seq reads from β-cells (top) are plotted with the exon structure of RefSeq transcripts for Tph1 locus (bottom). On the left is a heat map of expression level of each transcript in β-cells and skeletal muscle. B, RT-PCR with primers to the indicated exons was performed and run on an agarose gel. C, RT-qPCR targeting the indicated exons from islets isolated from pregnant (preg) and nonpregnant (non-preg) mice. Fold induction over nonpregnant levels is plotted with se (n = 2 mice from each condition). *, P value < 0.01 between pregnant and nonpregnant mice. D, The upstream region of exon 1b is sufficient to drive transcription in β-cells. The indicated constructs were transfected into MIN6 cells, and luciferase activity was determined. The se is plotted (n = 4). *, P < 0.0001 vs. no-promoter control.

De novo transcriptome assembly

For islet and β-cell samples, Cufflinks version 1.3.0 was used with default parameters except that upper quartile normalization was employed. Transcriptomes were combined using Cuffcompare. Only transcripts predicted by at least two of the biological replicates were accepted into the final assembly. For the non-islet, non-β-cell samples, we performed transcriptome assembly using Cufflinks version 1.3.0 again only accepting isoforms that were predicted by at least two samples. These two transcriptomes were then combined with the RefSeq transcript set using Cuffcompare.

β-Cell-specific isoform identification

We selected transcripts that were marked as statistically significantly different between β-cells and all other non-β-cell tissues. From this subset, we took only those transcripts whose fractional expression of the total gene FPKM changed by more than 50% and whose FPKM level in all tissues was greater than 1.

LincRNA filtering

Any transcript from the de novo transcriptome assembly with overlap with an Ensembl annotated protein-coding gene, small nuclear RNA, small nucleolar RNA, microRNA, tRNA, or pseudogene was removed. We also extended 5′ and 3′ untranslated regions (UTR) of RefSeq protein-coding genes using Cufflinks version 1.1.0 with the faux reads option (Hangauer M. J., I. W. Vaughn, and M. T. McManus, unpublished observations). Any de novo transcript with overlap to these extended UTR was also removed. The resulting transcripts were then assessed for coding potential using PhyloCSF (10); any transcript with a positive PhyloCSF score was removed.

Conservation

PhyloP conservation scores calculated using the PHAST package on the multiple alignment of 20 placental mammal genomes were retrieved from the UCSC genome browser database. The maximum average PhyloP score for any 50-bp window within each transcript was calculated. Size-matched repetitive elements from the RepeatMasker database were used as a nonconserved negative control set.

Heat map and mRNA-seq plots

FPKM estimates were normalized within each gene and then were plotted with Java TreeView (11). mRNA-seq reads were visualized with IGV version 2.0 (12).

RT-quantitative PCR (qPCR) and RT-PCR

Total RNA was extracted from isolated mouse islets using TRIzol. First-strand cDNA was synthesized from 1 μg total RNA in a 20-μl volume using a mixture of random hexamers, oligo-deoxythymidine, and Superscript III reverse transcriptase (Invitrogen, Carlsbad, CA). Real-time qPCR was performed. The appropriate amount of the reverse transcription reaction mixture was amplified with specific primers using SYBR green PCR master mix (Applied Biosystems, Foster City, CA). RNA samples were normalized by determining β-actin mRNA level. Primers used in PCR are as follows: exon 1A-F, gcaagccaaggtttcaagag; exon 1A-R, ggcccgtggacatacttcta; exon 2-F, accatgattgaagacaacaaggag; exon 3-R, tcaactgttctcggctgatg; exon 3-F, catcagccgagaacagttga; exon 4-R, ttcggatccatacaacagca. One microliter of the reverse transcription reaction mixture was amplified 30 cycles with primers specific for the Tph1 gene, as indicated, in a total volume of 30 μl. Samples were amplified for 30 cycles using the following parameters: 92 C for 30 sec, 62 C for 30 sec, and 72 C for 30 sec. GAPDH primers were used as an internal control for quality and quantity of RNA. The PCR products were subjected to electrophoresis on a 1.5% agarose gel.

Luciferase assays

MIN6 cells were maintained in DMEM supplemented with 15% fetal bovine serum, 100 U/ml penicillin, 100 g/ml streptomycin, and 71.5 μm β-mercaptoethanol. Cells were plated in six-well plates (6 × 105 cells per well) and cultured overnight. Transient transfection and luciferase assays were performed using Lipofectamine 2000 reagent (Invitrogen) and a dual luciferase assay kit (Promega, Madison, WI). Firefly luciferase activities were normalized with renilla luciferase activities to adjust for transfection efficiency. Normalized luciferase activities are shown as the mean ± se of four biological replicates and are expressed as relative luciferase activity.

Results

mRNA-seq of primary mouse islets

To capture a more complete map of the β-cell RNA landscape, we performed RNA-seq on isolated islets from MIP-GFP mice (13). We performed conventional bioinformatic treatment of RNA sequence reads. Gene expression levels in FPKM were estimated using Tophat/Cufflinks (14, 15). To validate our quantitation of gene expression, we compared the levels of all mRNA encoding G protein-coupled receptors (GPCR) as determined by mRNA-seq to the levels determined by RT-qPCR in a previous study (16). mRNA-seq- and RT-qPCR-determined expression levels were in good agreement (Spearman correlation ρ = 0.844) (Fig. 1A). Most of the variation between the two methods occurred in low-abundance genes expressed at or below the lower limit of detection for RT-qPCR (<5 arbitrary units).

Identification of genes enriched in pancreatic β-cells vs. whole islets

Because the pancreatic islet consists of multiple cell types, whole islet expression data represent the weighted sum of several unique gene expression profiles. To analyze the β-cell-specific transcriptome, we dissociated MIP-GFP islets into single cells and isolated GFP-positive β-cells by flow cytometry. Before sorting, the dissociated islets were approximately 50% GFP positive. After sorting, the β-cell samples were approximately 95% GFP positive.

β-Cell purity can be assessed at the level of mRNA-seq by analysis of cell-type-specific genes. We compared the expression levels of amylase, glucagon, pancreatic polypeptide, somatostatin, ghrelin, and Sox9, mRNA predominantly expressed by the non-β-cell populations of the islet, between the β-cell-enriched samples and the whole islet. As expected, all of these mRNA were depleted up to 70-fold in the isolated β-cells as compared with total islets (all q values < 10−14). Conversely, the β-cell-specific Ins2 and Ins1 mRNA were increased by nearly 2-fold in sorted β-cells compared with total islets [FDR adjusted p value (q) = 0.002 and 0.008, respectively] (Fig. 1B). This increase is consistent with the increase in β-cells purity between total islet and sorted β-cells from approximately 50 to 95%. These data show that the cell-sorting procedure successfully enriched β-cell RNA while depleting non-β-cell RNA.

To identify annotated genes enriched or depleted within the purified β-cell population, the FPKM values for all RefSeq transcripts in whole islets vs. sorted β-cells were plotted, marking in red those RNA with significantly different expression between the two samples (q value < 0.1) (Fig. 1C). Notably, a group of mRNA were significantly depleted in β-cells yet still had high FPKM values (marked by a blue dashed oval). These RNA predominantly corresponded to exocrine secretory enzymes (amylases, lipases, proteases, elastases, and ribonucleases). The linear, diagonal arrangement of these genes demonstrates that exocrine-specific mRNA had a similar fold depletion in the β-cell sample. Moreover, genes specific to non-β-cell types had a similar range of disenrichment: 46-fold for exocrine genes, 70-fold for endothelial genes, 17-fold for α-cell genes, and 60-fold for neuronal genes (Table 1). We classified all RefSeq annotated mRNA that were depleted in β-cells by more than 4-fold as non-β-cell enriched genes (Supplemental Table 1, published on The Endocrine Society's Journals Online web site at http://mend.endojournals.org).

Table 1.

Fold depletion in sorted β-cells vs. total islets for the indicated genes

Fold depletion
α-Cell
    Gcg 15.18
    Irx1 17.45
    Pou3f4 20.81
    Arx 13.28
    Average 16.68
Exocrine
    Amy2a2 28.40
    Amy2b 72.90
    Pnlip 35.66
    Ptf1a 34.78
    Rbpjl 61.61
    Average 46.67
Neuronal
    Gata2 157.09
    Foxd3 40.16
    Hand2 31.22
    Sox10 21.21
    Average 62.42
Endothelial
    Tcf21 31.62
    Sox18 103.71
    Nkx2–3 72.45
    Hlx 72.01
    Average 69.95

The ratio of the islet Cuffdiff-determined FPKM value divided by the β-cell FPKM value for each gene (fold depletion in β-cells) is shown.

To study those genes expressed predominately in β-cells, non-β-cell-enriched genes were excluded, and expression levels of β-cell-enriched mRNA were evaluated. As has been seen for other cell types, the distribution of FPKM values is notable for a large group of RNA expressed at FPKM values much less than 1, which presumably reflects RNA with fewer than one copy per cell on average (17, 18). However, the majority of detected mRNA were expressed at an FPKM well above 1 (Fig. 1D). The functional significance of RNA with an FPKM much less than 1 is not yet clear, so we eliminated those RNA with an FPKM of less than 0.5 from further analysis. The list of β-cell-expressed genes is given in Supplemental Table 2.

Identification of genes enriched in pancreatic β-cells vs. other tissues

To help classify β-cell-specific genes, previously published mRNA-seq data from lung fibroblasts, neural precursor cells, liver, skeletal muscle, and brain (4, 6, 19, 20) were compared with the high-resolution pancreatic β-cell mRNA described above. For each gene, the log base 10 ratio of the β-cell FPKM to the average FPKM of all other tissues was plotted (Fig. 1E). Genes marked in green showed a statistically significant increase over all five non-β-cell tissues examined. Sixteen of 12,082 genes expressed in β-cells had no detectable expression in any other cell type, and their FPKM ranged from slightly less than 1 to several hundred (Fig. 1F). Notably, this list includes well-known β-cell-specific genes (Pdx1, Rfx6, and Gpr119) and several lowly expressed genes that have not been previously described as being β-cell specific. Eighty-seven genes were 200-fold enriched in β-cells, and 548 genes were over 10-fold β-cell enriched. The FPKM values for all genes expressed in β-cells in all examined tissues is given in Supplemental Table 3.

Identification of the Tph1 promoter in β-cells using mRNA-seq

Little is known about β-cell-specific promoter usage, despite the high value in understanding mechanisms for β-cell gene expression. Several well-known genes use β-cell-specific promoters including Hnf4α and Gck (21, 22). We compared β-cell and liver mRNA-seq reads and confirmed the β-cell and liver-cell-specific 5′ exons for both the Hnf4α and Gck genes (Supplemental Fig. 1), suggesting that our mRNA-seq data can clearly delineate β-cell-specific promoter usage. As another test of this, we examined the β-cell expression of the tryptophan hydroxylase gene, Tph1. RefSeq contains two isoforms of Tph1: variant 1 (NM_009414), beginning with a distal first exon (we termed this exon 1a) and variant 2 (NM_001136084), an isoform that begins with a more proximal first exon (we termed this exon 1b) (Fig. 2A, bottom panel). In β-cell mRNA-seq, there are no reads mapping to exon 1a, whereas a substantial number of reads mapped to exon 1b (Fig. 2A, top panel). Confirming this visual inspection of the reads, Cufflinks-determined FPKM values show exclusive expression of variant 2, whereas skeletal muscle expresses predominantly variant 1 (Fig. 2A, left panel). The lack of expression of the exon 1a isoform in whole mouse islets was confirmed by RT-PCR (Fig. 2B). Likewise, the up-regulation of Tph1 in the islets of pregnant mice (2325) is not seen with probes against exon 1a (Fig. 2C). Finally, we show that the genomic region upstream of exon 1a actually suppresses transcriptional activity, whereas the analogous region upstream of exon 1b is transcriptionally active in the MIN6 β-cell line (Fig. 2D).

De novo transcriptome assembly reveals novel β-cell-specific promoter use and splicing

To expand upon our analysis of novel β-cell-specific promoter use and to explore β-cell-specific alternative splicing, de novo transcriptome assembly was performed with Cufflinks for the islet and β-cell samples (14). As a primary filter, only transcripts that were predicted by two or more of the biological samples were retained. These high-confidence transcripts were combined with all RefSeq transcripts, and a set of similarly derived de novo transcripts from liver, brain, neural progenitor cell, skeletal muscle, and lung fibroblasts (see Materials and Methods).

Analysis of the high-confidence de novo transcriptome assembly shows 4254 novel transcripts that share one splice junction with a RefSeq transcript and reveals many examples with extended 5′ or 3′ UTR beyond the currently annotated RefSeq annotated transcripts. The assembled transcriptomes are provided in the Supplemental Data.

To identify novel instances of β-cell-specific promoter use and alternative splicing, we then used Cuffdiff to estimate FPKM values for isoforms in the de novo transcriptome. Cuffdiff predicted 5892 genes with a different pattern of alternative splicing between β-cells and another non-β-cell tissue. To focus on the most dramatic of isoform switches, we identified isoforms of genes that exhibited at least a 50% change in the isoform's use between β-cells and all other cell types where the gene is expressed. Of the 43 genes matching this strict criterion (Supplemental Table 4), several demonstrated clear changes in protein-coding sequence. One example is the vacuolar protein sorting-associated protein 8 (Vps8). The majority of Vps8 in β-cells begins with a novel exon (highlighted in a green box) that results in an mRNA that does not contain any of the Vps8 coding sequence (Fig. 3A). A second example of β-cell-specific splicing is Rasgrf1, a 27-exon gene expressed in both the brain and the β-cell. However, the vast majority of Rasgrf1 in β-cells ends with a novel β-cell-specific exon 2 (highlighted in a green box), which is predicted to result in a significantly truncated protein product (Fig. 3B). A previous report identified an alternative truncated form of Rasgrf1 termed Grfbeta (NM_001039655), which has an additional intron (26), but this variant appears to be a minor isoform in primary β-cells. These data demonstrate an abundance of β-cell-specific promoter use and alternative splicing.

Fig. 3.

Fig. 3.

Examples of β-cell-specific splicing events and alternative promoter use. A, Vps8: heat map showing expression of three major isoforms of Vps8 (left) and their transcript structures (bottom right). mRNA-seq reads mapping to the locus are shown for β-cell, liver, lung fibroblast (fibro), and neural progenitor cell (NPC) (top). The green box highlights a β-cell-specific exon 1 that is not present in other tissues. B, Rasgrf1: as in A, but Rasgrf1 is expressed only in brain and β-cells, so only these two tissues are listed. The green box highlights a β-cell-specific exon 2.

Identification of lincRNA expressed in the pancreatic β-cell

Although our sequencing pipeline began with the purification of polyadenylated mRNA, we reasoned that additional β-cell relevant polyadenylated noncoding RNA might also be present in our high-resolution dataset. To begin exploring this possibility, we examined the de novo transcriptome assembly for potential lincRNA. First, any transcript in the merged assembly that overlapped with an Ensembl annotated protein-coding gene, rRNA, tRNA, small nucleolar RNA, small nuclear RNA, microRNA, or pseudogene was removed. Next, any transcript having an overlap with a 3′ or 5′ UTR of a RefSeq protein-coding gene (including extended UTR) was also removed. Transcripts shorter than 200 nucleotides and those with coding potential as measured by a positive PhyloCSF score (10) were also removed. In total, 2790 transcripts corresponding to 2425 loci were identified (Fig. 4A and Supplemental Table 5). Of these loci, 1342 genes were completely novel to RefSeq annotation.

Fig. 4.

Fig. 4.

Identification of β-cell lincRNA. A, Filtering strategy for lincRNA identification (see text for details). B, Cumulative distribution plot of FPKM of lincRNA (red) and protein-coding (green) genes shows that most lincRNA are expressed at lower levels vs. protein-coding genes. C, β-Cell enrichment score (log base 10 ratio of β-cell FPKM over average non-β-cell FPKM) for lincRNA (black and red) and protein-coding genes (black and green). Colored circles indicate genes for which there is a statistically significant increase in β-cells over all six non-β-cell tissues examined. D, Histogram of maximum PhyloP score (placental mammals) of lincRNA (red), protein-coding genes (NM, blue), and repeat masked sequence (RPMSK, black) showing that most protein-coding genes show high conservation, whereas only a minority of lincRNA show equivalent conservation. E, mRNA-seq reads and predicted transcript structure of a novel potential lincRNA 5′ of the Nkx6-1 locus. F, mRNA-seq reads and predicted transcript structure of lying antisense to the Pdx1 locus. NPC, Neural progenitor cell; sk muscl, skeletal muscle.

In agreement with previous reports, the expression of lincRNA in β-cells was much lower than that of protein-coding genes (Fig. 4B), with about 80% of lincRNA expressed below an FPKM of 5. In contrast, only approximately 65% of the β-cell protein-encoding genes display this low range of expression. To help narrow the list to those lincRNA that are specifically expressed in β-cells, we eliminated those lincRNA genes depleted in β-cells by 4-fold relative to islets. We also eliminated those expressed below FPKM of 0.5, resulting in a set of 1359 high-confidence lincRNA genes (Supplemental Table 6). The β-cell specificity of these lincRNA was higher than that of protein-coding genes with 108 of 2425 genes having no detectable expression in any other tissue type and 160 over 200-fold β-cell enriched (Fig. 4C and Supplemental Table 6).

As others have reported, the overall conservation of lincRNA tends to be much weaker than protein-coding genes. Indeed, most functionally characterized lincRNA appear poorly conserved when analyzed using traditional phylogenetic methods. It has therefore been proposed that conservation of secondary structure or conservation of small windows of sequence stretches within a transcript may be sufficient to maintain function (27). In the present analysis, we used a sliding 50-bp window to identify the most conserved element within each lincRNA transcript as measured by the PhyloP algorithm (28). Many of these lincRNA harbor significant windows of conservation comparable to protein-coding transcripts (Fig. 4D).

Previous reports suggest that lincRNA are often located in proximity to transcription factors important to a particular cell type (27, 29, 30). As an example of this, we found a potential multi-exonic lincRNA flanking the 5′ end of the Nkx6-1 gene that is approximately 1600 times more highly expressed in β-cells than non-β-cell tissues (ranked 117th most specific) (Fig. 4E). An additional example lincRNA, XLOC_019089, is not detected in any non-β-cell tissue and is located antisense to the Pdx1 locus (Fig. 4F). We note that this transcript appears to be a short isoform of the 210019I11Rik gene. We propose that these β-cell-specific lincRNA are novel candidates for noncoding modulators of β-cell function.

Discussion

By harnessing the power of RNA-seq technologies, we have provided an unprecedented view of the β-cell transcriptome. Besides the identification of β-cell-specific genes, we annotated many novel β-cell-specific promoters and splice forms and discovered a large class of β-cell-specific lincRNA.

Fluorescence-activated cell sorting (FACS) was crucial for the success of our approach because it allowed us to resolve primary β-cells from non-β-cells. Based on the reductions in known α-cell-, PP-cell-, and exocrine-cell-specific genes, the FACS strategy was immanently effective. Using this strategy, we were able to eliminate genes from these contaminating cells of the islet and exocrine pancreas, thereby clarifying the broad range of genes that are expressed in β-cells. This dataset includes many genes expressed at low levels that are β-cell specific but were not seen in previous studies, likely due to their low expression levels.

Of note, we could still detect glucagon and other highly abundant non-β-cell hormones in our β-cell samples. This is not surprising because an α-cell-specific gene with an FPKM of 100 will have a FPKM of 0.1 in a sorted β-cell sample with 0.1% contamination; we suspect the FPKM of Gcg in a pure α-cell population is undoubtedly much higher than 100. Nonetheless, Gcg and other α-cell-specific mRNA were disenriched nearly 17-fold in the purified β-cell samples relative to intact islets, and exocrine mRNA were disenriched nearly 50-fold. In contrast, the mRNA encoding Sst was disenriched only 2-fold. This differential depletion may reflect a difference in the ability of FACS to separate contaminating δ-cells from β-cells. Therefore, it is possible that there is a component of δ-cell contamination in our β-cell gene expression data and that some of the novel transcripts we identified as β-cell specific might be both β- and δ-cell specific or even δ-cell specific. Resolving this possibility will require a higher degree of β-cell purification, immunohistochemical, or in situ hybridization techniques. Notably, Sst mRNA was detectable in single-cell PCR of mouse pancreatic β-cells of MIP-GFP mice and was also found in sorted primary human β-cells, suggesting that either δ-cells are a common contaminant in FACS-sorted β-cells or that FACS-sorted β-cells do express some level of Sst mRNA (1, 9). Finally, even if an mRNA is significantly depleted in the sorted β-cell samples, we cannot exclude the possibility of low-level, but biologically important, expression in β-cells. Finally, if there is heterogeneity among β-cells, as single-cell PCR experiments suggest (9), our data would represent the weighted average of these distinct gene expression profiles.

Some β-cell-specific genes based on previous studies were only modestly enriched in sorted β-cells and did not reach statistical significance (Pdx1 and Nkx6-1). This failure to reach significance likely reflects the maximum 2-fold enrichment for β-cell-specific genes, reducing the detection power of this class of genes. It should also be noted that like most gene expression studies, there is a likely possibility that gene expression levels could have changed as a result of islet dissociation and sorting. Although the time from islet isolation to RNA extraction was typically less than 4 h, we did see changes in gene expression that likely reflect acute changes during sorting (for example, up-regulation of immediate early genes Fos and FosB).

We detected many instances of islet-cell-specific and β-cell-specific alternative splicing and promoter use. As of yet, the functional consequences of most of these splicing events is not clear, although we note that many of these events have drastic effects on protein-coding sequence. We also describe over 1000 lincRNA expressed in β-cells. As has been seen for other tissue types, most of these are β-cell specific. With the identification of novel β-cell lincRNA, future experiments to address their function are critical.

For investigators interested in β-cells, we expect this analysis to be a useful resource. Furthermore, as de novo transcriptome assembly algorithms become more refined, the primary read data may prove useful for reanalysis. These data are publicly accessible at the National Center for Biotechnology Information, Sequence Read Archive (SRA056174), and the alignments are viewable as a track at the β-Cell Biology Consortium for public access.

In conclusion, we demonstrate the rich transcriptional landscape of the primary pancreatic β-cell. Understanding these gene expression profiles may further our understanding of the unique phenotype of the β-cell and assist in efforts to improve or emulate β-cell function for the treatment of diabetes.

Supplementary Material

Supplemental Data

Acknowledgments

We thank Chester Chamberlain and Stefani Nalle of the University of California San Francisco (UCSF) Diabetes Center for thoughtful review of the manuscript.

G.M.K. was supported by a grant from the A.P. Giannini Medical Foundation and is currently supported by National Institutes of Health (NIH) Grant K08DK087945. H.K. is supported by a Juvenile Diabetes Research Foundation Advanced Postdoctoral Fellowship (10-2010-553), a grant of the Korean Health Technology R&D Project, Ministry of Health and Welfare, Republic of Korea (A110571 to C.M.O., A112024 to H.K.) and by the Basic Science Research Program through the National Research Foundation of Korea funded by the Ministry of Education, Science, and Technology (2011-0023387). M.H. is supported by a Susan G. Komen Research for the Cure Postdoctoral Fellowship. This research was supported by NIH RC1DK086290 to M.T.M and R01DK021344, U01DK08954 to M.S.G. We also acknowledge the support of the UCSF Diabetes Research Center P30DK063720.

Author contributions: G.M.K. researched data and wrote the manuscript. H.K., I.V., and C.M.O. researched data and reviewed/edited the manuscript. M.H., M.S.G., and M.T.M. contributed to discussion and reviewed/edited the manuscript.

Disclosure Summary: The authors have nothing to disclose.

Footnotes

Abbreviations:
FACS
Fluorescence-activated cell sorting
FPKM
fragments per kilobase of transcript per million mapped reads
GFP
green fluorescent protein
GPCR
G protein-coupled receptor
lincRNA
long intergenic noncoding RNA
MIP
mouse insulin promoter
mRNA-seq
mRNA sequencing
qPCR
quantitative PCR
UTR
untranslated regions.

References

  • 1. Dorrell C, Schug J, Lin CF, Canaday PS, Fox AJ, Smirnova O, Bonnah R, Streeter PR, Stoeckert CJ, Jr, Kaestner KH, Grompe M. 2011. Transcriptomes of the major human pancreatic cell types. Diabetologia 54:2832–2844 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Martens GA, Jiang L, Hellemans KH, Stangé G, Heimberg H, Nielsen FC, Sand O, Van Helden J, Van Lommel L, Schuit F, Gorus FK, Pipeleers DG. 2011. Clusters of conserved β-cell marker genes for assessment of β-cell phenotype. PLoS One 6:e24134. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Kutlu B, Burdick D, Baxter D, Rasschaert J, Flamez D, Eizirik DL, Welsh N, Goodman N, Hood L. 2009. Detailed transcriptome atlas of the pancreatic β-cell. BMC Med Genomics 2:3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. 2008. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5:621–628 [DOI] [PubMed] [Google Scholar]
  • 5. Cabili MN, Trapnell C, Goff L, Koziol M, Tazon-Vega B, Regev A, Rinn JL. 2011. Integrative annotation of human large intergenic noncoding RNA reveals global properties and specific subclasses. Genes Dev 25:1915–1927 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Guttman M, Garber M, Levin JZ, Donaghey J, Robinson J, Adiconis X, Fan L, Koziol MJ, Gnirke A, Nusbaum C, Rinn JL, Lander ES, Regev A. 2010. Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNA. Nat Biotechnol 28:503–510 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Wapinski O, Chang HY. 2011. Long noncoding RNA and human disease. Trends Cell Biol 21:354–361 [DOI] [PubMed] [Google Scholar]
  • 8. Anders S, Huber W. 2010. Differential expression analysis for sequence count data. Genome Biol 11:R106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Katsuta H, Akashi T, Katsuta R, Nagaya M, Kim D, Arinobu Y, Hara M, Bonner-Weir S, Sharma AJ, Akashi K, Weir GC. 2010. Single pancreatic β-cells co-express multiple islet hormone genes in mice. Diabetologia 53:128–138 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Lin MF, Jungreis I, Kellis M. 2011. PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions. Bioinformatics 27:i275–i282 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Saldanha AJ. 2004. Java Treeview: extensible visualization of microarray data. Bioinformatics 20:3246–3248 [DOI] [PubMed] [Google Scholar]
  • 12. Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP. 2011. Integrative genomics viewer. Nat Biotechnol 29:24–26 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Hara M, Wang X, Kawamura T, Bindokas VP, Dizon RF, Alcoser SY, Magnuson MA, Bell GI. 2003. Transgenic mice with green fluorescent protein-labeled pancreatic β-cells. Am J Physiol Endocrinol Metab 284:E177–E183 [DOI] [PubMed] [Google Scholar]
  • 14. Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. 2010. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28:511–515 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L. 2012. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc 7:562–578 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Regard JB, Sato IT, Coughlin SR. 2008. Anatomical profiling of G protein-coupled receptor expression. Cell 135:561–571 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Mercer TR, Gerhardt DJ, Dinger ME, Crawford J, Trapnell C, Jeddeloh JA, Mattick JS, Rinn JL. 2012. Targeted RNA sequencing reveals the deep complexity of the human transcriptome. Nat Biotechnol 30:99–104 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Toung JM, Morley M, Li M, Cheung VG. 2011. RNA-sequence analysis of human B-cells. Genome Res 21:991–998 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Guo H, Ingolia NT, Weissman JS, Bartel DP. 2010. Mammalian microRNA predominantly act to decrease target mRNA levels. Nature 466:835–840 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Robertson G, Schein J, Chiu R, Corbett R, Field M, Jackman SD, Mungall K, Lee S, Okada HM, Qian JQ, Griffith M, Raymond A, Thiessen N, Cezard T, Butterfield YS, Newsome R, Chan SK, She R, Varhol R, Kamoh B, Prabhu AL, Tam A, Zhao Y, Moore RA, Hirst M, Marra MA, Jones SJ, Hoodless PA, Birol I. 2010. De novo assembly and analysis of RNA-seq data. Nat Methods 7:909–912 [DOI] [PubMed] [Google Scholar]
  • 21. Thomas H, Jaschkowitz K, Bulman M, Frayling TM, Mitchell SM, Roosen S, Lingott-Frieg A, Tack CJ, Ellard S, Ryffel GU, Hattersley AT. 2001. A distant upstream promoter of the HNF-4α gene connects the transcription factors involved in maturity-onset diabetes of the young. Hum Mol Genet 10:2089–2097 [DOI] [PubMed] [Google Scholar]
  • 22. Magnuson MA, Shelton KD. 1989. An alternate promoter in the glucokinase gene is active in the pancreatic β-cell. J Biol Chem 264:15936–15942 [PubMed] [Google Scholar]
  • 23. Kim H, Toyofuku Y, Lynn FC, Chak E, Uchida T, Mizukami H, Fujitani Y, Kawamori R, Miyatsuka T, Kosaka Y, Yang K, Honig G, van der Hart M, Kishimoto N, Wang J, Yagihashi S, Tecott LH, Watada H, German MS. 2010. Serotonin regulates pancreatic β-cell mass during pregnancy. Nat Med 16:804–808 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Rieck S, White P, Schug J, Fox AJ, Smirnova O, Gao N, Gupta RK, Wang ZV, Scherer PE, Keller MP, Attie AD, Kaestner KH. 2009. The transcriptional response of the islet to pregnancy in mice. Mol Endocrinol 23:1702–1712 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Schraenen A, Lemaire K, de Faudeur G, Hendrickx N, Granvik M, Van Lommel L, Mallet J, Vodjdani G, Gilon P, Binart N, in't Veld P, Schuit F. 2010. Placental lactogens induce serotonin biosynthesis in a subset of mouse β-cells during pregnancy. Diabetologia 53:2589–2599 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Arava Y, Seger R, Walker MD. 1999. GRFβ, a novel regulator of calcium signaling, is expressed in pancreatic β-cells and brain. J Biol Chem 274:24449–24452 [DOI] [PubMed] [Google Scholar]
  • 27. Ulitsky I, Shkumatava A, Jan CH, Sive H, Bartel DP. 2011. Conserved function of lincRNA in vertebrate embryonic development despite rapid sequence evolution. Cell 147:1537–1550 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Pollard KS, Hubisz MJ, Rosenbloom KR, Siepel A. 2010. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res 20:110–121 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Guttman M, Amit I, Garber M, French C, Lin MF, Feldser D, Huarte M, Zuk O, Carey BW, Cassady JP, Cabili MN, Jaenisch R, Mikkelsen TS, Jacks T, Hacohen N, Bernstein BE, Kellis M, Regev A, Rinn JL, Lander ES. 2009. Chromatin signature reveals over a thousand highly conserved large non-coding RNA in mammals. Nature 458:223–227 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Ponjavic J, Oliver PL, Lunter G, Ponting CP. 2009. Genomic and transcriptional co-localization of protein-coding and long non-coding RNA pairs in the developing brain. PLoS Genet 5:e1000617. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Data

Articles from Molecular Endocrinology are provided here courtesy of The Endocrine Society

RESOURCES