Abstract
Clonal population expansion of T cells during an immune response is dependent on the affinity of the T cell receptor (TCR) for its antigen [1]. However, there is little understanding of how this process is controlled transcriptionally. We found that the transcription factor IRF4 was induced in a manner dependent on TCR-affinity and was critical for the clonal expansion and maintenance of effector function of antigen-specific CD8+ T cells. We performed a genome-wide expression profiling experiment using RNA sequencing technology (RNA-seq) to interrogate global expression changes when IRF4 was deleted in CD8+ T cells activated with either a low or high affinity peptide ligand. This allowed us not only to determine IRF4-dependent transcriptional changes but also to identify transcripts dependent on TCR-affinity [2]. Here we describe in detail the analyses of the RNA-seq data, including quality control, read mapping, quantification, normalization and assessment of differential gene expression. The RNA-seq data can be accessed from Gene Expression Omnibus database (accession number GSE49929).
Keywords: IRF4, RNA-sequencing analysis, TCR-affinity, T cells, Clonal expansion
| Specifications | |
|---|---|
| Organism/cell line/tissue | Mus musculus/spleen and lymph node |
| Strain | C57BL/6 |
| Sequencer or array type | Illumina HiSeq 2000 sequencer |
| Data format | FASTQ |
| Experimental factors | Wild-type or IRF4−/− CD8+ OT-1 T cells, activated with either a low or high affinity peptide ligand |
| Experimental features | RNA-seq data |
| Consent | Mice were maintained and used in accordance with the guidelines of the Walter and Eliza Hall Institute Animal Ethics Committee. |
| Sample source location | Melbourne, Australia |
Direct link to deposited data
Deposited data can be accessed via: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE49929.
Experimental design, materials and methods
Sample preparation
Irf4−/− mice have been described in [3] and were maintained on a C57BL/6 (Ly5.2+) background. They were crossed to OT-I mice, which carry a MHCI-restricted TCR-transgene resulting in the expression of an ovalbumin (OVA) peptide specific TCR [4]. Naive CD8+ T cells were isolated from the spleens and lymph nodes of OT-I mice on either a wild-type or Irf4−/− background and were activated for 72 h in vitro with OVA peptide N4 (SIINFEKL, high affinity) or V4 (SIIVFEKL, low affinity) [1] (1 ug/ml) in the presence of recombinant human IL-2 (100 U/ml; R&D Systems).
RNA sequencing
RNA was purified with an RNAeasy Plus Mini Kit according to the manufacturer's protocol (Qiagen). The DNA fragments were ligated to Illumina adaptors with blunt ends and were amplified, then were sequenced with an Illumina HiSeq 2000 sequencer. Each sample had two or three biological replicates. Paired-end 90 bp reads were generated from sequencing.
Sequencing quality
Fig. 1 shows the distribution of base-calling Phred scores at each base location in all the reads included in one of the libraries. Although nucleotides located at the ends of reads were found to have a lower sequencing quality than those in the middle of reads, the overall sequencing quality is high since the majority of read bases have a Phred score greater than 30 (ie. probability of incorrect base calling is less than 0.001). Other libraries included in this study were found to have a sequencing quality similar to that shown in Fig. 1.
Fig. 1.
Distribution of base-calling Phred scores at each base location in all the reads included in one of the libraries. The horizontal axis gives the position of each nucleotide in the read and the vertical axis shows a box plot of Phred scores of called nucleotides at each read position. For each base position, the box shows the 25%, 50% and 75% quantiles of the Phred scores. Scores more than 1.5 interquartile ranges from the median for that position are plotted as individual points. Phred scores of read bases were retrieved from the FASTQ input file using the qualityScores function in Bioconductor R package Rsubread.
Read mapping and summarization
Sequence reads were mapped to mouse reference genome mm9 using the Subread aligner [5], which is capable of mapping both exonic and exon-spanning reads. Mapped reads were summarized to NCBI RefSeq genes using the featureCounts program [6]. Raw read counts were generated for each gene in each library after summarization.
Gene filtering and normalization
Genes were removed from the analysis if they failed to achieve a FPKM (fragments per kilobases per million mapped reads) value of 0.5 or greater in at least one library. Counts were converted to log2 counts per million (CPM), quantile normalized and precision weighted using voom [7]. Fig. 2 shows the relationship between mean expression values of genes and their expression variations. Expression variations of genes were estimated from the biological replicates. Fig. 3 shows the clustering of samples after normalization. Distinct cell types were clearly separated and sample replicates were clustered together.
Fig. 2.

Mean–variance relationship estimated from the sequence data by voom. The horizontal axis gives the mean log2-CPM values of genes and the vertical axis gives the square root of standard deviation of log2-CPM expression values of genes that is estimated from the biological replicates of samples.
Fig. 3.

Unsupervised clustering of the samples by multi-dimensional scaling. ‘WT’ and ‘KO’ denote wild-type and Irf4−/− OT-1 T cells, respectively. ‘N4’ and ‘V4’ denote stimulation with high affinity peptide and stimulation with low affinity peptide, respectively. Distances on the plot represent average absolute log2 fold change for the leading 500 genes that distinguish each pair of samples. This figure was generated using the plotMDS function in Bioconductor R package limma.
Differential expression analysis
Linear models were fitted to genes using the Bioconductor R package limma [8]. Precision weights for genes that were estimated by voom were used in the linear modeling process. Empirical Bayes moderated t-statistics were used to assess differential expression [9]. A false discovery rate of 5% and a fold change cutoff of 2 fold were applied for calling differentially expressed (DE) genes. Also, DE genes must have a FPKM value of 8 or greater in one or both of two samples being compared. DE genes found in comparing Irf4−/− with wild type in high-affinity CD8+ T cells are highlighted in Fig. 4, in which genome-wide expression changes between the two samples are shown.
Fig. 4.
Genome-wide expression changes between Irf4−/− and wild type in high-affinity CD8+ OT-1 T cells. Significantly up-regulated and down-regulated genes are highlighted. This figure was generated using the plotMA function in limma.
Discussion
Here we provided a detailed description to the analyses we carried out for the RNA-seq data generated in the original study of TCR-affinity and IRF4-mediated transcriptional changes in CD8 T cells [2]. Raw sequence read data have been made publicly available and software programs used in this analysis can also be freely downloaded from Bioconductor [10] or SourceForge (http://subread.sourceforge.net). These should enable the RNA-seq analysis results presented in the original study to be readily reproduced. We also want to note that the pipeline used in this data analysis has been found to be one of the best-performing pipelines for RNA-seq analysis by the SEQC/MAQC III Consortium in their recent efforts to benchmark RNA-seq technologies [11].
Acknowledgements
This work was supported by grants and fellowships from the National Health and Medical Research Council of Australia (NHMRC) (GKS, SLN, AK, WS), the Sylvia and Charles Viertel Foundation (AK), the Australian Research Council (SLN, AK), and the WEHI Genomics Fund (AK). The funding includes NHMRC Project Grant 1023454 (GKS, WS), Project Grant 1032850 (AK) and Program grant 1054618 (GKS). This study was made possible through the Victorian State Government Operational Infrastructure Support and the Australian Government NHMRC Independent Research Institute Infrastructure Support scheme.
References
- 1.Zehn D., Lee S.Y., Bevan M.J. Complete but curtailed T-cell response to very low-affinity antigen. Nature. 2009;458:211–214. doi: 10.1038/nature07657. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Man K., Miasari M., Shi W., Xin A., Henstridge D.C. The transcription factor IRF4 is essential for TCR affinity-mediated metabolic programming and clonal expansion of T cells. Nat. Immunol. 2013;14:1155–1165. doi: 10.1038/ni.2710. [DOI] [PubMed] [Google Scholar]
- 3.Mittrucker H.W., Matsuyama T., Grossman A., Kundig T.M., Potter J. Requirement for the transcription factor LSIRF/IRF4 for mature B and T lymphocyte function. Science. 1997;275:540–543. doi: 10.1126/science.275.5299.540. [DOI] [PubMed] [Google Scholar]
- 4.Hogquist K.A., Jameson S.C., Heath W.R., Howard J.L., Bevan M.J. T cell receptor antagonist peptides induce positive selection. Cell. 1994;76:17–27. doi: 10.1016/0092-8674(94)90169-4. [DOI] [PubMed] [Google Scholar]
- 5.Liao Y., Smyth G.K., Shi W. The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote. Nucleic Acids Res. 2013;41:e108. doi: 10.1093/nar/gkt214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Liao Y., Smyth G.K., Shi W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. 2014;30:923–930. doi: 10.1093/bioinformatics/btt656. [DOI] [PubMed] [Google Scholar]
- 7.Law C.W., Chen Y., Shi W., Smyth G.K. Voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 2014;15:R29. doi: 10.1186/gb-2014-15-2-r29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Smyth G.K. Limma: linear models for microarray data. In: Gentleman R., Carey V., Dudoit S., Irizarry R., Huber W., editors. Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Springer; New York: 2005. pp. 397–420. [Google Scholar]
- 9.Smyth G.K. Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat. Appl. Genet. Mol. Biol. 2004;3 doi: 10.2202/1544-6115.1027. (Article3) [DOI] [PubMed] [Google Scholar]
- 10.Gentleman R.C., Carey V.J., Bates D.M., Bolstad B., Dettling M. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004;5:R80. doi: 10.1186/gb-2004-5-10-r80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Su Z., Labaj P.P., Li S., Thierry-Mieg J., Thierry-Mieg D. A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nat. Biotechnol. 2014;32:903–914. doi: 10.1038/nbt.2957. [DOI] [PMC free article] [PubMed] [Google Scholar]


