Abstract
Background
Gene expression profiling experiments with few replicates lead to great variability in the estimates of gene variances. Toward this end, several moderated t-test methods have been developed to reduce this variability and to increase power for testing differential expression. Most of these moderated methods are based on linear models with fixed effects where residual variances are smoothed under a hierarchical Bayes framework. However, they are inadequate for designs with complex correlation structures, therefore application of moderated methods to linear models with mixed effects are needed for differential expression analysis.
Results
We demonstrated the implementation of the fully moderated t-statistic method for linear models with mixed effects, where both residual variances and variance estimates of random effects are smoothed under a hierarchical Bayes framework. We compared the proposed method with two current moderated methods and show that the proposed method can control the expected number of false positives at the nominal level, while the two current moderated methods fail.
Conclusions
We proposed an approach for testing differential expression under complex correlation structures while providing variance shrinkage. The proposed method is able to improve power by moderation and controls the expected number of false positives properly at the nominal level.
Keywords: Fully moderated T-statistic, Linear mixed-effects model, Variance shrinkage, Expected number of false positives
Background
To understand the dynamics of gene function, many functional genomic studies have profiled transcriptomic data of individuals with multiple samples of different origins [1, 2], diverse cell types [3, 4], and various time points [5–7]. These measurements of gene expression levels from multiple samples of the same individual are correlated in nature. To analyze the correlated gene expression levels under complex structures, linear mixed-effects models for genes are needed. Due to the constrain of availability, financial resources, or even practical considerations, small number of individuals are commonly used for many studies, which can lead to undesired low power for detecting differential expression between conditions when genes are analyzed separately using linear mixed-effects models. Therefore, variance shrinkage over the range of genes’ expression levels is necessary in order to improve power. However, currently there is a lack of methods that use variance shrinkage techniques for linear mixed-effects models.
Several variance shrinkage methods through moderation under hierarchical Bayes framework have been proposed for detecting differential expression with microarray or RNA-Seq data [8–11]. These methods are based on variance shrinkage of residual errors of linear models with only fixed effects, but the power gain relative to other test methods without variance shrinkage is significant. However, use of these models in designs with correlation structures will lead to biased estimation of gene variances, thus will result in inflated false positive and false negative rates.
There are two variance shrinkage methods that try to address the correlation issue. Limma method [12] incorporates correlation estimates of replicates in linear modeling, but it enforces a common correlation for all genes, which works well for within-array replicates as originally planned, but is risky for between-array replicates. Dream method [13] tries to solve this issue in the setting of linear mixed-effects models. This method shrinks residual variances only, while variance estimates of random effects are scaled to residual variances before and after shrinkage. The limitation of this approach is that variance shrinkage of random effects is ignored by using the same shrinkage for both random and residual variances over the range of genes’ expression levels.
Toward this end, we implemented a novel procedure that allows us to shrink both types of variation independently through moderated t-methods under linear mixed-effects models. Our method assumes that variances of both residual errors and random effects have different mean functional forms over the range of genes’ expression levels. Then testing for differential expression between conditions is based on the combined variance shrinkage estimators. Simulations were performed to evaluate the proposed procedure. A real gene expression experiment on alopecia areata treatment on mouse skin samples was used as a case study.
Methods
Linear mixed-effects model
For each gene, Y denotes expression levels of all samples, which are normally distributed. The linear mixed-effects model (LMM) is formulated as
where β are fixed-effects, γ are random-effects, and ε are residual errors with Var(γ)=G and Var(ε)=R. γ and ε are independent. Y has mean Xβ and variance ZGZ′+R.
The hypothesis for testing fixed-effects is
Then the t-statistic for the hypothesis test is
where , .
Hierarchical model
Following our previously published fully moderated t-statistic (FMT) method [10], a hierarchical Bayes model is assumed for residual variances or variance estimates of a random effect from a linear mixed-effects model of log transformed expression data.
Residual variance for a gene g, , is assumed to follow a scaled Chi-square distribution with degrees of freedom:
where is the variance of residual error, which is assigned a scaled inverse Chi-square prior:
1 |
and are the prior degrees of freedom and location, respectively. Under this model, the posterior mean of , given , is
Similarly, variance estimate of a random effect for a gene g, , is assumed to follow a scaled Chi-square distribution with degrees of freedom:
where is the variance of a random effect, which is also assigned a scaled inverse Chi-square prior:
2 |
and are the prior degrees of freedom and location, respectively. Under this model, the posterior mean of , given , is
Then the denominator of the moderated t-statistic is
Variance components method [14] or Welch-Satterthwaite method [15] is used to calculate an approximation to the degrees of freedom for the linear combinations of independent posterior variance estimates and .
Estimation of hyper-parameters
Hyper-parameter estimation procedures for and are performed separately. The hyperparameters and as well as and are estimated by an empirical Bayes approach implemented in our FMT method [10]. The accuracy of hyper-parameter estimation (in terms of bias and variance) can be approximated by using the empirical distribution of hyper-parameters’ estimates generated from multiple simulation runs, which is demonstrated in our FMT method paper.
False positive control
With thousands of genes available in genome-wide profiling studies, multiplicity adjustment is needed for false positive error control. While the Bonferroni method controls the family-wise error rate (probability of one or more false rejections among all comparisons), it is well-known to be a conservative method for genome-wide profiling studies. A less conservative procedure (the extended interpretation of the Bonferroni method) that controls the expected mean number of false positives was used for multiplicity adjustment throughout this paper [16]. This procedure controls the per family error rate (PFER). It is as powerful as the Benjamimi-Hochberg (BH) FDR control procedure, but has better stability compared to the BH FDR control procedure. In simulations, the nominal level for the false positive error control is set as the PFER.
Simulation procedures
1: For 12,000 genes (g=1,2,⋯,12,000), simulate average log expression, αg, from a three parameter log-normal distribution: ln(αg−3.75)∼N(1.35,0.35).
- 2: For of gene g, calculate the gene-specific prior location and prior degrees of freedom as a function of the average log expression values (αg) obtained from Step 1. The prior location and prior degrees of freedom are modeled as:
For of gene g, calculate the gene-specific prior location and prior degrees of freedom as a function of the average log expression values (αg) obtained from Step 1. The prior location and prior degrees of freedom are modeled as: 3: For gene g, simulate and from an inverse Chi-square distribution (Eqs. 1 and 2) using the hyper-parameters from Step 2.
4: For gene g, simulate random effect Si from .
- 5: Among the 12,000 expressed genes, 500 genes are simulated to be differentially expressed using the following steps:
- Simulate mean log-differences μg from N(0,2) for 500 differentially expressed genes.
- Simulate expression data for each replicate in group 1 from and for each replicate in group 2 from .
6: For the remaining 11,500 expressed genes, simulate expression data for each replicate from .
Results
Simulation
Parameter setting
For simulation, we chose a simple design for gene expression experiment without loss of generality. We assumed two cohort groups with each cohort having ns biological replicates (subjects) and each subject having nr technical replicates. Our main interest is to compare gene expression differences between two cohort groups for each gene. We simulated gene expression data under linear mixed-effects models, where subject is a random effect and variation due to technical replicates was included in the residual errors.
Expression levels of 12,000 genes were simulated independently, among which 500 genes were simulated with differential expression between two cohort groups (see above Simulation Procedures). Simulation models were based on model assumptions of our published FMT method, which allows hyper-parameters to vary as a function of genes’ expression levels. All parameters were set to resemble real gene expression arrays. Under this setting, we generated data sets for 13 different biological replicates (3-20 in each cohort group) and 6 different technical replicates (2-11) within each biological replicate. One hundred simulation runs were conducted for each simulation scenario.
Methods comparison
We compared the proposed method (FMT in LMM) to Limma [12] and Dream methods [13]. The Limma method calculates a common correlation coefficient between technical replicates for all genes to account for the correlation of repeated measurements within subjects in a linear fixed-effects model setting. The Dream method uses a linear mixed-effects model, where variance estimates of random subject effect is standardized to residual variances, then residual variances were modeled as a function of average of expression levels. In our proposed method, we used two ways to estimate the degrees of freedom of the moderated t-statistic in Eq. 3. The first approach is the variance components method (VC), which divides the residual degrees of freedom into between-subject and within-subject portions based on variance decomposition and assigns the between-subject degrees of freedom to the fixed effects if subjects are nested in the fixed effects. The second approach is the Welch-Satterthwaite method (Sat), which uses an approximation of the effective degrees of freedom of a linear combination of independent variance estimates. We also compared our method to the ordinary t-test method (OT) in LMM, which does not perform variance shrinkage to the variance estimates.
Power and false positives
Power averaged over 100 simulation runs are shown in Fig. 1, where power is plotted against numbers of biological replicates (3-20) for 2 technical replicates and 5 expected mean number of false positives. The Limma method has the best power except at the small number of biological replicates and OT-VC has the lowest power. The power difference between all methods goes down when the number of biological replicates increases. For both FMT and OT methods, the Welch-Satterthwaite method results in better power than the variance components method. Fig. 2 shows the actual number of false positives over numbers of biological replicates for 2 technical replicates and 5 expected mean number of false positives. The Dream method has the largest actual number of false positives, and it spikes more than 100 fold higher than the controlled level when the number of biological replicates is relatively small and drops dramatically when the number of biological replicates increases. The Limma method has inflated actual number of false positives as well, but it is relatively stable over numbers of biological replicates. Only FMT-VC and OT-VC maintain the nominal level of false positive error control at each biological replicate level. Similar patterns are observed for higher numbers of technical replicates.
Power averaged over 100 simulation runs is plotted against numbers of technical replicates (2-11) for 3 biological replicate and 5 expected number of false positives as shown in Fig. 3. The Limma and Dream methods have higher power than other methods. The power differences between methods seem stable across the range of technical replicates. Figure 4 shows the actual number of false positives by technical replicates for 3 biological replicates and 5 expected number of false positives. Similarly, FMT-VC and OT-VC maintain the desired false positive error control at each technical replicate number. When the number of technical replicates is bigger than 4, the FMT-Sat method maintains the desired false positive error control, while the other three methods still fail to maintain false positives at the nominal level. For both Limma and Dream methods, the actual number of false positives increases with larger numbers of technical replicates. Similar patterns are observed for higher numbers of biological replicates. We believe that the ultrally inflated actual number of false positives for both Limma and Dream methods contributes to the higher power compared to both FMT and OT methods.
Gene correlation
To understand effects of gene correlation on the tests’ performance, we also performed simulations using a block-diagonal gene correlation structure. We randomly assigned genes into blocks of size 10, where gene correlations inside blocks are fixed at 0.5 while gene correlations between blocks are set to 0. Results on power and actual number of false positives under the gene correlation structure are highly similar to those under gene independence. Supplemental plots are provided in Additional file 1.
A case study
To demonstrate the performance of the proposed method for detecting differential expression under linear mixed-effects models, we applied it to a real microarray study published in Nature Medicine in 2014 by Xing et al. [17]. Microarray data were deposited in the GEO database with an access number GSE45514. This study used Affymetrix Mouse Genome 430 2.0 Array on mouse skin samples taken at different time points (weeks 6 and 12) from C3H/HeJ mice treated with topical JAK inhibitors ruxolitinib (Jak1), tofacitinib (Jak3) or control (PBS). Three skin samples were taken at each time point from 3 mice for each treatment. We focused on the comparison between two treatments (J3 vs. PBS), and we analyzed the data using our FMT-VC linear mixed-effects model where mouse was treated as a random effect. We compared the FMT-VC method with other methods as done in simulation studies. Figures 5 and 6 show the fitted values of residual variances and variance estimates of random mouse effect. As shown in the venn-diagram in Fig. 7, lists of significant genes of week 12 at the PFER cutoff value of 5 expected mean number of false positives are compared among methods. There are several genes detected only by the Limma and Dream methods, but these genes are most likely to be false positives since our simulation studies demonstrate the ultrally inflated false positive rate for both methods especially when the number of biological replicates is small. The few significant genes for OT-VC method are mainly due to low power of that method.
Discussion
Differential expression analysis through moderated t-tests has become common practice in genomic studies for the past decade. But most moderated t-tests are limited to simple designs with the usage of linear fixed-effects models. Recent efforts have been made to accommodate designs associated with complex correlation structures using LMM. In this paper, we proposed a novel approach to address the need of applying moderated methods in LMM and compared our method with two other current methods. Our proposed method is very flexible because we allow different shrinkage over the range of expression levels for each variance component. However the Limma and Dream methods can only do shrinkage of residual variances over the range of expression levels.
We have employed two approaches for approximating degrees of freedom of moderated t-statistic (the variance components method and the Welch-Satterthwaite method) for both FMT and OT methods in LMM for testing differential expression. The Welch-Satterthwaite method uses all variance estimates to approximate degrees of freedom, but the VC method only accounts for between-subject variation and ignores within-subject variation from the total degrees of freedom when fixed effects change across subjects. The Welch-Satterthwaite method inflates type I error and fails to maintain false positive error control at the nominal level, but the VC method performs proper control for false positive errors. In addition, FMT-VC is more powerful than OT-VC mainly due to moderation.
The performance of the proposed method was demonstrated through a case study with a simple design. However it can be applied to comprehensive designs including multi-factors or multi-levels in linear mixed-effects models. The proposed method can also be applied to data sets that measure gene expression by other technologies, e.g. RNA-Seq, protein expression, or quantities of small molecules.
Conclusions
Current moderated-t methods are limited to shrinking only the residual variance of linear models. In this paper, we proposed an approach to allow variance shrinkage of both residual error and random effects using moderated t-statistic under LMM. The gain in power through moderation and proper control of false positive errors was demonstrated by simulation studies. We also applied the proposed method to a real gene expression data set.
Supplementary information
Abbreviations
- BH
Benjamimi-Hochberg
- FDR
False discovery rate
- FMT
Fully moderated t-statistic
- LMM
Linar mixed-effects models
- PFER
Per family error rate
Authors’ contributions
LY generated the original idea and drafted the manuscript. LY and JZ performed the simulations. GB and SF guided the research and revised the manuscript. All authors read and approved the final version of this manuscript.
Funding
This research was partially supported by NIH grants 2P30CA016058-40. Publication costs are covered by internal fund of the corresponding author’s department.
Availability of data and materials
The microarray data set used in the case study is from publicly available repositories. R package fmt used in the analysis is available from CRAN. R code for the case study is available from the corresponding author.
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Guy Brock and Soledad Fernandez contributed equally to this work.
Contributor Information
Lianbo Yu, Email: Lianbo.Yu@osumc.edu.
Guy Brock, Email: Guy.Brock@osumc.edu.
Soledad Fernandez, Email: Soledad.Fernandez@osumc.edu.
Supplementary information
Supplementary information accompanies this paper at 10.1186/s12859-019-3248-9.
References
- 1.Carithers LJ, Ardlie K, Barcus M, Branton PA, Britton A, Buia SA, Compton CC, DeLuca DS, Peter-Demchok J, Gelfand ET, Guan P, Korzeniewski GE, Lockhart NC, Rabiner CA, Rao AK, Robinson KL, Roche NV, Sawyer SJ, Segrè AV. Shive CE, Smith AM, Sobin LH, Undale AH, Valentino KM, Vaught J, Young TR, Moore HM, GTEx Consortium A Novel Approach to High-Quality Postmortem Tissue Procurement: The GTEx Project. Biopreserv Biobank. 2015;13(5):311–9. doi: 10.1089/bio.2015.0032. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Yu YP, Landsittel D, Jing L, Nelson J, Ren B, Liu L, McDonald C, Thomas R, Dhir R, Finkelstein S, Michalopoulos G, Becich M, Luo JH. Gene expression alterations in prostate cancer predicting tumor aggression and preceding development of malignancy. J Clin Oncol. 2004;22(14):2790–9. doi: 10.1200/JCO.2004.05.158. [DOI] [PubMed] [Google Scholar]
- 3.Liu H, Dowdle JA, Khurshid S, Sullivan NJ, Bertos N, Rambani K, Mair M, Daniel P, Wheeler E, Tang X, Toth K, Lause M, Harrigan ME, Eiring K, Sullivan C, Sullivan MJ, Chang SW, Srivastava S, Conway JS, Kladney R, McElroy J, Bae S, Lu Y, Tofigh A, Saleh SMI, Fernandez SA, Parvin JD, Coppola V, Macrae ER, Majumder S, Shapiro CL, Yee LD, Ramaswamy B, Hallett M, Ostrowski MC, Park M, Chamberlin HM, Leone G. Discovery of Stromal Regulatory Networks that Suppress Ras-Sensitized Epithelial Cell Proliferation. Dev Cell. 2017;41(4):392–407. doi: 10.1016/j.devcel.2017.04.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Shi L, Zhang Y, Feng L, Wang L, Rong W, Wu F, Wu J, Zhang K, Cheng S. Multi-omics study revealing the complexity and spatial heterogeneity of tumor-infiltrating lymphocytes in primary liver carcinoma. Oncotarget. 2017;8(21):34844–57. doi: 10.18632/oncotarget.16758. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Kupfer DM, White VL, Strayer DL, Crouch DJ, Burian D. Microarray characterization of gene expression changes in blood during acute ethanol exposure. BMC Med Genomics. 2013;6:26. doi: 10.1186/1755-8794-6-26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Golumbeanu M, Desfarges S, Hernandez C, Quadroni M, Rato S, Mohammadi P, Telenti A, Beerenwinkel N, Ciuffi A. Proteo-Transcriptomic Dynamics of Cellular Response to HIV-1 Infection. Sci Rep. 2019;9(1):213. doi: 10.1038/s41598-018-36135-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Muscedere J, Maslove DM, Boyd JG, O’Callaghan N, Sibley S, Reynolds S, Albert M, Hall R, Jiang X, Day AG, Jones G, Lamontagne F. Prevention of Nosocomial Infections in Critically Ill Patients With Lactoferrin: A Randomized, Double-Blind, Placebo-Controlled Study. Crit Care Med. 2018;46(9):1450–6. doi: 10.1097/CCM.0000000000003294. [DOI] [PubMed] [Google Scholar]
- 8.Smyth GK. Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol. 2004;3:Article3. doi: 10.2202/1544-6115.1027. [DOI] [PubMed] [Google Scholar]
- 9.Sartor MA, Tomlinson CR, Wesselkamper SC, Sivaganesan S, Leikauf GD, Medvedovic M. Intensity-based hierarchical Bayes method improves testing for differentially expressed genes in microarray experiments. BMC Bioinformatics. 2006;7:538. doi: 10.1186/1471-2105-7-538. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Yu L, Gulati P, Fernandez S, Pennell M, Kirschner L, Jarjoura D. Fully moderated T-statistic for small sample size gene expression arrays. Stat Appl Genet Mol Biol. 2011;10(1):Article 42. doi: 10.2202/1544-6115.1701. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Law CW, Chen Y, Shi W, Smyth GK. voom: Precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 2014;15(2):R29. doi: 10.1186/gb-2014-15-2-r29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Smyth GK, Michaud J, Scott HS. Use of within-array replicate spots for assessing differential expression in microarray experiments. Bioinformatics. 2005;21(9):2067–75. doi: 10.1093/bioinformatics/bti270. [DOI] [PubMed] [Google Scholar]
- 13.Hoffman GE, Roussos P. Dream: Powerful differential expression analysis for repeated measures designs. biorxiv. 2018. 10.1101/432567. [DOI] [PMC free article] [PubMed]
- 14.Schluchter MD, Elashoff JD. Small-sample adjustments to tests with unbalanced repeated measures assuming several covariance structures. J Stat Comput Simul. 1990;37:69–87. doi: 10.1080/00949659008811295. [DOI] [Google Scholar]
- 15.Satterthwaite FE. An approximate distribution of estimates of variance components. Biom Bull. 1946;2:110–4. doi: 10.2307/3002019. [DOI] [PubMed] [Google Scholar]
- 16.Gordon A, Glazko G, Qiu X, Yakovlev A. Control of the mean number of false discoveries, Bonferroni and stability of multiple testing. Annals Appl Stat. 2007;1:179–90. doi: 10.1214/07-AOAS102. [DOI] [Google Scholar]
- 17.Xing L, Dai Z, Jabbari A, Cerise JE, Higgins CA, Gong W, de Jong A, Harel S, DeStefano GM, Rothman L, Singh P, Petukhova L, Mackay-Wiggan J, Christiano AM, Clynes R. Alopecia areata is driven by cytotoxic T lymphocytes and is reversed by JAK inhibition. Nat Med. 2014;20(9):1043–9. doi: 10.1038/nm.3645. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The microarray data set used in the case study is from publicly available repositories. R package fmt used in the analysis is available from CRAN. R code for the case study is available from the corresponding author.