Abstract
Methods for identifying differential expression were compared on time series microarray data from artificial gene networks. Identifying differential expression was dependent on normalization and whether the background was removed. Loess after background correction improved results for most methods. On data without background correction median centering improved performance. We recommend Cui and Churchill’s ANOVA variants on background subtracted data and Efron and Tibshirani’s Empirical Bayes Wilcoxon Rank Sum test when the background cannot be removed.
Introduction
The purpose of this study was to characterize the conditions in which tests for differential expression developed for static microarray data are capable of detecting changes in time series microarray experiments. We compared several methods using data simulated from artificial gene networks, which allow one to create realistic but controllable data.
Materials and Methods
We simulated microarray data using artificial gene networks. Networks from the three available topology categories were compared: Erdös and Rényi, “small-worlds”, and “scale-free”. The data were simulated using GEPASI 3 Biochemical Simulator software. We selected five networks from each topology and performed ten simulations each under one of the five following experimental conditions: Normal, Perturbation – High and Low, where a single is initialized to a concentration considerably higher or lower than the steady state level, respectively, and Mutation – High and Low, similar to the perturbation simulations except that the kinetics of the altered gene were changed such that it was no longer under regulatory control. Only one gene per network was explicitly altered. All others were initialized to near normal steady state. Normal simulations corresponded to the reference and all the other simulations are used as the Cy5/channel 2 data. Each simulation was run from time zero until steady state from which we sampled a relative distribution of “arrays” likely in a time course microarray experiment for analysis.
The following methods were compared: Empirical Bayes Wilcoxon Rank Sum (EB WRS), t-test, the Bayesian t-test (Cyber-T), the Significance analysis of Microarrays (SAM) test statistic (without permutation analysis), the t-statistic with maximum-likelihood standard error, (MLE-T), the B statistic, and the “three flavors” of F test (F1, F2 and F3).
We evaluated all comparison methods by means of the area under the Receiver Operating Characteristic curve (AUC). To generate the curves, all comparison results were compared to the “truth” for that network. We defined the truth as those genes directly influenced by the gene we explicitly altered in that network, including the altered gene itself. Data normalization across arrays can considerably influence the results. We therefore also compared median centering and loess normalization.
Results & Discussion
Several methods were found to adequately detect differentially expressed genes in the mutant simulations in which the absence or over-expression of a gene was persistent. This was expected since a mutation alters the entire temporal expression pattern of the mutated gene and all the genes that it influences. The comparison of the mean of those expression patterns is more likely to differ from the mean of the corresponding normal expression patterns. Thus, identifying differentially expressed genes in the time-series mutant persistent case is similar to that for a static experiment. The comparison methods studied were less able to identify differentially expressed genes in the perturbation simulations.
ANOVA, particularly F2 or F3, gave the best results on background subtracted data for most conditions and network topologies. ANOVA is similar to the other statistical methods in that it finds differences based on variance in mean expression among groups. Its increased power compared to the other methods is attributed to the additional term in the linear model that accounts for the time variable. This advantage was reduced on the data without background subtraction. EB WRS had the best relative results on non-background subtracted data. The increased noise owing to the uncorrected background likely favored the non-parametric EB WRS.
The results were more reliable on the background subtracted data; hence the background should be removed when possible. In most instances the background cannot be properly removed. This should factor in the choice of analysis method. Based on this study, and that true biological networks approximate a scale-free topology with small-word characteristics, we recommend the ANOVA variant combination for analyzing background subtracted data and EB WRS when background cannot be removed.
