In comparisons of human burn conditions and mouse models of infection, we found that 1,608 genes, roughly 12% of the genes that changed in humans, changed in the same direction in mice, as indicated by Warren et al. (1). However, we argue that this fact does not indicate that mice are a poor animal model for the following reasons. First, at present, we do not understand 12% of the entire picture of a complex biological phenomenon such as inflammation, do we? If not, the shared changes of 1,608 genes may not be negligible and would be useful for understanding the mechanisms shared by humans and mice. Second, these genes involved 185 commonly and significantly changed molecular pathways/biogroups (dataset S1 in ref. 2), which could serve as potential targets for preclinical studies. Third, the statistical significance represented by the overlap P value estimated using NextBio characterizes not only the direction of the changes but also the extent (ranking) of the fold-changes (FCs). The P value (3.4 × 10−35) indicates an extraordinarily significant overlap, which motivated the use of the word “greatly” in the title of our paper. Furthermore, the similarity of gene expression patterns between human and mice was possibly underestimated. As Shay et al. (3) and we (2) point out, the datasets compared in Seok et al. have not been optimized for comparisons regarding matching of time courses/frames between human and mouse datasets, treatment effects, heterogeneity in human datasets, etc. As suggested by Shay et al., with more optimized datasets, the extent of similarity between human diseases and mouse models would become even stronger.
Warren et al. also suggest that our analyses may not be “genome-wide” because of changes in gene selection criteria for humans and mice. The extent of gene expression changes in mice is generally smaller than that in humans. Considering this difference, using the same cutoff value (|FC| > 1.2) for both species would bias the calculation of correlation coefficients; therefore, applying a larger cutoff to humans (|FC| > 2.0) may serve as a rough normalization for data selection, with 20% shared by mice in the above case. Moreover, the method for obtaining overlap P values using NextBio is considered a nonbiased and predetermined method of genome-wide analysis (4). The same criteria of gene selection (|FC| > 1.2, P < 0.05) are applied to both conditions, and running Fisher analysis is then conducted for significantly changed genes in both sets, including those changed in opposite directions. NextBio ontology-driven meta-analysis is also a nonbiased method, without any arbitrary selection of the genes of interest.
Indeed, a broad biogroup such as “innate immune response” may not be particularly useful for prioritization of therapeutic candidates. However, we identified many other significantly changed pathways/biogroups in our analyses (dataset S1 in ref. 2). More specific ones, such as “PDGFR-β signaling pathway” and “Fc γ R-mediated phagocytosis,” may be useful for such prioritization.
We agree with the suggestion of Shay et al. that shared denominator artifacts exist in the human datasets used in Seok et al., because identical control samples (n = 37) are shared by human burn and trauma cases. Any unique features of the shared control samples, such as the weight, body mass index, and clinical history of individuals, time of sampling, and conditions of gene chip experiments, would result in similar gene expression changes between different comparisons. Such artifacts could have contributed to the high correlation coefficient in gene expression observed between the human burn and trauma datasets.
Shay et al. contend that the datasets used in Seok et al. (1) and in our report (2) are unsuitable for interspecies comparisons for several reasons. Although we basically agree with them, we would like to point out that highly significant similarities can still be found between human and mouse conditions even for datasets lacking optimization, as suggested by Shay et al. We compared datasets from mouse experiments for which time course data are available to human burn dataset and obtained overlap P values for each time point. For example, data are available for four time points of blood sampling in mouse burn models (2 h to 7 d after injury), all of which showed high similarity with the human burn dataset. In fact, the datasets with the highest similarity and lowest similarity both showed significant overlap P values (P = 4.7 × 10−7 to 3.9 × 10−34).* Therefore, the gene expression changes shared between datasets without careful matching of variables could be considered to represent genomic responses that are robustly induced regardless of various conditions and species.
Shay et al. also suggest that many of the changes in gene expression may reflect differences in the proportions of cell types in whole blood leukocytes between species. We reevaluated the expression changes of genes specifically expressed in neutrophils and lymphocytes that are abundant in humans and mice, respectively.† Although neutrophils are rich in human blood but not in mouse blood, a number of genes specific to neutrophils were up-regulated both in humans and mice. Meanwhile, lymphocytes are preponderant in mouse blood, yet lymphocytes-specific genes were largely down-regulated both in humans and mice. These results failed to provide evidence that the overall genomic responses in human diseases and mouse models reflect differential responses between neutrophils and lymphocytes.
Thus, we could now agree with the idea that at least some genomic responses on inflammation show statistically significant similarities between humans and mice, regardless of subjective and opposing views on whether 5.9–15.0% (figure 2 in ref. 2) represents a poor or great overlap. An important question is whether such commonality involves biological processes that are critical for understanding diseases and developing novel therapies. Warren et al. present an analogy of the similarities and differences between a station wagon and a motorcycle. Although components specific to each vehicle, such as airbags and sunroofs, could be essential for their commercial competitiveness, any problems in basic parts, such as wheels or spark plugs, harbor potential to prevent the vehicles from performing their fundamental and common function. Therefore, our hypothesis is that the key to understanding a fatal human illness may be hidden in such essential machineries that are conserved across species. This idea requires further detailed testing.
Footnotes
The authors declare no conflict of interest.
*Available at cbsn.neuroinf.jp/modules/xoonips/detail.php?item_id=29440.
References
- 1.Warren HS, et al. Mice are not men. Proc Natl Acad Sci USA. 2015;112:E345. doi: 10.1073/pnas.1414857111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Takao K, Miyakawa T. Genomic responses in mouse models greatly mimic human inflammatory diseases. Proc Natl Acad Sci USA. 2015;112:1167–1172. doi: 10.1073/pnas.1401965111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Shay T, Lederer JA, Benoist C. Genomic responses to inflammation in mouse models mimic humans: We concur, apples to oranges comparisons won’t do. Proc Natl Acad Sci USA. 2015;112:E346. doi: 10.1073/pnas.1416629111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Kupershmidt I, et al. Ontology-based meta-analysis of global collections of high-throughput public data. PLoS ONE. 2010;5(9):e13066. doi: 10.1371/journal.pone.0013066. [DOI] [PMC free article] [PubMed] [Google Scholar]