Abstract
Neutrophils are frequently studied in mouse models, but the extent to which findings translate to humans remains poorly defined. In an integrative analysis of 11 mouse and 13 human datasets, we find a strong correlation of neutrophil gene expression across species. In inflammation, neutrophils display substantial transcriptional diversity but share a core inflammation program. This program includes genes encoding IL-1 family members, CD14, IL-4R, CD69, and PD-L1. Chromatin accessibility of core inflammation genes increases in blood compared to bone marrow and further in tissue. Transcription factor enrichment analysis implicates members of the NF-κB family and AP-1 complex as important drivers, and HoxB8 neutrophils with JunB knockout show a reduced expression of core inflammation genes in resting and activated cells. In independent single-cell validation data, neutrophil activation by type I or type II interferon, G-CSF, and E. coli leads to upregulation in core inflammation genes. In COVID-19 patients, higher expression of core inflammation genes in neutrophils is associated with more severe disease. In vitro treatment with GM-CSF, LPS, and type II interferon induces surface protein upregulation of core inflammation members. Together, we demonstrate transcriptional conservation in neutrophils in homeostasis and identify a core inflammation program shared across heterogeneous inflammatory conditions.
Subject terms: Neutrophils, Immunogenetics, Immunogenetics, Innate immunity
Difficulties can be encountered when translating research between cells from animals and humans because of gene expression differences. Here the authors perform an integrative transcriptomic analysis from human and mouse neutrophils and identify a core inflammation program shared across inflamed contexts.
Introduction
Neutrophils mediate homeostatic and inflammatory processes and display substantial phenotypic and functional heterogeneity. While animal models fuel fundamental discoveries in immunology, differences between humans and mice can impair the translation of findings1. To maximize impact on human health, life sciences increasingly benefit from seamless transitions between the mouse and human system. However, due to structural and functional differences in genomes, it is often unclear which aspects reflect conserved biology. Therefore, integrative analyses of cellular systems across species are important for the success of translational research.
Structurally, the mouse and human genomes are closely related. They harbor ~16,000 protein-coding genes considered to be one-to-one orthologs with high confidence2. However, structural orthology does not equal functional similarity since expression patterns of orthologous genes can deviate substantially across organs and development3. In leukocytes, expression of most orthologous genes and lineage-specific genes, in particular, is well-conserved between humans and mice4. Despite this overall similarity, different species can display substantial differences in ortholog expression between tissues5. For example, human neutrophils are highly abundant in defensins, yet their mouse orthologs are expressed in gut epithelial cells, not in neutrophils. Furthermore, neutrophils display high phenotypic and functional heterogeneity as a function of organ, maturation, and inflammatory condition6–9, but whether a core inflammation program consisting of genes that become induced across a range of inflammatory conditions exists is not known. It is thus unclear how similarities and differences between human and mouse transcriptomes should be interpreted, particularly in the context of different inflammatory conditions.
To address these gaps in knowledge, we perform an integrative analysis of resting and inflamed leukocytes from humans and mice and assess the degree of conservation of gene expression. We find that human and mouse transcriptomes can be analyzed together and that lineage-specific gene expression was closely related between humans and mice. We further study how the neutrophil transcriptome changes in inflammation, using a wide range of studies covering in vitro and in vivo inflammation as well as resting conditions in human10–21 and mouse12,22–31. While transcriptional responses to different activating stimuli are heterogeneous, we identify a core inflammation program in neutrophils conserved across species and conditions. We predict upstream regulators and find increasing accessibility of core inflammation program members in ATAC-seq. JunB−/− HoxB8 cells display a lower upregulation of core inflammation genes when stimulated with zymosan compared to wild-type cells. In single-cell RNA-seq data from resting and activated neutrophils, stimulation with type I and II interferon, G-CSF, E. coli is associated with higher expression of core inflammation genes. Further, neutrophils from COVID-19 patients with more severe disease display higher expression of core inflammation genes. Finally, we validated members of the core inflammation program using flow cytometry of stimulated human and mouse neutrophils and identified an interplay between tissue of origin and stimulation in driving the phenotype of the neutrophil inflammatory response. Our approach illustrates that multiple datasets of mouse and human gene expression data can be effectively combined to identify patterns shared across conditions and conserved across species. This approach can be transferred to other cell types and organisms to facilitate studies comparing gene expression across species.
Results
Integrative analysis of leukocyte gene expression across species
To assess gene expression similarities and differences between human and mouse immune cells, we obtained bulk RNA-seq data from six sorted leukocyte lineages from the Haemopedia atlas12,32 (Supplementary Fig. 1). This dataset consisted of a total of 76 samples of T cells, B cells, dendritic cells, monocytes, NK cells, and neutrophils (Supplementary Fig. 1). Sequencing depths for samples across all lineages are shown in Supplementary Fig. 2a, b, and detailed quality control metrics are summarized in Supplementary Data 1. We then integrated gene expression matrices by mapping protein-coding, one-to-one orthologous genes with high confidence, according to ENSEMBL33.
To evaluate the robustness of this approach, we performed a principal component analysis on the integrated expression matrix. For each lineage, up to 200 lineage-associated genes were selected. Here, sample distribution was driven predominantly by lineage, followed by species (Fig. 1a). As envisioned, lineage-associated gene expression was highest in each respective lineage and occurred across species in all lineages (Fig. 1b). Similarly, clustering of sample-wise Pearson correlation coefficients based on these genes was driven predominantly by lineage, confirming that in our analytical approach, lineage identity dominates species differences (Fig. 1c).
Correspondingly, expression of key lineage-associated genes was highly conserved between humans and mice (Fig. 1d), such as CSF3R and CHI3L1 in neutrophils, CD19 and CD22 in B cells, CD3 molecules and CD28 in T cells, NKG7 and GZMA in NK cells, MSR1 and SERPINB2 in monocytes and FLT3 and MYCL in dendritic cells. The highest correlation between human and mouse gene expression was observed in neutrophils (r = 0.79), followed by T cells (0.65), B cells (0.65), Monocytes (0.56), and a weaker correlation in NK cells (0.24) and dendritic cells (0.22) (Fig. 1d).
This analysis demonstrates that mapping one-to-one orthologs allows an integrated analysis of leukocyte transcriptomes across species to identify conserved and divergent expression patterns of structurally related genes. Of note, although these data indicate a higher correlation in neutrophils compared to other lineages, this effect may have been influenced by smaller library complexities in neutrophils.
Transcriptional conservation in resting neutrophils
To systematically analyze which genes display similar and divergent expression across species, we integrated transcriptional profiles of resting (not activated) neutrophils available through the Sequence Read Archive (SRA). In this context, resting neutrophils were defined as those isolated from blood or tissue in the absence of disease or experimental manipulation. In a total of 84 human and 39 mouse samples, we observed a high correlation in overall gene expression, transcription factor expression, and lineage-associated gene expression across humans and mice (Pearson’s r between 0.78–0.87, P < 2.2 × 10−16) (Fig. 2a). These results were remarkably similar to those obtained from the more homogenous Haemopedia dataset, further illustrating the robustness of this approach even when integrating multiple datasets from different sources.
We next focused on neutrophil lineage-associated genes and defined five GENE: Gene (HUMAN: Mouse) pairs based on their expression patterns. In addition to one-to-one orthologs, we considered high-confidence one-to-many and many-to-many orthologs.
Orthologs with high expression in both humans and mice included the key neutrophil genes CSF3R (encoding the G-CSF receptor), CXCR2, NCF4 (neutrophil cytosolic factor 4), the transcription factors MCL1, SPI1 (encoding PU.1, an essential transcription factor for terminal granulopoiesis34,35) and JUNB, a transcription factor prominently expressed in late neutrotime which plays a vital role in the inflammatory response of neutrophils9,36 (Fig. 2b). As CSF3R, CXCR2 and JUNB expression changes along neutrophil development, their concordance in expression might suggest that the analyzed neutrophils from humans and mice were of comparable developmental stage.
Orthologs with higher expression in human neutrophils included FCGR3A and FCGR3B (encoding CD16A and CD16B, respectively), which both are one-to-many orthologs of mouse Fcgr4. This group also included the receptor for activated complement (C5AR1) and CXCR1, the receptor for CXCL8 (human)/KC (mouse). Genes with higher expression in mouse neutrophils included the protease Mmp9, Camp (encoding Cathelicidin Antimicrobial Peptide), Il1b, and Retnlg (encoding Resistin-like gamma) (Fig. 2b).
Of note, most genes in categories 1–3 were one-to-one orthologs, although 13/133 (9.8 %) were one-to-many orthologs. However, well-known neutrophil genes without one-to-one orthologs were also identified (categories 4 and 5) and included CXCL8 in humans, a cytokine abundantly expressed in blood neutrophils, and Ccl6, one of the most abundant chemokines in mouse neutrophils (Fig. 2b). Enrichment for neutrophil-related GO terms was found across all five groups of genes (Supplementary Fig. 3).
Thus, while resting human and mouse neutrophils display conserved expression of many key neutrophil genes and transcription factors, gene expression can deviate substantially in the same lineage between species, even for structurally highly related genes.
A core inflammation program is shared across conditions and conserved across species
We next assessed how the expression of one-to-one structural orthologs changes in different inflammatory contexts. Neutrophils display varied phenotypes in homeostasis and inflammation6,7,9,37, but it is unknown if a proportion of the transcriptional characteristics of different neutrophil states is shared across different inflammatory conditions9. Here, resting neutrophils were defined as above and compared with their respective inflammatory condition.
To identify changes in inflammation, we analyzed 11 studies encompassing a total of 46 resting and 66 activated neutrophil samples across different conditions (Fig. 3a, Supplementary Data 2). We tested for differential expression of genes with high-confidence one-to-one orthologs according to ENSEMBL separately within each study, comparing all reported conditions against their own resting controls to reduce the effect of technical variation between studies.
Compared to controls, inflamed neutrophils displayed 975 (median) differentially expressed genes (adjusted P < 0.05, absolute log2 fold change ≥ 0.5). These comprised 621 (median) significantly increased and 205 (median) significantly decreased genes (Supplementary Fig. 4a). Both the number of differentially expressed genes and the genes themselves were heterogeneous—concordant with the diverse transcriptional responses neutrophils can undergo in inflammation.
We next searched for potential overlap in the inflammatory response shared across conditions. Such an overlap may represent a “core inflammation program”, from which neutrophils preferentially upregulate genes across a broad range of activating conditions.
We used Fisher’s combined test to obtain a combined test statistic for each gene, summarizing individual comparisons from all datasets (Supplementary Data 3). Based on the elbow of the P-value-rank plot, we selected from the top 500 genes with the lowest P-value those with absolute log2 fold change ≥0.5 (Fig. 3b).
A total of 221 genes displayed consistent changes in inflammation across studies: 179 genes were upregulated across comparisons (the “core inflammation program”), and 42 genes were downregulated (Fig. 3c). Effect sizes of those 221 up- and downregulated genes agreed well across all tested comparisons and across species (Fig. 3c, Supplementary Fig. 4b).
Core inflammation genes included the IL-1 molecules IL1A and IL1B, the LPS co-receptor CD14, the adhesion molecule ICAM1, the lectin receptor CD69, CD40, IL4R and CD274 (encoding PD-L1) (Fig. 3c, d). Downregulated genes in inflammation included the cyclin-dependent kinase CDK5R1, TLR5 (encoding Toll Like Receptor 5, an essential pathogen recognition receptor38), CXCR4, CD101, and the member of the mitogen-activated protein kinase family MAP3K15 (Fig. 3c, d).
As expression of CD101 and CXCR4 changes throughout neutrophil maturation and aging, we compared the fold change of these markers between neutrophils activated in vitro and those activated in vivo to rule out the effects of differential release from the bone marrow under stress. No differences were observed in either marker (Supplementary Fig. 4c), suggesting that the transcriptional downregulation of CXCR4 and CD101 observed during neutrophil activation are cell-intrinsic and do not reflect a different maturation stage of neutrophils captured in the in vivo studies.
On the level of individual samples, we could confirm that the group of 179 core inflammation genes had either weak or absent expression in healthy neutrophils and were induced in inflamed neutrophils (Fig. 3d).
Gene set enrichment analysis identified a conserved enrichment of pathways related to apoptosis, inflammatory response, IL-2 and IL-6 signaling, IFN-γ response, and TNF signaling via NFKB and KRAS signaling (Fig. 3e).
Taken together, this integrative analysis of resting and activated neutrophils nominated a core inflammation program in neutrophils which is shared across inflammatory conditions and across species.
The core inflammation program is detectable using different analytical strategies and in single-cell data
To further test the robustness of the core inflammation program, we performed two independent analyses. Using a linear mixed model, we observed high replicability of our results, with differentially expressed genes (absolute β ≥ 1, Padj < 0.05) identified by the linear mixed model showing a strong skewing toward low Fisher P-values and a π1-statistic of 0.71 (Supplementary Fig. 5).
We additionally assessed the replicability of differentially expressed genes between all tested comparisons. Median values of the π1-statistic ranged from 0.06 to 0.60, depending on the study, and, importantly, did not show systematic species-driven differences (Supplementary Fig. 6a). Normalized enrichment scores for differentially expressed gene sets were in concordance with up-/downregulation of the tested sets across all studies, supporting the existence of a shared core inflammation program. Of note, the downregulation of specific genes in inflammation was more variable across studies and hence less informative (Supplementary Fig. 6b). Pearson correlation coefficients of log2 fold change values showed strong positive skewing, again pointing toward a core inflammatory response across conditions and species (Supplementary Fig. 6c).
As an additional analytical approach, we performed a weighted correlation network analysis (WGCNA)39. WGCNA constructs correlation networks and can help to identify clusters of genes (“modules”) that are co-expressed across different conditions. It identified four modules (19, 5, 8, and 4) with significant enrichment for core inflammatory response genes (Fisher’s exact test, Padj < 0.05). Gene expression within those four modules increased in inflammation and contained several members of the core inflammation program (Supplementary Fig. 7).
For validation purposes, we analyzed four recent single-cell RNA-sequencing datasets that had not been used to derive the core inflammation program. These included neutrophils from healthy control individuals and those with mild to moderate or severe COVID-19 (Combes et al., dataset 1)40, human neutrophils stimulated with G-CSF, IFN-β or IFN-γ (Montaldo et al., datasets 2+3)41 and mouse neutrophils infected with E. coli (Xie et al., dataset 4)7.
Expression of most of the 179 core inflammation genes increased in inflamed neutrophils (Fig. 4a). A gene set was created based on the 179 core inflammation genes, and changes in expression were tested compared to random background genes with the same expression abundance. A significant increase in the core inflammation genes was detected in all conditions and was higher in patients with severe compared to mild to moderate COVID-19 (Fig. 4b). However, examination of the expression of the core inflammation program on a single cell level indicated heterogeneity within the population of neutrophils, which was characterized by the presence of groups of cells with exceptionally high or low expression of the defined gene set in inflamed states (Fig. 4c).
The core inflammation program shows conserved transcriptional regulation across species
To identify putative regulators of neutrophil activation in inflammation, we applied transcription factor (TF) enrichment analysis individually to up- and downregulated genes in each study. TF enrichment across mouse and human inflamed neutrophils was highly consistent in TFs with decreasing (Supplementary Fig. 8a) and increasing (Supplementary Fig. 8b) activity.
Transcription factors that we found to be enriched in genes expressed in resting neutrophils include AKNA, PU.1 (encoded by SPI1), FOXO3, FOXO1, TFEB, RARA, and STAT5B (Supplementary Fig. 8a). Transcription factors that we found to be enriched in genes associated with inflamed neutrophils included CSRNP1, PLSCR1, FOS, FOSB, the NF-κB components NFKB1/NFKB2, the emergency granulopoiesis transcription factor CEBPB and JUNB (Supplementary Fig. 8b).
To reduce this selection of transcription factors to those with the highest changes in inflammation, we compared the predicted regulatory activity of transcription factors and their respective gene expression in inflammation. This analysis highlighted that the genes encoding for CSRNP1, JUNB, CEBPB, XBP1, and ETS2 were strongly upregulated in inflamed neutrophils while also displaying strongly increased regulatory activity (Fig. 5a).
On the level of individual studies, we also found high consistency in the transcription factors predicted to be enriched in genes upregulated and downregulated in activated neutrophils (Supplementary Fig. 8c). These results were consistent with an independent enrichment analysis performed separately for each species (Supplementary Fig. 8d, e).
Migration into tissue and activation significantly enhance chromatin accessibility and expression of core inflammation genes
If genes in the core inflammation program are predisposed to be upregulated, then chromatin accessibility for these genes should increase upon neutrophil maturation, migration into tissues, and exposure to inflammatory stimuli.
To test this hypothesis, we analyzed chromatin accessibility data derived from bone marrow, blood, and an air pouch model of acute inflammation. These data were generated using Assay for Transposase-Accessible Chromatin using sequencing (ATAC-Seq), a method that tests genome-wide chromatin accessibility. Briefly, ATAC-seq allows the analysis of chromatin accessibility by sequencing DNA fragments that are bound by a hyperactive Tn5 transposase, which preferentially inserts sequencing adapters into open chromatin regions42. In the air pouch model (executed on C57BL/6J mice), blood neutrophils first migrate into a sterile membrane in the skin before being activated by zymosan in the air pouch36. Of the 179 core inflammation program genes, 29 displayed increasing accessibility in blood vs. bone marrow, compared to only 10 genes with decreased accessibility (Fig. 5b). Neutrophils that had transmigrated from the blood into the membrane displayed enhanced accessibility of 78 genes. This increase was significantly (P = 5.1 × 10−9) higher than the increase of 29 genes for neutrophils in blood compared to bone marrow.
Similar skewing toward enhanced accessibility of core inflammation genes was observed when membrane vs. bone marrow (89 up, P = 1.2 × 10−13), inflamed air pouch vs. blood (85 up, P = 1.5 × 10−11), and inflamed air pouch vs. bone marrow (100 up, P = 2.1 × 10−17) were compared to blood vs. bone marrow. (Fig. 5b). The number of genes with increased accessibility was significantly higher than expected by chance, as compared to the accessibility of randomly selected background genes. Importantly, the genes with increased and decreased accessibility were highly consistent across the comparisons (Fig. 5c).
After finding that core inflammation genes have increased chromatin accessibility even before the onset of inflammation, we searched for potential driver transcription factors displaying increasing expression and regulatory activity in inflammation. Comparing motif enrichment (HOMER) with actual expression change in air pouch vs. blood, we observed an increase in both measures for a remarkably restricted set of transcription factors, namely ATF3, BATF, FOSL1, JUNB, and JUN (Fig. 5d).
We next investigated whether the core inflammation program represents a group of genes from which neutrophils preferentially draw upon exposure to inflammatory stimuli. If this were the case, then it would be more likely for core inflammation genes to be upregulated in inflammation compared to all other genes. We analyzed RNA-seq data from differentiated HoxB8 neutrophils stimulated with or without zymosan for 2 h36. We observed that in activated neutrophils, a significantly higher proportion of core inflammation genes (107/179 ≈ 60 %) was upregulated than expected by chance (36–74 genes in 1000 simulations using expression-matched background genes; Poverrepresentation = 6.5 × 10−41) (Fig. 5e).
When evaluating predicted conserved regulatory activity and change in chromatin accessibility together, JUNB emerged as a prominently affected transcription factor and has previously been shown to control neutrophil activation36 and to be highly expressed upon neutrophil activation43. On the other hand, CEBPB has previously been shown to be a key transcription factor mediating emergency granulopoiesis44 and showed a high predicted regulatory activity in our analysis with limited changes in chromatin accessibility. To assess the impact of two transcription factors identified in our enrichment analysis on the expression of core inflammation program genes, we repeated the same analysis in differentiated HoxB8 neutrophils carrying a genetic knockout of either JunB or Cebpβ. CEBPB showed upregulation in inflamed neutrophils as well as increased regulatory activity. In addition, JUNB, which plays an important role in the inflammatory response of neutrophils9,36, also had increased motif enrichment in the air pouch vs. blood comparison.
Based on these analyses, we expected a modest reduction in the expression of core inflammation genes in Cebpβ−/− cells and a stronger reduction in JunB−/− cells. Indeed, this was the case: In a direct comparison of resting knockout (JunB−/− and Cebpβ−/−) versus wild-type cells, we observed a significantly stronger downregulation of the core inflammation program in JunB−/− cells (69 genes; P = 1.5 × 10−9) than in Cebpβ−/− cells (43 genes; P = 0.0011) (Fig. 5e and Supplementary Fig. 9).
Comparing zymosan-stimulated knockout cells versus wild-type cells, we again saw a significant downregulation of core inflammation genes in the JunB−/− condition (51 genes; P = 0.0025) but not in the Cebpβ−/− condition (25 genes; P = 0.79) (Fig. 5e and Supplementary Fig. 9).
Together, these results indicate that maturation and migration into an inflamed tissue site predispose neutrophils to upregulate genes of the core inflammation program and that knockout of Cebpβ and especially JunB leads to a weaker induction of core inflammation genes compared to WT cells.
Members of the core inflammation program can be validated on the protein level in activated human and mouse neutrophils
To validate members of the core inflammation program experimentally, we filtered the list of genes by surface proteins, yielding 36 markers (Fig. 6a)45. Based on antibody availability, we developed a flow cytometry panel including canonical lineage markers (human: CD15, mouse: Ly6G) and five proteins predicted to be part of the core inflammation program: CD14, CD69, CD40, CD274 (PD-L1) and IL-4R (Supplementary Tables 1 and 2).
We isolated human neutrophils from peripheral blood and mouse neutrophils from bone marrow and cultured them over 48 h with or without the addition of GM-CSF + LPS and GM-CSF + IFN-γ (Fig. 6b).
Prolonged cell culture without activation led to an increase in CXCR4 and loss of CD62L and CD101 in human cells, while mouse cells showed a reversed phenotype with upregulation of CD62L and CD101 as well as a downregulation of CXCR4, suggesting continued maturation of bone marrow neutrophils in vitro and not classical neutrophil aging (Supplementary Fig. 10). Compared to unstimulated cells, activated mouse neutrophils significantly upregulated the predicted core inflammation program markers CD14, CD40, CD69, PD-L1, and IL-4R in the condition containing LPS and all but CD69 in the condition containing IFN-γ (Fig. 6c). Human neutrophils displayed a highly concordant increase in those markers. CD69 and PD-L1 increased with similar magnitude, while upregulation of CD14 was stronger in mouse neutrophils compared to human neutrophils. In human neutrophils, upregulation of CD40 was restricted to a small (~2%) population of neutrophils (in line with previous findings46) but reached significance on the bulk level for both stimulations (Fig. 6c).
Differences were also noticeable between inflammatory conditions. In mouse neutrophils, the combination of GM-CSF and LPS led to a stronger increase in the expression of CD14, CD69, IL-4R, and CD40 compared to GM-CSF and IFN-γ. The reverse was true for PD-L1, which is driven substantially by IFN-γ signaling8. Further, IFN-γ stimulation reduced CD69 expression, while LPS increased it. In human neutrophils, the combination of GM-CSF and IFN-γ leads to stronger increases in CD14, CD69, IL-4R, and PD-L1 than the combination of GM-CSF and LPS.
A combined diffusion map analysis revealed a high degree of overlap between mouse and human neutrophils, while cell distribution was driven predominantly by experimental conditions (Fig. 6d). Correspondingly, activated neutrophils of both species displayed a continuous upregulation of the inflammatory response markers (Fig. 6e).
These findings confirm the predicted activation markers, further substantiating the conservation of inflammatory response programs in neutrophils while also revealing differences between species and inflammatory conditions.
Neutrophil origin and inflammatory condition influence the expression of the core inflammation program
Neutrophil heterogeneity is influenced by the tissue microenvironment6,9. To evaluate the impact of tissue origin on the phenotype of neutrophils in inflammation, we performed stimulation experiments with paired leukocyte preparations from blood, bone marrow, and spleen of wild-type BL6 mice. In a principal component analysis of flow cytometry data, resting neutrophils clustered closely together, but each tissue remained distinguishable based on subtle baseline expression differences in IL-4R, CD69, and CD40 (Fig. 7a). Inflamed neutrophils deviated markedly from their resting counterparts and reached distinct states as a function of tissue and inflammatory condition (Fig. 7a).
Neutrophils from all tissues upregulated CD69 and IL-4R, suggesting that these markers can be utilized as neutrophil activation markers across a variety of conditions (Fig. 7a). In contrast, expression of CD40, CD14, and PD-L1 showed greater tissue dependence. CD40 (evident most prominently in mouse neutrophils) was robustly upregulated in splenic neutrophils and less prominently in blood neutrophils. Conversely, CD14 and PD-L1 expression was inducible to a greater extent in blood neutrophils and bone marrow neutrophils but less in splenic neutrophils.
We also noted differences related to activating stimuli, for example, through more prominent PD-L1 induction by IFN-γ compared to LPS. The single-cell analysis highlighted a continuum of states in all organs (Fig. 7b), driven by increasing expression of the core inflammation markers (Fig. 7c). Importantly, the core program was already inducible in bone marrow neutrophils, suggesting that in vitro and adoptive transfer experiments performed with bone marrow neutrophils can recapitulate important features of neutrophil biology in inflammation.
Discussion
Neutrophils are important mediators of immune defense and protagonists in immune-mediated diseases. Mouse and human neutrophils differ in morphology, frequency in blood (humans ~50–70%, mice ~10–25%), and expression of marker proteins. For example, mouse neutrophils are defined by surface expression of Ly6G, not present in the human genome, whereas mouse neutrophils lack expression of defensins47.
Both in humans and mice, neutrophils are phenotypically heterogeneous across different tissues and inflammatory conditions37,48,49. Recent studies suggest that neutrophil heterogeneity in homeostasis is driven by a chronological sequence of maturation and activation termed neutrotime, whereas the combination of aging, tissue factors, environmental features, and inflammatory signals promote their polarization toward distinct states6,7,9.
While the neutrotime signature can be detected in both species and this overarching principle of neutrophil ontogeny is likely conserved across humans and mice, it is poorly understood which features of the neutrophil inflammatory response are shared across species. Furthermore, it is unclear which aspects of the neutrophil inflammatory response reflect a general inflammatory response program shared across multiple inflammatory conditions and which features are highly specific to certain triggers or sites of inflammation.
To address these gaps in knowledge, we performed an integrative analysis of resting and inflamed RNA-seq samples from humans and mice. We validated our computational approach by comparing gene expression conservation across six immune cell lineages: T cells, B cells, monocytes, dendritic cells, NK cells, and neutrophils. Expression of lineage-specific genes was generally well-conserved across humans and mice. Intriguingly, neutrophils displayed both the greatest number of lineage-specific genes and the highest correlation of gene expression between mice and humans, suggesting a higher degree of conservation in this phagocytic cell compared to other lineages.
While different inflammatory conditions induced highly heterogeneous responses in neutrophils, our combined analysis allowed us to predict a core inflammation program conserved across mice and humans. The robustness of this program was underscored by the high concordance between the gene set derived from Fisher’s combined test and complementary approaches based on a linear mixed model as well as weighted correlation network analysis (WGCNA39,50,51). It is important to note that different analytical strategies may be used to derive this core inflammation program, each detecting a varying number of genes. The situation is similar for differential gene expression in general, which depends on the chosen method, as has been reviewed extensively52. Nevertheless, our analysis indicates that a group of genes exists from which neutrophils preferentially draw when they become activated across humans and mice and across a large range of conditions and disease states. The conservation of a small set of transcription factors predicted to regulate a broad variety of conditions across humans and mice highlights the conserved nature of gene expression in neutrophils.
To validate the predicted core inflammation program in different models, we analyzed differential gene accessibility in ATAC-sequencing data from a mouse air pouch model of inflammation. We found a significant proportion of core inflammation program genes to be more accessible with maturation and in pro-inflammatory conditions. While this model is very specific, it covered neutrophils from different maturation stages and presented the opportunity to study transmigrated and activated neutrophils separately. Further, analysis of the transcriptome on a single cell level in both in vivo and in vitro inflamed neutrophils of both species allowed us to validate the core inflammation program. While the overall enrichment of the proposed gene set on a pseudo-bulk level was clearly evident, our analyses also suggested significant heterogeneity within the population of inflamed neutrophils, consistent with recent analyses7,9,40,41. These analyses further highlight the predictive value of the program in a method not used in its generation.
HoxB8-derived neutrophils are a powerful tool to model neutrophil function. We assessed the differential expression of zymosan-activated HoxB8-derived neutrophils versus control, showing a significant overrepresentation of core inflammation genes in activated neutrophils. Zymosan-activated myeloid cells through TLR2 and is a commonly used pro-inflammatory trigger. The core inflammation program was reduced in resting cells carrying a knockout of key regulators of this program (JunB−/− and Cebpβ−/−). Core inflammation genes were also significantly less upregulated in zymosan-stimulated JunB−/− cells, indicating an impaired neutrophil inflammatory response in these cell lines. Concordant with previous reports of a more limited impact of the Cebpβ knockout on inflammatory neutrophil functions compared to the JunB knockout36, the underrepresentation of core inflammation genes was nonsignificant in our analysis.
Finally, we validated key components of the predicted core inflammation program experimentally. Using primary human and mouse neutrophils, we showed that the surface proteins CD14, CD69, IL-4R, CD40, and PD-L1 are induced by in vitro cytokine stimulation, and this upregulation is observable in both species, although CD40 was restricted to a small subset of neutrophils in humans, as expected46.
This finding further underlines the conserved character of the inflammation program as presented in this study. Interestingly, while neutrophils from different mouse tissues upregulated the inflammatory response markers, the magnitude of upregulation differed across bone marrow, spleen, and blood, suggesting that the tissue origin of neutrophils is an important consideration in experimental studies.
Recently, Jin et al. identified a distinct neutrophil population termed “antigen-presenting aged neutrophils (APANs)”53. In humans, this population was characterized as CD66b+CXCR4+CD62LloCD40+CD86+, while in mice, they were identified as Ly6G+CXCR4+CD62L-/loMHCII+CD40+CD86+. APANs were capable of inducing CD4 T cell proliferation via IL-12 and exhibited a hyper-NETosis phenotype. The presence of these neutrophils in patients with sepsis was associated with increased mortality. While we also observed the upregulation of key marker genes like CD40 in our study’s core inflammation program, APANs displayed distinct features, such as elevated levels of CXCR4 and coexpression with CD74, suggesting a unique neutrophil polarization state discriminable from both neutrophil aging and canonical activation. The phenotype observed by the authors suggests the importance of further studying APANs, their features and their role in antigen presentation in humans and mice.
The upregulation of IL-4R we observed is concordant with reports of IL-4R upregulation during sterile information in mice, with implications for diseases that are IL-4 mediated54. CD14 has recently been shown to be an important, highly cell-specific mediator of TNF response in a mouse sepsis model55. Interestingly, CD14+ macrophages and neutrophils were found to be key players leading to lethality in response to TNF (with improved survival in CD14-deficient mice), which provides a model for the cytokine storm seen in severe sepsis and provides evidence for the complexity of CD14-mediated inflammatory response beyond TLR-signaling. These examples highlight the importance of core inflammation program members and stress the need to study them in a broad variety of inflammatory contexts.
The question of how well mouse models mimic human immunology is an area of ongoing debate. Even the same data can support different conclusions56,57, highlighting the impact of analytical decisions. Furthermore, it is important to compare suitable datasets, control for batch effects, and make comparisons to varying controls to avoid a shared denominator effect56,58,59.
In the context of neutrophils, fundamental differences between humans and mice exist60,61. Those differences must be considered when using the mouse as a model to study neutrophil function, especially in disease, as previously discussed62. Granule proteins found in neutrophils play a key role in defense against infection. An important difference in the granule protein repertoire includes α-defensins, which exercise antimicrobe63,64 and chemotactic65 activity and are absent in mouse neutrophils. It is also known that mouse neutrophils express less MPO, leading to a more limited capability to produce hypochlorous acid compared to their human counterpart66. The importance of cytokine production by neutrophils has been increasingly recognized67,68, with some cytokines such as IFN-β and IL-17 apparently expressed in mouse and not human neutrophils. The different immunoreceptor reservoir69 is, in part, a result of pathogen responses that are exclusive to the human species. For example, human neutrophils express specific CEACAMs that mediate uptake of the human-specific pathogen Neisseria gonorrhea.70, which must be taken into account when modeling neutrophil responses to this pathogen71. Taken together, these studies provide important context to be taken into account when interpreting the core inflammation program identified.
The derivation of the core inflammation program was limited to bulk RNA-sequencing samples since a similar analysis using single-cell studies requires datasets that are only now beginning to emerge. To circumvent potential batch effects, we focused our analysis on studies with internal controls of resting neutrophils, excluding other potentially interesting studies containing only neutrophils harvested from inflamed sites. Analyzed samples were also limited by technical factors, including the known intron retention in neutrophils72, as well as the less complex transcriptome associated with low RNA and high RNase content. Furthermore, analysis of single-cell RNA sequencing data, the ATAC-seq data from the air pouch model of inflammation, and RNA-seq data from zymosan-activated HoxB8 samples represent only selected validation strategies in specific modalities of inflammation, which might limit the generalizability of some of the findings.
Nevertheless, our combined analysis of 11 human and mouse neutrophil transcriptomic datasets identified a largely conserved transcriptomic landscape across species, supporting the use of mouse neutrophils to illuminate human biology. We furthermore predicted and experimentally confirmed the existence of a core inflammation program conserved across human and mouse inflamed neutrophils. This study sets the stage for more fine-grained analyses of the epigenome, transcriptome, and proteome of neutrophils across varying conditions, which together will paint a clearer picture of the neutrophil response to different environments. Going forward, genetic perturbations and pharmacological interventions to interfere with pathologic neutrophil activation will be particularly informative if focused on programs conserved across species. The systems biology approach presented here can be transferred to other cell types and organisms to facilitate further studies comparing gene expression across species.
Methods
Ethics approval
Research with healthy human participants followed the declaration of Helsinki. Blood of healthy donors was collected under IRB-approved protocols (Heidelberg S-272/2021 and Heidelberg S-285-2015) approved by the ethics committee of the University of Heidelberg, Heidelberg, Germany. Informed consent was obtained from all participants.
Experiments involving animals were conducted under the approval of the Animal Care Facility Heidelberg and the Animal welfare officers (approval #T66/21) at the University of Heidelberg, Heidelberg, Germany.
We obtained publicly available RNA sequencing data from mouse and human leukocytes through GEO (262 samples from 24 studies) and integrated these data by mapping orthologous genes. Differential expression analysis between resting and inflamed neutrophils was performed separately for each dataset, and the core inflammation program was derived using Fisher’s combined test. Transcription factor enrichment analysis was performed using ChEA3 and DoRothEA and compared to chromatin accessibility data from ATAC-seq (GSE161765). The impact of Cebpβ and JunB knockout on the core inflammation program was studied using RNA-seq data from HoxB8 cells (GSE161765). The core inflammation program was validated in stimulated mouse and human neutrophils by flow cytometry.
Datasets
For all analyses, we used the following datasets:
RNA sequencing
Datasets of interest were identified through a literature search on PubMed and the NCBI Gene Expression Omnibus. In total, 262 publicly available RNA sequencing samples from 24 studies were included.
Lineage atlas dataset (Table 1): 76 samples, 40 human samples, 36 mouse samples. This dataset is a curated subset of the Haemopedia RNA-Seq atlas. Human cells were from buffy coats of healthy donors, and mouse cells were from blood, bone marrow, spleen, and lymph nodes.
Neutrophil dataset (Table 2): 195 samples (including the 9 Haemopedia neutrophil samples, Choi J 2019), 136 human samples from 13 studies, 59 mouse samples from 11 studies. All studies in this dataset were selected only to contain neutrophils. A subset of this dataset from studies with inflamed samples as well as healthy controls was used for differential expression testing and inflammatory core signature construction. Other subsets of samples from studies not selected for differential expression analysis have been used in analyses focusing on healthy control samples (Fig. 2).
Table 1.
Lineage | Species | N samples |
---|---|---|
B cells | Human | 8 |
Dendritic cells | Human | 7 |
Monocytes | Human | 7 |
Neutrophils | Human | 3 |
NK cells | Human | 5 |
T cells | Human | 10 |
B cells | Mouse | 2 |
Dendritic cells | Mouse | 4 |
Monocytes | Mouse | 6 |
Neutrophils | Mouse | 6 |
NK cells | Mouse | 2 |
T cells | Mouse | 16 |
Table 2.
Species | Study name | Used for signature construction | N samples |
---|---|---|---|
Human | Adrover JM 202010 | FALSE | 6 |
Human | Catapano M 202011 | FALSE | 11 |
Human | Choi J 201912 | FALSE | 3 |
Human | Franco LM 201913 | FALSE | 2 |
Human | Grabowski P 201914 | FALSE | 22 |
Human | Vecchio F 201819 | FALSE | 8 |
Human | McCreary M 2019 | TRUE | 10 |
Human | Miralda I 202016 | TRUE | 16 |
Human | Mistry P 201917 | TRUE | 28 |
Human | Ter Haar NM 2018107 | TRUE | 6 |
Human | Thomas HB 201518 | TRUE | 12 |
Human | Wright HL 201321 | TRUE | 6 |
Human | Wright HL 202020 | TRUE | 6 |
Mouse | Bhalla M 202122 | FALSE | 7 |
Mouse | Casulli J 201923 | FALSE | 3 |
Mouse | Coffelt SB 201524 | FALSE | 4 |
Mouse | Choi J 201912 | FALSE | 6 |
Mouse | Germann M 202026 | FALSE | 4 |
Mouse | Hsu BE 201927 | FALSE | 4 |
Mouse | Zhu YP 201831 | FALSE | 3 |
Mouse | Gal-Oz ST 201925 | TRUE | 12 |
Mouse | Hutchins AP 201528 | TRUE | 4 |
Mouse | Stasulli NM 201529 | TRUE | 6 |
Mouse | Yan Z 201930 | TRUE | 6 |
RNA-Seq of HoxB8 cells
Khoyratty TE 202136 18 samples
ATAC-Seq
Khoyratty TE 202136 5 peak annotations
Flow cytometry
This study samples from 8 human donors and 9 mice
This study samples from different organs of 6 mice
Single-cell RNA sequencing
Xie et al., 2020
Montaldo et al., 2022
Combes et al., 2021
Data retrieval and processing
We downloaded raw sequencing reads for the selected studies to the MLS&WISO bwForCluster using release 1.5 of the nf-core73 fetchngs pipeline and quantified them using release 3.6 of the nf-core rnaseq pipeline. The pipelines were launched using nextflow74 (v22.04.0). To ensure high reproducibility, all pipeline processes were run inside singularity (v3.9.2) containers. For bulk RNA-seq samples, we mapped all downloaded samples using salmon75 (v1.5.2) with the parameters libType set to ‘A’ and indexing the reference genomes with 21 base k-mers. Quantified transcripts were summarized to the gene level using bioconductor-tximeta76 (v1.8.0). All human samples were mapped to the GRCh38 genome. All mouse samples were mapped to the GRCm39 genome unless stated otherwise. Quality control was conducted using FastQC (Supplementary Data 1). Author-supplied metadata was queried using GEOquery77 (v2.64.0) and integrated manually to ensure consistency across studies (Supplementary Data 2). R (v4.2.0) was used for downstream analyses. Bioconductor (v3.15) and additional packages were used for downstream analyses and visualizations51,78–80. Sequencing depths (total amount of mapped reads) for human samples in the lineage atlas dataset ranged between 11,970,326 and 16,060,356 (median 13,267,958), for mouse samples between 16,076,257 and 33,595,867 (median 18,689,418; Supplementary Fig. 2a, b). In the neutrophil dataset, sequencing depths ranged between 195,712 and 57,144,004 (median 29,741,078) for human and between 1,022,486 and 42,005,334 (median 12,429,230) for mouse samples. The subset of samples that was selected for differential expression testing and inflammatory core program calculation was sequenced between 8,135,455 and 57,144,004 (median 33,522,466) for human and between 1,022,486 and 36,405,228 (median 6,388,735; Supplementary Fig. 2c, d) for mouse samples.
For single-cell RNA sequencing data, the raw sequencing reads were downloaded as described above and aligned using cellranger (v7.1.0) using the GRCh38 genome for human datasets and mm10 genome for mouse samples, respectively. Downstream analysis was carried out in Python (v3.10) using the scanpy81,82 API (v1.9.3) for data analysis and visualization.
Orthology analyses and mapping
For downstream analyses, genes were mapped using ENSEMBL Version 1072. We restricted all our composite cross-species analyses to protein-coding genes with a high-confidence orthology relationship and available gene symbols in both species. Mouse and human gene expression datasets were combined based on these orthologs.
Identification of lineage-associated genes
Lineage-associated genes were identified using a linear model-based differential expression test, implemented in limma83 (v3.52.0) and edgeR84–86 (v3.38.0). Differential expression testing was restricted to protein-coding genes that could be assigned high-confidence orthologs between human and mouse samples. We constructed a cross-species count matrix based on those mappings and referred to each mapped gene by its human gene symbol. Counts were filtered using edgeR’s filterByExpr filtering approach. We applied TMM normalization to account for differences in library composition. We then transformed counts to log2(CPM) values and estimated weights for each observation using voom. We applied limma to fit a linear model to our data and calculated differential expression for a given lineage against all remaining lineages. Lineage-associated genes were defined as genes that were differentially expressed in each lineage against all other lineages at a Benjamini-Hochberg corrected P-value of ≤0.05 and a log2 FC >0. Genes were ranked according to their F statistic, and up to 200 genes were selected per lineage.
Lineage PCA, correlation analysis, and clustering
We used these balanced lineage-associated gene sets to perform PCA as well as correlation and clustering analysis on all samples. Human and mouse samples were combined as described above. To emphasize our focus on comparisons between lineages, we mean-centered log2(CPM) for each species prior to combining the count matrices. A PCA was computed for all integrated samples, taking the concatenated lineage-associated gene sets as input features. Correlation of expression analyses was performed based on the same features, calculating Pearson’s r correlation coefficient for each inter-sample combination. We subsequently performed a hierarchical clustering analysis on the obtained correlation coefficients.
Comparison of neutrophil lineage gene expression profiles in resting neutrophils
To compare expression patterns of neutrophil lineage-associated genes in resting human and mouse neutrophils, we first defined lineage-associated genes for human and mouse samples separately. We defined those genes as lineage-associated that were upregulated (Benjamini-Hochberg corrected P-value of ≤0.05 and a log2 FC >1) in neutrophils against all other lineages. We next mapped those gene sets to their human and mouse counterparts, considering only high-confidence one-to-one, one-to-many, and many-to-many orthology relationships. Based on those mappings, we merged all genes detected as lineage-specific in either of the considered species. We also included genes detected as lineage-specific in either species but could not be mapped to a high-confidence ortholog. The obtained genes were subset only to include genes that showed evidence of expression in the inflammatory dataset.
Taking the computed mappings and log2(TPM+1) expression values of mapped gene-gene pairs, we tested for differential expression of those pairs between species using a linear mixed model87 (lme4 v1.1-29) accounting for study-related batch effects by including the study annotation as a random effect:
Full model: log2(TPM+1) ~ species + (1|study)
Null model: log2(TPM+1) ~ (1|study)
P-values were computed by performing a likelihood ratio test between these models. We subsequently adjusted those values using the Benjamini-Hochberg correction method based on the total number of tested gene-gene pairs (genes that appeared as lineage-specific in either species were expressed in the inflammatory dataset and could be mapped to one or more counterparts with high confidence).
Using the average expression of mapped gene-gene pairs and differential expression P-value, we defined 5 different expression profiles: Genes that showed high (>95th percentile of all genes that were detected as lineage-specific in either species) average expression levels in both species and did not exhibit differential expression between species (Benjamini-Hochberg corrected P-value ≥ 0.05, absolute beta < 1). Additionally, we defined 4 divergent clusters of genes that had high expression levels in only one of both species and showed evidence of differential expression (Benjamini-Hochberg corrected P-value < 0.05, absolute β ≥ 1) or were abundantly expressed but could not be assigned an orthologous gene in the other species respectively.
Differential expression testing
We performed differential expression analyses between inflamed and resting conditions on a total of 112 samples from N = 11 (human: 7, mouse: 4) studies. To account for potential batch effects between studies, we used DESeq288 (v1.36.0) in each of the studies individually to identify differentially expressed genes in inflamed compared to healthy control samples. Each study’s gene list was pre-filtered to only include genes with counts >1 in at least 1 sample before differential expression analysis, based on the negative binomial distribution. To remove noise while preserving significant differences, log2 fold change results were then shrunk using the apeglm package89. Differential gene expression results were additionally filtered through DESeq2’s default independent filtering approach, as well as its count outlier filtering.
Identification of a core inflammation program
To assess which inflammation-driven changes in the neutrophil transcriptome are shared across conditions and conserved across species, we applied a Fisher’s combined test to the adjusted P-values of each gene in each study, restricting the analysis to genes that passed our expression filter as well as DESeq2 filters in ≥80% of studies. This analysis provided a Benjamini-Hochberg-corrected composite P-value for all genes and allowed us to rank genes by their likelihood of dysregulation in inflammation. Additionally, we calculated a mean log2 fold change for each gene across all studies.
Based on a rank-P-value plot (Fig. 3b), we determined a P-value cutoff at a rank equaling 500, corresponding to an adjusted Fisher P ≤ 6.164117 × 10−40. Genes with an absolute log2 fold change greater than or equal to 0.5 and an adjusted P-value below our defined threshold were considered conserved in inflammation. We defined the upregulated subset (log2 FC ≥ 0.5) of those conserved genes as the core inflammation program.
Pathway enrichment analysis
Inflammatory pathway enrichment analysis was performed for each study individually using the fGSEA implementation of the Gene Set Enrichment Analysis method. For each study, the differential expression analysis results were ranked by log2 fold change. Enrichment was calculated for hallmark gene sets that were retrieved from the Molecular Signatures Database90 (v7.5.1).
Data preprocessing for single-cell RNA-seq
Datasets were imported using the raw count matrices from cellranger. First, empty droplets were determined by estimating the profile of ambient mRNA and testing deviations from this profile using a Dirichlet-multinomial model of UMI count sampling as implemented in the EmptyDrops method91 (implemented in the DropletUtils package, v1.18.1). Ambient RNA correction was applied using the soupX-algorithm92 (v1.6.2), and doublets were determined using a computational doublet detection tool that uses artificially created cell doublets to identify real cell doublets by nearest-neighbor-analysis in gene expression space93 (v1.12.0). Cells expressing hemoglobin-related genes in a proportion above 0.02 were excluded, as well as cells containing less than 250 (Xie et al., Montaldo et al.) or 100 (Combes et al.) unique genes per cell. Cells with a content above 5% (Xie et al.) or 10% (Combes et al., Montaldo et al.) of mitochondrial genes were also excluded. Cell types were identified using the SingleR package94 (v2.0.0) in R with BlueprintEncodeData and MonacoImmuneData as reference datasets for human datasets and ImmGenData as reference dataset for mouse data, as provided by the package celldex94 (v1.8.0). Data were log-normalized, and neutrophils were selected. For UMAP visualization, the 2000 genes containing the highest variance were selected and UMAP was computed using the scanpy API with default settings.
Gene set enrichment in single-cell RNA-seq
The gene counts were normalized and log1p transformed using scanpy. The enrichment of genes belonging to the core program was quantified using the difference between the average expression of the core inflammation program genes and the average expression of a random reference set of genes95 that have been sampled to match the expression distribution of the core inflammation program (scanpy score_genes function with default settings). We tested for enrichment in inflamed conditions using a linear mixed model with the sample (and organ for the dataset from Xie et al.) as a random effect and the respective treatment as a fixed effect. For datasets with multiple inflamed conditions, a respective model was created for each comparison separately. P-values were calculated by performing a maximum likelihood test between both models as described above. For visualization, the gene set scores were quantile-capped to the 5th and 95th percentile.
Transcription factor enrichment analysis
In order to identify regulators associated with genes induced or downregulated in inflammation, we used ChEA396 with default settings as described9, using the 250 most significantly up- and downregulated genes, respectively, for each condition, ranked by their adjusted P-value. We then calculated the arithmetic mean of the negative logarithms of the ChEA3 scores per species and transcription factor to compare average TF activity across species.
We used a paired t-test to assess significant differences between a TFs ChEA3-enrichment in up- against downregulated genes across all comparisons. The resulting P-values were corrected using a Benjamini-Hochberg correction for all tested TFs. We used these corrected P-values to determine if a TF was significantly more enriched in genes upregulated in inflammation or vice-versa. We subsequently inferred transcription factor activity using DoRothEA97 (v1.8.0) and decoupleR98 (v2.2.2), taking advantage of the species-specific transcription factor databases. Here, log2 fold change matrices per species served as input, leading to enrichment scores with their respective P-values. For downstream analyses, we calculated the mean enrichment scores per species and preserved the highest observed P-value for each transcription factor. For data visualization (Fig. 5a), only transcription factors where the respective gene had an assigned P-value in Fisher’s combined test are shown (N = 680). Genes encoding transcription factors with a mean log2 fold change ≥0 were merged with transcription factor scores that were derived from the upregulated score set (as in Supplementary Fig. 8b), and genes with a mean log2 fold change <0 were merged with transcription factor scores that were derived from the downregulated score set (as in Supplementary Fig. 8a). Labeling was selectively applied for genes encoding transcription factors with the 10 highest absolute log2 fold changes per direction which fall under the adjusted Fisher’s combined P-rank cutoff of 500 genes, as well as the 5 transcription factors with the lowest P-values and CEBPB.
ATAC-sequencing analysis
We retrieved ATAC-sequencing data from mice that were subjected to the air pouch model of acute inflammation (GEO: GSE161765, mapped to the GRCm38 genome). Genes annotated based on differentially accessible peaks as defined in the study (Padj < 0.05, log2 fold change > 1.5) were compared with the conserved upregulated genes as defined in the core inflammation program. The ratio and number of core inflammation program genes that were associated with projected increased accessibility served as an input for pairwise Fisher’s exact tests, and P-values were adjusted using the Benjamini-Hochberg method. For each comparison, the significance of the number of core inflammation genes with increased accessibility was retrieved by comparing the number with the results of this analysis using a 1000-fold repeated random selection of expression-matched background genes (as described below for RNA sequencing).
RNA-sequencing analysis of zymosan-treated HoxB8 cells
We retrieved featureCounts (per ENSEMBL-ID) from HoxB8 cells that were subjected to differentiation and zymosan treatment (GEO: GSE161765, mapped to the GRCm38 genome). Differential expression analysis was performed as described in the respective section above. We restricted the analysis to HoxB8 cells that were differentiated for 5 days and then compared (1) wild-type cells that were treated for 2 h with zymosan (50 µg/ml) or DMSO (control), (2) resting stable knockout HoxB8 cell lines versus wild type, (3) zymosan-treated stable knockout HoxB8 cell lines versus zymosan-treated wild-type HoxB8 cells. A significant up- or downregulation of core inflammation program genes was then assessed by performing pairwise overrepresentation analyses. For each overrepresentation analysis, we defined differentially expressed genes as genes with an FDR ≤ 0.05 and a |log2 fold change| ≥ 1. We then used goseq (v1.48.0) to calculate a Probability Weighting Function for the given set of genes, and calculated P-values by approximating the true distribution by the Wallenius non-central hypergeometric distribution as previously described99.
The control expression was calculated as previously described95. For each comparison, the gene expression was distributed in 25 bins. Then, each core inflammation program member was assigned to its respective bin. The randomized sets were then sampled according to the distribution of core inflammation program gene expressions. This sampling was repeated 1000 times.
Experimental validation
The list of ranked conserved inflammatory response genes was filtered to include genes encoding surface proteins using the surfaceome resource45. The remaining N = 69 surface protein-encoding genes were then filtered by available human and mouse antibodies (BioLegend), and a panel consisting of CD14, CD69, CD40, IL-4R, and PD-L1 was selected for validation.
Human samples
Neutrophils were isolated using density gradient centrifugation with Polymorphprep as previously described8: 30 ml whole blood was layered onto 20 ml Polymorphprep (Progen #1114683) and centrifuged at 535 × g for 35 min. The PBMC-containing layer was discarded by suction, and neutrophils were recovered and subjected to hypotonic lysis using 0.2% NaCl. The cells were subsequently washed with cell culture medium (RPMI 1640 (Gibco #21875-034)) supplemented with 10% heat-inactivated FBS (PAN Biotech #3302/P101102) and 1% GlutaMAX (Gibco #35050-061) and seeded at 5 million cells per 6 wells in a total volume of 5 ml at a humidified atmosphere at 37 °C with 5% CO2. The cells were cultured over 48 h either in the absence of cytokines (vehicle control), with GM-CSF + IFN-γ or GM-CSF + LPS. GM-CSF was used at a final concentration of 100 U/ml (R&D #215GM), IFN-γ at 10 ng/ml (BioLegend #570208), and LPS at 100 ng/ml (Invivogen #tlrl-3pelps). After 48 h, 1 million cells were collected and stained using the Zombie Yellow Fixable Viability Kit (BioLegend #423103) for live/dead discrimination, followed by an antibody panel (Supplementary Table 1) in 50 µl of FACS buffer (2% FBS, 5 mM EDTA and 0.1 sodium azide in PBS) for 25 min.
Mouse samples
Mice were housed under SPF conditions with a 12 h light/dark cycle, a humidity of 50–60%, a temperature of 22 ± 2 °C and food and water available ad libitum. Male and female C57BL/6J mice were sacrificed by cervical dislocation, and bone marrow was extracted by flushing with RPMI. Neutrophils were enriched by density centrifugation using Histopaque 1077 (Sigma-Aldrich #10771) and Histopaque 1119 (Sigma-Aldrich #11191). Cells were recovered from the interphase of both Histopaque layers and centrifuged. Cells were washed with RPMI containing 10% FBS and 1% Glutamax and seeded at 106 cells/ml in 48 well plates in a total volume of 500 µl. Mouse GM-CSF (Peprotech #315-03, 100 U/ml), mouse IFN-γ (Peprotech #315-05, 10 ng/ml), and LPS (Invivogen #tlrl-3pelps, 100 ng/ml) were added to the medium for 24 h and 48 h in combination as indicated in the respective figures. Cells cultured in the absence of cytokines were used as controls. After the indicated times, cells were collected and stained with the antibody panel (Supplementary Table 2) in 50 µl of FACS buffer containing 2% FBS, 5 mM EDTA and 0.1% sodium azide.
To assess neutrophils from different organs, mice were sacrificed by cardiac puncture under generalized anesthesia. Subsequently, the femora and tibiae were flushed with PBS to obtain bone marrow. Any remaining fat was removed from spleens, and splenic tissue was mechanically disintegrated using the back of a syringe. Cells were pelleted at 400×g, and erythrocytes were lysed using ACK Lysing Buffer (Lonza #10-548E) for 5 min at 4 °C. Cells were seeded at 106 cells/ml in 48 well plates in a total volume of 500 µl. Cytokines were added as described above for a total of 8 h.
Flow cytometry
Flow cytometry was performed on a BD LSRII flow cytometer. At least 50,000 events were recorded per sample. FCS files were exported by FACSDiva and subsequently gated and compensated in FlowJo (v10.8.0) for single, living, and CD15+ (human) and Ly6G+ (mouse) cells. Eosinophils were excluded based on high autofluorescence in the live/dead (Pacific Orange) channel (Supplementary Fig. 11). Gated events and their median fluorescence intensity values were exported and concatenated into a single-cell experiment using CATALYST100 (v1.16.2) in R (v4.2.0). The dataset was arcsinh transformed using manually determined cofactors8,101 and clustered by FlowSOM clustering and Consensus-Plus-Metaclustering. For combined analysis of human and mouse cells, both datasets were mean-centered, scaled, and combined into one SingleCellExperiment. Dimensionality reduction was performed using the DiffusionMap algorithm as implemented in the CATALYST package with standard settings. For visualization, a random subset of 1500 cells per sample was plotted using ggplot2102 (v3.3.5). Principal components were calculated based on the median fluorescence values of the respective marker proteins per sample and plotted using matplotlib (v3.5.1) in Python (v3.9.1).
Gene ontology enrichment analysis
We used EnrichR103–105 to assess enriched gene sets in a given list of genes. The analysis was restricted to terms annotated in GO_Biological_Process_2021.
Linear mixed-effect model
To validate the core inflammation program derived from Fisher’s combined test, we globally tested for differential expression between resting and inflamed cells including all samples used in the Fisher’s combined testing approach. We accounted for batch effects by correcting gene counts for the study using ComBat-Seq106 (sva v3.44.0). From batch-corrected counts, we calculated TMM-normalized log2 counts per million that were subsequently quantile normalized and then used as input for linear modeling. Modeling was implemented using lme487 (v1.1-29) to fit a linear mixed-effects model (LMM) to normalized counts. The linear formulae we fit for each gene were defined as and , where the variable to test for was condition, and the study was used as the covariate that was considered to be the random effect. We retrieved β as an estimate for the log2(FC) from the full model and subsequently performed a likelihood ratio test to compare the full with the reduced model and to retrieve the respective P-values. P-values were then adjusted using the Benjamini-Hochberg procedure. We accounted for batch effects by correcting gene counts using ComBat-Seq106 (sva v3.44.0).
π1-statistic
Using the qvalue-package39,50,51 (v2.28.0), we calculated the π1-statistic (1- π0) as an estimated proportion of truly significantly differentially expressed genes for a given set of P-values. To account for a selection of genes potentially biased toward low P-values when testing for the replicability of DEGs between studies, qvalue-calculation was implemented via the qvalue_truncp function.
Gene expression modules using WGCNA
For WGCNA39 analysis, we selected the same samples that were used for differential expression testing. We accounted for batch effects by correcting gene counts using ComBat-Seq106 (sva v3.44.0). From batch-corrected counts, we calculated TMM-normalized log2 counts per million that were subsequently quantile normalized and then used as input for WGCNA. The network was constructed as a signed network, using a soft thresholding power of 13, a minimum module size of 30, and a merge cut height of 0.25. Modules with more than 1000 genes were removed from subsequent analyses.
Statistical analyses
Correlations indicated on scatter plots represent Pearson’s R (Pearson’s correlation coefficient) with their respective P-value. For comparisons of the mean, we used the Mann-Whitney U test (two groups) or Kruskal-Wallis H test (three groups) if the Shapiro-Wilk test indicated non-normality in at least one group. When multiple comparisons were performed, P-values and/or asterisks indicate adjusted P-values using the Holm-Bonferroni method unless stated otherwise. For pairwise comparisons, the Mann-Whitney U test was used post-hoc, taking multiple comparisons into account using the Holm-Bonferroni method. To test for categorical associations, we used Fisher’s exact test. Asterisks represent the following P-value ranges: P > 0.05, ns. 0.01 < P ≤ 0.05, *. 0.001 < P ≤ 0.01, **. 0.0001 < P ≤ 0.001, ***. P ≤ 0.0001, ****.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Supplementary information
Source data
Acknowledgements
N.S.H. and F.A.R. were supported by MD fellowships from the Boehringer Ingelheim Fonds. F.A.R. and T.E. were supported by an MD/PhD fellowship from the Medical Faculty of Heidelberg. P.A.N. was supported by NIH grants R01AR065538 and P30AR070253. This work was supported by grants to R.G.-B. from Deutsche Forschungsgemeinschaft (DFG, GR 5979/2-1, 517717827), Else Kröner-Fresenius-Stiftung (2022_EKEA.72), state of Baden-Wuerttemberg within the Centers for Personalized Medicine Baden-Wuerttemberg (ZPM) and a research grant from the German Society for Rheumatology (DGRh). Figure 6b and Supplementary Fig. 1 were created with BioRender.com. The authors acknowledge support by the state of Baden-Württemberg through bwHPC and the German Research Foundation (DFG) through grant INST 35/1597-1 FUGG.
Author contributions
N.S.H. and F.A.R. conceptualized and designed the study, performed computational and statistical analysis, provided conceptual input in experimental planning, analyzed experiments, conceptualized the core inflammation program, and wrote the manuscript. T.E. performed computational and statistical analysis, planned experiments, processed human and mouse samples, performed and analyzed flow cytometry, and wrote the manuscript. H.-M.L., C.M.-T., P.A.N. and G.W. guided data analysis, interpretation, and validation. R.G.-B. conceptualized and designed the study, performed computational and statistical analysis, provided conceptual input in experimental planning, analyzed experiments, conceptualized the core inflammation program, supervised the entire project, and wrote the manuscript.
Peer review
Peer review information
Nature Communications thanks Monowar Aziz, David Casero and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Data availability
All sequencing data was used from publicly available platforms (Gene Expression Omnibus and European Nucleotide Archive), and all accession numbers are listed in Supplementary Data 2. Flow cytometry data have been deposited at flowrepository.org under the accession FR-FCM-Z6U3 and FR-FCM-Z6U4. All other data are available in the article and its Supplementary files or from the corresponding author upon request. Source data are provided with this paper.
Code availability
Analysis code is publicly available on GitHub: https://github.com/rgb-lab/inflamed_neutrophil_transcriptome.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Nicolaj S. Hackert, Felix A. Radtke, Tarik Exner.
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-023-43573-9.
References
- 1.Brubaker DK, Lauffenburger DA. Translating preclinical models to humans. Science. 2020;367:742–743. doi: 10.1126/science.aay8086. [DOI] [PubMed] [Google Scholar]
- 2.Howe KL, et al. Ensembl 2021. Nucleic Acids Res. 2021;49:D884–D891. doi: 10.1093/nar/gkaa942. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Cardoso-Moreira M, et al. Developmental gene expression differences between humans and mammalian models. Cell Rep. 2020;33:108308. doi: 10.1016/j.celrep.2020.108308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Shay T, et al. Conservation and divergence in the transcriptional programs of the human and mouse immune systems. Proc. Natl Acad. Sci. USA. 2013;110:2946–2951. doi: 10.1073/pnas.1222738110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Lin S, et al. Comparison of the transcriptional landscapes between human and mouse tissues. Proc. Natl Acad. Sci. USA. 2014;111:17224–17229. doi: 10.1073/pnas.1413624111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Ballesteros I, et al. Co-option of neutrophil fates by tissue environments. Cell. 2020;183:1282–1297.e1218. doi: 10.1016/j.cell.2020.10.003. [DOI] [PubMed] [Google Scholar]
- 7.Xie X, et al. Single-cell transcriptome profiling reveals neutrophil heterogeneity in homeostasis and infection. Nat. Immunol. 2020;21:1119–1133. doi: 10.1038/s41590-020-0736-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Grieshaber-Bouyer, R. et al. Ageing and interferon gamma response drive the phenotype of neutrophils in the inflamed joint. Ann. Rheum. Dis.81, 805–814 (2022). [DOI] [PMC free article] [PubMed]
- 9.Grieshaber-Bouyer R, et al. The neutrotime transcriptional signature defines a single continuum of neutrophils across biological compartments. Nat. Commun. 2021;12:2856. doi: 10.1038/s41467-021-22973-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Adrover JM, et al. Programmed ‘disarming’ of the neutrophil proteome reduces the magnitude of inflammation. Nat. Immunol. 2020;21:135–144. doi: 10.1038/s41590-019-0571-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Catapano M, et al. IL-36 promotes systemic IFN-I responses in severe forms of psoriasis. J. Invest. Dermatol. 2020;140:816–826.e813. doi: 10.1016/j.jid.2019.08.444. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Choi J, et al. Haemopedia RNA-seq: a database of gene expression during haematopoiesis in mice and humans. Nucleic Acids Res. 2019;47:D780–D785. doi: 10.1093/nar/gky1020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Franco LM, et al. Immune regulation by glucocorticoids can be linked to cell type-dependent transcriptional responses. J. Exp. Med. 2019;216:384–406. doi: 10.1084/jem.20180595. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Grabowski P, et al. Proteome analysis of human neutrophil granulocytes from patients with monogenic disease using data-independent acquisition. Mol. Cell Proteomics. 2019;18:760–772. doi: 10.1074/mcp.RA118.001141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Jiang K, Sun X, Chen Y, Shen Y, Jarvis JN. RNA sequencing from human neutrophils reveals distinct transcriptional differences associated with chronic inflammatory states. BMC Med. Genomics. 2015;8:55. doi: 10.1186/s12920-015-0128-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Miralda I, et al. Whole transcriptome analysis reveals that Filifactor alocis modulates TNFα-stimulated MAPK activation in human neutrophils. Front. Immunol. 2020;11:497. doi: 10.3389/fimmu.2020.00497. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Mistry P, et al. Transcriptomic, epigenetic, and functional analyses implicate neutrophil diversity in the pathogenesis of systemic lupus erythematosus. Proc. Natl Acad. Sci. USA. 2019;116:25222–25228. doi: 10.1073/pnas.1908576116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Thomas HB, Moots RJ, Edwards SW, Wright HL. Whose gene is it anyway? The effect of preparation purity on neutrophil transcriptome studies. PLoS ONE. 2015;10:e0138982. doi: 10.1371/journal.pone.0138982. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Vecchio, F. et al. Abnormal neutrophil signature in the blood and pancreas of presymptomatic and symptomatic type 1 diabetes. JCI Insight3, e122146 (2018). [DOI] [PMC free article] [PubMed]
- 20.Wright HL, Lyon M, Chapman EA, Moots RJ, Edwards SW. Rheumatoid arthritis synovial fluid neutrophils drive inflammation through production of chemokines, reactive oxygen species, and neutrophil extracellular traps. Front. Immunol. 2020;11:584116. doi: 10.3389/fimmu.2020.584116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Wright HL, Thomas HB, Moots RJ, Edwards SW. RNA-seq reveals activation of both common and cytokine-specific pathways following neutrophil priming. PLoS ONE. 2013;8:e58598. doi: 10.1371/journal.pone.0058598. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Bhalla M, et al. Transcriptome profiling reveals CD73 and age-driven changes in neutrophil responses against Streptococcus pneumoniae. Infect. Immun. 2021;89:e0025821. doi: 10.1128/IAI.00258-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Casulli J, et al. CD200R deletion promotes a neutrophil niche for Francisella tularensis and increases infectious burden and mortality. Nat. Commun. 2019;10:2121. doi: 10.1038/s41467-019-10156-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Coffelt SB, et al. IL-17-producing γδ T cells and neutrophils conspire to promote breast cancer metastasis. Nature. 2015;522:345–348. doi: 10.1038/nature14282. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Gal-Oz ST, et al. ImmGen report: sexual dimorphism in the immune system transcriptome. Nat. Commun. 2019;10:4295. doi: 10.1038/s41467-019-12348-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Germann M, et al. Neutrophils suppress tumor-infiltrating T cells in colon cancer via matrix metalloproteinase-mediated activation of TGFbeta. EMBO Mol. Med. 2020;12:e10681. doi: 10.15252/emmm.201910681. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Hsu BE, et al. Immature low-density neutrophils exhibit metabolic flexibility that facilitates breast cancer liver metastasis. Cell Rep. 2019;27:3902–3915.e3906. doi: 10.1016/j.celrep.2019.05.091. [DOI] [PubMed] [Google Scholar]
- 28.Hutchins AP, Takahashi Y, Miranda-Saavedra D. Genomic analysis of LPS-stimulated myeloid cells identifies a common pro-inflammatory response but divergent IL-10 anti-inflammatory responses. Sci. Rep. 2015;5:9100. doi: 10.1038/srep09100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Stasulli NM, et al. Spatially distinct neutrophil responses within the inflammatory lesions of pneumonic plague. mBio. 2015;6:e01530-15. doi: 10.1128/mBio.01530-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Yan, Z. et al. Deficiency of Socs3 leads to brain-targeted EAE via enhanced neutrophil activation and ROS production. JCI Insight5, e126520 (2019). [DOI] [PMC free article] [PubMed]
- 31.Zhu YP, et al. Identification of an early unipotent neutrophil progenitor with pro-tumoral activity in mouse and human bone marrow. Cell Rep. 2018;24:2329–2341.e2328. doi: 10.1016/j.celrep.2018.07.097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.de Graaf CA, et al. Haemopedia: an expression atlas of murine hematopoietic cells. Stem Cell Rep. 2016;7:571–582. doi: 10.1016/j.stemcr.2016.07.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Cunningham F, et al. Ensembl 2022. Nucleic Acids Res. 2022;50:D988–D995. doi: 10.1093/nar/gkab1049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Borregaard N. Neutrophils, from marrow to microbes. Immunity. 2010;33:657–670. doi: 10.1016/j.immuni.2010.11.011. [DOI] [PubMed] [Google Scholar]
- 35.Ng LG, Ostuni R, Hidalgo A. Heterogeneity of neutrophils. Nat. Rev. Immunol. 2019;19:255–265. doi: 10.1038/s41577-019-0141-8. [DOI] [PubMed] [Google Scholar]
- 36.Khoyratty TE, et al. Distinct transcription factor networks control neutrophil-driven inflammation. Nat. Immunol. 2021;22:1093–1106. doi: 10.1038/s41590-021-00968-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Neuenfeldt F, et al. Inflammation induces pro-NETotic neutrophils via TNFR2 signaling. Cell Rep. 2022;39:110710. doi: 10.1016/j.celrep.2022.110710. [DOI] [PubMed] [Google Scholar]
- 38.Yoon SI, et al. Structural basis of TLR5-flagellin recognition and signaling. Science. 2012;335:859–864. doi: 10.1126/science.1215584. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9:559. doi: 10.1186/1471-2105-9-559. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Combes AJ, et al. Global absence and targeting of protective immune states in severe COVID-19. Nature. 2021;591:124–130. doi: 10.1038/s41586-021-03234-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Montaldo E, et al. Cellular and transcriptional dynamics of human neutrophils at steady state and upon stress. Nat. Immunol. 2022;23:1470–1483. doi: 10.1038/s41590-022-01311-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Buenrostro JD, Giresi PG, Zaba LC, Chang HY, Greenleaf WJ. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods. 2013;10:1213–1218. doi: 10.1038/nmeth.2688. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Fischer J, et al. Safeguard function of PU.1 shapes the inflammatory epigenome of neutrophils. Nat. Immunol. 2019;20:546–558. doi: 10.1038/s41590-019-0343-z. [DOI] [PubMed] [Google Scholar]
- 44.Hirai H, et al. C/EBPbeta is required for ‘emergency’ granulopoiesis. Nat. Immunol. 2006;7:732–739. doi: 10.1038/ni1354. [DOI] [PubMed] [Google Scholar]
- 45.Bausch-Fluck D, et al. The in silico human surfaceome. Proc. Natl Acad. Sci. USA. 2018;115:E10988–E10997. doi: 10.1073/pnas.1808790115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Perazzio SF, et al. Soluble CD40L is associated with increased oxidative burst and neutrophil extracellular trap release in Behcet’s disease. Arthritis Res. Ther. 2017;19:235. doi: 10.1186/s13075-017-1443-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Risso A. Leukocyte antimicrobial peptides: multifunctional effector molecules of innate immunity. J. Leukoc. Biol. 2000;68:785–792. doi: 10.1189/jlb.68.6.785. [DOI] [PubMed] [Google Scholar]
- 48.Silvestre-Roig C, Fridlender ZG, Glogauer M, Scapini P. Neutrophil diversity in health and disease. Trends Immunol. 2019;40:565–583. doi: 10.1016/j.it.2019.04.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Grieshaber-Bouyer R, Nigrovic PA. Neutrophil heterogeneity as therapeutic opportunity in immune-mediated disease. Front. Immunol. 2019;10:346. doi: 10.3389/fimmu.2019.00346. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Zhang, B. & Horvath, S. A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol.4, 17 (2005). [DOI] [PubMed]
- 51.Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA. 2003;100:9440–9445. doi: 10.1073/pnas.1530509100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Baik B, Yoon S, Nam D. Benchmarking RNA-seq differential expression analysis methods using spike-in and simulation data. PLoS ONE. 2020;15:e0232271. doi: 10.1371/journal.pone.0232271. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Jin, H. et al. Antigen-presenting aged neutrophils induce CD4+ T cells to exacerbate inflammation in sepsis. J. Clin. Invest.133, e164585 (2023). [DOI] [PMC free article] [PubMed]
- 54.Panda SK, et al. IL-4 controls activated neutrophil FcgammaR2b expression and migration into inflamed joints. Proc. Natl Acad. Sci. USA. 2020;117:3103–3113. doi: 10.1073/pnas.1914186117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Muendlein HI, et al. Neutrophils and macrophages drive TNF-induced lethality via TRIF/CD14-mediated responses. Sci. Immunol. 2022;7:eadd0665. doi: 10.1126/sciimmunol.add0665. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Takao K, Miyakawa T. Genomic responses in mouse models greatly mimic human inflammatory diseases. Proc. Natl Acad. Sci. USA. 2015;112:1167–1172. doi: 10.1073/pnas.1401965111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Seok J, et al. Genomic responses in mouse models poorly mimic human inflammatory diseases. Proc. Natl Acad. Sci. USA. 2013;110:3507–3512. doi: 10.1073/pnas.1222878110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Warren HS, et al. Mice are not men. Proc. Natl Acad. Sci. USA. 2015;112:E345. doi: 10.1073/pnas.1414857111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Shay T, Lederer JA, Benoist C. Genomic responses to inflammation in mouse models mimic humans: we concur, apples to oranges comparisons won’t do. Proc. Natl Acad. Sci. USA. 2015;112:E346. doi: 10.1073/pnas.1416629111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Styrt B. Species variation in neutrophil biochemistry and function. J. Leukoc. Biol. 1989;46:63–74. doi: 10.1002/jlb.46.1.63. [DOI] [PubMed] [Google Scholar]
- 61.Mestas J, Hughes CC. Of mice and not men: differences between mouse and human immunology. J. Immunol. 2004;172:2731–2738. doi: 10.4049/jimmunol.172.5.2731. [DOI] [PubMed] [Google Scholar]
- 62.Nauseef, W. M. Human neutrophils ≠ murine neutrophils: does it matter? Immunol. Rev.314, 442–456 (2023). [DOI] [PMC free article] [PubMed]
- 63.Wilson SS, Wiens ME, Smith JG. Antiviral mechanisms of human defensins. J. Mol. Biol. 2013;425:4965–4980. doi: 10.1016/j.jmb.2013.09.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Xu D, Lu W. Defensins: a double-edged sword in host immunity. Front. Immunol. 2020;11:764. doi: 10.3389/fimmu.2020.00764. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Yang D, Chertov O, Oppenheim JJ. Participation of mammalian defensins and cathelicidins in anti-microbial immunity: receptors and activities of human defensins and cathelicidin (LL-37) J. Leukoc. Biol. 2001;69:691–697. doi: 10.1189/jlb.69.5.691. [DOI] [PubMed] [Google Scholar]
- 66.Rausch PG, Moore TG. Granule enzymes of polymorphonuclear neutrophils: a phylogenetic comparison. Blood. 1975;46:913–919. doi: 10.1182/blood.V46.6.913.913. [DOI] [PubMed] [Google Scholar]
- 67.Tecchio C, Micheletti A, Cassatella MA. Neutrophil-derived cytokines: facts beyond expression. Front. Immunol. 2014;5:508. doi: 10.3389/fimmu.2014.00508. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Cassatella MA, Ostberg NK, Tamassia N, Soehnlein O. Biological roles of neutrophil-derived granule proteins and cytokines. Trends Immunol. 2019;40:648–664. doi: 10.1016/j.it.2019.05.003. [DOI] [PubMed] [Google Scholar]
- 69.van Rees DJ, Szilagyi K, Kuijpers TW, Matlung HL, van den Berg TK. Immunoreceptors on neutrophils. Semin. Immunol. 2016;28:94–108. doi: 10.1016/j.smim.2016.02.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Gray-Owen SD, Dehio C, Haude A, Grunert F, Meyer TF. CD66 carcinoembryonic antigens mediate interactions between Opa-expressing Neisseria gonorrhoeae and human polymorphonuclear phagocytes. EMBO J. 1997;16:3435–3445. doi: 10.1093/emboj/16.12.3435. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Sarantis H, Gray-Owen SD. Defining the roles of human carcinoembryonic antigen-related cellular adhesion molecules during neutrophil responses to Neisseria gonorrhoeae. Infect. Immun. 2012;80:345–358. doi: 10.1128/IAI.05702-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Ullrich S, Guigo R. Dynamic changes in intron retention are tightly associated with regulation of splicing factors and proliferative activity during B-cell development. Nucleic Acids Res. 2020;48:1327–1340. doi: 10.1093/nar/gkz1180. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Ewels PA, et al. The nf-core framework for community-curated bioinformatics pipelines. Nat. Biotechnol. 2020;38:276–278. doi: 10.1038/s41587-020-0439-x. [DOI] [PubMed] [Google Scholar]
- 74.Di Tommaso P, et al. Nextflow enables reproducible computational workflows. Nat. Biotechnol. 2017;35:316–319. doi: 10.1038/nbt.3820. [DOI] [PubMed] [Google Scholar]
- 75.Patro R, Duggal G, Love MI, Irizarry RA, Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods. 2017;14:417–419. doi: 10.1038/nmeth.4197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Love MI, et al. Tximeta: reference sequence checksums for provenance identification in RNA-seq. PLoS Comput. Biol. 2020;16:e1007664. doi: 10.1371/journal.pcbi.1007664. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Davis S, Meltzer PS. GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor. Bioinformatics. 2007;23:1846–1847. doi: 10.1093/bioinformatics/btm254. [DOI] [PubMed] [Google Scholar]
- 78.Gu Z, Eils R, Schlesner M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics. 2016;32:2847–2849. doi: 10.1093/bioinformatics/btw313. [DOI] [PubMed] [Google Scholar]
- 79.Huber W, et al. Orchestrating high-throughput genomic analysis with Bioconductor. Nat. Methods. 2015;12:115–121. doi: 10.1038/nmeth.3252. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Schroder MS, Culhane AC, Quackenbush J, Haibe-Kains B. survcomp: an R/Bioconductor package for performance assessment and comparison of survival models. Bioinformatics. 2011;27:3206–3208. doi: 10.1093/bioinformatics/btr511. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Virshup I, et al. The scverse project provides a computational ecosystem for single-cell omics data analysis. Nat. Biotechnol. 2023;41:604–606. doi: 10.1038/s41587-023-01733-8. [DOI] [PubMed] [Google Scholar]
- 82.Wolf FA, Angerer P, Theis FJ. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 2018;19:15. doi: 10.1186/s13059-017-1382-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Ritchie ME, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43:e47. doi: 10.1093/nar/gkv007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–140. doi: 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.McCarthy DJ, Chen Y, Smyth GK. Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res. 2012;40:4288–4297. doi: 10.1093/nar/gks042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Chen Y, Lun AT, Smyth GK. From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline. F1000Res. 2016;5:1438. doi: 10.12688/f1000research.8987.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Bates D, Mächler M, Bolker B, Walker S. Fitting linear mixed-effects models using lme4. J. Stat. Softw. 2015;67:1–48. doi: 10.18637/jss.v067.i01. [DOI] [Google Scholar]
- 88.Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Zhu A, Ibrahim JG, Love MI. Heavy-tailed prior distributions for sequence count data: removing the noise and preserving large differences. Bioinformatics. 2019;35:2084–2092. doi: 10.1093/bioinformatics/bty895. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Liberzon A, et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 2015;1:417–425. doi: 10.1016/j.cels.2015.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Lun ATL, et al. EmptyDrops: distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data. Genome Biol. 2019;20:63. doi: 10.1186/s13059-019-1662-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Young, M. D. & Behjati, S. SoupX removes ambient RNA contamination from droplet-based single-cell RNA sequencing data. Gigascience9, giaa151 (2020). [DOI] [PMC free article] [PubMed]
- 93.Germain PL, Lun A, Garcia Meixide C, Macnair W, Robinson MD. Doublet identification in single-cell sequencing data using scDblFinder. F1000Res. 2021;10:979. doi: 10.12688/f1000research.73600.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Aran D, et al. Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat. Immunol. 2019;20:163–172. doi: 10.1038/s41590-018-0276-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Tirosh I, et al. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science. 2016;352:189–196. doi: 10.1126/science.aad0501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Keenan AB, et al. ChEA3: transcription factor enrichment analysis by orthogonal omics integration. Nucleic Acids Res. 2019;47:W212–W224. doi: 10.1093/nar/gkz446. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Garcia-Alonso L, Holland CH, Ibrahim MM, Turei D, Saez-Rodriguez J. Benchmark and integration of resources for the estimation of human transcription factor activities. Genome Res. 2019;29:1363–1375. doi: 10.1101/gr.240663.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Badia-i-Mompel, P. et al. decoupleR: ensemble of computational methods to infer biological activities from omics data. Bioinform. Adv.2, vbac016 (2022). [DOI] [PMC free article] [PubMed]
- 99.Young MD, Wakefield MJ, Smyth GK, Oshlack A. Gene ontology analysis for RNA-seq: accounting for selection bias. Genome Biol. 2010;11:R14. doi: 10.1186/gb-2010-11-2-r14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Crowell, H. L., Zanotelli, V. R., Chevrier, S., Robinson, M. D. & Bodenmiller, B. CATALYST: Cytometry dATa anALYSis Tools. R package version 1.16.2 (2021).
- 101.Melsen, J. E., van Ostaijen-ten Dam, M. M., Lankester, A. C., Schilham, M. W. & van den Akker, E. B. A comprehensive workflow for applying single-cell clustering and pseudotime analysis to flow cytometry data. J. Immunol.205, 864–871 (2020). [DOI] [PubMed]
- 102.Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer, 2016).
- 103.Chen EY, et al. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics. 2013;14:128. doi: 10.1186/1471-2105-14-128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Kuleshov MV, et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016;44:W90–W97. doi: 10.1093/nar/gkw377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Xie Z, et al. Gene set knowledge discovery with Enrichr. Curr. Protoc. 2021;1:e90. doi: 10.1002/cpz1.90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Zhang Y, Parmigiani G, Johnson WE. ComBat-seq: batch effect adjustment for RNA-seq count data. NAR Genom. Bioinform. 2020;2:lqaa078. doi: 10.1093/nargab/lqaa078. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Ter Haar NM, et al. Reversal of sepsis-like features of neutrophils by interleukin-1 blockade in patients with systemic-onset juvenile idiopathic arthritis. Arthritis Rheumatol. 2018;70:943–956. doi: 10.1002/art.40442. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All sequencing data was used from publicly available platforms (Gene Expression Omnibus and European Nucleotide Archive), and all accession numbers are listed in Supplementary Data 2. Flow cytometry data have been deposited at flowrepository.org under the accession FR-FCM-Z6U3 and FR-FCM-Z6U4. All other data are available in the article and its Supplementary files or from the corresponding author upon request. Source data are provided with this paper.
Analysis code is publicly available on GitHub: https://github.com/rgb-lab/inflamed_neutrophil_transcriptome.