Skip to main content
eLife logoLink to eLife
. 2022 Oct 18;11:e77373. doi: 10.7554/eLife.77373

Non-invasive classification of macrophage polarisation by 2P-FLIM and machine learning

Nuno GB Neto 1,2, Sinead A O'Rourke 1,2,3, Mimi Zhang 4, Hannah K Fitzgerald 3, Aisling Dunne 3,5, Michael G Monaghan 1,2,5,6,
Editors: Michael L Dustin7, Aleksandra M Walczak8
PMCID: PMC9578711  PMID: 36254592

Abstract

In this study, we utilise fluorescence lifetime imaging of NAD(P)H-based cellular autofluorescence as a non-invasive modality to classify two contrasting states of human macrophages by proxy of their governing metabolic state. Macrophages derived from human blood-circulating monocytes were polarised using established protocols and metabolically challenged using small molecules to validate their responding metabolic actions in extracellular acidification and oxygen consumption. Large field-of-view images of individual polarised macrophages were obtained using fluorescence lifetime imaging microscopy (FLIM). These were challenged in real time with small-molecule perturbations of metabolism during imaging. We uncovered FLIM parameters that are pronounced under the action of carbonyl cyanide-p-trifluoromethoxyphenylhydrazone (FCCP), which strongly stratifies the phenotype of polarised human macrophages; however, this performance is impacted by donor variability when analysing the data at a single-cell level. The stratification and parameters emanating from a full field-of-view and single-cell FLIM approach serve as the basis for machine learning models. Applying a random forests model, we identify three strongly governing FLIM parameters, achieving an area under the receiver operating characteristics curve (ROC-AUC) value of 0.944 and out-of-bag (OBB) error rate of 16.67% when classifying human macrophages in a full field-of-view image. To conclude, 2P-FLIM with the integration of machine learning models is showed to be a powerful technique for analysis of both human macrophage metabolism and polarisation at full FoV and single-cell level.

Research organism: Human

Introduction

Two-photon fluorescence lifetime imaging microscopy (2P-FLIM) is a non-destructive modality that can interrogate exogenous and endogenous fluorophores. 2P-FLIM provides high spatial and temporal resolution to image murine and human cell types, 2D and 3D cell cultures, and biopsies in vitro and in vivo (Skala et al., 2007, Okkelman et al., 2019, Neto et al., 2020). 2P-FLIM allows reduced nicotinamide adenine dinucleotide (NAD(P)H) and flavin adenine dinucleotide (FAD+) to be studied and quantified using 2P-FLIM, giving insight into cellular metabolism (Lakowicz et al., 1992; Skala et al., 2007, Neto et al., 2020). The average time for a fluorophore to return to the ground state from the excited state while emitting fluorescence is known as the fluorescence lifetime. NADH and NADPH fluorescence properties are identical, and hence NAD(P)H refers to these intracellular pools combined (Huang et al., 2002). NAD(P)H is noted as having a long fluorescence lifetime when enzyme-bound and a short fluorescence lifetime when free in the cytoplasm. NAD(P)H fluorescence properties offer extended information into NAD(P)H protein/enzyme binding (Lakowicz et al., 1992). 2P-FLIM facilitates the quantification of NAD(P)H fluorescence lifetimes and their respective proportions. In addition, 2P-FLIM can be used as a microscopy-based spatial approach and applied to limited cell numbers, which is beneficial in assessing limited sample numbers or validating metabolism-based therapies (Peterson et al., 2018; Shields et al., 2020).

Macrophages are essential components of the host immune response and key regulators of homeostatic function. In addition to host defence, macrophages are intimately involved in tissue homeostasis and play a key role in pathologies including heart failure, diabetes, and cancer (Mosser and Edwards, 2008). Macrophages adopt specific polarisation states, ranging in a spectrum, to accomplish various functions and mechanisms of action. Here, classically activated (M1) and alternatively activated (M2) macrophages occupy opposite ends (Gordon, 2003; Mosser and Edwards, 2008). In vitro, M1 macrophage behaviour can be evoked using IFNγ and/or microbial products such as LPS. M1 macrophage phenotype is defined by secretion of a significant amount of pro-inflammatory cytokines: TNFα and IL-1β enhanced endocytosis and ability to kill intracellular pathogens (Adams, 1989; Martinez et al., 2008).

M2 macrophage behaviour exists more heterogeneously and can be triggered by IL-4 or IL-13, IL1-β, TGF-β, IL-6, phagocytosis of apoptotic cells, or association with a tumour microenvironment, respectively, generating M2a, M2b, M2c, M2f, and tumour-associated macrophages (TAM) (Mantovani et al., 2002; Mantovani et al., 2017; Graney et al., 2020). M2 macrophages exhibit decreased expression of protein membrane markers, such as CD14 and CCR5, and increased fibronectin-1 production. In addition, M2 macrophages are also characterised by the downregulation of pro-inflammatory cytokines (Wang et al., 1998; Gratchev et al., 2001; Martinez et al., 2008). Most often, macrophage behaviour is assessed by endpoint assays using cytokine measurements, gene analysis, and staining of surface markers. However, there is a gathering shift towards non-invasive modalities to speed up this process and obtain spatio-temporal analysis. Two recent examples of this include the use of Raman microscopy to map the lipidomic spatial signature of polarised macrophages (Feuerer et al., 2021), and the use of average fluorescent lifetime parameters to discern murine macrophage phenotype (Alfonso-García et al., 2016).

We use an imaging-based approach, primarily focusing on metabolic machinery characteristically employed by polarised human macrophages. Macrophage metabolism poses a huge potential in the next generation of therapeutics for inflammatory disease. Human macrophage function and metabolism are inextricably linked (Van den Bossche et al., 2017). IFNγ-activated human macrophages (IFNγ-M1) are primed for enhanced inflammatory responses by stabilising HIF1α levels, activation of the JAK-STAT pathway, and increased production of IL-1β, all of which are dependent on enhanced levels of glycolysis (Wang et al., 2018). Alternatively activated human anti-inflammatory macrophages, most often observed in vitro through stimulation with IL-4 (IL-4-M2), are defined by an intact tricarboxylic acid cycle (TCA), enhanced OxPhos, increased fatty acid synthesis (FAS), and fatty acid oxidation (FAO) (O’Neill et al., 2016, Van den Bossche et al., 2017). In a nutshell, IFNγ-M1 macrophages are more active in glycolysis, while IL-4-M2 macrophages are more dependent on oxidative phosphorylation for their energy production. When assessing metabolism and bioenergetics, most methods are based on extracellular flux assays, measurement of cellular oxygen consumption, exogenous staining, radio-labelling nutrients, and gas chromatography-mass spectrometry (GC-MS). Metabolism probing methods require a substantial amount of sample processing yet still pose limitations due to short-lived oxidative metabolites (Vivekanandan-Giri et al., 2011; Koo et al., 2016; Fall et al., 2020; Ma et al., 2020).

2P-FLIM imaging of NAD(P)H acquires five fluorescence lifetime variables (τ1, τ2, α1, α2, τavg) and one fluorescence intensity-based variable (optical redox ratio [ORR]) descriptive of the polarisation linked-metabolic state of IFNγ-M1 and IL-4-M2 macrophage phenotypes. Furthermore, additional information can be obtained from these measurements by real-time perturbation of basal metabolism and metabolic capacity. To achieve this, sequential knockdown of intercellular pathways and uncomplexing of mitochondrial coupling is performed in this study via metabolic inhibitors.

For data visualisation purposes, we applied the Uniform Manifold Approximate and Projection (UMAP) technique, in favour over principal component analysis (PCA), due to its processing speed, capability to preserve the global and local structures of the data, and ability to use non-metric distance functions (McInnes et al., 2018). For in-depth statistical analysis, machine learning algorithms are employed for classification/supervised pattern recognition (Mohri and Rostamizadeh, 2012). Machine learning algorithms have been used to characterise DNA- and RNA-binding proteins, determine genetic and epigenetic contributions of antibody repertoire diversity, and classify chronic periodontitis patients based on their immune cell response to ex vivo stimulation with ligands (Alipanahi et al., 2015; Bolland et al., 2016; Culos et al., 2020). In addition, machine learning algorithms can be used to group samples into different classes (e.g., in this study IFNγ-M1 vs. IL-4-M2 based on 2P-FLIM variables) whilst determining which variables are the most important for this task (Touw et al., 2013). For our work, we employed the random forests algorithm for the classification task due to its high prediction accuracy, robustness to outliers, and ability to obtain the relative importance of each variable (Breiman, 2001; Verikas et al., 2011). Another advantage of the random forests model is that we can use the out-of-bag (OOB) error estimate to determine the values of the hyper-parameters, no need to set aside an independent dataset for validation. This makes random forests especially suitable for small datasets, where we do not have additional data for validating the model. The hyper-parameters of the random forests model are the number of trees (ntree) and the number of variables selected for the best split at each node (mtry). The OOB error of the final optimal random forests model serves as the estimate of the model’s prediction error.

The trained random forests model will be used to distinguish between the different populations of macrophages and to measure which 2P-FLIM variables are the most important for this differentiation. In our study, we used all data points obtained from both full field-of-view (FoV) and donor-specific single-cell images for training and validating the random forests model. Thus, there is no extra data available for testing our trained model. Therefore, to demonstrate the predictive ability and efficiency of the trained model, we calculated the area under the receiver operating characteristics curve (ROC-AUC), in which the specificity and sensitivity of the trained model are plotted. The ROC-AUC statistic is a performance evaluation metric for (binary) classification models. ROC curves plot the true positive rate against the false-positive rate at various threshold values. The AUC is the measure of the ability of a classifier to distinguish between two classes and is used as a summary of the ROC curve. An ROC-AUC value less than 0.5 suggests no discrimination, 0.7–0.8 is considered acceptable, 0.8–0.9 is considered excellent, and more than 0.9 is considered outstanding (Mandrekar, 2010). Similar applications of machine learning methods have been explored by Walsh et al. for classification of activated t-cells and Qian et al. for quality control of cardiomyocyte differentiation (Walsh et al., 2020, Qian et al., 2021).

We hypothesise that 2P-FLIM of NAD(P)H and FAD+ provides quantitative information to evaluate and identify human-derived macrophage polarisation by proxy of their metabolism. 2P-FLIM of NAD(P)H and FAD+ has strong clinical potential fuelled by the emergence of metabolic approaches to treat disease and inflammation. To test our hypothesis, we derived human macrophages from blood-circulating monocytes and polarised them into IFNγ-M1 or IL-4-M2 macrophages. We confirmed human macrophage cytokine and gene expression-related polarisation, while human macrophage metabolic behaviour was assessed via traditional extracellular flux assays. Finally, 2P-FLIM was applied during which real-time responses of photonic variables to metabolism-challenging small molecules were measured. This study establishes 2P-FLIM as a method to discriminate IFNγ-M1 and IL-4-M2 macrophages, which is most pronounced by macrophages’ differential responses when treated with carbonyl cyanide-p-trifluoromethoxyphenylhydrazone (FCCP). This allows the accurate classification of macrophage polarisation using machine learning methods, for example, the random forests model, and to evaluate single-cell heterogeneity. Our work reports, for the first time, the classification of human macrophages by random forests using data obtained from real-time metabolic perturbations triggered during 2P-FLIM.

Results

Macrophage polarisation with IFNγ and IL-4 induces metabolic reprograming

Human blood-derived macrophages were polarised by incubating in cell culture media containing IFNγ (M1) or IL-4 (M2) for 24 hr. Polarisation was confirmed using ELISA and RT-PCR. Cellular metabolic activity was analysed using a sequence of metabolic enzyme inhibitors, and the inhibitors’ effect was measured by extracellular acidification ratio (ECAR), oxygen consumption ratio (OCR), and 2P-FLIM (Figure 1).

Figure 1. Overview of experimental work.

Figure 1.

Image created using biorender.com.

A slightly higher amount of TNFα production was obtained for IFNγ-M1 when compared with IL-4-M2 macrophages. Regarding IL-10, a statistically higher production was measured in IL-4-M2 when compared with IFNγ-M1 and untreated macrophages (Figure 2A and D). For gene expression, a higher amount of CXCL9 and a statistically significant increase in CXCL10 in IFNγ-M1 macrophages were observed (Figure 2B and C). In addition, MRC1 and CCL13 were further expressed in IL-4-M2 macrophages when compared with untreated and IFNγ-M1 macrophages (Figure 2E and F). IFNγ-M1 macrophages have a higher dependence on aerobic glycolysis, whilst IL-4-M2 macrophages are more reliant on oxidative phosphorylation. We used ECAR and OCRs to certify this metabolic behaviour, which is linked to macrophage polarisation. For ECAR and OCR, we used four different metabolic modulators in succession, oligomycin, FCCP, rotenone + antimycin A (Rot+AA), and 2-deoxy-d-glucose (2-DG), to evaluate cellular metabolism (Figure 2). IFNγ-M1 macrophages exhibited a higher ECAR and lower OCR in response to the treatments added, whilst IL-4-M2 and untreated macrophages had lower ECAR and higher OCR (Figure 2G and J). After plotting ECAR and OCR curves, the areas under the curves (AUC) were measured to reflect basal glycolysis, maximal glycolysis, basal respiration, and maximal respiration.

Figure 2. Validation of macrophage polarisation and metabolic profiling of IFNγ-M1, IL-4-M2, and untreated (UT) macrophages.

(A, B) ELISA of inflammatory cytokine TNFα and anti-inflammatory IL-10 in IFNγ-M1, IL-4-M2, and UT macrophages. (C–F) Evaluation of CXCL9, MRC1, CXCL10, and CCL13 gene expression in IFNγ-M1, IL-4-M2, and UT macrophages. (G,J ) Extracellular acidification ratio (ECAR) and oxygen consumption ratio (OCR) profile of IFNγ-M1, IL-4-M2, and UT macrophages when treated sequentially with oligomycin, carbonyl cyanide-p-trifluoromethoxyphenylhydrazone (FCCP), rotenone + antimycin A, and 2-deoxy-d-glucose (2-DG). (H, I, K, L) Area under the curve (AUC) values calculated from ECAR and OCR between each treatment. Data displayed as average ± SD. Statistical significance verified by one-way ANOVA with *p<0.05, **p<0.01, ***p<0.001 to show significance for N = 6 donors.

Figure 2—source data 1. ELISA, gene expression, extracellular acidification ratio, and oxygen consumption ratio measurements for each replicate data and details of statistical tests and chosen parameters.

Figure 2.

Figure 2—figure supplement 1. Validation of macrophage polarisation using flow cytometry.

Figure 2—figure supplement 1.

Primary human macrophages were left untreated (UT), treated with IFNγ (20 ng/ml), or IL-4 (20 ng/ml) for 24 hr. Cells were stained for M1 maturation surface markers CD80, CD86, and M2 surface markers CD163, CD206, and analysed by flow cytometry. (A–D) Representative histograms depicting median fluorescence intensity (MFI) of surface markers. Bar graphs depict MFI as a percentage to the control (untreated cells) (N = 3). All data is represented as mean ± SEM and analysed by one-way ANOVA with Dunnett’s multiple-comparisons test.
Figure 2—figure supplement 1—source data 1. Flow cytometry surface markers measurement and details of statistical tests and chosen parameters.

IFNγ-M1 macrophages have a statistically significant increase of basal and max glycolysis when compared with untreated macrophages (Figure 2H and I). In addition, all macrophage types have similar basal respiration, whilst IL-4-M2 macrophages have a statistically significant increase in max respiration when compared with untreated macrophages (Figure 2K and L).

2P-FLIM captures metabolic shifts on IFNγ and IL-4-treated macrophages

2P-FLIM harvests NAD(P)H and FAD+ autofluorescence to infer cellular metabolism. NAD(P)H enzyme-bound state is characterised by a longer fluorescence lifetime, whilst NAD(P)H free-state has a shorter fluorescence lifetime. NAD(P)H and FAD+ fluorescence intensities are measured in order to calculate the ORR (Equation 3). These fluorescence features enable the distinction between an OxPhos or glycolytic-dependent metabolism (Skala et al., 1992, Okkelman et al., 2019, Schaefer et al., 2019; Floudas et al., 2020; Neto et al., 2020; Walsh et al., 2020, Perottoni et al., 2021).

For this experiment, we seeded unpolarised macrophages in ibidi Luer μ-slides in static conditions and polarised the macrophages using IFNγ or IL-4 for 24 hr. These macrophages are derived from the same donors as per those presented in Figure 2. Afterwards, we followed the same subjection of metabolic enzymatic inhibitors applied in the ECAR/OCR experiment in which we treated the macrophages with oligomycin, FCCP, Rot + AA, and 2-DG. During the time course of the experiments, the field of view was maintained so as to record single-cell metabolic variations (Figure 3A, Figure 3—figure supplements 1 and 2).

Figure 3. Two-photon fluorescence lifetime imaging microscopy (2P-FLIM) metabolimaging analysis.

(A) Time-course imaging of representative (same field of view throughout) IFNγ-M1 macrophages, scale bar: 100 μm. (B, C) Average fluorescence lifetime (τavg) and optical redox ratio (ORR) values for IFNγ-M1 and IL-4-M2 when treated sequentially with oligomycin, carbonyl cyanide-p-trifluoromethoxyphenylhydrazone (FCCP), rotenone + antimycin A and 2-deoxy-d-glucose (2-DG) of a representative donor. (D) z-score heatmap of 2P-FLIM acquired data for six donors separated by macrophage polarisation and metabolic inhibitor, each individual row corresponds to an imaging field. (E) Uniform Manifold Approximate and Projection (UMAP) plot of 2P-FLIM variables after each treatment each dot corresponds to an individual imaging field. (F) UMAP plot of 2P-FLIM variables after FCCP treatment, each dot corresponds to an individual imaging field. (G) Receiver-operator curve and area under curve values of random forests machine learning model applied to 2P-FLIM data after FCCP treatment. (H) 2P-FLIM weight features determined by mean decrease accuracy and mean decrease Gini of random forests model used to classify macrophages.

Figure 3—source data 1. Fluorescence lifetime imaging microscopy (FLIM) imaging and corresponding variables measurement for each replicate, Uniform Manifold Approximate and Projection (UMAP) analysis, and machine learning (random forests) input data and coding.

Figure 3.

Figure 3—figure supplement 1. Two-photon fluorescence lifetime imaging microscopy (2P-FLIM) of IFNγ-M1 macrophages for basal conditions, oligomycin, carbonyl cyanide-p-trifluoromethoxyphenylhydrazone (FCCP), rotenone with antimycin A, and 2-deoxy-d-glucose treatments with the concentrations detailed in the article.

Figure 3—figure supplement 1.

All treatment images collected at the 30 min of treatment and color-coded for different NADH FLIM variables.
Figure 3—figure supplement 2. Two-photon fluorescence lifetime imaging microscopy (2P-FLIM) of IL-4-M2 macrophages for basal conditions, oligomycin, carbonyl cyanide-p-trifluoromethoxyphenylhydrazone (FCCP), rotenone with antimycin A, and 2-deoxy-d-glucose treatments with the concentrations detailed in the article.

Figure 3—figure supplement 2.

All treatment images collected at the 30 min of treatment and color-coded for different NADH FLIM variables.
Figure 3—figure supplement 3. Phasor analysis of NADH fluorescence lifetime imaging microscopy (FLIM) variables in basal and carbonyl cyanide-p-trifluoromethoxyphenylhydrazone (FCCP) conditions.

Figure 3—figure supplement 3.

(A) Basal phasor with values set at 3.4 ns (red) and 0.4 ns (green) and all donors basal FLIM data distribution. (B) Histogram of all donors normalised to pixel intensity in basal conditions. (C) Histogram distribution of a representative donor normalised to pixel intensity in basal conditions. (D) FCCP phasor with values set at 3.4 ns (red) and 0.4 ns (green) and all donors FCCP FLIM data distribution. (E) Histogram of all donors (N = 6) normalised to pixel intensity in FCCP conditions. (F) Histogram distribution of a representative donor normalised to pixel intensity in FCCP conditions. Ψ value is the non-overlapped area for both curves in each plot when compared with the total area occupied by both curves. A higher Ψ value determines a better segregation between datasets.
Figure 3—figure supplement 3—source data 1. Phasor fluorescence lifetime imaging microscopy (FLIM) analysis raw histogram data.
Figure 3—figure supplement 4. 2P-FLIM variables correlation matrix.

Figure 3—figure supplement 4.

Principal component analysis (PCA) and t-SNE data visualisation of two-photon fluorescence lifetime imaging microscopy (2P-FLIM) variables obtained from full field of view (FoV) collected during the course of metabolic treatments (A, B). PCA data visualisation of 2P-FLIM variables obtained from full FoV collected during carbonyl cyanide-p-trifluoromethoxyphenylhydrazone (FCCP) treatments and representation of 99.9% confidence ellipses derived from 3 standard deviations of the data presented for each group of samples (C). t-SNE data visualisation of 2P-FLIM variables obtained from full FoV collected during FCCP treatments (D).
Figure 3—figure supplement 5. Correlation matrix of two-photon fluorescence lifetime imaging microscopy (2P-FLIM) NAD(P)H variables with correlation values displayed.

Figure 3—figure supplement 5.

Figure 3—figure supplement 6. Dispersion plot distribution of photons/pixel per fluorescence lifetime variable value.

Figure 3—figure supplement 6.

(A) τ1, (B) τ2, (C) α1, (D) α2, (E) optical redox ratio (ORR) distribution based on NADH results, (F) ORR distribution based on FAD results, and (G) τavg. These values are obtained after background removal.
Figure 3—figure supplement 6—source data 1. Dispersion plot distribution of photons/pixel per fluorescence lifetime variable value raw data.

We derived the average fluorescence lifetime (τavg) and ORR of IFNγ-M1 and IL-4-M2 macrophages from the full FoV of 2P-FLIM data and observed an increasing trend of τavg in response to the application metabolic enzymatic inhibitors. With the exception of 2-DG, in which a decrease of τavg was observed for both macrophage phenotypes (Figure 3B). Regarding ORR, there is a slight decreasing trend of ORR, followed by a raise in ORR with the 2-DG treatment for IFNγ-M1 macrophages. For IL-4-M2 macrophages, there is a decrease in ORR with the oligomycin treatment, followed by stabilisation with FCCP and Rot + AA, and finally an increase elicited by 2-DG (Figure 3C). In addition, we utilised phasor analysis on the raw FLIM data. Here, we plotted the phasor maps while fixing the lifetimes at 3.4 ns and 0.4 ns as indicated in literature (Ranjit et al., 2018). Furthermore, we generated histogram plots that showcase the distribution of the data in the phasor plot as well as the difference between IFNγ-M1 and IL-4-M2 macrophages (Figure 3—figure supplement 3). We compiled all the full FoV 2P-FLIM variables: τ1, τ2, α1, α2, τavg, and ORR into a representative z-score heatmap, stratified according to macrophage type and metabolic inhibitors across all donors. IFNγ-M1 macrophages have lower τ1, τ2, τavg, and ORR values when compared with IL-4-M2 macrophages (Figure 3D).

2P-FLIM variables allow the classification of IFNγ-M1 and IL-4-M2 macrophages

UMAP was applied to full FoV 2P-FLIM variables associated with IFNγ-M1 and IL-4-M2 macrophages as a data visualisation tool. The coordinates for each image were defined using a cosine distance function computed using the 2P-FLIM variables: τavg, τ1, τ2, α1, α2, and ORR (Figure 3E). UMAP representation of 2P-FLIM variables acquired during FCCP treatment provides a separation between IFNγ-M1 and IL-4-M2 macrophages (Figure 3F). This segregation is also observed when applying PCA analysis on FCCP treated human macrophages. However, this separation is not observed for t-SNE analysis (Figure 3—figure supplement 4).

Random forests classification models were applied to classify macrophage polarisation from 2P-FLIM variables when treated with FCCP (Table 1). To adequately train the random forests model, we removed the α1 2P-FLIM variable as it exhibits a negative correlation with the α2 variable (Figure 3—figure supplement 5). ROC curves of our dataset reveal high accuracy for predicting macrophage polarisation in the full FOV during FCCP (AUC = 0.944), when using 2P-FLIM variables as predictors (Figure 3G).

Table 1. Hyper-parameters, OBB, ROC-AUC, and confusion matrix of the trained random forests model.

Donor
(no. of data points)
ntree mtry OOB (%) ROC-AUC TP FP FN TN
All donors –
full FoV (36)
100 2 16.67 0.944 16 2 4 14

ntree, number of trees; mtry, number of variables selected for the best split at each node; OBB, out-of-bag error; ROC-AUC, area under receiver operating characteristics curve; TP, true positive; FP, false positive; FN, false negative; TN, true negative.

Next, the mean decrease accuracy and mean decrease Gini returned by the random forests model reveal that τ1, τ2, and τavg are the most important 2P-FLIM variables for macrophage classification and data segregation (Figure 3H). When using only τ1, τ2, and τavg as the 2P-FLIM predictors for random forests training, a high prediction accuracy was still achieved (ROC-AUC = 0.934) (Figure 3G).

2P-FLIM classification models are sensitive to cell heterogeneity

Macrophage polarisation heterogeneity at a single-cell level was evaluated within each donor in response to the FCCP treatment (Figure 4). Here, we utilised Cell Profiler to evaluate and track single-cell metabolic shifts (Figure 4A). A representative donor UMAP implies two clusters, one majorly occupied by IFNγ-M1 macrophages and the other occupied by IL-4-M2 macrophages (Figure 4B).

Figure 4. Single-cell two-photon fluorescence lifetime imaging microscopy (2P-FLIM) imaging analysis.

Figure 4.

(A) Single-cell analysis using a custom-built Cell Profiler script, scale bar = 100 µm. (B) Single-cell Uniform Manifold Approximate and Projection (UMAP) visualisation of a representative donor after carbonyl cyanide-p-trifluoromethoxyphenylhydrazone (FCCP) treatment using 2P-FLIM variables. (C) Receiver operating characteristics curve (ROCs) of random forests models for classification of macrophages of all human donors used in this study. (D) Mean decrease in accuracy and (E) mean decrease in Gini of each 2P-FLIM variable returned by the random forests model.

Figure 4—source data 1. Image and single-cell segmentation cell profiler coding, single-cell fluorescence lifetime imaging microscopy (FLIM) variables measurement for each replicate, and machine learning input data and coding.

Afterwards, we trained a random forests model for each donor. The new random forests models were trained using τavg, τ1, τ2, α2, and ORR as predictors, the measurements of which were obtained from single cells. The values of the hyper-parameters were decided according to the OBB estimate (Table 2). Subsequently, we evaluated the performance of the trained random forests models by plotting ROC-AUC curves (Figure 4C, Table 2). From Figure 4C and Table 2, it is noticeable that single-cell classification performance is affected by donor variability during the FCCP treatment. Donors D and E have the highest ROC-AUC values and lowest OOB errors. Finally, for the trained random forests models, we plotted the relative importance of each 2P-FLIM variable, and found that τ1, τ2, τavg, and α2 are the most important variables for classifying macrophage type at a single-cell level (Figure 4D and E).

Table 2. Hyper-parameters, OBB, ROC-AUC, and confusion matrix of donor-specific random forests models.

Donor (no. of data points) ntree mtry OOB (%) ROC-AUC TP FP FN TN
A (170) 100 4 24.41 0.740 39 18 13 57
B (155) 150 2 38.79 0.650 28 27 18 43
C (232) 250 3 32.18 0.813 67 29 27 51
D (199) 400 4 10.07 0.968 76 5 10 58
E (179) 250 1 19.40 0.854 58 9 17 50
F (212) 250 1 26.42 0.801 101 2 40 16

ntree, number of trees; mtry, number of variables selected for the best split at each node; OBB, out-of-bag error; ROC-AUC, area under receiver operating characteristics curve; TP, true positive; FP, false positive; FN, false negative; TN, true negative.

Discussion

In this study, we use macrophage polarisation as a model system to demonstrate the feasibility and effectiveness of the random forests model for classification, applied to 2P-FLIM parameters influenced by metabolic perturbations (Figure 1). The polarisation of human macrophages is often crudely described as two opposite phenotypes: classical activation (IFNγ-M1-macrophages) and alternative activation (IL-4-M2 macrophages). The higher production of TNFα in IFNγ-treated human macrophages and low IL-10 production are evidence of a macrophage classical activation (Tokunaga et al., 2018; Figure 2A and B). In contrast, treating human macrophages with IL-4, an increase in IL-10 production and higher expression of MRC1 and CCL13, are characteristic of an alternative activation of macrophages (Martinez et al., 2006; Artyomov et al., 2016; Figure 2D–F). Furthermore, we performed flow cytometry of polarised macrophages using antibodies to detect CD80, CD86, CD163, and CD206 surface markers, further validating the intended macrophage polarisation (Figure 2—figure supplement 1). Extending from this, extracellular flux analysis was performed. Extracellular flux measurements revealed a higher acidification ratio (ECAR) and a lower OCR for IFNγ-M1 macrophages during the different stages of metabolic inhibition when compared with IL-4-M2 macrophages (Figure 2G and J). By calculating the basal glycolysis rate and maximum glycolysis, IFNγ-M1 macrophages presented higher glycolytic rates (Figure 2H and I). The acidification (from ECAR) is linked with the production of lactate as a by-product of glycolysis, which reduces extracellular pH (Wang et al., 2018). Regarding IL-4-M2 macrophages, a reduced ECAR and increased OCR during the extracellular flux treatments (Figure 2G and J), together with a higher max respiration potential, were observed when compared with untreated macrophages. However, no difference was observed at the basal respiration measures (Figure 2K and L). ECAR and OCR calculated results are associated with a higher dependence of OxPhos as a more active metabolic machinery in IL-4-M2 macrophages. In order to fuel the upregulation of the TCA cycle and the ETC, the mitochondria need to consume more oxygen at the complex IV site of the electron transport chain (Van den Bossche et al., 2015; O’Neill et al., 2016).

We next sought to underline 2P-FLIM as a complimentary and, more advantageously, a non-invasive spatial evaluation of macrophage metabolism reflecting macrophage-induced polarisation. Recapitulating the sequence of metabolic enzyme inhibition which formed a basis of the extracellular flux analysis, sequential 2P-FLIM micrographs of IFNγ- or IL-4-polarised macrophages were acquired (Figure 3A). A reduced τavg, observed with IFNγ-M1 macrophages, reflects a higher relative amount of free NAD(P)H (which has characteristic short fluorescence lifetimes), indicative of glycolysis (Walsh et al., 2013; Perottoni et al., 2021). In contrast, IL-4-M2 macrophages had higher τavg-reflective of OxPhos due to increased proportion of longer lifetime, protein-bound NAD(P)H (Okkelman et al., 2019, Walsh et al., 2020; Figure 3B). Varying interpretations of ORR are reported in previous studies (Varone et al., 2014; Walsh et al., 2020), and for this study, we adopt a lower ORR reflecting a higher fraction of NAD(P)H and a lower FAD+ associated with upregulation of OxPhos (Equation 3; Neto et al., 2020). IL-4-M2 macrophages exhibited a lower ORR across the treatments in contrast with IFNγ-M1 macrophages, validating a higher dependency of IL-4-M2 macrophages on OxPhos when compared with IFNγ-M1 macrophages. A heatmap overviewing the 2P-FLIM output variables was compiled, showcasing the shifts promoted by the different treatments in both phenotype-directed human macrophages (Figure 3D). With IL-4-M2 macrophages, there is a noticeable increase in τ1, τ2, and τavg and an appreciable decrease in ORR when compared with IFNγ-M1-treated macrophages. The trending increase of NAD(P)H fluorescence lifetimes is further exacerbated after treatment with FCCP in IL-4-M2 macrophages. Indeed, given the ability of FLIM to imaging and measure NAD(P)H and FAD+, others have employed small molecules to gain further information about the dynamics of metabolic machinery in states of disease and differentiation. For instance, during stem cell osteogenic differentiation, Guo et al. tested oligomycin A (mitochondrial respiration inhibitor) as an experimental treatment in parallel to standard osteogenic media. A decrease in NAD(P)H average lifetime was calculated reflective of reduced oxidative phosphorylation. In addition, oligomycin A treatment promoted an increase in lactate production, lower oxygen consumption, and lower osteogenic differentiation (Guo et al., 2015). The heterogeneous response towards metabolic inhibitors by IFNγ-M1 or IL-4-M2 macrophages yields clustering patterns in the two-dimensional projected space via the UMAP method (Figure 3E). When analysing each treatment individually, measurements emanating from the FCCP treatment yielded the highest segregation across all donors between IFNγ-M1 and IL-4-M2 macrophages (Figure 3F). In addition, our phasor analysis (Figure 3—figure supplement 3) also demonstrates a higher segregation of IFNγ-M1 and IL-4-M2 macrophages when using FFCP treatment. Albeit, the limitation in pre-defining values for the fluorescence lifetime variables such as τ1 and τ2 can mask important predictors for the classification problem.

Higher NAD(P)H fluorescence lifetimes such as τ1, τ2, and τavg are attributable to two major factors: increases in NADPH concentrations and microenvironmental shifts (Blacker et al., 2014; Schaefer et al., 2019). FCCP functions as an uncoupler of mitochondria inner membrane allowing unhinged proton flux to the mitochondria matrix. Consequently, this proton flux causes a reduction of mitochondrial pH, increasing effectively the fluorescence lifetime of NAD(P)H. The unhinged proton flux due to FCCP compliments existing studies whereby Schaefer et al. reported an increased NAD(P)H fluorescence lifetime due to reduced mitochondria pH elicited by FCCP treatment (Blinova et al., 2005; Schaefer et al., 2017). Another consequence of FCCP treatment is an increase in ETC activity indicated by increased oxygen consumption of IL-4-M2 macrophages (Figure 2J). Increased ETC activity promotes an increase of NADH and FAD+ directly impacting the ORR. One would have expected the ORR to increase after the FCCP treatment as observed in IFNγ macrophages. However, for IL-4-M2 macrophages, the ORR begins to decrease. FCCP induction of maximal ETC activity increases the demand for NADH and FADH2 causing a concomitant increase of the FAO and TCA cycle activity already upregulated in IL-4-M2 macrophages (Ludtmann et al., 2014). This demand results in a reduced pool of FAD+ and an increase of NADH, effectively reducing ORR (Ludtmann et al., 2014; Akie and Cooper, 2015; Viola et al., 2019). The higher heterogeneity in FCCP response is due to the effect of FCCP on the mitochondrial membrane and the higher dependence of FAS/FAO on IL-4-M2 macrophages basally (O’Neill et al., 2016). Blacker et al. reported the FCCP impact in great detail where, when seeking to separate NADH and NADPH fluorescence in live cells and tissues using FLIM, inhibiting mitochondrial oxidative phosphorylation in wild-type HEK293 cells using rotenone (10 μM) or uncoupling using FCCP (1 μM) (Blacker et al., 2014). The study of Blacker et al. provides excellent insights into NADH and NADPH dynamics and separation, and the treatment of FCCP has a similar effect on HEK293 cells as the IL-4-M2 macrophages reported in our study. Here, FCCP uncoupling promotes higher ETC, higher TCA activity impacting the ORR and, at the same time, a decreased mitochondrial pH increases the fluorescence lifetimes of NAD(P)H (Blinova et al., 2005; Schaefer et al., 2017). Regarding IFNγ-M1 macrophages, their lower dependence on ETC, TCA, FAO, and lower mitochondria membrane potential while producing most of ATP by glycolysis makes the impact of FCCP on the mitochondria and fluorescence lifetimes less pronounced.

Random forests models were applied to study the endpoint variables of the 2P-FLIM obtained during the FCCP treatment of IFNγ-M1 macrophages and IL-4-M2 macrophages. 2P-FLIM measurements contain a minimum of 24 cells per FOV and culminate in a total of 36 data points used for random forests training. For the trained random forests model, a low OBB error estimate and a high ROC-AUC value were achieved when classifying a population of IFNγ-M1 and IL-4-M2 macrophages (Figure 3G and H, Table 1). When the random forests model was trained using only three predictors, τ1, τ2, and τavg, the ROC-AUC classification accuracy decreased slightly from 0.944 to 0.934 (Figure 3G). The feature importance ranking based on mean decrease in accuracy and mean decrease in Gini indicates that τ1, τ2, and τavg are the most important 2P-FLIM variables for classifying an IFNγ-M1 or an IL-4-M2 macrophage when subjected to the FCCP treatment (Figure 3H). The relative importance of the three features, τ1, τ2, and τavg, implies that these FLIM features are divergent in IFNγ-M1 and IL-4-M2 macrophages. This outcome agrees with our previous results showing that FCCP highly impacts the NAD(P)H fluorescence lifetimes in IFNγ-M1 and IL-4-M2 macrophages (Figure 3D). The high predictive power of 2P-FLIM variables for classifying cell phenotype compliments the machine learning approach of Walsh et al., on classifying CD3+ and CD3+CD8+ T-cell activation (Walsh et al., 2020).

Precise regulation of macrophage activation state is key to understanding disease control, tissue homeostasis, and implant response, with this regulation shown to be directly related to macrophage intracellular metabolism (O’Neill et al., 2016). Therefore, impaired macrophage metabolism results in compromised homeostasis such as the case of diabetes, the foreign body response to biomaterials, obesity, or cancer (Mantovani and Sica, 2010; McNelis and Olefsky, 2014). Depending on the investigation being applied, shifts observed in cellular metabolism, cytokine production, or gene expression are typically a cumulative output from a broad population. We investigated the clustering pattern of IFNγ-M1 macrophages and IL-4-M2 macrophages using single-cell data within individual donors and found that the IFNγ-M1 macrophage and IL-4-M2 macrophage appeared as separate clusters within each donor (Figure 4A and B). Classifying IFNγ-M1 and IL-4-M2 macrophages at a single-cell level yielded some varied results, with four donors providing acceptable predicting performance (OOB < 31%; ROC-AUC > 0.75) (Figure 4C and D). There is some cell-to-cell variability which could stem from the uptake capacity of FCCP and other treatments in our experiments as well as the underlying health of the donors which is not available (Smiley et al., 1991; Stiebing et al., 2017). Nonetheless, for the four donors with the most superior performance during classification, τ1, τ2, τavg, and α2 are the most relevant variables for the classification problem, which agrees with the case of full FoV analysis. Future efforts to improve single-cell classification include increasing the overall number of cells analysed to ensure a stronger classification. In addition, by observing phenotypic IFNγ-M1 and IL-4-M2 macrophage cell surface markers (examples include CD80, CD86, CD163, and CD206, respectively) during the imaging process by immunofluorescence or other modes of tagging, it would be possible to improve the classification efficiency (Murray et al., 2014).

2P-FLIM imaging has several advantages when compared with traditional metabolic assays and methods to classify and validate macrophage metabolism. 2P-FLIM enables spatial and temporal resolution in a non-invasive manner, allowing single-cell and cell-to-cell evaluations into cellular heterogeneity in a basal and interrogated mode. 2P-FLIM requires no fixation nor staining of cells and can be performed in real time with only a small number of cells. In this work, we demonstrated the feasibility of using 2P-FLIM as a tool to distinguish and classify opposing human macrophage polarisation states based on cellular metabolism and fluorescence lifetimes variables. Visualisation of the data showed a clear clustering pattern of IFNγ-M1 and IL-4-M2 macrophages in response to FCCP during real-time imaging in a full FoV. The excellent performance of machine learning models, applied on the data extracted from the non-invasive technique, underlines further the efficiency of this workflow. This workflow can be easily adapted to non-invasively characterising macrophage polarisation in in vivo models and in vitro multicellular organoid models. These organoid models can be developed to study foreign body interactions, biomaterial assessment, pharmaceutical research and screening, and clinical applications such as disease diagnosis.

Materials and methods

Human blood monocyte-derived macrophage isolation

This study was approved by the research ethics committee of the School of Biochemistry and Immunology, Trinity College Dublin, and was conducted in accordance with the Declaration of Helsinki. Leucocyte-enriched buffy coats from anonymous healthy donors were obtained with permission from the Irish Blood Transfusion Board (IBTS), St. James’s Hospital, Dublin. Donors provided informed written consent to the IBTS for their blood to be used for research purposes. PBMCs were isolated and differentiated into macrophages as described previously (Mahon et al., 2020). The purity of CD14+CD11b+ macrophages was assessed by flow cytometry and was routinely >95%.

Cytokine measurements

Macrophages (1 × 106 cells/ml) were treated with IFNγ (20 ng/ml) or IL-4 (20 ng/ml) for 24 hr. Supernatants were harvested, and cytokine concentrations of TNFα and IL-10 were quantified by ELISA (eBioscience) according to the manufacturer’s protocol.

Real-time PCR

Macrophages (1 × 106 cells/ml) were treated with IFNγ (20 ng/ml) or IL-4 (20 ng/ml) for 24 hr. RNA was extracted using High-Pure RNA Isolation Kits (Roche) and assessed for concentration and purity using the NanoDrop 2000c – UV-Vis spectrophotometer. RNA was equalised and reverse transcribed using the Applied Biosystems High-Capacity cDNA reverse transcription kit. Real-Time PCR Detection System (Bio-Rad Laboratories, CA) was used to detect mRNA expression of target genes. PCR reactions included iTaq Universal SYBR Green Supermix (Bio-Rad Laboratories), cDNA TaqMan fast universal PCR Master Mix and pre-designed TaqMan gene expression probes (Applied Biosystems) for CXCL9, CXCL10, MRC1, CCL13, and the housekeeping gene, 18S ribosomal RNA. The 2– ΔΔCT method was used to analyse relative gene expression.

Seahorse analyser

Macrophages were cultured at 1 × 106 cells/ml for 6 days prior to re-seeding at 2 × 105 cells/well in a Seahorse 96-well microplate and allowed to rest for 5 hr prior to stimulation with IFNγ (20 ng/ml) and IL-4 (20 ng/ml) for 24 hr. The Seahorse cartridge plate was hydrated with XF calibrant fluid and incubated in a non-CO2 incubator at 37°C for a minimum of 8 hr prior to use. Thirty minutes prior to placement into the Seahorse XF/XFe analyser, cell culture medium was replaced with complete XF assay medium (Seahorse Biosciences, supplemented with 10 mM glucose, 1 mM sodium pyruvate, 2 mM l-glutamine, and pH adjusted to 7.4) and incubated in a non-CO2 incubator at 37°C. Blank wells (XF assay medium only) were prepared without cells for subtracting the background OCR and ECAR during analysis. Oligomycin (1 mM, Cayman Chemicals), FCCP (1 mM, Santa Cruz Biotechnology), Rot (500 nM), and AA (500 nM) and 2-DG (25 mM, all Sigma-Aldrich) were prepared in XF assay medium and loaded into the appropriate injection ports on the cartridge plate and incubated for 10 min in a non-CO2 incubator at 37°C. OCR and ECAR were measured over time with sequential injections of oligomycin, FCCP, Rot, and AA and 2-DG. Analysis of results was performed using Wave software (Agilent Technologies). The rates of basal glycolysis, maximal glycolysis, basal respiration, and maximal respiration were calculated as detailed in the manufacturer’s protocol and supplied in Table 3.

Table 3. Calculation of basal glycolysis, max glycolytic, basal respiration, and max respiration for ECAR/OCR experimental setup.

Rate Calculation
Basal glycolysis Average ECAR values prior to oligomycin treatment – non-glycolytic ECAR
Max glycolysis Average ECAR values after oligomycin and before FCCP treatment
Basal respiration Average OCR values prior to oligomycin treatment – nonmitochondrial OCR
Max respiration Average OCR values after FCCP and before rotenone/antimycin A treatment

ECAR, extracellular acidification ratio; OCR, oxygen consumption ratio; FCCP, carbonyl cyanide-p-trifluoromethoxyphenylhydrazone.

Two-photon fluorescence lifetime imaging microscopy (2P-FLIM)

2P-FLIM was performed on 24 hr-polarised macrophages seeded in ibidi Luer μ-slides with a 0.8 mm channel height. 2P-FLIM was achieved using a custom upright (Olympus BX61WI) laser multiphoton microscopy system equipped with a pulsed (80 MHz) titanium: sapphire laser (Chameleon Ultra, Coherent, USA), water-immersion 25× objective (Olympus, 1.05 NA), and temperature-controlled stage at 37°C. Two-photon excitation of NAD(P)H and FAD+ fluorescence was performed at the excitation wavelength of 760 and 800 nm, respectively. Several studies have reported that two-photon excitation in the range of 720–760 nm can be used to selectively excite NAD(P)H, while for FAD+ a excitation wavelength above 900 nm is commonly used (Huang et al., 2002; Levitt et al., 2011). A 458/64 nm and 520/35 nm bandpass filter were used to isolate the NAD(P)H and FAD+ fluorescence emissions based on their emission spectra (Huang et al., 2002).

512 × 512 pixel images were acquired with a pixel dwell time of 3.81 μs and 30 s collection time. A PicoHarp 300 TCSPC system operating in the time-tagged mode coupled with a photomultiplier detector assembly (PMA) hybrid detector (PicoQuanT GmbH, Germany) was used for fluorescence decay measurements, yielding 256 time bins per pixel. TCSPC requires a defined ‘start,’ provided by the electronics steering the laser pulse or a photodiode, and a defined ‘stop’ signal, realised by detection with single-photon sensitive detectors. The measuring of this time delay is repeated many times to account for the statistical variance of the fluorophore’s emission. For more detailed information, the reader is referred elsewhere (Wahl et al., 2013).

Fluorescence lifetime images with their associated decay curves for NAD(P)H were obtained with a minimum of 1 × 106 photons peak. After imaging, the background noise was removed. This was performed by defining regions of interest (ROI) of the cells on the 2P-FLIM image. Consequently, lower values of photons/pixels are removed from analysis improving the signal-to-noise ratio (Figure 3—figure supplement 6).

The decay curved was generated and fitted with a double-exponential decay without including the instrument response function (IRF) (Equation 1).

It=I0[α1e-tτ1+α2e-tτ2]+C (1)

I(t) represents the fluorescence intensity measured at time t after laser excitation; α1 and α2 represent the fraction of the overall signal proportion of a short and long component lifetime, respectively. τ1 and τ2 are the long and short lifetime components, respectively; C corresponds to background light. Chi-square statistical test was used to evaluate the goodness of multiexponential fit to the raw fluorescence decay data. In this study, all of the fluorescence lifetime fitting values with χ2 < 1.3 were considered as ‘good’ fits. For NAD(P)H, the double exponential decay was used to differentiate between the protein-bound (τ1) and free (τ2) NAD(P)H. The average fluorescence lifetime was calculated using Equation 2.

τavg=(τ1×α1+τ2×α2)(α1+α2) (2)

Intensity-based images of NAD(P)H and FAD+ were acquired, and their ratio was calculated using Equation 3 to obtain the ORR.

ORR=FAD+NADPH (3)

From the images acquired using 2P-FLIM, single-cell analysis was performed using a custom-made script on Cell Profiler (McQuin et al., 2018). The single-cell analysis was conducted in a similar way as the global 2P-FLIM analysis.

Macrophage classification and machine learning

UMAP was used for data visualisation and exploratory analysis of the clustering patterns in the 2P-FLIM imaging datasets for both global- and single-cell analysis (McInnes et al., 2018). UMAP was implemented in Python, and the plots were obtained in GraphPad. The random forests model was applied to classify IFNγ-M1 and IL-4-M2 macrophages in both full FoV and single-cell donor-specific approaches. Random forests classification was implemented in R. Random forests hyper-parameters include the number of decision trees in the forest, the number of features considered by each tree when splitting a node, and the maximal depth of each tree. The maximal depth of each tree was controlled by setting the maximal number of terminal nodes in each tree to be 8. The values of the other two hyper-parameters were determined through grid search according to the OBB error estimate (Tables 1 and 2). The α1 variable was removed from the random forests model due to its deterministic relationship with the α2 variable (Figure 3—figure supplement 5). ROCs were plotted, and the AUC values were calculated. For full FoV approach, the training dataset was used for the global analysis due to limited data size. Whereas, for the single-cell donor approach, the overall dataset was divided 75% as training datasets and 25% as testing datasets. In addition, random forests feature selection was utilised to evaluate the weight of each 2P-FLIM variable to determine its relative importance in macrophage classification for both the overall and the single-cell datasets. Support vector machine (SVM) and logistic regression models were also implemented for comparing with the random forests model and yielded similar performance with 87.5% accuracy when dividing the full FoV datapoints into 80% training and 20% testing dataset. However, neither the SVM model nor the logistic model is able to provide information on the relative importance of the predictors, and they both require an independent dataset for model validation.

Statistics

Each experiment was performed in at least four healthy donors (defined by N) with 3–4 technical replicates run for each experiment (defined by n), depending on the assay type. Normality tests were performed to determine the normal distribution of the data. For ELISA and PCR data, one-way ANOVA and Tukey’s test were used for comparing more than two groups. For Seahorse data, repeated-measures one-way ANOVA was used to account for the variance in basal metabolism across donors. All statistical analyses were performed on GraphPad Prism 9.00 (GraphPad Software).

Acknowledgements

NN is supported by a Trinity College Dublin, Provost’s PhD Award, and the TCD FLIM core unit directed by MM is supported by a SFI Infrastructure Programme: Category D Opportunistic Funds Call (16/RI/3403). This work was also partially supported by EPSRC and SFI Centre for Doctoral Training in Engineered Tissues for Discovery, Industry and Medicine, Grant Number EP/S02347X/1 and in part by a grant from Science Foundation Ireland (SFI) and the European Regional Development Fund (ERDF) under grant number 13/RC/2073_P2.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Michael G Monaghan, Email: monaghmi@tcd.ie.

Michael L Dustin, University of Oxford, United Kingdom.

Aleksandra M Walczak, CNRS LPENS, France.

Funding Information

This paper was supported by the following grants:

  • Science Foundation Ireland 16/RI/3403 to Michael G Monaghan.

  • Science Foundation Ireland EP/S02347X/1 to Michael G Monaghan.

  • Science Foundation Ireland 13/RC/2073_P2 to Michael G Monaghan, Aisling Dunne.

Additional information

Competing interests

No competing interests declared.

No competing interests declared.

Author contributions

Conceptualization, Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing.

Data curation, Formal analysis, Validation, Investigation, Methodology, Writing – review and editing.

Resources, Software, Formal analysis, Visualization, Methodology, Writing – review and editing.

Resources, Visualization.

Resources, Writing – review and editing.

Conceptualization, Resources, Data curation, Software, Supervision, Funding acquisition, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing.

Additional files

Transparent reporting form

Data availability

All data generated or analysed during this study are included in the manuscript and supporting files (uploaded as source data).

References

  1. Adams DO. Molecular interactions in macrophage activation. Immunology Today. 1989;10:33–35. doi: 10.1016/0167-5699(89)90298-3. [DOI] [PubMed] [Google Scholar]
  2. Akie TE, Cooper MP. Determination of fatty acid oxidation and lipogenesis in mouse primary hepatocytes. Journal of Visualized Experiments. 2015;5:e52982. doi: 10.3791/52982. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Alfonso-García A, Smith TD, Datta R, Luu TU, Gratton E, Potma EO, Liu WF. Label-Free identification of macrophage phenotype by fluorescence lifetime imaging microscopy. Journal of Biomedical Optics. 2016;21:46005. doi: 10.1117/1.JBO.21.4.046005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Alipanahi B, Delong A, Weirauch MT, Frey BJ. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nature Biotechnology. 2015;33:831–838. doi: 10.1038/nbt.3300. [DOI] [PubMed] [Google Scholar]
  5. Artyomov MN, Sergushichev A, Schilling JD. Integrating immunometabolism and macrophage diversity. Seminars in Immunology. 2016;28:417–424. doi: 10.1016/j.smim.2016.10.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Blacker TS, Mann ZF, Gale JE, Ziegler M, Bain AJ, Szabadkai G, Duchen MR. Separating NADH and NADPH fluorescence in live cells and tissues using FLIM. Nature Communications. 2014;5:3936. doi: 10.1038/ncomms4936. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Blinova K, Carroll S, Bose S, Smirnov AV, Harvey JJ, Knutson JR, Balaban RS. Distribution of mitochondrial NADH fluorescence lifetimes: steady-state kinetics of matrix NADH interactions. Biochemistry. 2005;44:2585–2594. doi: 10.1021/bi0485124. [DOI] [PubMed] [Google Scholar]
  8. Bolland DJ, Koohy H, Wood AL, Matheson LS, Krueger F, Stubbington MJT, Baizan-Edge A, Chovanec P, Stubbs BA, Tabbada K, Andrews SR, Spivakov M, Corcoran AE. Two mutually exclusive local chromatin states drive efficient V (D) J recombination. Cell Reports. 2016;15:2475–2487. doi: 10.1016/j.celrep.2016.05.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Breiman L. Random forests. Machine Learning. 2001;45:5–32. doi: 10.1023/A:1010933404324. [DOI] [Google Scholar]
  10. Culos A, Tsai AS, Stanley N, Becker M, Ghaemi MS, McIlwain DR, Fallahzadeh R, Tanada A, Nassar H, Espinosa C, Xenochristou M, Ganio E, Peterson L, Han X, Stelzer IA, Ando K, Gaudilliere D, Phongpreecha T, Marić I, Chang AL, Shaw GM, Stevenson DK, Bendall S, Davis KL, Fantl W, Nolan GP, Hastie T, Tibshirani R, Angst MS, Gaudilliere B, Aghaeepour N. Integration of mechanistic immunological knowledge into a machine learning pipeline improves predictions. Nature Machine Intelligence. 2020;2:619–628. doi: 10.1038/s42256-020-00232-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Fall F, Lamy E, Brollo M, Naline E, Lenuzza N, Thévenot E, Devillier P, Grassin-Delyle S. Metabolic reprograming of LPS-stimulated human lung macrophages involves tryptophan metabolism and the aspartate-arginosuccinate shunt. PLOS ONE. 2020;15:e0230813. doi: 10.1371/journal.pone.0230813. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Feuerer N, Marzi J, Brauchle EM, Carvajal Berrio DA, Billing F, Weiss M, Jakobi M, Schneiderhan-Marra N, Shipp C, Schenke-Layland K. Lipidome profiling with Raman microspectroscopy identifies macrophage response to surface topographies of implant materials. PNAS. 2021;118:e2113694118. doi: 10.1073/pnas.2113694118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Floudas A, Neto N, Marzaioli V, Murray K, Moran B, Monaghan MG, Low C, Mullan RH, Rao N, Krishna V, Nagpal S, Veale DJ, Fearon U. Pathogenic, glycolytic PD-1+ B cells accumulate in the hypoxic RA joint. JCI Insight. 2020;5:21. doi: 10.1172/jci.insight.139032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Gordon S. Alternative activation of macrophages. Nature Reviews. Immunology. 2003;3:23–35. doi: 10.1038/nri978. [DOI] [PubMed] [Google Scholar]
  15. Graney PL, Ben-Shaul S, Landau S, Bajpai A, Singh B, Eager J, Cohen A, Levenberg S, Spiller KL. Macrophages of diverse phenotypes drive vascularization of engineered tissues. Science Advances. 2020;6:eaay6391. doi: 10.1126/sciadv.aay6391. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Gratchev A, Guillot P, Hakiy N, Politz O, Orfanos CE, Schledzewski K, Goerdt S. Alternatively activated macrophages differentially express fibronectin and its splice variants and the extracellular matrix protein betaig-h3. Scandinavian Journal of Immunology. 2001;53:386–392. doi: 10.1046/j.1365-3083.2001.00885.x. [DOI] [PubMed] [Google Scholar]
  17. Guo H-W, Yu J-S, Hsu S-H, Wei Y-H, Lee OK, Dong C-Y, Wang H-W. Correlation of NADH fluorescence lifetime and oxidative phosphorylation metabolism in the osteogenic differentiation of human mesenchymal stem cell. Journal of Biomedical Optics. 2015;20:017004. doi: 10.1117/1.JBO.20.1.017004. [DOI] [PubMed] [Google Scholar]
  18. Huang S, Heikal AA, Webb WW. Two-Photon fluorescence spectroscopy and microscopy of NAD (P) H and flavoprotein. Biophysical Journal. 2002;82:2811–2825. doi: 10.1016/S0006-3495(02)75621-X. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Koo SJ, Chowdhury IH, Szczesny B, Wan X, Garg NJ. Macrophages promote oxidative metabolism to drive nitric oxide generation in response to Trypanosoma cruzi. Infection and Immunity. 2016;84:3527–3541. doi: 10.1128/IAI.00809-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Lakowicz JR, Szmacinski H, Nowaczyk K, Johnson ML. Fluorescence lifetime imaging of free and protein-bound NADH. PNAS. 1992;89:1271–1275. doi: 10.1073/pnas.89.4.1271. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Levitt JM, McLaughlin-Drubin ME, Münger K, Georgakoudi I, Egles C. Automated biochemical, morphological, and organizational assessment of precancerous changes from endogenous two-photon fluorescence images. PLOS ONE. 2011;6:e24765. doi: 10.1371/journal.pone.0024765. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Ludtmann MHR, Angelova PR, Zhang Y, Abramov AY, Dinkova-Kostova AT. Nrf2 affects the efficiency of mitochondrial fatty acid oxidation. The Biochemical Journal. 2014;457:415–424. doi: 10.1042/BJ20130863. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Ma J, Wei K, Liu J, Tang K, Zhang H, Zhu L, Chen J, Li F, Xu P, Chen J, Liu J, Fang H, Tang L, Wang D, Zeng L, Sun W, Xie J, Liu Y, Huang B. Glycogen metabolism regulates macrophage-mediated acute inflammatory responses. Nature Communications. 2020;11:1769. doi: 10.1038/s41467-020-15636-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Mahon OR, Kelly DJ, McCarthy GM, Dunne A. Osteoarthritis-Associated basic calcium phosphate crystals alter immune cell metabolism and promote M1 macrophage polarization. Osteoarthritis and Cartilage. 2020;28:603–612. doi: 10.1016/j.joca.2019.10.010. [DOI] [PubMed] [Google Scholar]
  25. Mandrekar JN. Receiver operating characteristic curve in diagnostic test assessment. Journal of Thoracic Oncology. 2010;5:1315–1316. doi: 10.1097/JTO.0b013e3181ec173d. [DOI] [PubMed] [Google Scholar]
  26. Mantovani A, Sozzani S, Locati M, Allavena P, Sica A. Macrophage polarization: tumor-associated macrophages as a paradigm for polarized M2 mononuclear phagocytes. Trends in Immunology. 2002;23:549–555. doi: 10.1016/s1471-4906(02)02302-5. [DOI] [PubMed] [Google Scholar]
  27. Mantovani A, Sica A. Macrophages, innate immunity and cancer: balance, tolerance, and diversity. Current Opinion in Immunology. 2010;22:231–237. doi: 10.1016/j.coi.2010.01.009. [DOI] [PubMed] [Google Scholar]
  28. Mantovani A, Marchesi F, Malesci A, Laghi L, Allavena P. Tumour-Associated macrophages as treatment targets in oncology. Nature Reviews. Clinical Oncology. 2017;14:399–416. doi: 10.1038/nrclinonc.2016.217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Martinez FO, Gordon S, Locati M, Mantovani A. Transcriptional profiling of the human monocyte-to-macrophage differentiation and polarization: new molecules and patterns of gene expression. Journal of Immunology. 2006;177:7303–7311. doi: 10.4049/jimmunol.177.10.7303. [DOI] [PubMed] [Google Scholar]
  30. Martinez FO, Sica A, Mantovani A, Locati M. Macrophage activation and polarization. Frontiers in Bioscience. 2008;13:453–461. doi: 10.2741/2692. [DOI] [PubMed] [Google Scholar]
  31. McInnes L, Healy J, Saul N, Großberger L. UMAP: uniform manifold approximation and projection. Journal of Open Source Software. 2018;3:861. doi: 10.21105/joss.00861. [DOI] [Google Scholar]
  32. McNelis JC, Olefsky JM. Macrophages, immunity, and metabolic disease. Immunity. 2014;41:36–48. doi: 10.1016/j.immuni.2014.05.010. [DOI] [PubMed] [Google Scholar]
  33. McQuin C, Goodman A, Chernyshev V, Kamentsky L, Cimini BA, Karhohs KW, Doan M, Ding L, Rafelski SM, Thirstrup D, Wiegraebe W, Singh S, Becker T, Caicedo JC, Carpenter AE. CellProfiler 3.0: next-generation image processing for biology. PLOS Biology. 2018;16:e2005970. doi: 10.1371/journal.pbio.2005970. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Mohri M, Rostamizadeh A. Foundations of Machine Learning. MIT press; 2012. [Google Scholar]
  35. Mosser DM, Edwards JP. Exploring the full spectrum of macrophage activation. Nature Reviews. Immunology. 2008;8:958–969. doi: 10.1038/nri2448. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Murray PJ, Allen JE, Biswas SK, Fisher EA, Gilroy DW, Goerdt S, Gordon S, Hamilton JA, Ivashkiv LB, Lawrence T, Locati M, Mantovani A, Martinez FO, Mege J-L, Mosser DM, Natoli G, Saeij JP, Schultze JL, Shirey KA, Sica A, Suttles J, Udalova I, van Ginderachter JA, Vogel SN, Wynn TA. Macrophage activation and polarization: Nomenclature and experimental guidelines. Immunity. 2014;41:14–20. doi: 10.1016/j.immuni.2014.06.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Neto N, Dmitriev RI, Monaghan MG. In: Cell Engineering and Regeneration. Gimble M, Presen D, Oreffo ROC, Wolbank S, Redl H, editors. Springer International Publishing; 2020. Seeing is believing: noninvasive microscopic imaging modalities for tissue engineering and regenerative medicine; pp. 599–638. [DOI] [Google Scholar]
  38. Okkelman IA, Neto N, Papkovsky DB, Monaghan MG, Dmitriev RI. A deeper understanding of intestinal organoid metabolism revealed by combining fluorescence lifetime imaging microscopy (FLIM) and extracellular flux analyses. Redox Biology. 2019;30:101420. doi: 10.1016/j.redox.2019.101420. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. O’Neill LAJ, Kishton RJ, Rathmell J. A guide to immunometabolism for immunologists. Nature Reviews. Immunology. 2016;16:553–565. doi: 10.1038/nri.2016.70. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Perottoni S, Neto NGB, Di Nitto C, Dmitriev RI, Raimondi MT, Monaghan MG. Intracellular label-free detection of mesenchymal stem cell metabolism within a perivascular niche-on-a-chip. Lab on a Chip. 2021;21:1395–1408. doi: 10.1039/d0lc01034k. [DOI] [PubMed] [Google Scholar]
  41. Peterson KR, Cottam MA, Kennedy AJ, Hasty AH. Macrophage-Targeted therapeutics for metabolic disease. Trends in Pharmacological Sciences. 2018;39:536–546. doi: 10.1016/j.tips.2018.03.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Qian T, Heaster TM, Houghtaling AR, Sun K, Samimi K, Skala MC. Label-Free imaging for quality control of cardiomyocyte differentiation. Nature Communications. 2021;12:4580. doi: 10.1038/s41467-021-24868-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Ranjit S, Malacrida L, Jameson DM, Gratton E. Fit-free analysis of fluorescence lifetime imaging data using the phasor approach. Nature Protocols. 2018;13:1979–2004. doi: 10.1038/s41596-018-0026-5. [DOI] [PubMed] [Google Scholar]
  44. Schaefer PM, Hilpert D, Niederschweiberer M, Neuhauser L, Kalinina S, Calzia E, Rueck A, von Einem B, von Arnim CAF. Mitochondrial matrix pH as a decisive factor in neurometabolic imaging. Neurophotonics. 2017;4:045004. doi: 10.1117/1.NPh.4.4.045004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Schaefer PM, Kalinina S, Rueck A, von Arnim CAF, von Einem B. Nadh autofluorescence-A marker on its way to boost bioenergetic research. Cytometry. Part A. 2019;95:34–46. doi: 10.1002/cyto.a.23597. [DOI] [PubMed] [Google Scholar]
  46. Shields CW, Evans MA, Wang LLW, Baugh N, Iyer S, Wu D, Zhao Z, Pusuluri A, Ukidve A, Pan DC, Mitragotri S. Cellular backpacks for macrophage immunotherapy. Science Advances. 2020;6:eaaz6579. doi: 10.1126/sciadv.aaz6579. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Skala MC, Riching KM, Bird DK, Gendron-Fitzpatrick A, Eickhoff J, Eliceiri KW, Keely PJ, Ramanujam N. In vivo multiphoton fluorescence lifetime imaging of protein-bound and free nicotinamide adenine dinucleotide in normal and precancerous epithelia. Journal of Biomedical Optics. 1992;12:1–19. doi: 10.1117/1.2717503. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Skala MC, Riching KM, Gendron-Fitzpatrick A, Eickhoff J, Eliceiri KW, White JG, Ramanujam N. In vivo multiphoton microscopy of NADH and FAD redox states, fluorescence lifetimes, and cellular morphology in precancerous epithelia. PNAS. 2007;104:19494–19499. doi: 10.1073/pnas.0708425104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Smiley ST, Reers M, Mottola-Hartshorn C, Lin M, Chen A, Smith TW, Steele GD, Chen LB. Intracellular heterogeneity in mitochondrial membrane potentials revealed by a J-aggregate-forming lipophilic cation JC-1. PNAS. 1991;88:3671–3675. doi: 10.1073/pnas.88.9.3671. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Stiebing C, Meyer T, Rimke I, Matthäus C, Schmitt M, Lorkowski S, Popp J. Real-Time Raman and SRS imaging of living human macrophages reveals cell-to-cell heterogeneity and dynamics of lipid uptake. Journal of Biophotonics. 2017;10:1217–1226. doi: 10.1002/jbio.201600279. [DOI] [PubMed] [Google Scholar]
  51. Tokunaga R, Zhang W, Naseem M, Puccini A, Berger MD, Soni S, McSkane M, Baba H, Lenz HJ. Cxcl9, CXCL10, CXCL11/CXCR3 axis for immune activation-a target for novel cancer therapy. Cancer Treatment Reviews. 2018;63:40–47. doi: 10.1016/j.ctrv.2017.11.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Touw WG, Bayjanov JR, Overmars L, Backus L, Boekhorst J, Wels M, van Hijum SAFT. Data mining in the life sciences with random forest: a walk in the Park or lost in the jungle? Briefings in Bioinformatics. 2013;14:315–326. doi: 10.1093/bib/bbs034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Van den Bossche J, Baardman J, de Winther MPJ. Metabolic characterization of polarized M1 and M2 bone marrow-derived macrophages using real-time extracellular flux analysis. Journal of Visualized Experiments. 2015;5:53424. doi: 10.3791/53424. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Van den Bossche J, O’Neill LA, Menon D. Macrophage immunometabolism: where are we (going)? Trends in Immunology. 2017;38:395–406. doi: 10.1016/j.it.2017.03.001. [DOI] [PubMed] [Google Scholar]
  55. Varone A, Xylas J, Quinn KP, Pouli D, Sridharan G, McLaughlin-Drubin ME, Alonzo C, Lee K, Münger K, Georgakoudi I. Endogenous two-photon fluorescence imaging elucidates metabolic changes related to enhanced glycolysis and glutamine consumption in precancerous epithelial tissues. Cancer Research. 2014;74:3067–3075. doi: 10.1158/0008-5472.CAN-13-2713. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Verikas A, Gelzinis A, Bacauskiene M. Mining data with random forests: a survey and results of new tests. Pattern Recognition. 2011;44:330–349. doi: 10.1016/j.patcog.2010.08.011. [DOI] [Google Scholar]
  57. Viola A, Munari F, Sánchez-Rodríguez R, Scolaro T, Castegna A. The metabolic signature of macrophage responses. Frontiers in Immunology. 2019;10:1462. doi: 10.3389/fimmu.2019.01462. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Vivekanandan-Giri A, Byun J, Pennathur S. Quantitative analysis of amino acid oxidation markers by tandem mass spectrometry. Methods in Enzymology. 2011;491:73–89. doi: 10.1016/B978-0-12-385928-0.00005-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Wahl M, Röhlicke T, Rahn H-J, Erdmann R, Kell G, Ahlrichs A, Kernbach M, Schell AW, Benson O. Integrated multichannel photon timing instrument with very short dead time and high throughput. The Review of Scientific Instruments. 2013;84:043102. doi: 10.1063/1.4795828. [DOI] [PubMed] [Google Scholar]
  60. Walsh AJ, Cook RS, Manning HC, Hicks DJ, Lafontant A, Arteaga CL, Skala MC. Optical metabolic imaging identifies glycolytic levels, subtypes, and early-treatment response in breast cancer. Cancer Research. 2013;73:6164–6174. doi: 10.1158/0008-5472.CAN-13-0527. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Walsh AJ, Mueller KP, Tweed K, Jones I, Walsh CM, Piscopo NJ, Niemi NM, Pagliarini DJ, Saha K, Skala MC. Classification of T-cell activation via autofluorescence lifetime imaging. Nature Biomedical Engineering. 2020;5:77–88. doi: 10.1038/s41551-020-0592-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Wang J, Roderiquez G, Oravecz T, Norcross MA. Cytokine regulation of human immunodeficiency virus type 1 entry and replication in human monocytes/macrophages through modulation of CCR5 expression. Journal of Virology. 1998;72:7642–7647. doi: 10.1128/JVI.72.9.7642-7647.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Wang F, Zhang S, Jeon R, Vuckovic I, Jiang X, Lerman A, Folmes CD, Dzeja PD, Herrmann J. Interferon gamma induces reversible metabolic reprogramming of M1 macrophages to sustain cell viability and pro-inflammatory activity. EBioMedicine. 2018;30:303–316. doi: 10.1016/j.ebiom.2018.02.009. [DOI] [PMC free article] [PubMed] [Google Scholar]

Editor's evaluation

Michael L Dustin 1

The authors introduce a machine learning based classifier for M1 and M2 polarised macrophages based on autofluorescence lifetime parameters excited by two-photon excitation in the NAD(P)H emission band following during uncoupling of oxidative phosphorylation. They have identified a promising direction for use of metabolic imaging for macrophage classification.

Decision letter

Editor: Michael L Dustin1
Reviewed by: Michael L Dustin2, Sergi Padilla-Parra3

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Non-Invasive classification of macrophage polarisation by 2P-FLIM and machine learning" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, including Michael L Dustin as the Reviewing Editor and Reviewer #1, and the evaluation has been overseen by Aleksandra Walczak as the Senior Editor. The following individual involved in the review of your submission has agreed to reveal their identity: Sergi Padilla-Parra (Reviewer #2).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

1) Please, produce more data utilising the Phasor plot.

2) Produce 2D graphs clearly showing the number of photons (per pixel) vs lifetime.

3) Show all pictures of your cells with the different parameters.

4) If photons are limited leading to artifacts, increase the resolution of all your images (utilise a higher magnification/higher NA objective) and collect more photons per cell.

5) Figure 2 – Please indicate n number. Is this data from all 6 donors?

6) Figure 3 – In D, E and F, please indicate what each bar or point represents. Is this single-cell, an imaging field or200 cells? Please indicate for each panel.

7) The authors should make clear if the ROC AUC value of 0.944 is for single cells or a population? If not signal cell, how many cells are needed to reach this level?

8) The authors cite Kröger et al.'s 2021 preprint that uses lifetime parameters to classify human macrophages in vivo. Are the results here consistent with this kind of accuracy without the application of metabolic inhibitors? In this case, does in vivo environment likely serve as a discriminative condition that might separate the cells by FLIM better than the excess of oxygen in the in vitro setting?

9) The study could be strengthened further by looking at the phenotypic markers of polarisation at the single-cell level, for example, by immunofluorescence in a manner that could be correlated with the FLIM measurements. This might reveal how much the accuracy of the methods is related to some failure of macrophages to polarise in a population, rather than a true error in classification. This could be discussed as a future effort.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting the paper entitled "Non-Invasive classification of macrophage polarisation by 2P-FLIM and machine learning" for further consideration by eLife. Your revised article has been evaluated by a Senior Editor and a Reviewing Editor. We are sorry to say that we have decided that this submission will not be considered further for publication by eLife.

While the revisions were appreciated, they reinforced concerns about S/N. Your response related to importance of machine learning the classification led to a deeper analysis of this machine learning approach, which also suggests several weaknesses. We hope both sets of comments will be helpful in going forward with development of a robust approach to this important and interesting problem.

Reviewer #1 (Recommendations for the authors):

Comments on machine learning approach-

Doing large grid sweeps when cross-validating is not exactly best practice as you will be optimising for performance on the test set. Data should instead be split into train-test-val: grid sweeps should be evaluated on the test set, but final performance should be evaluated on a validation set. Otherwise, these parameters may have been overfitted to test set and may not generalise to other validation sets.

line 110 – t-SNE cannot be used for dimension reduction as it doesn’t learn a function that can be re-applied.

Given there only appear to be 6 variables used, PCA will likely be useful and faster than UMAP, and the principal components will be highly interpretable. The main utility of UMAP over PCA is that it is a non-linear transformation. PCA should still be explored to see what features make up the principal components.

139 – ROC-AUC is a bit subjective based on the number of cases: true positive, false positive, true negative and false negative should be reported too.

In figure 3: is 3F a reapplication of the UMAP learned in E, or is it a new UMAP?

Figure 4B does not at all look convincing: the M1 and M2 groups do not appear to be separated. Furthermore, was the UMAP used in this figure re-trained on this patient's data, or was it pre-trained on a different dataset? It is not clear.

There's a lot going on in figure 4: Was a new model trained for each patient, or was each patient tested on the same model? In either case, A variability of 0.937 ROC-AUC to 0.650 ROC-AUC, does not suggest that this classification model is robust.

Perhaps most importantly, It is not apparent where this score of 0.944 comes from – is this the max of cross-validation or the mean? is it on a dataset that collates all the data, or only on a subset? In particular, there is no link to the findings shown in figure 4.

This could be extended by doing ‘cross-validation’ when one of the patients is held out each time and generalisation performance is evaluated on the held-out patient.

Line 235: 'it is noticeable that single-cell classification performance is affected by donor variability during the FCCP 236 treatment' – this should be emphasised in the abstract – it is a weakness of the paper.

In Table 1 it looks like mtry and ntree are the wrong way around – having only 2-5 trees in a random forest is in no way stable, and given there are 6 features, mtry can not be more than 6 (instead of the reported range of 100 to 300).

If the researchers are doing cross-validation they should report the mean and standard deviation/S.E. for their ROC-AUC scores along with their TP/TN/FP/FN. It's not apparent if they've picked the highest ROC-AUC score they got in the cross-validations or the mean

their analysis of feature importance is fairly ad-hoc: a method like SHAP should be explored.

It appears that in the fluorescence some features are combinations of other features.

Line 486: if SVMs and logistic regressions have been done these must be reported along with their confidence intervals. Logistic regressions will also be highly interpretable. It does not appear cross-validation has been done on this data, however.

Reviewer #2 (Recommendations for the authors):

The authors have performed a number of analysis to try to respond to my questions. They have produced the Phasor plot for some of the data and have also presented pixel by pixel images and the photon histograms which is really valuable to understand how reliable is the data.

I have to say that after examining these data I am not convinced that the signal to noise and the limited photon collections is not affecting the results:

Figure 3 supplement 1 all figures are the same!!! There is no difference in pixel by pixel values for all conditions! The same can be seen in Figure 3 supplement 2. All figures regardless treatment conditions look the same to me. In the case of the phasor plot the differences might come from S/N as the shift in the plot is minimal and might be equivalent to your error (a few ps). In Figure 3 Supplement 4, the average lifetime clearly shows that for lower number of photons you have a shift in the lifetime which suggests that your changes in lifetime are affected by your poor signal. If you are considering the average lifetime as the mean value from a double exponential, still this shows that your calculations are affected by poor photon collection. You might consider non-fitting approach at all (for instance photon arrival time) and the number of photons would still be important. This would be better shown in a graph in which you do not bin the average lifetimes to a particular lifetime value (histograms) but instead you plot directly each photon value versus its corresponding lifetime value (dispersion plot). If you do this pixel by pixel instead of averaging your results per cell you will get a bid distribution of lifetimes that are pretty much affected by poor photon collection and you will have to determine which is the minimal amount of photons that gives a reliable lifetime.

I am sorry to say that this vision is strengthened when examining the pixel by pixel images provided with different treatments. No differences at all can be seen when taking a look at the different treatments (i.e. Olygomycin, FCCP…). Even when comparing IL-4-M2 macrophages vs IFNγ-M1 macrophages I could not see any significant difference.

Overall, I do appreciate the effort in producing all these data and I understand that there might be some differences in lifetimes that are quantified. However, the impact of the S/N and the difficulties to deconvolve background noise from real signal as shown in the histograms and also the images puts in doubt the main hypothesis of the paper.

eLife. 2022 Oct 18;11:e77373. doi: 10.7554/eLife.77373.sa2

Author response


Essential revisions:

1) Please, produce more data utilising the Phasor plot.

We thank the reviewers for this suggestion. We have produced and added phasor plot data based on the data originally presented in the first version of this manuscript and presented the results in supplementary figure 5. We also added in the results and Discussion section an indication of this data. While we do see some separation between populations M1 and M2 when treated with FCCP and applied to a phasor analysis, the separation is quite weak. This strengthens the rationale of a machine-learning approach.

2) Produce 2D graphs clearly showing the number of photons (per pixel) vs lifetime.

We thank the reviewers for the comment and hope that we have interpreted correctly. We have calculated the number of photons per pixel of all images acquired across all fluorescence lifetime and intensity measurements. We plotted these in histograms, so the distribution of the data can be observed. For all measurements, normality tests were conducted and all data except τavg had a normal distribution. Since τavg is not a raw variable but instead derived from other fluorescence lifetime variables, we believe the double Gaussian distribution is likely due to two different populations analysed in this work (IFNγ-M1 and IL-4-M2 macrophages). In addition, while fitting these Gaussian curves we also provide their amplitude, mean, standard deviation and R2 value. All of these results can be found in supplementary figure 3. This comment also has overlap with comment 5 (below) where we explain the applicability of the resolution and the objective employed.

3) Show all pictures of your cells with the different parameters.

We thank the reviewers for this comment and have added all of the images for all of the parameters in supplementary figures 3 and 4. We have donor matched the extra figures with the ones present in main figure 3A. for both IFNγ-M1 and IL-4-M2 macrophages.

4) If photons are limited leading to artifacts, increase the resolution of all your images (utilise a higher magnification/higher NA objective) and collect more photons per cell.

The objective used in this study is an Olympus XLPlan 25x NA 1.05 Water Immersion objective designed specifically for 2-Photon applications. This is by far one of the most superior objectives available on the market for multiphoton excitation (Singh et al. 2015).

5) Figure 2 – Please indicate n number. Is this data from all 6 donors?

We thank the reviewers for this comment and have made corrections to make this point clearer.

To clarify- this data is from all 6 donors.

6) Figure 3 – In D, E and F, please indicate what each bar or point represents. Is this single-cell, an imaging field or200 cells? Please indicate for each panel.

We thank the reviewers for this comment and have made corrections to make this point clearer.

In Figure 3D, E and F each row and point represents an imaging field. Notably in figure 3F, each symbol is also shape-coded for each donor.

7) The authors should make clear if the ROC AUC value of 0.944 is for single cells or a population? If not signal cell, how many cells are needed to reach this level?

We thank the reviewers for this comment and have made corrections to make this point clearer. The ROC-AUC value calculated is for a population based on the image-field images obtained during the experiment. The minimum number of cells present in an image-field was 24 cells.

8) The authors cite Kröger et al.'s 2021 preprint that uses lifetime parameters to classify human macrophages in vivo. Are the results here consistent with this kind of accuracy without the application of metabolic inhibitors? In this case, does in vivo environment likely serve as a discriminative condition that might separate the cells by FLIM better than the excess of oxygen in the in vitro setting?

Yes, Kröger results have a similar accuracy to this work without resorting to metabolic inhibitors and surely, an in vivo environment can play a role on making the phenotype of M1 and M2 macrophages more distinguishable by FLIM. As described in that same work, M1 macrophages are actively engaging in phagocytosis which generates ROS leading to metabolic stress. In addition, nutrient availability, 3D environment with a complex ECM structure and the cross-talk between different cell types present in the human dermis can trigger further metabolic changes that can make it easier to evaluate macrophage polarisation using FLIM. This does evoke an age old debate- in vivo versus in vitro and absolutely the two are not directly comparable whereby even the most advance in vitro conditions (3D, bioreactors, multicellularity, ECM) do not fully capture in vivo environmental conditions, never mind those that are diseased or infected. However, our platform serves as a powerful tool for in vitro analysis and non-discriminative classification of human macrophage behaviour.

9) The study could be strengthened further by looking at the phenotypic markers of polarisation at the single-cell level, for example, by immunofluorescence in a manner that could be correlated with the FLIM measurements. This might reveal how much the accuracy of the methods is related to some failure of macrophages to polarise in a population, rather than a true error in classification. This could be discussed as a future effort.

We thank the reviewers for this comment and have added this point to the Discussion section.

We agree with this point and have included it in the discussion and a new reference which showcases fluorescent markers that could be used to determine macrophage phenotype. Indeed, in supplementary figure 6, we already use some markers to distinguish between macrophage polarizations and can serve as a starting point for improving the classification model as future work.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

While the revisions were appreciated, they reinforced concerns about S/N. Your response related to importance of machine learning the classification led to a deeper analysis of this machine learning approach, which also suggests several weaknesses. We hope both sets of comments will be helpful in going forward with development of a robust approach to this important and interesting problem.

Reviewer #1 (Recommendations for the authors):

Comments on machine learning approach-

Doing large grid sweeps when cross-validating is not exactly best practice as you will be optimising for performance on the test set. Data should instead be split into train-test-val: grid sweeps should be evaluated on the test set, but final performance should be evaluated on a validation set. Otherwise, these parameters may have been overfitted to test set and may not generalise to other validation sets.

We thank the reviewer for this comment. Our data is comprised of small datasets with limited number of datapoints. We decided to use all of these data-points to train and validate the random forests model using cross-validation. Therefore, we do not have extra data to test our model. When we trained the random forests, we tuned the hyperparameters to achieve the smaller OBB error. The OOB allows us to estimate to the predicative error of our random forests models. In addition, we calculate the ROC curves and corresponding AUC values to evaluate the predictive ability and efficiency of the trained models. We corrected the manuscript to include this explanation.

line 110 – t-SNE cannot be used for dimension reduction as it doesn’t learn a function that can be re-applied.

Given there only appear to be 6 variables used, PCA will likely be useful and faster than UMAP, and the principal components will be highly interpretable. The main utility of UMAP over PCA is that it is a non-linear transformation. PCA should still be explored to see what features make up the principal components.

Our intention is to use UMAP, PCA or t-SNE as data visualization tools only. We added a new supplemental figure 3 —figure supplement 5 where we showcase PCA and t-SNE to visualize global (ie. All treatments) and FCCP treatments FLIM data. Here, we noticed that PCA analysis enforces a clear clustering of FCCP data similar to the one observed with UMAP, whilst t-SNE does not cluster as well. We have corrected the manuscript to reflect this.

139 – ROC-AUC is a bit subjective based on the number of cases: true positive, false positive, true negative and false negative should be reported too.

Good point. To give this clarityreport these new values on table 1and 2as true positive (TP), false positive (FP), true negative (TN) and false negative (FN).

In figure 3: is 3F a reapplication of the UMAP learned in E, or is it a new UMAP?

Figure 4B does not at all look convincing: the M1 and M2 groups do not appear to be separated. Furthermore, was the UMAP used in this figure re-trained on this patient's data, or was it pre-trained on a different dataset? It is not clear.

Figure 3F is a reapplication of the UMAP of figure 3E but only applied to FCCP treatment. We are using UMAP as a data visualisation technique and to classify it using random forests therefore, there is no need to enforce clear clustering pattern. The UMAP on figure 4B was re-trained on a representative donor data.

There's a lot going on in figure 4: Was a new model trained for each patient, or was each patient tested on the same model? In either case, A variability of 0.937 ROC-AUC to 0.650 ROC-AUC, does not suggest that this classification model is robust.

Perhaps most importantly, It is not apparent where this score of 0.944 comes from – is this the max of cross-validation or the mean? is it on a dataset that collates all the data, or only on a subset? In particular, there is no link to the findings shown in figure 4.

This could be extended by doing ‘cross-validation’ when one of the patients is held out each time and generalisation performance is evaluated on the held-out patient.

To clarify, there is a new model for each donor. Our point here is to show that he can evaluate donor heterogeneity from a single-cell point by applying random forests to single-donor data. The 0.944 value comes from a model trained using all of donor data from a large field of view treated with FCCP, specifically from figure 3F and is now reported on table 1.

Line 235: 'it is noticeable that single-cell classification performance is affected by donor variability during the FCCP 236 treatment' – this should be emphasised in the abstract – it is a weakness of the paper.

We thank the reviewer for this comment, we have corrected the manuscript and added this information on the abstract as “We uncover FLIM parameters that are pronounced under the action of carbonyl cyanide-p-trifluoromethoxyphenylhydrazone (FCCP), which strongly stratifies the phenotype of polarised human macrophages, however this performance is impacted by donor variability when analysing the data at a single-cell level”

In Table 1 it looks like mtry and ntree are the wrong way around – having only 2-5 trees in a random forest is in no way stable, and given there are 6 features, mtry can not be more than 6 (instead of the reported range of 100 to 300).

How embarrassing, the reviewer is totally correct; yes they are the wrong way around. We have corrected table 1 and table 2. Now, mtry range is from 2-5 while ntree range is 100-400.

If the researchers are doing cross-validation they should report the mean and standard deviation/S.E. for their ROC-AUC scores along with their TP/TN/FP/FN. It's not apparent if they've picked the highest ROC-AUC score they got in the cross-validations or the mean

their analysis of feature importance is fairly ad-hoc: a method like SHAP should be explored.

We thank the reviewer for this comment, we added these values to table 1. For the ROC-AUC values we did not perform cross-validation of the testing datasets and we utilised all of the data to train the random forests models. Therefore, the results obtained are based on the predictive capabilities of the trained models without a testing dataset. Using random forests allows us to evaluate the feature importance of each predicator and we have presented these results in figure 3H and figure 4E.

It appears that in the fluorescence some features are combinations of other features.

We thank the reviewer for this comment. We provide a correlation matrix in figure 4 —figure supplement 1. Here, we observed that α2 is negatively correlated with α1 and therefore is not used a predictor. The other values albeit being a mathematical obtained from other features do not show a high correlation value that would warrant removal as a variable for random forest training. For this reason, although we acquire 6 variables using 2P-FLIM, we only use 5 variables as predicators to train the random forest models.

Line 486: if SVMs and logistic regressions have been done these must be reported along with their confidence intervals. Logistic regressions will also be highly interpretable. It does not appear cross-validation has been done on this data, however.

We appreciate the reviewer’s comment. We have added the accuracy results of SVM and logistic regression directly in the manuscript. Regarding the confidence intervals, we believe they are describing the bootstrapping confidence interval. For our study, it’s not appropriate to do bootstrapping. Here, we only have 36 datapoints while performing bootstrapping requires a large dataset. In addition, bootstrapping is used more for regression and not for classification, as the assumptions underlying bootstrapping usually fail for classification problems.

Reviewer #2 (Recommendations for the authors):

The authors have performed a number of analysis to try to respond to my questions. They have produced the Phasor plot for some of the data and have also presented pixel by pixel images and the photon histograms which is really valuable to understand how reliable is the data.

I have to say that after examining these data I am not convinced that the signal to noise and the limited photon collections is not affecting the results:

We appreciate the reviewer’s concern. Their comments are addressed point by point below but in summary we are confident that S/N is not an issue. From the onset, before multiexponential decay fitting we set a minimum threshold of photons/pixel required for fitting. The lifetime values for bound and free NADH agree with several independent international labs and our previous measurements of pure suspended NADH. We have clarified more detail in the revised manuscript and added more supplementary figures to answer the reviewer’s queries.

Figure 3 supplement 1 all figures are the same!!! There is no difference in pixel by pixel values for all conditions! The same can be seen in Figure 3 supplement 2. All figures regardless treatment conditions look the same to me. In the case of the phasor plot the differences might come from S/N as the shift in the plot is minimal and might be equivalent to your error (a few ps).

We thank the reviewer for this comment. We have changed the colorscales of Figure 3 supplement 1 and Figure 3 supplement 2 to be more sensitive to the shifts we observed. We believe that now the impact of different metabolic treatment can be fully appreciated.

In Figure 3 Supplement 4, the average lifetime clearly shows that for lower number of photons you have a shift in the lifetime which suggests that your changes in lifetime are affected by your poor signal. If you are considering the average lifetime as the mean value from a double exponential, still this shows that your calculations are affected by poor photon collection. You might consider non-fitting approach at all (for instance photon arrival time) and the number of photons would still be important. This would be better shown in a graph in which you do not bin the average lifetimes to a particular lifetime value (histograms) but instead you plot directly each photon value versus its corresponding lifetime value (dispersion plot). If you do this pixel by pixel instead of averaging your results per cell you will get a bid distribution of lifetimes that are pretty much affected by poor photon collection and you will have to determine which is the minimal amount of photons that gives a reliable lifetime.

We thank the reviewer for this comment. We have now generated new dispersion plots to show directly each photon value (photon/pixel) to their corresponding FLIM and intensity variable. As suggested by the reviewer we are doing this in a pixel by pixel manner. We acquire our data in a 2P-FLIM system equipped with a specific objective designed for multiphoton excitation (25x, 1.05NA) and TCSPC detectors. In addition, we strive to reduce the noise of our imaging by performing it in a dark room with physical obstructions (closed system) to reduce light interference. Besides this the image is acquired during 30 seconds with an appropriate power to generate images with adequate photon counts for fitting of a multiexponential decay curve with χ2 values below 1.4. After imaging, we remove the background of the images obtained by using a proprietary tool of Symphotime which removes all of pixels with low photon counts in pre-determined value of photons. Referring to the new dispersion plots we do not have values below 450 photon/pixel, an observable consequence of the background removal by the user assisted by the software.

Regarding the average lifetime plot in figure 3 —figure supplement 4, we believe that the separation observed was due to the presence of two metabolically opposed cell populations. As showed in the feature importance data, τavg, τ1 and τ2 are the most important parameters to distinguish both cells populations and since τavg is a mathematical combination of four fluorescence parameters, the difference of both cellular populations is estimated to be higher in the τavg plot. In addition, in the average lifetime plot, there are similar frequency values for lower and higher lifetimes. Therefore, if lower photon counts were directly impacting average fluorescence lifetimes measurements, we would observe a skewed trend towards higher frequency values revolving a particular lifetime range. Furthermore, there seems to be no trend on the other non-derived 2P-FLIM parameters.

Nonetheless, with the new dispersion plots the trend described by the reviewer is no longer present. We have randomised distributions in which we can observe that either high values of photons/pixel (2000 photons/pixel) can result in lower FLIM variables values and vice-versa. Specifically, regarding average fluorescence lifetime referred by the reviewer we no longer observe any trends. We have in our data values with 1000 photons/pixel and corresponding lifetimes of 0.95ns as well 1000 photons/pixel with 1.2ns lifetimes. With these new plots we are confident on the data being measured and on the values obtained. In the past we used these same technique and conditions

to evaluate metabolic changes that were further validated with standard biochemical techniques (DOI: 10.1172/jci.insight.139032).

I am sorry to say that this vision is strengthened when examining the pixel by pixel images provided with different treatments. No differences at all can be seen when taking a look at the different treatments (i.e. Olygomycin, FCCP…). Even when comparing IL-4-M2 macrophages vs IFNγ-M1 macrophages I could not see any significant difference.

We thank the reviewer for this comment. We have changed the limits of the colorscale of Figure 3A as well of Figure 3 —figure supplement 1 and Figure 3 —figure supplement 2. This way, we believe that the fluorescence lifetime shifts of each treatment are more pronounced and easier to be appreciated.

Overall, I do appreciate the effort in producing all these data and I understand that there might be some differences in lifetimes that are quantified. However, the impact of the S/N and the difficulties to deconvolve background noise from real signal as shown in the histograms and also the images puts in doubt the main hypothesis of the paper.

While this critique is acknowledged, we sincerely believe we have comprehensively addressed this reviewers concern.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Figure 2—source data 1. ELISA, gene expression, extracellular acidification ratio, and oxygen consumption ratio measurements for each replicate data and details of statistical tests and chosen parameters.
    Figure 2—figure supplement 1—source data 1. Flow cytometry surface markers measurement and details of statistical tests and chosen parameters.
    Figure 3—source data 1. Fluorescence lifetime imaging microscopy (FLIM) imaging and corresponding variables measurement for each replicate, Uniform Manifold Approximate and Projection (UMAP) analysis, and machine learning (random forests) input data and coding.
    Figure 3—figure supplement 3—source data 1. Phasor fluorescence lifetime imaging microscopy (FLIM) analysis raw histogram data.
    Figure 3—figure supplement 6—source data 1. Dispersion plot distribution of photons/pixel per fluorescence lifetime variable value raw data.
    Figure 4—source data 1. Image and single-cell segmentation cell profiler coding, single-cell fluorescence lifetime imaging microscopy (FLIM) variables measurement for each replicate, and machine learning input data and coding.
    Transparent reporting form

    Data Availability Statement

    All data generated or analysed during this study are included in the manuscript and supporting files (uploaded as source data).


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES