Skip to main content
Bioinformatics logoLink to Bioinformatics
. 2022 Jan 18;38(7):2015–2021. doi: 10.1093/bioinformatics/btac032

massNet: integrated processing and classification of spatially resolved mass spectrometry data using deep learning for rapid tumor delineation

Walid M Abdelmoula 1,2, Sylwia A Stopka 3,4, Elizabeth C Randall 5, Michael Regan 6, Jeffrey N Agar 7, Jann N Sarkaria 8, William M Wells 9,10, Tina Kapur 11, Nathalie Y R Agar 12,13,14,
Editor: Olga Vitek
PMCID: PMC8963284  PMID: 35040929

Abstract

Motivation

Mass spectrometry imaging (MSI) provides rich biochemical information in a label-free manner and therefore holds promise to substantially impact current practice in disease diagnosis. However, the complex nature of MSI data poses computational challenges in its analysis. The complexity of the data arises from its large size, high-dimensionality and spectral nonlinearity. Preprocessing, including peak picking, has been used to reduce raw data complexity; however, peak picking is sensitive to parameter selection that, perhaps prematurely, shapes the downstream analysis for tissue classification and ensuing biological interpretation.

Results

We propose a deep learning model, massNet, that provides the desired qualities of scalability, nonlinearity and speed in MSI data analysis. This deep learning model was used, without prior preprocessing and peak picking, to classify MSI data from a mouse brain harboring a patient-derived tumor. The massNet architecture established automatically learning of predictive features, and automated methods were incorporated to identify peaks with potential for tumor delineation. The model’s performance was assessed using cross-validation, and the results demonstrate higher accuracy and a substantial gain in speed compared to the established classical machine learning method, support vector machine.

Availability and implementation

https://github.com/wabdelmoula/massNet. The data underlying this article are available in the NIH Common Fund’s National Metabolomics Data Repository (NMDR) Metabolomics Workbench under project id (PR001292) with http://dx.doi.org/10.21228/M8Q70T.

Supplementary information

Supplementary data are available at Bioinformatics online.

1 Introduction

Mass spectrometry imaging (MSI) is a rapidly growing technology that can provide spatial mapping of a wide range of biomolecular classes (such as proteins, metabolites and lipids) simultaneously and directly from a tissue section in a label-free manner (McDonnell and Heeren, 2007). These make MSI a promising discovery tool with the potential to impact the accuracy and speed of cancer diagnosis and complement the current practice of anatomic pathology (Dewez et al., 2020; Norris and Caprioli, 2013). In support providing information that can complement anatomic pathology, MSI can detect molecular alterations in diseased tissues prior to the manifestation of observable morphological changes (Abdelmoula et al., 2016; Addie et al., 2015; Randall et al., 2020). Recent developments have also enabled automated multimodal integration between MSI and histology (Abdelmoula et al., 2014; Patterson et al., 2018; Race et al., 2021). Such multimodal integration can harness complementary molecular and anatomical information from biological systems, enabling, e.g. a better understanding of molecular mechanisms and pathology of various diseases (Chaurand et al., 2004; Veselkov et al., 2014), more informed drug metabolite distribution (Castellino et al., 2011; Randall et al., 2018), and more sensitive surgical guidance (Eberlin et al., 2014; Santagata et al., 2014).

Matrix-assisted laser desorption ionization (MALDI) is a promising sample introduction technique for the development of diagnostic applications based on MSI (Basu et al., 2019; Calligaris et al., 2015; Drake et al., 2017; Huizing et al., 2019). Acquisition of spectral data from different molecular classes depends largely on the tissue preparation and matrix selection (Carreira et al., 2015). MSI data acquisition is done through laser scanning of a tissue surface for desorption and ionization of molecules using a predefined spatial resolution grid, which defines pixel dimensions. The mass-to-charge ratio (m/z) and relative intensity of the released ions are detected by a mass analyzer. Each pixel then provides a mass spectrum that would be considered as a high-dimensional data point for machine learning purposes. The quality and complexity of mass spectral data depend upon the mass analyzer. Fourier-transform ion cyclotron resonance mass spectrometry imaging (FT-ICR MSI) offers the highest mass resolving power (Bowman et al., 2020). Such ultrahigh mass resolution technology significantly improves the mass identification accuracy, but at the cost of concomitant increases in data complexity and volume (McDonnell et al., 2010b).

The complexity of raw MSI data poses challenges for classical machine learning approaches, where prone to the curse of dimensionality-related issues and slow processing (Alexandrov, 2020). These complexities can be described in terms of massive dimensionality (e.g. 105-107 m/z values) and a large number of spectra—that can exceed one million spectra per image, depending on both spectral and spatial resolutions—which can reduce clustering and classification quality (Van Der Maaten et al., 2009). MSI data preprocessing such as peak picking has been a fundamental step that, to avoid the aforementioned limitations, preceded the analysis routines of classical machine learning (McDonnell et al., 2010b). This fundamental preprocessing step aims to significantly reduce the original data dimensionality through identification of as many relevant peaks as possible while minimizing the noise. However, the available implementations of currently established peak picking approaches are highly sensitive to the selection of parameters that require expert optimization (e.g. peak shape, signal-to-noise ratio, full-width-half-maximum, peak frequency, baseline subtraction and spectral smoothing; Donnelly et al., 2019; Murta et al., 2021). In addition, the computational performance of peak picking approaches implemented in widely used commercial software is typically quadratic (Alexandrov, 2012), resulting in slow analysis that can take several hours or even a few days depending on mass resolving power and the number of mass spectra processed. A faster alternative but less sensitive approach is to base the peak picking analysis on the mean spectrum (Alexandrov, 2012). McDonnell et al. (2010a) showed that peak picking on the base peak spectrum, which displays the maximum intensity of each m/z value across the dataset, could improve results compared to the analysis of the mean spectrum, although this analysis is still depended upon expert optimization of peak picking parameters. Spectral preprocessing and the effect of parameters selection do not only impact the downstream analysis (e.g. dimensionality reduction, clustering and classification) but can also affect the overall biological interpretation (Murta et al., 2021; Seddiki et al., 2020).

Deep learning methods have revolutionized many application areas of biomedical imaging (Esteva et al., 2017; Hosny et al., 2018; Ronneberger et al., 2015). Unlike well-established classical approaches that require feature engineering (Balluff et al., 2015; Calligaris et al., 2015), deep learning can perform automatic learning of predictive features (Lecun et al., 2015). Deep learning techniques offer scalability, nonlinearity and efficiency that can accommodate the complexity of MSI data (Abdelmoula et al., 2020; Inglese et al., 2017; Thomas et al., 2016). For example, convolutional neural networks (CNN) were successfully applied to preprocessed and peak picked MSI data for tumor classification (Behrmann et al., 2018; Guo et al., 2020; van Kersbergen et al., 2019). Recently, CNN methods revealed promising results when applied on small scale MSI data without prior processing (Seddiki et al., 2020). Despite the promising results of these CNN architectures, the classification process is sensitive to a user defined, hyperparameter at the input layer that is referred to as the receptive field. The receptive field defines a convolutional kernel window in these CNN architectures to identify salient mass spectral patterns that depend on the selected size of the receptive field (Behrmann et al., 2018). Fully connected neural networks (FCNN) were applied on MSI data to perform nonlinear dimensionality reduction (Dexter et al., 2020; Inglese et al., 2017; Thomas et al., 2016), and we recently applied FCNN-based architecture to capture spatial patterns and learn underlying m/z peaks of interest from large-scale MSI data while bypassing conventional preprocessing (Abdelmoula et al., 2020).

In this work, we extend our recent deep learning development of msiPL (Abdelmoula et al., 2020) to enable tissue classification while avoiding potential bias from the user’s parameter optimization in spectral preprocessing. We propose, massNet, a scalable deep learning architecture to perform probabilistic pixel-based classification directly from mass spectral data with massive dimensionality (e.g. tens of thousands of m/z values) and without prior preprocessing such as peak picking. Unlike classical machine learning methods, massNet is capable of automatically learning predictive features from large-scale MSI data. We demonstrate our method on a MALDI 9.4 Tesla FT-ICR MSI dataset from a patient-derived xenograft (PDX) mouse brain tumor model of glioblastoma (GBM). The model stability was evaluated using 5-fold cross-validation, and the classification speed and robustness were assessed on a test set using various established machine learning metrics such as receiver operating characteristic curve (ROC), accuracy, recall, precision and F1-score. Moreover, the classification performance of massNet was benchmarked by comparison to support vector machine (SVM) a widely used classical machine learning method.

2 Materials and methods

2.1 MALDI FT-ICR MSI data

MALDI FT-ICR MSI data acquisition was performed on tissue sections from five different intracranial GBM PDX models (Fig. 1). This MSI data and acquisition process were previously published in another study from our group (Randall et al., 2020). Briefly, eight GBM tissue sections of 12-µm thickness were prepared and analyzed using a 9.4 Tesla SolariX mass spectrometer (Bruker Daltonics, Billerica, MA) in the positive ion mode with spatial resolution of 100 µm. The MSI data were exported from SCiLS lab 2020a (Bruker, Bremen, Germany) in the standardized format imzML (Race et al., 2012) and converted to the HDF5 format (Folk et al., 2011) for deep learning analysis. The data underlying this article are available in the NIH Common Fund’s National Metabolomics Data Repository (NMDR) Metabolomics Workbench (Sud et al., 2016) under project id (PR001292).

Fig. 1.

Fig. 1.

Tissue sections from five intracranial GBM PDX models were divided into training/validation and testing sets: (a) schematic distribution of tissue sections from different GBM models, (b) annotated tumor regions in the MSI datasets which were guided by the H&E annotations (c)

2.2 Microscopy imaging and tumor annotation

Five tissue sections consecutive to those used for the MSI analysis were thaw mounted onto glass slides for hematoxylin and eosin (H&E) staining. The H&E images were acquired using a Zeiss Observer Z.1 microscope equipped with 20× Plan-APOCHROMAT lens and AxioCam MR3 camera. The tumor regions in these H&E images were annotated by an expert pathologist and the annotations were manually transferred to annotate the MSI data as demonstrated in Randall et al. (2020; Fig. 1).

2.3 Deep learning architecture of massNet

The hyperspectral image from MSI encompasses a set of high-dimensional data points X=x1, x2, , xN, where N is the total number of spectra (or pixels) and each spectrum is a d-dimensional point xiRd where d is the total number of m/z variables. The pixel-wise annotation of the hyperspectral image is represented by a binary vector Y=y1, y2, , yN, where yi0,1C is the ground truth class label of spectrum x(i) which belongs to one of the total two-class labels C which are here normal and tumor. Our proposed massNet architecture shown in Figure 2 aims to learn a predictive function F that would establish the relationship between each spectrum xiand its associated class label yi as depicted in

y^i=F(H,xi)), (1)

where y^i0,1C is a probabilistic estimate of the predicted class labels C that is computed based on the optimized artificial neural network hyperparameter set H. Class prediction of normal and tumor for a given spectrum xi can then be, respectively, represented as PNy^ic=1 and 1-PN= PTy^ic=2. The massNet architecture has two main modules to optimize the hyperparameter set H, namely the variational autoencoder (VAE) and probabilistic classification modules. The VAE module aims to learn a nonlinear manifold of the original complex MSI data through optimizing two functions: a probabilistic encoder Pzx and a probabilistic decoder Pθ(z|x) for data reconstruction. The learned nonlinear manifold represents a latent variable zRk;kd which is of lower dimensions (k) and assumed to be sampled from a normal distribution (Coombes et al., 2005) as such z  Nμx, σx. The VAE parameters and θ are optimized based on maximizing the variational lower bound (L) that is given in Equation (2). For more information on VAE we refer readers to Kingma and Welling (2013)  

Lϕ,θ;xi=- KLqϕzxipθz+ Eqϕzxilogpθxiz. (2)

Fig. 2.

Fig. 2.

Deep learning-based architecture of massNet for probabilistic two-class classification of large-scale MSI data without prior preprocessing and peak picking. The artificial neural network is based on spectral-wise analysis and consists of two modules, namely: VAE for nonlinear manifold learning that is captured at the ‘Code’ layer, and two fully connected layers that take input from the ‘Code’ layer to yield probabilistic predictions at the output layer using the sigmoid activation. massNet is regularized based on batch normalization and drop out to maintain learning stabilization and faster optimization

The second module of massNet performs probabilistic classification, which is activated with a sigmoid function (pa=1/(1+e-a)), where a is an arbitrary variable at the output layer. This classifier module is parametrized by a hyperparameter ζ and consists of two fully connected hidden layers and an output layer of two classes (normal and tumor; Fig. 2). These two hidden layers are activated based on a rectified linear unit (ReLU) and take their input from the optimized latent variable (z) of the VAE. The classification result is based on optimizing a loss function of the binary cross-entropy (E), shown in Equation (3), to maximize the similarity between the real and predicted class labels. Finally, the hyperparameters H, given in Equation (1), consist of hyperparameters from these two optimized modules H={,ζ}  

E(y,y^)= -[ylogy^+1-ylog1-y^. (3)

2.4 Data visualization and feature localization

Uniform Manifold Approximation and Projection (UMAP) is a nonlinear dimensionality reduction method for data visualization in a single map representation in which it preserves both local and global data structures from higher dimensional space (McInnes et al., 2018). UMAP has been found promising in visualizing different types of omics data (Becht et al., 2019). The UMAP algorithm was applied on the learned k-dimensional latent variable (z) to enable data visualization in a two-dimensional space. Of note, applying the UMAP on the latent variable instead of the original hyperdimensional MSI data is a more efficient way to avoid the curse of dimensionality (Van Der Maaten et al., 2009).

The classification predictions were spatially visualized (based on the location of each corresponding mass spectra in the image space) to gain insight into the molecular interpretability of the predicted classes. This enabled the predicted class labels to be mapped back into the image space to reveal spatial patterns associated with each of the predicted classes. Image-based Pearson correlation was then applied on each of these spatial patterns and the identified peaks from the VAE module using the msiPL method (Abdelmoula et al., 2020). The highly colocalized m/z peaks are those that achieved the highest Pearson correlation coefficient (r0.7).

2.5 Model evaluation

The model’s learning stability was evaluated using 5-fold cross-validation on the training set (80% training and 20% validation), and the best model was finally applied and assessed on an independent test set. The classification speed and robustness were assessed on a test set using various established machine learning metrics such as ROC, accuracy, recall, precision and F1-score. Moreover, the classification performance of massNet was benchmarked by comparison to SVM a widely used classical machine learning method. The SVM was based on a linear kernel and implemented in python using the default values provided by the public machine learning library of Scikit-learn in which the regularization parameter was set to 1.

3 Results

3.1 Hyperparameters and implementation details of the massNet architecture

The massNet architecture, shown in Figure 2, consists of an input layer Lin, five hidden layers h1,hcode, h3, h4, h5 and two output layers (Lout1, Lout2). The input layer Lin takes its input signal from total-ion-count (TIC) normalized mass spectra. The VAE module has three hidden layers (h1,hcode, h3) and an output layer (Lout1). The lower dimensional latent variable (z) is captured at the Code layer (hcode), which is then used by the generative model for spectral data reconstruction at the output layer (Lout1). The Code layer (hcode) also provides an input to the classification module which has two fully connected hidden layers (h4, h5) and eventually yields a probabilistic estimate of class prediction at the output layer (Lout2). The layer size of each Lin and Lout1 is equivalent to the number of m/z variables, whereas the size of Lout2 is equivalent to the number of class labels. The size of hidden layers is a user tunable parameter, and it was empirically set as h1=512, hcode=5,  h3=512, h4=128, h5=128 in which the VAE parameter settings were adopted from our former msiPL method (Abdelmoula et al., 2020). The ReLU was used as an activation function for all layers except for the two output layers (Lout1, Lout2) which were activated using the sigmoid function. The proposed neural network architecture was regularized using batch normalization and dropout to stabilize the learning process, fasten the convergence and significantly reduce overfitting (Ioffe and Szegedy, 2015; Srivastava et al., 2014). A 20% dropout was added to the two hidden layers of the classification module. The cost function was optimized based on minibatch processing (batch size = 100) using the adaptive stochastic gradient descent method of Adam optimization (with default learning rate = 0.001) and the total number of epochs for VAE and classification modules were, respectively, 100 and 30. The architecture was implemented in python using the publicly available deep learning platforms of Keras (Chollet, 2017) and TensorFlow (Abadi et al., 2016), and it was trained on a PC workstation (Windows 10, Intel Xenon 3.3 GHz, 64-bit Windows and NVIDIA TITAN XP Graphics Card).

3.2 UMAP visualization of the VAE latent variable reveals separation between normal and tumor spectra

The deep learning model was optimized on a training set acquired by MALDI FT-ICR MSI from three different intracranial GBM PDX models, as demonstrated in the left column of Figure 1. This training set encompasses mass spectra with 85 062 m/z variables that were collected from two different classes of normal and tumor tissue types with total number of spectra 7432 and 3572, respectively. The VAE model optimization revealed stable convergence distribution in 100 epochs with random shuffling of data batches (Supplementary Fig. S1) and a total running time of 42.77 min. The optimized model captured the latent variable (z) of five-dimensions at the Code layer (Supplementary Fig. S2) which was then used to efficiently reconstruct the original TIC-normalized MSI data with a total mean squared error (MSE) of 5.31×10-5 (Fig. 3a). The UMAP visualization of the latent variable in two-dimensional space revealed separation of mass spectra from normal and tumor tissues as shown in Figure 3b. The UMAP features were also colored based on different GBM models to reveal intertumor heterogeneity and visualizing the learning efficiency in capturing similarities based on the molecular phenotype and assess potential batch effect (Fig. 3b).

Fig. 3.

Fig. 3.

Performance of the VAE module and nonlinear data visualization: overlay of the TIC-normalized average spectrum of original and reconstructed data for both training (a) and testing (b) datasets. UMAP visualization of the five-dimensional latent variable captured by the VAE model reveals distinction between normal and tumor mass spectra from different GBM models for both training (b) and testing (d) dataset

3.3 Pretrained VAE enables optimization of the classification module

The pretrained and optimized VAE provides input to the classifier module through the latent variable that was captured at the Code layer (hcode). The classification module was optimized in 30 epochs which also included random shuffling of data batches and the total running time of 4.5 min. To avoid potential overfitting and provide unbiased estimate of out-of-sample error (i.e. on test data), the model was evaluated using 5-fold cross-validation. For each of the cross-validation folds, the entire training dataset (shown in Fig. 1a and b) was randomly divided into 80% training and 20% validation. The robustness and evaluation of the different cross-validation models are shown in Supplementary Table S1, in which the best model with maximum accuracy and minimum loss values was eventually selected to be further independently evaluated on the withheld test set. The best model was benchmarked by comparison to SVM where both models showed comparable performance but massNet was faster (Table 1). Of note, the total running time shown in Table 1 for optimizing the overall massNet model included the running time for both the VAE and the classification modules.

Table 1.

Classification performance during both training and testing phases on MALDI FT-ICR MSI dataset without peak picking

Phase Class Spectra # Precision (%) Recall (%) F1-score (%) Accuracy (%) Time (min)
Training phase: massNet Normal 7432 100 100 100 100 47.27
Tumor 3572 100 100 100
Training phase: SVM Normal 7432 94 98 96 94 97.95
Tumor 3572 95 87 91
Test phase: massNet Normal 12 390 100 97 98 97.6 0.59
Tumor 2933 90 98 94
Test phase: SVM Normal 12 390 100 96 98 96.14 117.14
Tumor 2933 84 98 91

3.4 Rapid and efficient classification performance on MALDI MSI test set of large dimensions

The optimized massNet model was evaluated on the withheld test set of MALDI FT-ICR MSI data from four tissues from two different intracranial GBM PDX models as demonstrated in the right column of Figure 1. This test set encompasses mass spectra with 85 062 m/z variables that were collected from two different classes of normal and tumor tissue types with total number of spectra 12 390 and 2933, respectively. The optimized VAE module took 61.8 s to analyze the entire test set as such the latent variable was captured (Supplementary Fig. S3) and the original TIC-normalized MSI test data were reconstructed with an overall MSE of 5.917×10-5 (Fig. 3c). The UMAP visualization of the five-dimensional latent variable revealed separation between molecularly distinct phenotypes of normal and tumor tissues, as shown in Figure 3d.

The optimized massNet took about 35.34 s to provide spectral-wise probabilistic prediction of the entire test set. The robustness of this ultrafast classification (compared to SVM, Table 1) is supported by ROC analysis (AUC values for normal and tumor classes were 99.54% and 99.55%, respectively) and the confusion matrix which revealed true negative and true positive values of 97% and 98% respectively, as demonstrated in Figure 4. This classification performance was further benchmarked by comparison to an SVM classifier using different machine learning evaluation metrics shown in Table 1. Of note, both the massNet and SVM models were first optimized on the training set and then applied on this test set. The overall performance of massNet was slightly higher than SVM with respective accuracy of 96.8% and 95.26%, but more significantly, the massNet was substantially faster than SVM as shown in Table 1. Here, the slow running time of SVM was mainly due to the massive dimensionality of MSI data and the computational complexity of SVM is at least quadratic O(nfeatures×msamples2) during either training or prediction stages (Chang and Lin, 2011).

Fig. 4.

Fig. 4.

Classification performance on the MALDI FT-ICR MSI test set: (a) ROC curve distribution for both normal (blue) and tumor (orange) classes with an AUC of 99.54% and 99.55%, respectively. (b) Confusion matrix showing the prediction performance compared to the ground truth labels

3.5 Spatial mapping of predicted scores reveals uncertainty at the tumor margin

The probabilistic scores of the predicted class labels were spatially mapped to enable their visualization in the image space (Fig. 5a and b). Despite the massNet model being quite accurate in predicting the true class labels in most of the tissue regions, it showed a higher level of uncertainty at the tumor margin (Fig. 5c). This could reflect that the tumor margin has a convoluted molecular signature likely resulting from mixtures of normal and tumor cells compared to either the tumor core or normal tissue types, therefore representing an infiltrative edge. This observation is in accordance with other studies that studied mass spectral signatures from tumor margins (Calligaris et al., 2014). Identification of m/z peaks that are highly correlated with each of the predicted class labels were determined through the Pearson correlation coefficient. This coefficient was computed by correlating each of the predicted spatial maps with each of the 716 m/z values that were identified by msiPL. A list of top correlated m/z values correlated with the tumor region were identified and presented in Table 2, whereas the highest correlated values for normal tissue are shown in Supplementary Table S2. One of the highest correlated peak with the tumor class was at m/z480.9211±0.01 which is elevated in the tumor region (Fig. 5e). The automated nonlinear image registration method (Abdelmoula et al., 2014) was applied to nonlinearly warp this ion image and fuse it with an image of the H&E stained tissue section (Fig. 5d and e). This multimodal integration can provide a reference system for pathologists to further study and gain more insight about the tissue that can go beyond tissue anatomy (Caprioli, 2019).

Fig. 5.

Fig. 5.

Spatial mapping of the classification predictions and multimodal integration: spatial distribution of the spectral-wise probabilistic predictions for normal (a) and tumor (b) classes. (c) Closeup visualization of the spatially mapped tumor prediction scores reveals a higher level of uncertainty at the interface between normal and tumor (i.e. tumor margins). (d) The H&E microscopy images show tumor regions in different GBM models (columns). (e) Multimodal integration of the H&E images and ion image at m/z 480.9211 ± 0.01 which is highly correlated and elevated in the tumor region

Table 2.

A list top correlated m/z values with the tumor tissue

m/z Correlation Tentative assignment Molecular formula Adduct m/z calculated Database Error (ppm)
450.29788 0.894
468.3084 0.845
480.9211 0.841 Uridine diphosphate (UDP) C9H14N2O12P2 [M + 2K − H]+ HMDB −0.21
730.537 0.824
428.3732 0.8 Stearoylcarnitine C25H49NO4 [M + H]+ HMDB 0.47
518.8773 0.79
400.9546 0.7807 Uridine monophosphate (UMP) C9H13N2O9P [M + 2K − H]+ 400.9554553 HMDB −2.1
545.9591 0.775 Adenosine triphosphate (ATP) C10H16N5O13P3 [M + K]+ 545.9589 Metlin 0.36
703.57416 0.7235 SM (34:1)/PE-Cer (37d/1) C39H79N2O6P [M + H]+ 703.5749 Metlin −1.05
583.91437 0.717 Adenosine triphosphate (ATP) C10H16N5O13P3 [M + 2K − H]+ 583.9148 HMDB −0.74

4 Discussion

We presented massNet, a fully connected deep learning architecture for spectral-wise classification of large-scale MSI data without prior preprocessing and peak picking. The massNet architecture consists of two main modules, namely: VAE and classification modules. The VAE module learns a lower dimensional latent variable that was used to reconstruct the original MSI data and assess the learning quality of the VAE module. That optimized latent variable provides an input to the classification modules which was activated at the output layer using a sigmoid function to provide a probabilistic estimate of the predicted class membership. Unlike other architectures for MSI data classification, massNet is based on a fully connected artificial neural network and does not rely on optimizing a receptive field parameter, which is inherently associated with convolutional artificial neural networks and has an impact on the learning process (Behrmann et al., 2018; Guo et al., 2020; van Kersbergen et al., 2019). The optimized model showed high accuracy and ultrafast performance (<1 min) on the full MSI test data that encompassed a total of 15 323 mass spectra with 85 062 m/z variables. When compared for analysis of the same test dataset, the trained massNet model achieved higher accuracy and was substantially faster than a trained SVM model that took about 117.14 min to provide prediction during the testing phase (Table 1).

The performance of massNet methods was also benchmarked on preprocessed MALDI FT-ICR MSI GBM data after peak picking in SCiLS lab 2020a (Bruker, Bremen). The peak picking algorithm is based on the orthogonal matching pursuit (OMP) algorithm which has a quadratic computational complexity (Alexandrov et al., 2010). The peak picking analysis took 2 h and 58 min on the same workstation (Windows 10, Intel Xenon 3.3 GHz, 64-bit Windows and NVIDIA TITAN XP Graphics Card) and resulted in a reduction of the feature space from 85 062 to 243 dimensions. The overall training time of the massNet model on the preprocessed data was faster (Supplementary Table S3) compared to the analysis of MSI data without preprocessing (Table 1). The massNet training convergence distribution is shown in Supplementary Figure S4a, and the optimized model captured latent variables that revealed structural information (Supplementary Fig. S4b). The two-dimensional UMAP visualization of these latent variables is given in Supplementary Figure S4c and d and shows separation between normal and tumor. The trained model was then applied on the unseen and peak picked test set, and classification performance shows higher AUC scores for both normal and tumor classes, respectively (Supplementary Fig. S5). These results are in agreement with the MSI analysis without classical preprocessing (Figs 3 and 4). While the model training was faster on preprocessed data for both massNet and SVM (Table 1 and Supplementary Table S3), the peak picking step is more computationally expensive and sensitive to parameter settings that can impact downstream analysis and biological interpretation (Murta et al., 2021).

While the proposed model showed promising results for binary class classification, the model could be extended in future development to enable multiclass classification. We envision a slight change in the massNet architecture mainly at the output layer (Lout2) in which its size will be defined based on the new number of class labels and it must be activated using a different function such as softmax. The ground truth annotations could be defined using microscopy images and then mapped into the MSI space using image registration (Heijs et al., 2015). Depending on the application and tissue types, different stainings could be investigated to provide increased structural annotations compared to the most common H&E staining. MSI and microscopy imaging are complementary modalities and annotating the MSI data based solely on annotated histology is an approximation approach which might not be optimal especially in case of intratumor heterogeneity (Balluff et al., 2015). This is mainly because the MSI data could reveal molecular intratumor heterogeneity that is not yet observed in the microscopic image (Jones et al., 2012; Randall et al., 2020).

The proposed model was applied to the analysis of MALDI FT-ICR MSI data here but the massNet architecture is independent of the mass spectral ionization nature and could be applied to the analysis of MSI data from different platforms with distinct ionization and mass analyzer technologies. The massNet architecture could also be applied to the classification of different tissues and tumor types where the challenging task becomes the establishment of ground truth annotation of the MSI data. For instance, the massNet revealed a higher level of uncertainty around tumor margins in the binary classification task as shown in Figure 5b and c. This could reflect distinct molecular signatures in the tumor margin area compared to either the tumor core or normal tissue types. Tumor margins could therefore either be treated as an independent class or deconvoluted to extract signal from tumor and normal cells; however, the ground truth annotation is a challenging task that would require more pathological investigation and could be enhanced by integrating with multiplexed immunofluorescence to clearly delineate cell types from the tumor and microenvironment.

Funding

This work was supported by the NIH Grants U54-CA210180 (N.Y.R.A.), P41-EB028741 (N.Y.R.A.) and T32EB025823 (S.A.S.).

Conflict of Interest: In compliance with Harvard Medical School and Partners Healthcare guidelines on potential conflict of interest, we disclose that N.Y.R.A. is a scientific advisor to BayesianDX and inviCRO.

Data availability

Additional data are available from the corresponding author on reasonable request.

Supplementary Material

btac032_Supplementary_Data

Contributor Information

Walid M Abdelmoula, Department of Neurosurgery, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA; Invicro LLC, Boston, MA 02210, USA.

Sylwia A Stopka, Department of Neurosurgery, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA; Department of Radiology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA.

Elizabeth C Randall, Department of Radiology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA.

Michael Regan, Department of Neurosurgery, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA.

Jeffrey N Agar, Department of Chemistry and Chemical Biology, Northeastern University, Boston, MA 02111, USA.

Jann N Sarkaria, Department of Radiation Oncology, Mayo Clinic, Rochester, MN 55902, USA.

William M Wells, Department of Radiology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA; Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA 02139, USA.

Tina Kapur, Department of Radiology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA.

Nathalie Y R Agar, Department of Neurosurgery, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA; Department of Radiology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA; Department of Cancer Biology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02115, USA.

References

  1. Abadi M.  et al. (2016) TensorFlow: a system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI’16), USA, Vol. 16, pp. 265–283. [Google Scholar]
  2. Abdelmoula W.M.  et al. (2014) Automatic generic registration of mass spectrometry imaging data to histology using nonlinear stochastic embedding. Anal. Chem., 86, 9204–9211. [DOI] [PubMed] [Google Scholar]
  3. Abdelmoula W.M.  et al. (2016) Data-driven identification of prognostic tumor subpopulations using spatially mapped t-SNE of Mass spectrometry imaging data. Proc. Natl. Acad. Sci. USA, 113, 12244–12249. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Abdelmoula W.M.  et al. (2020) Peak learning of mass spectrometry imaging data using artificial neural networks. Nat. Commun., 12, 1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Addie R.D.  et al. (2015) Current state and future challenges of mass spectrometry imaging for clinical research. Anal. Chem., 87, 6426–6433. [DOI] [PubMed] [Google Scholar]
  6. Alexandrov T. (2012) MALDI imaging mass spectrometry: statistical data analysis and current computational challenges. BMC Bioinformatics, 13, S11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Alexandrov T. (2020) Spatial metabolomics and imaging mass spectrometry in the age of artificial intelligence. Annu. Rev. Biomed. Data Sci., 3, 61–87. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Alexandrov T.  et al. (2010) Spatial segmentation of imaging mass spectrometry data with edge-preserving image denoising and clustering. J. Proteome Res., 9, 6535–6546. [DOI] [PubMed] [Google Scholar]
  9. Balluff B.  et al. (2015) De novo discovery of phenotypic intratumour heterogeneity using imaging mass spectrometry. J. Pathol., 235, 3–13. [DOI] [PubMed] [Google Scholar]
  10. Basu S.S.  et al. (2019) Rapid MALDI mass spectrometry imaging for surgical pathology. npj Precis. Oncol., 3, 17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Becht E.  et al. (2019) Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol., 37, 38–44. [DOI] [PubMed] [Google Scholar]
  12. Behrmann J.  et al. (2018) Deep learning for tumor classification in imaging mass spectrometry. Bioinformatics, 34, 1215–1223. [DOI] [PubMed] [Google Scholar]
  13. Bowman A.P.  et al. (2020) Ultra-high mass resolving power, mass accuracy, and dynamic range MALDI mass spectrometry imaging by 21-T FT-ICR MS. Anal. Chem., 92, 3133–3142. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Calligaris D.  et al. (2014) Application of desorption electrospray ionization mass spectrometry imaging in breast cancer margin analysis. Proc. Natl. Acad. Sci. USA, 111, 15184–15189. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Calligaris D.  et al. (2015) MALDI mass spectrometry imaging analysis of pituitary adenomas for near-real-time tumor delineation. Proc. Natl. Acad. Sci. USA, 112, 9978–9983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Caprioli R.M. (2019) Imaging mass spectrometry: a perspective. J. Biomol. Tech., 30, 7–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Carreira R.J.  et al. (2015) Large-scale mass spectrometry imaging investigation of consequences of cortical spreading depression in a transgenic mouse model of migraine. J. Am. Soc. Mass Spectrom., 26, 853–861. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Castellino S.  et al. (2011) MALDI imaging mass spectrometry: bridging biology and chemistry in drug development. Bioanalysis, 3, 2427–2441. [DOI] [PubMed] [Google Scholar]
  19. Chang C.C., Lin C.J. (2011) LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol., 2, 1–27. [Google Scholar]
  20. Chaurand P.  et al. (2004) Integrating histology and imaging mass spectrometry. Anal. Chem., 76, 1145–1155. [DOI] [PubMed] [Google Scholar]
  21. Chollet F. (2017) Keras (2015). http://keras.io (accessed 27 January 2022).
  22. Coombes K.R.  et al. (2005) Understanding the characteristics of mass spectrometry data through the use of simulation. Cancer Inform., 1, 41–52. [PMC free article] [PubMed] [Google Scholar]
  23. Dewez F.  et al. (2020) MS imaging-guided microproteomics for spatial omics on a single instrument. Proteomics, 20, 1900369. [DOI] [PubMed] [Google Scholar]
  24. Dexter A.  et al. (2020) Training a neural network to learn other dimensionality reduction removes data size restrictions in bioinformatics and provides a new route to exploring data representations. bioRxiv preprint. doi: 10.1101/2020.09.03.269555. [DOI]
  25. Donnelly D.P.  et al. (2019) Best practices and benchmarks for intact protein analysis for top-down mass spectrometry. Nat. Methods, 16, 587–594. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Drake R.R.  et al. (2017) MALDI mass spectrometry imaging of N-linked glycans in cancer tissues. Adv Cancer Res., 134, 85–116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Eberlin L.S.  et al. (2014) Molecular assessment of surgical-resection margins of gastric cancer by mass-spectrometric imaging. Proc. Natl. Acad. Sci. USA, 111, 2436–2441. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Esteva A.  et al. (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542, 115–118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Folk M.  et al. (2011) An overview of the HDF5 technology suite and its applications. In: Proceedings of the EDBT/ICDT 2011 Workshop on Array Databases. ACM, USA, pp. 36–47. [Google Scholar]
  30. Guo D.  et al. (2020) Deep multiple instance learning classifies subtissue locations in mass spectrometry images from tissue-level annotations. Bioinformatics, 36, i300–i308. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Heijs B.  et al. (2015) Histology-guided high-resolution matrix-assisted laser desorption ionization mass spectrometry imaging. Anal. Chem., 87, 11978–11983. [DOI] [PubMed] [Google Scholar]
  32. Hosny A.  et al. (2018) Artificial intelligence in radiology. Nat. Rev. Cancer, 18, 500–510. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Huizing L.R.S.  et al. (2019) Development and evaluation of matrix application techniques for high throughput mass spectrometry imaging of tissues in the clinic. Clin. Mass Spectrom., 12, 7–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Inglese P.  et al. (2017) Variational autoencoders for tissue heterogeneity exploration from (almost) no preprocessed mass spectrometry imaging data. arXiv preprint. arXiv:1708.07012.
  35. Ioffe S., Szegedy C. (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint. arXiv:1502.03167.
  36. Jones E.A.  et al. (2012) Imaging mass spectrometry statistical analysis. J. Proteomics, 75, 4962–4989. [DOI] [PubMed] [Google Scholar]
  37. Kingma D.P., Welling M. (2013) Auto-encoding variational bayes. arXiv preprint. arXiv:1312.6114.
  38. Lecun Y.  et al. (2015) Deep learning. Nature, 521, 436–444. [DOI] [PubMed] [Google Scholar]
  39. McDonnell L.A., Heeren R.M. (2007) Imaging mass spectrometry. Mass Spectrom. Rev., 26, 606–643. [DOI] [PubMed] [Google Scholar]
  40. McDonnell L.A.  et al. (2010a) Automated imaging MS: toward high throughput imaging mass spectrometry. J. Proteomics, 73, 1279–1282. [DOI] [PubMed] [Google Scholar]
  41. McDonnell L.A.  et al. (2010b) Imaging mass spectrometry data reduction: automated feature identification and extraction. J. Am. Soc. Mass Spectrom., 21, 1969–1978. [DOI] [PubMed] [Google Scholar]
  42. McInnes L.  et al. (2018) UMAP: uniform manifold approximation and projection for dimension reduction. arXiv Preprint. arXiv1802.03426.
  43. Murta T.  et al. (2021) Implications of peak selection in the interpretation of unsupervised mass spectrometry imaging data analyses. Anal. Chem., 93, 2309–2316. [DOI] [PubMed] [Google Scholar]
  44. Norris J.L., Caprioli R.M. (2013) Imaging mass spectrometry: a new tool for pathology in a molecular age. Proteomics, 7, 733–738. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Patterson N.H.  et al. (2018) Next generation histology-directed imaging mass spectrometry driven by autofluorescence microscopy. Anal. Chem., 90, 12404–12413. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Race A.M.  et al. (2012) Inclusive sharing of mass spectrometry imaging data requires a converter for all. J. Proteomics, 75, 5111–5112. [DOI] [PubMed] [Google Scholar]
  47. Race A.M.  et al. (2021) Deep learning-based annotation transfer between molecular imaging modalities: an automated workflow for multimodal data integration. Anal. Chem., 93, 3061–3071. [DOI] [PubMed] [Google Scholar]
  48. Randall E.C.  et al. (2018) Integrated mapping of pharmacokinetics and pharmacodynamics in a patient-derived xenograft model of glioblastoma. Nat. Commun., 9, 4904. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Randall E.C.  et al. (2020) Localized metabolomic gradients in patient-derived xenograft models of glioblastoma. Cancer Res., 80, 1258–1267. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Ronneberger O.  et al. (2015) U-Net: convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015.
  51. Santagata S.  et al. (2014) Intraoperative mass spectrometry mapping of an onco-metabolite to guide brain tumor surgery. Proc. Natl. Acad. Sci. USA, 111, 11121–11126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Seddiki K.  et al. (2020) Cumulative learning enables convolutional neural network representations for small mass spectrometry data classification. Nat. Commun., 11, 5595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Srivastava N.  et al. (2014) Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res., 15, 1929–1958. [Google Scholar]
  54. Sud M.  et al. (2016) Metabolomics workbench: an international repository for metabolomics data and metadata, metabolite standards, protocols, tutorials and training, and analysis tools. Nucleic Acids Res., 44, D463–D470. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Thomas S.A.  et al. (2016) Dimensionality reduction of mass spectrometry imaging data using autoencoders. In: 2016 IEEE Symposium Series on Computational Intelligence (SSCI). New York, USA: IEEE, pp. 1–7. [Google Scholar]
  56. Van Der Maaten L.J.P.  et al. (2009) Dimensionality reduction: a comparative review. J. Mach. Learn. Res., 10, 1–41. [Google Scholar]
  57. van Kersbergen J.  et al. (2019) Cancer detection in mass spectrometry imaging data by dilated convolutional neural networks. In: Medical Imaging: Digital Pathology. Vol.  10956.  International Society for Optics and Photonics, Washington, USA: SPIE, p. 109560I. [Google Scholar]
  58. Veselkov K.A.  et al. (2014) Chemo-informatic strategy for imaging mass spectrometry-based hyperspectral profiling of lipid signatures in colorectal cancer. Proc. Natl. Acad. Sci. USA, 111, 1216–1221. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

btac032_Supplementary_Data

Data Availability Statement

Additional data are available from the corresponding author on reasonable request.


Articles from Bioinformatics are provided here courtesy of Oxford University Press

RESOURCES