Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2020 Jan 29;10:1462. doi: 10.1038/s41598-020-58299-7

Machine learning-based prediction of glioma margin from 5-ALA induced PpIX fluorescence spectroscopy

Pierre Leclerc 1,2, Cedric Ray 1, Laurent Mahieu-Williame 2, Laure Alston 2, Carole Frindel 2, Pierre-François Brevet 1, David Meyronet 3,4, Jacques Guyotat 3, Bruno Montcel 2,✉,#, David Rousseau 2,5,#
PMCID: PMC6989497  PMID: 31996727

Abstract

Gliomas are infiltrative brain tumors with a margin difficult to identify. 5-ALA induced PpIX fluorescence measurements are a clinical standard, but expert-based classification models still lack sensitivity and specificity. Here a fully automatic clustering method is proposed to discriminate glioma margin. This is obtained from spectroscopic fluorescent measurements acquired with a recently introduced intraoperative set up. We describe a data-driven selection of best spectral features and show how this improves results of margin prediction from healthy tissue by comparison with the standard biomarker-based prediction. This pilot study based on 10 patients and 50 samples shows promising results with a best performance of 77% of accuracy in healthy tissue prediction from margin tissue.

Subject terms: Diagnostic markers, Surgical oncology

Introduction

Gliomas account for more than fifty percent of primitive brain tumors. They are infiltrative tumors, with a margin difficult to identify and discriminate from the surrounding healthy tissues. The world health organization (WHO) classifies gliomas in 4 grades1, but most studies commonly consider two separate groups: High-Grade Gliomas (HGG) and Low-Grade Gliomas (LGG). Studies have shown that in 85% cases, recurrences of HGG are localized less than 2 centimeters away from the initial tumor2. Then, improving the extent of resection is relevant to prevent recurrence and improve life quality and expectancy35. Pre-operative MRI combined with neuro-navigation is currently used in the operating theater6,7 but shows strong limitations810. 5-aminolevulinic acid (5-ALA) induced protoporphyrin IX (PpIX) fluorescence microscopy has shown its relevance in neuro-oncology11. PpIX absorbs light at 405 nm and emits fluorescence with a main peak centered at 634 nm. This technique is the actual clinical standard for PpIX-based surgical assistance. However, its sensitivity is still limited when applied to low-density infiltrative parts of HGG12,13 or to LGG14.

Various 5-ALA induce PpIX fluorescence spectroscopy methods have been proposed to overcome these sensitivity issues. Previous works6,1524, focus on the extraction of biomarkers from the measurements, based on a priori information on the link between the biomarkers and the microenvironment of PpIX. These approaches are known as expert-based, and various biomarker models have been proposed in the literature. Quantification of PpIX concentration15 show enhanced sensitivity either in HGG16 or in LGG17. Normalization procedures of biomarkers can also increase their robustness6,18,19. Other works suggest that relevant models could be obtained based on the shape of the PpIX emission spectrum1826. These works show that the PpIX fluorescence emission spectral complexity in tissue is closely linked with the pathological status. However, the still unsolved origin of this complexity impairs the extraction of the best features with an expert-based related method, thus preventing the classification of measurements into relevant pathological status.

In this study, we adopt a different approach for the prediction of glioma margin from fluorescence information. Instead of choosing a small amount of numerical features used as biomarkers like in the recent above-cited literature6,1524, we propose to investigate the prediction of glioma margin with an entirely data-driven approach where no assumption of feature selection based on expert is made. To this purpose, we implement, for the first time to our knowledge in this context, a machine learning classification approach. The pipeline, as illustrated in Fig. 1 and detailed in the material and method section predicts glioma margin from the raw fluorescence spectrum. This is done on the same data acquired previously in a surgical procedure of 5-ALA induced PpIX guided glioma removal and which had been only processed so far with a biomarker approach23. This choice enables a comparison of the prediction performance of glioma margin from a biomarker approach with a novel expert-independent point of view. Also, as another element of novelty, the prediction of glioma margin is performed from 3 different fluorescent spectra in response to 3 distinct excitation wavelengths taken successively over the same area, while previously only a few features extracted from a single fluorescent spectrum were used for analysis23.

Figure 1.

Figure 1

Global view of the proposed machine learning-based prediction of glioma margin by PpIX fluorescence spectroscopic measurements. In this study, the data set is composed of 50 samples from 10 patients. From left to right, the optical spectrum of cells around a tumor is measured. The dimension of the spectral information is then reduced to lower the redundancy. Supervised or unsupervised algorithms are finally used to classify the data and create a prediction of tissue state from the PpIX fluorescence spectroscopic measurements.

Results

Before comparing performances of supervised and unsupervised classification of machine-learning-based with an expert-based approach, we provide the estimation of the number of clusters and the identification of the best spectral features with the pipeline of Fig. 1 as described step-by-step in the Method section.

Data driven estimation of the number of clusters corresponds to the clinical taxonomy

The number of classes in the data set was automatically estimated using data-driven methods. As illustrated in Fig. 2, the optimum number of clusters is robustly found between 4 and 5. This is recorded by the Bayesian information criterion (BIC)27 and gap criteria28 where the extremum of the curves indicate the optimal number of clusters for both tested clustering methods (K-Means, Gaussian Mixture Models). Interestingly, while obtained here from a purely data-driven approach, this number of clusters is compatible (see red dotted lines in Fig. 2) with the number of classes proposed independently by the clinical taxonomy described in23 from histological images: tumor core, high-density margin, low-density margin, healthy tissue.

Figure 2.

Figure 2

Bayesian inference criterion (BIC) (left) and gap criterion (right) as a function of the number of clusters for K-means (top row) and GMM (bottom row). The minimum of the BIC and maximum of the gap criterion (highlighted in the red dash-dotted line) correspond to the optimal number of clusters in our data. Interestingly K-means and GMM are best described with 3 or 4 clusters which fit with the red dotted lines corresponding to the number of classes from the clinical taxonomy.

Identification of best spectral features

Our original feature space composed of raw fluorescence spectrum with 900 emission wavelengths (from 435 to 840 nm) taken at three distinct excitation wavelengths includes 2700 features. In order to both select the best spectral features and analyze their shapes, dimension reduction techniques were applied. Based on the principal component analysis (PCA)29, the Scree test30 and the cumulative variance were used to find the statistically relevant principal spectral components. As illustrated in Fig. 3, either the Scree test (left) and the cumulative variance (right) in the PCA analysis show that a minimum of 5 components is required to describe the data variance. Indeed, the Scree test “elbow” and the saturation around 95% of cumulative variance are both found around five components (given by the red dotted vertical line in Fig. 3). By comparison, in our previous study23, for one excitation wavelength, 2 features were extracted from the spectrum: the relative intensity of the component leading to a peak of PpIX fluorescence at 620 nm (PpIX620) and the relative intensity of the component leading to a peak PpIX of fluorescence at 634 nm (PpIX634). With three excitation wavelengths, the resulting feature space of this model is 2 × 3 = 6 descriptors to describe the variability of the data. Thus the number of statistically relevant descriptors computed with the data-driven approach with the principal components analysis (5 components) is compatible with the expert-based model (6 components). In addition to the number of relevant principal components, the shapes of these components were analyzed in the original feature space, i.e. the fluorescence emission spectrum. These are plotted in Fig. 4. In order to simplify the visualization, only the result for one excitation wavelength (405 nm) is displayed. However, results were very similar to the two other excitation wavelengths.

Figure 3.

Figure 3

Scree test (on the left) and cumulative variance (on the right) of principal component analysis. A minimum of 5 principal components is required to describe the data variance as can be inferred from the Scree test “elbow” and the saturation around 95% of the cumulative variance highlighted in red dotted lines.

Figure 4.

Figure 4

Normalized principal component of the PCA (in blue) in the original feature space (i.e. the optical spectrum space). For each principal component, the reference spectrum of the PpIX (for the state peaking at 634 nm) is also plotted (in red). For better comprehension, only one of the three fluorescence emission spectrum from the original feature space is represented as they are similar for all three excitation wavelengths. The first principal component is similar to the spectrum of the PpIX with a peak of 636 nm. The second principal component is best described as the autofluorescence of the measured tissue, i.e. the contribution of other fluorophores. The five following components all show a peak shifting between 620 nm and 636 nm.

It is then interesting to compare the shape of the principal components extracted from the data-driven PCA approach with the spectral patterns extracted in the expert-based approach of 23, as shown in Fig. 4. Not surprisingly, the first principal component is very similar to the fluorescence emission spectrum of the PpIX19 with a peak around 634 nm and, also not surprisingly the second component corresponds to the autofluorescence of the tissue, which can be linked to numerous endogenous autofluorescent molecule such as NADH, FAD, lipofuscin31,32. These molecules are also known to be correlated with cancerous pathological status through the Warburg effect22,33 which modify cell metabolism and favorize glyclolysis. The third principal component is a mix between the spectrum of PpIX with a peak around 632 nm and the fluorescence previously described. The fourth principal component is comprised of a peak at 629 nm and a smaller one at 695 nm. The fifth and last relevant principal component shows a maximum at 625 nm. All principal components present a peak varying from 622 nm (principal component 6) to 636 nm (principal component 1). Interestingly, the variance of our normalized spectra is best described as a blue-shift of the PpIX spectrum peak from 636 nm to 620 nm. Remarkably, the shape of these principal components extracted with a purely data-driven approach corresponds to spectral components identified in the expert-based studies19,23.

Good predictions are obtained in unsupervised and supervised modes

Unsupervised learning was then used to find clusters among a reduced feature space autonomously. Two methods were tested, including PCA and T-SNE algorithms34. A dimension reduction to 3 was chosen to facilitate the visualization of clusters. With this choice, the resulting clustering presented a better accuracy with T-SNE rather than PCA, and thus only T-SNE results are presented here when applied to K-means and GMM for the four classes identified previously. As K-Means and GMM are randomly initiated, in order to assess the quality of our results, the classification was reproduced twenty times and averaged. Results can be seen for four classes with K-means in Table 1 when the entire feature space (3 excitation wavelengths) was considered. A comparison with a feature space based on each excitation wavelength was also plotted and discussed further in the Supplementary Material. The typical clustering results are displayed in Fig. 5.

Table 1.

Confusion matrix for K-means with 4 classes: tumor core, high density margin, low density margin and healthy tissue.

Predicted Core Predicted HD Margin Predicted LD Margin Predicted Healthy
True Core (10) 8 (80%) σ = 0 1.9 (19%) σ = 0.31 0 (0%) σ = 0 0.1 (1%) σ = 0.31
True HD Margin (24) 1.3 (5%) σ =  0.5 7.1 (30%) σ = 0.32 9.7 (40%) σ = 0.48 5.9 (25%) σ = 0.32
True LD Margin (9) 0 (0%) σ = 0 0.1 (1%) σ = 0.31 5.6 (62%) σ = 0.52 3.3 (37%) σ = 0.48
True Healthy (7) 0 (0%) σ = 0 0 (0%) σ = 0 1 (14%) σ = 0 6 (86%) σ = 0

Statistics (average, standard deviation) result from 20 predictions. In each cell, from left to right: number of instances, percentage of total class population and standard deviation.

Figure 5.

Figure 5

Unsupervised classification in a T-SNE reduced space. From left to right: the histological truth as given by the anatomopathologist, K-means classification and GMM classification both with 4 clusters.

As seen in Fig. 5, the four resulting computed clusters fairly correspond to the clinical taxonomy given by the anatomopathologist (on the left). In addition, these clusters are in this feature space aligned along with the ordinal severity of symptoms following the density of diseased cells with maximization of the distance between tumor core and healthy cells. Both K-means and GMM methods (middle and right of Fig. 5, respectively) give results in accordance with the histological truth. Figure 5 and Table 1 constitutes promising results. While accurate prediction for the tumor core is logical since it is the main target of the PpIX, we also see that accurate prediction is obtained for the healthy tissue. Overall a prediction between healthy tissue, tumor core, and margin led to a 73% accuracy and went up to 77% for classification with only three classes. To further stress the interest of machine learning-based prediction of glioma margin, a supervised method was also used. Typical results are shown in the Supplementary Material.

Comparison between machine-learning-based and expert-based feature selection

As stated above, the number of clusters and the shape of the principal component calculated are remarkably similar to what has been described in23. The benefits of this fully automated approach compared to the expert-based model is thus investigated. We use the expert-based method of our previous study23. This consist in a fitting process of the fluorescence spectrum to retrieve the relative contribution of the two states of PpIX, PpIX620 and PpIX634 for each excitation wavelength. In this study this process is led for the three excitation wavelength, instead of only at the 405 nm excitation wavelength in the previous study. The resulting feature space of dimension 6 is then normalized and reduced using T-SNE to 3 dimensions like with the machine learning approach in order to enable a fair comparison between the discrimination power of each method. The average result of 20 predictions can be seen in Fig. 6. The machine learning-based method shows a clear superiority over the expert-based model. In particular, the “Healthy tissue” case is significantly better for true positive and true negative while being comparable or better for false positive.

Figure 6.

Figure 6

Comparison of confusion matrix for K-means with 4 classes: tumor core, high and low density margin and healthy tissue. Average result of 20 predictions. ML model is Machine learning-based Model, HD stands for high density and LD for low density.

This result demonstrates that despite similar features, the fully automated machine-learning-based method shows significantly better results than the previously used expert-based model. In the expert-based model of 23, the fit was between 585 and 640 nm, which did not include patterns around 700 nm which shows variability among the datasets. Also, in23 the autofluorescence was used to normalize every spectrum and was then substrated from the signal. In our case, autofluorescence was used differently since it was kept in the signal (mainly present in the second component of the PCA in Fig. 4).

Discussion

This study demonstrated the interest of a machine learning-based approach for the prediction of glioma margin. This approach found spectral features important for the prediction of glioma margin, which happens to be close to the one selected in the expert-based model of 23. This is an important result since this is obtained by two independent ways on the same dataset. It reinforced, quantitatively, the evidence of the interest in using the blue-shift in the PpIX fluorescence spectrum as a mean of discrimination between margin and healthy tissue. Other works already suggested a wavelength blue-shift of the peak intensity of the emission spectrum correlated with the tissue pathological status18,19,23. In particular, a second peak of fluorescence at 620 nm has been observed in tissues1923,25,26 or in cell culture21,24. It is in vivo origin is still an open issue, some works supporting the assumption that it is related to different aggregates of PpIX19,23,24,26, other works20,35 argue that it is a fluorescence induced by the precursors of PpIX, uroporphyrins or coproporphyrins. This work cannot give clues on the origin of the blue-shift. However, the relevance of the blue-shift effect is retrieved in this data-driven approach. This reinforces the expert-driven previous works1826 focusing on the blue-shift. The biological mechanism behind this shift remains to be uncovered to this day and is not the subject of this work. In this section, we rather discuss some elements of data preparation, excitation wavelengths, and further machine learning approaches.

Data sample preparation

By contrast with23, the distinction between LGG and HGG was not done to increase the sample size of clusters. This choice could be discussed since the pathology is classified as different in high grade and low grade from the perspective of histology, which is based on tissular structures at the supracellular scale. However, a posteriori, we found that this choice does not prevent a reasonably good clustering prediction for the whole data set. In a future study, with increased sample size, it would be interesting to compare the clustering with labels differentiating LGG and HGG and the fused approach followed in this study.

Multi-excitation wavelengths

The guideline of the article is to investigate how the increase of the fluorescence feature space can contribute to improving the performance of classification by comparison with an approach based on few fitted parameters. We extend the feature space to the entire spectrum of a single excitation wavelength in the core of the manuscript. We demonstrated that this feature space extension produces a gain of classification performances. In the complementary material of the article, we also report the performance when the feature space is increased to an additional fluorescence spectrum obtained with other excitation wavelengths. In order to probe the two PpIX states (peaking at 634 nm and 620 nm), the optical fluorescence spectra were acquired for three different LEDs exposition: 385, 405 and 420 nm. We investigated the potential of using different excitation wavelengths to expand the feature space, suppress potential degeneracy, and potentially get more accurate information and better insight. Using a PCA, we analysed the principal component in the original feature space (the eigenvectors in the spectral space). However, the resulting PCA were rather similar for all 3 LEDs, not showing significantly different information with respect to the wavelength. This is probably due to the significant overlapping of the emission spectra of the 3 LEDs. Another explanation for this degeneracy can be that minor change in the spectrum acquired with different excitation wavelengths are not distinguished because the small sample size does not allow for a minor and subtle change in the spectrum to be discriminating in this study.

Supervised learning

Supervised learning showed (in Supplementary Material Section) consistent results with unsupervised learning. This is an interesting result since it shows the robustness of the classification with various machine learning algorithms. While it is certainly encouraging, we cannot be definitive about these supervised methods until the sample size is increased. Such input of more data would allow a fine-tuning of these models or a more advanced supervised classifier (such as deep neural networks), including more hyperparameters. Extended cohort and larger training dataset would also enable to perform classification in real time for clinical use without the need to perform cross-validation.

Conclusion

In this article, we have demonstrated the interest of a machine learning approach for the prediction of glioma margin from 5-ALA induced PpIX fluorescence spectroscopy. When considering the entire raw spectrum as input feature space, classical dimension reduction was shown to select spectral patterns similar to those identified around 634 nm and 620 nm as possible biomarkers for margin prediction in our previous expert based work23. This pure data-driven proof, independent from the expert-based approach found in the current literature, is significant since the biochemical or physical origin of these spectral biomarkers is not yet understood. A second interest of the machine learning approach proposed here is that it shows an increase of discrimination as compared to our previous expert-based features used as biomarkers23. The best performance of 77% of accuracy between healthy tissue and margin is found. Despite the relatively small size of the data set considered here, this can be considered as promising pilot results due to their self-consistency with the classical expert-based approach for feature selection. Repetition with a larger cohort will have to be carried out to establish the added value of the optical probe for surgery. With such an extended data set, other machine learning models with higher capability and tuning parameters could be tested. Other expert model16,17 performance could also be compare against data driven approach to test its robustness. Also, another direction of investigation for the future could be to enlarge the feature space. No positive effect on prediction performance was recorded when increasing the number of excitation with single-photon fluorescence in this study. Two-photons fluorescence or the effect of polarization could also be worth to investigate in this context while revisiting the proposed machine-learning approach.

Methods

The data set was acquired during a clinical study led at the neurologic center of the Pierre Wertheimer hospital in Bron, France. This study was described in detail in previous works23. All experiments were in accordance and approved by the French Agency for Health (ANSM) and the local ethics committee of Lyon University Hospital (France). All participating patients signed written informed consent. Only a part of the acquired data was used since we focused on “in vivo” measurements and used the multi-wavelength excitation measurements. For readability purposes, we shortly described the method and added complementary information to the already published method of 23.

Spectroscopic device

The developed device has been described in detail in previous works23. Here, as a novelty, multi-wavelengths excitation, not described previously, was used. Therefore, we describe here the setup, including the multi-wavelength excitation capabilities. Excitation was performed through three light-emitting diodes (LED) centered at 385 nm, 405 nm, and 420 nm with 7 nm Full-Width Half-Maximum (M385F1, M405F1, M420F1, Thorlabs). Emitted light was transmitted through three optical fibers (HCG M0600T, sedi-fibres) to a dedicated probe. The probe entrance consists of a bundle of 7 optical fibers of 600 μm core diameter. The other ends of these fibers are cleaved, so that excited tissue area and emitting tissue area are the same. The light goes through a low-pass filter (Edmunds Optics OD4 low pass 450 nm) with a cutoff wavelength of 450 nm. This led to an output irradiance of 80 W/m, 30 W/m, and 50 W/m, respectively, for the LED centered at 385 nm, 405 nm, and 420 nm. Tissue reflectance was collected through the same probe, with a detection fiber, and goes through a high pass filter (HQ485LP, Chroma) with a cutoff wavelength of 485 nm. The filtered light was finally injected into a spectrometer (Maya2000, Ocean optics). Characterization of the system has been performed on calibrated phantoms36.

Surgical procedure and data acquisition

Patients were given an oral dose of 20 mg/kg of body weight of 5 aminolevulinic acids (Gliolan; Medac GmBH) approximately 3 hours prior to the induction of anesthesia. For each patient, the standard surgical procedure started in order to expose the tissue, and, when asked by the surgeon, the surgical procedure was stopped so that fluorescence measurements were performed. Each acquisition was composed of 200 ms of duration, with the LED turned on, followed by the same duration with the LED turned off to get rid of ambient light coming from the operating room. For each measurement, 6, 12, and 6 acquisitions were led respectively for the LED centered at 385 nm, 405 nm, and 420 nm. This gave a total acquisition time of 9.6 s. The tissue was then removed and sent for histopathological analysis. These fluorescence measurements were performed in order to get different densities of infiltrative tumor cells per glioma. In total, 50 measurements were kept in this analysis.

Histopathology

Histopathological analysis was performed on formalin-fixed paraffin-embedded biopsy tissue specimens processed for H & E staining. Each H & E stained tissue section was assessed for the presence of tumor cells, necrosis, mitotic activity, nuclear atypia, microvascular proliferation, and reactive astrocytosis. Molecular criteria were also assessed. Biopsy specimens were then classified into five categories based on WHO histopathological and molecular criteria1 as HGG solid part, HGG margin, HGG margin of low density, LGG and healthy tissue. Finally, LGG and HGG data were combined in order to increase the sample size and test the expert driven against data driven approaches. This clustering is supported by previous works19 showing that LGG and HGG margins share common properties in terms of PpIX fluorescence intensity. The samples from LGG patients were included in the healthy, HGG low-density margin, or HGG high-density margin depending on their pathological status. The resulting studied data set was composed of 50 samples from 10 patients. This includes 28 samples for HGG composed of 10 samples from tumor core, eight from the high-density margin, five from the low-density margin, and five healthy samples. Furthermore, this includes also 22 samples for LGG, 17 included in HGG high-density margin, three included in HGG low-density margin, and two included in healthy tissue, depending on their pathological status.

Data processing pipeline

The data processing pipeline developed is illustrated in Fig. 1. In the first step, the spectrum was acquired with the optical system. Three fluorescence spectra corresponding to three excitation LED’s were acquired. The pathological status of the corresponding tissue was recorded from the histopathological analysis. In the second step, the spectrum was then normalized by the global energy of the spectrum. The feature spaces, created by the three spectra, being too large to be applied to clustering methods37,38 were reduced with dimension reduction techniques in a third step. In our case, principal component analysis (PCA) and t-distributed stochastic neighbor embedding (T-SNE), which are two basic techniques for dimension reduction, were tested34,39. PCA was mainly chosen among others, orthogonal transformations, for its exploratory ability to summarize data along with their main characteristics. T-SNE was mainly chosen for its ability to preserve local structure so that points close to one another in the high-dimensional feature space will tend to be close to one another in the reduced feature space. The number of principal components to retain is determined with the Scree test30 or by computing the number of principal components requested to reach 95% of cumulated variance. The last step corresponded to the classification from the reduced feature space in an unsupervised and supervised way. Two clustering methods were chosen for this study: K-Means40 and the Gaussian mixture model (GMM)41. A large variety of methods can be found in the literature with various levels of complexity and hyperparameters to be adjusted42. Because the size of the data set is fairly small, algorithms were chosen with a minimal amount of hyperparameters to be adjusted. K-Means cluster points in the feature space inside hyper-spheres according to a euclidian distance while GMM includes an additional degree of freedom in the organization of the points which are clustered inside hyper-ellispoïds. In K-Means and GMM, the number of clusters is a hyperparameter, which was determined with the Bayesian inference criterion27 and the “gap” criteria28. We compared the predictive value of the pure data-driven feature space with the predictive value of expert-based feature space. The expert-based feature space consists of two features α634 and α620 computed from the raw spectrum. As described in23, the other intrinsic fluorophores (at the exclusion of PpIX) emitting below 620 nm were removed in the emission spectrum. The resulting spectrum was then fitted with the contribution of two PpIX spectra acquired in vitro. One of those spectrums presented a peak assumed Gaussian at 620 nm, and its relative contribution to the resulting spectrum was named α620. The other one presented a peak also assumed Gaussian at 634 nm, and its relative contribution was named α634. An extensively detailed explanation of this feature space can be found in previous work23. The reference spectra in Fig. 4 was also taken from this previous work.

Supplementary information

Acknowledgements

LABEX PRIMES (ANR-11-LABX-0063) of Université de Lyon, within the program “Investissements d’Avenir” (ANR-11-IDEX-0007) operated by the French National Research Agency (ANR); Cancéropôle Lyon Auvergne Rhône Alpes (CLARA) within the program “OncoStarter”; Infrastructures d’Avenir en Biologie Santé (ANR-11-INBS-000), within the program “Investissements d’Avenir” operated by the French National Research Agency (ANR). The authors thank the PILoT facility for the support provided on the signal acquisition.

Author contributions

P.L. conducted the data analysis. L.A., L.M.W., B.M., D.M. and J.G. conducted the experiment. L.A., B.M. and L.M.W. conceived the experiment. C.R., D.R. and B.M. analysed the results. All authors reviewed the manuscript. B.M. and D.R. have equally contributed and jointly supervised the work.

Data availability

The datasets analysed during the current study are available from the corresponding author on reasonable request.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Bruno Montcel and David Rousseau.

Supplementary information

is available for this paper at 10.1038/s41598-020-58299-7.

References

  • 1.Louis DN, et al. The 2016 World Health Organization Classification of Tumors of the Central Nervous System: a summary. Acta Neuropathol. 2016;131:803–820. doi: 10.1007/s00401-016-1545-1. [DOI] [PubMed] [Google Scholar]
  • 2.Petrecca K, Guiot M-C, Panet-Raymond V, Souhami L. Failure pattern following complete resection plus radiotherapy and temozolomide is at the resection margin in patients with glioblastoma. J. Neuro-Oncology. 2013;111:19–23. doi: 10.1007/s11060-012-0983-4. [DOI] [PubMed] [Google Scholar]
  • 3.Laws ER, et al. Survival following surgery and prognostic factors for recently diagnosed malignant glioma: data from the Glioma Outcomes Project. J. Neurosurg. 2003;99:467–473. doi: 10.3171/jns.2003.99.3.0467. [DOI] [PubMed] [Google Scholar]
  • 4.Lacroix M, Toms SA. Maximum Safe Resection of Glioblastoma Multiforme. J. Clin. Oncol. 2014;32:727–728. doi: 10.1200/JCO.2013.53.2788. [DOI] [PubMed] [Google Scholar]
  • 5.Li YM, Suki D, Hess K, Sawaya R. The influence of maximum safe resection of glioblastoma on survival in 1229 patients: Can we do better than gross-total resection? J. Neurosurg. 2016;124:977–988. doi: 10.3171/2015.5.JNS142087. [DOI] [PubMed] [Google Scholar]
  • 6.Haj A, et al. Extent of resection in newly diagnosed glioblastoma: Impact of a specialized neuro-oncology care center. Brain Sci. 2018;8:5. doi: 10.3390/brainsci8010005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Wu GN, Ford JM, Alger JR. MRI measurement of the uptake and retention of motexafin gadolinium in glioblastoma multiforme and uninvolved normal human brain. J. Neuro-Oncology. 2006;77:95–103. doi: 10.1007/s11060-005-9101-1. [DOI] [PubMed] [Google Scholar]
  • 8.Nabavi A, et al. Serial Intraoperative Magnetic Resonance Imaging of Brain Shift. Neurosurgery. 2001;48:787–798. doi: 10.1097/00006123-200104000-00019. [DOI] [PubMed] [Google Scholar]
  • 9.Kubben PL, et al. Intraoperative MRI-guided resection of glioblastoma multiforme: a systematic review. The Lancet Oncol. 2011;12:1062–1070. doi: 10.1016/S1470-2045(11)70130-9. [DOI] [PubMed] [Google Scholar]
  • 10.Senft C, et al. Intraoperative MRI guidance and extent of resection in glioma surgery: a randomised, controlled trial. The Lancet Oncol. 2011;12:997–1003. doi: 10.1016/S1470-2045(11)70196-6. [DOI] [PubMed] [Google Scholar]
  • 11.Stummer W, et al. Technical Principles for Protoporphyrin-IX-Fluorescence Guided Microsurgical Resection of Malignant Glioma Tissue. Acta Neurochir. 1998;140:995–1000. doi: 10.1007/s007010050206. [DOI] [PubMed] [Google Scholar]
  • 12.Bravo JJ, et al. Hyperspectral data processing improves PpIX contrast during fluorescence guided surgery of human brain tumors. Sci. Reports. 2017;7:9455. doi: 10.1038/s41598-017-09727-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Hadjipanayis CG, Widhalm G, Stummer W. What is the surgical benefit of utilizing 5-Aminolevulinic acid for fluorescence-guided surgery of malignant gliomas? Neurosurgery. 2015;77:663–673. doi: 10.1227/NEU.0000000000000929. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Jaber M, et al. Is visible aminolevulinic acid-induced fluorescence an independent biomarker for prognosis in histologically confirmed (World Health Organization 2016) low-grade gliomas? Neurosurgery. 2019;84:1214–1224. doi: 10.1093/neuros/nyy365. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Valdés PA, et al. Quantitative, spectrally-resolved intraoperative fluorescence imaging. Sci. Reports. 2012;2:798. doi: 10.1038/srep00798. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Valdés PA, et al. System and methods for wide-field quantitative fluorescence imaging during neurosurgery. Opt. Lett. 2013;38:2786–2788. doi: 10.1364/OL.38.002786. [DOI] [PubMed] [Google Scholar]
  • 17.Valdés PA, et al. Quantitative fluorescence using 5-aminolevulinic acid-induced protoporphyrin IX biomarker as a surgical adjunct in low-grade glioma surgery. J. Neurosurg. 2015;123:771–780. doi: 10.3171/2014.12.JNS14391. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Ando T, et al. Precise comparison of protoporphyrin IX fluorescence spectra with pathological results for brain tumor tissue identification. Brain Tumor Pathol. 2011;28:43–51. doi: 10.1007/s10014-010-0002-4. [DOI] [PubMed] [Google Scholar]
  • 19.Montcel B, Mahieu-Williame L, Armoiry X, Meyronet D, Guyotat J. Two-peaked 5-ALA-induced PpIX fluorescence emission spectrum distinguishes glioblastomas from low grade gliomas and infiltrative component of glioblastomas. Biomed. Opt. Express. 2013;4:548–558. doi: 10.1364/BOE.4.000548. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Zanello M, et al. Multimodal optical analysis discriminates freshly extracted human sample of gliomas, metastases and meningiomas from their appropriate controls. Sci. Reports. 2017;7:41724. doi: 10.1038/srep41724. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Dietel W, Fritsch C, Pottier RH, Wendenburg R. 5-aminolaevulinic-acid-induced formation of different porphyrins and their photomodifications. Lasers Med. Sci. 1997;12:226–236. doi: 10.1007/BF02765103. [DOI] [PubMed] [Google Scholar]
  • 22.Figueredo AJ, Wolf PSA. Assortative pairing and life history strategy. Hum. Nat. 2009;20:317–330. doi: 10.1007/s12110-009-9068-2. [DOI] [Google Scholar]
  • 23.Alston L, et al. Spectral complexity of 5-ala induced ppix fluorescence in guided surgery: a clinical study towards the discrimination of healthy tissue and margin boundaries in high and low grade gliomas. Biomed. Opt. Express. 2019;10:2478–2492. doi: 10.1364/BOE.10.002478. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Hope CK, Higham SM. Evaluating the effect of local pH on fluorescence emissions from oral bacteria of the genus Prevotella. J. Biomed. Opt. 2016;21:084003. doi: 10.1117/1.JBO.21.8.084003. [DOI] [PubMed] [Google Scholar]
  • 25.Montcel, B. et al. 5-ALA-induced PpIX fluorescence in gliomas resection: spectral complexity of the emission spectrum in the infiltrative compound. In SPIE Proceedings, 8804, 880409 (Optical Society of America, 2013).
  • 26.Alston, L. M. et al. Interventional fluorescence spectroscopy: preliminary results to detect tumor margins during glioma resection with two fluorescence spectra of PpIX. In SPIE Proc. 10411, 104110C, 10.1117/12.2286110 (Optical Society of America, 2017).
  • 27.Schwarz G. Estimating the dimension of a model. The Annals Stat. 1978;6:461–464. doi: 10.1214/aos/1176344136. [DOI] [Google Scholar]
  • 28.Tibshirani R, Walther G, Hastie T. Estimating the number of clusters in a data set via the gap statistic. J. Royal Stat. Soc. Ser. B (Statistical Methodol.) 2001;63:411–423. doi: 10.1111/1467-9868.00293. [DOI] [Google Scholar]
  • 29.Pearson K. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, Dublin Philos. Mag. J. Sci. 1901;2:559–572. doi: 10.1080/14786440109462720. [DOI] [Google Scholar]
  • 30.Cattell RB. The scree test for the number of factors. Multivar. Behav. Res. 1966;1:245. doi: 10.1207/s15327906mbr0102_10. [DOI] [PubMed] [Google Scholar]
  • 31.Andersson H, Baechi T, Hoechl M, Richter C. Autofluorescence of living cells. J. microscopy. 1998;191:1–7. doi: 10.1046/j.1365-2818.1998.00347.x. [DOI] [PubMed] [Google Scholar]
  • 32.Minamikawa T, et al. Simplified and optimized multispectral imaging for 5-ALA-based fluorescence diagnosis of malignant lesions. Sci. Reports. 2016;6:25530. doi: 10.1038/srep25530. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Warburg O, Wind F, Negelein E. The metabolism of tumors in the body. The J. Gen. Physiol. 1927;8:519–530. doi: 10.1085/jgp.8.6.519. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Van Der Maaten L, Hinton G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008;9:2579–2605. [Google Scholar]
  • 35.Dietel W, Pottier R, Pfister W, Schleier P, Zinner K. 5-Aminolaevulinic acid (ALA) induced formation of different fluorescent porphyrins: A study of the biosynthesis of porphyrins by bacteria of the human digestive tract. J. Photochem. Photobiol. B: Biol. 2007;86:77–86. doi: 10.1016/j.jphotobiol.2006.07.006. [DOI] [PubMed] [Google Scholar]
  • 36.Alston L, Rousseau D, Hébert M, Mahieu-Williame L, Montcel B. Nonlinear relation between concentration and fluorescence emission of protoporphyrin IX in calibrated phantoms. J. Biomed. Opt. 2018;23:097002. doi: 10.1117/1.JBO.23.9.097002. [DOI] [PubMed] [Google Scholar]
  • 37.Bellman R, Kalaba R. Reduction of dimensionality, dynamic programming, and control processes. J. Basic Eng. 1961;83:82–84. doi: 10.1115/1.3658896. [DOI] [Google Scholar]
  • 38.Steinbach Michael, Ertöz Levent, Kumar Vipin. New Directions in Statistical Physics. Berlin, Heidelberg: Springer Berlin Heidelberg; 2004. The Challenges of Clustering High Dimensional Data; pp. 273–309. [Google Scholar]
  • 39.Jolliffe Ian T, Cadima J. Principal component analysis: a review and recent developments. Philos. Transactions Royal Soc. A: Math. Phys. Eng. Sci. 2016;374:20150202. doi: 10.1098/rsta.2015.0202. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.MacQueen, J. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics, 281–297 (University of California Press, Berkeley, Calif., 1967).
  • 41.Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. J. Royal Stat. Soc. Ser. B (Methodological) 1977;39:1–22. doi: 10.1111/j.2517-6161.1977.tb01600.x. [DOI] [Google Scholar]
  • 42.Bishop, C. M. Pattern Recognition And Machine Learning, 1st ed. 2006. corr. 2nd printing 2011 edn (Springer-Verlag New York Inc., New York, 2006).

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

The datasets analysed during the current study are available from the corresponding author on reasonable request.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES