Abstract
Justification
Automatic brain tumor classification by MRS has been under development for more than a decade. Nonetheless, to our knowledge, there are no published evaluations of predictive models with unseen cases that are subsequently acquired in different centers. The multicenter eTUMOUR project (2004–2009), which builds upon previous expertise from the INTERPRET project (2000–2002) has allowed such an evaluation to take place.
Materials and Methods
A total of 253 pairwise classifiers for glioblastoma, meningioma, metastasis, and low-grade glial diagnosis were inferred based on 211 SV short TE INTERPRET MR spectra obtained at 1.5 T (PRESS or STEAM, 20–32 ms) and automatically pre-processed. Afterwards, the classifiers were tested with 97 spectra, which were subsequently compiled during eTUMOUR.
Results
In our results based on subsequently acquired spectra, accuracies of around 90% were achieved for most of the pairwise discrimination problems. The exception was for the glioblastoma versus metastasis discrimination, which was below 78%. A more clear definition of metastases may be obtained by other approaches, such as MRSI + MRI.
Conclusions
The prediction of the tumor type of in-vivo MRS is possible using classifiers developed from previously acquired data, in different hospitals with different instrumentation under the same acquisition protocols. This methodology may find application for assisting in the diagnosis of new brain tumor cases and for the quality control of multicenter MRS databases.
Keywords: Magnetic resonance spectroscopy, Pattern classification, Brain tumors, Decision support systems, Multicenter evaluation study
Introduction
Magnetic resonance spectroscopy is slowly becoming an accurate non-invasive complement to magnetic resonance imaging for initial diagnosis exam of brain masses [1], since it provides useful chemical information about metabolites for characterizing brain tumors [2]. To achieve this status, clinical and pattern recognition (PR)-based classification of brain tumors using magnetic resonance spectroscopy (MRS) data has been thoroughly investigated for more than fifteen years [1,3–13].
The clinical decision-support systems (CDSSs) based on PR should be developed in such a way so as to obtain high accuracy in classification, interpretability by means of clinical knowledge and the generalization of the performance to new samples obtained subsequently in different clinical centers [14–17]. Standardization of acquisition conditions and protocols should make data from different hospitals compatible and allow the development and evaluation of joint CDSSs. This standardization prevents possible bias from single-center or single-machine studies and, additionally, increases the number of available cases for classifier development and test purposes.
During the INTERPRET project (INTERPRET) [8,18], a protocol was defined to guarantee the compatibility of the signals acquired at different hospitals [19,20]. As a result, studies on automated brain tumor classification were carried out using these data. Hence, in previous studies [7,8,10, 21], the ability of automatic classifiers based on short echo time (TE) MRS to discriminate among different brain tumor diagnoses was demonstrated. In addition, in [11,13,21], the automated classification by means of long TE MRS was also studied and demonstrated. Other studies evaluated the extension of the classifiers towards 1H magnetic resonance spectroscopic imaging (MRSI) [12,21–24]. Every study reported above was developed and evaluated using data acquired during the same period of time. Besides, other automated classification studies, such as [2,13,25–28], have been reported on single-center MRS datasets of brain masses.
In order to provide the clinical community with robust results of automatic classification, the extension of the evaluation in time is advisable. Hence, the validation of classifiers through subsequent cases can consolidate the confidence of clinicians in the potential applicability of these classifiers. The multicenter The eTUMOUR project (eTUMOUR) [29] (2004–2009) has benefited from the data and expertise gathered by INTERPRET. The INTERPRET acquisition protocols for clinical, radiological, and histopathological data were extended to ex-vivo transcriptomic (DNA microarrays) and metabolomic (HR-MAS) data acquisition in The eTUMOUR project (eTUMOUR). Furthermore, the raw MRS data acquired during INTERPRET were incorporated into the eTUMOUR dataset for classifier development. This provides a unique opportunity to evaluate INTERPRET-based models by means of cases from a later date from partly different hospitals with different instrumentation, but obtained using the same or compatible acquisition protocols. The multiproject-multicenter evaluation proposed in this study gives a close-up perspective of the conditions that predictive models may face under different real clinical environments.
In this study, six pairwise classifiers for glioblastoma GBM, low-grade meningioma (MEN), metastasis (MET), and low-grade glial (LGG) diagnoses were developed and tested on single-voxel (SV) short TEMRS signals. Short TE MRS is fast (typically 5 min) and robust, so it is considered to be appropriate for routine clinical studies [1]. Most major hospitals currently use this acquisition protocol for the MRS evaluation of brain tumors. Short TE spectral pattern has been reported to contain a larger amount of information than long TE spectra, e.g. metabolites and other compounds that are considered useful for classification purposes [1,8,11]. Hence, creatine (Cr) (3.02, 3.92 ppm), choline (Cho) (3.21 ppm), N-acetyl aspartate (NAA) (2.01 ppm), myo-inositol (mI) and glycine (Gly) (3.55 ppm), mI/Taurine (Tau) (3.26 ppm), glutamate/glutamine (Glx) (2.04, 2.46, 3.78 ppm), lactate (Lac) (1.31 ppm), and alanine (Ala) (1.47 ppm) are observed at short TE. Furthermore, macromolecules (MM) (5.4, 2.9, 2.25, 2.05, 1.4 and 0.87 ppm) and mobile lipids (ML) are also well detected at short TE [1,8]. Comparative studies on the use of short TE versus long TE have shown the benefit of using short TE or the combination of both echo times for automatic classification purposes [30].
Based on previous results from [10,11,18,21], good performance of the PR models could be expected for most of the classification problems, except for the discrimination of glioblastoma and metastasis [10]. Our performance estimations of models trained with INTERPRET data and tested over eTUMOUR cases confirmed this behaviour. We observed that pairwise discrimination between glioblastoma, meningioma, metastasis, and low-grade glial achieved an accuracy of around 90%. The exception was for the discrimination between glioblastoma and metastasis that did not perform better than 78%. This study consolidates the results obtained by previous studies in automatic brain tumor classification using MRS. These results may also increase the confidence of the clinical community in the use of CDSSs that incorporate this kind of classifiers for the interpretation of MRS biomedical signals and the diagnosis of brain tumors.
Materials and methods
Data acquisition
The training data used for classifier development were SV MRS signals at 1.5 T at short TE (point-resolved spectroscopic sequence (PRESS) or stimulated echo acquisition mode sequence (STEAM), 20–32 ms) that were acquired by international centers in the framework of INTERPRET [18]. The classes considered for inclusion in this study were based on the histological classification of the central nervous system (CNS) tumors set up by the World Health Organization (WHO) [31]: glioblastoma (GBM), MEN, MET, and LGG (Astrocytoma gII, Oligoastrocytoma gII, or Oligodendroglioma gII). The number of cases by class is summarized in Table 1.
Table 1.
Class | INTERPRET | eTUMOUR |
---|---|---|
GBM | 84 | 28 |
MEN | 57 | 17 |
MET | 37 | 32 |
LGG | 33 | 20 |
211 | 97 |
Short TE 1HMRS data were acquired according to a consensus protocol during the INTERPRET (2000–2002) and eTUMOUR (2004–2009) projects
211 SV 1H (nuclear) magnetic resonance (MR) spectra from the INTERPRET database [19] were included. These signals were acquired with Siemens, general electric (GE), and Philips instruments by six international centers. The acquisition protocols included PRESS or STEAM sequences, with spectral parameters: repetition time (TR) between 1,600 and 2,020 ms, TE of 20 or 30–32 ms, spectral width of 1,000–2,500 Hz, and 512, 1,024, or 2,048 data-points, as described in previous studies [19]. Every training spectrum and diagnosis was validated by the INTERPRET Clinical Data Validation Committee (CDVC) and expert spectroscopists [8].
The test data were provided by eight international institutions in the framework of eTUMOUR [29]. The cases with the SV short TE (STEAM 20 ms, PRESS 30–32 ms) MRS at 1.5 T signal validated by the expert spectroscopist of eTUMOUR and with the original histopathology available before 28 February 2007) were included. Therefore, 97 cases from eTUMOUR were considered for testing in this study. The test cases used to evaluate the performance of the classifiers were acquired from partly different hospitals in later dates than the training cases and using instruments of the three main manufacturers. Table 2 shows that the percentages of cases by manufacturer included in the test data are similar to the percentages in the training data. Table 3 shows the percentage of cases by center included in the training and test datasets. Forty percent of training cases belong to one center that afterwards did not provide test data. Besides, 35% of test cases belong to three new centers that were not providers of training data.
Table 2.
Manufacturer | INTERPRET (%) | eTumour (%) |
---|---|---|
GE | 53.1 | 54.6 |
Siemens | 6.6 | 12.4 |
Philips | 40.3 | 33.0 |
Table 3.
CENTERS | Training from INTERPRET (%) | Test from eTUMOUR (%) |
---|---|---|
UMC Nijmegen | 2.8 | 1.0 |
St. George’s Hospital | 27.0 | 18.6 |
Medical University of LODZ | 3.8 | 10.3 |
FLENI | 1.9 | 6.2 |
IDI-Bellvitge | 40.3 | |
Centre de Diag. Pedralbes | 24.2 | |
Centre de Diag. Pedralbes + IAT | 28.9 | |
IDI-Badalona | 17.5 | |
Univ. de Valencia | 16.5 | |
Hospital Sant Joan de DEU | 1.0 | |
Cases of project | ||
exclusive centers (%) | 40.3 | 35.1 |
Last row indicates the percentage of training cases that belong to centers that did not produce eTUMOUR cases, and the percentage of test cases that belong to centers that did not acquired training data for INTERPRET
Pre-processing
Each signal was pre-processed according to the INTERPRET protocol. A fully automatic pre-processing pipeline was available for the training data. Besides, a semi-automatic pipeline was defined for some new file formats of the test cases from GE and Siemens manufacturers. The semi-automatic pipeline was designed to ensure compatibility of its output with the automatic one.
Automatic pipeline
The steps of the automatic pre-processing pipeline were: (1) Eddy current correction was applied to the water-suppressed free induction decay (FID) of each case using the Klose algorithm [32]. (2) The residual water resonance was removed using the Hankel-Lanczos singular value decomposition (HLSVD) time-domain selective filtering using ten singular values and a water region of [4.33, 5.07] ppm. (3) An apodization with a Lorentzian function of 1 Hz of damping was applied. (4) Before transforming the signal to the frequency domain using the fast Fourier transform (FFT), an interpolation was needed in order to increase the frequency resolution of the low resolution spectra to the maximum frequency resolution used in the acquisition protocols (see [8] for details in the acquisition conditions and resolutions). This was carried out with the zero-filling procedure. (5) Afterwards, the baseline offset, which was estimated as the mean value of the region [11, 9] ∪ [−2,−1] ppm, was subtracted from the spectrum. (6) The normalization of the spectral data vector to the L2-norm was performed based on the data-points in the region [−2.7, 4.33] ∪ [5.07, 7.1] ppm. (7) Depending on the signal-to-noise ratio (SNR) and the tumor pattern, an additional frequency alignment check of the spectrum was performed by referencing the ppm-axis to (in order of priority) the total Cr at 3.03 ppm or to the Cho containing compounds at 3.21 ppm or the ML at 1.29 ppm. (8) Finally, the region of interest was restricted to [0.5, 4.1] ppm, obtaining a vector of 190 points for each spectrum where, after the pre-processing filters, the resonances of the main metabolites arise and where the contribution of the residual water is expected to be minimal. In summary, 211 INTERPRET cases and 47 cases of the eTUMOUR test dataset (32 from Philips and 15 from GE) were pre-processed with the automatic pipeline.
Semi-automatic pipeline
Due to limitations of the automatic pre-processing software, 50 test samples were pre-processed by a semi-automatic pipeline that was partially based on the java magnetic resonance user interface (jMRUI) [33]. Some modifications of the semiautomatic pipeline with respect to the automatic pipeline were in the following steps: (1) The phase of the water suppressed FID was mainly corrected with the reference water. Additional manual zero-order and first-order phase correction was performed when needed. (2) Residual water was removed by means of the jMRUI-implementation of the Hankel singular value decomposition (HSVD)algorithm [34]. The filter was parametrized as in the automatic pipeline. Steps 3–8 remained equivalent to the automatic pre-processing. As a result, a pre-processing pipeline based on different software implementations but compatible with the automatic one was set up, and comparable signals for testing the PR models were obtained.
Feature extraction
Several feature extraction methods based on PR were applied to the real part of the spectra prior to any classification approach. These methods included direct spectral peak integration (PI) on selected metabolite resonance regions [35], peak height of typical resonances (PPM) [36], principal component analysis (PCA) [37,38], independent component analysis (ICA) [39,40], and wavelet transform (WAV) [41,42]. Finally, some classification approaches were applied to the full region of interest represented by a data vector of 190 points (190). The selected features for the classifiers were derived from previous studies [10,30] or from model validation based on the training dataset. In some approaches, standard normal variate (SNV) scaling was applied to the obtained features. The wavelet basis used in the experiments was coiflet 3 with nine levels [41]. Further information and experimental details about the methods used can be found in “Appendix A” of the on-line Supplementary Material.1
Classification methods
Ten methods were applied to address the pairwise classifications. These methods included parametric discriminant analysis [43]: linear discriminant analysis (LDA), Fisher’s rank-reduced version of LDA (FLDA) [44]), quadratic discriminant analysis (QDA), linear discriminant analysis with diagonal covariance matrix (dLDA) and quadratic discriminant analysis with diagonal covariance matrix (dQDA). Kernel-based models (support vector machines (SVM) [45] and least-squares support vector machine (LS-SVM) [46]) were also applied. Additionally, artificial neural networks (multilayer perceptron (MLP) [47] and bi-directional Kohonen networks (BDK) [27,48]) and single and ensemble [49] classifiers using K-nearest neighbours and local feature reduced by PCA (PCA-KNN) [50,51]) were used.
Bayesian strategies for regularization were also applied in some of the classifiers based on LS-SVM [52] and MLP [53]. Further information about these methods can be found in “Appendix B” of the on-line Supplementary Material.
A measure to evaluate unbalanced classifiers: the balanced error rate (BER)
The performance was measured by means of the error rate (ERR) and the balanced error rate (BER). In a binary classifier A versus B, BER is the average of the error rate on the A and B classes [54]. Let n A be the number of cases of the class A, and e A the number of misclassified cases. Let n B be the number of cases of the class B, and e B the number of misclassified cases. While the ERR is defined as e A+e B/n A+n B, the BER is defined as 1/2(e A/n A + e B/n B). BER is useful when one class is underrepresented compared to the other class, e.g. GBM versus LGG and GBM versus MET in the INTERPRET dataset and MEN versus GBM and MEN versus MET in the eTUMOUR dataset.
Results and discussion
For each task, different combinations of feature extraction and classification methods were applied in the study. An estimation of the ERR and BER for the INTERPRET dataset using a tenfold cross validation (CV)was carried out for each model. Afterwards, the estimations of the ERR and BER were obtained on the independent test (IT) dataset of eTUMOUR. Table 4 illustrates the results with the best pairwise classifiers based on the IT estimations. A detailed list of the results is available in Sect. 1 of the on-line Supplementary Material.
Table 4.
Task | id | Features | Classif | CV | IT | ||
---|---|---|---|---|---|---|---|
ERR | BER | ERR | BER | ||||
GBM versus MEN | 1.6 | 190 | MLP | 0.06 | 0.07 | 0.07 | 0.09 |
GBM versus MET | 2.13 | PI | LDA | 0.33 | 0.40 | 0.22 | 0.21 |
GBM versus LGG | 3.16 | PI | LS-SVM | 0.12 | 0.18 | 0.08 | 0.09 |
MEN versus MET | 4.21 | PCA | MLP | 0.05 | 0.05 | 0.06 | 0.07 |
MEN versus LGG | 5.10 | ICA | LS-SVM | 0.08 | 0.09 | 0.08 | 0.08 |
MET versus LGG | 6.13, 21, 25–26 | PI | LDA/FLDA/MLP/LS-SVM | [0.01, 0.04] | [0.01, 0.04] | 0.06 | 0.07 |
The ERR and BER estimation based on CV over the INTERPRET data and based on the eTUMOUR IT set are shown. The columns of the table are: task: classification problem defined by the classes to discriminate by the classifiers; id, identification of the classifier; features: acronym of the feature extraction method, classif, acronym of the classification method; CV, results estimated by means of a tenfold CV in the INTERPRET database; IT, results estimated by means of the independent test, with the INTERPRET database as training and the eTUMOUR dataset as test; ERR, error rate; and BER, balanced error rate. ▭interval within every result falls
The classification problems
Most of the discrimination problems among the four classes were solved with high accuracy in the eTUMOUR dataset. Table 4 shows that most of the best classifiers among GBM, MEN, MET, and LGG achieved an accuracy (1 — ERR) of around 90%. Such decision support methodologies with these ratios of accuracy may be useful to be incorporated in integrated CDSSs for clinical purposes. Besides, for GBM versus MET, the best result was an accuracy of 78% of the independent test, which is far from the accuracy obtained for the other discrimination problems. The glioblastoma versus metastasis discrimination by means of the MRS is difficult with the use of SV spectroscopy alone [7,8,55–58]. Other approaches, such as MRSI coupled with magnetic resonance imaging (MRI) or the acquisition of an additional adjacent voxel to the brain mass should provide relevant additional information for distinguishing between these two types of tumors [57–59].
Figure 1 shows the box-whisker plot of the performance (BER based on IT) for each problem based on the detailed list of the results (Sect. 1 of the on-line Supplementary Material). Note the high deviation of the distribution for the GBM versus MET with respect to the others. In a multiple comparison at a 0.05 α-level based on the Tukey’s honestly significance difference criterion for Kruskal-Wallis nonparametric one-way analysis of variance [60], each problem had a mean rank that was significantly different from the GBM versus MET problem. The distributions of the other five discrimination problems overlapped among them. Nevertheless, the smallest non-outlier observation of the GBM versus LGG problem was higher than the smallest non-outlier observation of the remaining problems. This may indicate that the GBM versus LGG discrimination is more difficult to solve by SV short TE MRS than the other four discrimination problems.
The different approaches obtained good results for the discrimination of the GBM and MEN classes. A multilayer perceptron with the full spectra achieved a BER of 0.09. The mode of the distribution of BER was below0.20 for the GBM versus MEN problem.
The difficulty of the GBM versus MET discrimination was clearly observed in both CV-and IT-estimations (see Fig. 2). In the distribution of the IT results for this problem, the BER mode was 0.5, and the main distribution of the results ranged from 0.4 to 0.55. Some methods achieved a BER of 0.2; nevertheless, the main mass of the distribution was far from this value, which makes it difficult to ensure reproducibility of these performances. These results agree with those already published in previous studies [8,10]. This is most probably due to the similar necrotic profile (high lipid peaks mask the rest of the metabolic information) of the Metastasis cases and of most of the glioblastoma cases.
The mode of the BER for the GBM versus LGG problem was 0.2. Nevertheless, there was a set of regularized classifiers that obtained a BER of around 0.09. To be more precise, the best BER corresponded to the Bayesian framework for LS-SVM using peak integration (PI) values. Devos et al. [10] obtained comparable performances for this problem using LDA and standard LS-SVMs. In studies [25,61], significant statistical differences between GBM and LGG and between GBM and astrocytoma grade-III were also found for different metabolite ratios with respect to Cr and/or water. In long TE, Menze et al. [13] observed a better performance with regularized methods than with the standard ones when classifying normal, non-progressive tumors (with radiation injury and stable disease) and brain tumors.
As expected, our results confirm that MEN can be easily discriminated from MET no matter what method is used. Most of the BER probability mass of the results was in the interval from 0.1 to 0.2. The best result achieved a BER of 0.07, which was based on PCA and a neural network with Bayesian regularization. These results are consistent with [10].
LS-SVM and LDA with different feature extraction methods achieved BER of 0.08 and 0.11 for the meningioma versus low-grade glial problem. Most of the results for this problem were in the interval from 0.15 to 0.25, and the mode of the distribution was under 0.2. The low error in MEN versus LGG was also predicted by the CV results on the INTERPRET data. This result is consistent with the performances reported in Tate et al. in [7] on a three-class discrimination problem: MEN versus astrocytomas grade II (A2) versus aggressive tumors (AGG) (which is composed of GBM and MET). In that study, the confusion submatrix of MEN versus A2 indicates no misclassifications between them. Identical results were obtained by Tate et al. in [8] when extending the three-class classifier to MEN versus LGG versus AGG.
The distribution of BER forMET versus LGG had a clear trend towards the lower values (BER of 0.1), showing good performance for all the methods studied in this problem. PI combined with LDA, FLDA, MLP, or LS-SVM classification methods obtained the best performance for the IT set. The CV estimations of the errors also indicated good performance by the classifiers. These results are also consistent with [10].
The pre-processing techniques
Eight out of 50 semi-automatically preprocessed test cases were misclassified at least once by the pairwise BDK classifiers (GBM versus MET excluded). Also, 10 out of 47 of the automatically preprocessed test cases were misclassified at least once by the same classifiers. Based on these rates, no differences were observed in the classification of automatic and semi-automatic pre-processed signals. The semi-automatic pre-processing pipeline applied to the larger part of the test dataset was consistent with the automatic pipeline applied on the training set. This is an important practical conclusion because it suggests the compatibility of different pre-processing software tools, either in an automatic or a semi-automatic fashion for automatic classification in CDSSs.
The feature extraction methods
All the feature extraction methods applied in this study were based on PR. Therefore, we could not make any comparison between PR and metabolite quantification approaches. Approaches that take advantage of the combination of different TE [25,26,30,62–64] were not considered in order to ensure that results could be compared with previous analyses of this type of data [7,8,10,12,27,28,65–67]. Furthermore, although a feature extraction evaluation is not the aim of the present study and the setup of this study is not designed specifically for it, some effects of the different feature extraction methods are reported.
Figure 3 shows the box-whisker plot of the performance (BER) for each feature extraction (FE) method. GBM versus MET classifiers are not included because of their large difference in performance with respect to the other classification problems. The distributions of the results for all FE methods overlap, and no statistical differences were observed. Nevertheless, a noteworthy fact is the trend toward low values of the peak integration method compared to other methods. The study of Devos et al. [10] about the same four classes obtained similar performances when comparing full region of interest, peak regions and PI. In [12], Simonetti et al. compared, PCA, independent component analysis (ICA), LCModel [67] and PI for feature extraction on short TE MRSI data and they also obtained the best results with PI. In a single-center study, Opstad et al. [28] reported that the LCModel quantification obtained better results than PCA for two-step LDA classification. In long TE spectra, Lukas et al. [11] observed a better performance using the full region of interest rather than using PI or peak region extraction. Finally, Menze et al. [13] and Luts et al. [68] obtained an improvement when PR approaches (e.g. ICA, PCA, binned peak region and WAV) were used in short or long TE instead of quantification approaches.
The classification methods
The diversity of methods used for classification is broad enough to have a good overview of the effect that this selection has on the performance of the classifiers. Figure 4 shows the box-whisker plot of the performance (BER) for each classification method. Analogously to the analysis of FE methods, GBM versus MET classifiers are not included in the distributions because of their large differences in performance with respect to the other classification methods. As observed in Fig. 4, the distributions overlap, but in general, lower results of BER were obtained using a BDK. In [27], BDK was used in PI values to discriminate over tumor grades and other tissues in the INTERPRET multi-voxel dataset. The study of Devos et al. [10] observed similar performances of their LDA and LS-SVM classifiers based on PI and evaluated by the area under the ROC curves. Tate et al. [7,8] based their three-class classifiers on the LDA due to the ability of this method for projecting the results in a two-dimensional space for visualization. Note that FLDA shows similar results when compared with the other methods in average; however, other methods like LS-SVM and BDK might be preferable for some discrimination problems (e.g. GBM vs. LGG).
Finally, in Fig. 2, we summarize and compare the BER estimation obtained by the CV for the INTERPRET training dataset and the IT consisting of the new eTUMOUR cases. Most of the results are in the (BER(CV)<0.2, BER(IT)<0.3) region, except for the GBM versus MET problem, which had a sparse distribution. The general trend in this region is indicated by the black-dashed line. This indicates an underestimation of the BER by the CV evaluation. The underestimation is typically observed in the PR challenges [54], and it is usually produced by the overfitting of the models on the training dataset and the estimation of the error with non-fully independent samples [69]. A noteworthy feature of our study is the evaluation of the predictive models using the new subsequently acquired multicenter test, that ensures the independence of the training and test sets. With respect to the GBM versus MET results, they are scattered in regions of larger error. For this problem, some overestimations of the CV error are also observed. This may show the difficulty of the problem and the randomness in the results. The results obtained for the rest of the discrimination problems confirm the expected behaviour of the predictive models.
Use of the study for automatic validation of MRS entries in brain tumour datasets
An intuitive method to compare datasets of signals is the visual inspection of their prototypical patterns. Figure 5 shows plots of the unimodal prototypes of the short TE spectra for the four tumour groups of the training and test datasets. Each prototype is represented by the unsmoothed mean function and the mean function±the standard deviation function. The view is zoomed in the [0.5, 4.1] ppm region used in our experiments. The observed resonances correspond to the main compounds reported in the “Introduction”. In general, the training and test prototype patterns for GBM, MET and LGG are close to each other, whereas the MEN prototype differs visually more. This may be because of a higher standard deviation on the test dataset around the 3.21 ppm peak with respect to the training dataset. Besides, the variation around the 2.2 ppm is higher in the test-set mean than in the training one.
A practical result of this study is that cases that are repeatedly misclassified by the different techniques can be flagged as being susceptible of revision for possible problems in voxel positioning, acquisition artifact, normal-tissue contamination, or limitation in the classification methodology (e.g. patterns replicated in non-tumoral diseases, atypical MRS patterns and underrepresented tumor subtypes). In this way, even in the absence of biopsy, PR techniques can contribute to the automatic validation of cases, assisting the specialists on the detection of potential source of errors in the biomedical data acquired from patients.
Figures 6 and 7 show some eTUMOUR misclassified cases which may be interesting to review. The eTUMOUR case et2274 was diagnosed by the original pathologist as oligodendroglioma 9450/3 (grade II, WHO), although a comment was added to the free text section of the eTUMOUR database (eTDB) making reference to the presence of areas of anaplastic oligodendroglioma (grade III, WHO). Still, the final diagnosis proposed was grade II oligodendroglioma. The voxel allocation was carried out following the eTUMOUR acquisition protocol. The ML pattern is uncommon, as the high 0.9 and 1.3 ppm resonances show. The disappearance of these resonances at long TE (136 ms) discards a significant necrotic contribution (results not shown, but see [30]). This pattern has been observed before [30], for example in the INTERPRET cases I0450 (oligoastrocytoma) and I0179 (oligodendroglioma), which are also misplaced in the short TE latent space of the INTERPRET decision-support system (DSS) 2.0 (http://azizu.uab.es/INTERPRET). In summary, et2274 seems to behave as a class outlier and its consistent misclassification in our analysis may be sampling precisely that. The eTUMOUR case et2206 was originally diagnosed as oligoastrocytoma 9382/3 (grade II, WHO), but there were some discrepancies regarding the glial subtype on the validation done by the pathological committee. It was misclassified by every MET versus LGG classifier, and also by some GBM versus LGG and MEN versus LGG classifiers. Its ML pattern at short TE is also uncommon, having relatively large 0.9, 1.3 and 2.8 ppm peaks that are reduced at long TE (results not shown), which suggests, as well, a non-necrotic origin. The eTUMOUR case et2349 is a GBM without clear visible ML, which was misclassified in every classification problem. The review of the experts did not indicate problems in the location of the voxel, being this mainly positioned in the highly cellular part of the tumour. The eTUMOUR case et2197 is a MET with possible MRS pattern contribution from normal brain parenchyma, as it could be deduced by the relative difference of size between the voxel used for acquisition and the small brain lesion. Its pattern shows similar Cho and Cr peak heights and relatively high NAA at 2 ppm). However, the appearance of high Lac/ML at 1.3 ppm at the same time suggests abnormality. Nonetheless, it is clearly an uncommon spectral pattern for a MET.
Conclusions
This study describes a multiproject-multicenter evaluation of automated brain tumor classifiers using single-voxel short TE MR spectra. To our knowledge, there is no previous work that evaluates predictive models trained with data acquired from a multicenter project using a new independent test set subsequently acquired from partly different centers. Classifiers were trained with cases acquired by six centers during the 2000–2002 period. They were tested with posterior cases acquired by eight institutions during the 2004–2007 period. This strategy provides a view that is close to a real environment where similar classifiers, integrated in a clinical decision-support system (CDSS), may be used in multiple hospitals to assist in the diagnosis of new cases.
Our major conclusion is that accurate classification of those new cases is feasible using data acquired in different hospitals, different instrumentation, but similar acquisition protocols. Specifically, in our experiments, classifiers developed from the INTERPRET dataset seem to be robust enough for predictive classification of prospective cases from eTUMOUR.
The pairwise discrimination between Glioblastoma, Meningioma, Metastasis, and Low-grade Glial achieved accuracies of around 90%. However, the discrimination of Glioblastoma and Metastasis did not achieve a result better than 78% accuracy. Our results consolidate the conclusions of previous studies on automatic brain tumor classification using MRS but with multiproject-multicenter data for training and subsequent test.
A well-defined protocol for the acquisition of MRS (e.g. spectral parameters and voxel localization), and the application of quality controls to MRS spectra should allow the reproducibility of such classification rules and the successful use of decision-support systems (DSSs) in clinical environments.
The methodology provided in the present study may also be of use as “automatic flaggers” to help in the quality control of cases during the eTUMOUR multicenter project and beyond. The approach used in this work could be of use for pediatric brain tumour related studies [70] aimed at providing predictive information to pediatric neurosurgeons.
Hence, the conclusions obtained in this study are directly applicable to several of the tasks associated to a CDSS development for brain tumor diagnosis and prognosis and its deployment in clinical environments.
Acknowledgments
We would like to thank the INTERPRET and eTUMOUR partners for providing data, particularly, Carles Majós (IDI-Bellvitge), John Griffiths and Franklyn Howe (SGUL), Arend Heerschap (RU), Witold Gajewicz (MUL), Jorge Calvar (FLENI), and Antoni Capdevila (H. de Sant Joan de Déu). This work was partially funded by the European Commission: eTUMOUR (contract no. FP6-2002-LIFESCIHEALTH 503094), the HEALTHAGENTS EC project (HEALTHAGENTS) (contract no. FP6-2005-IST 027213), BIOPATTERN (contract no. FP6-2002-IST 508803). The authors appreciate the suggestions from the reviewers that have improved the discussion presented in this work. We also thank the following for their contributions: Programa de Apoyo a la Investigación y Desarrollo, PAID-00-06 UPV; Research Council KUL: GOA-AMBioRICS, Centers-of-excellence optimisation; Belgian Federal Government: DWTC, IUAPV P6/04 (DYSCO 2007-2011); the following participants acknowledge the following: JVR acknowledges to Programa Torres Quevedo from Ministerio de Educación y Ciencia, co-founded by the European Social Fund (PTQ05-02-03386). JL is a PhD student supported by an IWT grant. DM is supported by the Ministerio de Educación y Ciencia del Gobierno de España for a Ramon y Cajal 2006 Contract. BC and CA gratefully acknowledge the Ministerio de Educación y Ciencia del Gobierno de España (BC: SAF2004-06297 and SAF2007-6547; CA: SAF2005-03650). CIBER-BBN is an initiative of the “Instituto de Salud Carlos III”(ISCIII), Spain.
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
Abbreviations
- A2
Astrocytomas grade II
- AGG
Aggressive tumors
- Ala
Alanine
- BER
Balanced error rate
- BDK
Bi-directional Kohonen networks
- CDSS
Clinical decision-support system
- CDSSs
Clinical decision-support systems
- CDVC
Clinical Data Validation Committee
- Cho
Choline
- CNS
Central nervous system
- Cr
Creatine
- CV
Cross validation
- dLDA
Linear discriminant analysis with diagonal covariance matrix
- dQDA
Quadratic discriminant analysis with diagonal covariance matrix
- DSS
Decision-support system
- DSSs
Decision-support systems
- ERR
Error rate
- eTDB
The eTUMOUR project (eTUMOUR) database
- eTUMOUR
The eTUMOUR project
- FE
Feature extraction
- FID
Free induction decay
- FLDA
Fisher’s rank-reduced version of LDA
- FFT
Fast Fourier transform
- GBM
Glioblastoma
- GE
General electric
- Gly
Glycine
- Glx
Glutamate/glutamine
- HEALTHAGENTS
The HEALTHAGENTS EC project
- HLSVD
Hankel–Lanczos singular value decomposition
- HSVD
Hankel singular value decomposition
- ICA
Independent component analysis
- INTERPRET
The INTERPRET project
- IT
Independent test
- jMRUI
java Magnetic resonance user interface
- Lac
Lactate
- LDA
Linear discriminant analysis
- LGG
Low-grade glial
- LS-SVM
Least-squares support vector machine
- MEN
Low-grade Meningioma
- MET
Metastasis
- mI
myo-Inositol
- ML
Mobile lipids
- MLP
Multilayer perceptron
- MM
Macromolecules
- MR
(Nuclear) magnetic resonance
- MRI
Magnetic resonance imaging
- MRS
Magnetic resonance spectroscopy
- MRSI
Magnetic resonance spectroscopic imaging
- NAA
N-acetyl aspartate
- PCA
Principal component analysis
- PCA-KNN
K-nearest neighbours and local feature reduced by principal component analysis (PCA)
- PI
Peak integration
- PPM
Peak height of typical resonances
- PR
Pattern recognition
- PRESS
Point-resolved spectroscopic sequence
- QDA
Quadratic discriminant analysis
- SNR
Signal-to-noise ratio
- SNV
Standard normal variate
- STEAM
Stimulated echo acquisition mode sequence
- SV
Single-voxel
- SVM
Support vector machines
- Tau
Taurine
- TE
Echo time
- TR
Repetition time
- WAV
Wavelet transform
- WHO
World Health Organization
Footnotes
Available from http://bmg.webs.upv.es/joomla_rpboys/articulos/mmeval_mrs08.pdf
References
- 1.Howe FA, Opstad KS. 1H MR spectroscopy of brain tumours and masses. NMR Biomed. 2003;16(3):123–131. doi: 10.1002/nbm.822. [DOI] [PubMed] [Google Scholar]
- 2.Galanaud D, Nicoli F, Chinot O, Confort-Gouny S, Figarella-Branger D, Roche P, Fuentes S, Le Fur Y, Ranjeva JP, Cozzone PJ. Noninvasive diagnostic assessment of brain tumors using combined in vivo MR imaging and spectroscopy. Magn Reson Med. 2006;55(6):1236–1245. doi: 10.1002/mrm.20886. [DOI] [PubMed] [Google Scholar]
- 3.Arnold DL, De Stefano N. Magnetic resonance spectroscopy in vivo: applications in neurological disorders. Ital J Neurol Sci. 1997;18(6):321–329. doi: 10.1007/BF02048235. [DOI] [PubMed] [Google Scholar]
- 4.Poptani H, Kaartinen J, Gupta RK, Niemitz M, Hiltunen Y, Kauppinen RA. Diagnostic assessment of brain tumours and non-neoplastic brain disorders in vivo using proton nuclear magnetic resonance spectroscopy and artificial neural networks. J Cancer Res Clin Oncol. 1999;125(6):343–349. doi: 10.1007/s004320050284. [DOI] [PubMed] [Google Scholar]
- 5.Moller-Hartmann W, Herminghaus S, Krings T, Marquardt G, Lanfermann H, Pilatus U, Zanella FE. Clinical application of proton magnetic resonance spectroscopy in the diagnosis of intracranial mass lesions. Neuroradiology. 2002;44(5):371–381. doi: 10.1007/s00234-001-0760-0. [DOI] [PubMed] [Google Scholar]
- 6.Hagberg G. From magnetic resonance spectroscopy to classification of tumors. A review of pattern recognition methods. NMR Biomed. 1998;11(4-5):148–156. doi: 10.1002/(SICI)1099-1492(199806/08)11:4/5<148::AID-NBM511>3.0.CO;2-4. [DOI] [PubMed] [Google Scholar]
- 7.Tate AR, Majos C, Moreno A, Howe FA, Griffiths JR, Arús C. Automated classification of short echo time in in vivo 1H brain tumor spectra: a multicenter study. Magn Reson Med. 2003;49(1):29–36. doi: 10.1002/mrm.10315. [DOI] [PubMed] [Google Scholar]
- 8.Tate AR, Underwood J, Acosta DM, Julia-Sape M, Majos C, Moreno-Torres A, Howe FA, van der Graaf M, Lefournier V, Murphy MM, Loosemore A, Ladroue C, Wesseling P, Luc Bosson J, Cabanas ME, Simonetti AW, Gajewicz W, Calvar J, Capdevila A, Wilkins PR, Bell BA, Remy C, Heerschap A, Watson D, Griffiths JR, Arús C. Development of a decision support system for diagnosis and grading of brain tumours using in vivo magnetic resonance single voxel spectra. NMR Biomed. 2006;19(4):411–434. doi: 10.1002/nbm.1016. [DOI] [PubMed] [Google Scholar]
- 9.González-Vélez H, Mier M, Julià-Sapé M, Arvanitis T, García-Gómez J, Robles M, Lewis P, Dasmahapatra S, Dupplaw D, Peet A, Arús C, Celda B, Van Huffel S, Lluch-Ariet M (2007) HealthAgents: distributed multi-agent brain tumor diagnosis and prognosis. Appl Intell (Epub ahead of print)
- 10.Devos A, Lukas L, Suykens JAK, Vanhamme L, Tate AR, Howe FA, Majos C, Moreno-Torres A, van der Graaf M, Arús C, Van Huffel S. Classification of brain tumours using short echo time 1H MR spectra. J Magn Reson. 2004;170(1):164–175. doi: 10.1016/j.jmr.2004.06.010. [DOI] [PubMed] [Google Scholar]
- 11.Lukas L, Devos A, Suykens JAK, Vanhamme L, Howe FA, Majós C, Moreno-Torres A, Graaf MVD, Tate AR, Arús C, Huffel SV. Brain tumor classification based on long echo proton MRS signals. Artif Intell Med. 2004;31:73–89. doi: 10.1016/j.artmed.2004.01.001. [DOI] [PubMed] [Google Scholar]
- 12.Simonetti AW, Melssen WJ, Szabo de Edelenyi F, van Asten JJA, Heerschap A, Buydens LMC. Combination of feature-reduced MR spectroscopic and MR imaging data for improved brain tumor classification. NMR Biomed. 2005;18(1):34–43. doi: 10.1002/nbm.919. [DOI] [PubMed] [Google Scholar]
- 13.Menze BH, Lichy MP, Bachert P, Kelm BM, Schlemmer HP, Hamprecht FA. Optimal classification of long echo time in vivo magnetic resonance spectra in the detection of recurrent brain tumors. NMR Biomed. 2006;19(5):599–609. doi: 10.1002/nbm.1041. [DOI] [PubMed] [Google Scholar]
- 14.Potts HWW, Wyatt JC, Altman DG (2001) Challenges in evaluating complex decision support systems: lessons from design-a-trial. In: AIME ’1: proceedings of the 8th conference on AI in medicine in Europe, pp 453–456. Springer, London
- 15.Lisboa PJ, Taktak AFG. The use of artificial neural networks in decision support in cancer: a systematic review. Neural Netw. 2006;19(4):408–415. doi: 10.1016/j.neunet.2005.10.007. [DOI] [PubMed] [Google Scholar]
- 16.Anagnostou T, Remzi M, Djavan B. Artificial neural networks for decision-making in urologic oncology. Eur Urol. 2003;43(6):596–603. doi: 10.1016/s0302-2838(03)00133-7. [DOI] [PubMed] [Google Scholar]
- 17.Perner P. Intelligent data analysis in medicine-recent advances. Artif Intell Med. 2006;37(1):1–5. doi: 10.1016/j.artmed.2005.10.003. [DOI] [PubMed] [Google Scholar]
- 18.INTERPRET Consortium. Interpret web site. http://azizu.uab.es/INTERPRET. Accessed 28 April 2008
- 19.Julia-Sape M, Acosta D, Mier M, Arús C, Watson D. A multi-centre, web-accessible and quality control-checked database of in vivo MR spectra of brain tumour patients. Magn Reson Mater Phys. 2006;19(1):22–33. doi: 10.1007/s10334-005-0023-x. [DOI] [PubMed] [Google Scholar]
- 20.van der Graaf M, Julia-Sape M, Howe FA, Ziegler A, Majos C, Moreno-Torres A, Rijpkema M, Acosta D, Opstad KS, van der Meulen YM, Arus C, Heerschap A. MRS quality assessment in a multicentre study on MRS-based classification of brain tumours. NMR Biomed. 2008;21(2):148–158. doi: 10.1002/nbm.1172. [DOI] [PubMed] [Google Scholar]
- 21.Devos A (2005) Quantification and classification of magnetic resonance spectroscopy data and applications to brain tumour recognition. Ph.D. thesis, Faculty of Engineering, K.U.Leuven
- 22.Simonetti AW, Melssen WJ, van der Graaf M, Postma GJ, Heerschap A, Buydens LMC. A chemometric approach for brain tumor classification using magnetic resonance imaging and spectroscopy. Anal Chem. 2003;75(20):5352–5361. doi: 10.1021/ac034541t. [DOI] [PubMed] [Google Scholar]
- 23.Devos A, Simonetti AW, van der Graaf M, Lukas L, Suykens JAK, Vanhamme L, Buydens LMC, Heerschap A, Van Huffel S. The use of multivariate MR imaging intensities versus metabolic data from MR spectroscopic imaging for brain tumour classification. J Magn Reson. 2005;173(2):218–228. doi: 10.1016/j.jmr.2004.12.007. [DOI] [PubMed] [Google Scholar]
- 24.Laudadio T, Martinez-Bisbal M, Celda B, Van Huffel S. Fast nosological imaging using canonical correlation analysis of brain data obtained by two-dimensional turbo spectroscopic imaging. NMR Biomed. 2007;21(4):311–321. doi: 10.1002/nbm.1190. [DOI] [PubMed] [Google Scholar]
- 25.Martinez-Bisbal MC, Celda B, Marti-Bonmati L, Ferrer P, Revert-Ventura AJ, Piquer J, Molla E, Arana R, Dosda-Munoz R. The contribution of magnetic resonance spectroscopy for the classification of high grade glial tumours. The predictive value of macromolecules. Revista de Neurología. 2002;34:309–313. [PubMed] [Google Scholar]
- 26.Martinez-Bisbal MC, Ferrer-Luna R, Martinez-Granados B, Monleón D, Esteve V, Piquer J, Revert AJ, Mollá E, Martí-Bonmatí L, Celda B. Glial tumours grading by a combination of (1)H MR short and medium echo time single voxel located by spectroscopic imaging. Magn Reson Mater Phys. 2005;18(S1):S68. [Google Scholar]
- 27.Melssen W, Wehrens R, Buydens L. Supervised Kohonen networks for classification problems. Chemom Intell Lab Syst. 2006;83(2):99–113. doi: 10.1016/j.chemolab.2006.02.003. [DOI] [Google Scholar]
- 28.Opstad KS, Ladroue C, Bell BA, Griffiths JR, Howe FA. Linear discriminant analysis of brain tumour (1)H MR spectra: a comparison of classification using whole spectra versus metabolite quantification. NMR Biomed. 2007;20(8):763–770. doi: 10.1002/nbm.1147. [DOI] [PubMed] [Google Scholar]
- 29.eTumour Consortium (2003) eTumour: web accessible MR decision support system for brain tumour diagnosis and prognosis, incorporating in vivo and ex vivo genomic and metabolomic data. Technical report, FP6-2002-LIFESCIHEALTH 503094, VI framework programme, EC. http://www.etumour.net. Accessed 28 April 2008
- 30.García-Gómez JM, Tortajada S, Vidal C, Julia-Sape M, Luts J, Moreno-Torres À, Van Huffel S, Arús C, Robles M (2008) The effect of combining two echo times in automatic brain tumor classification by MRS. NMR Biomed 21 (in press) [DOI] [PubMed]
- 31.Kleihues P, Burger PC, Scheithauer BW. The new WHO classification of brain tumours. Brain Pathol. 1993;3(3):255–268. doi: 10.1111/j.1750-3639.1993.tb00752.x. [DOI] [PubMed] [Google Scholar]
- 32.Klose U. In vivo proton spectroscopy in presence of eddy currents. Magn Reson Med. 1990;14(1):26–30. doi: 10.1002/mrm.1910140104. [DOI] [PubMed] [Google Scholar]
- 33.Naressi A, Couturier C, Castang I, de Beer R, Graveron-Demilly D. Java-based graphical user interface for MRUI, a software package for quantitation of in vivo/medical magnetic resonance spectroscopy signals. Comput Biol Med. 2001;31(4):269–286. doi: 10.1016/S0010-4825(01)00006-3. [DOI] [PubMed] [Google Scholar]
- 34.Cabanes E, Confort-Gouny S, Le Fur Y, Simond G, Cozzone PJ. Optimization of residual water signal removal by HLSVD on simulated short echo time proton MR spectra of the human brain. J Magn Reson. 2001;150(2):116–125. doi: 10.1006/jmre.2001.2318. [DOI] [PubMed] [Google Scholar]
- 35.Hoch JC, Stern AS. NMR data processing. New York: Wiley; 1996. [Google Scholar]
- 36.Preul MC, Caramanos Z, Collins DL, Villemure JG, Leblanc R, Olivier A, Pokrupa R, Arnold DL. Accurate, noninvasive diagnosis of human brain tumors by using proton magnetic resonance spectroscopy. Nat Med. 1996;2(3):323–325. doi: 10.1038/nm0396-323. [DOI] [PubMed] [Google Scholar]
- 37.Burges CJ (2004) Geometric methods for feature extraction and dimensional reduction: a guided tour. Technical report, Microsoft Research, University of Toronto
- 38.Fukunaga K. Introduction to statistical pattern recognition, 2nd edn. San Diego: Academic Press; 1990. [Google Scholar]
- 39.Comon P. Independent component analysis, a new concept. Signal Process. 1994;36(3):287–314. doi: 10.1016/0165-1684(94)90029-9. [DOI] [Google Scholar]
- 40.Cardoso J-F, Souloumiac A. Blind beamforming for non Gaussian signals. IEE Proc F. 1993;140(6):362–370. [Google Scholar]
- 41.Daubechies I (1992) Ten lectures on wavelets (CBMS–NSF regional conference series in applied mathematics). Society for Industrial and Applied Mathematics. http://www.amazon.ca/exec/obidos/redirect?tag=citeulike09-20&path=ASIN/0898712742
- 42.Panagiotacopulos N, Lertsuntivit S, Savidge L, Lin A, Shic F, Ross B (2000) Wavelet analysis of brain tumors in clinical MRS. In: Advances in physics, electronics and signal processing applications, pp 290–296
- 43.Krzanowski WJ, editor. Principles of multivariate analysis: a user’s perspective. New York: Oxford University Press; 1988. [Google Scholar]
- 44.Fisher RA. Statistical methods for research workers. Edinburgh: Oliver and Boyd; 1925. [Google Scholar]
- 45.Vapnik V. The nature of statistical learning theory. NY: Springer; 1995. [Google Scholar]
- 46.Suykens JAK, Vandewalle J. Least squares support vector machine classifiers. Neural Process Lett. 1999;9(3):293–300. doi: 10.1023/A:1018628609742. [DOI] [Google Scholar]
- 47.Rosenblatt F. The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev. 1958;65(6):386–408. doi: 10.1037/h0042519. [DOI] [PubMed] [Google Scholar]
- 48.Melssen W, Ustun B, Buydens L. SOMPLS: a supervised self-organising map—partial least squares algorithm for multivariate regression problems. Chemom Intell Lab Syst. 2007;86(1):102–120. doi: 10.1016/j.chemolab.2006.08.013. [DOI] [Google Scholar]
- 49.Valentini G, Dietterich TG. Bias-variance analysis of support vector machines for the development of SVM-based ensemble methods. J Mach Learn Res. 2004;5:725–775. [Google Scholar]
- 50.Hastie T, Tibshirani R, Friedman JH. The elements of statistical learning. Heidelberg: Springer; 2001. [Google Scholar]
- 51.Duda R, Hart P, Stork D. Pattern classification. London: Wiley; 2001. [Google Scholar]
- 52.Van Gestel T, Suykens JAK, Lanckriet G, Lambrechts A, De Moor B, Vandewalle J. Bayesian framework for least squares support vector machine classifiers, Gaussian processes and Kernel Fisher discriminant analysis. Neural Comput. 2002;14:1115–1147. doi: 10.1162/089976602753633411. [DOI] [PubMed] [Google Scholar]
- 53.MacKay DJC. Bayesian interpolation. Neural Comput. 1992;4(3):415–447. doi: 10.1162/neco.1992.4.3.415. [DOI] [Google Scholar]
- 54.Guyon I, Alamdari ARSA, Dror G, Buhmann JM (2006) Performance Prediction Challenge. In: IJCNN ’6 international joint conference on neural networks, pp 1649–1656
- 55.Ishimaru H, Morikawa M, Iwanaga S, Kaminogo M, Ochi M, Hayashi K. Differentiation between high-grade glioma and metastatic brain tumor using single-voxel proton MR spectroscopy. Eur Radiol. 2001;11(9):1784–1791. doi: 10.1007/s003300000814. [DOI] [PubMed] [Google Scholar]
- 56.Opstad KS, Murphy MM, Wilkins PR, Bell BA, Griffiths JR, Howe FA. Differentiation of metastases from high-grade gliomas using short echo time 1H spectroscopy. J Magn Reson Imaging. 2004;20(2):187–192. doi: 10.1002/jmri.20093. [DOI] [PubMed] [Google Scholar]
- 57.Law M, Cha S, Knopp EA, Johnson G, Arnett J, Litt AW. High-grade gliomas and solitary metastases: differentiation by using perfusion and proton spectroscopic MR imaging. Radiology. 2002;222(3):715–721. doi: 10.1148/radiol.2223010558. [DOI] [PubMed] [Google Scholar]
- 58.Burtscher IM, Skagerberg G, Geijer B, Englund E, Stahlberg F, Holtas S. Proton MR spectroscopy and preoperative diagnostic accuracy: an evaluation of intracranial mass lesions characterized by stereotactic biopsy findings. AJNR Am J Neuroradiol. 2000;21(1):84–93. [PMC free article] [PubMed] [Google Scholar]
- 59.Laudadio T, Luts J, Martinez-Bisbal M, Celda B, Huffel SV (2008) Differentiation between brain metastasis and glioblastoma using MRI and two-dimensional turbo spectroscopic imaging data. In: Proceedings of the 4th European medical and biomedical engineering congress (in press)
- 60.Hochberg Y, Tamhane AC. Multiple comparison procedures. New York: Wiley; 1987. [Google Scholar]
- 61.Celda B, Monleon D, Martinez-Bisbal MC, Esteve V, Martinez-Granados B, Pinero E, Ferrer R, Piquer J, Marti-Bonmati L, Cervera J. MRS as endogenous molecular imaging for brain and prostate tumors: FP6 project “eTUMOR”. Adv Exp Med Biol. 2006;587:285–302. doi: 10.1007/978-1-4020-5133-3_22. [DOI] [PubMed] [Google Scholar]
- 62.Tortajada S, García-Gómez JM, Vidal C, Arús C, Julià-Sapé M, Moreno A, Robles M (2006) Improved classification by pattern recognition of brain tumours combining long and short echo time 1H-MR spectra. In: SpringerLink (ed) Book of abstracts ESMRMB 2006. J Magn Reson Mater Phys Biol Med 19(suppl 1): 168–169
- 63.García-Gómez JM, Tortajada S, Vicente J, Sáez C, Castells X, Luts J, Julià-Sapé M, Juan-Císcar A, Van Huffel S, Barcelo A, Ariño J, Arús C, Robles M (2007) Genomics and metabolomics research for brain tumour diagnosis based on machine learning. In IWANN: lecture notes in computer sicences, vol 4507/2007, pp 1012–1019
- 64.McIntyre DJO, Charlton RA, Markus HS, Howe FA. Long and short echo time proton magnetic resonance spectroscopic imaging of the healthy aging brain. J Magn Reson Imaging. 2007;26(6):1596–1606. doi: 10.1002/jmri.21198. [DOI] [PubMed] [Google Scholar]
- 65.Majos C, Julia-Sape M, Alonso J, Serrallonga M, Aguilera C, Acebes JJ, Arús C, Gili J. Brain tumor classification by proton MR spectroscopy: comparison of diagnostic accuracy at short and long TE. AJNR Am J Neuroradiol. 2004;25(10):1696–1704. [PMC free article] [PubMed] [Google Scholar]
- 66.Julia-Sape M, Acosta D, Majos C, Moreno-Torres A, Wesseling P, Acebes JJ, Griffiths JR, Arús C. Comparison between neuroimaging classifications and histopathological diagnoses using an international multicenter brain tumor magnetic resonance imaging database. J Neurosurg. 2006;105(1):6–14. doi: 10.3171/jns.2006.105.1.6. [DOI] [PubMed] [Google Scholar]
- 67.Provencher SW. Automatic quantitation of localized in vivo 1H spectra with LCModel. NMR Biomed. 2001;14(4):260–264. doi: 10.1002/nbm.698. [DOI] [PubMed] [Google Scholar]
- 68.Luts J, Poullet JB, Garcia-Gomez JM, Heerschap A, Robles M, Suykens JAK, Van Huffel S. Effect of feature extraction for brain tumor classification based on short echo time 1H MR spectra. Magn Reson Med. 2008;60(2):288–298. doi: 10.1002/mrm.21626. [DOI] [PubMed] [Google Scholar]
- 69.Bishop CM. Pattern recognition and machine learning (information science and statistics) Heidelberg: Springer; 2006. [Google Scholar]
- 70.Davies N, Wilson M, Harris L, Natarajan K, Lateef S, Macpherson L, Sgouros S, Grundy R, Arvanitis T, Peet A. Identification and characterisation of childhood cerebellar tumours by in vivo proton MRS. NMR Biomed. 2008;21(8):908–918. doi: 10.1002/nbm.1283. [DOI] [PubMed] [Google Scholar]