A hierarchical spectral clustering and nonlinear dimensionality reduction scheme for detection of prostate cancer from magnetic resonance spectroscopy (MRS)

Pallavi Tiwari; Mark Rosen; Anant Madabhushi

doi:10.1118/1.3180955

. 2009 Aug 12;36(9):3927–3939. doi: 10.1118/1.3180955

A hierarchical spectral clustering and nonlinear dimensionality reduction scheme for detection of prostate cancer from magnetic resonance spectroscopy (MRS)

Pallavi Tiwari ¹, Mark Rosen ², Anant Madabhushi ^3,^a)

PMCID: PMC2738739 PMID: 19810465

Abstract

Magnetic resonance spectroscopy (MRS) has been shown to have great clinical potential as a supplement to magnetic resonance imaging in the detection of prostate cancer (CaP). MRS provides functional information in the form of changes in the relative concentration of specific metabolites including choline, creatine, and citrate which can be used to identify potential areas of CaP. With a view to assisting radiologists in interpretation and analysis of MRS data, some researchers have begun to develop computer-aided detection (CAD) schemes for CaP identification from spectroscopy. Most of these schemes have been centered on identifying and integrating the area under metabolite peaks which is then used to compute relative metabolite ratios. However, manual identification of metabolite peaks on the MR spectra, and especially via CAD, is a challenging problem due to low signal-to-noise ratio, baseline irregularity, peak overlap, and peak distortion. In this article the authors present a novel CAD scheme that integrates nonlinear dimensionality reduction (NLDR) with an unsupervised hierarchical clustering algorithm to automatically identify suspicious regions on the prostate using MRS and hence avoids the need to explicitly identify metabolite peaks. The methodology comprises two stages. In stage 1, a hierarchical spectral clustering algorithm is used to distinguish between extracapsular and prostatic spectra in order to localize the region of interest (ROI) corresponding to the prostate. Once the prostate ROI is localized, in stage 2, a NLDR scheme, in conjunction with a replicated clustering algorithm, is used to automatically discriminate between three classes of spectra (normal appearing, suspicious appearing, and indeterminate). The methodology was quantitatively and qualitatively evaluated on a total of 18 1.5 T in vivo prostate T2-weighted (w) and MRS studies obtained from the multisite, multi-institutional American College of Radiology (ACRIN) trial. In the absence of the precise ground truth for CaP extent on the MR imaging for most of the ACRIN studies, probabilistic quantitative metrics were defined based on partial knowledge on the quadrant location and size of the tumor. The scheme, when evaluated against this partial ground truth, was found to have a CaP detection sensitivity of 89.33% and specificity of 79.79%. The results obtained from randomized threefold and fivefold cross validation suggest that the NLDR based clustering scheme has a higher CaP detection accuracy compared to such commonly used MRS analysis schemes as z score and PCA. In addition, the scheme was found to be robust to changes in system parameters. For 6 of the 18 studies an expert radiologist laboriously labeled each of the individual spectra according to a five point scale, with 1∕2 representing spectra that the expert considered normal and 3∕4∕5 being spectra the expert deemed suspicious. When evaluated on these expert annotated datasets, the CAD system yielded an average sensitivity (cluster corresponding to suspicious spectra being identified as the CaP class) and specificity of 81.39% and 64.71%, respectively.

Keywords: magnetic resonance spectroscopy, computer-aided diagnosis, prostate cancer, nonlinear dimensionality reduction, hierarchical clustering, unsupervised classification, k-means, replicated clustering, principal component analysis, locally linear embedding, graph embedding, spectral clustering, z score

INTRODUCTION

Prostatic adenocarcinoma (CaP) is the most commonly occurring malignancy among men with 186 320 new cases and 28 660 deaths estimated to occur in the United States in 2008 (American Cancer Society, 2008). Early detection of CaP offers the best hope of curing it; however, early prostate cancer is usually asymptomatic.¹ Screening of CaP is based on digital rectal examination and monitoring elevated levels of the blood serum prostate specific antigen (PSA). Definitive diagnosis of CaP involves histological examination of biopsy specimens obtained via a blinded sextant transrectal ultrasound (TRUS) directed biopsy for patients with elevated PSA levels. Since prostate ultrasound is limited in its ability to identify CaP, biopsy locations are chosen at random within the prostate sextants. Consequently, the CaP detection accuracy associated with TRUS is only 20%–25% in patients with elevated PSA levels (4–10 μg∕ml).² Recently, in vivo endorectal T2-weighted (w) structural magnetic resonance (MR) imaging (MRI) of the prostate has allowed for greater discrimination between benign and cancerous prostatic structures as compared to TRUS.³ However, structural T2-w MRI by itself has been shown to be limited in its ability to detect small foci of carcinoma contributing to a relatively low detection specificity.³

Over the past few years, MR spectroscopic (MRS) imaging (MRSI) has emerged as a useful complement to structural MRI for potential screening of CaP.⁴^,⁵ MRSI is a noninvasive technique used to obtain the metabolic concentrations of specific molecular markers and biochemicals in the prostate including citrate, creatine, and choline, and changes in the ratio [choline∕citrate or (choline+creatine)∕citrate] of which have been shown to be linked to presence of CaP.⁶^,⁷^,⁸^,⁹ The spectra are obtained at either single or multiple locations from a rectangular spectral grid placed on a corresponding T2-w MR image. It has been demonstrated previously that the relative concentrations of choline, citrate, and creatine are significantly different in CaP and normal regions within the prostate.⁶^,¹⁰

The relative concentrations of choline, creatine, and citrate are obtained by calculating the area under the peak for these metabolites to assess the presence of CaP at a specific prostate location on the T2-w MRI. Identification of precise location of specific metabolites on the MR spectra is a difficult task for radiologists due to (a) a low signal to noise ratio and (b) the presence of biomedical signal artifacts associated with MR spectra such as peak overlap and peak and baseline distortion. Figure 1a shows a MRS grid superimposed on the corresponding T2-w MRI slice, while Fig. 1b shows the MR spectra acquired from individual MRS voxels from the grid shown in Fig. 1a. Figures 1c, 1d, 1e correspond to representative CaP, noisy and normal spectra obtained from three different locations (shown in red, blue, and green, respectively) within the spectral grid [Fig. 1b]. Note the amount of noise in the MRS in Fig. 1d which, in some cases, can severely limit the ability of a radiologist to accurately identify individual metabolite peaks and quantitate corresponding peak areas. Thus the usefulness of MRSI as an adjunct to MRI as a means of detecting, localizing, and characterizing CaP is highly dependent on the quality of the spectral examinations obtained, owing to the challenges in visually identifying spectroscopic CaP signatures (through the identification of abnormal metabolite peak area ratios) with poor data quality.¹¹

(a) shows a T2-w MRI slice of the prostate with a user-selected 3×7 voxel grid overlaid on the prostate; the MRS spectral grid corresponding to the T2 slice is shown in (b). An abnormal appearing spectra [red voxel in (a) and (b)] is shown in (c), while (e) shows the normal appearing spectra [green voxel in (a) and (b)] (d) shows additional spectra [blue voxel in (a) and (b)] with poor signal to noise ratio and with the baseline affected by a tail of broad upfield lipid resonance (2.0–2.5 ppm). Note that lipid contamination in edge voxels is relatively common in prostate MRSI and can affect the accuracy of the calculated metabolite ratios.

Computer-aided diagnosis (CAD) in medical imaging serves as an adjunctive method for image or data analysis, aiding the radiologist in identifying areas of disease or abnormality.¹² The low detection accuracy associated with current TRUS prostate biopsies points to a need for developing an image guided decision support system to direct needle placement in the prostate to increase CaP detection accuracy. While our group¹³^,¹⁴ and others¹⁵ have begun to develop CAD systems for CaP detection from structural and functional MR imaging, corresponding developments in MRS have not been as forthcoming. This is in spite of evidence to suggest that integration of structural and metabolic imaging⁷^,¹⁶ could boost diagnostic yield over that which could be obtained from any individual modality. To date, computer-based approaches for prostate MRSI analysis have focused largely on the use of semiautomated peak area integration to determine metabolite ratios, although a few researchers¹⁷^,¹⁸^,¹⁹^,²⁰^,²¹ explored the use of more sophisticated techniques for automated peak finding.

Previous attempts at computerized analysis of MRS can be classified into two broad categories: (a) signal quantification (model dependent) and (b) pattern recognition (model independent) approaches. Commonly used quantification methods include VARPRO,¹⁷AMARES,¹⁸ and QUEST,¹⁹ which are software utilities where the objective is to minimize the squared distance between the acquired data and a model basis function built on prior knowledge about the metabolic profiles of a typical MR spectrum. However, the performance of these quantification models is usually dependent on (i) the choice of correct number of model components, (ii) optimal choice of prior knowledge (model function), (iii) presence of noise and contributions from nonprostate spectra, (iv) peak overlap owing to contributions from multiple metabolites, and (v) baseline distortion and line broadening.²² In order to avoid the limitations of model and peak detection based approaches for MRS, recently some researchers have begun to explore domain independent techniques such as z score and principal component analysis (PCA). An excellent comprehensive comparison of quantification and pattern recognition schemes used for MRS analysis is provided in Ref. 22.

z score is a statistical quantity obtained as the ratio of the difference in each individual sample’s score and the population mean to the population variance, z score analysis²⁰^,²³ aims to quantify the totality of contributions of all metabolites in the spectral vector. In Refs. ²⁰^,²³, z score was employed for MRS based detection of glioma and CaP, respectively. Another statistical technique, canonical correlation analysis (CCA),²⁴ based on calculating the canonical coefficients to obtain correlated linear relationships between two multidimensional variables was shown to be useful in successfully classifying prostate MRSI datasets into four classes: Aggressive tumor, tumor, mixed tissue, and healthy tissue.

Devos et al.²⁵ used linear discriminant analysis (LDA) and a least squares support vector machine (SVM) classifier to discriminate between different tumor classes on brain MRS. Ma and Sun²⁶ and Simonetti et al.²⁷ explored other linear dimensionality reduction methods such as independent component analysis and PCA, in conjunction with support vector machine (SVM) classifier, to differentiate between brain tissue classes via MRS.

Dimensionality reduction (DR) refers to the projection of high dimensional data into a reduced dimensional feature space without a significant loss in class discriminatory information. The low dimensional representation of the data is easier to visualize and DR algorithms aim to preserve object relationships so that objects that are close to one another in the high dimensional ambient space are mapped to adjacent locations in the resulting low dimensional embedding space. However, linear DR schemes such as PCA assume the original high dimensional data to be inherently linear and hence employ linear projection methods to reduce data dimensionality. Recently several nonlinear dimensionality reduction (NLDR) algorithms have been proposed for the analysis and visualization of nonlinear data.²⁹^,³⁰^,³¹ The objective behind NLDR methods is to nonlinearly map objects, c, d belonging to the same object class and characterized by M-dimensional feature vectors F(c), F(d), to adjacent locations S(c), S(d) in the low dimensional embedding; S(c), S(d) representing the m-dimensional dominant eigenvectors corresponding to c, d (m⪡M). Unlike linear DR schemes that typically employ the Euclidean distance measure to estimate object distances, most NLDR schemes aim at preserving geodesic distances between objects while computing the embedding. We have previously demonstrated that the use of NLDR methods for representation of high dimensional gene- and protein-expression data, results in better classification compared to linear DR based representation (PCA and LDA).²⁸

In this paper, we present a fully automated CAD system for detecting abnormal∕suspicious regions on the prostate using 1.5 T prostate MRSI data. Figure 2 illustrates the organization of our CAD scheme. Our methodology comprises of two stages. In stage 1, a novel hierarchical classification scheme is employed to recursively distinguish prostatic from extracapsular (noninformative) spectra via graph embedding (GE),³¹^,³² a well known NLDR scheme, in order to hone in the region of interest (ROI) corresponding to the prostate. In a typical field of view (FOV) of an in vivo endorectal prostate T2-w MR image, the prostate occupies a small percentage [approximately 10% in Fig. 1a] of the total volume within a prostate MRI scene. Since the extracapsular spectra are most populous, the largest cluster is identified at each iteration as being noninformative and eliminated. This process is repeated until the number of MR spectra remaining is approximately equal to the number usually contained in the prostate (Θ) [Fig. 1a]. The removal of extracapsular spectra in stage 1 makes it easier to discriminate between suspicious appearing and normal appearing spectra within the prostate. In stage 2, a NLDR scheme is applied to nonlinearly embed the informative spectra into a reduced dimensional space. The individual prostate spectra now characterized by their low dimensional embedding coordinates are clustered into distinct classes via a “replicated clustering” scheme. All spectra are aggregated into three classes based on the assumption that they correspond to normal, CaP, and indeterminate classes.

Flowchart showing various system components and methodological overview of the CAD scheme. Hierarchical spectral clustering is performed in stage 1 to automatically obtain the prostate grid, followed in stage 2 by the identification of different tissue classes via NLDR and replicated clustering.

The rest of this paper is organized as follows: In Sec. 2 we provide a detailed description of the feature extraction and clustering schemes employed in this paper. In Sec. 3 we provide the details of our CAD methodology for prostate MRS classification. In Sec. 4 we explain the evaluation scheme employed in this work followed in Sec. 5 by results of qualitative and quantitative evaluation of our scheme on a total of 18 prostate MRI-MRS studies. Concluding remarks and future research directions are presented in Sec. 6.

DESCRIPTION OF FEATURE EXTRACTION AND REPLICATED CLUSTERING METHODS

Feature extraction methods

Linear dimensionality reduction scheme: PCA

PCA is a linear DR method widely used to visualize high dimensional data and discern object relationships in the data by finding orthogonal axes that contain the greatest amount of variance in the data.³³ These orthogonal eigenvectors corresponding to the largest eigenvalues are called “principal components.” To obtain these principal components, each data point c in set C is first centered by subtracting the mean of all the features for each observation c from its original M dimensional feature value f_u(c), u∊{1,…,M} as shown by

{\bar{f}}_{u} (c) = f_{u} (c) - \frac{1}{∣ C ∣} \sum_{c ∊ C} f_{u} (c), u ∊ {1, . ., M} .

(2.1)

From feature values $\bar{f} (c)$ for each c∊C, a new ∣C∣×M matrix Y is constructed, where ∣C∣ is the cardinality of set C. The matrix Y is then decomposed into corresponding singular values as shown by

Y = U W_{PCA} V^{T},

(2.2)

where via singular value decomposition a ∣C∣×∣C∣ diagonal matrix W_PCA containing the eigenvalues of the principal components, a m×∣C∣ left singular matrix U, and a M×∣C∣ matrix V are obtained. The eigenvalues in W_PCA represent the amount of variance for each eigenvector $S_{v}^{PCA}$ , v∊{1,2,…,m}, in matrix V^T and are used to rank the corresponding eigenvectors in the order of greatest variance. Thus the first m eigenvectors that represent a prespecified percentage of the variance in the data are extracted, while the remaining Eigen vectors are discarded. Thus each data sample c∊C is now described by an m-dimensional embedding vector S^PCA(c). In spite of the fact that PCA assumes that the data lie on a linear manifold, it allows for specification of the number of eigenvectors required to explain a prespecified percentage of the variance in the data.

z score

For a set of objects, C^V⊂C, c∊C^V, a mean vector $F^{μ} = [f_{u}^{μ} ∣ u ∊ {1, \dots, M}]$ and the corresponding standard deviation vector $F^{σ} = [f_{u}^{σ} ∣ u ∊ {1, \dots, M}]$ , where $f_{u}^{μ} = 1 ∕ ∣ C^{V} ∣ \sum_{c ∊ C^{V}} f_{u} (c)$ and $f_{u}^{σ} = \sqrt{1 ∕ ∣ C^{V} ∣ \sum_{c ∊ C^{V}} {[f_{u} (c) - f_{u}^{μ} (c)]}^{2}}$ are obtained. At each c∊C, z score (S^z(c)) is defined as

S^{z} (c) = \frac{{‖ F (c) - F^{μ} ‖}_{2}}{{‖ F^{σ} ‖}_{2}},

(2.3)

where F(c) corresponds to the feature vector at each c and S^z(c) reflects the degree to which the value of an object deviates from the normal based on a statistical linear model. A predefined threshold θ_z is then used to classify each c∊C into one class or the other based on whether S^z(c)⩾θ_z.

Nonlinear dimensionality reduction methods

In this work we consider two popular NLDR schemes, locally linear embedding (LLE) (Ref. ³⁰) and GE (Ref. ³²) for MRS analysis. We aim to demonstrate the use of NLDR schemes for representation of MRS data results in superior discrimination between CaP and non-CaP spectra compared to the use of linear DR schemes such as PCA.

Graph embedding. The aim of graph embedding³² is to find an embedding vector S^GE(c_i), ∀c_i∊C, i∊{1,…,∣C∣}, such that the relative ordering of the distances between objects in high dimensional space is maximally preserved in the lower dimensional space. Thus, if locations c_i,c_j∊C,i,j∊{1,…,∣C∣} are adjacent in the high dimensional feature space, then _{‖S^GE(c_i)−S^GE(c_j)‖2} should be small, where _‖⋅‖2 represents the Euclidean norm. This will only be true if the distances between all c_i, c_j∊C are preserved in the low dimensional mapping of the data. To compute the optimal embedding, an adjacency matrix W_GE∊R^{∣C∣×∣C∣} is first defined as

W_{GE} (i, j) = e^{- {‖ F (c_{i}) - F (c_{j}) ‖}_{2}}, \forall c_{i}, c_{j} ∊ C, i, j ∊ {1, \dots, ∣ C ∣} .

(2.4)

S^GE(c_i) is then obtained via maximization of the following objective function:

E (X_{GE}) = 2 γ tr [\frac{X_{GE} (D - W_{G E}) X_{GE}^{T}}{X_{GE} D X_{GE}^{T}}],

(2.5)

where X_GE=[S^GE(c₁),S^GE(c₂),…,S^GE(c_n)], n=∣C∣, and γ=∣C∣−1. Additionally, D is a diagonal matrix where ∀c_i∊C, i∊{1,…,∣C∣}, the diagonal element is defined as D(i,i)=∑_jW_GE(i,j). The embedding space is defined by the eigenvectors corresponding to the smallest m eigenvalues of (D−W_GE) X_GE=λDX_GE. The matrix X_GE∊R^∣C∣×m of the first m eigenvectors is constructed, and ∀c_i∊C, S^GE(c_i) is defined as row i of X_GE.

Locally linear embedding (LLE). LLE (Ref. ³⁰) operates by assuming that objects within a local neighborhood in a high dimensional feature space are linearly related. Consider the set of high dimensional feature vectors F={F(c₁),F(c₂),…,F(c_n)}, ∀c_i∊C, i∊{1,…,n}. LLE aims to map the set F to the corresponding set X_LLE={S^LLE(c₁),S^LLE(c₂),…,S^LLE(c_n)} of embedding coordinates. Let d⁽¹⁾,…,d^(K) be the K nearest neighbors of c_i and let η^K(c_i) be the indices of the location of the K-nearest neighbors (K-NN) of c_i∊C. The feature vector F(c_i) and its K-NN’s {F(d⁽¹⁾),F(d⁽²⁾),…,F(d^(K))} are assumed to lie on a patch of the manifold that is locally linear, allowing us to use the Euclidean metric to determine distance between neighbors. Each F(c_i) can then be approximated by a weighted sum of its K-NN. The optimal reconstruction weights are given by the sparse matrix W_LLE (subject to the constraint ∑_jW_LLE(i,j)=1) that minimizes

E_{1} (W_{LLE}) = \sum_{i = 1}^{n} {‖ F (c_{i}) - \sum_{r = 1}^{K} W_{LLE} (i, η^{r} (c_{i})) F (d^{(r)}) ‖}_{2} .

(2.6)

Having determined the weighting matrix W_LLE, the next step is to find a low dimensional representation of the points in F that preserves this weighting. Thus, for each F(c_i) approximated as the weighted combination of its K-NN, its projection S^LLE(c_i) will be the weighted combination of the projections of these same K-NN. The optimal X_LLE in the least squares sense minimizes

E_{2} (X_{LLE}) = \sum_{i = 1}^{n} {‖ S^{LLE} (c_{i}) - \sum_{j = 1}^{n} W_{LLE} (i, j) S^{LLE} (c_{j}) ‖}_{2} = tr (X_{LLE} L X_{LLE}^{T}),

(2.7)

where X_LLE=[S^LLE(c₁),S^LLE(c₂),…,S^LLE(c_n)], $L = (I - W_{LLE}) (I - W_{LLE}^{T})$ , and I is the identity matrix. The minimization of Eq. 2.7 subject to the constraint $X_{LLE} X_{LLE}^{T} = I$ (a normalization constraint that prevents the solution X_LLE=0) is an eigenvalue problem whose solutions are the eigenvectors of the Laplacian matrix L. Since the rank of L is n−1, the first eigenvector is ignored and the second smallest eigenvector represents the best one-dimensional projection of all the samples. The best two-dimensional projection is given by the eigenvectors with the second and third smallest eigenvalues, and so forth.

Replicated k-means clustering in the reduced feature space

For the DR schemes, ϕ∊{PCA,LLE,GE}, unsupervised replicated clustering is used to classify all objects c∊C into one of the k classes based on S^ϕ(c), the low dimensional representation of F(c). Replicated clustering is a variant of the popular k-means³⁴ clustering scheme. The k-means algorithm is initialized by randomly partitioning the data into k clusters and computing the cluster center for each partition. The distance of each point from each of the k centroids is computed and each object is reassigned to the closest cluster centroid to minimize the intraclass variance. This random initialization may lead to local minima leading, in turn, to different clustering results. The motivation behind replicated clustering is to make the final aggregation results from k-means more deterministic. In replicated clustering, multiple weak clusterings of the data are generated. The optimal clustering solution is then chosen among the various weak clusterings as the one with the least intracluster variance. Below we briefly describe the various steps involved in this algorithm.

Step 1. At each of T iterations, k-means is applied to cluster all objects c∊C into one of the k classes $V_{t}^{1}, V_{t}^{2}, \dots, V_{t}^{k}$ , t∊{1,…,T}, where each c∊C is characterized by a high dimensional feature vector F(c). For each $c ∊ V_{t}^{q}$ , q∊{1,2,…,k}, t∊{1,2,…,T} the centroid of each cluster is determined as

F_{t}^{q} = \frac{1}{∣ V_{t}^{q} ∣} \sum_{c ∊ V_{t}^{q}} F (c) .

(2.8)

Step 2. At each iteration t∊{1,…,T}, the average Euclidean distance between each $F (c) ∊ V_{t}^{q}$ and corresponding cluster center $F_{t}^{q}$ , t∊{1,…,T}, q∊{1,2,…,k}, is then determined as

d_{t}^{q} = \frac{1}{∣ V_{t}^{q} ∣} \sum_{c ∊ V_{t}^{q}} ‖ F (c) - F_{t}^{q} ‖ .

(2.9)

The average intracluster distance over all k clusters is then obtained as

μ_{t}^{d} = \frac{1}{k} \sum_{q} d_{t}^{q} .

(2.10)

Step 3. Finally, the clusterings ${\hat{V}}^{q}$ , q∊{1,2,…,k} within a specific iteration t∊{1,…,T} are identified as the stable clustering result for which $μ_{t}^{d}$ is minimum over all t.

Note that replicated clustering identifies stable clusterings as those that minimize intraclass variance. Note further that while we are not explicitly seeking to increase intercluster distance, our empirical results suggest that replicated k-means clustering tends to also push the cluster centers farther apart.

METHODOLOGY

In Sec. 3A we provide a brief description of the notation employed in this paper. In Sec. 3B we provide a brief description of the data sets considered in this study. Details concerning determination of the approximate ground truth for spatial extent of CaP on MRI are provided in Sec. 3C. Methodological details regarding the two-stage CAD system for CaP detection on MRS are detailed in Secs. 3D, 3E, respectively.

Notation

We represent the 3D prostate T2-w MRI scene by G=(G,f), where G is a 3D grid of voxels g∊G and f(g) is a function that assigns an intensity value to every g∊G. We also define a spectral scene C=(C,F) where C is a 3D grid of metavoxels, c∊C. Each metavoxel c is associated with a corresponding M-dimensional spectral vector F(c)=[f_u(c)∣u∊{1,…,M}], where f_u(c) represents the MRS signal intensity at each c along the frequency domain. Figure 3 shows the spatial relationship between the MR spectral metavoxel c∊C and T2-w MRI voxel g∊G. Note that the distance between any two adjacent metavoxels c_i,c_j∊C, ‖c_i−c_j‖₂, where ‖⋅‖₂ denotes the L₂ norm, i,j∊{1,…,∣C∣}, and ∣C∣ is the cardinality of C, is roughly 13 times the distance between any two adjacent MRI voxels g_i,g_j∊G, where i,j∊{1,…,∣G∣}. A list of commonly used notations and symbols in this paper is given in Table 1.

Illustration of the spatial relationship between MRS metavoxels c∊C and T2-w MRI voxels g∊G. The spectral grid C comprising of 28 metavoxels has been overlaid on a T2-w MRI prostate slice and is shown in white. Note the region outlined in red on C corresponds to the area occupied by a metavoxel, but may contain multiple MRI voxels (highlighted in red).

Table 1.

List of commonly used notation and symbols in this paper.

Symbol	Description
G	3D MRI scene
G	3D grid of MRI voxels
g	Voxel location in G,g∊G
f_u(c)	MR signal intensity at c
f(g)	MR intensity value at g
M	Number of original high dimensions
ϕ	DR method, ϕ∊{PCA,LLE,GE}
i,j	Two adjacent spatial locations on an image
R	Maximum CaP diameter
C_p	Potential cancer space, C_p⊂C
${\hat{V}}^{ϕ, q}$	Stable clusters obtained for q∊{1,2,…,k}
ΔX,ΔY	Size of metavoxel in X and Y dimensions
α	Threshold parameter for z score, α∊[0,1]
C_s	Set of precise spatial locations of CaP
C_a,o	Spatial locations of C_a outside C_P
$S N (C_{a}^{C a P})$	Sensitivity value associated with CaP class
W_ϕ	Distance matrix for ϕ∊{GE,LLE,PCA}
κ	Locally linear parameter for ϕ∊{LLE}
C	3D MR spectral scene
C	3D grid of metavoxels
c	A metavoxel in C,c∊C
u	Frequency index
F(c)	Vector of spectral content at c
m	Number of reduced dimensions, m⪡M
S^ϕ(c)	Low dimensional embedding vector at c
S^z(c)	z scores at each $c ∊ {\tilde{C}}_{T}$
K_s	Contiguous slices with CaP presence
N_g	Number of CaP voxels in C_P,N_g∊C_P
V_T	Prostate grid
Θ	Threshold for prostate size
θ_z	z score threshold
$C_{a}^{q}$	Set of spatial locations for clusters ${\hat{V}}_{q}, q ∊ {1, 2, \dots, k}$
C_a,i	Spatial locations of C_a inside C_P
$S P (C_{a}^{C a P})$	Specificity value associated with CaP class
η	Confidence estimate associated with SN and SP
u	Dimension parameter for ϕ∊{LLE,GE,PCA}

Open in a new tab

MRSI data description

A total of 18 deidentified and anonymized prostate cancer MRI∕MRS datasets from the ACRIN trial was chosen randomly for this pilot study. All exams were performed on a 1.5 T magnet (GE Medical Systems, Milwaukee, WI). Oblique axial T2-w spectra (TR 4000–6000 ms, TE 90–120 ms, slice thickness of 3 mm, acquisition matrix of 256×192, and FOV of 12–14 cm) were first obtained. Using the PROSE (Prostate spectroscopy and imaging examination) software package (www.gehealthcare.com∕usen∕mr∕applications∕products∕prose.html), MRSI (TR 1000 ms, TE 130 ms, spectral width of 1000 Hz, and number of points is 512) was then prescribed from the oblique axial T2-w images, with outer voxel suppression and oblique suppression planes manually set to exclude the majority of fat about the prostate capsule. A FOV of 11 cm was used for the 16×16 MRSI grid (voxel dimension of 6.75×6.75×3 mm³). MRS was obtained as a PRESS (point resolved spectroscopy sequence) sequence.³⁵ The MRS spectral grid was contained in DICOM image sets, from which the 16×16 grid containing 256 complex spectra per slice was obtained using IDL 6.4 (ITT Visual Information Systems).

Determining approximate ground truth for spatial extent of CaP on MRI-MRS

Following radical prostatectomy, the gland was fixed in formalin and sectioned per institutional routine (whole mount or standard sections). Sections were then embedded in formalin, and duplicate slides for each block were prepared for central review. Slides were stained with Hemtoxylin and Eosin and reviewed by a single central pathologist for areas of cancer. MRI-pathology correlation was established through a joint review session of trial imagers and pathologists, who determined slice-by-slice histology-MRI concordance. In order to ensure uniform spectral and histologic sextant assignment, sextant boundaries (i.e., between apex and midgland, and between midgland and base) were based on the MRI slice assignments that had previously been determined by each site radiologist. Using these sextant boundaries, and the best approximation of MRI-histologic concordance, the presence and the diameter of CaP in each sextant were established with maximum tumor diameter, denoted as R, recorded for all positive sextants.

Potential cancer space

In order to quantitatively evaluate CAD performance in terms of performance metrics such as sensitivity and specificity, the precise spatial location of the target class within C is required. Unfortunately, this information was not readily available for the studies considered in this project. Hence, we define a probabilistic ground truth for CaP that involves first defining a potential cancer space representing a spatial region (C_P) within C within which the tumor is embedded. To appreciate the need for C_P, let us assume the ideal case scenario (Fig. 4) where the precise spatial location of CaP is known a priori and is denoted by the set C_s. If C_a denotes the set of spatial locations corresponding to CaP identified by the CAD system, the true positive (TP) area could be calculated as ∣C_s∩C_a∣. Similarly, the false positive (FP) area for the CAD system is ∣C_a−(C_s∩C_a)∣ and false negative (FN) area is ∣C_s−(C_s∩C_a)∣. For the problem we are considering, C_s is unavailable and hence the need for C_P which is defined as the set of spatial locations within which a total number of N_g CaP locations are contained. The true number of metavoxels c within C_P that represent CaP can be calculated as

N_{g} = K_{s} \times ⌈ \frac{(R^{2})}{Δ X Δ Y} ⌉,

(3.1)

where K_s represents the number of contiguous MR sections containing CaP, ⌈ ⌉ refers to the ceiling operation, and ΔX and ΔY refer to the size of the metavoxel c in the X and Y dimensions. Thus for a MRS scene C, with known cancer in left midgland (LM), the prostate being contained in a 3×6 grid and the prostate midgland region extending over two contiguous slices, the total number of CaP metavoxels ∣C_P∣ is 18 (3×3×2). The 3×6 prostate grid is divided into two equal right and left halves. Given that the tumor has a maximum diameter of 13.75 mm in LM, with ΔX, ΔY=6.875, N_g=8 metavoxels corresponding to CaP within C_P. Both C_P and N_g are integral to defining probabilistic estimates of CAD sensitivity, specificity, and positive predictive value (PPV), details of which are provided in Sec. 4.

An illustration of the precise ground truth location (C_s) on the prostate and spatial location of the class (C_a) identified as CaP by a CAD system. Note that in this case sensitivity, specificity of CaP detection via CAD can be determined precisely since C_s is known exactly. C_P represents the potential cancer space that needs to be defined when C_s is not available and contains within it N_g CaP metavoxels.

Expert annotations

We also endeavored to obtain a more precise estimate of CaP ground truth for a subset of the studies via expert annotation of individual MR spectra. For 6 of the 18 ACRIN studies, an expert radiologist laboriously annotated each individual spectra with labels 1–5; with 1 and 2 being spectra that appeared to be normal and spectra 3–5 being suspicious appearing. For this expert annotated dataset, our CAD system attempted to discriminate all spectra as being either normal or suspicious appearing (i.e., distinguish between spectras labeled as 1∕2 and those labeled as 3∕4∕5).

Stage 1: Localization of prostate using hierarchical spectral clustering of MRS

For the FOV for studies considered in this work, a majority of the spectra lay outside the prostate. The motivation behind stage 1 of our algorithm is to exploit the differences in spectral characteristics of spectra inside and outside the prostate to hone in the prostate ROI automatically. This is done by identifying and eliminating the dominant cluster at every iteration until a cluster size threshold is attained. The nonlinear DR scheme graph embedding³² is employed to project all spectra into a reduced dimensional embedding S^GE(c), followed by replicated k-means clustering to aggregate all c∊C into two clusters ${\hat{V}}^{1}$ , ${\hat{V}}^{2}$ corresponding to prostatic and extracapsular spectral classes. At each iteration t∊{1,…,T}, a subset of voxels ${\tilde{C}}_{t}$ in C is obtained by eliminating the noninformative extracapsular spectra identified as the dominant cluster $({\hat{V}}^{dom})$ . The approximate number of prostate spectra (Θ) of a MRS grid is learned during the offline training phase. The automatic cascaded scheme stops when the number of remaining spectra in the MRS grid is approximately equal to Θ. The result of the HierarclustMRS algorithm is a spectral grid $({\tilde{C}}_{T})$ containing all the prostate spectra.

Algorithm.

HierarclustMRS

Input: F(c) for all c∊C, Θ, C.

Output:

{\tilde{C}}_{T}

begin

0. Initialize

{\tilde{C}}_{0} = C

, t=0;

1. while

∣ {\tilde{C}}_{t} ∣ > Θ

2. Apply Graph Embedding (Ref. ³²), to F(c), for all

c ∊ {\tilde{C}}_{t}

to obtain

S_{t}^{G E} (c)

;

3. Apply replicated k-means clustering on

S_{t}^{G E} (c)

to obtain two stable clusters

{\hat{V}}_{t}^{1}, {\hat{V}}_{t}^{2}

;

4. Identify larger cluster

{\hat{V}}_{t}^{dom} = \arg {\max_{w} [{\hat{V}}_{t}^{w}]}

, where w∊{1,2};

5. Create set

{\tilde{C}}_{t + 1} \subset {\tilde{C}}_{t}

by eliminating all

c ∊ {\hat{V}}_{t}^{dom}

from

{\tilde{C}}_{t}

;

6. t=t+1;

7. endwhile;

{\tilde{C}}_{T} = {\tilde{C}}_{t}

;

end

Open in a new tab

Note that in general the algorithm is terminated when the total number of spectra is marginally greater than (or equal to) Θ, which usually occurs within two to three iterations.

Stage 2: CaP identification on MRS via NLDR

Following stage 1, we attempt to apply more sophisticated analysis to the spectra in ${\tilde{C}}_{T}$ to be able to discriminate between different tissue classes in the prostate. Apart from NLDR schemes (LLE and graph embedding) that were considered, two other feature extraction schemes (z score and PCA) were also evaluated in terms of their ability to discriminate between different classes of the prostate spectra (abnormal appearing, normal appearing and indeterminate). Following feature extraction, replicated k-means clustering for the DR schemes (LLE, graph embedding, and PCA) and thresholding for z score was applied to obtain hard classification of the spectra into CaP and non-CaP categories. For z score evaluation, a predefined threshold θ^z is used to classify each $c ∊ {\tilde{C}}_{T}$ as abnormal or normal appearing based on whether S^z(c)>θ^z or S^z(c)<θ^z. Three commonly employed DR methods (explained previously in Sec. 3) are applied to the MR spectra in ${\tilde{C}}_{T}$ so that for any $c ∊ {\tilde{C}}_{T}$ , the high dimensional ambient feature vector F(c) is mapped to S^ϕ(c), where ϕ∊{PCA,LLE,GE}. Replicated clustering is then employed to cluster each $S^{ϕ} (c), \forall c ∊ {\tilde{C}}_{T}$ , into one of the three possible classes, ${\hat{V}}^{ϕ, 1}, {\hat{V}}^{ϕ, 2}, {\hat{V}}^{ϕ, 3}$ corresponding to abnormal appearing, normal appearing, or an indeterminate class, with the intermediate class corresponding to undefined, equivocal, or noisy spectra. For the manually annotated datasets, replicated clustering was used to identify all analyzable spectra as being either normal or suspicious appearing (two clusters).

EVALUATION METHODS

Identification of cancer cluster

As mentioned previously in Sec. 3C1, a potential CaP space C_P is defined within which the number of CaP locations N_g is determined. Following replicated clustering of S^ϕ(c), $c ∊ {\tilde{C}}_{T}$ , we need to identify which of the k clusters ${\hat{V}}^{1}, {\hat{V}}^{2}, \dots, {\hat{V}}^{k} ∊ {\tilde{C}}_{T}$ correspond to the CaP class. We represent the corresponding sets of spatial locations for clusters ${\hat{V}}^{1}, {\hat{V}}^{2}, \dots, {\hat{V}}^{k}$ as $C_{a}^{1}, C_{a}^{2}, \dots, C_{a}^{k}$ , respectively. With respect to C_P (as illustrated in Fig. 5), some part of $C_{a}^{q}$ may be within $(C_{a, i}^{q}, q ∊ {1, 2, \dots, k})$ or outside $(C_{a, o}^{q}, q ∊ {1, 2, \dots k})$ C_P. Thus $C_{a}^{q} = C_{a, i}^{q} \cup C_{a, o}^{q}$ , q∊{1,2,…,k}. The TP, FP, true negative (TN), and FN ratios associated with each class $C_{a}^{q}$ , q∊{1,2,…,k} with respect to C_P and N_g are then obtained. The following heuristic algorithm is then used to identify which of $C_{a}^{q}$ , q∊{1,2,…,k} represents the CaP class $(C_{a}^{CaP})$ .

Algorithm.

IdentifyCaPCluster

Input:

C_{a}^{q}

, q∊{1,2,..,k}, C_P, N_g, C.

Output:

C_{a}^{CaP}

begin

0. forq=1 tokdo;

C_{a, i}^{q} = C_{a}^{q} \cap C_{P}

;

C_{a, o}^{q} = C_{a}^{q} - C_{a, i}^{q}

;

3. If

∣ C_{a, i}^{q} ∣ ⩾ N_{g}

then

TP^q=N_g,

F P^{q} = ∣ C_{a}^{q} ∣ - N_{g}

, FN^q=0,

T N = ∣ C - C_{a}^{q} ∣

;

4. else

T P^{q} = ∣ C_{a, i}^{q} ∣

F P = ∣ C_{a, o}^{q} ∣

F N = N_{g} - ∣ C_{a, i}^{q} ∣

T N = ∣ C - C_{a, o}^{q} ∣ - N_{g}

;

5. endif;

6. endfor;

q^{*} = \arg {\max_{q} [\frac{T P^{q}}{F P^{q}}]}

;

C_{a}^{CaP} = C_{a}^{q^{*}}

end

Open in a new tab

Illustration of the potential ground truth space (C_P) containing N_g metavoxels corresponding to CaP. C_a represents the CaP segmentation obtained by the MRS CAD analysis scheme and C_a,o and C_a,i represent those regions of C_a that lie outside and within C_P respectively.

Thus Algorithm IdentifyCaPCluster identifies the CaP cluster as the one that maximizes true positive fraction while simultaneously minimizing the false positive fraction.

Performance evaluation metrics

Having identified $C_{a}^{CaP}$ , the corresponding CaP detection sensitivity and specificity values are determined as

S N (C_{a}^{CaP}) = \frac{{TP}_{a}}{{TP}_{a} + {FN}_{a}} \times 100,

(4.1)

S P (C_{a}^{CaP}) = \frac{{TN}_{a}}{{TN}_{a} + {FP}_{a}} \times 100 .

(4.2)

Given that $S N (C_{a}^{CaP})$ and $S P (C_{a}^{CaP})$ are estimates of the sensitivity and specificity (given the probabilistic estimates of the CaP ground truth extent), we also compute a confidence estimate (η) associated with the performance measures as a function of ∣C_P∣ and N_g. η is determined as

η = \frac{N_{g}}{∣ C_{P} ∣} \times 100 .

(4.3)

Thus for each study for which we only have C_P and N_g available as surrogates of CaP ground truth, we report η along with $S N (C_{a}^{CaP})$ and $S P (C_{a}^{CaP})$ for those studies.

RESULTS

Qualitative results

Stage 1: Qualitative evaluation of the hierarchical clustering scheme

Figure 6 shows the qualitative results of the hierarchical cascade scheme for distinguishing prostatic from extracapsular spectra. Figure 6a represents spatial maps of the spectral grid ${\tilde{C}}_{0}$ (16×16 spectral voxels) superimposed on the corresponding T2-w MRI scene for one patient study. Every $c ∊ {\tilde{C}}_{0}$ in Fig. 6a is assigned one of two colors (blue and red) corresponding to spectra identified by the algorithm as prostatic or extracapsular. Note that the dominant cluster [spatial locations in red in Fig. 6a] has been eliminated in the second iteration [ ${\tilde{C}}_{1}$ (16×8 spectral voxels)] [Fig. 6b]. The final spectral grid [ ${\tilde{C}}_{2}$ in Fig. 6c] is obtained after elimination of extracapsular spectra (red locations) during the third iteration of the cascade. Figures 6d, 6e, 6f represent the embedding plots [where each original spectral vector F(c), c∊C is plotted in 3D eigenvector space using the three dominant embedding values as coordinates] from ${\tilde{C}}_{0}$ (16×16 spectral voxels) (d) to ${\tilde{C}}_{2}$ (7×4 spectral voxels) (f) for one study at three different levels of the cascade. Note that at the end of the third iteration, the prostate ROI has been accurately identified and the spectral grid accurately overlaid on the prostate. Further note that in Figs. 6a, 6b, 6c, the spectral grid with the pronounced boundary indicates the ROI during the current iteration.

Spectral grids for a single 2D slice of a T2-w MRI scene for a patient at (a) the first cascade level ${\tilde{C}}_{0}$ , (b) second cascade level ${\tilde{C}}_{1}$ , and (c) third cascade level ${\tilde{C}}_{2}$ . Note that the size of the grid reduces from 16×16 metavoxels in (a) to 7×4 in (c) by elimination of extracapsular spectra in the dominant cluster (red). The corresponding clustered embedding plots at each of the cascade levels are also shown in (d)–(f), corresponding, in turn, to the metavoxel grids shown in (a)–(c).

Stage 2: Evaluation of feature extraction schemes for CaP detection

The identification of the prostate grid in stage 1 allows for the resolvability of the three MR spectral classes (abnormal appearing, normal appearing, and indeterminate). The differences between these three spectral classes within the prostate spectra cluster at the higher levels in the cascade $({\tilde{C}}_{1}, {\tilde{C}}_{2}, {\tilde{C}}_{3})$ become discriminable only after the removal of extracapsular spectra. Note that replicated clustering was used to identify only two clusters (suspicious and normal appearing spectra) for the six manually annotated studies. Figure 7 shows the qualitative results of the four feature extraction schemes employed in this work for CaP detection for two different patient studies, each row in Fig. 7 corresponding to a different study. The three colors assigned to the spectral voxels in Fig. 7 correspond to the three clusters obtained via replicated clustering on the reduced dimensional spectra S^ϕ(c), ϕ∊{PCA,LLE,GE} for c∊C. For the z score scheme, each metavoxel was classified as belonging to one of the two classes [red and blue in Figs. 7a, 7d], reflecting abnormal and normal appearing spectra. The white box superposed in Figs. 7a, 7b, 7c, 7d, 7e, 7f shows the potential cancer space (C_P) for corresponding slices. In each of Figs. 7a, 7b, 7c, 7d, 7e, 7f, the red cluster was identified as comprising abnormal using the IdentifyCaPCluster algorithm (sec. 4) and following feature extraction and replicated clustering. Figures 7a, 7d show the results obtained via z score; while Figs. 7b, 7e show the results of PCA in identifying abnormal appearing, normal appearing, and indeterminate spectra. Figures 7c, 7f show the corresponding results for graph embedding and LLE, respectively. For the first study (first row in Fig. 7), graph embedding [Fig. 7c] appears to yield a near perfect CaP detection in terms of sensitivity and specificity as only CaP voxels are identified within the white cancer grid (C_P). The corresponding results for z score [Fig. 7a] and PCA [Fig. 7b] both yield poor detection sensitivity and specificity. Similarly, for the second study shown in Fig. 7, LLE [Fig. 7f] appears to yield higher CaP detection sensitivity and specificity compared to z score [Fig. 7d], and PCA [Fig. 7e]. Figure 8 shows an example of the MR spectral grid with classification labels obtained from graph embedding and replicated clustering plotted back on the individual spectra; the spectra in red corresponding to those identified as abnormal appearing, blue spectra corresponding to normal appearing, and green spectra corresponding to indeterminate.

Qualitative comparison of prostate MRS analysis via four different feature extraction schemes employed for two different studies; each row corresponding to a different study. (a) and (d) represent the suspicious appearing (red voxels) and benign appearing (blue voxels) spectra identified via z score, and (b) and (e) illustrate the clustering results obtained via PCA [red (suspicious appearing), blue (benign appearing), green (intermediate)]. Results for the NLDR schemes (c) GE and (f) LLE are also shown [the red, blue, and green colors having the same meaning as for PCA in (b) and (e)]. The white box superposed on (a)–(f) shows the locations of the potential cancer space (C_P). In each of (a)–(f) the cluster with the red metavoxels was the one identified as the CaP class based on the IdentifyCaPCluster algorithm (Sec. 5A).

MRS spectral grid plotted with the classification labels (three colors correspond to three different clusters) obtained from graph embedding and replicated clustering. The red spectra correspond to those identified as abnormal appearing, the blue correspond to normal appearing, and the green correspond to indeterminate spectra (possibly noise in this case).

Quantitative results

Quantitative evaluation of stage 1: Hierarchical clustering

At the end of stage 1, the largest rectangular box ${\tilde{C}}_{T}$ that contains all prostate spectra is then overlaid on the T2-w image. Note that in quantitative evaluation of stage 1, the precise spatial extent of the prostate is all that is needed. This ground truth is ascertained by manual placement of a spectral grid $({\tilde{C}}_{g}^{T})$ on the prostate by an expert radiologist. Table 2 shows the average sensitivity, specificity, and PPV in automated identification of the prostate grid ${\tilde{C}}_{T}$ with respect to ${\tilde{C}}_{g}^{T}$ and averaged over 18 studies.

Table 2.

Average sensitivity, specificity, PPV values for automated identification of prostate grid using hierarchical spectral clustering averaged over 18 studies.

Sensitivity	Specifity	PPV
97.66%	98.87%	89.29%

Open in a new tab

Quantitative evaluation of stage 2: Identifying suspicious appearing spectra

Evaluation ofzscore via receiver operating characteristic (ROC) curve analysis. In order to define the optimal threshold θ^z for performing a z score-based classification of each c∊C as normal or suspicious, a set of cancerous voxels obtained via annotation by an expert radiologist, C^V⊂C, was defined during an offline training phase. For each c∊C^V, a corresponding S^z(c) was obtained which was then used to define the average μ^z and standard deviation σ^z of z score values for CaP. The threshold θ^z was then defined as μ^z±ασ^z, where α∊[0,1]. The value of α was uniformly varied between [0,1] and a corresponding values for θ^z obtained. At each value of θ^z, each metavoxel $c ∊ {\tilde{C}}_{T}$ is identified as belonging to CaP if S^z(c)>θ^z, normal otherwise. Thus at each θ^z, the corresponding sensitivity and specificity of CaP detection via z score is computed [Eqs. (4.12) and (4.13)] with respect to C_P and N_g. A curve is then fit to the sensitivity, specificity values to obtain the ROC curve. The optimal threshold ${\hat{θ}}^{z}$ was determined as the operating point on the ROC curve; the location on the ROC curve closest to 100% sensitivity, specificity. Average sensitivity and specificity at the operating point of ROC curve were found to be 72.85% and 65.45%, respectively.

Evaluation of DR methods via cross validation using probabilistic ground truth for CaP extent. The ability of a classifier to distinguish between object classes in embedding spaces obtained via DR methods is dependent on to the choice of the number of dimensions (v) of the embedding space in which the data are represented. For the NLDR methods, the low dimensional data representations obtained via LLE (Ref. ³⁰) is also a function of κ, the parameter controlling the size of the local neighborhood within which linearity is assumed. In order to evaluate the parameter sensitivity of different DR methods, the robustness of the DR methods over different values of κ and v was quantitatively evaluated. LLE was evaluated by varying κ∊{6,7,…,15} and v∊{3,4,…,10}, a total of 80 different combinations of parameter values. PCA and graph embedding were evaluated for eight different values of v∊{3,4,…,10}.

For each of ϕ∊{LLE,GE,PCA}, threefold and fivefold cross validation averaged over 20 iterations were also performed to obtain average sensitivity and specificity in terms of CaP detection performance for all 18 studies. Threefold cross validation was performed by randomly choosing three datasets and calculating average CaP detection sensitivity and specificity across three studies on the 80-dimensional parameter space. The parameter set (v_max for PCA, GE and v_max, κ_max for LLE) with maximum sensitivity and specificity in this space were then identified as optimal values and used for the CAD scheme on the remaining 15 studies. The average CaP detection sensitivity and specificity on these 15 studies was then recorded. On the next trial, 3 random training studies from 18 were again selected and used to optimize the parameters and evaluation again was done on the remaining 15 studies. This entire process was repeated for a total of 20 times. The mean μ^ϕ and standard deviation σ^ϕ in CaP detection sensitivity and specificity across these 20 iterations were reported in Table 3(a) for ϕ∊{LLE,GE,PCA}. A similar routine was employed when performing fivefold cross validation. The corresponding results are reported in Table 3(b). All NLDR schemes employed in this work were found to have higher sensitivity of CaP detection compared to PCA; graph embedding performing the best with a sensitivity of almost 90%. In Figs. 9 10 barplots are shown representing CaP detection sensitivity and specificity (μ^ϕ,σ^ϕ) for each of the 18 studies considered in this work via threefold and fivefold cross validation, respectively, over 20 iterations for GE. The confidence estimates (η) associated with the sensitivity and specificity measurements of each study are also shown. Note that while η is low for a majority of the studies, for studies 11 and 18, the confidence associated with CaP detection sensitivity and specificity for graph embedding were almost 90% and 80%, respectively.

Table 3.

This shows the average and standard deviation in CaP detection sensitivity and specificity for different feature extraction methods over κ∊{5,…,15} and v∊{3,…,10} for 18 different studies via (a) threefold cross validation, and (b) fivefold cross validation.

Method	Sensitivity	Specificity
(a)
GE	89.33±1.87	79.79±2.24
LLE	83.70±4.50	81.04±4.60
PCA	78.53±3.20	83.97±2.81
(b)
GE	89.09±2.10	79.22±2.10
LLE	84.93±4.42	81.32±4.05
PCA	78.78±4.31	84.11±2.46

Open in a new tab

Barplot showing average and standard deviation in CaP detection sensitivity and specificity for the individual 18 studies averaged over 20 iterations of threefold cross validation via graph embedding. The coefficient of confidence (η) associated with the estimation of the performance measures is also shown in green.

Barplot showing the average and standard deviation in CaP detection sensitivity and specificity for the individual 18 studies averaged over 20 iterations for fivefold cross validation via graph embedding. The coefficient of confidence (η) associated with the estimation of the performance measures is also shown in green.

Evaluation of all feature extraction methods against expert annotated spectra. Table 4 shows the average sensitivity and specificity results obtained for ϕ∊{LLE,GE,PCA,z} for six studies for which each spectra had been labeled according to the five point scale³⁶ by an expert radiologist. For all methods, we made the assumption that spectra labeled as 1∕2 were all normal appearing and the spectra labeled 3∕4∕5 were suspicious appearing. For each of ϕ∊{LLE,GE,PCA,z}, the IdentifyCaPCluster algorithm was employed to identify the cluster most likely to correspond to CaP spectra. Since we only had six studies available, the mean and standard deviation in CaP detection sensitivity and specificity across all the expert annotated datasets were determined (based on previously determined v_max and κ_max) and shown in Table 4.

Table 4.

This shows the average and standard deviation in CaP detection sensitivity and specificity for different feature extraction methods obtained via comparison with expert annotations for six ACRIN studies.

Method	Sensitivity	Specificity
GE	79.74±13.18	65.49±7.58
LLE	81.36±5.69	64.71±9.84
PCA	76.68±14.07	48.26±12.95
z score	62.70±18.67	50.73±7.39

Open in a new tab

CONCLUDING REMARKS

In this paper we have presented a novel application of nonlinear dimensionality reduction and hierarchical clustering for automated identification of (a) the prostate ROI based on the classification of MR spectral data alone and (b) suspicious appearing spectra within the prostate ROI. A total of 18 MRI∕MRS studies from the ACRIN trial was considered for the evaluation of four different feature extraction algorithms (PCA, z score, LLE, GE) in conjunction with replicated clustering in terms of their ability to identify suspicious appearing spectra. Owing to the fact that only limited knowledge regarding precise spatial extent of CaP was available for the studies considered in this work, we defined a probabilistic ground truth estimate for CaP and a confidence coefficient to assess the degree of certainty associated with the CAD performance measures reported. The high confidence estimates associated with two of the studies (study 11 and 18) seem to suggest that the consistently high CaP detection sensitivity and specificity measurements for the other studies are not erroneous. For a subset (6) of the 18 studies, a radiologist laboriously annotated each of the spectra according to the five point scale. For these studies, we assumed that the spectra corresponding to 3∕4∕5 were cancerous. In comparing the four feature extraction schemes on the 18 datasets with partial CaP ground truth estimates, as well as the six studies for which expert annotations was available, the NLDR schemes (GE and LLE) consistently outperformed PCA and z score in terms of both CaP detection sensitivity and specificity. In addition, the nonlinear DR schemes were found to be relatively robust to change in the value of the system parameters (κ,v). The use of the replicated clustering scheme helped overcome the instability associated with k-means clustering, yielding consistently stable clusters. Our scheme is also highly efficient with stage 1 only requiring an average of 11.24 s to analyze a 256×256×8 MRI∕MRS 3D 1.5 T scene on a Pentium IV, 2 Gbyte RAM Intel processor machine. Stage 2 took an average of 5.23 s for automated classification of MR spectra as being suspicious, normal appearing, or indeterminate.

Future work will involve integration of our automated CAD MRS scheme with T2-w MRI to incorporate both structural and functional information for more accurate identification of CaP. We also aim to perform more rigorous analysis of the scheme on a larger cohort of data. The availability of more precise knowledge of spatial location of CaP on the MR imagery will help to further confirm and validate the efficacy of our methods.

While replicated clustering is more stable than k-means, it still requires specification of the value of k. In this work we assumed k=3 based on the assumption that all the spectra were either normal appearing, suspicious appearing, or indeterminate. While Jung et al.,³⁶ identified five distinct classes of prostate MR spectra (five point scale), in our experiments (owing perhaps to the quality of data) we were unable to find five unique clusters. A future avenue of exploration in the future will be to look at fully unsupervised clustering schemes (e.g., mean shift) which do not require prespecification of the number of data clusters.

ACKNOWLEDGMENTS

This work was made possible via grants from the Wallace H. Coulter Foundation, New Jersey Commission on Cancer Research, National Cancer Institute (Grant Nos. R01CA136535-01, ARRA-NCl-3 R21CA127186–02S1, R21CA127186–01, and R03CA128081-01), the Society for Imaging Informatics in Medicine (SIIM), The Cancer Institute of New Jersey, and the Life Science Commercialization Award from Rutgers University.

References

Catalona W., Smith D., Ratliff T., Dodds K., Coplen D., Yuan J., Petros J., and Andriole G., “Measurement of prostate-specific antigen in serum as a screening test for prostate cancer,” N. Engl. J. Med. 324, 1156–1161 (1991). [DOI] [PubMed] [Google Scholar]
Borboroglu P., Comer S., Riffenburgh R., and Amling C., “Extensive repeat transrectal ultrasound guided prostate biopsy in patients with previous benign sextant biopsies,” J. Urol. (Paris) 163(1), 158–162 (2000). [PubMed] [Google Scholar]
Yu Kyle K., and Hricak H., “Imaging prostate cancer,” Radiol. Clin. North Am. 38(1), 59–85 (2000). 10.1016/S0033-8389(05)70150-0 [DOI] [PubMed] [Google Scholar]
Scheidler J., Hricak H., Vigneron D., Yu K., Sokolov D., Huang R., Zaloudek C., Nelson S., Carroll P., and Kurhanewicz J., “Prostate cancer: Localization with three-dimensional proton MR spectroscopic imaging-clinicopathologic study,” Radiology 213, 473–480 (1999). [DOI] [PubMed] [Google Scholar]
Carroll P., Coakley F., and Kurhanewicz J., “Magnetic resonance imaging and spectroscopy of prostate cancer,” Rev. Neurol. 8(1), S4–S10 (2006). [PMC free article] [PubMed] [Google Scholar]
Ullrich G., Mueller-Lisse , Swanson M., Vigneron D., Hricak H., Bessette A., Males R., Wood P., Noworolski S., Nelson S., Barken I., Carroll P., and Kurhanewicz J., “Time-dependent effects of hormone-deprivation therapy on prostate metabolism as detected by combined magnetic resonance imaging and 3D magnetic resonance spectroscopic imaging,” Magn. Reson. Med. 46, 49–57 (2001). 10.1002/mrm.1159 [DOI] [PubMed] [Google Scholar]
Kurhanewicz J., Swanson M., Nelson S., and Vigneron D., “Combined magnetic resonance imaging and spectroscopic imaging approach to molecular imaging of prostate cancer,” J. Magn. Reson Imaging 16, 451–463 (2002). 10.1002/jmri.10172 [DOI] [PMC free article] [PubMed] [Google Scholar]
Heerschap A., Jager G. J., van der Graaf M., Barentsz J. O., Rosetten J., De La Oosterhof G., Ruijter E., and Ruijs J., “In vivo proton MR spectroscopy reveals altered metabolite content in malignant prostate tissue,” Anticancer Res. 17, 1455–1460 (1997). [PubMed] [Google Scholar]
Kurhanewicz J., Vigneron D. B., Hricak H., Narayan P., Carroll P., and Nelson S. J., “Three-dimensional H-1 MR spectroscopic imaging of the in situ human prostate with high (0.24-0.7-cm3) spatial resolution,” Radiology 198(3), 795–805 (1996). [DOI] [PubMed] [Google Scholar]
Zakian K. L., Eberhardt S., Hricak H., Shukla-Dave A., Kleinman S., Muruganandham M., Sircar K., Kattan M. W., Reuter V. E., Scardino P. T., and Koutcher J. A., “Transition zone prostate cancer: Metabolic characteristics at H MR spectroscopic imaging—initial results,” Radiology 229, 241–247 (2003). [DOI] [PubMed] [Google Scholar]
Wetter A., Engl T., Nadjmabadi D., Fliessbach K., Lehnert T., Gurung J., Beecken W., and Vogl J., “Combined MRI and MR spectroscopy of the prostate before radical prostatectomy,” AJR, Am. J. Roentgenol. 187, 724–730 (2006). [DOI] [PubMed] [Google Scholar]
Madabhushi A. and Metaxas D., “Ultrasound techniques in digital mammography and their application in breast cancer diagnosis,” Medical Imaging Systems Technology: Analysis and Computational Methods (World Scientific, Singapore, 2005), pp. 119–150. [Google Scholar]
Madabhushi A., Feldman M., Metaxas D., Tomaszeweski J., and Chute D., “Automated detection of prostatic adenocarcinoma from high-resolution ex vivo MRI,” IEEE Trans. Med. Imaging 24(12), 1611–1625 (2005). 10.1109/TMI.2005.859208 [DOI] [PubMed] [Google Scholar]
Viswanath S., Rosen M., and Madabhushi A., “A consensus embedding approach for segmentation of high resolution in vivo prostate magnetic resonance imagery,” Society of Photo-Optical Instrumentation Engineers Medical Imaging Conference (SPIE 2008), San Diego (SPIE, 2008), Vol. 6915, 69150U.
Chan I., Wells W., Mulkern R., Haker S., Zhang J., Zou K., Maier S., and Tempany C., “Detection of prostate cancer by integration of line-scan diffusion, T2-mapping and T2-weighted magnetic resonance imaging: A multichannel statistical classifier,” Med. Phys. 30(9), 2390–2398 (2003). 10.1118/1.1593633 [DOI] [PubMed] [Google Scholar]
Kurhanewicz J., Vigneron D. B., Males R. G., Swanson M. G., Yu K. K., and Hricak H., “The prostate: MR imaging and spectroscopy. Present and future,” Radiology 38(1), 115–138 (2000). [DOI] [PubMed] [Google Scholar]
Van der Veen J. W. C., De Beer R., Luyten P. R., and Van Ormondt D., “Accurate quantification of in vivo 31P NMR signals using the variable projection method and prior knowledge,” Magn. Reson. Med. 6(1), 92–98 (1988). 10.1002/mrm.1910060111 [DOI] [PubMed] [Google Scholar]
Vanhamme L., Boogaart A., and Huffel S., “Improved method for accurate and efficient quantification of MRS data with use of prior knowledge,” J. Magn. Reson. 129(1), 35–43 (1998). 10.1006/jmre.1997.1244 [DOI] [PubMed] [Google Scholar]
Ratiney H., Sdika M., Coenradie Y., Cavassila S., van Ormondt D., and Graveron-Demilly D., “Time-domain semiparametric estimation based on a metabolite basis set,” Pediatr. Nephrol. 18(1), 1–13 (2005). [DOI] [PubMed] [Google Scholar]
Zakian K., Sircar K., Hricak H., Chen H., Dave A., Eberhardt S., Muruganandham M., Ebora L., Kattan M., Reuter V., Scardino P., and Koutcher J., “Correlation of proton MR spectroscopic imaging with gleason score based on sept-section pathologic analysis after radical prostatectomy,” Radiology 234, 804–814 (2005). 10.1148/radiol.2343040363 [DOI] [PubMed] [Google Scholar]
Pels P., Ozturk-Isik E., Swanson M., Vanhamme L., Kurhanewicz J., Nelson S., and Huffel S., “Quantification of prostate MRSI data by model-based time domain fitting and frequency domain analysis,”Pediatr. Nephrol. 19, 188–197 (2006). [DOI] [PubMed] [Google Scholar]
Kelm M., Menze B., Zechmann C., Baudendistel K., and Hanprecht F., “Automated estimation of tumor probability in prostate MRSI: Pattern recognition vs. quantification,” Magn. Reson. Med. 57, 150–159 (2006). 10.1002/mrm.21112 [DOI] [PubMed] [Google Scholar]
Quan H. and Bao S., “z score analysis for H-MRSI data of glioma and prostate cancer,” Asia-Oceania Federation of Organizations for Medical Physics, 2004.
Laudadio T., Pels P., Lathauwer L., Hecke P., and Huffel S., “Tissue segmentation and classification of MRSI data using canonical correlation analysis,” Magn. Reson. Med. 54, 1519–1529 (2005). 10.1002/mrm.20710 [DOI] [PubMed] [Google Scholar]
Devos A., Lukas L., Suykens J. A. K., Vanhamme L., Tate A. R., Howe F. A., Majs C., Moreno-Torres A., van der Graaf M., Ars C., and Van Huffel S., “Classification of brain tumours using short echo time 1H MR spectra,” J. Magn. Reson Imaging 170, 164–175 (2004). [DOI] [PubMed] [Google Scholar]
Ma J. and Sun J., “MRS classification based on independent component analysis and support vector machines,” IEEE Fifth International Conference on Hybrid Intelligent Systems (IEEE, Nov. 2005), pp. 81–84.
Simonetti A. W., Melssen W. J., Szabo de Edelenyi F., van Asten J. J., Heerschap A., and Buydens L. M., “Combination of feature-reduced MR spectroscopic and MR imaging data for improved brain tumor classification,” NMR Biomed. 18, 34–43 (2005). 10.1002/nbm.919 [DOI] [PubMed] [Google Scholar]
Lee G., Madabhushi A., and Rodriguez C., “Investigating the efficacy of nonlinear dimensionality reduction schemes in classifying gene and protein expression studies,” IEEE/ACM Trans. Comput. Biol. Bioinf. 5, 368–384 (2008). 10.1109/TCBB.2008.36 [DOI] [PMC free article] [PubMed] [Google Scholar]
Tenenbaum J., de Silva V., and Langford J. C., “A global geometric framework for nonlinear dimensionality reduction,” Science 290, 2319–2322 (2000). 10.1126/science.290.5500.2319 [DOI] [PubMed] [Google Scholar]
Roweis S. and Saul L., “Nonlinear dimensionality reduction by locally linear embedding,” Science 290, 2323–2326 (2000). 10.1126/science.290.5500.2323 [DOI] [PubMed] [Google Scholar]
Madabhushi A., Shi J., Rosen M., Tomasezweski J., and Feldman M., “Graph embedding to improve supervised classification: Detecting prostate cancer,” Medical Image Computing and Computer-Assisted Intervention (MICCAI 2005), LNCS, 2005, Vol. 3749, pp. 729–737. [DOI] [PubMed]
Shi J. and Malik J., “Normalized cuts and image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000). 10.1109/34.868688 [DOI] [Google Scholar]
Hotelling H., “Analysis of a complex of statistical variables into principal components,” J. Educ. Psychol. 24, 417–441 (1933). 10.1037/h0071325 [DOI] [Google Scholar]
MacQueen J. B., “Some methods for classification and analysis of multivariate observations,” Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability (University of California Press, Berkeley, 1967), Vol. 1, pp. 281–297.
Bottomley P. A., “Spatial localization in NMR spectroscopy in vivo,” Ann. N.Y. Acad. Sci. 508, 333–348 (1997). 10.1111/j.1749-6632.1987.tb32915.x [DOI] [PubMed] [Google Scholar]
Jung J., Coakley F., Vigneron D., Swanson M., Qayyum A., Weinberg V., Jones K., Carroll P., and Kurhanewicz J., “Prostate depiction at endorectal MR spectroscopic imaging: investigation of a standardized evaluation system,” Radiology 233, 701–708 (2004). 10.1148/radiol.2333030672 [DOI] [PubMed] [Google Scholar]

[c1] Catalona W., Smith D., Ratliff T., Dodds K., Coplen D., Yuan J., Petros J., and Andriole G., “Measurement of prostate-specific antigen in serum as a screening test for prostate cancer,” N. Engl. J. Med. 324, 1156–1161 (1991). [DOI] [PubMed] [Google Scholar]

[c2] Borboroglu P., Comer S., Riffenburgh R., and Amling C., “Extensive repeat transrectal ultrasound guided prostate biopsy in patients with previous benign sextant biopsies,” J. Urol. (Paris) 163(1), 158–162 (2000). [PubMed] [Google Scholar]

[c3] Yu Kyle K., and Hricak H., “Imaging prostate cancer,” Radiol. Clin. North Am. 38(1), 59–85 (2000). 10.1016/S0033-8389(05)70150-0 [DOI] [PubMed] [Google Scholar]

[c4] Scheidler J., Hricak H., Vigneron D., Yu K., Sokolov D., Huang R., Zaloudek C., Nelson S., Carroll P., and Kurhanewicz J., “Prostate cancer: Localization with three-dimensional proton MR spectroscopic imaging-clinicopathologic study,” Radiology 213, 473–480 (1999). [DOI] [PubMed] [Google Scholar]

[c5] Carroll P., Coakley F., and Kurhanewicz J., “Magnetic resonance imaging and spectroscopy of prostate cancer,” Rev. Neurol. 8(1), S4–S10 (2006). [PMC free article] [PubMed] [Google Scholar]

[c6] Ullrich G., Mueller-Lisse , Swanson M., Vigneron D., Hricak H., Bessette A., Males R., Wood P., Noworolski S., Nelson S., Barken I., Carroll P., and Kurhanewicz J., “Time-dependent effects of hormone-deprivation therapy on prostate metabolism as detected by combined magnetic resonance imaging and 3D magnetic resonance spectroscopic imaging,” Magn. Reson. Med. 46, 49–57 (2001). 10.1002/mrm.1159 [DOI] [PubMed] [Google Scholar]

[c7] Kurhanewicz J., Swanson M., Nelson S., and Vigneron D., “Combined magnetic resonance imaging and spectroscopic imaging approach to molecular imaging of prostate cancer,” J. Magn. Reson Imaging 16, 451–463 (2002). 10.1002/jmri.10172 [DOI] [PMC free article] [PubMed] [Google Scholar]

[c8] Heerschap A., Jager G. J., van der Graaf M., Barentsz J. O., Rosetten J., De La Oosterhof G., Ruijter E., and Ruijs J., “In vivo proton MR spectroscopy reveals altered metabolite content in malignant prostate tissue,” Anticancer Res. 17, 1455–1460 (1997). [PubMed] [Google Scholar]

[c9] Kurhanewicz J., Vigneron D. B., Hricak H., Narayan P., Carroll P., and Nelson S. J., “Three-dimensional H-1 MR spectroscopic imaging of the in situ human prostate with high (0.24-0.7-cm3) spatial resolution,” Radiology 198(3), 795–805 (1996). [DOI] [PubMed] [Google Scholar]

[c10] Zakian K. L., Eberhardt S., Hricak H., Shukla-Dave A., Kleinman S., Muruganandham M., Sircar K., Kattan M. W., Reuter V. E., Scardino P. T., and Koutcher J. A., “Transition zone prostate cancer: Metabolic characteristics at H MR spectroscopic imaging—initial results,” Radiology 229, 241–247 (2003). [DOI] [PubMed] [Google Scholar]

[c11] Wetter A., Engl T., Nadjmabadi D., Fliessbach K., Lehnert T., Gurung J., Beecken W., and Vogl J., “Combined MRI and MR spectroscopy of the prostate before radical prostatectomy,” AJR, Am. J. Roentgenol. 187, 724–730 (2006). [DOI] [PubMed] [Google Scholar]

[c12] Madabhushi A. and Metaxas D., “Ultrasound techniques in digital mammography and their application in breast cancer diagnosis,” Medical Imaging Systems Technology: Analysis and Computational Methods (World Scientific, Singapore, 2005), pp. 119–150. [Google Scholar]

[c13] Madabhushi A., Feldman M., Metaxas D., Tomaszeweski J., and Chute D., “Automated detection of prostatic adenocarcinoma from high-resolution ex vivo MRI,” IEEE Trans. Med. Imaging 24(12), 1611–1625 (2005). 10.1109/TMI.2005.859208 [DOI] [PubMed] [Google Scholar]

[c14] Viswanath S., Rosen M., and Madabhushi A., “A consensus embedding approach for segmentation of high resolution in vivo prostate magnetic resonance imagery,” Society of Photo-Optical Instrumentation Engineers Medical Imaging Conference (SPIE 2008), San Diego (SPIE, 2008), Vol. 6915, 69150U.

[c15] Chan I., Wells W., Mulkern R., Haker S., Zhang J., Zou K., Maier S., and Tempany C., “Detection of prostate cancer by integration of line-scan diffusion, T2-mapping and T2-weighted magnetic resonance imaging: A multichannel statistical classifier,” Med. Phys. 30(9), 2390–2398 (2003). 10.1118/1.1593633 [DOI] [PubMed] [Google Scholar]

[c16] Kurhanewicz J., Vigneron D. B., Males R. G., Swanson M. G., Yu K. K., and Hricak H., “The prostate: MR imaging and spectroscopy. Present and future,” Radiology 38(1), 115–138 (2000). [DOI] [PubMed] [Google Scholar]

[c17] Van der Veen J. W. C., De Beer R., Luyten P. R., and Van Ormondt D., “Accurate quantification of in vivo 31P NMR signals using the variable projection method and prior knowledge,” Magn. Reson. Med. 6(1), 92–98 (1988). 10.1002/mrm.1910060111 [DOI] [PubMed] [Google Scholar]

[c18] Vanhamme L., Boogaart A., and Huffel S., “Improved method for accurate and efficient quantification of MRS data with use of prior knowledge,” J. Magn. Reson. 129(1), 35–43 (1998). 10.1006/jmre.1997.1244 [DOI] [PubMed] [Google Scholar]

[c19] Ratiney H., Sdika M., Coenradie Y., Cavassila S., van Ormondt D., and Graveron-Demilly D., “Time-domain semiparametric estimation based on a metabolite basis set,” Pediatr. Nephrol. 18(1), 1–13 (2005). [DOI] [PubMed] [Google Scholar]

[c20] Zakian K., Sircar K., Hricak H., Chen H., Dave A., Eberhardt S., Muruganandham M., Ebora L., Kattan M., Reuter V., Scardino P., and Koutcher J., “Correlation of proton MR spectroscopic imaging with gleason score based on sept-section pathologic analysis after radical prostatectomy,” Radiology 234, 804–814 (2005). 10.1148/radiol.2343040363 [DOI] [PubMed] [Google Scholar]

[c21] Pels P., Ozturk-Isik E., Swanson M., Vanhamme L., Kurhanewicz J., Nelson S., and Huffel S., “Quantification of prostate MRSI data by model-based time domain fitting and frequency domain analysis,”Pediatr. Nephrol. 19, 188–197 (2006). [DOI] [PubMed] [Google Scholar]

[c22] Kelm M., Menze B., Zechmann C., Baudendistel K., and Hanprecht F., “Automated estimation of tumor probability in prostate MRSI: Pattern recognition vs. quantification,” Magn. Reson. Med. 57, 150–159 (2006). 10.1002/mrm.21112 [DOI] [PubMed] [Google Scholar]

[c23] Quan H. and Bao S., “z score analysis for H-MRSI data of glioma and prostate cancer,” Asia-Oceania Federation of Organizations for Medical Physics, 2004.

[c24] Laudadio T., Pels P., Lathauwer L., Hecke P., and Huffel S., “Tissue segmentation and classification of MRSI data using canonical correlation analysis,” Magn. Reson. Med. 54, 1519–1529 (2005). 10.1002/mrm.20710 [DOI] [PubMed] [Google Scholar]

[c25] Devos A., Lukas L., Suykens J. A. K., Vanhamme L., Tate A. R., Howe F. A., Majs C., Moreno-Torres A., van der Graaf M., Ars C., and Van Huffel S., “Classification of brain tumours using short echo time 1H MR spectra,” J. Magn. Reson Imaging 170, 164–175 (2004). [DOI] [PubMed] [Google Scholar]

[c26] Ma J. and Sun J., “MRS classification based on independent component analysis and support vector machines,” IEEE Fifth International Conference on Hybrid Intelligent Systems (IEEE, Nov. 2005), pp. 81–84.

[c27] Simonetti A. W., Melssen W. J., Szabo de Edelenyi F., van Asten J. J., Heerschap A., and Buydens L. M., “Combination of feature-reduced MR spectroscopic and MR imaging data for improved brain tumor classification,” NMR Biomed. 18, 34–43 (2005). 10.1002/nbm.919 [DOI] [PubMed] [Google Scholar]

[c28] Lee G., Madabhushi A., and Rodriguez C., “Investigating the efficacy of nonlinear dimensionality reduction schemes in classifying gene and protein expression studies,” IEEE/ACM Trans. Comput. Biol. Bioinf. 5, 368–384 (2008). 10.1109/TCBB.2008.36 [DOI] [PMC free article] [PubMed] [Google Scholar]

[c29] Tenenbaum J., de Silva V., and Langford J. C., “A global geometric framework for nonlinear dimensionality reduction,” Science 290, 2319–2322 (2000). 10.1126/science.290.5500.2319 [DOI] [PubMed] [Google Scholar]

[c30] Roweis S. and Saul L., “Nonlinear dimensionality reduction by locally linear embedding,” Science 290, 2323–2326 (2000). 10.1126/science.290.5500.2323 [DOI] [PubMed] [Google Scholar]

[c31] Madabhushi A., Shi J., Rosen M., Tomasezweski J., and Feldman M., “Graph embedding to improve supervised classification: Detecting prostate cancer,” Medical Image Computing and Computer-Assisted Intervention (MICCAI 2005), LNCS, 2005, Vol. 3749, pp. 729–737. [DOI] [PubMed]

[c32] Shi J. and Malik J., “Normalized cuts and image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000). 10.1109/34.868688 [DOI] [Google Scholar]

[c33] Hotelling H., “Analysis of a complex of statistical variables into principal components,” J. Educ. Psychol. 24, 417–441 (1933). 10.1037/h0071325 [DOI] [Google Scholar]

[c34] MacQueen J. B., “Some methods for classification and analysis of multivariate observations,” Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability (University of California Press, Berkeley, 1967), Vol. 1, pp. 281–297.

[c35] Bottomley P. A., “Spatial localization in NMR spectroscopy in vivo,” Ann. N.Y. Acad. Sci. 508, 333–348 (1997). 10.1111/j.1749-6632.1987.tb32915.x [DOI] [PubMed] [Google Scholar]

[c36] Jung J., Coakley F., Vigneron D., Swanson M., Qayyum A., Weinberg V., Jones K., Carroll P., and Kurhanewicz J., “Prostate depiction at endorectal MR spectroscopic imaging: investigation of a standardized evaluation system,” Radiology 233, 701–708 (2004). 10.1148/radiol.2333030672 [DOI] [PubMed] [Google Scholar]

PERMALINK

A hierarchical spectral clustering and nonlinear dimensionality reduction scheme for detection of prostate cancer from magnetic resonance spectroscopy (MRS)

Pallavi Tiwari

Mark Rosen

Anant Madabhushi

Abstract

INTRODUCTION

Figure 1.

Figure 2.

DESCRIPTION OF FEATURE EXTRACTION AND REPLICATED CLUSTERING METHODS

Feature extraction methods

Linear dimensionality reduction scheme: PCA

z score

Nonlinear dimensionality reduction methods

Replicated k-means clustering in the reduced feature space

METHODOLOGY

Notation

Figure 3.

Table 1.

MRSI data description

Determining approximate ground truth for spatial extent of CaP on MRI-MRS

Potential cancer space

Figure 4.

Expert annotations

Stage 1: Localization of prostate using hierarchical spectral clustering of MRS

Algorithm.

Stage 2: CaP identification on MRS via NLDR

EVALUATION METHODS

Identification of cancer cluster

Algorithm.

Figure 5.

Performance evaluation metrics

RESULTS

Qualitative results

Stage 1: Qualitative evaluation of the hierarchical clustering scheme

Figure 6.

Stage 2: Evaluation of feature extraction schemes for CaP detection

Figure 7.

Figure 8.

Quantitative results

Quantitative evaluation of stage 1: Hierarchical clustering

Table 2.

Quantitative evaluation of stage 2: Identifying suspicious appearing spectra

Table 3.

Figure 9.

Figure 10.

Table 4.

CONCLUDING REMARKS

ACKNOWLEDGMENTS

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases