Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Feb 16.
Published in final edited form as: IEEE Trans Biomed Eng. 2015 Jan 23;62(6):1585–1594. doi: 10.1109/TBME.2015.2395812

Pharmacokinetic Tumor Heterogeneity as a Prognostic Biomarker for Classifying Breast Cancer Recurrence Risk

Majid Mahrooghy 1, Ahmed B Ashraf 2, Dania Daye 3, Elizabeth S McDonald 4, Mark Rosen 5, Carolyn Mies 6, Michael Feldman 7, Despina Kontos 8,*
PMCID: PMC10870107  NIHMSID: NIHMS1962323  PMID: 25622311

Abstract

Goal:

Heterogeneity in cancer can affect response to therapy and patient prognosis. Histologic measures have classically been used to measure heterogeneity, although a reliable noninvasive measurement is needed both to establish baseline risk of recurrence and monitor response to treatment. Here, we propose using spatiotemporal wavelet kinetic features from dynamic contrast-enhanced magnetic resonance imaging to quantify intratumor heterogeneity in breast cancer.

Methods:

Tumor pixels are first partitioned into homogeneous subregions using pharmacokinetic measures. Heterogeneity wavelet kinetic (HetWave) features are then extracted from these partitions to obtain spatiotemporal patterns of the wavelet coefficients and the contrast agent uptake. The HetWave features are evaluated in terms of their prognostic value using a logistic regression classifier with genetic algorithm wrapper-based feature selection to classify breast cancer recurrence risk as determined by a validated gene expression assay.

Results:

Receiver operating characteristic analysis and area under the curve (AUC) are computed to assess classifier performance using leave-one-out cross validation. The HetWave features outperform other commonly used features (AUC = 0.88 HetWave versus 0.70 standard features). The combination of HetWave and standard features further increases classifier performance (AUCs 0.94).

Conclusion:

The rate of the spatial frequency pattern over the pharmacokinetic partitions can provide valuable prognostic information.

Significance:

HetWave could be a powerful feature extraction approach for characterizing tumor heterogeneity, providing valuable prognostic information.

Index Terms—: Breast cancer recurrence prediction, breast dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI), feature extraction, gene expression, partitioning, prognostic assessment, tumor heterogeneity

I. Introduction

BREAST cancer is a heterogeneous disease with varying intratumoral molecular expression that can confound targeted therapies and lead to a mixed treatment response. Molecular heterogeneity in breast cancer is well established for both primary and metastatic disease. There is a clinical need for a noninvasive method of assessing heterogeneity to both establish appropriate therapy and monitor response.

Histologic tumor heterogeneity correlates with a poor outcome [1]. While tumor heterogeneity can be determined from surgical specimens, critical therapeutic decisions often need to be made from core biopsy samples, limited to only one region of a tumor and not easily repeated. Imaging is uniquely poised to appropriately capture the entire tumor phenotype. If imaging heterogeneity could have similar prognostic value to pathologic heterogeneity, it would be a powerful tool to aid personalized decisions about cancer treatment.

Since DCE-MRI captures both anatomical and functional tumor characteristics such as perfusion, permeability, and angiogenesis [2]–[4], it could potentially allow for assessing tumor heterogeneity in vivo. Several studies have been performed to obtain biomarkers using DCE-MRI to characterize breast tumors in terms of malignancy, lymph node involvement, tumor grade, response to therapy, and other histopathologic markers [5]–[13]. The most commonly used semiquantitative prognostic features extracted directly from DCE-MR images include kinetic, morphological, and textural measures [5], [14]. Although useful characterizing tumors, a common limitation of these features is that they do not fully capture tumor heterogeneity, as they either rely on simple aggregate measures or tumor “hotspots” [5], [6] and consider the whole tumor as a relatively homogeneous volume.

Limited preliminary studies investigating imaging heterogeneity on DCE-MRI for treatment response prediction have been promising. Parikh et al. found that a mid-treatment decrease in heterogeneity on DCE-MRI during neoadjuvant chemotherapy (NAC) correlated with pathologic response better than standard size criteria [15]. Similarly, Yankeelov et al. noted that spatial information incorporating intratumoral heterogeneity from DCE-MRI could be used to improve response prediction of breast tumors to NAC over standard quantitative MRI parameters [16]. Some studies have also previously analyzed the voxel-wise heterogeneity of the tumor enhancement in DCE-MRI by using independent component analysis (ICA) [12], [13].

This paper builds on prior work looking at neoadjuvant response to query whether baseline pharmacokinetic (PK) imaging heterogeneity can be used to predict prognosis. We introduce a methodology to characterize intratumor PK heterogeneity from breast DCE-MRI data. Our method, which builds upon our previous work [17], [18], uses a two-step approach: first, the tumor pixels are partitioned into groups that act similarly based on PK heterogeneity measures. Wavelet kinetic features are then extracted within each partitioned subregion to obtain the spatiotemporal patterns of the wavelet coefficients and contrast agent uptake. As a result, instead of considering global (i.e., average) spatiotemporal patterns for the whole tumor, and potentially losing information about the more subtle intratumor heterogeneity, we extract localized spatiotemporal features within the obtained tumor pixel partitions, based on specific heterogeneity properties. In addition, by using the powerful wavelet descriptors, we obtain multiresolution kinetic information from the tumor partitions at different spatial frequencies, as well as the rate of their frequency change over time, providing richer characterization of heterogeneity. As a preliminary evaluation, the proposed features are compared with other representative and commonly used DCE-MRI descriptors in classifying breast cancer recurrence risk as determined by a validated gene-expression assay.

Our long-term hypothesis is that imaging could ultimately complement current histopathologic and molecular biomarkers in prognostication and prediction.

II. Methods

A. Pharmacokinetic Heterogeneity-Based Partitioning

In DCE-MRI, a region of interest (ROI) or individual voxel can have a characteristic signal intensity time course which is related to the contrast agent concentration [19]. To model the uptake of the contrast agent by the tissue, several PK models have been proposed based on compartmental modeling (CM), using the on ROI and pixel-wise scales [4], [20]–[22]. The time course of the DCE-MRI signal can be used for extracting physiological parameters of these PK models [19], [23]. These physiological parameters are mainly related to tissue perfusion, vascular permeability, and extracellular volume fraction [3], [19], [23], [24]. Typically, compartment models assume that there are different tissue types contributing to the measured contrast agent concentration. Specifically, in the standard compartment model [4], [25], we have

Cmeasured (i,t)=j=1J1Cj(i,t)+Vp(i)Cp(t) (1)

where Cmeasured (i,t) is the measured contrast agent concentration in the ith pixel, Cj(i,t) is the concentration in tissue-type j at the ith pixel which is defined:

Cj(i,t)=Kjtrans(i)Cp(t)exp(kep,jt),j=1,,J1 (2)

where Cp(t) is the tracer concentration in plasma, Kjtrans  is the unidirectional volume transfer constant from plasma to tissue-type j,kep,j is the flux rate constant in tissue-type j,Vp is the plasma volume in the ROI, and represents the convolution operation. Let Q to the number of sampling time points, Fj(t)=Cp(t)exp-kep,jt, and Fp(t)=Cp(t), we may represent the time series of the measured concentration for the ith pixel as follows:

Cmeasured (i,t)=FT(t)Ktrans (i) (3)

where Cmeasured (i)=[Cmeasured (i,t1),Cmeasured (i,t2),,Cmeasured (i,tQ)];F(t)=[F1(t),F2(t),,FJ1(t),Fp(t)]T (T is a transpose matrix sign), and

Ktrans (i)=[K1trans (i),K2trans (i),,KJ1trans (i),Vp(i)]T (4)

where Cmeasured i,tl is the contrast-agent concentration at time tl for pixel i.K1trans (i),,KJ-1trans (i) are the local volume transfer constants for tissue-types 1 to J-1, at pixel i; and Vp(i) is the plasma volume for pixel i [4].

The accurate estimation of PK parameters such as Kjtrans ,kep,j, and Vp is challenging due to multiple local optima for the parameters along with nonlinear optimization. Different methods and techniques have been developed to obtain these parameters [4], [23], [26]. In our study, we used the compartment modeling based on convex analysis of mixtures (CM-CAM) technique for estimating these PK parameters. CM-CAM aims at identifying the pure-volume pixels, which only consist of a single compartment tissue type, thereby minimizing partial volume effects, which are created by a mixture of more than one distinct compartment. These pure-volume pixels are identified by finding the corner points of the convex set of the pixel time series, as follows:

(X)={i=1Nαix(i)x(i)X,αi0,i=1Nαi=1}. (5)

In the previous equation, x(i) represents the pixel-time series, and H(X) is the convex set. Every data-point, x(i), of the time-series can be represented as a convex combination of the corner points aj of the convex set as follows:

x(i)=j=1JKj(i)aj (6)

where

x(i,tl)=Cmeasured (i,tl)l=1QCmeasured (i,tl),aj(tl)=Fj(tl)l=1QFj(tl)
Kj(i)=Kjtrans (i)j=1JKjtrans (i),Jj=1Kj(i)=1.

By applying the standard finite-normal mixture method to cluster data and using convex optimization methods [4], CM-CAM identifies the corner clusters and estimates the corresponding PK parameters [4]. The output of this step is the constant Ktrans (i) for every pixel i. In this study, a two-tissue compartment model is considered for the CM.

To capture heterogeneity in terms of the volume transfer constant and plasma volume within the tumor, we analyze the Ktrans (i) (4) as a PK heterogeneity measure (we used Kjtrans  and Vp parameters because they are spatially variant compared to kep,j and Cp [4], [27]). We first apply fuzzy c-means (FCM) clustering [28] on the pixel-wise Ktrans (i) values, to identify corresponding heterogeneity partitions (here chosen three partitions based on the rationale of the common subtype cancers of Basal, Luminal A, and B inside breast tumors [29]). This allows us to identify subregions within the same tumor that have homogeneous PK properties (i.e., high intracluster homogeneity versus intercluster heterogeneity). For our experiments, we used MATLAB’s (V. 8.1.0.604) standard implementation of the FCM algorithm with default parameters. We set the exponent parameter to 2 for the partition matrix, a maximum iteration number of 100, and a minimum improvement amount of 10−5 for the cost function [28]. To establish a correspondence between partitions for different tumors, the partitions are sorted based on the maximum first contrast uptake value of all pixels within the partition (i.e., the upper bound of the range of uptake values measured in each partition). This ensures that partition 1 always contains the region of the tumor with absolute maximal first contrast uptake, and partition 3’s most enhancing region is less enhancing than those present in partitions 1 and 2. Fig. 1(c) and (d) shows examples of the PK partitioning for a low- and high-recurrence risk tumor (where teal, gold, and dark red partitions represent low, medium, and high maximum uptake in the first post-contrast DCE-MR image). Note that in the previous partitioning scheme, the output is a mask image M such that Mk represents the partition number of the kth pixel within the image.

Fig. 1.

Fig. 1.

First post-contrast breast DCE-MR images (first column) and corresponding PK heterogeneity partitioning (second column; teal, gold, and dark red partitions represent low, medium, and high maximum first contrast uptake) for (a), (b) low recurrence risk and (c), (d) high recurrence risk tumor examples.

B. Heterogeneity Wavelet Kinetic Features

After obtaining the tumor partitions as described earlier, we use the wavelet transform [30] to characterize the spatial frequency information. In principal, different wavelet families (i.e., bases) can be used for this purpose. Here, we have used the Daubechies 2 (“db2”) as our primary wavelet family, which is shown to have better performance than other wavelet families. We selected two scales to ensure the availability of enough wavelet coefficient samples for different sized tumors, as for partitions of very small tumors (i.e., less than 0.3 cm2), it is not possible to extract enough wavelet coefficients for more than two decomposition levels. Three detail coefficients corresponding to horizontal, vertical, and diagonal spatial frequencies along with the approximation coefficients are computed at two level decompositions (in the rest of the paper, they are referred to as DP1 and DP2).

Since each decomposition image is downsampled from the original image, the corresponding tumor mask is also downsampled to match the wavelet images. To process the dynamic patterns of the wavelet coefficients for each partition, we compute the mean and variance of the approximation and detail (horizontal, vertical, and diagonal) images at DP1 and DP2 within each partition as well as the mean and variance of the pre- and post-contrast images. Next is a detailed description of the mathematical definitions of our HetWave features:

Assume the pre- and post-contrast images are defined as It(t=1 represents the pre-contrast while t=2,3,..,Q correspond to the post-contrast images). Therefore, approximate, CLt(k), and detail coefficients, DLt,s(k), are obtained as follows [30]:

CLt(k)=ItΦL,k (7)
DLt,s(k)=ItψL,ks. (8)

Note that the operation x, y represents the inner product between two vectors x and y, and It,ψL,ks, and ΦL,k are the vectorized versions of the image, the wavelet family function, and the corresponding scaling function, respectively [30]; k shows the index of the pixels in the vectorized images, and L identifies the decomposition level. The coefficients of DLt,s can be horizontal, vertical, and diagonal detail (s=H,V,D) [30]. Examples of the wavelet coefficient images from different decomposition levels for low and high recurrence risk tumors are shown in Fig. 2.

Fig. 2.

Fig. 2.

Low (first row) and high (third row) recurrent tumor contrast images and the corresponding wavelet images from the first post-contrast DCE-MRI scan (second and fourth rows); (a)–(d), pre- and post-contrast images of the lesion; (e) approximate wavelet coefficients at DP1; (f) horizontal wavelet coefficients at DP1; (g) vertical wavelet coefficients at DP1; (h) diagonal wavelet coefficients at DP2; (i)–(p) are similar to the previous description but for a tumor of high risk of recurrence.

As described in previous section, we obtain M the mask of heterogeneity partitioning such that Mk represents the membership mapping of pixel k to its respective partition (where Mkε{1,2,,p} and p is the number of partitions). The mean and variance of the approximation and detail coefficients (horizontal, vertical, and diagonal) at DP1 and DP2 for the ith partition at time t are computed as follows:

μcA,L(i,t)=k=1NCLt(k)δ(Mk=i)k=1Nδ(Mk=i) (9)
μcH,L(i,t)=k=1NDLt,H(k)δ(Mk=i)k=1Nδ(Mk=i) (10)

where μcA,L is the mean of approximation coefficients at decomposition level L and μcH,L,μcV,L, and μcD,L are the mean of the extracted detail coefficients (horizontal, vertical, and diagonal); δMk=i is an indicator function which equals 1 when Mk=i, and zero otherwise; i represents the partition number, and N is the total number of tumor pixels. Since the formulas of μcV,L and μcD,L are similar to μcH,L, we only show μcH,L for simplicity. The variance for the approximation σcA,L2 and detail coefficients (σcH,L2,σcV,L2,  and σcD,L2) are obtained as follows:

σcA,L2(i,t)=k=1N(CLt(k)μcA,L(i,t))2δ(Mk=i)k=1Nδ(Mk=i) (11)
σcH,L2(i,t)=k=1N(DLt,H(k)μcH,L(i,t))2δ(Mk=i)k=1Nδ(Mk=i). (12)

The mean and variance of the pre- and post-contrast images It within each partition are also obtained:

μI(i,t)=k=1NIt(k)δ(Mk=i)k=1Nδ(Mk=i) (13)
σI2(i,t)=k=1N(It(k)μI(i,t))2δ(Mk=i)k=1Nδ(Mk=i). (14)

After computing the previous statistics of wavelet coefficients within each tumor partition, we seek to examine how the previous statistics are changed over the enhancement process. Therefore, we compute the signal enhancement ratio (SER) of each statistic. Hylton [31] previously proposed the SER as SER=I1-I0/I2-I0 where I0,I1, and I2 represent the signal intensities on the pre-contrast, early post-contrast, and late post-contrast, respectively.

Following a similar rationale in quantifying enhancement, our proposed heterogeneity wavelet kinetic (HetWave) features are defined as follows:

HetWavef=W1fW0fWQfW0f (15)

where instead of just using intensity as in SER, we use the statistics as detailed earlier for each DCE-MR time point to obtain richer spatiotemporal information. In this definition, Wtf represents the statistics as defined earlier for the DCE-MR image corresponding to each time point t{0,1,2,Q} (i.e., W11 is μcA,L(1,1)). In our approach, the number of HetWave features depends on the number of the heterogeneity partitions (p), the number of the statistical operations (s), and the wavelet decomposition levels (d). We also have (4d+1) images in total, corresponding to the original input image and the approximate, horizontal, vertical, and diagonal wavelet images. Therefore, the total number of HetWave features is

NHetWave =(ps)(4d+1). (16)

These features capture the relative change in the mean and variance of the high and low spatial frequency pattern of the partitions as well as contrast agent uptake from the first post-contrast to the last post-contrast time point.

In this study, we used three heterogeneity partitions (p=3), the mean and variance as statistic operations (s=2), and two wavelet decomposition levels (d=2). As a result, our final HetWave feature vector consists of 54 features.

C. Dataset

Breast DCE-MRI sagittal scans of 56 women diagnosed with invasive breast cancer were collected at our institution during 2007–2010, and retrospectively analyzed per HIPAA and IRB approval. These women had estrogen receptor positive and node negative tumors. The women were imaged prone in a 1.5T scanner (GE LX echo, GE Healthcare, or Siemens Sonata, Siemens); matrix size: 512 × 512; slice thickness: 2.4–4.4 mm; flip angle: 25° or 30°, and T1-weighted. The ages of the women ranged from 37 to 74 years with a mean age of 55.5 years. The images were obtained before and after the administration of gadodiamide (Omniscan) or gadobenate dimeglumine (MultiHance) contrast agent at intervals of 5, 8, and 11 min for three post-contrast time points. For Siemens machines, the repetition time (TR) and echo time (TE) are 14.6 and 3.5 ms, respectively, and Gad dose ranged from 8 to 18 mL multihance. For GE, TR = 7.5 ms and TE = 1.6 ms, and Gad dose was 20 mL Omniscan. All tumors had undergone molecular profiling per standard clinical protocol with a validated gene expression assay (Oncotype DX, Genomic Health Inc) [32]. The assay calculates the risk of breast cancer recurrence by measuring RNA expression of 21 genes from formalin-fixed paraffin-embedded tumor tissue samples [32]. The outcome is a continuous score that predicts the likelihood of breast cancer recurrence in ten years after the treatment (risk: low < 18, 18 ≤ medium < 30, high≥ 30) [32]. We used this validated assay as a surrogate for long-term recurrence outcomes, and evaluated our extracted features for the classification of the different recurrence risk categories, where we considered score greater than 30 as high risk, and any score less than or equal to 30 as low/medium-risk for recurrence. Of the 56 total patients in our study, 27, 19, and 10 are low, medium, and high risk of recurrence, respectively. For our analysis, the most representative slice of each tumor was identified by a breast imaging radiologist and manually segmented using the validated ITK-SNAP software [33].

D. Classification Experiments

We use a machine learning algorithm to evaluate the extracted HetWave features as predictors of recurrence compared with the gene expression assay. We apply GA-Wrapper feature selection to reduce the feature dimensionality and find the optimal features. Finally, the selected features are applied to a logistic regression classifier to classify tumor recurrence risk. Fig. 3 shows a block diagram of the machine learning algorithm.

Fig. 3.

Fig. 3.

Block diagram of the HetWave feature extraction based on tumor heterogeneity partitioning and recurrence risk classification.

We compare our HetWave features against other established DCE-MRI features used in the literature for prognostic assessment including standard kinetic, textural, and morphological features [5], [6], [9], [14], [34]. Briefly, standard kinetic features are obtained which depict the relative enhancement plotted as a function of time for a particular pixel or a representative group of pixels (i.e., “hot-spot”). These features have been previously described in detail in [5], [6], and [14]. Textural features have also been used to characterize heterogeneity at a more global level within the tumor and are typically based on the computation of the gray level co-occurrence matrix per the Haralick method [9]. The morphological and geometric features of the tumors were also calculated as described previously [34]. A summary of all the standard features implemented for comparison is given in Table I.

TABLE I.

DCE-MRI Standard Features Used for Comparison

Kinetic [5], [14] [6] Peak enhancement (PE)
Time-to- peak (TTP)
Wash-in-slope (WIS)
Washout rate (WOS)
Curve shape index (CSI)
Enhancement at first post-contrast image (EFP)
Enhancement ratio (ER)
Maximal variance in uptake (MVU)
Variance in time to peak (VTTP)
Variance in uptake rate (VUTP)
Variance in washout rate speed (VWOS)
Textural [9], [35] Contrast
Correlation
Energy
Homogeneity
Entropy
Variance
Sum average
Sum variance
Sum entropy
Difference in variance (DV)
Difference in entropy (DE)
Information measure of correlation1 (IMC1)
Information measure of correlation2 (IMC2)
Maximal correlation coefficient (MCC)
Morphologic [34] Size
Circularity
Irregularity
Margin sharpness (mean gradient at margin)
Variance in margin sharpness (VMS)
Variance in radial gradient histogram (VRGH)

In addition to standard DCE-MRI features previously used in prognostic assessment, we also compare our results to the ICA approach which is a recently proposed method to capture heterogeneity in imaging data [12], [13]. The method has primarily been used in diagnostic assessment to differentiate between malignant and benign tumors [13]. In this approach, voxel-wise heterogeneity of the tumor enhancement is analyzed by ICA, and features such as the enhancement curves of the different tissue types are obtained. In other words, the ICA is applied to the DCE-MRI images to unmix the enhancement of every voxel into amount of enhancement caused by every single tissue type included in the voxel (spatial ICA). We assume X = AS where X is a matrix in which the rows are the pre- and post-contrast subtraction images (subtraction from the pre-contrast), and the columns are the voxels. S is a matrix showing n independent components (ICs) corresponding to different tissue types (rows include n ICs and the columns are the voxels). A is the mixing matrix in which the number of rows equals to the number of pre- and post-subtraction images, and the number of columns is equal to the number of ICs. The enhancement curve for each tissue type is obtained from this mixing matrix A (each column of A is showing an enhancement curve for a corresponding tissue type). The column of A with maximum sum is identified as the strongest enhancing curve, and also considered as the tumor enhancement curve [12], [13]. The elements of this column (vector elements of the strongest enhancing curve) are used as features for our classifier.

The GA-wrapper technique, which is a combination of the wrapper method and GA feature subset generation, was applied to the extracted features [36], [37]. The wrapper techniques use a predictive model to score feature subsets [38]. The concept of GA is taken from the computational studies of Darwinian evolution and natural selection process. The algorithm iteratively changes a population of individuals by probabilistically selecting individuals from the current population as parents. In each generation, the fitness of every individual in the population is evaluated by a fitness function. By applying cross-over and mutation rules to parents, the children for the next generation are produced as a new population. Therefore, GA is as a heuristic search that globally aims to find a feature subset which maximizes a fitness function by using inheritance, mutation, selection, and crossover processes [39]. In our study, the fitness function is the area under the curve (AUC) of the receiver operating characteristic (ROC) of the training data. The GA default parameter settings include a population size (number of subsets in every generation) of 100, a crossover rate of 0.8, a mutation rate of 0.05, a termination tolerance on fitness function value of 1e–6, a termination tolerance on constraints of 1e–6, and a number of generations over which cumulative change in fitness function value is less than 1e–6 of 50 (i.e., default parameters of MATLAB optimization toolbox [40]). We also used the scattered and the uniform functions as the crossover type and the mutation functions, respectively [40].

The GA-wrapper feature selection was performed in each leave-one-out (LOO) cross-validation loop, only in the training set. The GA-wrapper process stops when a stopping criterion is satisfied. Based on the previous parameters, the algorithm stopped when the average relative change in the best fitness function value over 50 generations was less than 1e–6. [40]. We varied the dimensionality (i.e., number of features allowed in the GA) from K=7 up to 10. Therefore, the number of selected features varies from 7 to 10 for each LOO loop. We used a maximum of ten features to prevent possible overfitting [41]. Fewer features than 7 would result in loss of valuable information, and this range gives the feature subset with the best ROC AUC in each LOO loop.

A heat-map of unsupervised hierarchical clustering [42] on the most selected features was obtained to visualize the pattern of the most frequently selected features, and to also group the tumors based on their intrinsic DCE-MRI feature pattern. Note that a heat-map displays the numerical values of the features in color instead of numbers for the purpose of more intuitive visualization. The result of the hierarchical clustering is shown as a dendrogram in the heat map.

III. Results

To find the optimal wavelet kinetic features, we computed and compared the ROC AUCs of the HetWave features with different wavelet families such as “Daubechies-Db2,” “Daubechies-Db4,” “Coiflets 1,” “Symlets 2,” and “Biorthogonal 1.1.” The results show that “db2” and “Biorthogonal 1.1” have the same performance, superior to that of the other wavelet families (see Table II).

TABLE II.

Comparing the Performance of Different Wavelet Families Used to Extract the HetWave Features

Wavelet Type AUC
Daubechies- Db2 (Haar) 0.88
Daubechies-Db4 0.80
Coiflets 1 0.78
Symlets 2 0.85
Biorthogonal 1.1 0.88

We compared the ROC of the standard features, HetWave, and their combination. The result shows ROC AUCs of 0.70, 0.88, and 0.94 for the standard features, HetWave, and their combination, respectively (see Fig. 4).

Fig. 4.

Fig. 4.

Classifier performance using ROC curves of HetWave features, standard features, and their combination based on PK partitioning.

Note that, to include the beneficiary of the PK heterogeneity in the standard features, the kinetic and textural features are extracted from each tumor partition, separately, in the combination of standard and HetWave features. The morphological features are calculated from the entire tumor shape.

In addition, the GA-Wrapper feature selection technique is similarly employed for obtaining the ROC graphs of the standard features, HetWave, and the combination of HetWave and standard features. We bounded the selected feature between 7 and 10 for the features.

The most frequently selected features when the HetWave features are used in combination with the standard features for the PK heterogeneity partitioning scheme include Enhancement Ratio in partition 3 (ER3), Curve Shape Index in partition 3 (CSI3), HetWave of the Variance of Vertical Wavelet Detail Coefficients of DP2 in Partion2 (HetWave_Var_cV_DP2_P2), and HetWave of the Mean of Diagonal Wavelet Detail Coefficients of DP1 in Partion3 (HetWave_Mean_cD_DP1_P3), which are selected 78%, 58%, 42% and 25% in LOO loops, respectively. We should also note that at least two and up to seven HetWave features are always selected in each LOO loop. Fig. 5 demonstrates the box plots of these most frequently selected features along with the corresponding p-value of the nonparametric Wilcoxon rank sum test.

Fig. 5.

Fig. 5.

Box plot of the most frequently selected features when the HetWave features are used in combination with standard features.

The heat-map of unsupervised hierarchical clustering with the average linkage method [43] for the seven most selected features of the HetWave and standard feature combination is depicted in Fig. 6. The Pearson correlation was used during clustering as the distance metric for computing pairwise distances between rows and similarly for columns. This figure visualizes the intrinsic patterns of the selected features. Each cell is a color-coded representation of normalized feature values, where red and green represent the higher and lower end of their distributions, respectively, and black represents a feature value being close to its mean. The patient numbers and the corresponding gene expression assay scores and recurrence risk categories are shown in the bottom of the heat-map.

Fig. 6.

Fig. 6.

Heat-map showing intrinsic phenotypic heterogeneity patterns with rows representing the most frequently selected DCE-MRI features based on PK heterogeneity partitioning and columns representing tumors. The corresponding gene expression scores are shown in the colorbar.

The ROCcomparison of HetWave and ICA based features is shown in Fig. 7. An AUC of 0.72 is obtained by the ICA approach for classifying tumors into high and low recurrence risk. We should note, however, that the HetWave approach is different in principal from the ICA approach in terms of capturing heterogeneity. In HetWave, first the tumors are partitioned based on certain heterogeneity measures (PK parameters), and then the features are extracted from each partition. In fact, in HetWave, by partitioning the tumors, we identify the heterogeneous areas; while in the ICA method, no such areas are identified, and the features are directly extracted from applying ICA to the voxel-wise enhancing curves. Therefore, we cannot combine the ICA features with HetWave features by using the partitioning approach as we did for the comparison to the standard features.

Fig. 7.

Fig. 7.

Classifier performance using ROC curves of HetWave and ICA features.

IV. Discussion

Overall the superior performance of the HetWave features when compared to other commonly used features suggests that the rate of the spatial frequency pattern (wavelet coefficients) over the PK partitions can provide valuable prognostic information, not captured by established DCE-MRI features.

In addition, the combination of HetWave and standard features provide an improvement in AUC compared to the HetWave or the standard features alone. Thus, it can be inferred that both the HetWave and the standard features provide separately valuable and complementary, prognostic information.

In the PK partitioning, the pixels with a similar volume transfer constant rate and plasma volume are grouped together. In other words, the pixels in a PK partition behave similarly in terms of absorbing and distributing the contrast agent for the tissue types. Since the most selected features were extracted from partitions 2 and 3, which have less maximum post-contrast uptake than partition 1, it can be inferred that the dynamic patterns of distributing and absorbing of the agents in the tissue types in these partitions more related to the recurrence risk of the tumors.

When combining the HetWave and standard features, the ER and CSI in partition 3 are among the most selected features. That is, the ratio of the initial enhancement (absorption) to average enhancement and the change between the initial and the last enhancement in partition 3 may provide important information for the tumor recurrence risk. In other words, it shows the pattern of the initial agent absorption in the partition 3 might more connected to the tumor recurrence risk.

In addition, the HetWave features include the SER of the high frequency coefficient variance in partition 2 (HetWave_Var_cV_DP2_P2), and the high frequency coefficient mean in partition 3 (HetWave_Mean_cD_DP1_P3) are among the most selected features; this might suggest that the dynamic pattern of spatial variation of the agent absorbing in these partitions is more related to the tumor recurrence risk.

The visualization of Fig. 6 can be interpreted at two levels, the Dendrogram of the selected features, and their corresponding pattern visualized by the heat-map. In the Dendrogram interpretation, there are two main dominant imaging phenotypes within our study population. Phenotype 1(blue clusters) consists only of tumors with gene expression scores less or equal than 30 including only low and medium recurrence risk tumors, while Phenotype 2 (red clusters) consists mostly of tumors with a gene expression score greater than 30. That is, if we group the tumors based on these selected features (i.e., phenotypes) into two clusters, one group includes only of low/medium recurrence risk tumors. The pattern visualized by the heat-map provides important insight of each feature’s ability to discriminate tumors at low versus high risk recurrence. For features of CSI 2 and CSI 3, the green pattern (i.e., low feature values) is more related to high risk tumors and the red pattern (i.e., high feature values) is related to low breast cancer recurrence risk. That is, the small (or respectively the big) change between the initial and the last enhancement in partitions 2 and 3 are more associated in high (versus low) recurrence risk tumors. For ER3, EFP2, and HetWave_Var_cV_DP2_P2 features, the green pattern is more representative of low risk tumors while red is more related to tumors at a high risk for recurrence. That is to say, the low (versus high) ratio of the initial enhancement (absorption) to average tumor enhancement, and the low (versus high) rate of dynamic pattern of spatial variation in the agent absorbing in partition 2 are more related to low (versus high) risk of recurrence.

There are also some limitations to be noted in our study. First, feature extraction was performed only by using one most representative slice of the tumor (as deemed by an experienced radiologist). Future work will need to extend the proposed features to the entire tumor volume, to potentially obtain further improvement in performance. In addition, some parameters in our study were empirically determined. For example, we considered three heterogeneity partitions based on the rationale of the common subtype cancers of Basal, Luminal A, and B inside breast tumors. Our goal was to attempt to capture phenotypic heterogeneity via imaging using representative PK features. In this regard, we aimed to capture the heterogeneity of agent absorbing and distributing for the tissue types based on PK features (volume transfer constant and plasma volume). This is a first step to establish proof of concept. While our results are promising, future work will also include extracting and combining features from other types of heterogeneity [44], optimizing the partitioning approach, and using larger datasets of DCE-MRI tumors to better characterize the tumors and prognostic biomarkers. We should note that we cannot validate the heterogeneity partitioning due to lack of ground-truth data for the biological interpretation of the heterogeneity observed in our study. As such, we can only examine if the obtained partitions, which are based on imaging phenotypic heterogeneity, could better classify tumor recurrence risk, compared to when no partitioning is used.

We ultimately aim to merge imaging biomarkers with histopathology and multigene assays such as Oncotype DX. We believe that both imaging and gene expression assays could provide complementary information for recurrence risk classification. As a first step, we evaluate the prognostic value of our HetWave features using Oncotype DX as a validated surrogate for recurrence. Given the promising results obtained by our method, larger studies are warranted to prospectively validate the prognostic value of our features using true recurrence outcomes based on patient followup, and in conjunction to the gene-expression assay.

V. Summary

We propose a feature extraction method for characterizing PK intratumor heterogeneity in breast DCE-MRI using spatiotemporal wavelet kinetic features. The proposed HetWave features aim to capture tumor heterogeneity by first partitioning the tumor into locally heterogeneous subregions, and then characterizing spatiotemporal patterns of the contrast agent uptake and its spatial frequency information using wavelet coefficients. Pharmacokinetic-based heterogeneity partitioning was evaluated for extracting the HetWave features. A wrapper-based feature selection method using genetic algorithm, and a logistic regression classifier with LOO cross validation were used to evaluate the prognostic value of the proposed features in classifying breast cancer recurrence risk as determined by a widely validated gene expression assay. HetWave features provide superior ROC AUC when compared to a wide range of currently established, standard breast DCE-MRI features. The combination of HetWave and standard features can give further classification improvement. This suggests that HetWave could be a powerful feature extraction approach for characterizing tumor heterogeneity, providing valuable prognostic information.

Acknowledgments

This work was supported by funding from the Translational Centers of Excellence program at the Abramson Cancer Center of the University of Pennsylvania and the Institute of Translational Medicine and Therapeutics Transdisciplinary Program in Translational Medicine and Therapeutics under Grant UL1RR024134 from the National Center for Research Resources.

Contributor Information

Majid Mahrooghy, Computational Breast Imaging Group, Department of Radiology, University of Pennsylvania.

Ahmed B. Ashraf, Computational Breast Imaging Group, Department of Radiology, University of Pennsylvania.

Dania Daye, Computational Breast Imaging Group, Department of Radiology, University of Pennsylvania.

Elizabeth S. McDonald, Computational Breast Imaging Group, Department of Radiology, University of Pennsylvania

Mark Rosen, Computational Breast Imaging Group, Department of Radiology, University of Pennsylvania.

Carolyn Mies, Department of Pathology and Laboratory Medicine, University of Pennsylvania.

Michael Feldman, Department of Pathology and Laboratory Medicine, University of Pennsylvania.

Despina Kontos, Computational Breast Imaging Group, Department of Radiology, University of Pennsylvania, Philadelphia, PA 19104 USA.

References

  • [1].Allison KH and Sledge GW Jr., “Heterogeneity and cancer,” Oncology, vol. 28, pp. 772–778, Sep 15, 2014. [PubMed] [Google Scholar]
  • [2].van ‘t Veer LJ et al. , “Gene expression profiling predicts clinical outcome of breast cancer,” Nature, vol. 415, pp. 530–536, Jan. 31, 2002. [DOI] [PubMed] [Google Scholar]
  • [3].McDonald DM and Choyke PL, “Imaging of angiogenesis: From microscope to clinic,” Nat. Med, vol. 9, pp. 713–725, Jun. 2003. [DOI] [PubMed] [Google Scholar]
  • [4].Chen L et al. , “Tissue-specific compartmental analysis for dynamic contrast-enhanced MR imaging of complex tumors,” IEEE Trans. Med. Imaging, vol. 30, no. 12, pp. 2044–2058, Dec. 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].Bhooshan N et al. , “Cancerous breast lesions on dynamic contrast-enhanced MR images: Computerized characterization for image-based prognostic markers,” Radiology, vol. 254, pp. 680–690, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Chen W et al. , “Computerized interpretation of breast MRI: Investigation of enhancement-variance dynamics,” Med. Phys, vol. 31, pp. 1076–1082, 2004. [DOI] [PubMed] [Google Scholar]
  • [7].Meinel LA et al. , “Breast MRI lesion classification: Improved performance of human readers with a backpropagation neural network computer-aided diagnosis (CAD) system,” J. Magn. Reson. Imaging, vol. 25, pp. 89–95, 2007. [DOI] [PubMed] [Google Scholar]
  • [8].Ashraf A et al. , “A multichannel Markov random field framework for tumor segmentation with an application to classification of gene expression-based breast cancer recurrence risk,” IEEE Trans. Med. Imaging, vol. 32, no. 4, pp. 637–648, Apr. 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Gibbs P and Turnbull LW, “Textural analysis of contrast-enhanced MR images of the breast,” Magn. Reson Med, vol. 50, pp. 92–98, 2003. [DOI] [PubMed] [Google Scholar]
  • [10].Levman J et al. , “Classification of dynamic contrast-enhanced magnetic resonance breast lesions by support vector machines,” IEEE Trans. Med. Imaging, vol. 27, no. 5, pp. 688–696, May 2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Agliozzo S et al. , “Computer-aided diagnosis for dynamic contrast-enhanced breast MRI of mass-like lesions using a multiparametric model combining a selection of morphological, kinetic, and spatiotemporal features,” Med. Phys, vol. 39, pp. 1704–1715, Apr. 2012. [DOI] [PubMed] [Google Scholar]
  • [12].Koh TS et al. , “Independent component analysis of dynamic contrast-enhanced magnetic resonance images of breast carcinoma: A feasibility study,” J. Magn. Reson. Imaging, vol. 28, pp. 271–277, Jul. 2008. [DOI] [PubMed] [Google Scholar]
  • [13].Goebl S et al. , “Segmentation and kinetic analysis of breast lesions in DCE-MR imaging using ICA,” in Proc. Int. Conf. Inf. Technol. Bio-Med. Informat, 2014, vol. 8649, pp. 45–59. [Google Scholar]
  • [14].Chen W et al. , “Automatic identification and classification of characteristic kinetic curves of breast lesions on DCE-MRI,” Med. Phys, vol. 3, pp. 2878–2887, 2006 [DOI] [PubMed] [Google Scholar]
  • [15].Parikh J et al. , “Changes in primary breast cancer heterogeneity may augment midtreatment MR imaging assessment of response to neoadjuvant chemotherapy,” Radiology, vol. 272, pp. 100–112, Jul. 2014. [DOI] [PubMed] [Google Scholar]
  • [16].Li X et al. , “Analyzing spatial heterogeneity in DCE- and DW-MRI parametric maps to optimize prediction of pathologic response to neoadjuvant chemotherapy in breast cancer,” Transl. Oncol, vol. 7, pp. 14–22, Feb. 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [17].Mahrooghy M et al. , “Heterogeneity wavelet kinetics from DCE-MRI for classifying gene expression based breast cancer recurrence risk,” in Proc. Med. Image Comput. Comput.-Assisted Intervention Conf, 2013, vol. 8150, pp. 295–302. [DOI] [PubMed] [Google Scholar]
  • [18].Ashraf AB et al. , “Identification of intrinsic imaging phenotypes for breast cancer tumors: Preliminary associations with gene expression profiles,” Radiology, vol. 272, pp. 374–384, Apr. 4, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Yankeelov TE et al. , “Quantitative pharmacokinetic analysis of DCE-MRI data without an arterial input function: A reference region model,” Magn. Reson. Imaging, vol. 23, pp. 519–529, May 2005. [DOI] [PubMed] [Google Scholar]
  • [20].Riabkov DY and Di Bella EVR, “Estimation of kinetic parameters without input functions: Analysis of three methods for multichannel blind identification,” IEEE Trans. Biomed. Eng, vol. 49, no. 11, pp. 1318–1327, Nov. 2002. [DOI] [PubMed] [Google Scholar]
  • [21].Zhu XP et al. , “Quantification of endothelial permeability, leakage space, and blood volume in brain tumors using combined T1 and T2*contrast-enhanced dynamic MR imaging,” J. Magn. Reson. Imaging, vol. 11, pp. 575–585, Jun. 2000. [DOI] [PubMed] [Google Scholar]
  • [22].Kelm BM et al. , “Estimating kinetic parameter maps from dynamic contrast-enhanced MRI using spatial prior knowledge,” IEEE Trans. Med. Imaging, vol. 28, no. 10, pp. 1534–1547, Oct. 2009. [DOI] [PubMed] [Google Scholar]
  • [23].Tofts PS et al. , “Estimating kinetic parameters from dynamic contrast-enhanced T(1)-weighted MRI of a diffusable tracer: Standardized quantities and symbols,” J. Magn. Reson. Imaging, vol. 10, pp. 223–232, Sep. 1999. [DOI] [PubMed] [Google Scholar]
  • [24].Li KL et al. , “Heterogeneity in the angiogenic response of a BT474 human breast cancer to a novel vascular endothelial growth factor-receptor tyrosine kinase inhibitor: Assessment by voxel analysis of dynamic contrast-enhanced MRI,” J. Magn. Reson. Imaging, vol. 22, pp. 511–519, Oct. 2005. [DOI] [PubMed] [Google Scholar]
  • [25].Schmid VJ et al. , “Bayesian methods for pharmacokinetic models in dynamic contrast-enhanced magnetic resonance imaging,” IEEE Trans. Med. Imaging, vol. 25, no. 12, pp. 1627–1636, Dec. 2006. [DOI] [PubMed] [Google Scholar]
  • [26].Zhou Y et al. , “A modeling-based factor extraction method for determining spatial heterogeneity of Ga-68 EDTA kinetics in brain tumors,” IEEE Trans. Nucl. Sci, vol. 44, no. 6, pp. 2522–2527, Dec. 1997. [Google Scholar]
  • [27].Wang Y et al. , “Modeling and reconstruction of mixed functional and molecular patterns,” Int. J. Biomed. Imaging, vol. 2006, pp. 29707-1-29707-9, 2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Bezdek JC and Pal SK, Fuzzy Models for Pattern Recognition. New York, NY, USA: IEEE Press, 1992. [Google Scholar]
  • [29].Polyak K, “Heterogeneity in breast cancer,” J. Clin. Invest, vol. 121, pp. 3786–3788, Oct. 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [30].Gonzalez R and Woods RE, Digital Image Processing, 3th ed. Englewood Cliffs, NJ, USA: Prentice-Hall, 2007. [Google Scholar]
  • [31].Hylton N, “MR imaging for assessment of breast cancer response to neoadjuvant chemotherapy,” Magn. Reson. Imaging Clin. North Amer, vol. 14, pp. 383–389, 2006. [DOI] [PubMed] [Google Scholar]
  • [32].Paik S et al. , “Gene expression and benefit of chemotherapy in women with node-negative, estrogen receptor-positive breast cancer,” J. Clin. Oncol, vol. 24, pp. 3726–3734, 2006. [DOI] [PubMed] [Google Scholar]
  • [33].Yushkevich PA et al. , “User-guided 3D active contour segmentation of anatomical structures: Significantly improved efficiency and reliability,” NeuroImage, vol. 31, pp. 1116–1128, 2006 [DOI] [PubMed] [Google Scholar]
  • [34].Gilhuijs KG et al. , “Computerized analysis of breast lesions in three dimensions using dynamic magnetic-resonance imaging,” Med. Phys, vol. 25, pp. 1647–1654, Sep. 1998. [DOI] [PubMed] [Google Scholar]
  • [35].Goldhirsch A et al. , “Thresholds for therapies: Highlights of the St Gallen International Expert Consensus on the primary therapy of early breast cancer 2009,” Ann. Oncol, vol. 20, pp. 1319–1329, Aug. 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [36].Yang J, “Feature subset selection using a genetic algorithm,” IEEE Intell. Syst. Appl, vol. 13, no. 2, pp. 44–49, Mar./Apr. 1998. [Google Scholar]
  • [37].Huang JJ et al. , “A hybrid genetic algorithm for feature selection wrapper based on mutual information,” Pattern Recog. Lett, vol. 28, pp. 1825–1844, Oct. 1, 2007. [Google Scholar]
  • [38].Kohavi R and John GH, “Wrappers for feature subset selection,” Artif. Intell, vol. 97, pp. 273–324, Dec. 1997. [Google Scholar]
  • [39].Goldberg D, Genetic Algorithms in Search, Optimization, and Machine Learning. Boston, MA, USA: Addison-Wesley, 1989. [Google Scholar]
  • [40].MathWorks T, “Genetic algorithm and direct search toolbox for use with MATLAB,” 2004.
  • [41].Trunk GV, “A problem of dimensionality: A simple example,” IEEE Trans. Pattern Anal. Mach. Intell, vol. PAMI-1, no. 3, pp. 306–307, Jul. 1979. [DOI] [PubMed] [Google Scholar]
  • [42].Eisen MB et al. , “Cluster analysis and display of genome-wide expression patterns,” Proc. Nat. Acad. Sci. USA, vol. 95, pp. 14863–14868, Dec. 8, 1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [43].Hastie T et al. , The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. New York, NY, USA: Springer, 2009. [Google Scholar]
  • [44].Bhatia S et al. , “The challenges posed by cancer heterogeneity,” Nature Biotechnol., vol. 30, pp. 604–610, Jul. 2012. [DOI] [PubMed] [Google Scholar]

RESOURCES