Skip to main content
Medical Physics logoLink to Medical Physics
. 2010 Mar 29;37(4):1840–1849. doi: 10.1118/1.3314075

Computer aided automatic detection of malignant lesions in diffuse optical mammography

David R Busch 1,a), Wensheng Guo 2, Regine Choe 3, Turgut Durduran 4, Michael D Feldman 5, Carolyn Mies 5, Mark A Rosen 6, Mitchell D Schnall 6, Brian J Czerniecki 7, Julia Tchou 7, Angela DeMichele 8, Mary E Putt 9, Arjun G Yodh 10
PMCID: PMC2864673  PMID: 20443506

Abstract

Purpose: Computer aided detection (CAD) data analysis procedures are introduced and applied to derive composite diffuse optical tomography (DOT) signatures of malignancy in human breast tissue. In contrast to previous optical mammography analysis schemes, the new statistical approach utilizes optical property distributions across multiple subjects and across the many voxels of each subject. The methodology is tested in a population of 35 biopsy-confirmed malignant lesions.

Methods: DOT CAD employs multiparameter, multivoxel, multisubject measurements to derive a simple function that transforms DOT images of tissue chromophores and scattering into a probability of malignancy tomogram. The formalism incorporates both intrasubject spatial heterogeneity and intersubject distributions of physiological properties derived from a population of cancer-containing breasts (the training set). A weighted combination of physiological parameters from the training set define a malignancy parameter (M), with the weighting factors optimized by logistic regression to separate training-set cancer voxels from training-set healthy voxels. The utility of M is examined, employing 3D DOT images from an additional subjects (the test set).

Results: Initial results confirm that the automated technique can produce tomograms that distinguish healthy from malignant tissue. When compared to a gold standard tissue segmentation, this protocol produced an average true positive rate (sensitivity) of 89% and a true negative rate (specificity) of 94% using an empirically chosen probability threshold.

Conclusions: This study suggests that the automated multisubject, multivoxel, multiparameter statistical analysis of diffuse optical data is potentially quite useful, producing tomograms that distinguish healthy from malignant tissue. This type of data analysis may also prove useful for suppression of image artifacts.

Keywords: diffuse optical tomography, breast cancer, breast imaging, computer aided detection, CAD, DOT

INTRODUCTION

Diffuse optics and automated cancer diagnosis

Diffuse optical spectroscopy (DOS) utilizes light in the low absorption window1 between 650 and 950 nm to measure chromophore concentrations and scattering in deep tissues. Information about concentrations of water, lipid, oxy (HbO2), deoxy (Hb), and total hemoglobin (Hbt), as well as blood oxygen saturation (StO2), and tissue scattering (i.e., the reduced scattering coefficient, μs) are readily derived with the optical techniques.

Diffuse optical tomography (DOT) utilizes many point measurements on the surface of tissue to reconstruct concentrations of chromophores and scattering parameters in the interior of the tissue. Images are obtained by inverting the heterogeneous diffusion equation. A recent review by Arridge and Schotland2 describes various techniques needed to perform this inversion. The results are three-dimensional (3D) maps of optical properties and chromophore concentrations. The reconstructed parameters have been correlated with the physiological signatures of tumors. For example, optically measured Hbt has been correlated with microvessel density measured by histopathology,3 and μs has been correlated with cellular volume fraction and mean size.4 Leff et al.5 recently reviewed DOT breast tumor contrasts in Hbt and StO2. Some disagreements remain in the diffuse optics community about which optically measured parameters are the most important indicators of malignancy; recent work, for example, on water6 and collagen7 has opened up additional possibilities.

While current incarnations of multiwavelength DOT provide 3D images of several physiological parameters associated with cancer metabolism and growth, unambiguous 3D maps of healthy and malignant tissue are sometimes elusive. DOT images require simultaneous interpretation of multiparameter data at each spatial point, and images sometimes exhibit significant inter- and intrasubject variation in the absolute and relative values of these parameters. Together, these factors limit DOT image analysis to skilled practitioners of the art. In this contribution, we address this issue. In particular, we introduce a novel algorithm for automated identification of malignant and healthy tissue based on a statistical analysis of diffuse optical data from a population of known cancers.

The requirement for skilled readers is not unique to optics; most clinical imaging technologies have similar constraints and various techniques for automated breast cancer detection and diagnosis have been explored to ameliorate this situation. Notably, computer-aided detection (CAD) in x-ray mammography screening relies upon high-spatial-resolution 2D intensity projections to automatically identify tumors in images based on structural features such as spiculation and microcalcification.8, 9, 10

The formalism presented herein for DOT CAD employs multiparameter, multivoxel, multisubject measurements to derive a simple function that transforms DOT images of tissue chromophores and scattering into a “probability of malignancy” tomogram. The formalism incorporates both intrasubject spatial heterogeneity and intersubject distributions of physiological properties derived from a population of cancer-containing breasts (the “training set”). We extract a weighted combination of physiological parameters from the training set to define a “malignancy parameter” (M), with the weighting factors optimized to separate training-set voxels from healthy and cancerous tissue. We then examine the utility of M, employing 3D DOT images from an additional subject (the “test set”).

Limitations of current diffuse optics analysis

To better appreciate the need for DOT CAD, consider the typical analysis scheme currently employed in the field. First, the lesions in each subject are identified, then tissue is divided into “healthy” and “lesion” regions, and average optical properties are computed for each region. Finally, these regionally averaged optical properties and differences thereof are assessed across the population. With this approach, the spatial information from DOT images is reduced to a few numbers. Such an analysis implicitly ignores the spatial heterogeneity of cancers11, 12, 13 and healthy tissues.14

Grand averages of DOT data across multiple studies suggest that malignant lesions can be differentiated from healthy tissue by Hbt.5 In recent work15 using the same data set as in the analysis to be presented herein, we performed such regional averaging from DOT images; we then demonstrated that benign and malignant lesions could be separated with a univariate analysis of the ratio of the mean lesion and healthy tissue values of Hbt, μs, and an empirically derived optical index combining Hbt, μs, and StO2. For example, this work found that the benign and malignant lesions had a statistically significant different ratio of the Hbt region means (e.g., ⟨HbtLesion∕⟨HbtHealthy).

Volume element histograms of rHbt (Hbt∕⟨HbtHealthy) reveal more information than distribution means. Histograms of two subjects are shown in Fig. 1. Figure 1a shows an optimal subject wherein the lesion is clearly distinguishable from healthy tissue. Figure 1b shows a problematic subject; here, variations in the healthy region (which include possible image reconstruction artifacts) extend the distribution of healthy tissue rHbt into the range of cancer tissue rHbt. The first lesion [Fig. 1a] can be readily identified with the simple normalization procedures described above and a cutoff of rHbt=1.2. The same procedure and cutoff for case two [Fig. 1b], however, would miss the cancer completely; furthermore, adjustment of the cutoff to include the tumor causes incorrect assignment of some healthy regions. These observations suggest a more sophisticated analysis using all available spatial information is desirable. Additionally, simple thresholding in a single optical parameter ignores the potential utility of multiparameter data and ignores possible spatially heterogeneous signatures of cancers and healthy tissues.

Figure 1.

Figure 1

(Top row) Slices from 3D tomograms from subjects with breast cancer. Lesions are denoted by the thick black line. Histogram of rHbt for healthy (middle row) and cancer (bottom row) voxels, segmented as shown in the top row. Data are normalized as rHbt=Hbt∕⟨HbtH for each subject. The left column (a) shows a lesion with clearly separated means and distinct distributions; and the right column (b) shows a subject wherein the healthy Hbt distribution overlaps that of the cancer region. This normalization approach is standard in the “typical” DOT analysis, and the example is taken from our previous work (Ref. 15). Note that the “typical DOT” scheme shown in the figure utilizes a normalization that is different from Eq. 1.

Thus far, a few groups applied statistical analysis techniques to multiparameter optical measurements, including applications to arthritic joints,16 high-risk17 or high mammographic density18, 19 breast tissue, and to various “endoscopic” measurements or excised tissues.20 The data sets employed thus far, however, have limited spatial information and orders of magnitude fewer measurements per subject than, for example, the breast tomograms utilized in the analysis to be presented herein. A few researchers implemented automated image methods with DOT to identify lesions in a particular subject.21, 22, 23 This per-subject analysis, however, neglects information about the common signatures of cancer across a population. Still other researchers pursued hypothesis-driven multiparameter optical metrics with DOS24, 25 and DOT.15, 26 Such metrics are dependent on the underlying hypotheses, however, and are often empirically chosen combinations of equally weighted parameters. Chance et al.27 explored two-parameter signatures of breast cancer, graphically identifying malignant lesions, but the separation lines were manually selected for the specific data set.

Thus, previous studies fall into two groups. Some considered spatial variation in cancers, but neglected common signatures across a population. Others considered population signatures, but used only regionally averaged measurements. Furthermore, few, if any, applied statistical optimization techniques to multiparameter optical signatures of cancer across a population. By contrast, the methods we present herein utilize data from many voxels in many subjects to statistically optimize a multiparameter probability of malignancy.

METHODS

Data set used in this analysis

Our data set consists of 3D tomograms of total hemoglobin concentration (Hbt), blood oxygen saturation (StO2), and reduced scattering coefficient (μs) in 35 biopsy-confirmed cancer-bearing breasts. DOT images from these subjects were collected with a parallel plate optical imaging system described in previous works.15, 36 Table 1 contains the demographics of the population used, separated by clinical diagnosis.

Table 1.

Demographic breakdown of cancers in this study. IDC: Invasive ductal carcinoma; DCIS: Ductal carcinoma in situ; ILC: Invasive lobular carcinoma; LCIS: Lobular carcinoma in situ; and BMI: Body mass index. Numeric data are given as mean±standard deviation. 16 subjects were premenopausal and 19 were postmenopausal.

No. Diagnosis Age (yr) BMI (kg∕m2) Tumor size (cm3)
8 IDC 44±11 27±6.2 2.9±1.2
2 DCIS 60±4.9 29±6.6 0.7±0.28
2 ILC 62±3.5 22±2 1.4±0.35
22 IDC & DCIS 49±10 28±7 1.8±0.97
1 DCIS & LCIS 39±0 19±0 5±0
35 All 49±11 27±6.5 2.1±1.2

The cancers in this analysis had an average volume of 6.7±5.2 cm3, corresponding to 841±656 image voxels. The average size of the entire breast was 374±231 cm3, corresponding to 4.7×104±2.9×104 image voxels.37 Note, for each parameter, traditional regional averaging analysis of these data, as described above, reduces these ∼5×104 data points per subject to two numbers (cancer and healthy region averages). Figure 1 shows sample intrasubject spatial heterogeneity of these regions, and Fig. 2a plots the distribution of Hbt for the healthy regions of all subjects.

Figure 2.

Figure 2

Intrasubject data normalization brings intersubject data distribution close to a normal distribution. The left figure (a) shows absolute values of Hbt; and the right figure (b) shows the population distribution of zHbt after intrasubject normalization [see Eq. 1 and note that each subject is normalized individually]. Each trace represents the healthy region of one subject. For clarity of presentation, the vertical axis is normalized to the total number of voxels in each subject.

We demonstrate the new statistical analysis method with a leave-one-out cross-validation (e.g., as described by Hastie et al.28), in which 34 out of our 35 subjects serve as the training set and the remaining subject provides the test data. Permuting these sets, such that each subject serves as the test set once, provides 35 training∕test data combinations and enables estimation of classification accuracy. Note that gold standard segmentation of the DOT images into tumor and healthy regions is required for the training set to train the classifier (i.e., for the logistic regression model) and is required for the test set classification validation (i.e., to assess how well the classifier performed compared to the gold standard in a new data set).

Both training set normalization (described below) and testing of our method require “gold standard” spatial localization of the cancers; a full description of the procedure utilized to identify cancer regions is given in Ref. 15. Briefly, a traditional clinical imaging method, typically MRI, was used to approximately locate each tumor. We then selected nearby regions of high optical contrast as the starting point for a region-growing algorithm to identify the spatial extent of the tumor. A 2 cm border region about the tumor and voxels within 1 cm of the source and detector plane was excluded from the training data; the remainder of the breast is defined as healthy tissue. We exclude these boundary regions to reduce the effect of physiological changes near the tumor, errors in tumor positioning, and optode artifacts. In the training set, we assume perfect segmentation into malignant and healthy tissue. In the test set, gold standard segmentation is used only to determine the accuracy of our malignancy prediction.

Algorithm to calculate probability of malignancy

The image analysis has two parts: First, we determine the probability of malignancy function based on the population of known cancers. Then, we test the resulting function on another data set. We iterate this process, exchanging members of the training and test sets to improve the generalizability of our results.

We chose Hbt, StO2, and μs as our fundamental physiological variables. We also tested the combination of Hb, HbO2, and μs, but little difference was found (results not shown). A schematic of the normalization, analysis, and testing protocol is given in Fig. 3.

Figure 3.

Figure 3

Data processing flow chart for each iteration of the leave-one-out protocol. See text for details.

Intrasubject normalization

The first step of this analysis is to normalize the tomographic data across the training sample, as distributions of optical properties vary significantly between subjects. We carry out this procedure for each physiological parameter in both healthy and cancer tissues. To illustrate, consider the total hemoglobin concentration, Hbt. Figure 2a shows the reconstructed Hbt for all healthy tissue voxels in all subjects. We see that the spread of Hbt values is quite large, complicating the use of these data across the sample. Therefore, for each subject, we compute X=ln[Hbt], the healthy tissue mean ⟨XH, and the healthy tissue standard deviation σXH. Note that ⟨XH and σXH are calculated individually over the healthy regions in each subject, thereby capturing both inter- and intrasubject tissue heterogeneity. A similar set of data (i.e., X=ln[Hbt]) is obtained for malignant tissue in each subject in the training set. Together, these quantities permit calculation of the “Z-score” for each variable in each voxel of each subject, e.g.,

zHbtd(s,k)=Xd(s,k)XsHσXHs=ln[Hbtd(s,k)]ln[Hbts]Hσ[ln[Hbts]]H. (1)

Here, the subscript index d=H,M specifies healthy (H) and malignant (M) regions; the superscript index s=1:Ns, specifies a subject in the training set, and the superscript index k=1:Nv(s,d) specifies voxel number within the healthy or malignant region in each subject. Note that the Z-score for both the healthy and malignant regions depends on the mean and standard deviation of the healthy region.

After the Z-score procedure, the distributions of physiological parameters in each voxel across subjects in the sample are much more similar. Figure 2b shows this for Hbt in the healthy regions across 35 subjects. Note that this intrasubject normalization brings the intersubject voxel chromophore distribution close to a zero-centered Gaussian distribution, permitting us to more sensibly combine data across multiple subjects.

A particularly attractive feature of this data normalization is the explicit inclusion in the normalized variable of the subject-dependent characteristic spatial fluctuations in each parameter via the distribution width (σXH). Previous work in the field (e.g., as reviewed by Leff et al.5) used only the ratio of cancer to healthy values for each parameter; the previous approach accounts for the wide variation in mean parameter values between subjects, but ignores the differences in optical parameter distribution widths found in each subject’s healthy tissue. With the normalization scheme described here, lesion contrast is scaled to the variation in healthy tissue. Similarly, we compute the Z-score for StO2 and μs.

For the remainder of this paper, we will be computing and manipulating the normalized physiological variables: zHbtd(s,k), zStO2d(s,k), and zμsd(s,k). Since we use identically sized voxels for all subjects, we expect that each tissue type (H,M) in each subject to have a different number of voxels; i.e., Nv(s,d), is not constant. In order to avoid weighting the data unduly toward healthy or malignant tissue, we set Nv(s,d)=Nv=40. Therefore, the same number of voxels randomly selected from each region and each subject is used for the next level of our analysis. We chose Nv=40 because the smallest tumors in our data set had ∼140 voxels and our analysis scheme depends on independent measurements from each voxel. With this choice for Nv, the median voxel separation in the 35 tumor regions was 1.2 cm. We therefore do not expect spatial correlation due to DOT resolution to strongly affect results by reducing the independence between measurements.

Drawing Nv=40 voxels from the tumor and healthy regions in each of 35 subjects provides a set of 1400 tumor and 1400 healthy voxels defined by our gold standard segmentation. We utilize this set of voxels in our leave-one-out protocol, removing all voxels of the test subject from the total data set so that the remaining 1360 voxels from each region serve as the training set for classification. The trained classification rule is then applied to the test set to predict malignancy. The entire procedure is then repeated for the other test subjects. Choosing an equal number of voxels from each region improves our estimate of the accuracy of our classification technique under the leave-one-out cross-validation protocol, which will be discussed in Sec. 3.

Training set analysis procedure

The tomographic data of all subjects in the training set, i.e., data from all chromophores in Nv voxels of the healthy and malignant regions of each of Ns subjects, are combined into a single matrix. Using a logistic regression model with the known malignancy status of each voxel as the outcome and the normalized tomographic data as predictors, we fit a weight vector β=[βzHbt,βzStO2,βzμs,β0] and compute the vector Md(s,k), whose elements define a scalar malignancy parameter for each voxel in each region of each subject. For the logistic regression model, M is the log odds of malignancy,

Md(s,k)=[MH1,1MH1,NvMHNs,1MHNs,NvMM1,1MM1,NvMMNs,1MMNs,Nv]=[zHbtH1,1zStO2H1,1zμsH1,11zHbtH1,NvzStO2H1,NvzμsH1,Nv1zHbtHNs,1zStO2HNs,1zμsHNs,11zHbtHNs,NvzStO2HNs,NvzμsHNs,Nv1zHbtM1,1zStO2M1,1zμsM1,11zHbtM1,NvzStO2M1,NvzμsM1,Nv1zHbtMNs,1zStO2MNs,1zμsMNs,11zHbtMNs,NvzStO2MNs,NvzμsMNs,Nv1][βzHbtβzStO2βzμsβ0]. (2)

In forming Md(s,k), we thus account for measurements taken across multiple subjects and measurements taken across multiple spatial locations in each subject. The right most column of 1’s in the Z-matrix relates to β0 and introduces an offset that could, in principle, include effects from additional parameters (e.g., variations in tissue fat content, age, etc.) not considered in the present analysis (see Refs. 29, 30 for more details on this latter point).

From M, we compute a probability of malignancy using the function

P[Md(s,k)]=11+eMd(s,k). (3)

Our goal is to identify a weighting vector β that maximizes the difference in probability between voxels in healthy (P[MH(s,k)]) and malignant (P[MM(s,k)]) regions in our training set. In performing this logistic regression, we assume that each element of M is independent. β is optimized by minimizing the difference between P[M] and the gold standard diagnosis, plotted against M (see Ref. 31 for a more detailed description of logistic regression).38 This results in a β such that P[MH(s,k)]0 and P[MM(s,k)]1. Optimized M and P[M] are shown in Fig. 4 for typical training and test sets.

Figure 4.

Figure 4

Example of training (34 subjects, top row) and test (1 subject, bottom row) set M and P[M]. Nv=40 voxels were randomly selected from each region in each subject, as described in Sec. 2B1 (i.e., 1360 voxels from each of the training healthy and tumor regions and 40 voxels from the healthy and tumor regions of the test subject). Left column: Malignancy parameter (M) for each subject (s), voxel in training set (k), and diagnosis Md(s,k). Right column: Optimized probability of malignancy P[Md(s,k)]. The test subject is also shown in Fig. 1a; note the improved separation between the malignant and healthy regions.

Test subject normalization

The output from the logistic regression described in Sec. 2B2 is the parameter weighting vector, β, which we then apply to data from an independent test subject (i.e., DOT data from the “leftover” subject). Such an application derives a predicted probability of malignancy for each voxel in the test subject.

Normalization of the test data set is slightly more complicated than that of the training data, as we must not assume knowledge of the cancer location in the test subject’s breast. We therefore empirically define the healthy region as those voxels in which both Hbt and μs lie within the whole-breast mean and the whole-breast mean plus 2 standard deviations. Note that the results turned out to be only mildly sensitive to the particular choice of healthy criterion, as the cancers usually do not occupy a large fraction of the breast.

RESULTS

Figure 5 shows P vs M for two subjects; example slices through the center of the cancers for the same subjects are shown for rHbt [Fig. 6a] and P[M] [Fig. 6b]. A probability cutoff is readily applied to the data (i.e., a horizontal line in Fig. 5) in order to provide a concrete criterion to create spatial masks of regions that are highly suspicious for malignancy [Fig. 6c]. One can then compare these masks to the gold standard malignant and healthy regions for each test subject.

Figure 5.

Figure 5

Example probability of malignancy (P[M]) calculated for two test subjects. The function P[M] (blue line) is derived from the training set, as described in Sec. 2B2. This function is then applied to the remaining test subject in each case; using the gold standard segmentation described in Sec. 2A, voxels are labeled as healthy (green crosses) or malignant (red dots). Image slices from the same subjects are shown in Fig. 6. For clarity, only every hundredth voxel is plotted. (a) Typical invasive cancer that shows very good separation between healthy and malignant regions. (b) Case study: In situ lesion; this lesion is not as well separated from the background healthy tissue. P[M] for this in situ lesion is more heterogeneous, with a lower average than the invasive lesion and more overlap between malignant and healthy tissue.

Figure 6.

Figure 6

Slices from 3D images of subjects in Fig. 5, showing total hemoglobin concentration [(a) Hbt], probability of malignancy [(b) P[M]], and a binary cancer mask (c) using a cutoff of P[M]=0.95. This in situ lesion provides an interesting case study, with the P[M] falling between the malignant lesions and the healthy regions.

To quantify the quality of probability maps, such as those shown in Fig. 6b, we examine the distributions of M and P[M] for Nv=40 voxels in gold standard healthy and malignant regions across the entire set of test subjects; for this task, we use the same randomly selected voxels, as described in Sec. 2B1. Figure 7a shows a histogram of M drawn from test subject healthy and malignant regions in each iteration of the leave-one-out protocol; Fig. 7b is a box plot of the P[M] distributions. Notice that the voxels from the tumor region are distributed narrowly about P[M]∼1, but the distribution has a small tail of outliers. Similarly, the healthy region voxels are concentrated near P[M]∼0.1, but the central quartiles of the distribution extend from ∼0.01 to ∼0.2 and the fourth quartile extends to ∼0.6, with outliers up to ∼1. This wide distribution of values in the healthy region can also be seen in Fig. 5.

Figure 7.

Figure 7

(a) Histogram of M for the healthy and tumor regions of 35 test subjects, used to generate the box plot in (b). (b) Box plot of probability of malignancy (P[M]) of Nv=40 voxels from all test subjects (bold lines mark median values, boxes denote interquartile range, dashed lines indicate outer two quartiles, and squares mark outliers). See insets in Fig. 8 for a tabulation of mean classification rates at several values of Pcut. Median P[M]=0.998 for tumor voxels and 0.019 for healthy voxels.

We next impose a probability of malignancy cutoff, Pcut: Voxels with probability above Pcut are predicted to be malignant, those below Pcut are predicted to be healthy. Finally, we compare this prediction to the gold standard diagnosis. For each value of Pcut, some test subject voxels are malignant and correctly predicted to be malignant [true positive (TP)]; some test subject voxels are malignant, but incorrectly predicted to be healthy [false negative (FN)]; some test subject voxels are healthy, but incorrectly predicted to be malignant [false positive (FP)]; and some test subject voxels are healthy and correctly predicted to be healthy [true negative (TN)].

These quantities can be expressed as rates by dividing each of these classifications by the total number of voxels in the healthy or malignant regions. We can thus calculate true and false positive rates (TPR=TP∕(TP+FN), FPR=FP∕(TN+FP)) and true and false negative rates (TNR=TN∕(TN+FP), FNR=FN∕(FN+TP)) as functions of Pcut. The receiver operator characteristic (ROC) curve plots TPR against FPR; ROC curves are shown in Fig. 8. Note that these quantities are often referred to as sensitivity (TPR) and specificity (1−FPR). Rates are averaged over 35 permutations of our training∕test subjects. Average TPR, FPR, FNR, and TNR across all test subjects for several values of Pcut are tabulated in the insets of Fig. 8.

Figure 8.

Figure 8

(a) Average ROC curve for all healthy and tumor voxels for each of 35 test subjects. [(a), inset] Classification rates calculated using all voxels from cancerous and healthy tissue, as defined by our gold standard. Black diamonds mark TPR[Pcut] vs FPR[Pcut] values for Pcut given by the numeric labels. (b) Rates calculated only using the Nv=40 training voxels from each region in each subject. [(b), inset] Average classification rates calculated over Nv voxels. (Insets) Mean TPRs, FPRs, FNRs, and TNRs as a function of Pcut. Healthy voxels are defined as P[M]≤Pcut. Note that a low FNR is desirable for cancer diagnosis. Rates are averaged over all test subjects used in the leave-one-out protocol and given as percentages.

For the purposes of cancer detection, FNR is critical, as this determines how many cancers are missed. With the sample used in this analysis, a probability cutoff of Pcut=0.95 yields FNR of 11%. At the same cutoff, 89% of the voxels predicted to be cancerous are correctly classified (TPR) and 6% of the voxels in the healthy region are incorrectly predicted to be cancerous (FPR), while the remaining 94% of the healthy region voxels are correctly labeled. The cutoffs can be tuned to suit particular clinical needs (see insets in Fig. 8 for classification rates corresponding to other values of Pcut).

Since the Z-score is in “units” of standard deviation, we can directly compare the magnitudes of the derived coefficients β to determine the relative importance of each parameter for identifying malignancy. Averaging over 35 combinations of 34 subjects in the training set gives, ⟨βzHbt⟩=0.83±0.06, ⟨βzStO2⟩=−0.19±0.04, and βzμs=2.68±0.12,37 suggesting that of the three optical parameters, the difference in μs offers the strongest evidence of malignancy. It should be noted that the weights are coefficients derived from a logistic regression model and can be interpreted precisely in terms of changes in the log odds of malignancy. For example, a one unit change in zμs implies an estimated 2.7-fold increase in the log odds of malignancy for that voxel or, equivalently, a 14.9-fold increase in the odds of malignancy. However, when β is calculated across all 35 subjects, we find the significance (“p value”) of each element in β to be <0.005, suggesting that all three parameters should be retained in the model.

The analysis described above assumes that each 2×2×2 mm voxel in the 3D tomogram is independent of the others. However, it is well known that the spatial point spread function of light transport in tissue limits DOT spatial resolution to ∼0.5–1 cm (Refs. 32, 33, 34, 35) in breast tissue at biological contrast levels. Furthermore, there are likely to be physiological correlations between different spatial locations in the breast (e.g., all voxels drawn from adipose tissue). We explored a simple model of correlated voxels, in which we identified “clumps” of the same size as the tumor in each breast and then compared the average value of the probability of malignancy in these clumps to that of the tumors. Interestingly, we found the distribution of P[M] to have fewer outliers than Fig. 7 and values in the confusion matrix to be within ∼±5% of the per-voxel values presented in the insets of Fig. 8 at Pcut=0.95.

DISCUSSION

This study suggests that multisubject, multivoxel, multiparameter statistical analysis of diffuse optical data is potentially quite useful. Using a relatively simple statistical classification approach, we have reproduced most of the results described in Ref. 15. Furthermore, a lengthy analysis by a skilled researcher was not required and several different cancer types were included (see Table 1). Initial results also suggest that this type of data analysis may be useful for suppression of image artifacts, but we have not yet systematically studied the issue. Interestingly, in the small number (3, see Table 1) of in situ cancers contained in our sample, we have noted a lower value of the malignancy parameter M than for the more common invasive cancers (see Fig. 5 for an example).

In Sec. 3, we noted |βzμs|>|βzHbt|>|βzStO2|, and therefore zμs is the most important parameter for identification of malignancy, followed by zHbt. The value of βzμs and βzHbt differed little between training sets. ⟨βzStO2⟩, on the other hand, was both smaller than the other coefficients and had a much larger variation between subjects (∼20% vs ∼5%). StO2 is thus less important for differentiation of malignant regions in this data set, an observation consistent with earlier results.15

This preliminary study has several limitations. The most significant limitation is its small sample size. The current analysis uses a “leave-one-out” protocol. Future work with larger samples will use “leave-M-out,” for M>1, a better approach to probing the generalizability of the method.

A more subtle issue that deserves further exploration was our assumption that each voxel used in the training is independent. As we discussed at the end of Sec. 3, we expect spatial correlations from both biological sources and reconstruction artifacts from DOT mammography. The logistic regression classification scheme assumes independent measurements; future work will utilize more sophisticated classifiers to take advantage of the biological correlation (e.g., between voxels from glandular tissue) and minimize the effects of correlation arising from DOT (e.g., resolution limits). The use of correlated measurements in the training set will result in underestimation of the variance in β; however, our classifier does not use this variance: We only apply the point estimate of β, which is unbiased even when the data are correlated, to the test set. Therefore, the classification rates remain valid even though we did not explicitly take into account potential correlation between adjacent voxels.

We also explored possible effects on this analysis technique from DOT resolution limits by eliminating the smallest tumors from our data set. The smallest tumor included in the sample has a total volume of 1.1 cm3 (139 voxels). The median separation between the Nv=40 randomly chosen voxels in this tumor is 0.85 cm with an interquartile range (IQR) of 0.71 cm; this separation is within the expected ∼0.5–1 cm resolution of the optical reconstruction technique. For comparison, in the same subject, the median separation in the Nv=40 healthy tissue voxels was 4.5 cm and the IQR 3.5 cm. We then repeated the leave-one-out protocol described above on a sample of the 31 subjects with tumor volume greater than 2 cm3 (250 voxels). We found that the classification rates at Pcut=0.95 changed very little (i.e., ∼1%). Furthermore, β remained consistent within our calculated error. We also tested the possibility of overweighting small tumors through correlated measurements by extracting a fixed percentage (10%) of the total tumor voxels from each subject and using 10% of the total tumor voxels, up to a maximum of 40. These shifts in the training set selection changed the classification rates at Pcut=0.95 by only ∼2%. Alternately, we could select voxels from a sparse grid in the tissue regions to enforce a consistent or minimal voxel separation.

To test the effects of our random voxel selection, we repeated our analysis on the data sample five times, randomly selecting new training voxels with each iteration. The standard deviations of β extracted from the entire training set were σβ0=2%, σβzμs=6%, σβzHbt=11%, and σβzStO2=29%. Recall that βzStO2 had the smallest magnitude and most variation between different training sets (∼20%).

Another potential source of error is the imperfect tissue segmentation we relied upon as the gold standard for assignment of each voxel as healthy or cancerous. This segmentation relies upon both nonconcurrent clinical imaging (e.g., MRI) to locate the tumor and a region growing algorithm on each subject’s optical tomogram to define the tumor boundaries, therefore potentially introducing discrepancies in the tissue segmentation. We excluded a 2 cm thick boundary region about each tumor from the corresponding healthy region to reduce effects of errors in spatial localization of the tumor boundary on the training healthy tissue data.

Although the analysis includes more spatial data than typically used, most of the data were still discarded, i.e., Nv=40⪡4.5×104 voxels drawn from the healthy region in each subject. We chose this limit for data selection in order to weight tumor and healthy regions equally and take only ∼30% of voxels from the smallest tumors. Furthermore, this choice permits a more intuitive interpretation of P[M] and improves quantification of classification accuracy. We are currently exploring other weighting schemes which permit use of all or most of the healthy tissue voxels. Additionally, no healthy subjects or benign lesions were included in the present sample; inclusion might raise false positive rates, as the relative optical properties of some benign tumors overlapped those of cancers in our previous work.15

Logistic regression is a fairly simple binary classification scheme, which permits use of both continuous and classification variables. The analysis presented here did not include other classification variables (menopausal status, age, etc.), as our total sample is fairly small. For the same reason, we did not attempt to separate cancer types; we will apply such analysis to larger samples in future work. More sophisticated classification techniques, for example, could include more than two output categories (e.g., image artifact, malignant, benign, and healthy regions); again, sufficiently large data sets will permit us to differentiate between types of benign and malignant lesions. Additionally, we will apply a variation in this technique to larger samples (e.g., nonimaging studies of many subjects) and more complete data sets (e.g., concurrent optical∕MR imaging). Finally, future work will implement classification approaches such as support vector machines or neural networks that have demonstrated better predictive capability in other applications, as in Klose et al.16

CONCLUSION

The potential for population-based statistical image processing of diffuse optical data using logistic regression of three optically measured physiological parameters (Hbt, StO2, and μs) and a leave-one-out paradigm for 35 subjects was demonstrated. Our voxel-level diagnosis produced an average TPR of 89% and FPR of 11% with no human interpretation of test data set required. These results are a starting point for development of diffuse optical tomography CAD algorithms. Such multiparameter optical signatures of cancer may enhance the utility of an adjunct or standalone optical device in the clinical imaging environment.

ACKNOWLEDGMENTS

This work was supported by NIH grants R01-EB002109, K99-CA126187, P41-RR002305, and NTROI 1U54CA105480. The authors thank Norman Butler, and the patient coordinators Kathleen Thomas, Tamara April, Deborah Arnold, Stephanie Damia, Dalton Hance, and Monika Koptyra. They also thank colleagues who have contributed to this DOT breast program, including Britton Chance, John Schotland, Leonid Zubkov, Simon R. Arridge, Martin Schweiger, Joseph P. Culver, Soren D. Konecky, Alper Corlu, Kijoon Lee, Han Y. Ban, Saurav Pathak, and Raghav Puranmalka. Finally, the authors thank their clinical collaborator Douglas L. Fraker.

References

  1. Jacques S. L. and Pogue B. W., “Tutorial on diffuse light transport,” J. Biomed. Opt. 13(4), 041302 (2008). 10.1117/1.2967535 [DOI] [PubMed] [Google Scholar]
  2. Arridge S. and Schotland J., “Optical tomography: Forward and inverse problems,” Inverse Probl. 25, 123010 (2009). 10.1088/0266-5611/25/12/123010 [DOI] [Google Scholar]
  3. Srinivasan S., Pogue B. W., Brooksby B., Jiang S., Dehghani H., Kogel C., Wells W. A., Poplack S. P., and Paulsen K. D., “Near-infrared characterization of breast tumors in vivo using spectrally-constrained reconstruction,” Technol. Cancer Res. Treat. 4(5), 513–526 (2005). [DOI] [PubMed] [Google Scholar]
  4. Li C., Grobmyer S. R., Massol N., Liang X., Zhang Q., Chen L., Fajardo L. L., and Jiang H., “Noninvasive in vivo tomographic optical imaging of cellular morphology in the breast: Possible convergence of microscopic pathology and macroscopic radiology,” Med. Phys. 35(6), 2493–2501 (2008). 10.1118/1.2921129 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Leff D. R., Warren O. J., Enfield L. C., Gibson A., Athanasiou T., Patten D. K., Hebden J., Yang G. Z., and Darzi A., “Diffuse optical imaging of the healthy and diseased breast: A systematic review,” Breast Cancer Res. Treat. 108(1), 9–22 (2008). 10.1007/s10549-007-9582-z [DOI] [PubMed] [Google Scholar]
  6. Chung S. H., Cerussi A. E., Klifa C., Baek H. M., Birgul O., Gulsen G., Merritt S. I., Hsiang D., and Tromberg B. J., “In vivo water state measurements in breast cancer using broadband diffuse optical spectroscopy,” Phys. Med. Biol. 53(23), 6713–6727 (2008). 10.1088/0031-9155/53/23/005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Taroni P., Pifferi A., Salvagnini E., Spinelli L., Torricelli A., and Cubeddu R., “Seven-wavelength time-resolved optical mammography extending beyond 1000 nm for breast collagen quantification,” Opt. Express 17(18), 15932–15946 (2009). 10.1364/OE.17.015932 [DOI] [PubMed] [Google Scholar]
  8. Winsberg F., Elkin M., Macy J., Bordaz V., and Weymouth W., “Detection of radiographic abnormalities in mammograms by means of optical scanning and computer analysis,” Radiology 89, 211–215 (1967). [Google Scholar]
  9. Vyborny C. J., Giger M. L., and Nishikawa R. M., “Computer-aided detection and diagnosis of breast cancer,” Radiol. Clin. North Am. 38(4), 725–740 (2000). 10.1016/S0033-8389(05)70197-4 [DOI] [PubMed] [Google Scholar]
  10. Kim S. J., Moon W. K., Cho N., Cha J. H., Kim S. M., and Im J. -G., “Computer-aided detection in full-field digital mammography: Sensitivity and reproducibility in serial examinations,” Radiology 246(1), 71–80 (2007). 10.1148/radiol.2461062072 [DOI] [PubMed] [Google Scholar]
  11. Fidler I. J., “Tumor heterogeneity and the biology of cancer invasion and metastasis,” Cancer Res. 38(9), 2651–2660 (1978). [PubMed] [Google Scholar]
  12. Reya T., Morrison S. J., Clarke M. F., and Weissman I. L., “Stem cells, cancer, and cancer stem cells,” Nature (London) 414(6859), 105–111 (2001). 10.1038/35102167 [DOI] [PubMed] [Google Scholar]
  13. Chen Y., Zheng G., Zhang Z. H., Blessington D., Zhang M., Li H., Liu Q., Zhou L., Intes X., Achilefu S., and Chance B., “Metabolism-enhanced tumor localization by fluorescence imaging: In vivo animal studies,” Opt. Lett. 28(21), 2070–2072 (2003). 10.1364/OL.28.002070 [DOI] [PubMed] [Google Scholar]
  14. Shah N., Cerussi A. E., Jakubowski D., Hsiang D., Butler J., and Tromberg B. J., “Spatial variations in optical and physiological properties of healthy breast tissue,” J. Biomed. Opt. 9(3), 534–540 (2004). 10.1117/1.1695560 [DOI] [PubMed] [Google Scholar]
  15. Choe R., Konecky S. D., Corlu A., Lee K., Durduran T., Busch D. R., Czerniecki B. J., Tchou J., Fraker D. L., DeMichele A., Chance B., Arridge S. R., Schweiger M., Culver J. P., Schnall M. D., Putt M. E., Rosen M. A., and Yodh A. G., “Differentiation of benign and malignant breast tumors by in-vivo three-dimensional parallel-plate diffuse optical tomography,” J. Biomed. Opt. 14(2), 024020 (2009). 10.1117/1.3103325 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Klose C. D., Klose A. D., Netz U., Beuthan J., and Hielscher A. H., “Multiparameter classifications of optical tomographic images,” J. Biomed. Opt. 13(5), 050503 (2008). 10.1117/1.2981806 [DOI] [PubMed] [Google Scholar]
  17. Simick M. K., Jong R., Wilson B., and Lilge L., “Non-ionizing near-infrared radiation transillumination spectroscopy for breast tissue density and assessment of breast cancer risk,” J. Biomed. Opt. 9(4), 794–803 (2004). 10.1117/1.1758269 [DOI] [PubMed] [Google Scholar]
  18. Blyschak K., Simick M., Jong R., and Lilge L., “Classification of breast tissue density by optical transillumination spectroscopy: Optical and physiological effects governing predictive value,” Med. Phys. 31(6), 1398–1414 (2004). 10.1118/1.1738191 [DOI] [PubMed] [Google Scholar]
  19. Blackmore K. M., Knight J. A., Jong R., and Lilge L., “Assessing breast tissue density by transillumination breast spectroscopy (tibs): An intermediate indicator of cancer risk,” Br. J. Radiol. 80(955), 545–556 (2007). 10.1259/bjr/26858614 [DOI] [PubMed] [Google Scholar]
  20. Zhu C., Breslin T. M., Harter J., and Ramanujam N., “Model based and empirical spectral analysis for the diagnosis of breast cancer,” Opt. Express 16(19), 14961–14978 (2008). 10.1364/OE.16.014961 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Song X., Pogue B. W., Jiang S., Doyley M. M., Dehghani H., Tosteson T. D., and Paulsen K. D., “Automated region detection based on the contrast-to-noise ratio in near-infrared tomography,” Appl. Opt. 43(5), 1053–1062 (2004). 10.1364/AO.43.001053 [DOI] [PubMed] [Google Scholar]
  22. Pogue B. W., Davis S. C., Song X., Brooksby B. A., Dehghani H., and Paulsen K. D., “Image analysis methods for diffuse optical tomography,” J. Biomed. Opt. 11(3), 033001 (2006). 10.1117/1.2209908 [DOI] [PubMed] [Google Scholar]
  23. Wang J. Z., Liang X., Zhang Q., Fajardo L. L., and Jiang H., “Automated breast cancer classification using near-infrared optical tomographic images,” J. Biomed. Opt. 13, 044001 (2008). 10.1117/1.2956662 [DOI] [PubMed] [Google Scholar]
  24. Tromberg B. J., Cerussi A., Shah N., Compton M., Durkin A., Hsiang D., Butler J., and Mehta R., “Imaging in breast cancer: Diffuse optics in breast cancer: Detecting tumors in pre-menopausal women and monitoring neoadjuvant chemotherapy,” Breast Cancer Res. 7(6), 279–285 (2005). 10.1186/bcr1358 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Cerussi A., Shah N., Hsiang D., Durkin A., Butler J., and Tromberg B. J., “In vivo absorption, scattering, and physiologic properties of 58 malignant breast tumors determined by broadband diffuse optical spectroscopy,” J. Biomed. Opt. 11, 044005 (2006). 10.1117/1.2337546 [DOI] [PubMed] [Google Scholar]
  26. Konecky S. D., Choe R., Corlu A., Lee K., Wiener R., Srinivas S. M., Saffer J. R., Freifelder R., Karp J. S., Hajjioui N., Azar F., and Yodh A. G., “Comparison of diffuse optical tomography of human breast with whole-body and breast-only positron emission tomography,” Med. Phys. 35(2), 446–455 (2008). 10.1118/1.2826560 [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Chance B., Nioka S., Zhang J., Conant E. F., Hwang E., Briest S., Orel S. G., Schnall M. D., and Czerniecki B. J., “Breast cancer detection based on incremental biochemical and physiological properties of breast cancers a six-year, two-site study,” Acad. Radiol. 12(8), 925–933 (2005). 10.1016/j.acra.2005.04.016 [DOI] [PubMed] [Google Scholar]
  28. Hastie T., Tibshirani R., and Friedman J., The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. (Springer Science+Business Media, New York, NY, 2009) (corrected 3rd printing). [Google Scholar]
  29. Pepe M. S., The Statistical Evaluation of Medical Tests for Classification and Prediction (Oxford University Press, New York, NY, 2003). [Google Scholar]
  30. Breslow N. E., Day N. E., and Davis W., Statistical Methods in Cancer Research (International Agency for Research on Cancer, Lyon, France, 1980). [Google Scholar]
  31. McCullagh P. and Nelder J. A., Generalized Linear Models (Chapman and Hall, London: /CRC, Boca Raton, FL, 1989). [Google Scholar]
  32. Konecky S. D., Panasyuk G. Y., Lee K., Markel V. A., Yodh A. G., and Schotland J. C., “Imaging complex structures with diffuse light,” Opt. Express 16(7), 5048–5060 (2008). 10.1364/OE.16.005048 [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Hebden J., Arridge S., and Delpy D., “Optical imaging in medicine: I. experimental techniques,” Phys. Med. Biol. 42, 825–840 (1997). 10.1088/0031-9155/42/5/007 [DOI] [PubMed] [Google Scholar]
  34. Culver J. P., Ntziachristos V., Holboke M. J., and Yodh A. G., “Optimization of optode arrangements for diffuse optical tomography: A singular-value analysis,” Opt. Lett. 26, 701–703 (2001). 10.1364/OL.26.000701 [DOI] [PubMed] [Google Scholar]
  35. Markel V. A. and Schotland J. C., “Scanning paraxial optical tomography,” Opt. Lett. 27(13), 1123–1125 (2002). 10.1364/OL.27.001123 [DOI] [PubMed] [Google Scholar]
  36. This data sample is slightly smaller than that reported by Choe et al. (Ref. ) (total: 51) as we excluded subjects with multiple (1) or benign (10) lesions from the current analysis. Additionally, a few subjects with very little healthy tissue or large reconstruction artifacts in the optical field of view were excluded from both test and training sets (5).
  37. Mean±standard deviation.
  38. Optimization was performed with the mnrfit function from the MATLAB© statistics toolbox.

Articles from Medical Physics are provided here courtesy of American Association of Physicists in Medicine

RESOURCES