Skip to main content
Medical Physics logoLink to Medical Physics
. 2011 Mar 1;38(3):1649–1659. doi: 10.1118/1.3555300

A GMM-based breast cancer risk stratification using a resonance-frequency electrical impedance spectroscopy

Dror Lederman 1,a), Bin Zheng 1, Xingwei Wang 1, Jules H Sumkin 1, David Gur 1
PMCID: PMC3064686  PMID: 21520878

Abstract

Purpose: The authors developed and tested a multiprobe-based resonance-frequency-based electrical impedance spectroscopy (REIS) system. The purpose of this study was to preliminarily assess the performance of this system in classifying younger women into two groups, those ultimately recommended for biopsy during imaging-based diagnostic workups that followed screening and those rated as negative during mammography.

Methods: A seven probe-based REIS system was designed, assembled, and is currently being tested in the breast imaging facility. During an examination, contact is made with the nipple and six concentric points on the breast skin. For each measurement channel between the center probe and one of the six external probes, a set of electrical impedance spectroscopy (EIS) signal sweeps is performed and signal outputs ranging from 200 to 800 kHz at 5 kHz interval are recorded. An initial subset of 174 examinations from an ongoing prospective clinical study was selected for this preliminary analysis. An initial set of 35 features, 33 of which represented the corresponding EIS signal differences between the left and right breasts, was established. A Gaussian mixture model (GMM) classifier was developed to differentiate between “positive” (biopsy recommended) cases and “negative” (nonbiopsy) cases. Selecting an optimal feature set was performed using genetic algorithms with an area under a receiver operating characteristic curve (AUC) as the fitness criterion.

Results: The recorded EIS signal sweeps showed that, in general, negative (nonbiopsy) examinations have a higher level of electrical impedance symmetry between the two breasts than positive (biopsy) examinations. Fourteen features were selected by genetic algorithm and used in the optimized GMM classifier. Using a leave-one-case-out test, the GMM classifier yielded a performance level of AUC=0.78, which compared favorably to other three widely used classifiers including support vector machine, classification tree, and linear discriminant analysis. These results also suggest that the REIS signal based GMM classifier could be used as a prescreening tool to correctly identify a fraction of younger women at higher risk of developing breast cancer (i.e., 47% sensitivity at 90% specificity).

Conclusions: The study confirms that asymmetry in electrical impedance characteristics between two breasts provides valuable information regarding the presence of a developing breast abnormality; hence, REIS data may be useful in classifying younger women into two groups of “average” and “significantly higher than average” risk of having or developing a breast abnormality that would ultimately result in a later imaging-based recommendation for biopsy.

Keywords: Electrical impedance spectroscopy (EIS), Gaussian mixture model (GMM), resonance frequency, risk stratification, breast cancer, technology assessment

INTRODUCTION

Methods for earlier detection of breast cancer have long been of great interest. Mammography has been widely used for screening of breast cancer. However, conventional screen-film-based mammography suffers from relatively low detection sensitivity and specificity in younger women (i.e., less than 50 yr old), especially due to the low prevalence of breast cancer and denser breast tissue (e.g., in some studies sensitivity levels were as low as 16%–40%).1, 2, 3 To improve detection performance, different approaches (or imaging modalities) have been proposed, tested, and applied. Among these, full-field digital mammography (FFDM), whole breast ultrasound (US), and magnetic resonance imaging (MRI) have been evaluated and used in clinical practice.4 Other new technologies, such as digital breast tomosynthesis and breast cone beam computed tomography, are also being investigated.5, 6 Some of these new breast imaging modalities have shown comparable and∕or improved detection performance as compared to mammography,7, 8, 9 resulting in the recommendation by the American Cancer Society (ACS), among others, for periodic breast MRI examinations in women at high risk as an adjunct to mammography.10 All of these imaging modalities suffer from drawbacks, including but not limited to high cost and low accessibility, particularly if applied to a substantial fraction of the population. Therefore, continued efforts are needed to find better approaches. In particular, specific attention should be given to women who are at prescreening age and women who are older (i.e., >40 yr old in the USA and 50 yr old in most other countries) and therefore qualified for periodic screening but are not complying with the conventional imaging screening recommendations. For these women, it is important to develop a risk stratification method, namely, a prescreening approach that can be used to identify the presence or absence of abnormalities requiring an imaging-based follow-up that may lead to invasive intervention (i.e., biopsy).11 By identifying a group of younger women (<50 yr old) having a higher than average risk for breast cancer due to findings of biopsy-warranting suspicious breast lesions, and which constitutes but a small fraction of the general population, other advanced imaging examinations (e.g., FFDM and MRI) can then be applied to this group who typically do not undergo imaging-based screening examinations due to their younger age and∕or other personal reasons for electing not to participate in screening programs. Using this “rule-in” rather than “rule-out” strategy, the identified group of women at high risk may benefit from early detection, hence, potentially improve prognosis. Therefore, what is required for this purpose is an inexpensive, non-radiation-based, reliable, and easy to use examination modality that could be used during annual physical and∕or gynecologic examinations at the physician’s office and in conjunction with (or not), as applicable, the currently customary clinical breast examination (CBE).

Recently, electrical impedance spectroscopy (EIS) has been proposed for the purpose of risk stratification.12, 13, 14, 15, 16, 17, 18, 19, 20 The EIS approach is based on the fact that differences between cancerous and normal tissues are associated with changes in the electrical conductivity and capacitance, probably due to changes in cellular water content, blood flow, amount of extracellular fluid, and membrane proteins.21 It has long been reported that cancer cells exhibit altered local dielectric properties with respect to normal cells.17 For example, the conductivity and capacitance of malignant human breast tissue were shown to be 20–40-fold higher than that of normal breast tissue.12

During EIS measurements, alternating low power electric signals are introduced into the tissue being tested and the corresponding response is measured. One such commercial system (T-Scan 2000, Mirabel Medical Systems, Austin, TX) was approved by the U.S. Food and Drug Administration as an adjunct modality to mammography.20 Unfortunately, at high specificity levels, the system had low sensitivity in the range of 20%–30%.19 As a result, EIS has not been used in routine clinical practice, to date.

We have been developing and testing a markedly modified impedance measurements based approach for this purpose, which focuses primarily on the analysis of asymmetry or differences in the electrical impedance output signals between two bilateral breasts, particularly at and around the resonance frequency, to identify younger women with high risk for having highly suspicious findings requiring breast biopsy and∕or for having breast cancer. The current paper describes a continuation of preliminary work previously presented,22, 23, 24 in which a new multiprobe-based resonance-frequency-based electrical impedance spectroscopy (REIS) system was developed and a preliminary clinical assessment was initiated under an Institutional Review Board (IRB) approved protocol. The present work aims to make contributions as follows:

  • Development of a non-radiation-based, low cost, and easy to use breast cancer prescreening model (tool) based on analysis of multifeatures extracted at and around the resonance frequency of EIS output signal sweeps.

  • Design and development of a likelihood ratio test (LRT), based on a Gaussian mixture model (GMM) classifier and a mirror-matched feature extraction concept, for differentiation between “positive” cases with suspicious breasts abnormalities detected and recommended for biopsy during imaging-based workups and “negative” cases not recommended for biopsy. The selection of an optimal feature subset is performed using genetic algorithms (GAs).

  • Evaluation of classification performance of the GMM classifier using a dataset of REIS examinations obtained from 174 women and comparison of the performance levels of this classifier with the performance of several widely used classifiers.

This paper proceeds as follows. Section 2 introduces the REIS measurement concept and system, presents its possible application to breast examinations, and describes the features extracted from the REIS examination. The concept and application of the GMM classifier are also presented in this section. Section 3 presents the results, followed by a discussion given in Sec. 4.

MATERIALS AND METHODS

The REIS acquisition system

Under an exclusive agreement between the University of Pittsburgh and Dhurjaty Electronics Consulting LLC (Rochester, NY), a unique multiprobe-based-REIS system was designed and assembled (Fig. 1). The system measures and records multichannel EIS output signal sweeps between 200 and 800 kHz. The system consists of a mechanical support, an electronic box with two sensor cups, and a notebook computer that includes the management software of the system control and data acquisition and recording. The two sensor cups with different surface curvatures are mounted on the “front” and “back” of the electronic box and can be easily rotated 180° to allow the use of either cup by the technologist to fit the “smaller” and “larger” breasts when conducting the REIS examination. The curvature of the cups and the distance between the probes were based on actual physical measurements using a mock-up device. Each sensor cup has seven mounted chrome plated copper probes, one probe is located at the center of the cup and the other six are mounted uniformly along an “outer” circle with a 60 mm radius at 12, 2, 4, 6, 8, and 10 o’clock positions, respectively, as shown in Fig. 1. The probes (contacts) protrude approximately 4 mm from the base of the cups to enable good contact with the breast. The breast skin surface and all probes are cleaned using alcohol pads prior to an REIS examination. During the examination, the center probe contacts with the nipple and the other six probes contact with six fixed points on the breast skin surface at a fixed distance to the nipple (center probe). In the current study, no conducting gel or other elements are used to improve contact between the probes and the breast skin surface. After REIS system detects adequate contact between each of the seven probes and either the nipple or breast skin, the electrical impedance signal scanning can be initiated. The system applies a 1.5 V voltage and measures the current with a maximal current of 30 mA, which is safe and to date we have no indication of sensation of an electrical current by the participants.23 The complete REIS examination on one breast lasts 8 s to acquire and record EIS output signal sweeps between all connected probe pairs.

Figure 1.

Figure 1

The multiprobe REIS system installed in our clinical breast imaging facility.

The REIS examination

During a complete REIS examination, the same positioning and scanning procedure is performed on both breasts. The scanning time of the REIS examination for each breast is 8 s. The recorded EIS signal sweeps in each detection channel (a single pair of probes) includes three components: An in-phase signal, denoted by I, a quadrature signal, denoted by Q, and a magnitude, denoted by M, such that M=I2+Q2. The phase signal is therefore given by p=arctan(QI). Each one of the signals includes 121 output values for frequencies ranging between 200 and 800 kHz at 5 kHz increment. At the resonance frequency, the EIS signal magnitude (M) reaches a minimum value and the phase signal crosses the p=0 line (converts from a negative to a positive value).

The REIS database

Under an IRB approved protocol, REIS examinations were performed in our breast imaging clinical facility on consenting women between the ages of 30 and 50 yr old, who met the inclusion criteria. For the purpose of this study, participants were classified based on the actual diagnostic outcome into positive and negative groups. The positive group included 66 women who had been recommended for biopsy following an imaging-based diagnostic workup and the REIS relevant examination was performed prior to the biopsy (typically on the same day, approximately half an hour prior to the scheduled biopsy). Among these 66 women, imaging-based examinations (i.e., mammography, additional views, ultrasound, and magnetic resonance imaging) showed that 44 cases depicted suspicious masses, asymmetric density, or architectural distortions, 9 depicted microcalcification clusters alone, and 13 depicted both masses and microcalcification clusters. The results from the biopsies that followed were positive for cancer in 9 cases and 10 were classified as “high risk” abnormalities with recommendation for surgical excision. All other biopsies (47) were diagnosed as benign. The average mass size in this group as measured by ultrasound is 1.37±1.28 cm (range: 0.4–7cm), and the pathology reported sizes of malignant masses ranges from 0.85 to 2.0 cm. Among the 66 biopsied cases, 9 located in the “periareolar” or “retroareaolar” regions and 3 were located posterially, close to the chest wall. The remaining 54 biopsies lesions were reasonably well distributed in all breast regions.

The negative group included 108 women who either previously had a negative screening examination and were visiting our facility for their annual screening examination or had been recalled for a diagnostic follow-up during a prior screening examination but were later (after the REIS examination and the diagnostic workup) determined not to require a biopsy. The negative status verification in these women (without a recall) included either the negative mammograms of the scheduled screening examination that followed or the results of the diagnostic workup. Although long-term follow-up on these women is not available, the expectation value for future positive findings within 1 yr is extremely low.

Our inclusion criteria for women who had been recalled for a diagnostic follow-up, or those who had been recommended and scheduled for a biopsy, did not include restrictions regarding the type of abnormality in question (e.g., mass, cluster of microcalcifications, or both) or the specific location and∕or depth of the suspected abnormality within the breast. In this dataset, the majority of breasts were rated by the radiologists during the imaging-based examinations as “heterogeneously dense” (61.5%) and extremely dense (9.8%). Table 1 presents the distribution of breast tissue density BIRADS rating in the positive and negative groups.

Table 1.

Distribution of breast tissue density BIRADS ratings of the 174 participants as subjectively rated by the radiologists during imaging procedures.

  BIRADS rating Total cases
1 2 3 4
Positive group 0 10 44 12 66
Negative group 10 30 63 5 108
Entire dataset 10 40 107 17 174

Features extraction

The basic idea behind REIS-based breast risk stratification and∕or abnormality detection is that similar to screening mammography, in which negative cases have a higher level of tissue symmetry between the two breasts of a woman than positive cases, contralateral breasts should have a higher level of impedance symmetry in mirrored matched regions, in negative cases than in abnormal cases.17 Hence, asymmetry at and around the resonance frequency of two breasts may be indicative of having a breast abnormality that could lead to a higher risk of having or developing breast cancer.23, 24 Following this concept, the feature set is designed to represent the level of symmetry between a set of extracted EIS features at and around two resonance frequencies actually measured from two breasts. For this purpose, geometrically corresponding locations obtained from the two breasts of each participant were analyzed, i.e., the features were extracted in a mirror-matched fashion.

Using Fig. 2 as an example, we describe the definitions, the computational methods, and the resulting 33 EIS signal-difference based features. The initial feature set included a large number of EIS-based features from the six sets of EIS signal sweeps acquired between the center probe and each one of the other probes. Specifically, the following 33 features were extracted. Feature 1 was defined as the absolute difference between the ranges of all six resonance frequencies for each of the breasts as computed for the left (L) and right (R) breasts (F1=|ΔfL−ΔfR|). Features 2–33 were divided into two distinct groups. In the first group, a number of EIS signal values of the six EIS sweeps (including both magnitude and phase) for each breast were averaged. Except for one feature that was computed as an average of two averaged EIS signal values from the two breasts, the remaining 15 features were all computed by subtraction of the two averaged EIS signal values from the two breasts. These 16 features are described as follows:

  • (1)

    Features 2 and 3: Two averaged resonance frequencies including (f¯L=(i=16fi)6 for the left breast and f¯R=(i=16fi)6 for the right breast were used to define the following two features. Feature 2 was the average of the two averaged resonance frequencies from the left and right breasts (F2=(f¯L+f¯R)2) and this was the only feature that did not represent a subtracted value, rather it represented the overall resonance frequency value measured in the case. Feature 3 was the absolute difference between the two averaged resonance frequency values (F3=|f¯Lf¯R|).

  • (2)

    Feature 4: By identifying the EIS signal magnitude value at the resonance frequency [M(f)] of each sweep and computing the average value of all six EIS sweeps (M¯=(i=16M(f)i)6), feature 4 was defined as F4=|M¯LM¯R| representing the absolute difference in averaged EIS magnitude values at the respective resonance frequencies.

  • (3)

    Features 5–10: Near each resonance frequency, six EIS signal magnitude values of an EIS sweep (M(fi)) at six recorded frequencies including two smaller than the resonance frequency (f−10=f−10 kHz and f−5=f−5 kHz) and four larger than the resonance frequency (from f5=f+5 kHz to f20=f+20 kHz at 5 kHz increments) were extracted. For each of these extracted EIS magnitude values, the EIS signal magnitude value at the resonance frequency of the same EIS sweep was subtracted [ΔMi=M(fi)−M(f)]. After computing the average EIS signal magnitude differences (ΔIi) among all six probe pairs (ΔM¯i=k=16(ΔMi)k) for each breast, six features were defined (Fj=|ΔM¯(fi)LΔM¯(fi)R|, where j=5,6,…,10, and i=−10,−5,+5,+10,+15,+20 representing the six frequency differences near the resonance frequency (in kHz) of the EIS signal sweeps. Thus, features 5–10 represented the absolute differences of the averaged EIS signal magnitude values between the left and right breasts at these six frequencies.

  • (4)

    Feature 11: Similar to the computation of feature 3 representing the difference in the value between two frequencies, another frequency difference value between two breasts was computed based on the EIS in-phase signal sweeps. For each EIS signal phase sweep, the frequency at which the in-phase signal value reaches a plateau was found [f(p0)]. The average frequency value of the six EIS signal phase sweeps (f¯(p0)=i=16f(pi0)6) for the left and the right breasts were computed. Feature 11 was defined as F11=|f¯(pL0)f¯(pR0)|, representing the absolute difference between two averaged frequencies at which EIS phase values reached plateaus for the two breasts.

  • (5)

    Features 12–17: Similar to features 5–10, EIS signal phase value differences were computed rather than EIS signal magnitude value differences for the same set of six frequencies of interest. Features 12–17 were defined as the absolute phase value differences between two breasts of two averaged phase values at the specific frequencies, namely, Fj=|p¯(fi)Lp¯(fi)R|, where j=12,13,…,17, and i=−10,−5,+5,+10,+15,+20 kHz from the resonance frequencies, respectively.

Figure 2.

Figure 2

Six sets of recorded EIS signal magnitude sweeps [(a) and (b)] and related phases [(c) and (d)] for the left and right breasts from one examination. The lowest, average, and highest resonance frequencies of the six sweeps are shown for comparison in (a) and (b), respectively. The frequencies at which phase signals reach plateaus are shown in (c) and (d), respectively. A mirror-matched pair of EIS magnitude signal sweeps (e) and phase (f) that have the largest difference in resonance frequency between the right and left breasts in (a)–(d) are shown. Note that the presented output values (mV or deg) are amplified by the system.

Features 18–33 were similar to features 2–17 except that only one pair of mirror-matched EIS signal sweeps from the left and the right breasts, rather than averages of all six sweeps, was selected for computing these features. Specifically, for each examination, the resonance frequency differences for the six sets of mirror-matched EIS pairs of sweeps were computed. Then, one matched pair of EIS sweeps which had the maximum resonance frequency difference among all six matched pairs of all EIS sweeps was selected. Table 2 summarizes the distribution of cases that have a maximum resonance frequency difference between the two breasts at each of the six mirror-matched pairs in the dataset used in this study. After selecting the examination specific matched pair, all other (five) pairs were discarded, and the same computation process used to compute features 2–17 (F2–17) was reapplied to compute the second group of 16 EIS signal and phase related features (F18–33).

Table 2.

Distribution or number of occurrences by orientation (probe locations) of the mirror-matched probe pair (out of six pairs) that showed maximum resonance frequency differences between breasts within a case.

Probe pair position on the left-right breast (o’clock) 12–12 2–10 4–8 6–6 8–4 10–2
No. of cases 44 34 22 26 11 37

After computing the set of 33 EIS signal related features, we added two non-EIS related features, namely, the participant’s age (F34) and the breast tissue density (F35) as subjectively rated by the radiologists based on four categories defined in American College of Radiology (ACR) recommended density BIRADS: (1) Almost entirely fat (<25% fibroglandular), (2) scattered fibroglandular (25%–50% fibroglandular), (3) heterogeneously dense (51%–75% fibroglandular), and (4) extremely dense (>75% fibroglandular). Since mammography typically has lower performance (both sensitivity and specificity) in younger women with dense breasts,25 it was important to incorporate these features and evaluate their effect, if any, on the performance of our REIS-based classifier. A summary of the initial feature set is given in Table 3.

Table 3.

A list of the initial set of 33 EIS signal based features. With the exception of features 2 and 18 that represent an average of two feature values computed from two breasts, all other features represent absolute differences of two corresponding EIS signal values computed from the signals obtained from two breasts.

Feature Feature description Feature unit
1 Difference in the range of six resonance frequencies kHz
2 Average of two averaged resonance frequencies kHz
3 Difference of two averaged resonance frequencies kHz
4 Difference of two averaged EIS signal magnitude values at corresponding resonance frequencies mV
5–10 Difference of two averaged EIS signal magnitude values near the resonance frequency ranging from −10 to +20 kHz at 5 kHz increments mV
11 Difference of two averaged frequency values when EIS phase signal reaches a plateau kHz
12–17 The difference of two averaged EIS signal phase values near the resonance frequency ranging from −10 to +20 kHz at 5 kHz increments deg
18 Average of two resonance frequencies exhibiting the maximum resonance frequency difference between mirror-matched pair of probes kHz
19 Difference between two resonance frequencies of the selected matched pair of probes kHz
20 Difference between two EIS signal magnitude values at the two matched resonance frequencies mV
21–26 Difference between two EIS signal magnitude values of the matched probe pair near the resonance frequencies ranging from −10 to +20 kHz at 5 kHz increments mV
27 Difference between two frequency values when EIS phase signal reaches a plateau kHz
28–33 Difference between two EIS signal phase values of the matched probe pair near the resonance frequencies ranging from −10 to +20 kHz at 5 kHz increments deg

GMM-based classification

GMM-based classification methods have been applied in speech recognition.26 Mixture models, particularly GMM, form a common technique for probability density estimation. This is justified by the fact that any density can be estimated in a required degree of approximation, using finite Gaussian mixture.27 The mathematical properties of GMMs, as well as their flexibility and the availability of efficient estimation algorithms, make them attractive for classification problems.

A popular algorithm for GMM parameters estimation is the expectation maximization (EM).28 This algorithm allows for iterative optimization of the mixture parameters, under monotonic likelihood requirements, and has a relatively simple implementation. However, the EM suffers from several drawbacks: (i) It requires a priori knowledge of the number of mixing components, (ii) it is highly sensitive to parameters initialization, and (iii) it tends to converge to local maxima. Greedy learning of GMM, recently proposed,29, 30 overcomes some of the drawbacks of the EM algorithm (e.g., Ref. 31). The greedy learning algorithm estimates the GMM parameters in a greedy fashion and thus inherently estimates the model order. In this study, we used a greedy algorithm with an arbitrarily selected maximum model order (number of Gaussians) of 10 and a threshold of 10−5. According to the GMM concept, the positive and negative cases were represented by two classes, each of which was modeled by a GMM that was defined as a weighted sum of K Gaussian component densities and a useful tool for probability density function (pdf) representation.

A GMM, representing a random process, x, can be expressed as follows:

fK(x)=k=1Kπkϕθk(x), (1)

where ϕθk(x) represents the kth Gaussian mixture component and πk represents the mixing weight such that k=1Kπk=1,πj0j. A multivariate Gaussian mixture is given by the weighted sum 1, where the jth component ϕ(xj) is the d-dimensional Gaussian density,

ϕ(x;θj)=(2π)d2|Sj|12exp[0.5(xmj)TSj1(xmj)], (2)

which is parametrized on the mean mj and the covariance matrix Sj, collectively denoted by the parameter vector θj.

The classification criterion for GMM is based on a maximum likelihood decision rule. According to this approach, the decision is made by finding the class, m, which maximizes the likelihood function,

m=arg maxm=1,2,,Mfk(x;Hm), (3)

where fk(x;Hm) is the pdf of the classification features, x, under hypothesis Hm. These pdf’s under the different hypothesis are estimated by the greedy GMM. In the present work, we aimed to discriminate between two classes of cases. Therefore, a LRT was employed as follows:

LRT=f(x;H2)f(x;H1)><γ. (4)

The log LRT (LLRT) is given by

LLRT=lnf(x;H2)lnf(x;H1)><γ. (5)

A receiver operating characteristic (ROC) curve can therefore be calculated by changing the threshold, γ. The optimal threshold, γ, is typically determined in such a manner that the classification error is minimized. In clinical applications, such as the application considered in the present work, the threshold may be determined based on a predefined clinically relevant criterion (e.g., based on a predefined minimal specificity requirement). Since our REIS system is intended to be ultimately used as a breast cancer prescreening “risk assignment” system, only performance at high specificity levels is of interest if the system is practical for clinical use. A reasonable choice for specificity thresholds in this group of women would therefore be 80% or 90%.

Figure 3 shows a general scheme diagram of the proposed GMM-based classification system. The system is composed of two GMMs, denoted by GMM1 and GMM2, representing the negative and positive cases, respectively, which are estimated using the greedy-GMM algorithm during the learning phase and are incorporated in the LLRT test in Eq. 5.

Figure 3.

Figure 3

A general scheme of the proposed greedy-GMM-based classifier.

Feature selection and classifier optimization

Selection of the best feature set for the specific question at hand is an important step in developing any classification system and the appropriate selection method may depend on the specific application. In the previous studies, a number of methods have been proposed and applied including GA,32, 33, 34 floating search methods,35 and others,36, 37 to select “optimal” feature sets. In this work, the GA approach was chosen due to its flexibility and efficiency. Furthermore, GA allows searching for a suboptimal feature set, which can be chosen based on a specific optimization criterion, independent of the classifier. Specifically, the GA was implemented with the following optimization criterion (fitness index):

c(m)=Az(m)ρN(m), (6)

where Az(m) represents the computed area under the ROC curve (AUC), N(m) is the number of features incorporated in the solution m, and ρ is the penalty constant, set in this work to ρ=0.001. The purpose of the penalty function is to give priority to solutions with a smaller number of incorporated features and thereby to improve the overall performance, namely, when two different solutions yield the same performance score, Az(m), the one with the smaller number of features is preferred. The GA chromosome size was 35, in which each gene of the GA chromosome represents one of 35 initially computed EIS signal related features, as well as the woman’s age and breast density rating. The binary coding method was used to build each GA chromosome, in which 1 indicates that the specific feature in question is selected (activated) in the classifier and 0 means that the feature is discarded. During the GA optimization, a mutation probability of 0.01 and a crossover probability of 0.2 were implemented. The initial GA chromosome was randomly selected by the GA program.

Once a GA chromosome was selected, a leave-one-case-out (LOCO) method was used to assess the performance of the GMM classifier due to the size limitation of the REIS dataset. When using the leave-one-case-out method, N−1 cases (i.e., 173 cases in this study) are used to train the classifier, and the resulting classifier is applied to the one remaining test case (i.e., for generating a classification score). This process is repeated N times (N=174) so that each case in the dataset is used once as a test case of the classifier. The N classification scores are generated and GA computes its fitness index [Eq. 6]. The LOCO method is considered to be a minimally biased optimization method when using limited datasets.38 The GA chromosomes with computed higher fitness index have higher priority to be selected to generate the new GA chromosomes in the next optimization generation using a predetermined crossover and mutation criteria (probabilities). The maximum number of GA optimization iterations was set to 100. Thus, the GA chromosome that achieved the highest fitness index, c(m), in this GA optimization process was selected and the features selected in this GA chromosome were used as a set of optimal features to build the final GMM classifier.

We also compared performance levels of the GMM classifier to several other machine learning classifiers including a support vector machine (SVM), a classification tree (CART), and a linear discriminant analysis (LDA) based classifier when applied to the same dataset used in this study. Each classifier was independently optimized (trained and tested) implementing the same genetic algorithm optimization protocol used to optimize the GMM classifier.

RESULTS

From the original feature pool of 33 EIS signal related features and two non-EIS related features, GA selected a subset of 14 features including F3, F8, F10–11, F14–15, F17–19, F28–31, and F33. The distribution of the selected features shows that four features represented frequency differences (F3, F11, F18, and F19), two represented EIS signal magnitude differences (F8 and F10), and eight represented EIS signal phase differences (F14–15, F17, F28–31, and F33). The two non-REIS-based features, including age (F34) and BIRADS rated breast density (F35), were not selected by the GA for the inclusion in the classifier, supporting the general expectation that, unlike mammography, REIS examinations are independent of the patient age and∕or breast tissue density.

Figure 4 presents the convergence of the GA-based feature selection algorithm. The figure presents a best chromosome score, as well as the average population score, as a function of the GA-iteration index (number). The GA-based feature selection algorithm significantly improves the classifier performance. It should be noted that changing the iteration limit to 150 did not yield further performance improvement.

Figure 4.

Figure 4

Performance levels measured by the area under the ROC curve as a function of the iteration number.

Figure 5 presents four nonparametric ROC performance curves generated by the GMM, SVM, CART, and LDA based classifiers. This figure shows that the performance of the GMM classifier is similar to that of the SVM classifier with an AUC of AZ=0.78. Both classifiers significantly outperformed (p<0.001) the CART and LDA classifiers, with AUCs of AZ=0.73 and AZ=0.60, respectively. Despite the comparable overall performance as measured by AUC for the GMM and SVM classifiers, the GMM actually performed somewhat better (higher detection sensitivity) in the higher specificity region (specificity>90%), which is the clinically relevant range. At a 90% specificity level (0.1 “false-positive” rate), the GMM yielded a true detection rate of 47%, as compared to 40%, 28%, and 8%, for the SVM, CART, and LDA classifiers, respectively.

Figure 5.

Figure 5

Performance curves for the four classifiers based on GMM, LDA, SVM, and CART.

Although the primary focus of this work was to assess the potential of a REIS-based system to identify the presence or absence of suspicious breast abnormalities ultimately requiring an invasive intervention (i.e., a breast biopsy), we also assessed the GMM classification performance for different types of abnormal cases. Table 4 summarizes the GMM classification sensitivity levels for different biopsy outcomes, type of abnormality, and breast density BIRADS rating at specificity levels of 80% and 90%.

Table 4.

Sensitivity levels for different groups of cases with different biopsy outcomes, type of abnormality, and breast density BIRADS ratings at specificity levels of 80% and 90%.

All 66 biopsy cases Specificity level No. of cases 80% 90%
Abnormality Cancer-verified 9 5 (56%) 4 (44%)
  High risk 10 6 (60%) 6 (60%)
  Benign-verified 47 24 (51%) 18 (38%)
  Masses only 45 25 (56%) 19 (42%)
  Calcification only 11 10 (91%) 9 (82%)
  Both masses and calcification 10 6 (60%) 3 (30%)
Breast density BIRADS 2 10 5 (50%) 3 (30%)
  BIRADS 3 44 25 (57%) 20 (45%)
  BIRADS 4 12 9 (75%) 8 (67%)

DISCUSSION

We presented a substantially modified EIS approach, termed REIS, to prescreen younger women (e.g., <50YO) and assess their risk for having highly suspicious breast abnormalities requiring imaging-based follow-up resulting in a recommendation for biopsy. Unlike previous EIS systems, the primary difference of the current approach is that we focus on assessing impedance signal asymmetry between two bilateral breasts at and around the resonance frequency of the tissue being measured. By using multiple probes and measuring impedance between multiple pairs of probes, we increase the sensitivity of the system to abnormalities located in different regions of the breast. The REIS system measures impedance-asymmetry distributions in six concentric uniformly distributed regions of the breast and our GMM-based classifier uses the average EIS signal feature differences computed from all six probes as well as feature differences computed from only one probe pair that shows the maximum resonance frequency difference (asymmetry) between two mirror-matched bilateral breast regions.

The ultimate intended use of the REIS-based approach, introduced in this paper, is to provide a fast, cost-effective, easy to use, non-radiation-based, and non-image-based prescreening tool that can be incorporated into physicians’ practices for periodic examinations of younger women who do not participate in annual mammography-based screening because of their age at examination or other personal reasons. Women who are identified by this prescreening tool as having higher risk (i.e., significantly larger electrical impedance signal asymmetry between two breasts) should be “ruled-in,” namely, recommended to be evaluated by imaging (e.g., mammography, ultrasound, or MRI). In this respect, REIS can be seen as an adjunct examination to the CBE (when and if it is performed). Any cancer identified under this scenario would not have been found until later and possibly at a higher stage during the eventual detection. If ultimately successful, this type of an approach could have a significant impact on breast screening paradigms, in general, and on earlier detection breast cancers in younger women, in particular.

The results of this study demonstrate a number of unique characteristics when using the proposed REIS approach. First, it has been shown that, in general, negative cases had a higher level of electrical impedance symmetry between the two breasts than positive cases. This supports our hypothesis that symmetrical features acquired from two contralateral breast REIS examinations contain significant information that may be used for the purpose of breast risk stratification. Second, we successfully applied a GMM classifier, with a feature set selected by a GA algorithm specifically designed for the task at hand. The GMM classifier achieved favorable performance levels as compared to three other classifiers. Third, we confirmed our previous finding24 using both a different classifier and a different dataset. Unlike mammography, the performance of the REIS-based classifier was found to be insensitive to breast tissue density as determined by density BIRADS rating. The results showed that the GMM-based classifier did not only exclude the radiologists rated breast density BIRADS related feature (F35) but also yielded a somewhat higher detection sensitivity in cases with denser breasts (Table 4). This may prove to be an important advantage when applying this approach to the intended population of younger women. Fourth, REIS-based classification decisions were not only consistent with biopsy recommendations but also with pathology verified outcome. Namely, at specificity level of 90%, the classifiers correctly identified 47% of the cancer cases and 60% of the cases with “precancer high risk lesions,” both higher than the sensitivity level of identifying women with a biopsy-proven benign finding (38%), as shown in Table 4.

The GMM classifier is also computationally efficient. All of the algorithms used in this work were implemented in MATLAB (R2009a). The code was not optimized for real-time processing at the current stage. Nevertheless, using an AMD Phenom II X2 545 3.01 GHz processor, with 3.75 GB RAM, feature extraction requires approximately 3–4 s for each signal, the GMM classifier training requires 2–4 min∕model, depending on the specific configuration, and the validation process requires 2–3 s for each signal. Hence, ultimately the algorithms can be easily implemented in real time.

Despite these encouraging results, we recognize that this is a very preliminary study that includes a number of limitations and issues that warrant further investigations. First, only a limited number of REIS examinations acquired from the initial phase of a larger prospective study was used. Although we used LOCO testing method to minimize training and testing biases,38 the robustness of the test results depends on whether the used dataset is diverse enough to adequately cover the intended use population. We are in the process of expanding the REIS dataset as a part of an ongoing prospective study. Further evaluations and validation of the system performance will be conducted as additional data becomes available. In this regard, the ultimate goal is to provide a tool that would enable examination of unscreened women at physicians’ offices (e.g., PCP and GYN), but we currently recruit women who undergo imaging screening procedures as we need “truth” (or outcome) during the development phase to demonstrate feasibility. Thus, our dataset may not adequately represent the actual intended use (or unscreened) population. Second, unlike other imaging devices (e.g., US), we currently do not use conducting gel during the REIS examination. Whether the use of conductivity gel between the sensor probes and the contacted breast skin surface could improve either quality or consistency of the recorded EIS output signal sweeps and thereby improving detection performance was not investigated. Third, unlike conventional imaging modalities, the REIS system, as currently configured, does identify either the type or the locations breast abnormalities. Fourth, we recognize that impedance differences between normal and cancerous tissues as measured on small tissue volumes (e.g., specimens) may not be directly applicable to impedance measurements of the whole breast. Although this and other studies have shown feasibility of applying EIS type measurement to the whole breast to detect suspicious lesions, the underlying biological mechanisms that allow for generalizing local tissue based measurements to the findings described here has not been adequately investigated nor is it well understood. Fifth, we did not investigate whether probes made of different materials (e.g., Ag–AgCl electrodes) that have different conductivities would affect the measurements and thereby system performance. Based on our experience to date in ongoing REIS data acquisition project, the system is able to generate and record smooth EIS output signal sweeps in all connected probe pairs. As the values we use in the classifier are relative (based on differences rather than absolute values), we believe that an impact, if any, due to the use of a specific probe material is likely to be minimal. Last, during the development of our GMM-based classifier, we extracted EIS output signal features from a linear scale system, which is different from a logarithmic based scaling system frequently used in similar tasks. This will be explored in future work. Clearly, much additional research work is needed before we can develop an optimal, REIS-based technology that can be widely applied in clinical practice.

ACKNOWLEDGMENTS

This work was supported in part by Grant 1R21∕R33 CA127169 to the University of Pittsburgh from the National Cancer Institute, National Institutes of Health. The authors also thank the Magee-Women Research Institute & Foundation, Glimmer of Hope Fund, for supporting this effort.

References

  1. Smith R. A., “Breast cancer screening among women younger than age 50: A current assessment of the issues,” Ca-Cancer J. Clin. 50, 312–336 (2000). 10.3322/canjclin.50.5.312 [DOI] [PubMed] [Google Scholar]
  2. Pisano E. D., Gatsonis C., Hendrick E., Yaffe M., Baum J. K., Acharyya S., Conant E. F., Fajardo L. L., Bassett L., D’Orsi C., Jong R., and Rebner M., “Diagnostic performance of digital versus film mammography for breast cancer screening,” N. Engl. J. Med. 353, 1773–1783 (2005). 10.1056/NEJMoa052911 [DOI] [PubMed] [Google Scholar]
  3. Fenton J. J., Egger J., Carney P. A., Cutter G., D’Orsi C., Sickles E. A., Fosse J., Abraham L., Taplin S. H., Barlow W., Hendrick R. E., and Elmore J. G., “Reality check: Perceived versus actual performance of community mammographers,” AJR, Am. J. Roentgenol. 187, 42–46 (2006). 10.2214/AJR.05.0455 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Berg W. A., Gutierrez L., NessAiver M. S., Carter W. B., Bhargavan M., Lewis R. S., and Ioffe O. B., “Diagnostic accuracy of mammography, clinical examination, US, and MR imaging in preoperative assessment of breast cancer,” Radiology 233, 830–849 (2004). 10.1148/radiol.2333031484 [DOI] [PubMed] [Google Scholar]
  5. Poplack S. P., Tosteson T. D., Kogel C. A., and Nagy H. M., “Digital breast tomosynthesis: Initial experience in 98 women with abnormal digital screening mammography,” AJR, Am. J. Roentgenol. 189, 616–623 (2007). 10.2214/AJR.07.2231 [DOI] [PubMed] [Google Scholar]
  6. Yang K., Kwan A. L., Huang S. Y., Packard N. J., and Boone J. M., “Noise power properties of a cone-beam CT system for breast cancer detection,” Med. Phys. 35, 5317–5327 (2008). 10.1118/1.3002411 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Warner E., Plewes D. B., Hill K. A., Causer P. A., Zubovits J. T., Jong R. A., Cutrara M. R., DeBoer G., Yaffe M. J., Messner S. J., Meschino W. S., Piron C. A., and Narod M. S. A., “Surveillance of BRCA1 and BRCA2 mutation carriers with magnetic resonance imaging, ultrasound, mammography, and clinical breast examination,” JAMA, J. Am. Med. Assoc. 292, 1317–1325 (2004). 10.1001/jama.292.11.1317 [DOI] [PubMed] [Google Scholar]
  8. Kriege M., Brekelmans C. T. M., Boetes C., Besnard P. E., Zonderland H. M., Obdeijn I. M., Manoliu R. A., Kok T., Madeleine H. P., Tilanus-Linthorst M. M. A., Muller S. H., Meijer S., Oosterwijk J. C., Beex L. V. A. M., Tollenaar R. A. E. M., de Koning H. J., Rutgers E. J. T., and Klijn J. G. M., “Efficacy of MRI and mammography for breast-cancer screening in women with a familial or genetic predisposition,” N. Engl. J. Med. 351, 427–437 (2004). 10.1056/NEJMoa031759 [DOI] [PubMed] [Google Scholar]
  9. Leach M. O. et al. , “Screening with magnetic resonance imaging and mammography of a UK population at high familiar risk of breast cancer: A prospective multicentre cohort study (MARIBS),” Lancet 365, 1769–1778 (2005). 10.1016/S0140-6736(05)66646-9 [DOI] [PubMed] [Google Scholar]
  10. Saslow D., Boetes C., Burke W., Harms S., Leach M. O., Lehman C. D., Morris E., Pisano E., Schnall M., Sener S., Smith R. A., Warner E., Yaffe M., Andrews K. S., and Russell C. A., “American Cancer Society guidelines for breast screening with MRI as an adjunct to mammography,” Ca-Cancer J. Clin. 57, 75–89 (2007). 10.3322/canjclin.57.2.75 [DOI] [PubMed] [Google Scholar]
  11. Stojadinovic A., Nissan A., and Shriver C. D., “Electrical impedance scanning as a new breast cancer risk stratification tool for young women,” J. Surg. Oncol. 97, 112–120 (2008). 10.1002/jso.20931 [DOI] [PubMed] [Google Scholar]
  12. Surowiec A. J., Stuchly S. S., Barr J. B., and Swarup A., “Dielectric properties of breast carcinoma and surrounding tissues,” IEEE Trans. Biomed. Eng. 35, 257–263 (1988). 10.1109/10.1374 [DOI] [PubMed] [Google Scholar]
  13. Pipemo G., Frei G., and Moshitzky M., “Breast cancer screening by impedance measurement,” Med. Biol. Eng. 2, 111–117 (1990). [PubMed] [Google Scholar]
  14. Malich A., Fritsch T., Anderson R., Boehm T., Freesmeyer M. G., Fleck M., and Kaiser W. A., “Electrical impedance scanning for classifying suspicious breast lesions: First results,” Eur. Radiol. 10, 1555–1561 (2000). 10.1007/s003300000553 [DOI] [PubMed] [Google Scholar]
  15. Kerner T. E., Paulsen K. D., Hartov A., Soho S. K., and Poplack S. P., “Electrical impedance spectroscopy of the breast: Clinical imaging results in 26 subjects,” IEEE Trans. Med. Imaging 21, 638–645 (2002). 10.1109/TMI.2002.800606 [DOI] [PubMed] [Google Scholar]
  16. Glickman Y. A., Filo O., Nachaliel U., Lenington S., Amin-Spector S., and Ginor R., “Novel EIS postprocessing algorithm for breast cancer diagnosis,” IEEE Trans. Med. Imaging 21, 710–712 (2002). 10.1109/TMI.2002.800605 [DOI] [PubMed] [Google Scholar]
  17. Poplack S. P., Paulsen K. D., Hartov A., Meaney P. M., Pogue B. W., Tosteson T. D., Grove M. R., Soho S. K., and Wells W. A., “Electromagnetic breast imaging: Average tissue property values in women with negative clinical findings,” Radiology 231, 571–580 (2004). 10.1148/radiol.2312030606 [DOI] [PubMed] [Google Scholar]
  18. Sumkin J. H., Stojadinovic A., Huerbin M., and Klym A. H., “Impedance measurements for early detection of breast cancer in younger women: A preliminary assessment,” Proc. SPIE 5034, 197–203(2003). 10.1117/12.480073 [DOI] [Google Scholar]
  19. Stojadinovic A., Nissan A., Gallimidi Z., Lenington S., Logan W., Zuley M., Yeshaya A., Shimonov M., Melloul M., Fields S., Allweis T., Ginor R., Gur D., and Shriver C. D., “Electrical impedance scanning for the early detection of breast cancer in young women: Preliminary results of a multicenter prospective clinical trial,” J. Clin. Oncol. 23, 2703–2715 (2005). 10.1200/JCO.2005.06.155 [DOI] [PubMed] [Google Scholar]
  20. Stojadinovic A., Moskovitz O., Gallimidi G., and Fields S., “Prospective study of electrical impedance scanning for identifying young women at risk for breast cancer,” Breast Cancer Res. Treat. 97, 179–189 (2006). 10.1007/s10549-005-9109-4 [DOI] [PubMed] [Google Scholar]
  21. Fricke H. and Morse S., “The electric capacity of tumors of the breast,” Cancer Res. 16, 310–376 (1926). [Google Scholar]
  22. Sumkin J., Zheng B., Gruss M., Drescher J., Leader J., Good W., Lu A., Cohen C., Shah R., Zuley M., and Gur D., “Assembling a prototype resonance electrical impedance spectroscopy system for breast tissue signal detection: Preliminary assessment,” Proc. SPIE 6917, 691716 (2008). 10.1117/12.770457 [DOI] [Google Scholar]
  23. Gur D., Zheng B., Lederman D., Dhurjaty S., Sumkin J., and Zuley M., “A support vector machine designed to identify breasts at high risk using multi-probe generated REIS signals: A preliminary assessment,” Proc. SPIE 7627, 7627B7127–7646 (2010). [Google Scholar]
  24. Zheng B., Zuley M. L., Sumkin J. H., Catullo V. J., Abrams G. S., Rathfon G. Y., Chough D. M., Gruss M. Z., and Gur D., “Detection of breast abnormalities using a prototype resonance electrical impedance spectroscopy system: A preliminary study,” Med. Phys. 35, 3041–3048 (2008). 10.1118/1.2936221 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Saarenmaa I. et al. , “The effect of age and density of the breast on the sensitivity of breast cancer diagnostic by mammography and ultrasonography,” Breast Cancer Res. Treat. 67, 117–123 (2001). 10.1023/A:1010627527026 [DOI] [PubMed] [Google Scholar]
  26. Reynolds D. A. and Rose R. C., “Robust text-independent speaker identification using Gaussian mixture speaker models,” IEEE Trans. Speech Audio Process. 3, 72–83 (1995). 10.1109/89.365379 [DOI] [Google Scholar]
  27. Li J. Q. and Barron A. R., Mixture Density Estimation. In Advances in Neural Information Processing Systems, edited by Solla S. A.. Leen T. K., and Mueller K-R. (MIT Press, Cambridge, 2000), Vol. 12, 279–285. [Google Scholar]
  28. Dempster A. P., Laird N. M., and Rubin D. B., “Maximum likelihood from incomplete data via the EM algorithm,” J. R. Stat. Soc. Ser. B (Methodol.) 39, 1–38 (1977). [Google Scholar]
  29. Verbeek J., Vlassis N., and Krose B., “Efficient greedy learning of Gaussian mixture models,” Neural Comput. 15, 469–485 (2003). 10.1162/089976603762553004 [DOI] [PubMed] [Google Scholar]
  30. Vlassis N. and Likas A., “A greedy EM algorithm for Gaussian mixture learning,” Neural Process. Lett. 15, 77–87 (2002). 10.1023/A:1013844811137 [DOI] [Google Scholar]
  31. Bilik I., Tabrikian J., and Cohen A., “GMM-based target classification for ground surveillance Doppler radar,” IEEE Trans. Aerosp. Electron. Syst. 42, 267–278 (2006). 10.1109/TAES.2006.1603422 [DOI] [Google Scholar]
  32. Zheng B., Chang Y. H., Wang X. H., Good W. F., and Gur D., “Feature selection for computerized mass detection in digitized mammograms by using a genetic algorithm,” Acad. Radiol. 6, 327–332 (1999). 10.1016/S1076-6332(99)80226-8 [DOI] [PubMed] [Google Scholar]
  33. Zheng B., Chang Y. H., Good W. F., and Gur D., “Performance gain in computer-assisted detection schemes by averaging scores generated from artificial neural networks with adaptive filtering,” Med. Phys. 28, 2302–2308 (2001). 10.1118/1.1412240 [DOI] [PubMed] [Google Scholar]
  34. Zheng B., Lu A., Hardesty L. A., Sumkin J. H., and Hakim C. M., “A method to improve visual similarity of breast masses for an interactive computer-aided diagnosis environment,” Med. Phys. 33, 111–117 (2006). 10.1118/1.2143139 [DOI] [PubMed] [Google Scholar]
  35. Pudil P., Novovicova J., and Kittler J., “Floating search methods in feature selection,” Pattern Recogn. Lett. 15, 1119–1125 (1994). 10.1016/0167-8655(94)90127-9 [DOI] [Google Scholar]
  36. Somol P., Pudil P., and Kittler J., “Fast branch & bound algorithms for optimal feature selection,” IEEE Trans. Pattern Anal. Mach. Intell. 26, 900–912 (2004). 10.1109/TPAMI.2004.28 [DOI] [PubMed] [Google Scholar]
  37. Peng H., Long F., and Ding C., “Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy,” IEEE Transactions on Pattern Analysis and Machine Learning 27, 1226–1238 (2005). 10.1109/TPAMI.2005.159 [DOI] [PubMed] [Google Scholar]
  38. Li Q. and Doi K., “Reduction of bias and variance for evaluation of computer-aided diagnostic schemes,” Med. Phys. 33, 868–875 (2006). 10.1118/1.2179750 [DOI] [PubMed] [Google Scholar]

Articles from Medical Physics are provided here courtesy of American Association of Physicists in Medicine

RESOURCES