Skip to main content
Medical Physics logoLink to Medical Physics
. 2015 Feb 2;42(2):1098–1118. doi: 10.1118/1.4905232

A computational model to generate simulated three-dimensional breast masses

Luis de Sisternes 1,a), Jovan G Brankov 1, Adam M Zysk 1, Robert A Schmidt 2, Robert M Nishikawa 3, Miles N Wernick 4,b)
PMCID: PMC4320152  PMID: 25652522

Abstract

Purpose:

To develop algorithms for creating realistic three-dimensional (3D) simulated breast masses and embedding them within actual clinical mammograms. The proposed techniques yield high-resolution simulated breast masses having randomized shapes, with user-defined mass type, size, location, and shape characteristics.

Methods:

The authors describe a method of producing 3D digital simulations of breast masses and a technique for embedding these simulated masses within actual digitized mammograms. Simulated 3D breast masses were generated by using a modified stochastic Gaussian random sphere model to generate a central tumor mass, and an iterative fractal branching algorithm to add complex spicule structures. The simulated masses were embedded within actual digitized mammograms. The authors evaluated the realism of the resulting hybrid phantoms by generating corresponding left- and right-breast image pairs, consisting of one breast image containing a real mass, and the opposite breast image of the same patient containing a similar simulated mass. The authors then used computer-aided diagnosis (CAD) methods and expert radiologist readers to determine whether significant differences can be observed between the real and hybrid images.

Results:

The authors found no statistically significant difference between the CAD features obtained from the real and simulated images of masses with either spiculated or nonspiculated margins. Likewise, the authors found that expert human readers performed very poorly in discriminating their hybrid images from real mammograms.

Conclusions:

The authors’ proposed method permits the realistic simulation of 3D breast masses having user-defined characteristics, enabling the creation of a large set of hybrid breast images containing a well-characterized mass, embedded within real breast background. The computational nature of the model makes it suitable for detectability studies, evaluation of computer aided diagnosis algorithms, and teaching purposes.

Keywords: breast, tumor, phantom, simulation, mammography

1. INTRODUCTION

Computational phantom models are widely used in the evaluation of imaging systems and algorithms, where it is often important to have access to large and diverse sets of image data having known characteristics. In this work, we present a new method for generating realistic three-dimensional (3D) mass phantoms, expanding on our prior work.1 We describe an approach for creating simulated masses and a method for embedding the simulated masses digitally within real clinical mammograms to obtain digital hybrid images.39 We then show that the resulting hybrid images are, for practical purposes, essentially indistinguishable from real mammograms as judged by both expert readers and computer-aided diagnosis (CAD) procedures.

The model introduced in this work employs a Gaussian random sphere (GRS) technique to generate a central mass and an iterative branching algorithm to simulate spicules. The branching process is informed by the principle of minimum work. The random generation of simulated masses is controlled by user-specified parameters. This permits the user to produce an unlimited family mass having particular characteristics.

Several previous approaches to breast tumor simulation found in the literature (e.g., Refs. 2–4) are based on simple shapes such as ellipsoids and cylinders. Typically, realism of a simulation method is quantified by the ability of readers to distinguish real tumors from simulated ones, with area under the receiver operating characteristic (ROC) curve (AUC) of 0.5 (pure guessing) being the ultimate goal. In a previous published study by Saunders et al.,2 a proposed mass model yielded AUC values of 0.68 ± 0.07 for the benign masses and 0.65 ± 0.07 for the malignant masses. Thus, the results differed significantly from the target AUC value of 0.5. A similar reader study by Berks et al.3 was performed for a different model, resulting in a similarly unsatisfactory AUC result of 0.7 ± 0.09. Bliznakova et al.4 proposed a model based on various geometrical shapes, but readers were easily able to distinguish real masses from the simulated ones, with accuracy exceeding 95%. Rashidnasab et al. proposed several models for mass simulation in mammography using a diffusion limited aggregation algorithm5 and a random walk algorithm.6 Their simulated images were generated using the same approach as previously described by us,1,7 in which healthy mammography images are modified by substituting the effect of a mixture of fatty and fibrous tissue with tumor tissue. The range of structures created by their method was able to be controlled by a set of parameters and produced realistic looking results. However, their work did not provide any distinction in the generation of different mass types (as spiculated or nonspiculated). Further models that have been proposed in the literature8–10 are either simplistic or lack thorough validation. The purpose of this paper is to create breast mass phantoms that are more detailed and realistic than those produced by existing techniques, as well as producing different mass types that can be differentiated by a spiculated or nonspiculated margin.

Our work was originally motivated by a specific need for highly detailed simulations for our research on phase contrast imaging11,12 where fine details are clearly seen in the image. Note that the purpose of this paper is not to faithfully replicate the biological processes of tumor growth, but simply to produce simulated masses that appear visually similar to what is seen in medical imagery, so as to provide imaging researchers with a tool for evaluating imaging systems and algorithms. The proposed method may also be useful as a training tool.

In Sec. 2, we discuss modeling of the simulated breast mass, computation of its voxelized volume and projection, and its geometrical interpretation. We also describe a method for creation of hybrid digitized mammograms consisting of a simulated mass that is computationally embedded within real breast background. Experiments to measure the realism of the results, using CAD methods and expert readers, are provided in Sec. 3.A, and discussion follows in Sec. 4.

2. MATERIALS AND METHODS

In the proposed approach, simulated breast masses are modeled according to the steps shown in Fig. 1. Figures 1(a)1(c) show the process of simulating the central mass, beginning from a shape constructed using GRS,13 and modifying its surface with various irregularities, first by set of low-frequency modifications referred to colloquially as “bumps” or “spikes,” and then by adding high-frequency modifications to the surface defined as a “fuzzy” surface texture. Next, if a spiculated mass is desired, spiculation structure can be added to the central mass by using an iterative branching algorithm, as shown in Fig. 1(d). Finally, as shown in Figs. 1(e) and 1(f), a projection image of the simulated mass can be embedded within a clinical mammogram to produce a hybrid image (mammogram with simulated mass). The mass is described as a 3D shape model, so it can also be used in simulations of tomographic imaging; however, we did not consider the performance of our method in that setting. The simulated mass is generated in the form of a parametric surface model, so it can be rasterized to form a volumetric image represented on a 3D voxel grid; however, an analytic projection approach has tremendous computational benefits, therefore we describe the analytic solution for planar mammography in the Appendix. Sections 2.A–2.D provide the details of each of the steps of mass simulation shown in Fig. 1.

FIG. 1.

FIG. 1.

Process of generating a simulated mass model and embedding it into an existing mammogram.

2.A. Modeling the central mass

2.A.1. GRS model [Fig. 1(a)]

In this section, we explain the first step shown in Fig. 1(a) in which the central mass is simulated. We accomplish this by employing a GRS model, which was originally designed to model planets and comets13 but has also been used in other fields.14 The GRS is a parametric surface model described by the radial distance of the surface from the origin, rθ,φ, which is given by the following function of spherical coordinates θ and φ (see Fig. 2):

rθ,φ=α1+σ2expsθ,φ (1)

in which the logarithmic radius sθ,φ (a random function) is a series of spherical harmonics Ylmθ,φ truncated to a maximum order lmax, defined as

sθ,φ=l=0lmaxm=llslmYlmθ,φ. (2)

In Eq. (1), α is the mean of rθ,φ, defining the size of the mass, and σ2 is the variance of rθ,φ, defining the degree of irregularity of the surface of the mass.

FIG. 2.

FIG. 2.

GRS surface with the introduction of two low-frequency modifications: A Gaussian bump (left side) and a spike irregularity (right side), exaggerated here for illustration purposes.

For sθ,φ to be normally distributed with zero mean and an angular covariance function appropriate for a closed surface,11 the expansion coefficients slm in Eq. (2) are defined by

slm=2πCl2l+1xG1+δm0+iyG1δm0,l=0,,lmax,m=1,,l, (3)
C0=C1=0,
Cl=C˜lv,l=2,3,,lmax,
C˜=ln1+σ2l=0lmax1lv1,

where xG and yG are Gaussian random variables with zero mean and unit variance, i=1, and δm0 is the Kronecker delta function. The exponent v is the power-law index of the covariance function. In this work, we fixed v = 4 to correspond to nonfractal shapes without sharp features in their geometry.14 The coefficients for the negative values of m are defined as follows:

sl,m=1mslm*,l=0,1,,lmax,m=l,,0,Ims10=0, (4)

where the asterisk denotes complex conjugate and Im(⋅) denotes the imaginary part.

The statistics of the GRS shapes are controlled by lmax and σ2. For example, weighting the spectrum toward higher-degree harmonics results in Gaussian spheres with finer surface irregularities. Increasing the variance of the logarithmic radius enhances these irregularities radially. Setting the variance to zero, for example, produces a sphere because all surface locations have the same radial distance in this instance.

2.A.2. Introducing low-frequency modifications [spikes and bumps; Fig. 1(c)]

In the next step, the GRS model is given an irregular surface at discrete locations by introducing low-frequency modifications we will refer to colloquially as spikes and bumps [see Figs. 1(b) and 2]. Spikes introduce pointy localized surface changes into the GRS model while bumps introduce localized, lobulated surface changes. These modifications were included to allow the central mass to have a greater degree of surface variation and hence greater realism, especially for nonspiculated masses for which these are the only fine surface structures.

Spikes are introduced as follows. For each modification j of the mass surface, a spherical coordinate pair θcj,φcj is randomly chosen, defining the axis of revolution of the modification. The initial GRS model rθ,φ is then modified in the vicinity of each chosen coordinate pair, according to a specified function as follows. For a given coordinate pair θcj,φcj, values of the random spike radius rspike and length lspike are selected and the initial GRS surface is modified using the quadratic ramp function

rθ,φ=rθ,φ±lspikerspikedrθ,φ,rθcj,φcjrspike4,fordrθ,φ,rθcj,φcjrspikerθ,φ,fordrθ,φ,rθcj,φcj>rspike (5)

in which rθ,φ is the new radial position of the mass surface and d, is the Euclidean distance between two points on the surface. The symbol ± indicates that these variations can be defined to grow outward (+) or inward (−) from the initial GRS surface.

Bumps are introduced in the following manner. For a given coordinate pair θci,φci, values of the random radius rGauss and length lGauss are selected. Radial positions rθ,φ are altered to produce bump features according to

rθ,φ=rθ,φ±lGaussexpdr(θ,φ),r(θci,φci)2rGauss/22exp92,fordrθ,φ,rθci,φcirGaussrθ,φ,fordrθ,φ,rθci,φcirGauss (6)

in a similar manner as for the spike shapes.

2.A.3. Introducing high-frequency modifications [fuzzy surface texture; Fig. 1(b)]

A fuzzy surface is created by modifying the surface profile function as follows:

rθ,φ=rθ,φ1+αn, (7)

where r″ is the new radial position of the mass surface, n is a standard normal random variable, and α is a user-defined parameter controlling the variance of the surface variations.

2.B. Modeling spicule structures

The process of mass generation may conclude at this point; but if a spiculated mass is desired, then the algorithm proceeds to the step illustrated in Fig. 1(d), in which spicules are introduced into the central mass structure. This is accomplished by an iterative fractal branching algorithm that recursively creates a set of segments bn,  n = 1, …,  N based on a set of growing rules. Each of these segments is characterized by a starting location psn in 3D space, an ending location pen, an initial radius rinin, and a final radius rfinn, which defines a conical frustum with a hemisphere at the end acting as a “cap,” as represented in Fig. 4. Additional variation to the defined geometries is introduced by adding a normal random variable to the distance to the center of each geometry (the revolution axis in the case of the frustum and the center location for the sphere), in a similar way as the high-frequency variations were introduced in the central mass [Eq. (7)]. The growth process is defined by user-selected parameters, including the distribution, bifurcation probability, direction of extension, emerging density, radius, and length of the introduced frustum shapes. Next, we describe the iterative algorithm formulated for the segment set generation.

FIG. 4.

FIG. 4.

Spatial interpretation of an initial parent segment (superscript 0) and child segments (superscript 1) generated in the branching algorithm.

2.B.1. Iterative branching structure generation

A temporary segment structure at a particular iteration k is defined by the set Sk=s1k,,sjk,,sLkk, where sjk is the jth segment in the kth iteration of the algorithm, defined by a starting spatial location qjk, a direction of growth defined by angles θjk,φjk, a length ljk, and an initial radius rjk. The initial segment set S0 is generated as described in the flowchart shown in Fig. 3. The user specifies the number of initial segments L0 and a number Mgroup that indicates the number of initial segments clustered within each neighborhood. The latter is used to allow spicules to emerge in bunches. The starting position coordinates qi0 are grouped in neighborhoods centered at rθmini,φmini, where the coordinates are randomly selected within each neighborhood m. The starting position of segment n is added to neighborhood m as rθmini+log2n+γ1π/50,φminilog2n+γ2π/50, where γ1 and γ2 are uniform random variables. That is, one segment emerges from the center of the neighborhood and the rest are added at increasing angular distances from this neighborhood center, each segment having a random length li0 and initial radius ri0 which are drawn from a Gaussian distribution with user-defined properties. The direction of growth of these segments is defined in spherical coordinates by

θi0=θmini+δ0 (8)

and

φi0=φmini+δ1, (9)

where δ0 and δ1 are normal random variables with zero mean and a user-defined standard deviation (std) that controls the possible variance expected in the direction of the emerging segments from the radial direction with respect to the center of the mass (in our work, the standard deviation of δ0 and δ1 was set to 15180/π rad).

FIG. 3.

FIG. 3.

Flowchart for the generation of the initial set of simulated spicule segments.

Figure 4 illustrates an example of the first iteration (iteration 0), showing an initial segment (with parameters having superscript 0) and two child segments (denoted by super-script 1). In Fig. 5, example simulations of spicule structures are shown emanating from the same central mass, along with a table of their user-defined parameter values. Figures 5(a)5(c) show that increasing the number of initial segments L0 produces a larger number of spiculations, and that increasing the number of maximum emerging segments in a neighborhood Mgroup causes the spicules to emerge from isolated regions in the central mass, while decreasing this parameter results in a more uniform distribution of spicules.

FIG. 5.

FIG. 5.

An example mass with various spiculation structures added. The parameters defining the spiculation growth for these examples are shown in the table below the images.

A fractal branching algorithm is applied iteratively to generate the complete set of segments forming a spicule structure. At each iteration k, given the segments Sk generated in the previous iteration (parent segments), a new set of segments Sk+1 (child segments) is generated following the flowchart shown in Fig. 6. The child segments act as parent segments in the next iteration of the process, which continues until no new segments are generated per a given set of growing rules.

FIG. 6.

FIG. 6.

Flowchart for the generation of the iterative branching structure.

Each parent segment produces zero, one, or two children, with user-defined probabilities (Fig. 4 shows a two-child generation event). The generated child segments have a starting point that is the ending point of the parent, a radius less than or equal to the final radius of the parent segment, and a direction of growth that deviates randomly from that of the parent. Child segment diameters are scaled such that the flows within a branch follow the physiological principle of minimum work.9,15 That is, the radii of the parent and child branches can be related by

rjk+1=rikdr1/τ, (10)
rj+1k+1=rik1dr1/τ, (11)

where rjk+1 and rj+1k+1 are the resulting child segment radii, rik is the parent segment radius, dr, the dividing ratio, has a value between 0.5 and 1, and τ is a constant known as the diameter exponent16 (τ = 2.6 is used here per previous work9). Child segments have equal radii when dr = 0.5 and the radius difference increases as dr increases.

At the kth iteration, no new segments are added to the structure if a parent segment has a length or initial radius smaller than the maximum defined resolution (in this work, a tenth of the pixel size of the imaging system where the breast mass phantom will be used). Otherwise, one of three possible different scenarios in the branching scheme is followed by each parent segment with defined probabilities:

2.B.1.a. Continuing branch.

One child segment sjk+1 is generated as a continuing segment from the parent sik, with slightly reduced length and radius, and a slight change in direction. The radius of the child segment is computed,

rjk+1=decreaserrik, (12)

where decreaser is a chosen factor that models how fast the radius of each branch decreases along its length in the complete structure, the value of which is discussed later. The length of the continuing child segment is computed in a similar way

ljk+1=decreasellik, (13)

where decreasel is a chosen factor that models how fast the segments’ lengths decrease along a branch. The orientation of the continuing segment has a slight random variation from the parents in terms of azimuth and inclination

θjk+1=θik+x1γbif, (14)
φjk+1=φik+x2γbifx2φik4, (15)

where x1 and x2 are independent normal random variables and γbif is an angle bifurcation factor. The factor γbif, defined a priori, describes the angular variation in the direction of child segments with respect to their parents, and has a value between 0 and π (smaller values avoid potentially abrupt random changes of the growth direction).

The factor x2φik/4 appears in Eq. (15) to give the branches a tendency to be stretched along the XY plane, which simulates the effect of breast compression. Thus, x2 is either defined as an independent normal random variable for the compressed-breast case, or set to zero for the uncompressed-breast case. This stretching effect is illustrated in Fig. 7.

FIG. 7.

FIG. 7.

Example of a generated mass phantom (a) without horizontal stretching and (b) with horizontal stretching.

After the continuing child is generated, the parent segment sik is added to the complete structure to form a new element bn=psn,pen,rinin,rfinn of the form

psn=qik, (16)
pen=qik+1, (17)
rinin=rik, (18)
rfinn=decreaserrik. (19)
2.B.1.b. Symmetric bifurcation.

Two continuing child segments sjk+1 and sj+1k+1 are generated from the parent, each having similar length and radius, and a similar, but opposed change, in direction. In this case, the dividing ratio dr shown in Eqs. (10) and (11) is computed as a uniform random variable taking values from 0.5 to 0.8. The radii of the two child segments are computed,

rjk+1=decreaserrikdr12.6, (20)
rj+1k+1=decreaserrik1dr12.6. (21)

In the same way, the two child segment lengths are defined,

ljk+1=decreasellikdr12.6, (22)
lj+1k+1=decreasellikdr12.6. (23)

The two different orientations for the child segments are generated with opposed azimuth Gaussian variation

θjk+1=θik+x1+1γbif, (24)
θj+1k+1=θik+x21γbif, (25)
φjk+1=φik+x3γbifx5φik4, (26)
φj+1k+1=φik+x4γbifx6φik4, (27)

in which xi,  i = 1, …,  4 are independent, normal random variables. The variables x5 and x6 have the same horizontal stretching interpretations as in the previous scenario [parameter x2 in Eq. (15)], and assume values different from zero when we consider breast compression. Once the two bifurcating children are generated, the parent segment sik is added to the complete structure forming a new nth element bn=psn,pen,rinin,rfinn. The starting point, ending point, and initial radius are computed in the same way as in the previous scenario [Eqs. (16)(18)]. The final radius is computed using the dividing ratio for this particular scenario

rfinn=decreaserrikdr12.6. (28)
2.B.1.c. Asymmetric bifurcation.

Two child segments sjk+1 and sj+1k+1 are generated: the first forms a continuation of the parent branch; the second (smaller) branch diverges from the parent. The radius and length of the child segments are computed in a similar way as in the symmetric bifurcation scenario, but with a dividing ratio dr computed as a uniform random variable assuming values from 0.8 to 1. The continuing segment also has lower angle deviation from the parent segment than the bifurcation. Their orientations are described in a manner similar to the previous scenario, but with azimuth variations given by

θjk+1=θik+x1γbif, (29)
θj+1k+1=θik+x2γbif±π5. (30)

After the two bifurcating children are generated, the parent segment sik is added to the complete structure forming a new nth element bn, defined in a manner similar to the previous scenario.

In Figs. 5(c)5(e), we can observe the effect of adjusting the parameters of the spiculation model, as indicated in the inset table. Comparing Figs. 5(c) and 5(d) shows that larger values for the decreaser and decreasel parameters produce more-extended spicule structures. Comparing Figs. 5(d) and 5(e) shows that defining a higher continuing branch probability reduces the chances of bifurcation, allowing the spicules to grow more extensively.

2.C. Embedding in real tissue images

To allow us to validate the use of our proposed mass simulation model in the context of mammography, we developed a method of embedding the masses within actual clinical digitized mammograms using published values of breast tissue attenuation17 and a simulated mammography source spectrum.18,19 Our goal was to computationally modify the mammogram to include the simulated tumor projection by substituting the attenuation effect produced by a mixture of fat and fibrous tissue with that produced by tumor tissue. In these experiments, we used digitized film mammograms from the digital database for screening mammography (DDSM) database.20,21 We began the embedding process by converting the image values to film optical density values by using the scanner conversion function (as indicated in Refs. 20 and 21). We then computed normalized intensity values from the film optical density values. Assuming a linear relationship between inverse of the recorded optical density and the logarithm of the intensity, we inverted the optical density values and later inverted the logarithm relationship to compute the normalized intensity image (normalized by the integral of the source intensity, that is, values of pixels in air should be 1). According to Beer’s law, the normalized intensity image follows:

I(x,y)=0εmaxId(x,y;ε)ε0εmaxS(ε)ε=0εmaxS(ε)e0Lμobj(x,y;ε)zε0εmaxS(ε)ε, (31)

where Id(x, y; ε) is the intensity recorded at the detector at x-ray energy ε, S(ε) is the intensity of the source, εmax is the maximum photon energy produced by the source, L is the thickness of the sample in centimeter, and μobj(x, y; ε) is the attenuation coefficient of the sample in cm−1, which is energy-dependent. Since the attenuation coefficients have a nonlinear dependence with energy (Ref. 17), the energy dependence of the integrand Id(x, y; ε) in Eq. (31) is not only on the source intensity but also on the breast thickness and composition, which are not known a priori. Therefore, in order to modify the equation to substitute the attenuation effect produced by healthy tissue with that of the simulated mass, we first need to estimate the composition and thickness of the sample. We approximated the sample composition in the form of a ratio of fat to fibrous tissue (we assume that no lesion is located within the embedding area). Once these factors are estimated, the integrand Id(x, y; ε) in Eq. (31) can be approximated as

Iˆ(x,y;ε)=Sˆ(ε)e(μfat(ε)ratiofat(x,y)+μfib(ε)ratiofib(x,y))Lˆ(x,y), (32)

where Sˆ(ε) is a simulated mammography source spectrum;18,19 μfat(ε) and μfib(ε) are the attenuation coefficients (in cm−1) of fat and fibrous tissue, respectively; and ratiofat(x, y), ratiofib(x, y), and Lˆ(x,y) are approximations of the fat tissue ratio, fibrous tissue ratio, and tissue thickness observed in the mammogram, respectively, computation of which is described shortly. The simulated tumor projection is then embedded within the original mammogram by substituting the effect caused by the mixture of fat and fibrous tissue with tumor tissue

IT(x,y)=0εmaxIˆ(x,y;ε)eμtum(ε)+μfat(ε)ratiofat(x,y)+μfib(ε)ratiofib(x,y)Ttum(x,y)ε0εmaxSˆ(ε)ε, (33)

where μtum(ε) is the attenuation coefficient (in cm−1) of tumorous tissue, either benign or carcinoma tissue, and Ttum(x, y) is the thickness of the projected simulated tumor at location (x, y) (in cm). The resulting hybrid intensity image IT(x, y) was converted back to optical density values similar to the ones presented in the original digitized mammograms by considering the linear relationship between the log-intensity and optical density values and inverting the particular normalization of the scanner used to digitize the original film mammogram.20,21

The breast thickness considered in Eq. (33) was approx- imated by considering a secondary image IF(x, y), which approximately describes the intensity recorded by a breast of similar thickness characteristics to the one presented, but composed entirely of fat tissue. That is, in theory

IF(x,y)=0εmaxS(ε)eμfatεLx,yε0εmaxS(ε)ε. (34)

We generated the image IF(x, y) by fitting a thin plate spline (TPS)22 to a selection of candidate locations in I(x, y) where fat tissue was the main breast component. These candidate locations were chosen by selecting the pixels in I(x, y) that are monotonically decreasing from nipple to chest wall. The rationale behind this is that, assuming that breast tissue thickness is not expected to decrease from nipple to chest wall, higher intensity values in this direction should be observed where fat tissue percentage is the most predominant, since fat is the least-absorbing tissue type in breast. We can observe an example of the resulting fat-tissue approximation image in Fig. 8(b), where the logarithm of the intensity IF(x, y) is shown (showing the image in terms of absorption), for easier comparison to the original mammogram shown in Fig. 8(a). Following Eq. (34), we can find the approximated thickness map of the breast sample using a constrained least-squares minimization process23

Lˆ(x,y)=argminL(x,y)IF(x,y)0εmaxSˆ(ε)eμfat(ε)L(x,y)ε0εmaxSˆ(ε)ε2,. (35)

FIG. 8.

FIG. 8.

Process of embedding a simulated mass in a real mammogram: (a) Original mammogram; (b) Fat tissue approximation obtained using thin-plate splines: the gray values correspond to absorption produced by a fixed attenuation coefficient of fat tissue; (c) Hybrid image with real mammogram containing a simulated tumor; and (d) Simulated tumor projection.

Once the approximated sample thickness Lˆ(x,y) is computed, we can obtain the approximate fibrous and fat ratios in Eq. (32) using a similar constrained least-squares process

ratiofib(x,y)=argminratiofib(x,y)I(x,y)0εmaxSˆ(ε)eμfat(ε)(1ratiofib(x,y))+μfib(ε)ratiofib(x,y)Lˆ(x,y)ε0εmaxSˆ(ε)ε20ratiofib(x,y)1, (36)
ratiofat(x,y)=1ratiofib(x,y). (37)

A simulated mass can be embedded at any location within any case so long as it can be accommodated within the thickness and spatial extent of the breast at that location. Figure 8(c) shows an example of the visually realistic results obtained by the proposed embedding procedure. Figure 9 shows mammograms for two patients each having one breast containing a real mass and one normal breast. We embedded a simulated mass into the image of the normal breast having similar visual characteristics (general shape, degree of spiculation, size, and location) to that of the actual mass in the opposite breast. In Fig. 9, the normal breast (left), simulated mass in that breast (center), and actual mass in the other breast (right) are shown.

FIG. 9.

FIG. 9.

Examples showing the results of embedding the simulated mass phantoms into digitized mammograms. Each of these patients had one normal breast and one breast containing a true mass. The locations of the simulated and actual masses are indicated by white arrows. (a)–(f) The normal breast is shown in the left image; the center image shows the normal image with a simulated mass embedded; the right image shows the opposite breast image containing a real mass. (g) and (h) Detail in the mammograms where the simulated (left) and normal (right) masses were embedded for spiculated and nonspiculated examples, respectively.

2.D. System performance characteristics and variable values

The modeling software was written in matlab and executed using a Windows desktop computer (32-bit version Microsoft Windows Vista; 2 GHz dual-core processor with 4 GB RAM). The most challenging task given the limitations of processing speed and memory was the computation of a simulated mass projection and volume. Generating the volume and projection of the simulated tumors directly from a rasterized version proved to be computationally intensive; therefore, we developed a more-efficient analytical solution (described in the Appendix). The computational time and memory requirements for the generation of the simulated mass structures, volume, and projection are summarized in Table I. In our experiments, the simulated masses were defined within a 1000 × 1000 × 1000 voxelized volume in which each voxel had a size of 50 μm in each direction.

TABLE I.

Computation times and memory requirements for algorithm steps.

Computation time (min) Memory
Central mass surface modeling ∼3 ∼2 MB
Spiculation surface modeling ∼1 ∼2 MB
Convert mass phantom to volume image and form projection image ∼1 ∼1.5 MB (projection), ∼320 MB (volume)
Convert spiculation phantom to volume image and form projection image ∼5 ∼1.5 MB (projection), ∼320 MB (volume)
Complete phantom file ∼2 MB

The tumor simulation described by this model is randomized, following a series of user-defined variables describing the angular undulation of central mass, shape, amount, and size of the included modifications, number and density of emerging spiculation structures, and their growth properties. The parameters were chosen so as to match the appearance of breast tumors found in clinical radiology, and were systematically changed in order to produce a lesion that visually looked to be a mass with distinct features (e.g., lobulated margin versus circumscribed margin). This was accomplished using knowledge from published work,9,24–28 and guidance from two of the authors (R.A.S. and R.M.N.), both of whom are experts in breast radiology. The process of adjusting the parameters to produce visually realistic simulated masses was conducted independently from creation of the set of hybrid mammograms in the evaluation, so as to avoid bias. The resulting parameters of the cases included in the evaluation of our method are summarized in Table II. The radius and length factors included in the table indicate a factor of the mean central mass radial distance value α, selected for the tumor generation in the GRS model [Eq. (1)].

TABLE II.

Variable values in the generation of the breast tumor phantoms evaluated.

Nonspiculated masses Spiculated masses
Parameter Mean value Standard deviation Mean value Standard deviation
GRS variance σ2 0.31 0.04 0.32 0.03
Number of low-frequency modifications in GRS 611.2 70.6 688.7 61.3
Shape of low-frequency modifications in GRS (0 = spikes, 1 = bumps) 0.36 0.48 0.59 0.20
Low-frequency modification radius factor in GRS rGauss/αrspike/α 0.229 0.073 0.216 0.046
Low-frequency modification length factor in GRS lGauss/αlspike/α 0.113 0.021 0.109 0.012
Variance for fuzzy surface in GRS (α) 0.015 0 0.015 0
Number of emerging initial segments (L0) 0 0 1358 365.2
Maximum number segments in neighborhood Mgroup 8.98 1.89
Emerging segments radius factor ri0/α 0.0240 0.0053
Segment radius decrease factor decreaser 0.89 0.31
Emerging segments length factor li0/α 0.173 0.018
Segment length decrease factor decreasel 0.91 0.30
Continuing branch probability 0.717 0.057
Symmetric bifurcation probability 0.142 0.028
Asymmetric bifurcation probability 0.142 0.028
Branching angle variance γbif (in degrees) 6.55 0.62

2.E. Evaluation of hybrid images

Figure 10 illustrates the overall design of the experiments we conducted to validate the realism of our phantom model. A set of clinical digitized mammograms was downloaded from the DDSM,20,21 which contains 2620 cases in which breast masses are described by shape, margin (including spiculated and nonspiculated), and proven pathology is provided. We identified 83 cases in which the patient exhibited one normal breast and one breast with a spiculated or nonspiculated mass, visible both in the craniocaudal (CC) and the mediolateral-oblique (MLO) mammogram views. For each of these clinical cases, we selected a simulated mass with similar size and spiculation level to the mass that was actually present in the abnormal breast and embedded it in a similar breast location (distribution of parameters shown in Table II). The simulated masses were generated independently from the case and location within the breast in which they were later embedded. We embedded the simulated masses so as to resemble the real breast masses in terms of size, degree of spiculation, and location. (In concept, a simulated mass generated by the proposed approach can be embedded at any location in any case, provided that it fits within the spatial extent and thickness of the breast at that location.) The parameter value distribution was determined, and a set of simulations was generated, in a process separate from that used in the evaluation process, in consultation with two experts in mammography (authors R.A.S. and R.M.N.). We then digitally embedded each simulated mass within the MLO or CC view (chosen randomly for each pair) of the healthy breast using the proposed embedding scheme. By this approach, we created 83 corresponding left- and right-breast image pairs (a total of 166 images), in which the image of one breast depicted an actual mass, while the opposite breast image contained a similar simulated mass, both in the same view (either CC or MLO). Of the 83 cases on which our experiments were based, 31 exhibited benign masses that were classified as nonspiculated in the DDSM database; the remaining 52 cases exhibited malignant tumors that were classified as spiculated. Figure 11 presents side-by-side example comparisons of corresponding left- and right-breast image pairs of real and simulated nonspiculated and spiculated tumors in digitized mammograms. Additional examples are shown in Fig. 12, where 3D representations of the simulated masses are displayed alongside corresponding regions of interest of the simulations embedded in real breast tissue and their corresponding opposite breast containing a real mass.

FIG. 10.

FIG. 10.

Overall outline of the validation study, based on the generation of image pairs, with one image exhibiting a real mass and the other containing a matched, simulated mass.

FIG. 11.

FIG. 11.

Corresponding pairs of (a) real and (b) simulated masses for a nonspiculated tumor (left pair) and (c) real and (d) simulated masses for a spiculated tumor (right pair). The tumor locations are indicated by arrows.

FIG. 12.

FIG. 12.

Examples of simulated tumors with matched simulated and real mammogram regions. The first three columns show examples of nonspiculated masses; the last three columns show spiculated masses. Each row shows (from left to right) a 3D representation of the simulated tumor, a region of interest in which the simulated tumor has been embedded, and a corresponding region of interest from the opposite breast containing a real tumor with similar characteristics.

2.E.1. Validation of hybrid phantom images for CAD analysis

To validate the proposed hybrid phantom images in the context of CAD, we applied a well-established set of quantitative CAD features29–31 that are widely used to characterize lesions in digital mammography, the purpose being to observe whether our simulated masses yield the same results as their matched real masses when analyzed using CAD techniques. We only briefly describe the CAD procedures, as the details of these methods can be found in prior publications.

Prior to feature extraction, we segmented each mass in a hybrid image using the region-growing algorithm proposed in Ref. 29. Figure 13 shows an example segmentation of one of our hybrid images, indicating the grown region of the tumor, the tumor margin, an encompassing region, and the surrounding periphery.30 The surrounding periphery is obtained by a morphological opening applied to the grown region.30 The encompassing region and surrounding periphery simply extend the region window by 20 pixels along the left, right, top, and bottom edges of the mass.

FIG. 13.

FIG. 13.

Region of interest in hybrid mammogram, indicating various findings as described in Ref. 30.

Five CAD features are next extracted from the detected regions by the methods described in Refs. 29–31. Two of these features—sharpness and full width at half maximum (FWHM) angular deviation (related to spiculation level)—characterize the tumor margin and are important to distinguishing malignant tumors from benign ones; the other three features—average gray level, contrast, and texture—characterize the density of the mass. For more details about the extracted CAD features, we refer the reader to Refs. 29–31.

2.E.2. Reader study design

Our study was based on 83 pairs of corresponding left- and right-breast images, in which each pair consisted of a clinical digitized mammogram of an abnormal breast with a real mass, and a hybrid image (real mammogram with simulated mass) of the opposite breast in the same patient. Thus, the total data set consisted of 166 mammograms. All 166 images were rated independently and sequentially by five expert radiologists. Images containing real or simulated tumors were shown in random order. Each reader assigned a score for each image expressing his or her confidence that the mass shown was real or simulated. Scoring was done on the following seven-point scale: definitely real (0), probably real (1), possibly real (2), unsure (3), possibly simulated (4), probably simulated (5), or definitely simulated (6).

Customized software was developed to conduct the reader study. The software displays the images sequentially in random order to avoid ordering effects, and indicates the position of the tumor to be evaluated in each image. The observers were able to control the pan, zoom, white balance, and contrast of each image individually. The readers were entirely free to choose the reading pace; however, after an image was scored, the next image was displayed and the readers were not permitted to revisit previous images. The images were displayed on a 5-megapixel mammographic monitor (Totoku ME551i) with 11-bit grayscale, calibrated by the vendor to the DICOM grayscale standard display function. The ratings were saved for each reader independently, and later processed in a multiple-reader multiple-case (MRMC) analysis.32–38

3. RESULTS

3.A. Results of CAD analysis of hybrid images

3.A.1. CAD features

Table III summarizes the mean and std values for each of the features extracted from the real and hybrid mammograms, for both nonspiculated and spiculated tumors. Table III also contains the resulting p-values from a t-test comparing feature values from real and hybrid mammograms for each of the five CAD features. Results for nonspiculated and spiculated tumors were computed independently. The result shown in Table II is that none of the comparisons showed a statistically significant difference between results from the real and hybrid images at the level p < 0.05, i.e., p exceeded 0.05 in every comparison, in many cases by a large margin.

TABLE III.

CAD feature values [mean, (std)] for the mammograms in each category (nonspiculated, spiculated) and p-values from a t-test of the difference in feature values between real and hybrid mammograms.

Nonspiculated Spiculated
CAD feature Real Hybrid p-values Real Hybrid p-values
Sharpness 44.76 (10.5) × 103 43.82 (9.59) × 103 0.693 49.00 (10.16) × 103 44.83 (10.0) × 103 0.084
FWHM 121.58 (36.89) 119.12 (28.76) 0.754 174.84 (28.09) 164.67 (31.97) 0.156
Average gray level 42.88 (5.26) × 103 44.78 (4.90) × 103 0.118 47.72 (4.45) × 103 47.47 (4.12) × 103 0.807
Contrast 8.65 (4.68) × 103 7.63 (3.90) × 103 0.318 10.60 (5.17) × 103 8.88 (4.08) × 103 0.121
Texture 18.65 (5.17) × 103 17.78 (4.44) × 103 0.444 19.62 (4.09) × 103 18.00 (4.08) × 103 0.097

3.A.2. Discrimination power of features

We further explored the generated hybrid images in the context of CAD by determining the extent to which the CAD methods behave similarly on the hybrid images and real mammograms. We accomplished this by applying univariate classifiers to discriminate nonspiculated tumors from spiculated ones, following the approach described in Ref. 30, using either real or hybrid images. Classifier performance was then evaluated using ROC curves, with AUC being used to summarize the performance. In our data set, all the real tumors observed in the mammograms that were classified as nonspiculated were found to be benign at histology, and those classified as spiculated were found to be malignant (histology data acquired from the DDSM database20,21). So in this case, the task for discerning tumors according to their spiculation level was equivalent to discerning benign tumors from malignant ones. The resulting measured mean and 95% confidence interval (CI) of the AUCs are summarized in Table IV, together with the results from a similar analysis obtained from previously published work on real breast tumors,30 demonstrating, in general, good agreement between the real and simulated tumors and the published data. The AUCs for our simulated tumors were all slightly lower than that measured for our real tumors, but within the margin of error.

TABLE IV.

AUC (mean ± 95% CI) of the computer-extracted features in distinguishing between benign and malignant tumors for the real tumor set, the simulated tumor set, and previously published data (Ref. 30).

Extracted feature AUC for real tumors AUC for simulated tumors Published AUC from real tumors (Ref. 30)
Sharpness 0.62 ± 0.14 0.54 ± 0.14 0.56
FWHM 0.88 ± 0.08 0.86 ± 0.09 0.88
Average gray level 0.76 ± 0.11 0.66 ± 0.12 0.65
Contrast 0.63 ± 0.13 0.61 ± 0.14 0.59
Texture 0.59 ± 0.12 0.55 ± 0.11 0.54

We compared the performance in terms of AUC obtained from the extracted features using ANOVA analysis, employing the dbm-mrmc software,32–38 testing the hypothesis that each feature’s performance in distinguishing between nonspiculated and spiculated masses is the same for the real masses as for the masses simulated using the proposed model. The mean differences, 95% CI, and p-values obtained by using ANOVA analysis to test this similarity are summarized in Table V. The performance differences were very small and all p-values observed were well above 0.05, showing that no significant differences in performance between the real and simulated masses for the considered extracted features were observed.

TABLE V.

ANOVA results of the AUC differences among real and simulated masses for each extracted feature.

Real vs simulated tumor AUC differences
ANOVA test Extracted feature Mean (95% CI) p-value
Real vs simulated tumors performance differentiating between nonspiculated and spiculated Sharpness 0.081 (−0.075, 0.237) 0.307
FWHM 0.023 (−0.087, 0.135) 0.674
Average gray level 0.098 (−0.017, 0.214) 0.1
Contrast 0.020 (−0.09, 0.13) 0.722
Texture 0.035 (−0.08, 0.156) 0.561

3.B. Reader study to evaluate visual realism

Figure 14 shows the distribution of realism scores assigned by each radiologist (on the seven-point scale), displaying their means and 95% confidence intervals. Table VI provides a statistical comparison of the distribution of these ratings for each radiologist by the Mann–Whitney U-test, testing against the null hypothesis that the scores are drawn from equivalent distributions, similar to comparisons in previous studies where breast tumor simulations were proposed.2,3 The collected data were also used to construct ROC curves for the task of distinguishing real from simulated tumors for each radiologist individually and in a MRMC analysis using the dbm-mrmc software available from the University of Chicago.32–38 The resulting AUC values in the MRMC scenario analysis, as well as the independent AUC values for each radiologist, are reported in Table VI, showing that AUC = 0.5 was in the 95% CI for each reader and mass type (nonspiculated and spiculated).

FIG. 14.

FIG. 14.

Rating distribution for each radiologist in the reader study. Error bars indicate 95% confidence intervals.

TABLE VI.

U-test comparison of ratings between real and simulated tumors for each observer, testing for the hypothesis that real and simulated tumors have the same score distributions. AUC values are shown for each individual reader and for a MRMC analysis.

U-test p values Mean AUC (95% CI)
Reader Years of experience Nonspiculated Spiculated Nonspiculated Spiculated
Reader 1 2 0.3 0.861 0.57 (0.42, 0.69) 0.49 (0.38, 0.60)
Reader 2 8 0.752 0.006 0.52 (0.37, 0.67) 0.69 (0.58, 0.78)
Reader 3 4 0.670 0.164 0.47 (0.32, 0.62) 0.57 (0.47, 0.68)
Reader 4 4 0.709 0.049 0.47 (0.34, 0.63) 0.61 (0.50, 0.71)
Reader 5 39 0.009 0.136 0.68 (0.55, 0.78) 0.58 (0.47, 0.68)
MRMC 0.54 (0.44, 0.65) 0.58 (0.50, 0.68)

4. DISCUSSION

We have presented a method to generate a collection of random simulated 3D breast masses, with a user-defined spiculation degree. We have also described a method for embedding the simulated masses within real digitized mammograms, yielding hybrid images for which the ground truth about the tumor location, extent, and characteristics is known. We evaluated the realism of the simulated masses by measuring the extent to which CAD features extracted from our proposed hybrid images are similar to those extracted from digitized mammograms containing real masses, both in terms of the features’ values as well as their discriminating power (as judged by AUC). We also conducted an expert reader study to evaluate visual realism of the simulated hybrid images, in which we measured the readers’ ability to distinguish real and hybrid images, and found that they were not able to achieve significantly better than random guessing.

The values reported in Table III show that CAD features extracted from the hybrid mammograms resulted in a distribution similar to that obtained from the real mammograms. We found no statistically significant difference for any of the features (at level p < 0.05) between the real and hybrid images. Indeed, many of the comparisons yielded very high p-values, suggesting good correspondence between the features computed from the real and hybrid mammograms. We further evaluated the discrimination power of each feature to discern between nonspiculated and spiculated masses, both for the real and hybrid mammograms, yielding the results reported in Table IV. The features were found to perform similarly on the real and hybrid images, and yielded discrimination power similar to that reported for these specific features in an independent study (Ref. 30). Although differences can be observed in the mean performance in real and hybrid images, further investigation of these differences by ANOVA analysis yielded no evidence of statistically significant differences (Table V).

We also evaluated realism by measuring the ability of five experienced radiologist readers to distinguish the real masses from the simulated ones. The realism scores assigned by the readers (Fig. 14) showed similar distributions for both real and simulated masses, all yielding values around 2–3 on the 0-to-6 realism scale (i.e., the readers rated all masses, whether real or simulated, as “unsure” or “possibly simulated”). This fact, together with comments received from the observers about the difficulty of the discrimination task, suggests that the hybrid phantoms are indeed realistic.

We quantitatively analyzed the overall performance of the readers given the collected scores for the task of discerning real and simulated masses in a MRMC scenario, which resulted in a mean AUC of 0.544, with a 95% CI of (0.44, 0.65) for nonspiculated tumors; and a mean AUC of 0.588, with a 95% confidence interval of (0.50, 0.68) for spiculated tumors, as indicated in Table VI. The independent AUC values for each observer are also indicated in the same table. The results suggest that the readers were not able to perform significantly better than random guessing in distinguishing between the real and simulated tumors. These AUCs are lower than previously published AUCs (Refs. 2–4) and close to the ideal 0.5. The large overlap of confidence intervals may be also due to the relatively small number of readers and reader variability and statistical significance might have been established if a greater number of simulations or readers were available. There was also some reader variability, which raises the question of whether less-experienced readers may behave differently from more-experienced ones. Reader 2 presented the highest accuracy in distinguishing real from simulated tumors in the spiculated category, while showing low accuracy in the nonspiculated category. Surprisingly, Reader 5, who has substantially more experience than the others, presented the highest accuracy in the nonspiculated category, but moderate accuracy in the spiculated category. One of the previously published methods with highest accuracy was described in Ref. 5, with mean AUC of 0.55 (but without a distinction between masses according to their margin spiculation level). As their computational embedding approach was based on previous work described by our group,1,7 and which is expanded in this paper, we hypothesize that the proposed embedding process may play an important role in the realistic appearance of the simulations.

The method for generating an array of random simulations presented here depends on a large set of parameters that were user-defined and describe the shape of the simulated masses. The goal of the project was to develop a method that is capable of simulating the wide variation in appearance of breast masses on mammograms. We were not trying to develop a model from first principles, so we took a more pragmatic approach. With our goal in mind, we varied the parameters to change the appearance of the mass to get the desired appearance. The exact values of the parameters were less important than the appearance of the lesion and how the parameters can be changed to change the appearance of the lesion. The parameter distribution employed in the evaluation of our method was determined by an iterative process guided by authors R.A.S. and R.M.N. Several simulation sets were generated from an array of parameter values, and these were later adjusted after discussion of the appearance of the generated simulations prior to the embedding process, followed by the generation of a new set and repeating the process until the authors were satisfied with the realism of the results. As the parameters reported in Table II correspond to those in our evaluation study, we suggest that they can be used as a guideline to generate successful simulations of breast masses. However, the sensitivity of each of these parameters for the generation of visually realistic results is not presented here and remains a matter to be studied in future work.

It is also important to also note that our methodology of embedding simulated masses projections onto real digitized mammograms does not explicitly take into account film blurring or scatter;8 however, the manner in which we used the real background appears to have preserved sufficient blur and noise effects so as to obtain very realistic phantoms, as demonstrated from the CAD and human reader studies. But, we acknowledge that further considerations of sharpness and noise effect may improve the results of the embedding process, which will be taken up in future work.

Owing to the availability and ease of use of the DDSM database, we based these first experiments on that source of images. However, we anticipate that our 3D modeling approach can be adapted to other data sets and imaging modalities, which we will study in future research. Further validation studies would be required to prove the effectiveness of our method in these new settings.

The benefit of the proposed 3D mass phantom is that it allows a large number of breast masses to be simulated with known characteristics, extent, and location. We anticipate that the proposed method could have useful application in training of radiologists and in the evaluation and optimization of imaging systems and algorithms.

5. CONCLUSIONS

In this work, we have presented a realistic, 3D compu-tational breast mass simulation model exhibiting the fine structures and details observed in mammographic images and we have described a method of embedding the simulated masses within real clinical digitized mammograms. The model’s versatility allows the creation of a large number of different tumor cases, both benign and malignant, with borders ranging from smooth to highly spiculated. We have demonstrated that computer-aided diagnosis (CA) algorithms yield very similar results when applied to our hybrid mammograms and real ones and we have found that expert readers did not perform significantly better than random guessing in discriminating our hybrid mammograms from real ones. This tool may be helpful in detectability studies for modern breast imaging techniques where a large database of tumors of different sizes and characteristics with knowledge of ground truth of the mass replicated is a requirement. The method may also be useful for training purposes. We plan to make mass model data available via a web site; in the meantime, interested parties can obtain data examples by contacting the corresponding author.

ACKNOWLEDGMENTS

This research was supported by NIH/NIBIB Grant No. EB009715, NIH/NHLBI Grant No. HL091017, and NIH/NCI Grant Nos. CA111976 and CA132973.

APPENDIX: ANALYTICAL SOLUTION FOR VOXEL POPULATION WITH SPICULE STRUCTURES

The following appendix derives an analytic solution for the population of a voxelized volume with simulated spicules, defined as tubular structures described by a set of segments, as explained earlier in the paper. Considering a position p=xp,yp in the projection plane, we computed the two intersection points of a line perpendicular to such plane and passing through p with each defined segment in the structure independently. The process was repeated for a discrete grid of locations in the projection plane defined within the limits of each particular segment considered. The voxelized volume was produced by populating the voxels between the intersection points for each location in the defined grid. The correspondence of voxelized volume and projected image follows the assumption that the rays are perpendicular to the projection plane, which is not necessarily true for imaging modalities such as tomosynthesis or computed tomography, but was satisfactory for the simulations presented here. For situations in which this assumption does not hold, the simulated image can be created by projecting a voxelized volume of the simulated tumor using a forward imaging model appropriate to the given imaging modality.

As defined in Sec. 2.B, each segment is described as a conical frustum [defined by parameters b=ps,pe,rini,rfin], capped by a hemisphere (see Fig. 15). (Note that we will suppress the subscript n defining the segment number to simplify the notation.)

FIG. 15.

FIG. 15.

Example representation of an individual segment and its projection.

Finding the possible intersection of a line at p perpendicular to the projection plane with a sphere is trivial. We define a variable indicating whether there is intersection with the sphere for the point p: Intspp and the top and bottom intersecting points with the sphere are indicated by xxsptp,yxsptp,zxsptp and xxspbp,yxspbp,zxspbp, respectively. The intersection of a line perpendicular to the projection plane and passing through location p with the conical frustum is computed for those locations inside the area projected by the frustum. The area projected by this tubular structure consists of two ellipses (one produced by its base and one by its top) joined together with the two lines tangent to both of them, as we can observe in Fig. 15. This tubular object is defined by its starting point ps=xs,ys,zs, its ending point pe=xe,ye,ze, its initial radius rini, and its final radius rfin. We find the projection of the starting and ending points in the projected plane at xs,ys,0 and xe,ye,0, respectively. We also compute the direction of the axis of the tubular object: dt=peps=dtx,dty,dtz. Then, we can find a vector perpendicular to the axis of the cylinder and parallel to the projection plane: nt = (dty, − dtx, 0). A vector contained in the plane formed the cylinder base (or top) and perpendicular to the vector parallel to the projection plane can be found by the cross product of those two. This vector describes the direction to follow in order to find the highest and lowest points for the base (or top) of the frustum, which will be needed when computing the projection

vt=dt×ntdt×nt=(vtx,vty,vtz), (A1)
pbh=xbh,ybh,zbh=ps+rinivtx,vty,vtzif vtz0ps+rinivtx,vty,vtzif vtz<0, (A2)
pbl=xbl,ybl,zbl=ps+rinivtx,vty,vtzif vtz0ps+rinivtx,vty,vtzif vtz<0, (A3)
pth=xth,yth,zth=pe+rfinvtx,vty,vtzif vtz0pe+rfinvtx,vty,vtzif vtz<0, (A4)
ptl=xtl,ytl,ztl=pe+rfinvtx,vty,vtzif vtz0pe+rfinvtx,vty,vtzif vtz<0, (A5)

where pbh, pbl and pth, ptl are the highest and lowest points in the base of the cylinder and the highest and lowest points in the top of the cylinder, respectively. We define the top axis of the cylinder at=pthpbh=atx,aty,atz (the highest line going along the cylinder length) and the bottom axis ab=(ptlpbl)=abx,aby,abz (the lowest line going along the cylinder length). We can then specify the boundaries of the ellipses projected from the base and top of the frustum (Fig. 15). The ellipse projected by its base has a major axis equal to the initial radius abase = rini, and a minor axis equal to bbase=xbhxbl2+ybhybl2/2. In the same way, the ellipse projected by its top has a major axis equal to the final radius atop = rfin, and a minor axis equal to btop=xthxtl2+ythytl2/2. The point in the line formed by the projection of the frustum axis in the projection plane which is the closest to p, which we will call pline, can be found,

u=xpxs,ypysdtx,dtydtx2+dty2, (A6)
pline=xs,ys+udtx,dty. (A7)

We should then expect the segment defined between p and pline to be perpendicular to the projection of the cylinder axis. Computing the distance D between those two points let us set up the rules to decide whether p is inside the area projected by the frustum, distinguishing between the two ellipses formed from its base and top or in the rest of the area. This distinction is important, since it affects the way the frustum surface intersects with a line perpendicular to the projection plane

D=plinep. (A8)

As we can observe in Fig. 15, the intersection of the frustum with a plane containing p (displayed in red in the figure) perpendicular to the projection plane and perpendicu-lar to the projection of the frustum axis, always forms a ellipse as long as it is not intersecting with the frustum base or top, following the geometric definition of ellipse in conic theory. We can compute the intersection of such plane with the top and bottom axis of the cylinder, defined as at and ab

ut=xpxbh,ypybhatx,atyatx2+aty2, (A9)
ptline=xbh,ybh,zbh+utatx,aty,atz=xtline,ytline,ztline, (A10)
ub=xpxbl,ypyblabx,abyabx2+aby2, (A11)
pbline=xbl,ybl,zbl+ubabx,aby,abz=xbline,ybline,zbline, (A12)

where ptline and pbline are the intersection points of the defined top and bottom axis and the plane containing p perpendicular to the projection plane and projection of the axis. This way, the Z coordinates of those two points, zt−line and zb−line, define the major axis acyl of the intersecting ellipse between plane and frustum. The middle point between the top and bottom intersecting points specifies the center of such ellipse, pcenter

acyl=ztlinezbline, (A13)
pcenter=ptline+pbline2=xcenter,ycenter,zcenter. (A14)

In order to find the minor axis of the ellipse, we need to find the corresponding frustum radius defined at that middle point, which we call rrad. This radius is determined by the point in the center axis of the frustum which is the closest to pcenter in a similar way as described before,

urad=pcenterpsdtdt, (A15)
prad=ps+uraddt=xrad,yrad,zrad, (A16)
rrad=rfinurad+rini1urad. (A17)

Considering that the base of the frustum is spherical, we can then find the value of the minor axis of the intersecting ellipse, bcyl

bcyl=rrad2pcenterprad2. (A18)

Then, the area covered by the frustum in the projection plane can be determined

Inteb(p)=1if udtx2+dty2bbaseandudtx2+dty2bbase2+D2abase210otherwise (A19)
Intet(p)=1if (u1)dtx2+dty2btopand(u1)dtx2+dty2btop2+D2atop210otherwise (A20)
Intecp=1if 0u1&DbcylandIntebxp,yp=0andIntetxp,yp=0,0otherwise (A21)

where Intebp, Intetp, and Intecp indicate if the point p is inside (1) or outside (0) of the projected area of the base, top, and rest of the conical frustum in the projection plane, respectively, as shown in Fig. 15, with the boundaries indicated in yellow. The top and bottom intersecting points, xxcyltp,yxcyltp,zxcyltp and xxcylbp,yxcylbp,zxcylbp, respectively, are then found following the definition of the ellipse formed with the intersection of the frustum and plane:

xxcyltp,yxcyltp,zxcyltp=xp,yp,zcenter+acyl1D2bcyl2, (A22)
xxcylbp,yxcylbp,zxcylbp=xp,yp,zcenteracyl1D2bcyl2. (A23)

REFERENCES

  • 1.de Sisternes L., Zysk A. M., Brankov J. G., and Wernick M. N., “Development of a computational three-dimensional breast lesion phantom model,” Proc. SPIE 7622, 762205–762208 (2010). 10.1117/12.844501 [DOI] [Google Scholar]
  • 2.Saunders R., Samei E., Baker J., and Delong D., “Simulation of mammographic lesions,” Acad. Radiol. 13, 860–870 (2006). 10.1016/j.acra.2006.03.015 [DOI] [PubMed] [Google Scholar]
  • 3.Berks M., da Silva D. B., Boggis C., and Astley S., “Evaluating the realism of synthetically generated mammographic lesions: An observer study,” Proc. SPIE 7627, 762704–762711 (2010). 10.1117/12.845543 [DOI] [Google Scholar]
  • 4.Bliznakova K., Bliznakov Z., Bravou V., Kolitsi Z., and Pallikarakis N., “A three-dimensional breast software phantom for mammography simulation,” Phys. Med. Biol. 48, 3699–3719 (2003). 10.1088/0031-9155/48/22/006 [DOI] [PubMed] [Google Scholar]
  • 5.Rashidnasab A., Elangovan P., Yip M., Diaz O., Dance D. R., Young K. C., and Wells K., “Simulation and assessment of realistic breast lesions using fractal growth models,” Phys. Med. Biol. 58, 5613–5627 (2013). 10.1088/0031-9155/58/16/5613 [DOI] [PubMed] [Google Scholar]
  • 6.Rashidnasab A., Elangovan P., Dance D. R., Young K. C., Yip M., Diaz O., and Wells K., “Realistic simulation of breast mass appearance using random walk,” Proc. SPIE 8313, 83130L1–83130L7 (2012). 10.1117/12.911641 [DOI] [Google Scholar]
  • 7.de Sisternes L., “Computer modeling of breast lesions and studies of analyzer-based X-ray imaging,” Ph.D. dissertation ( Illinois Institute of Technology, 2011). [Google Scholar]
  • 8.Gong X., Glick S. J., Liu B., Vedula A. A., and Thacker S., “A computer simulation study comparing lesion detection accuracy with digital mammography, breast tomosynthesis, and cone-beam CT breast imaging,” Med. Phys. 33, 1041–1050 (2006). 10.1118/1.2174127 [DOI] [PubMed] [Google Scholar]
  • 9.Shorey J., “Stochastic simulations for the detection of objects in three dimensional volumes: Applications in medical imaging and ocean acoustics,” Ph.D. dissertation ( Duke University, 2007). [Google Scholar]
  • 10.Ma A. K. W., Gunn S., and Darambara D. G., “Introducing DeBRa: A detailed breast model for radiological studies,” Phy. Med. Biol. 54, 4533–4545 (2009). 10.1088/0031-9155/54/14/010 [DOI] [PubMed] [Google Scholar]
  • 11.Wernick M. N., Wirjadi O., Chapman D., Zhong Z., Galatsanos N. P., Yang Y., Brankov J. G., Oltulu O., Anastasio M. A., and Muehleman C., “Multiple-image radiography,” Phys. Med. Biol. 48, 3875–3895 (2003). 10.1088/0031-9155/48/23/006 [DOI] [PubMed] [Google Scholar]
  • 12.Brankov J. G., Wernick M. N., Yang Y., Li J., Muehleman C., Zhong Z., and Anastasio M. A., “A computed tomography implementation of multiple-image radiography,” Med. Phys. 33, 278–289 (2006). 10.1118/1.2150788 [DOI] [PubMed] [Google Scholar]
  • 13.Muinonen K., “Introducing the gaussian shape hypothesis for asteroids and comets,” Astron. Astrophys. 332, 1087–1098 (1998). [Google Scholar]
  • 14.Muinonen K., Zubko E., Tyynelä J., Shkuratov Y. G., and Videen G., “Light scattering by Gaussian random particles with discrete-dipole approximation,” J. Quant. Spectrosc. Radiat. Transfer 106, 360–377 (2007). 10.1016/j.jqsrt.2007.01.049 [DOI] [Google Scholar]
  • 15.Murray C. D., “The physiological principle of minimum work: I. The vascular system and the cost of blood volume,” Proc. Natl. Acad. Sci. 12, 207–214 (1926). 10.1073/pnas.12.3.207 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Zamir M. and Chee H., “Branching characteristics of human coronary arteries,” Can. J. Physiol. Pharmacol. 64, 661–668 (1986). 10.1139/y86-109 [DOI] [PubMed] [Google Scholar]
  • 17.Johns P. C. and Yaffe M., “X-ray characterization of normal and neoplastic breast tissues,” Phys. Med. Biol. 32, 675–695 (1987). 10.1088/0031-9155/32/6/002 [DOI] [PubMed] [Google Scholar]
  • 18.Boone J. M. and Seibert J. A., “An accurate method for computer-generating tungsten anode x-ray spectra from 30 to 240 kV,” Med.Phys. 24, 1661–1671 (1997). 10.1118/1.597953 [DOI] [PubMed] [Google Scholar]
  • 19.Boone J. M., Fewell T. R., and Jennings R. J., “Molybdenum, rhodium, and tungsten anode spectral models using interpolated polynomials with application to mammography,” Med. Phys. 24, 1863–1864 (1997). 10.1118/1.598100 [DOI] [PubMed] [Google Scholar]
  • 20.Heath M., Bowyer K., Kopans D., Moore R., and Kegelmeyer W. P., “The digital database for screening mammography,” in Proceedings of the Fifth International Workshop on Digital Mammography (Medical Physics Publishing, Madison, WI, 2001), pp. 212–218. [Google Scholar]
  • 21.Heath M., Bowyer K., Kopans D., Kegelmeyer W. P., Moore R., Chang K., and Kumaran S. M., “Current status of the digital database for screening mammography,” in Proceedings of the Fourth International Workshop on Digital Mammography (Kluwer Academic Publishers, Dordrecht, Netherlands, 1998), pp. 456–460. [Google Scholar]
  • 22.Bookstein F. L., “Principal warps: Thin plate splines and the decomposition of deformations,” IEEE Trans. Pattern Anal. Mach. Intell. 11, 567–585 (1989). 10.1109/34.24792 [DOI] [Google Scholar]
  • 23.Lagarias J. C., Reeds J. A., Wright M. H., and Wright P. E., “Convergence properties of the Nelder-Mead simplex method in low dimensions,” J. Optim. 9, 112–147 (1998). 10.1137/S1052623496303470 [DOI] [Google Scholar]
  • 24.Apsimon H. T., Stewart H. J., and Williams W. J., “Recording the gross outlines of breast tumours a pathological assessment of the accuracy of radiographs of breast cancer,” Br. J. Cancer 22, 40–46 (1968). 10.1038/bjc.1968.6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Lanyi M., Mammography: Diagnosis and Pathological Analysis (Springer, New York, NY, 2003), ISBN 3-540-43134-9. [Google Scholar]
  • 26.Tabar L., Tot T., and Dean P. B., “Breast cancer,” in The Art and Science of Early Detection with Mammography (Thieme, Stuttgart, 2005), ISBN 3-13-135371-6. [Google Scholar]
  • 27.Mazy G., van Bogaert L. J., and Jeahmart L., “La définition de l’image spiculaire des cancers mammaires,” J. Radiol. Electrol. 56, 312–313 (1975). [PubMed] [Google Scholar]
  • 28.van Bogaert L. J., Hermans J., and Obstet S. G., “Importance of spicules on clinical staging of carcinoma of the breast,” Surg., Gynecol. Obstet. 144, 356–358 (1977). [PubMed] [Google Scholar]
  • 29.Huo Z., Giger M. L., Vyborny C. J., Bick U., Lu P., Wolverton D. E., and Schmidt R. A., “Analysis of spiculation in the computerized classification of mammographic masses,” Med. Phys. 22, 1569–1579 (1995). 10.1118/1.597626 [DOI] [PubMed] [Google Scholar]
  • 30.Huo Z., Giger M. L., vyborny C. J., Wolverton D. E., Schmidt R. A., and Doi K., “Automated computerized classification of malignant and benign masses on digitized mammograms,” Acad. Radiol. 5, 155–168 (1998). 10.1016/S1076-6332(98)80278-X [DOI] [PubMed] [Google Scholar]
  • 31.Huo Z., Giger M. L., Vyborny C. J., Wolverton D. E., and Metz C. E., “Computerized classification of benign and malignant masses on digitized mammograms: A study of robustness,” Acad. Radiol. 7, 1077–1084 (2000). 10.1016/s1076-6332(00)80060-4 [DOI] [PubMed] [Google Scholar]
  • 32.Dorfman D. D., Berbaum K. S., and Metz C. E., “Receiver operating characteristic rating analysis: Generalization to the population of readers and patients with the jackknife method,” Invest. Radiol. 27, 723–731 (1992). 10.1097/00004424-199209000-00015 [DOI] [PubMed] [Google Scholar]
  • 33.Dorfman D. D., Berbaum K. S., Lenth R. V., Chen Y. F., and Donaghy B. A., “Monte Carlo validation of a multireader method for receiver operating characteristic discrete rating data: Factorial experimental design,” Acad. Radiol. 5, 591–602 (1998). 10.1016/S1076-6332(98)80294-8 [DOI] [PubMed] [Google Scholar]
  • 34.Hillis S. L. and Berbaum K. S., “Power estimation for the Dorfman-Berbaum-Metz method,” Acad. Radiol. 11, 1260–1273 (2004). 10.1016/j.acra.2004.08.009 [DOI] [PubMed] [Google Scholar]
  • 35.Hillis S. L., Obuchowski N. A., Schartz K. M., and Berbaum K. S., “A comparison of the Dorfman-Berbaum-Metz and Obuchowski-Rockette methods for receiver operating characteristic (ROC) data,” Stat. Med. 24, 1579–1607 (2005). 10.1002/sim.2024 [DOI] [PubMed] [Google Scholar]
  • 36.Hillis S. L., “Monte Carlo validation of the Dorfman-Berbaum-Metz method using normalized pseudovalues and less data-based model simplification,” Acad. Radiol. 12, 1534–1541 (2005). 10.1016/j.acra.2005.07.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Hillis S. L., “A comparison of denominator degrees of freedom for multiple observer ROC analysis,” Stat. Med. 26, 596–619 (2007). 10.1002/sim.2532 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Hillis S. L., Berbaum K. S., and Metz C. E., “Recent developments in the Dorfman-Berbaum-Metz procedure for multireader ROC study analysis,” Acad. Radiol. 15, 647–661 (2008). 10.1016/j.acra.2007.12.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Throughout the paper we will describe clinical mammograms exhibiting actual tumors as “real” mammograms, and normal clinical images modified to include simulated masses as “hybrid” images.

Articles from Medical Physics are provided here courtesy of American Association of Physicists in Medicine

RESOURCES