Abstract
Multi-object segmentation with mutual interaction is a challenging task in medical image analysis. We report a novel solution to a segmentation problem, in which target objects of arbitrary shape mutually interact with terrain-like surfaces, which widely exists in the medical imaging field. The approach incorporates context information used during simultaneous segmentation of multiple objects. The object–surface interaction information is encoded by adding weighted inter-graph arcs to the graph. A globally optimal solution is achieved by solving a single maximum flow problem in a low-order polynomial time. The method’s performance was evaluated in robust delineation of lung tumors in megavoltage cone-beam CT images in comparison with an expert-defined independent standard. The evaluation showed that our method generated highly accurate tumor segmentations. Compared with the conventional graph-cut method, our new approach provided significantly better results (p < 0.001). The Dice coefficient obtained by the conventional graph-cut approach (0.76 ± 0.10) was improved to 0.84 ± 0.05 when employing our new method for pulmonary tumor segmentation.
1 Introduction
Medical image segmentation allows analyzing medical image data in a quantitative manner, which plays a vital role in numerous biomedical applications. Recently, graph-based methods with a global energy optimization property have attracted considerable attentions (e.g., [1–3]). Despite their widespread use for image segmentation, they may have problem in the presence of multiple target objects with weak boundaries, similar intensity information and serious mutual interaction among each other [4].
To solve the problem of multi-object segmentation with mutual interaction, Boykov et al. [5] developed a multi-region framework based on a graph cut method. An interaction term is added into the energy function, which incorporated the geometric interactive relationship between different regions. Their method is topologically flexible and shares some elegance with the level set methods. Li et al. [3] proposed an approach called “graph search” for globally optimal segmentation of multiple terrain-like surfaces(Fig.1(a)), which was successfully extended to a wide spectrum of applications [6, 4, 7, 8]. For each target surface a corresponding geometric graph is constructed. The relative position information between pairs of surfaces is enforced by adding weighted arcs between corresponding graphs. Multiple surfaces can be detected simultaneously in a globally optimal fashion by solving a single maximum flow problem in a transformed graph. Compared with the graph cut approach, this method requires no human interaction and provides easier way to incorporate shape prior information [9]. The major limitation of Li et al.’s family of the graph search method lies in the fact that it is non-trivial to deal with the region of complicated topology without an approximate segmentation of the region’s surface.
In this paper, we consider a special structure consisting of mutually interacting terrain-like surfaces and regions of “arbitrary shape”. Here “arbitrary shape” means that the target region itself does not have any specific preferred shape. Mutual interaction exists between the pairs of the terrain-like surfaces and the regions in a way that their topological information and relative positions are usually known beforehand. Incorporating these interrelations into the segmentation plays a significant role for accurate segmentation when target surfaces and regions lack clear edge and have similar intensity distribution. Fig. 1(b),(c) and (d) show some typical medical imaging examples.
The main idea is to make use of the advantages from both the graph cut method [1] and Li et al.’s graph search method [3]. For regions of arbitrary shape we construct a corresponding graph following the graph cut method making use of its shape flexibility. For terrain-like surfaces the graph is built based on the graph search approach, which requires no initial seeds and provides good shape control. The key is to introduce an additional interaction term in the energy function, which can be implemented by adding inter-graph arcs between the pairs of different types of graphs, enforcing known interacting information between the target terrain-like surfaces and the regions of arbitrary shape. Then a globally optimal solution can be obtained by solving a single maximum flow problem, yielding the simultaneous segmentation of surfaces and regions.
Our paper is organized as follows. In Section 2 we describe our proposed framework in details. In Section 3-5 a quantitative and comparative performance evaluation over a representative number of experiments with Mega-Voltage Cone Beam CT (MVCBCT) images for lung tumor is performed. Section 6 presents concluding remarks.
2 Method
To simplify the presentation and to ease the understanding of our method, let us first consider the task of detecting one terrain-like surface and one region of arbitrary topology with mutual interaction between each other. Note that the same principles used for this illustration are directly applicable to multiple pairs of surfaces and regions with interactions between those two kinds of targets.
2.1 Energy function
Consider a volumetric image of size X×Y ×Z. For each (x, y) pair the voxel subset {v(x, y, z)ȣ0 ≤ z < Z} forms a column parallel to the z-axis, denoted by p(x, y). Each column has a set of neighborhoods for a certain neighbor setting [3]. A terrain-like surface of particular interest, denoted as ST , is the surface that intersects each column p(x, y) at exactly one voxel. It can be defined as a function ST (x, y), mapping each (x, y) pair to its z-value. The target region of arbitrary shape, denoted as RA, includes all the voxels v inside the region. Fig. 2(a) shows one example 2-D slice from a 3-D image.
To explain the employed cost function, let us start with the energy terms used for detecting the terrain-like surface, which are similar in form to those described in [9]. Suppose an edge-based cost cv is assigned to each voxel v ∈ I, which is inversely related to the likelihood that the desired surface ST indeed contains this voxel. For any pair of neighboring columns (p, q) ∈ a convex function penalizing the surface shape change of ST on p and q is expressed as fp,q(ST (p) − ST (q)). Then the energy term Egs takes the form
(1) |
For segmentation of the target region RA we employ the well-known binary graph cut energy [1, 5]. Let l denote the binary variables assigned for each voxel, indexed as lv over voxels v ∈ I. In our notation, lv = 1 means that v belongs to the target region RA and lv = 0 means that v belongs to the background. The graph cut energy Egc is expressed as
(2) |
where Dv is the data term measuring how well the label lv fits the voxel v given the image data. defines the neighboring relationship between voxels, and the boundary energy term Vi,j(li, lj) is the penalty of assigning the neighboring voxels (vi, vj) to labels li and lj, respectively.
As mentioned above, incorporation of known interrelations between terrain-like surface ST and object RA plays a key role for accurate segmentation. To enforce a priori object–surface interaction information, we add an interaction term Einteraction(ST ,RA) to our energy function and the energy function takes the form
(3) |
Our objective is to find the optimal set S = {ST ,RA} such that the above energy is minimized.
2.2 Incorporation of geometric interactions
In this section, we specify the geometric interactions incorporated in our framework between the target surface ST and the region RA, and show how to enforce as the interaction energy term Einteraction(ST ,RA). We start with the case that the region RA tends to be lower than the terrain-like surface ST with a given distance d. For any violating voxels in RA, a penalty is given. More specifically, let z(v) denote the z coordinate for voxel v. ST (p) is the z-value for the surface ST on column p, representing the “height” of the surface on that column. Then for any voxel v ∈ p, if v ∈ RA and ST (p) − z(v) < d, a penalty wv is given (Fig.2(b)). Our interaction energy term takes the form
(4) |
For the constraint that RA is a priori expected to be positioned “higher” than the terrain-like surface ST , a similar formulation is employed. For any voxel v ∈ p, if v ∈ RA(lv = 1) and z(v) − ST (p) < d, a penalty wv is given (Fig. 2(c)). Our interaction energy term then takes the form
(5) |
2.3 Graph construction
Our graph transformation scheme formulates the energy minimization problem as a single computation of maximum flow in the graph.
For the graph cut energy term Egc(ST ) a sub-graph GR(NR,AR) is built using the method described in [1]. Every voxel has a corresponding node nR ∈ NR. Two additional nodes, the source (object) s and the sink (background) t, are added. Each node nR has one t-link to each of the source and sink, which enforces the data-term energy. Each pair of neighboring nodes is connected by an n-link, which encodes the boundary energy term. The minimum-cost s-t cut divides the graph GR into two parts: All nodes belonging to the target object are included in the source set and all background nodes are in the sink set.
For the graph search energy term Egs(ST ) a sub-graph GT (NT ,AT ) is constructed, which follows the method described in [10, 4]. Every node nT (x, y, z) ∈ NT corresponds to exactly one voxel . The positions of nodes reflect the positions of corresponding voxels in the image domain. Two types of arcs are added to the graph: (1) The intra-column arcs with +∞ weight serve to enforce the monotonicity of the target surface ST ; and (2) the inter-column arcs incorporate the shape-prior penalties fp,q between the neighboring columns p and q. Each node is assigned a weight wn such that the total weight of a closed set in the graph GT equals to the edge-cost term in Egs. A typical graph GT is shown in Fig. 3(a). Then, as in [10, 4], each node nT is connected to either the source s by the arc with weight −wn if wn < 0 or the sink t by the arc with weight wn if wn > 0. Note that the source s and the sink t are the same nodes used in GR for the implementation of the graph cut energy term. Using this construction, we merge the two sub-graphs GR and GT as a single s-t graph G. The original energy minimization can be achieved by solving a maximum flow problem in the graph. The target surface ST can be defined by the minimum-cost s-t cut in the graph. All nodes in GT above surface ST belong to the sink set and all nodes on or below ST belong to the source set of the cut [10].
To incorporate geometric interaction constraints, additional inter-graph arcs are added between two sub-graphs. We begin with the case that region RA tends to be lower than the terrain-like surface ST with a given distance d. If nR(x1, y1, z1) in the subgraph GR belongs to the source set (labeled as “object”) and nT (x1, y1, z1+d) in the subgraph GT belongs to the sink set, which indicates that ST (x1, y1)−z1 < d, a penalty w contributes to the objective energy function. That can be enforced by adding a directed arc with a weight of w from each node nR(x, y, z) to nT (x, y, z + d), as shown in Fig. 3(b).
To enforce the constraint that RA tends to be higher than ST with distance d, a “flip” operation is involved. A transformed graph is constructed, in which a node corresponds to a voxel v(x, y,Z − z − 1) in the image . The interaction penalty is given by adding a directed arc with weight w from nR(x, y, z) to , which is associated with the voxel v(x, y, z − d) in original image. Fig. 3(c) shows the flipped graph.
Once the graph is constructed, a globally optimal solution can be found by solving a single maximum flow problem, which minimizes the total energy E(S) in a low-order polynomial time.
3 Application for lung tumor segmentation on MVCBCT image
In this section, we exemplify the application of our method on lung tumor segmentation from MVCBCT images. MVCBCT is a promising technique used in clinic for daily imaging of patients [11] for lung tumor radiotherapy. Successful segmentation of lung tumors from the respiratory correlated 3-D images reconstructed from the projection data of MVCBCT scans can provide important information of tumor motion and volume changes, which allows better delineation of lung tumors for radiation therapy [12, 13].
Here we mainly focus on the segmentation of the primary lung tumor from the reconstructed MVCBCT images, which is a very challenging work. First, the quality of the MVCBCT images is poor. Serious noise interference exists. Second, the lung tumor is frequently located next to the lung surface, the adjacent tissues have a similar intensity profile, and no clear boundary exists in-between. To overcome those difficulties, our new approach was employed for simultaneous segmentation of the lung primary tumor and the adjacent boundary.
3.1 Initialization
As the initialization step, one center point and two radii were required as the user input, from which two spheres were generated. The smaller one was required to be completely inside the tumor and the larger one was required to be completely outside the tumor. Fig. 4(a) shows a typical example. The segmentation was then conducted on the bounding box area of the larger sphere.
3.2 Simultaneous segmentation of lung boundary and tumor
The graph is constructed using the method described in Section 2.3, which contains two sub-graphs: GT for lung boundary detection and GR for tumor segmentation. To construct GT , the user also need to interactively define the direction from which the tumor is adjacent to the lung boundary. In certain cases the tumor may be adjacent to the lung surface from two directions, as shown in Fig. 4(b). In this situation, we segment the tumor and two boundary surfaces together. Three sub-graphs are constructed accordingly: GT1 and GT2 for detection of lung boundaries from two directions; GR for tumor segmentation. The geometric interactions are enforced between two pairs of graphs: (GT1, GR) and (GT2, GR).
Cost design for GT
For lung boundary detection a gradient-based cost function was employed for edge-based costs. The negative magnitude of the gradient of the image was computed at each voxel: . Note that the intensities inside the lung are generally lower than surrounding tissues, a Sobel kernel was used to favor a bright-to-dark or dark-to-bright transition depending on the direction of the target surface. A second order shape prior penalty was employed with the form: f(h) = β·h2, where β was a constant parameter. The shape prior penalty penalized the change of the surface topology: h = ST (p) − ST (q) between neighboring columns p and q.
Cost design for GR
For tumor detection the data term Dv for voxel v was designed as follows: For all voxels lying inside of the smaller circle, which belong to the object, Dv(lv = 1) = 0, Dv(lv = 0) = +∞. Similarly, for all voxels outside the larger circle Dv(lv = 1) = +∞, Dv(lv = 0) = 0. For all other voxels the intensity distribution for tumor followed Gaussian distribution. The mean intensity value and the standard deviation value σ were obtained from all the voxels inside of the smaller circle. For voxel v with intensity iv the data term was given as
(6) |
(7) |
A typical cost image for data term is shown in Fig. 4(c).
For the boundary penalty a gradient-based cost function was used with the similar form as described in [1]
(8) |
where wij corresponds to the weight of the n-link arc between neighboring voxels vi and vj; denotes the squared gradient magnitude.
Cost design for interaction constraint
For the geometric interaction constraint we required that the minimum distance between the target tumor and the lung boundary be at least one. A hard constraint is enforced with d = 1 and wv = +∞.
4 Experimental Methods
The performance evaluation of the reported method was carried in 20 MVCBCT scans obtained from three patients with non-small cell lung cancer. Each set of patient MVCBCT scans was acquired longitudinally over eight weeks of radiation therapy treatment. For each scan two volumetric images were reconstructed, one for full-exhalation phase and one for full-inhalation phase, resulting in 40 images [13]. The size of the reconstructed images was 128 × 128 × 128 voxels with cubic voxel sizes of 1.07 × 1.07 × 1.07 mm3. Out of the 40 datasets, 2 were rejected for poor image quality by experts prior to any work reported here [13] and our experiments were conducted on the remaining 38 datasets.
Surfaces of lung tumors were obtained by expert manual tracing as reported in [13] and served as the independent standard for assessing segmentation correctness of our approach. The employed values of the above-described segmentation method parameters were selected empirically. The same parameter values of β = 5, d = 1, and σg = 10 were applied to all analyzed datasets.
The segmentation performance was assessed using the Dice similarity coefficient (DSC). DSC = 2∣Vm⋂Vc∣/(∣Vm∣ + ∣Vc∣), where Vm denotes the volume of the independent standard and Vc denotes the volume of the computer-determined object. All DSC values were computed in 3-D.
To determine the performance of our novel surface-and-region segmentation approach in comparison with a conventional approach of solely using a graph cut method to detect the tumor without simultaneously segmenting associated lung boundary surfaces, these two approaches were applied to all 38 MVCBCT images with identical spherical initialization (Fig. 4) and the obtained DSC performance indices were compared for the two methods. Statistical significance of the observed differences was determined using Student t-test for which p value of 0.05 was considered significant.
5 Results
Our simultaneous surface–region segmentation method as well as Boykov’s conventional graph-cut segmentation method were applied to all 38 MVCBCT images for which the independent standard was available. Our approach achieved tumor segmentation correctness characterized by DSC = 0.840 ± 0.049 while the conventional approach yielded statistically significantly lower segmentation performance of DSC = 0.760 ± 0.104 (p < 0.001). Fig. 5(a) displays these overall results graphically as Mean±stdev.
Fig. 5(b) shows the pairwise performance comparisons for all 38 datasets, ordered according to the performance of the conventional approach. Fig. 5(b) thus clearly demonstrates that the performance of the conventional method is very uneven in the analyzed data set. In contrary, our new approach not only shows an overall improvement of the segmentation, it also demonstrates that the segmentation performance is very similar for all tumors in the entire set of the 38 analyzed images, thus showing a markedly higher robustness of the new approach resulting from the incorporation of image-based surface context.
Fig. 6 shows a typical outcome of our new tumor segmentation method and gives visual comparison with the independent standard. Fig. 7 shows tumor segmentation results obtained using the two compared methods in a difficult image, in which the tumor is closely adjacent to the lung surface from two directions. Fig. 7(b) shows a segmentation failure of the conventional approach while Figs. 7(c,d) show correctly segmented tumor using our new method.
Current non-optimized implementation requires about 5 minutes on a Linux workstation (3 GHz, 32 GB memory). The run time decreases to less than 10 s when working with downsampled images for which voxel sizes doubled along each of the x, y, and z directions with only a limited segmentation performance penalty (DSC2 = 0.826 ± 0.046).
6 Discussion and Conclusion
We presented a novel graph-based framework incorporating surface-region context information to solve the segmentation problem of a special structure having target objects of arbitrary topology mutually interacting with terrain-like surfaces, which widely exists in medical imaging fields. The proposed approach was successfully applied to lung tumor segmentation in MVCBCT. The result shows the power of our algorithm. To demonstrate the diverse applications of our framework, we also applied our method to lymph node segmentation in CT image. Fig. 8 shows an illustrative result.
Footnotes
This work was supported in part by NSF grants CCF-0830402 and CCF-0844765, and NIH grants R01-EB004640 and K25-CA123112.
References
- 1.Boykov Y, Funka-Lea G. Graph cuts and efficient N-D image segmentation. IJCV. 2006;70(2):109–131. [Google Scholar]
- 2.Schaap M, et al. Coronary lumen segmentation using graph cuts and robust kernel regression. IPMI. 2009 doi: 10.1007/978-3-642-02498-6_44. [DOI] [PubMed] [Google Scholar]
- 3.Li K, Wu X, Chen DZ, Sonka M. Optimal surface segmentation in volumetric images - a graph-theoretic approach. IEEE TPAMI. 2006;28(1):119–134. doi: 10.1109/TPAMI.2006.19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Song Q, et al. Graph search with appearance and shape information for 3-d prostate and bladder segmentation. MICCAI. 2010 doi: 10.1007/978-3-642-15711-0_22. [DOI] [PubMed] [Google Scholar]
- 5.Delong A, Boykov Y. Globally optimal segmentation of multi-region objects. ICCV. 2009 [Google Scholar]
- 6.Yin Y, et al. LOGISMOS - Layered Optimal Graph Image Segmentation of Multiple Objects and Surfaces: Cartilage segmentation in the knee joints. IEEE Trans. Medical Imaging. 2010;29(12):2023–2037. doi: 10.1109/TMI.2010.2058861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Han D, Wu X, Sonka M. Optimal multiple surfaces searching for video/image resizing - a graph-theoretic approach. ICCV. 2009 [Google Scholar]
- 8.Han D, et al. Globally optimal tumor segmentation in pet-ct images: a graph-based co-segmentation method. IPMI. 2011 doi: 10.1007/978-3-642-22092-0_21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Song Q, Wu X, Liu Y, Haeker M, Sonka M. Simultaneous searching of globally optimal interacting surfaces with convex shape priors. CVPR. 2010 [Google Scholar]
- 10.Wu X, Chen DZ. Optimal net surface problems with applications. ICALP. 2002:1029–1042. [Google Scholar]
- 11.Morin O, et al. Megavoltage cone-beam CT: system description and clinical applications. Medical Dosimetry. 2006;31(1):51–61. doi: 10.1016/j.meddos.2005.12.009. [DOI] [PubMed] [Google Scholar]
- 12.Chen M, Siochi RA. Diaphram motion quantification in megavoltage cone-beam CT projection images. Medical Physics. 2010;37:2312–20. doi: 10.1118/1.3402184. [DOI] [PubMed] [Google Scholar]
- 13.Chen M, Siochi RA. A clinical feasibility study on respiratory sorted megavoltage cone beam CT. International Workshop on Pulmonary Image Analysis.2010. [Google Scholar]