Abstract
We present a fully automated learning-based approach for segmenting knee cartilage in presence of osteoarthritis (OA). The algorithm employs a hierarchical set of two random forest classifiers. The first is a neighborhood approximation forest, the output probability map of which is utilized as a feature set for the second random forest (RF) classifier. The output probabilities of the hierarchical approach are used as cost functions in a Layered Optimal Graph Segmentation of Multiple Objects and Surfaces (LOGISMOS). In this work, we highlight a novel post-processing interaction called just-enough interaction (JEI) which enables quick and accurate generation of a large set of training examples. Disjoint sets of 15 and 13 subjects were used for training and tested on another disjoint set of 53 knee datasets. All images were acquired using double echo steady state (DESS) MRI sequence and are from the osteoarthritis initiative (OAI) database. Segmentation performance using the learning-based cost function showed significant reduction in segmentation errors (p < 0.05) in comparison with conventional gradient-based cost functions.
Keywords: Graph based segmentation, knee MRI, Random forest classifier, Neighborhood approximation forests, Osteoarthritis, Just enough interaction, LOGISMOS
1 Introduction
Osteoarthritis (OA) is one of the most prevalent aging diseases in our society [3]. Early detection of structural differences associated with OA is important for testing interventional drugs in clinical trials. Segmentation is a crucial first step in structural analysis. Manual segmentation takes several hours of effort and is prone to inter/intra-observer variability and operator-induced bias. Reproducible automated segmentation is challenging in presence of severe OA due to cartilage thinning, osteophytes, bone marrow, and cartilage lesions in the MR volume.
Several automated algorithms for knee segmentation have been proposed in the literature such as approximate binary k-NN classifiers [4], deformable active shape models [6], local image patch optimization using Markov random fields [10], hierarchical two stage cartilage classifiers optimized by graph cuts [15], and multi-atlas labeling with locally weighted voting [9]. Many of these methods make use of locally (not globally) optimal techniques to solve the segmentation problem. For example, Wang et al. [15] used multi-label graph cuts to optimize the background combining bone regions with true background and the cartilage classifier outputs. LOGISMOS algorithm [16] can simultaneously segment multiple surfaces in different objects taking contextual information between them to make the segmentation robust with guaranteed global optimality.
Severe OA pathology alters the tissue appearance substantially, limiting the segmentation performance when using simple cost functions. A major limiting factor of learning-based costs (used in [4,7,15,16]) is the time consuming task of curating a large numbers of accurate training examples. Several interactive correction methods were designed to ease the post-processing corrections such as thin plate splines [11], live-wires [12] and active shape model based interactions [13]. In these cases and others, the refinement mainly corrects the surface errors directly to match the object boundaries and/or are prohibitively expensive to be computed in real time.
We present a fully automated LOGISMOS segmentation algorithm based on learned costs using a two-stage hierarchical random forest classifiers (RF). Unlike in [15], our method uses two variations of RF classifier, the first being a neighborhood approximation forests (NAF) [8] followed by an RF classifier on a geometric graph thereby learning from a combination of local and global context and textural features. JEI approach [14] was used to prepare the training data, substantially reducing the interaction time compared to traditional voxel-by-voxel post-processing approaches. This is achieved by interacting with the underlying graph algorithm. While similar to [1], their graph-cut interaction approach for multiple objects does not guarantee global optimality. Live-wires with embedded interaction capabilities have a similar drawback of being unable to maintain global optimality for multiple surfaces and objects. LOGISMOS-JEI handles multiple object interactions while maintaining global optimality.
2 Methods
The proposed segmentation work-flow is outlined in Fig. 1 beginning with an automated LOGISMOS segmentation using gradient-based costs. The optimization finds the minimum closed set on a node weighted graph. The optimized residual graph and the image volume are loaded into the custom built graphical user interface to examine the segmentation quality and perform JEI. Upon completing JEI, the final edited surfaces are used as training examples for the hierarchical classifier system. Upon training, the classifier probability values on a test volume is assigned as cartilage node costs in LOGISMOS graph. For bone surface segmentation, the initially-employed gradient-based costs were very robust and remained unchanged.
2.1 LOGISMOS Segmentation Algorithm
The algorithm segments multiple objects and surfaces simultaneously in a graph based framework. Initialization of the algorithm consists of volume of interest (VOI) detection using an AdaBoost classifier [5] trained on manually identified VOI’s. Training used 9 different 3D Haar-like features applied at different scales. The smaller region localized by the VOI helps reduce the computation time. Further, these VOI bounds are used for fitting the mean shape mesh S0 for each bone (femur and tibia respectively) using affine registration. A patient-specific bone shape S is important since the final segmentation of both the bone and the cartilage surfaces is defined by this shape prior. S is obtained by a single-surface single-object LOGISMOS segmentation using S0.
A geometric graph is constructed representing each surface of the object to be segmented. Non-intersecting graph column construction is a crucial step in ensuring topologically correct segmentation. This is enforced by mimicking the behavior of electrically charged particles resulting in an electric lines of force (ELF) based graph column construction. All graph nodes are assigned unlikeliness costs based on the local image intensity gradient. To represent the segmentation task as a max-flow problem, columns are connected by intra- and inter-column arcs enforcing surface smoothness constraints. The final multi-object multi-surface segmentation add additional arcs to enforce inter-object and inter-surface constraints whose arc construction and final solution are described in [16]. A 1D derivative operator, ∇(x, y, z), along the column direction was used to determine bone-surface costs while the cartilage costs employed weighted (determined empirically) first and second order derivatives: w*∇(x, y, z)+(1−w)*∇2(x, y, z).
2.2 Just Enough Interaction
Following LOGISMOS segmentation, JEI editing on the resulting surfaces is done as needed. The image volume, residual graph and the ELF based geometric graph are loaded into a custom designed GUI. The previously reported JEI methods [14] were extended from a single object, two surface interaction to a multi-object multi-surface interaction for knee MRI. A k-dimensional tree based interaction algorithm was designed for cost modification based on user inputs. Furthermore, a faster max-flow optimization algorithm was utilized for immediate feedback on the interaction. The JEI workflow was as follows: (1) specifying approximately correct boundary points, referred to as nudge points hereafter, on a chosen 2D slice, (2) modification of local graph costs in the entire 3D neighborhood of interaction, (3) max-flow re-computation in 3D, resulting in corrected surfaces within milliseconds. The process was repeated until the results were deemed satisfactory.
Underlying Interaction and Graph Cost Modification
The user specified nudge points form a 3D contour and its intersection (intersecting nodes) with the graph columns is identified by utilizing a k-dimensional tree that stores the coordinates of all graph nodes to enable O(log n) query on N nearest (determined empirically) nodes for any given node. The node costs (i.e. unlikeness) of all columns with intersecting nodes are modified as:
where c(i, j) is the cost of node j on column i, D((i, j), n(i, j)) is the distance between node (i, j) and its nearest intersecting node n(i, j), and Δ is a tolerance. The max-flow optimization then continues from its previous state using the updated costs and consequently produces the updated surfaces.
2.3 Classifier System Design
Two RF based classifiers in hierarchy were used to train cartilage regions. A NAF [8] trained on example image patches was used as the first stage followed by a second RF classifier [2] with features collected along the ELF based geometric graph nodes. The output probability maps of the NAF were not directly input as costs to the graph optimization. They were used with other image based features for training of the second RF classifier. The advantage of this approach is that information is gathered from a larger global neighborhood in combination with local features. The NAF classifier gathers contextual and textual information from a larger neighborhood of 3D image patches. The RF classifier collects local feature information along the geometric search columns of the graph. Disjoint training sets were used to help build a more realistic RF model based on actual NAF performance on unseen images.
Neighborhood Approximation Forests
NAF is based on a random forests framework which approximates the nearest neighbors of image patches based on a user defined distance function to optimize the node split. The pairwise distance function ρ(I, J) between each of the training image patches is defined as ρ(I, J) =||seg(I) − seg(J)||l0 where seg(.) is the segmentation label map for the corresponding image patch. Intuitively it measures similarity between image patches based on the segmentation. The algorithm learns to group image patches which appear similar to each other based on this neighborhood distance criteria. The output probability map of the unseen image (Fig. 2) was used as one of the inputs for a second RF classifier.
Clustered Random Forest Classifier
The second RF was trained on features collected at each node of the geometric graph. JEI edited bone mesh surfaces were used for geometric graph construction during training. Positive example labels corresponded to the nearest cartilage mesh intersection along each search column. The different features collected at each node point are shown in Table 1 with feature values interpolated to the search path points from corresponding feature volumes. To handle the large variability of cartilage intensities in the volume, a k-means clustering algorithm was applied to the S0 mesh of femur and tibia respectively resulting in spatial parcellation of the pre-segmented mesh surfaces into 40 clusters each (total 80). The clustering trained regionally-specific appearance models by accounting for the surrounding menisci, muscle, bone and other anatomies. The probability response to the features along the search nodes in the testing datasets provided the node costs for graph optimization instead of gradient based costs.
Table 1.
Index | Description |
---|---|
1–9 | 3 eigenvalues of Hessian matrices on intensity image at σ = 0.5, 1.0, 2.0 mm |
10–15 | 1st Gaussian gradient on intensity and NAF probability volumes at σ = 0.36, 0.7, 1.4 mm |
16–18 | Intensity, Gaussian smoothed intensity, and NAF probability volumes |
19–20 | Laplacian derivative of intensity volume at σ = 0.36, 0.7 mm |
21 | Gabor texture feature |
22–25 | Intensity statistics: mean, variance, skewness and kurtosis of a 2 mm3 region centered around each graph node |
26–28 | Haar features (1.5 mm kernel) along horizontal, vertical & diagonal directions |
29–30 | 1D directional gradient along the search column direction on NAF probability and intensity volume |
3 Experimental Methods
MRI volumes with independent standards are available from the OAI. All subjects were imaged using a DESS protocol with a voxel resolution of 0.36 × 0.36 × 0.7 mm3. All MR volumes in this study were from diseased subjects. The data were divided into two training sets with 15 and 13 which were used to train the NAF and the second RF classifier. The data-sets used for training the clustered RF classifier were first inspected and JEI edited. 53 data-sets were used for testing.
All image volumes were first LOGISMOS segmented using gradient costs. The geometric graph had 8006 and 8002 graph columns for the femur and tibia objects respectively. The graph parameters are listed in Table 2.
Table 2.
Inter-surface max (nodes) | Inter-object max (nodes) | Smoothness (nodes) | Column size (nodes) | Node spacing (mm) | |
---|---|---|---|---|---|
Learned cost | 40 | 120 | 4 | 121 | 0.15 |
Gradient cost | 20 | 60 | 2 | 61 | 0.20 |
The NAF features consisted of image patches sampled over 15 data-sets with 1521 sample points per patch. Because of the highly imbalanced ratio between the negative and positive labels, we considered a neighborhood around the cartilage labels and marked them as negative examples. The image patches collected consisted of all the positive and the surrounding negative labels. We trained a set of 200 trees with 40,000 images patches as inputs to each tree.
The second set of RF classifier was trained on 13 JEI-corrected data-sets with 30 features (see Table 1) along with the ELF search path for each node. 80 (40 × 2) RF classifiers were trained with each one representing the given cluster with 800 trees per forest.
4 Results
Surface positioning errors (compared against independent standard) achieved by hierarchical classifier, gradient cost and single stage RF classifier are listed in Table 3. It shows a significant reduction in signed errors for the femur using learned costs (p < 0.001). While the signed tibia errors were not statistically significantly different (p = 0.62, difference between means of 1/18 voxel), the error variance was substantially reduced showing that larger errors were avoided by the new strategy. The reduction in unsigned surface positioning errors was significant for both the femur and tibia (p < 0.001). Table 3 also shows the benefits of the hierarchical classifier when compared with the single stage RF classifier. Although the single stage RF improves the segmentation accuracy, the addition of the hierarchical NAF stage further improves the accuracy with the largest reduction in error seen in the tibial regions.
Table 3.
NAF+RF (Proposed) | Gradient | p-value | RF only | |
---|---|---|---|---|
Femur signed | 0.03±0.19 | −0.31±0.28 | ≪ 0.001 | −0.06±0.18 |
Femur unsigned | 0.55±0.11 | 0.68±0.20 | ≪ 0.001 | 0.56±0.11 |
Tibia signed | 0.10±0.17 | 0.08±0.35 | 0.62 | 0.16±0.24 |
Tibia unsigned | 0.61±0.14 | 0.81±0.18 | ≪ 0.001 | 0.65±0.17 |
Figure 3 qualitatively compares the segmentation accuracies between the two methods and the independent standard. Both the femur and tibia are shown with their respective bone and cartilage segmentations showing good agreement between learning-based segmentation and the independent standard.
5 Conclusion
A novel fully automated learning-based algorithm for designing cost functions used in LOGISMOS segmentation was presented. Cost function was designed by optimizing a hierarchical set of classifiers, each associated with one of two different learning methods. We also demonstrated the use of JEI for acquiring training examples. The presented method was highly accurate when compared to pre-JEI results. Understanding the effect of feature selection and the mutual combination of parameters from both classifiers remains as future work.
Acknowledgments
OAI support for providing the MRI volumes and manual segmentation gratefully acknowledged. This research was supported by NIH grant - R01EB004640. Computation support through Helium cluster provided by University of Iowa.
References
- 1.Boykov YY, Jolly MP. Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. ICCV. 2001:105–112. [Google Scholar]
- 2.Breiman L. Random forests. Mach Learn. 2001;45(1):5–32. [Google Scholar]
- 3.Centre for Disease Control: Arthritis prevalence and activity limitations United States, 1990. JAMA. 1994;272(5):346. [PubMed] [Google Scholar]
- 4.Folkesson J, Dam BE, Olsen Fogh O, Pettersen CP, Christiansen C. Segmenting articular cartilage automatically using a voxel classification approach. IEEE TMI. 2007;26(1):106–115. doi: 10.1109/TMI.2006.886808. [DOI] [PubMed] [Google Scholar]
- 5.Freund Y, Schapire RE. A decision theoretic generalization of on-line learning and an application to boosting. Comput Syst Sci. 1997;57(1):119–139. [Google Scholar]
- 6.Fripp J, Crozier S, Warfield SK, Ourselin S. Automatic segmentation and quantitative analysis of the articular cartilages from magnetic resonance images of the knee. IEEE TMI. 2010;29(1):55–64. doi: 10.1109/TMI.2009.2024743. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Kashyap S, Yin Y, Sonka M. Automated analysis of cartilage morphology. Proceedings - ISBI. 2013:1300–1303. [Google Scholar]
- 8.Konukoglu E, Glocker B, Zikic D, Criminisi A. Neighbourhood approximation using randomized forests. Med Image Anal. 2013;17(7):790–804. doi: 10.1016/j.media.2013.04.013. [DOI] [PubMed] [Google Scholar]
- 9.Lee JG, Gumus S, Moon CH, Kwoh CK, Bae KT. Fully automated segmentation of cartilage from the MR images of knee using a multi-atlas and local structural analysis method. Med Phys. 2014;41(9):092303. doi: 10.1118/1.4893533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Lee S, Park SH, Shim H, Yun ID, Lee SU. Optimization of local shape and appearance probabilities for segmentation of knee cartilage in 3-D MR images. Comput Vis Image Underst. 2011;115(12):1710–1720. [Google Scholar]
- 11.Ross JC, Estépar RSJ, Díaz A, Westin CF, Kikinis R, Silverman EK, Washko GR. Lung extraction, lobe segmentation and hierarchical region assessment for quantitative analysis on high res. CT images. MICCAI. 2009:690–698. doi: 10.1007/978-3-642-04271-3_84. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Schenk A, Prause G, Peitgen H-O. Efficient semiautomatic segmentation of 3D objects in medical images. In: Delp SL, DiGoia AM, Jaramaz B, editors. MICCAI 2000. LNCS. Vol. 1935. Springer; Heidelberg: 2000. pp. 186–195. [DOI] [Google Scholar]
- 13.Schwarz T, Heimann T, Tetzlaff R, Rau AM, Wolf I, Meinzer HP. Interactive surface correction for 3D shape based segmentation. SPIE. 2008 [Google Scholar]
- 14.Sun S, Sonka M, Beichel RR. Graph-based IVUS segmentation with efficient computer-aided refinement. IEEE TMI. 2013;32(8):1536–1549. doi: 10.1109/TMI.2013.2260763. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Wang Q, Wu D, Lu L, Liu M, Boyer KL, Zhou SK. Semantic context forests for learning-based knee cartilage segmentation in 3D MR images. In: Menze B, Langs G, Montillo A, Kelm M, Müller H, Tu Z, editors. MCV 2013. LNCS. Vol. 8331. Springer; Heidelberg: 2014. pp. 105–115. [DOI] [Google Scholar]
- 16.Yin Y, Zhang X, Williams R, Wu X, Anderson DD, Sonka M. LOGISMOS-layered optimal graph image segmentation of multiple objects and surfaces: cartilage segmentation in the knee joint. IEEE TMI. 2010;29(12):2023–2037. doi: 10.1109/TMI.2010.2058861. [DOI] [PMC free article] [PubMed] [Google Scholar]