Body-Wide Hierarchical Fuzzy Modeling, Recognition, and Delineation of Anatomy in Medical Images

Jayaram K Udupa; Dewey Odhner; Liming Zhao; Yubing Tong; Monica MS Matsumoto; Krzysztof C Ciesielski; Alexandre X Falcao; Pavithra Vaideeswaran; Victoria Ciesielski; Babak Saboury; Syedmehrdad Mohammadianrasanani; Sanghun Sin; Raanan Arens; Drew A Torigian

doi:10.1016/j.media.2014.04.003

. Author manuscript; available in PMC: 2015 Jul 1.

Published in final edited form as: Med Image Anal. 2014 Apr 24;18(5):752–771. doi: 10.1016/j.media.2014.04.003

Body-Wide Hierarchical Fuzzy Modeling, Recognition, and Delineation of Anatomy in Medical Images

Jayaram K Udupa ^a, Dewey Odhner ^a, Liming Zhao ^a, Yubing Tong ^a, Monica MS Matsumoto ^a, Krzysztof C Ciesielski ^a,^b, Alexandre X Falcao ^d, Pavithra Vaideeswaran ^a, Victoria Ciesielski ^a, Babak Saboury ^a, Syedmehrdad Mohammadianrasanani ^a, Sanghun Sin ^e, Raanan Arens ^e, Drew A Torigian ^c

PMCID: PMC4086870 NIHMSID: NIHMS589264 PMID: 24835182

Abstract

To make Quantitative Radiology (QR) a reality in radiological practice, computerized body-wide automatic anatomy recognition (AAR) becomes essential. With the goal of building a general AAR system that is not tied to any specific organ system, body region, or image modality, this paper presents an AAR methodology for localizing and delineating all major organs in different body regions based on fuzzy modeling ideas and a tight integration of fuzzy models with an Iterative Relative Fuzzy Connectedness (IRFC) delineation algorithm. The methodology consists of five main steps: (a) gathering image data for both building models and testing the AAR algorithms from patient image sets existing in our health system; (b) formulating precise definitions of each body region and organ and delineating them following these definitions; (c) building hierarchical fuzzy anatomy models of organs for each body region; (d) recognizing and locating organs in given images by employing the hierarchical models; and (e) delineating the organs following the hierarchy. In Step (c), we explicitly encode object size and positional relationships into the hierarchy and subsequently exploit this information in object recognition in Step (d) and delineation in Step (e). Modality-independent and dependent aspects are carefully separated in model encoding. At the model building stage, a learning process is carried out for rehearsing an optimal threshold-based object recognition method. The recognition process in Step (d) starts from large, well-defined objects and proceeds down the hierarchy in a global to local manner. A fuzzy model-based version of the IRFC algorithm is created by naturally integrating the fuzzy model constraints into the delineation algorithm.

The AAR system is tested on three body regions – thorax (on CT), abdomen (on CT and MRI), and neck (on MRI and CT) – involving a total of over 35 organs and 130 data sets (the total used for model building and testing). The training and testing data sets are divided into equal size in all cases except for the neck. Overall the AAR method achieves a mean accuracy of about 2 voxels in localizing non-sparse blob-like objects and most sparse tubular objects. The delineation accuracy in terms of mean false positive and negative volume fractions is 2% and 8%, respectively, for non-sparse objects, and 5% and 15%, respectively, for sparse objects. The two object groups achieve mean boundary distance relative to ground truth of 0.9 and 1.5 voxels, respectively. Some sparse objects – venous system (in the thorax on CT), inferior vena cava (in the abdomen on CT), and mandible and naso-pharynx (in neck on MRI, but not on CT) – pose challenges at all levels, leading to poor recognition and/or delineation results. The AAR method fares quite favorably when compared with methods from the recent literature for liver, kidneys, and spleen on CT images. We conclude that separation of modality-independent from dependent aspects, organization of objects in a hierarchy, encoding of object relationship information explicitly into the hierarchy, optimal threshold-based recognition learning, and fuzzy model-based IRFC are effective concepts which allowed us to demonstrate the feasibility of a general AAR system that works in different body regions on a variety of organs and on different modalities.

Keywords: anatomy modeling, fuzzy models, object recognition, image segmentation, fuzzy connectedness, quantitative radiology

1. INTRODUCTION

1.1 Background

Since the birth of radiology in 1895, the emphasis in clinical radiology has been on human visualization of internal structures. Although various tomographic image modalities evolved subsequently for deriving anatomic, functional, and molecular information about internal structures, the emphasis on human visualization continued and the practice of clinical radiology has remained mostly descriptive and subjective. Quantification is amply employed in radiology in clinical research. However, in clinical radiological practice, this is not common. In the qualitative mode, quantifiable and/or subtle image information is underutilized, interpretations remain subjective, and subtle changes at early disease stages or due to therapeutic intervention may be underestimated or missed (Torigian et al. 2007). It is generally believed now that if Quantitative Radiology (QR) can be brought to routine clinical practice, numerous advances can be made including: improved sensitivity, specificity, accuracy, and precision of early disease diagnosis; more objective and standardized response assessment of disease to treatment; improved understanding of what is “normal”; increased ease of disease measurement and reporting; and discovery of new disease biomarkers.

To make QR a reality, we believe that computerized Automatic Anatomy Recognition (AAR) during radiological image interpretation becomes essential. To facilitate AAR, and hence eventually QR, and focusing only on the anatomic aspects of shape, geography, and architecture of organs, while keeping the larger goal in mind, we present in this paper a novel fuzzy strategy for building body-wide anatomic models, and for utilizing these models for automatically recognizing and delineating body-wide anatomy in given patient images.

1.2 Related work

Image segmentation – the process of recognizing and delineating objects in images – has a rich literature spanning over five decades. From the perspective of the direction in which this field is headed, it is useful to classify the methods developed to date into three groups: (a) Purely image-based, or pI approaches (Beucher 1992, Boykov et al. 2001, Kass et al. 1987, Malladi et al. 1995, Mumford and Shah 1989, Udupa and Samarasekera 1996), wherein segmentation decisions are made based entirely on information derived from the given image; (b) Object model-based, or OM approaches (Ashburner and Friston 2009, Cootes et al. 2001, Heimann and Meinzer 2009, Pizer et al. 2003, Shattuck et al. 2008, Staib and Duncan 1992,), wherein known object shape and image appearance information over a population are first codified in a model and then utilized on a given image to bring constraints into the segmentation process; (c) Hybrid approaches (Chen and Bagci 2011, Hansegrad et al. 2007, Horsfield et al. 2007, Liu and Udupa 2009, Rousson and Paragios 2008, Shen et al. 2011, van der Lijn et al. 2012, Zhou and Bai 2007), wherein the delineation strengths of the pI methods are combined synergistically with the global object recognition capabilities of the OM strategies. pI algorithms predate other approaches, and they still continue to seek new frontiers. OM approaches go by various names such as statistical models and probabilistic atlases, and continue to be pursued aggressively. Particularly, atlas-based techniques have gained popularity in brain MR image segmentation and analysis (Cabezas et al. 2011). Hybrid approaches hold much promise for AAR and QR and are currently very actively investigated. Since our focus in this paper is the body torso, and since the nature of the images and of the objects and challenges encountered are different for these regions (from, for example, for the brain), our review below will focus mainly on methods developed for the torso.

Since the simultaneous consideration of multiple objects offers better constraints, in recent years, multi-object strategies have been studied under all three groups of approaches to improve segmentation. Under pI approaches, the strategy sets up a competition among objects for delineating their regions/boundaries (e.g.; Bogovic et al. 2013, Saha and Udupa 2001). In OM approaches, the strategy allows including inter-relationships among objects in the model to influence their localization and delineation (e.g.; Cerrolaza et al. 2012, Duta and Sonka 1998). In hybrid approaches, multi-object strategies try to strengthen segmentability by incorporating relevant information in model building, object recognition/localization, and subsequently also in delineation via the pI counterpart of the synergistic approach (Chen et al. 2012, Chu et al. 2013, Linguraru et al. 2012, Lu et al. 2012, Meyer et al. 2011, Okada et al. 2008, Shen et al. 2011, Tsechpenakis and Chatzis 2011). Motivated by applications (such as semantic navigation) where the focus is just locating objects in image volumes and not delineating them, a separate group of methods has been emerging (Criminisi et al. 2013, Zhou et al. 2005, Zhou et al. 2013). They use features characterizing the presence of whole organs or specific anatomic aspects of organs (such as the femoral neck and head) combined with machine learning techniques to locate objects in image volumes by finding the size, location, and orientation of rectangular bounding boxes that just enclose the anatomic entities.

The state-of-the-art in image segmentation seems to leave several gaps that hinder the development of a body-wide AAR system. First, while multi-object strategies have clearly shown superior performance for all approaches, in all published works they have been confined to only a few (three to five) selected objects and have not taken into account an entire body region or all of its major organs, the only exception being (Baiker et al. 2010), whose focus was whole body segmentation of mice on micro CT images. Second, and as a result, there is no demonstrated single method that operates on different body regions, on all major organs in each body region, and at different modalities. Third, all reported modeling strategies have a statistical framework, either as statistical models of shape and intensity pattern of appearance of objects in the image or as atlases, and none taking a fuzzy approach, except (Zhou and Rajapakse 2005) and our previous work (Miranda et al. 2008, Miranda et al. 2009), both in the brain only. Fuzzy set concepts have been used extensively otherwise in image processing and 3D visualization. Fuzzy modeling approaches allow bringing anatomic information in an all-digital form into graph theoretic frameworks designed for object recognition and delineation, obviating the need for (continuous) assumptions made otherwise in statistical approaches about shapes, random variables, their independence, functional form of density distributions, etc. They also allow capturing information about uncertainties at the patient level (e.g., blur, partial volume effects) and population level, and codification of this information within the model. Fourth, objects have complex inter-relationships in terms of their geographic layout. Learning this information over a population and encoding this explicitly in an object hierarchy can facilitate object localization considerably. Although several multi-object methods have accounted for this relationship indirectly, its direct incorporation into modeling, object recognition, and delineation in an anatomic hierarchical order has not been attempted. The AAR approach presented in this paper is designed to help overcome these gaps.

1.3 Outline of paper and approach

We start off by describing a novel hierarchical fuzzy modeling framework for codifying prior population information about object assemblies in Section 2. In Section 3, we delineate methods for automatically recognizing objects in given patient images that employ these hierarchical models. We present fuzzy-connectedness-based object delineation techniques in Section 4 that employ the modified fuzzy models found at recognition as constraints in delineation. We demonstrate and evaluate the applicability of the AAR methodology in Section 5 on three different body regions – thorax, abdomen, and neck - on different modalities. A comparison to methods from recent literature, the lessons learned, our conclusions, and the challenges we encountered are examined in Section 6. The AAR approach has five unique characteristics: (1) direct hierarchical codification of the prior object geographic and geometric relationship information; (2) a “what-you-see-is-what-you-get” entirely digital fuzzy modeling strategy; (3) hierarchical object recognition strategies that go from a broader gestalt to narrower specifics in locating objects; (4) demonstrated generality of applicability of the same approach to different organ systems, body regions, and modalities; and (5) adaptability of the system to different applications.

The AAR approach is graphically summarized in Figure 1. The body is divided into body regions B₁, …, B_K. Models are built for each specific body region Inline graphic ∈ {B₁, …, B_K} and each population group G (whatever way G is defined). Throughout this paper, and G are treated as variables, and each body region is considered separately and independent of other body regions. In Section 6, we will discuss briefly the issue of linking body regions for considering the whole body for the AAR schema. The three main blocks in Figure 1 correspond to model building, object recognition, and object delineation. A fuzzy model FM(O_l) is built separately for each of the L objects O_l in Inline graphic , and these models are integrated into a hierarchy chosen for . The output of the first step is a fuzzy anatomic model FAM( , G) of the body region for group G. This model is utilized in recognizing objects in a given patient image I of belonging to G in the second step. The hierarchical order is followed in this process. The output of this step is the set of transformed fuzzy models FM^T(O_l) corresponding to the state when the objects are recognized in I. These modified models and the image I form the input to the third step of object delineation which also follows the hierarchical order. The final output is in the form of delineated objects O₁^D, …, O_L^D, where each O_l^D is a binary image.

A schematic representation of the AAR schema. The three main steps of model building, object recognition, and object delineation are explained in Sections 2, 3, and 4.

Very preliminary versions of some of the contents of this paper appeared in SPIE Medical Imaging conference proceedings in 2011, 2012, and 2013. Those papers did not contain the full details presented here on model building. More importantly, based on earlier experience many improvements are reported in this paper, none of which appeared earlier. Further, the recognition and delineation methods presented here have many novel elements. As a result, the entire AAR approach has changed substantially. Additional differences include comprehensive evaluation and the demonstration of the AAR scheme on multiple body regions.

2. BUILDING FUZZY MODEL OF BODY REGION

Notation

We will use the following notation throughout this paper. G: the population group under consideration. Inline graphic : the body region of focus. O₁, …, O_L : L objects or organs of (such as esophagus, pericardium, etc. for = Thorax). I = {I₁, …, I_N}: the set of images of for G from N subjects which are used for model building and for training the parameters of the AAR algorithms. I_n_,_l: the binary image representing the true delineation of object O_l in the image I_n ∈ I. I^b = {I_n_,_l : 1 ≤ n ≤ N & 1 ≤ l ≤ L} is the set of all binary images used for model building. FM(O_l): Fuzzy model of object O_l derived from the set of all binary images I^b_l = {I_n_,_l : 1 ≤ n ≤ N} of O_l. FAM( Inline graphic ,G): Fuzzy anatomy model of the whole object assembly in with its hierarchy. FM^T(O_l): Transformed (adjusted) FM(O_l) corresponding to the state when O_l is recognized in a given patient image I. O_l^D: Delineation of O_l in I represented as a binary image. Any image I will be represented by a pair I = (C, f), where C denotes a 3D rectangular array of voxels, and f is a mapping f: C → I where I is a set of integers¹ denoting the image intensities. For any binary image J = (C, f_b), we will use PAS(J) to denote the principal axes system derived from the set X of voxels of J with value 1. PAS(J) is described by the geometric center of X and the eigenvectors derived from X via principal component analysis.

Our description in the rest of Section 2 will follow the schematic of Figure 1. Table 1 in Appendix lists brief anatomic definitions of all objects from all three body regions considered in this paper.

2.1 Gathering image database for and G

This retrospective study was conducted following approval from the Institutional Review Board at the Hospital of the University of Pennsylvania along with a Health Insurance Portability and Accountability Act (HIPAA). The basic premise of our AAR approach is that the fuzzy anatomic model of Inline graphic for G should reflect near normal anatomy. Consequently, the cleanest way of gathering image data for model building will be to prospectively acquire image data in a well-defined manner from subjects in group G who are certified to be near normal. Such an approach would be expensive and may involve radiation exposure (in case of CT imaging). For developing the concepts and testing the feasibility of AAR, therefore, we have taken a vastly less expensive and simpler approach of utilizing existing human subject image data sets. For the thoracic and abdominal body regions, a board certified radiologist (co-author DAT) selected all image data sets (CT) from our health system patient image database in such a manner that the images appeared radiologically normal for the body region considered, with exception of minimal incidental focal abnormalities such as cysts, small pulmonary nodules, etc. Images with severe motion/streak artifacts or other limitations were excluded from consideration. For these two body regions, the population groups considered have an age range of approximately 50–60 years. This age range was selected to maximize our chances of finding sufficient number of near normal images. For the neck body region, we have utilized image data (MRI) previously acquired from normal subjects for the study of pediatric upper airway disorders. G in this instance is female subjects in the age range of 7–18. Our modeling schema is such that the population variables can be defined at any desired “resolution” in the future and the model can then be updated when more data are added.

Some organs in Inline graphic are better defined in a slice plane different from the slice plane used for imaging others. For example, for = neck, the best plane for slice imaging is sagittal for tongue and soft palate, while for the upper airways and other surrounding organs, axial slices are preferred. Our AAR methodology automatically handles organs defined in images with different orientations of digitization by representing image and object data in a fixed and common scanner coordinate system of reference.

2.2 Delineating objects of in the images in the database

There are two aspects to this task – forming an operational definition of both Inline graphic and the organs in in terms of their precise anatomic extent, and then delineating the objects following the definition. These considerations are important for building consistent and reliable models, and, in the future, if similar efforts and results for body-wide models are to be combined, exchanged, and standardized.

Definition of body regions and objects

Each body region is defined consistently in terms of a starting and ending anatomic location. For axial slice data, these locations are determined in terms of transverse slice positions. For example, for Inline graphic = Thorax, the body region is considered to extend axially from 5 mm below the base of the lungs to 15 mm above the apex of the lungs. Arms are not included in this study. For other orientations of slice planes in slice imaging, the same definitions are applied but translated into other planes. Similarly, each object included in Inline graphic is defined precisely irrespective of whether it is open-ended - because it straddles body regions (for example, esophagus) - or closed and contained within but is contiguous with other objects (for example, liver with hepatic portal vein, common hepatic artery, and bile duct). For each body region, we have created a document that delineates its precise definition and the specification of the components and boundaries of its objects. This document is used as a reference by all involved in generating data sets for model building. These definitions are summarized in the table included in Appendix.

Each body region is carved out manually, following its definition, from the data sets gathered for it. In our notation, I denotes the resulting set of such standard images that precisely cover Inline graphic as per definition. We assume the scanner coordinate system, SCS, as a common reference system with respect to which all coordinates will be expressed.

Delineation of objects

The objects of Inline graphic are delineated in the images of I, adhering to their definition, by a combination of methods including live wire, iterative live wire (Souza and Udupa 2006), thresholding, and manual painting, tracing and correction. To minimize human labor and to maximize precision and accuracy, algorithms in terms of a proper combination of these methods and the order in which objects are delineated are devised first, all of which operate under human supervision and interaction. For illustration, in the abdomen, to delineate subcutaneous adipose tissues (SAT) as an object, the skin outer boundary ASkn (as an object) is first segmented by using the iterative live wire method. Iterative live wire is a version of live wire in which once the object is segmented in one slice, the user commands next slice, the live wire then operates automatically in the next slice, and the process is continued until automatic tracing fails when the user resorts to interactive live wire again, and so on. Subsequently, the interface between the subcutaneous and visceral adipose compartments is delineated by using also the iterative live wire method. Once these two object boundaries are delineated, the subcutaneous and visceral components are delineated automatically by using thresholding and morphological operations. On MR images, the same approach works if background non-uniformity correction and intensity standardization (Nyul and Udupa 1999) are applied first to the images in I. If direct delineation by manual tracing or even by using live wire is employed, the process would become complicated (because of the complex shape of the adipose and visceral compartments) and much more labor intensive.

Because of the enormity of this task, a number of trainees, some with medical and biomedical but some with engineering background, were involved in accomplishing this task. All tracings were examined for accuracy by several checks – 3D surface renditions of objects from each subject in various object combinations as well as a slice-by-slice verification of the delineations overlaid on the gray images for all images. The set of binary images generated in this step for all objects is denoted by I^b = {I_n_,_l : 1 ≤ n ≤ N & 1 ≤ l ≤ L}. The set of binary images generated just for object O_l is denoted by I^b_l = {I_n_,_l : 1 ≤ n ≤ N}.

2.3 Constructing fuzzy object models

The Fuzzy Anatomy Model FAM( Inline graphic , G) of any body region for group G is defined to be a quintuple:

FAM (B, G) = (H, M, ρ, λ, η) .

(1)

Briefly, the meaning of the five elements of FAM( Inline graphic , G) is as follows. H is a hierarchy, represented as a tree, of the objects in ; see Figure 2. M is a collection of fuzzy models, one model per object in . ρ describes the parent-to-offspring relationship in H over G. λ is a set of scale factor ranges indicating the size variation of each object O_l over G. η represents a set of measurements pertaining to the objects in Inline graphic . A detailed description of these elements and the manner in which FAM( , G) is derived from I and I^b are presented below.

(a) ***Hierarchy for whole body WB***. (b) ***Hierarchy for Thorax***. TSkn: Outer boundary of thoracic skin as an object; RS: Respiratory System; TSk: Thoracic Skeleton; IMS: Internal Mediastinum; RPS, LPS: Right & Left Pleural Spaces; TB: Trachea & Bronchi; E: Esophagus; PC: Pericardium; AS, VS: Arterial & Venous Systems. (c) ***Hierarchy for Abdomen***. ASkn: Outer boundary of abdominal skin; ASk: Abdominal Skeleton; Lvr: Liver; ASTs: Abdominal Soft Tissues; SAT & VAT: Subcutaneous and Visceral Adipose Tissues; Kd: Kidneys; Spl: Spleen; Msl: Muscle; AIA: Aorta and Iliac arteries; IVC: Inferior Vena Cava; RKd & LKd: Right and Left Kidneys. (d) ***Hierarchy for Neck***. NSkn: Outer boundary of skin in neck; A&B: Air & Bone; FP: Fat Pad; NSTs: Soft Tissues in neck; Mnd: Mandible; Phrx: Pharynx; Tnsl: Tonsils; Tng: Tongue; SP: Soft Palate; Ad: Adenoid; NP & OP: Nasopharynx and Oropharynx; RT & LT: Right and Left Tonsils.

Hierarchy H

This element describes the way the objects of Inline graphic are considered ordered anatomically as a tree structure. This order currently specifies the inclusion of an offspring object O_k anatomically in the parent object O_l.² While each has its own hierarchy, itself forms the offspring of a root denoting the whole body, WB, as shown in Figure 2. The hierarchies devised for the three body regions studied are shown in Figure 2. An object that is exactly a union of its offspring will be referred to as a composite object. Examples: RS, Fat, Kd, etc. Note that none of the skin objects is a composite object since the full body region inside the skin is not fully accounted for by the union of the offspring objects. The notion of composite objects is useful in combining objects of similar characteristics at a higher level of the hierarchy, which may make object recognition (and delineation) more effective. Thin tubular objects will be called sparse objects: TB, E, AS, VS, AIA, IVC, Phrx, NP, and OP. Compact, blob-like objects will be referred to as non-sparse: TSkn, RS, IMS, LPS, RPS, PC, ASkn, Fat, SAT, VAT, Lvr, Spl, Kd, RKd, LKd, NSkn, FP, NSTs, Tnsl, Tng, SP, Ad, RT, and LT. Some objects are a hybrid between these two types, consisting of both features. Examples: TSk, Ask, ASTs, A&B, and Mnd.

Fuzzy model set M

The second element M in the description of FAM( Inline graphic , G) represents a set of fuzzy models, M = {FM(O_l): 1 ≤ l ≤ L}, where FM(O_l) is expressed as a fuzzy subset of a reference set Ω_l ⊂ Z³ defined in the SCS; that is, FM(O_l) = (Ω_l, μ_l). The membership function μ_l(v) defines the degree of membership of voxel v ∈ Ω_l in the model of object O_l. Ideally, for any l, 1 ≤ l ≤ L, we would like the different samples of O_l in different subjects to differ by a transformation A_n_,_l involving translation, rotation, and isotropic scaling. Our idea behind the concept of the fuzzy model of an object is to codify the spatial variations in form from this ideal that may exist among the N samples of the object as a spatial fuzzy set, while also retaining the spatial relationship among objects in the hierarchical order.

Given the training set of binary images I^b_l of object O_l, we determine A_n_,_l, μ_l, and FM(O_l) for O_l as follows. We permit only such alignment operations, mimicking A_n_,_l, among the members of I^b_l, that are executed precisely without involving search and that avoid the uncertainties of local optima associated with optimization-based full-fledged registration schemas. In this spirit, we handle the translation, rotation, and scaling components of A_n_,_l in the following manner.

For translation and rotation, for each manifestation I_n_,_l of O_l in I^b_l, we determine, within SCS, the principal axes system PAS(I_n_,_l) of O_l. Subsequently, all samples are aligned to the mean center and principal axes³. The scale factor estimation is based on a linear size estimate (in mms) of each sample of O_l and resizing all samples to the mean size. The size of O_l in I_n_,_l is determined from $\sqrt{(e_{1} + e_{2} + e_{3})}$ , where e₁, e₂, and e₃ are the eigenvalues corresponding to the principal components of O_l in I_n_,_l.⁴

After aligning the members of I^b_l via A_n_,_l, a distance transform is applied to each transformed member for performing shape-based interpolation (Raya and Udupa 1990, Maurer et al. 2003), the distances are averaged over all members, and converted through a sigmoid function to obtain the membership values μ_l and subsequently FM(O_l).

Parent-to-offspring relationship ρ

This element describes the parent-to-offspring spatial relationship in H for all objects in Inline graphic . Since each object O_k has a unique parent, this relationship is represented by ρ = {ρ_k : 1 ≤ k ≤ L}⁵. For each O_k, ρ_k codifies the mean position as well as the orientation relationship between O_k and its parent over N samples. We adopt the convention that ρ₁ denotes the relationship of the root object of Inline graphic relative to SCS. Let GC_n_,_l be the geometric center of O_l in I_n_,_l. Then, the mean positional relationship P_l_,_k between O_l and O_k is considered to be the mean of the vectors in the set {GC_n_,_k − GC_n_,_l : 1≤ n ≤ N}. To find the mean orientation Q_l_,_k, we make use of the eigenvectors E¹_n_,_l, E²_n_,_l, and E³_n_,_l of the shape of O_l in I_n_,_l estimated over all N samples. We take an average of each Eⁱ_n_,_l over N samples for i = 1, 2, 3. However, for some n and i, Eⁱ_n_,_l may be more than 90 degrees from the average, in which case we replace Eⁱ_n_,_l by −Eⁱ_n_,_l while simultaneously replacing E^j_n_,_l by −E^j_n_,_l for some j different from i so as to keep the system right-handed. We then recalculate the average, and repeat until the eigenvector is within 90 degrees of the average. Then, starting from either the first or the third eigenvector, whichever has the eigenvalue farther from the second, we normalize and make the others orthogonal to it. Q_l,k is then taken to be the transformation that aligns the eigenvector system of the parent O_l with that mean orientation. This method guarantees a robust orientation estimate despite the 180-degrees switching property of eigenvectors.

In order not to corrupt ρ_k by the differences in size among subjects, before estimating ρ_k, the parent O_l and all offspring objects O_k of O_l are scaled with respect to the center GC_n_,_l of O_l as per a common scale factor, estimated for O_l via the method described above. The reasoning behind this scaling strategy is that an object and its entire offspring should be scaled similarly to retain their positional relationship information correctly.

Scale range λ

The fourth element λ of FAM( Inline graphic , G) is a set of scale factor ranges, λ = {λ_l = [λ^b_l, λ^h_l] : 1 ≤ l ≤ L}, indicating the size variation of each object O_l over its family I^b_l. This information is used in recognizing O_l in a given image to limit the search space for its pose; see Section 3.

Measurements η

This element represents a set of measurements pertaining to the object assembly in Inline graphic . Its purpose is to provide a database of normative measurements for future use. We are not exploring this aspect in this paper. However, this element also serves to improve our knowledge about object relationships (in form, geographical layout etc. in ) and thence in constructing better hierarchies for improving AAR. We will discuss this briefly in Section 5.

There are several parameters related to object recognition (Section 3) and delineation (Section 4), some of which are image modality specific. (They are identified by T₁^m and Th_l in Section 3 and σ_ψO, m_ϕO, m_ϕB, σ_ϕO, and σ_ϕB, in Section 4.) The values of these parameters are also considered part of the description of η. The definition of these parameters and the process of their estimation are described at relevant places in Sections 3 and 4 for ease of reading, although their actual estimation is done at the model building stage.

The fuzzy anatomy model FAM( Inline graphic , G) output by the model building process is used in performing AAR on any image I of for group G as described in Sections 3 and 4.

3. RECOGNIZING OBJECTS

We think of the process of what is usually referred to as “segmenting an object in an image” as consisting of two related phenomena – object recognition (or localization) and object delineation. Recognition is a high-level process of determining the whereabouts of the object in the image. Given this information for the object, its delineation is the meticulous low-level act of precisely indicating the space occupied by the object in the image. The design of the entire AAR methodology is influenced by this conceptual division. We believe that without achieving acceptably accurate recognition it is impossible to obtain good delineation accuracy. The hierarchical concept of organizing the objects for AAR evolved from an understanding of the difficulty involved in automatic object recognition. Once good recognition accuracy is achieved, several avenues for locally confined accurate delineation become available, as we discuss in Section 4. The goal of recognition in AAR is to output the pose (translation, rotation, and scaling) of FM(O_l), or equivalently the pose-adjusted fuzzy model FM^T(O_l), for each O_l in a given test image I of Inline graphic such that FM^T(O_l) matches the information about O_l present in I optimally.

The recognition process proceeds hierarchically as outlined in the procedure AAR-R presented below. In Step R₁, the root object is recognized first by calling algorithm R-ROOT⁶. Then, proceeding down the tree represented by H in the breadth-first order, other objects are recognized by calling algorithm R-OBJECT. The latter makes essential use of the parent fuzzy model and the parent-to-offspring relationship ρ encoded in FAM( Inline graphic , G).

Procedure AAR-R
Input:	An image I of , FAM( , G).
Output:	FM^T(O_l), l = 1, …, L.
Begin
R1.	Call R-ROOT to recognize the root object in H;
R2.	Repeat
R3.	Find the next offspring O_k to recognize in H (see text);
R4.	Knowing FM^T(O_l), ρ_k, and λ_k, call R-OBJECT to recognize O_k;
R5.	Until all objects are covered in H;
R6.	Output FM^T(O_l), l = 1, …, L;
End

Procedure AAR-D
Input:	An image I of , FAM( , G), FM^T(O_l), l = 1, …, L.
Output:	O_l^D, l = 1, …, L.
Begin
D1.	Call D-ROOT to delineate the root object in H;
D2.	Repeat
D3.	Traverse H and find the next offspring O_k to delineate in H;
D4.	Knowing delineation of O_l, call D-OBJECT to delineate O_k in I;
D5.	Until all objects are covered in H;
D6.	Output O_l^D, l = 1, …, L;
End

Algorithm FMIRFC
Input:	Image I of , FAM( , G), FM^T(O_l) at recognition. Below, we assume O = O_l.
Output:	O_l^D.
Begin
FC1.	Determine background B of O;
FC2.	Retrieve affinities κ_O and κ_B from FAM( , G);
FC3.	Compute combined affinity κ;
FC4.	Retrieve thresholds Th_O, Th_B, Th_O^M, and Th_B^M from FAM( , G) and determine seed sets A_O and A_B in I via (9);
FC5.	Call the IRFC delineation algorithm with κ, A_O, A_B, and I as arguments;
FC6.	Output image O_l^D returned by the IRFC algorithm;
End

Data Identifier	Body Region	Group G (age)	Number of Subjects N	Image Modality	Imaging Protocol Details	Image Information
DS1	Thorax	50–60 male	50 normal	CT	Contrast-enhanced, axial, breath-hold	512 × 512 × 51–69, 0.9 × 0.9 × 5 mm³
DS2	Abdomen	50–60 male	50 normal	CT	Contrast-enhanced, axial, breath-hold	512 × 512 × 38–55, 0.9 × 0.9 × 5 mm³
DS3	Neck	8–17 male & female	15 normal	MRI	T2-weighted, axial & T1- & T2-weighted sagittal. T2: TR/TE=8274.3/82.6 msec, T1: TR/TE= 517.7/7.6 msec	400 × 400 × 35–50, 0.5 × 0.5 × 3.3 mm³
DS4	Abdomen	8–17 male & female	14 6 normal, 8 obese patients	MRI	T2-weighted, axial. TR/TE=1556.9/84 msec	400 × 400 × 45–50, 0.7 × 0.7 × 6 mm³

	TSkn	RS	TSk	IMS	RPS	TB	LPS	PC	E	AS	VS
TSkn	1
RS	0.76	1
TSk	0.76	0.93	1
IMS	0.48	0.76	0.71	1
RPS	0.6	0.92	0.88	0.75	1
TB	0.06	0.41	0.5	0.56	0.59	1
LPS	0.64	0.93	0.87	0.74	0.96	0.57	1
PC	0.47	0.51	0.45	0.65	0.28	0.11	0.3	1
E	0.42	0.65	0.56	0.58	0.72	0.58	0.78	0.18	1
AS	0.44	0.53	0.49	0.71	0.54	0.24	0.51	0.35	0.35	1
VS	0.3	0.31	0.35	0.34	0.34	0.09	0.34	−.01	0.05	0.42	1

	NSkn	A&B	FP	Mnd	NP	OP	Tng	SP	Ad	LT	RT
NSkn	1
A&B	0.89	1
FP	0.76	0.81	1
Mnd	0.75	0.96	0.83	1
NP	0.39	0.12	−.06	−.12	1
OP	0.63	0.59	0.44	0.54	0.14	1
Tng	0.83	0.75	0.76	0.66	0.19	0.65	1
SP	0.5	0.27	0.23	0.14	0.46	0.26	0.37	1
Ad	−.2	0.61	−.19	0.1	−.29	−.06	−.07	−.19	1
LT	0.61	0.56	0.58	0.48	0.28	0.5	0.64	0.25	−.1	1
RT	0.61	0.56	0.58	0.48	0.28	0.5	0.64	0.25	−.1	1	1

	TSkn	RS	TSk	IMS	LPS	TB	RPS	E	PC	AS	VS	Mean

Location Error (mm)	3.9	5.5	9.0	5.6	6.3	11.6	10.4	9.8	8.6	10.7	31.8	8.1
Location Error (mm)	1.5	2.3	5.0	3.5	3.1	5.0	4.7	4.8	5.0	5.4	12.0	4.0

Size error	1.0	0.99	0.96	0.95	0.97	0.91	0.98	0.9	0.95	1.01	0.77	0.96
Size error	0.01	0.02	0.05	0.05	0.03	0.06	0.04	0.14	0.05	0.08	0.06	0.06

	TSkn	RS	TSk	IMS	LPS	TB	RPS	E	PC	AS	VS	Mean

Location Error (mm)	3.9	5.5	9	5.6	6.3	8	10.4	14.2	8.6	8.1	33.6	8.0
Location Error (mm)	1.5	2.3	5	3.5	3.1	6.5	4.7	10.5	5	7.5	15.1	4.9

Size error	1.01	0.99	0.96	0.95	0.97	0.83	0.98	0.85	0.95	0.99	0.77	0.95
Size error	0.01	0.02	0.05	0.05	0.03	0.08	0.04	0.12	0.05	0.08	0.06	0.05

	NSkn	A&B	FP	NSTs	Mnd	Phrx	Tnsl	Tng	SP	Ad	NP	OP	RT	LT	Mean

Location Error(mm)	3	7.8	4.2	4.8	12.5	10.4	2.8	4.9	5.1	1.8	11.1	10	2.9	2.3	5.96
Location Error(mm)	1.2	3.8	2.1	2.1	3.7	4.5	1.8	2.8	1.8	0.8	6.8	8.7	2.2	2.1	1.96

Size error	1	0.9	1	0.92	0.74	0.8	1	1.02	0.93	0.9	0.65	0.74	0.92	0.9	0.93
Size error	0.01	0.03	0.03	0.06	0.05	0.04	0.1	0.06	0.24	0.12	0.07	0.2	0.11	0.12	0.04

	ASkn	SAT

Position Error (mm)	4.6	12.97
Position Error (mm)	2.5	5.3

Size Error	1.01	1
Size Error	0.05	0.03

	TSkn	RS	TSk	IMS	LPS	RPS	E	PC	TB	AS

FPVF	0.02	0.0	0.19	0.03	0.01	0.01	0.0	0.01	0.01	0.01
FPVF	0.02	0.0	0.05	0.01	0.03	0.02	0.0	0.00	0.00	0.00

FNVF	0.05	0.06	0.13	0.07	0.04	0.04	0.49	0.09	0.16	0.17
FNVF	0.06	0.04	0.07	0.07	0.02	0.02	0.19	0.06	0.14	0.17

HD (mm)	3.6	1.24	10.6	6.2	2.9	2.1	3.1	3.5	5.2	5.3
HD (mm)	4.5	0.42	2.4	1.8	8.8	4.7	0.87	1.3	1.8	2.5

Method	Objects	Voxel size (mm³)	Training-to-test data proportion	Location error (mm)	Region overlap (Dice, Jackard Index (JI), etc.)
Lu et al. 2012	Prostate, bladder, rectum	~ × ~ × 0.8 to 5	141 to 47, 4-fold	2.4 to 4.2	~
Linguraru et al. 2012	Liver, spleen, kidneys	(0.5 to 0.9)² × 1 to 5	27 to 1, 28-fold	0.8 to 1.2	90.9% to 94.8%
Okada et al. 2008	Liver, vena cava, gallbladder	0.7 × 0.7 × 2.5	20 to 8	(for liver) 1.5 to 2.8	88%
Chu et al. 2013	Liver, spleen, pancreas, kidneys	(0.55 to 0.82)² × 0.7 to 1 (estimated)	90 to 10, 10-fold	~	56% (pancreas-JI) to 95.2% (liver-Dice)
Criminisi et al. 2013	26 anatomic structures in the torso	(0.5 to 1)² × 1 to 5	318 to 82	9.7 to 19.1 (mean for each structure)	~
Zhou et al. 2012	12 organ regions in thorax, abdomen, pelvis	(0.6 to 0.7)³	300 to 1000	6 to 14 for mode locations	~
Baiker et al. 2010	Brain, heart, kidneys, lungs, liver, skeleton	(0.332)³	MOBY atlas, 26 datasets	~	47% to 73%

	TSkn	RS	TSk	IMS	LPS	TB	RPS	E	PC	AS	VS	Mean

Location error (mm)	10.5	12.9	21.1	27.7	91.4	53.3	72.3	42.4	45.5	23.1	82.2	43.8
Location error (mm)	9.5	13.1	21.8	9.8	10.8	20.9	12.9	34.5	12.5	15.2	33.8	17.7

Size error	1.0	1.01	0.96	0.92	0.8	0.82	0.8	0.86	0.9	0.97	0.81	0.9
Size error	0.02	0.09	0.08	0.07	0.09	0.06	0.07	0.14	0.06	0.11	0.08	0.08

Operation	Thorax	Abdomen	Neck
Model building	35	42	24
Object recognition	30	46	6
Object delineation	47	56	24

	ASkn	ASk	ASTs	Lvr	SAT	Msl	Spl	RKd	LKd	AIA	IVC
ASkn	1
ASk	0.68	1
ASTs	0.9	0.8	1
Lvr	0.61	0.48	0.58	1
SAT	1	0.69	0.92	0.61	1
Msl	0.91	0.79	0.99	0.63	0.94	1
Spl	0.62	0.43	0.61	0.51	0.65	0.62	1
RKd	0.53	0.64	0.57	0.61	0.51	0.6	0.34	1
LKd	0.53	0.56	0.52	0.51	0.49	0.54	0.34	0.87	1
AIA	0.6	0.85	0.7	0.27	0.58	0.68	0.49	0.51	0.5	1
IVC	0.32	0.58	0.47	0.29	0.32	0.46	0.3	0.38	0.36	0.67	1

	ASkn	SAT	ASk	Lvr	ASTs	Kd	Spl	Msl	AIA	IVC	RKd	LKd	Mean

Location Error(mm)	5.9	20.2	11.7	7.9	7.2	10.6	11.6	7.7	8.2	8.7	11.3	7.3	9.8
Location Error(mm)	3.4	8.5	7.9	5.4	3.0	9.8	13.9	3.6	2.8	7.2	11.6	7.4	7

Size error	1.0	0.97	0.96	0.93	1.0	0.94	1.2	1.01	1.1	1.15	0.97	0.93	1.01
Size error	0.02	0.03	0.06	0.07	0.02	0.09	0.19	0.03	0.13	0.1	0.1	0.08	0.07

Thoracic objects	Acronym	Definition of object
Thoracic skin	TSkn	The outer boundary of the thoracic skin (arms excluded). The interior region constitutes the entire thoracic body region. The inferior boundary is defined to be 5 mm below the base of the lungs and the superior boundary is defined to be 15 mm above the lung apices.
Thoracic skeleton	TSk	All skeletal structures contained in the thoracic body region, including the spine, ribs, sternum, and the portions of the scapulae and clavicles that are inside the body region.
Respiratory system	RS	Grouping of RPS, LPS, and TB.
Right lung	RPS	The outer boundary of the right lung along the right pleura.
Left lung	LPS	The outer boundary of the left lung along the left pleura.
Trachea and bronchi	TB	The outer boundary of the trachea and bronchi from the superior thoracic trachea to the distal main stem bronchi.
Internal mediastinum	IMS	Grouping of PC, E, AS, and VS.
Pericardial region	PC	Region within the boundary of pericardial sac. The superior aspect is defined by the branching of the main pulmonary artery.
Esophagus	E	The outer boundary of the esophagus from the superior aspect of thorax to the level of gastric cardia.
Arterial system	AS	The outer boundary of the ascending aorta, aortic arch, descending thoracic aorta, pulmonary arteries, innominate artery, proximal left common carotid artery, and proximal left subclavian artery. The superior aspect is defined by the branching of the innominate artery.
Venous system	VS	The outer boundary of the superior vena cava, right and left brachiocephalic veins, and azygos vein.
Abdominal objects	Acronym	Definition of object
Abdominal skin	ASkn	The outer boundary of the abdominal skin. The interior region constitutes the entire abdominal body region. The superior boundary is defined by the superior aspect of the liver. The inferior boundary is defined by the bifurcation of the abdominal aorta into the common iliac arteries.
Abdominal skeleton	ASk	All skeletal structures contained in the abdominal body region, including lumbar spine and portion of the inferior ribs within the body region.
Soft tissue	ASTs	Grouping of Kd, Spl, Msl, AIA, IVC.
Kidneys	Kd	Grouping of RKd and LKd.
Right kidney	RKd	The outer boundary of the right kidney. All external blood vessels are excluded.
Left kidney	LKd	The outer boundary of the left kidney. All external blood vessels are excluded.
Spleen	Spl	The outer boundary of the spleen. All external blood vessels are excluded
Muscle	Msl	The outer boundaries of the abdominal musculature, including the rectus abdominis, abdominal oblique, psoas, and paraspinal muscles.
Abdominal aorta	AIA	The outer boundary of the abdominal aorta. The superior and inferior slices of AIA are the same as those of the abdominal region.
Inferior vena cava	IVC	The outer boundary of the inferior vena cava. The superior and inferior slices of IVC are the same as those of the abdominal region.
Liver	Lvr	The outer boundary of the liver. The intrahepatic portal veins and hepatic arteries are included in this region.
Fat	Fat	Grouping of SAT and VAT.
Subcutaneous adipose tissue	SAT	Adipose tissue in the subcutaneous region in the abdomen.
Visceral adipose tissue	VAT	Adipose tissue internal to the abdominal musculature.
Neck objects	Acronym	Definition of object
Head and Neck skin	NSkn	The outer boundary of the head and neck skin, where the interior region constitutes the entire head and neck body region. The superior boundary is defined by a level 6.6 mm above the superior aspect of the globes. The inferior boundary is defined by a level 6.6 mm inferior to the inferior aspect of the mandible.
Air and Bone	A&B	Grouping of Mnd and Phrnx.
Mandible	Mnd	The outer boundary of the mandible.
Pharynx	Phrx	Grouping of NP and OP.
Nasopharyngeal airway	NP	The outer contour of the nasal and nasopharyngeal air cavity, extending to the inferior aspect of the soft palate.
Oropharyngeal airway	OP	The outer contour of the oropharyngeal air cavities, extending from the inferior aspect of the soft palate to the superior aspect of the epiglottis.
Fat pad	FP	The outer boundary of the parapharyngeal fat pad.
Neck soft tissues	NSTs	Grouping of Tnsl, Tng, SP, Ad.
Palatine tonsils	Tnsl	Grouping of RT and LT.
Right palatine tonsil	RT	The outer boundary of the right palatine tonsil.
Left palatine tonsil	LT	The outer boundary of the left palatine tonsil.
Tongue	Tng	The outer boundary of the tongue.
Soft palate	SP	The outer boundary of the soft palate.
Adenoid tissue	Ad	The outer boundary of the adenoid tissue.

PERMALINK

Body-Wide Hierarchical Fuzzy Modeling, Recognition, and Delineation of Anatomy in Medical Images

Jayaram K Udupa

Dewey Odhner

Liming Zhao

Yubing Tong

Monica MS Matsumoto

Krzysztof C Ciesielski

Alexandre X Falcao

Pavithra Vaideeswaran

Victoria Ciesielski

Babak Saboury

Syedmehrdad Mohammadianrasanani

Sanghun Sin

Raanan Arens

Drew A Torigian

Abstract

1. INTRODUCTION

1.1 Background

1.2 Related work

1.3 Outline of paper and approach

Figure 1.

2. BUILDING FUZZY MODEL OF BODY REGION

Notation

2.1 Gathering image database for and G

2.2 Delineating objects of in the images in the database

Definition of body regions and objects

Delineation of objects

2.3 Constructing fuzzy object models

Figure 2.

Hierarchy H

Fuzzy model set M

Parent-to-offspring relationship ρ

Scale range λ

Measurements η

3. RECOGNIZING OBJECTS

One-Shot Method

Thresholded Optimal Search

Determining Thl at the model building stage

4. DELINEATING OBJECTS

Fuzzy model-based IRFC (FMIRFC)

Affinity function

Seed specification

5. ILLUSTRATIONS, EXPERIMENTS, RESULTS, AND DISCUSSION

5.1 Image data

Table 2.

5.2 Model building

Figure 3.

Figure 4.

Figure 5.

Table 3.

Table 5.

5.3 Object recognition

Figure 6.

Figure 8.

Table 6.

Table 9.

Table 8.

Figure 9.

Table 10.

5.4 Object delineation

Figure 10.

Figure 13.

Table 11.

Table 14.

5.5 Comparison with a non-hierarchical approach

Table 15.

5.6 Computational considerations

Table 16.

5.7 Comparison with other methods

Table 17.

6. CONCLUDING REMARKS

Figure 7.

Figure 11.

Figure 12.

Table 4.

Table 7.

Table 12.

Table 13.

HIGHLIGHTS.

Determining Th_l at the model building stage