Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Jun 15.
Published in final edited form as: Med Image Comput Comput Assist Interv. 2014;17(0 1):299–306. doi: 10.1007/978-3-319-10404-1_38

Hierarchical Label Fusion with Multiscale Feature Representation and Label-Specific Patch Partition

Guorong Wu 1, Dinggang Shen 1
PMCID: PMC4467207  NIHMSID: NIHMS697143  PMID: 25333131

Abstract

Recently, patch-based label fusion methods have achieved many successes in medical imaging area. After registering atlas images to the target image, the label at each target image point can be subsequently determined by checking the patchwise similarities between the underlying target image patch and all atlas image patches. Apparently, the definition of patchwise similarity is critical in label fusion. However, current methods often simply use entire image patch with fixed patch size throughout the entire label fusion procedure, which could be insufficient to distinguish complex shape/appearance patterns of anatomical structures in medical imaging scenario. In this paper, we address the above limitations at three folds. First, we assign each image patch with multiscale feature representations such that both local and semi-local image information can be encoded to increase robustness of measuring patchwise similarity in label fusion. Second, since multiple variable neighboring structures could present in one image patch, simply computing patchwise similarity based on the entire image patch is not specific to the particular structure of interest under labeling and can be easily misled by the surrounding variable structures in the same image patch. Thus, we partition each atlas patch into a set of new label-specific atlas patches according to the existing label information in the atlas images. Then, the new label-specific atlas patches can be more specific and flexible for label fusion than using the entire image patch, since the complex image patch has now been semantically divided into several distinct patterns. Finally, in order to correct the possible mis-labeling, we hierarchically improve the label fusion result in a coarse-to-fine manner by iteratively repeating the label fusion procedure with the gradually-reduced patch size. More accurate label fusion results have been achieved by our hierarchical label fusion method with multiscale feature presentations upon label-specific atlas patches.

1 Introduction

Many medical imaging based studies demand accurate segmentation of anatomical structures, in order to quantitatively measure structure differences across individuals or between two groups. To this end, automatic ROI (Region of Interest) labeling has been a hot topic in medical image processing areas, as evidenced by hundreds of labeling and label fusion methods that have been developed to improve both segmentation accuracy and robustness.

In order to deal with high structural variations in the population, multiple atlases with delineated labels are commonly used for labeling the latent ROIs of the target image [1]. The basic assumption behind multi-atlas based segmentation is that the target image should bear the same label as the atlas image if both of them present similar shape/appearance. Thus, all atlas images are required to be registered to the target image before label fusion. To alleviate the possible mis-registration, patch-based label fusion technique [1, 2] is also advocated by measuring the patchwise similarity at each point. Intuitively, the higher the similarity between target image and a particular atlas image is, the more confidence we assign the label on that atlas to the target image.

It is apparent that the patchwise similarity is the key in patch-based label fusion methods. Most of the current state-of-the-art methods only use the fixed patch size throughout entire label fusion procedure. For example, 7×7×7 or 9×9×9 cubic patches are usually used in the literature. In order to make the label fusion robust to noise, image patches are required to be large enough in order to capture sufficient image content. However, large image patch could raise a critical issue in labeling small anatomical structures, since the patchwise similarity could be dominated by the surrounding large structures in the image patch. The main reason for such dilemma is that the simple use of whole image patch lacks high-level knowledge to distinguish complex appearance patterns in medical imaging data.

Many efforts have been made to improve the discrimination power of image patches. For instance, sparse dictionary learning technique is used in [3] to find the best feature representations in label fusion. However, the dictionary is still confined in using the whole image patch with fixed size. In this paper, we address the above limitations in a new perspective of developing hierarchical and high-level feature representations for image patch. In general, our contribution has three folds.

First, we propose to adaptively treat each image point within the image patch by designing the image patch with multi-scale feature representations. We argue that image points close to the patch center should use fine-scale features to characterize the details of patch center, while the level of image features could gradually turn from fine to coarse as the distance toward the patch center increases. To this end, we assign the conventional image patch with the layerwise multi-scale feature representation by adaptively capturing image features in each layer with different scale.

Second, it is very common that the to-be-segmented ROI, e.g., hippocampus, is surrounded by other complex structures. Those surrounding variable structures may mislead the patchwise similarity measurement. In computer vision area, recognizing object could be much easier if the foreground pattern can be separated from the background clutters [4]. In light of this, we present a new concept of label-specific patch partition to enhance the discriminative power of each atlas patch in label fusion. Specifically, since each atlas patch bears the well-determined labels, such information can provide the valuable heuristic about anatomical structures and thus can be used to guide the splitting of each atlas patch into a set of new complementary label-specific (or structure-specific) image patches. It is worth noting that each label-specific image patch carries only the image information at selected locations with same label. Therefore, our label-specific partition not only enriches the representations for each atlas patch but also encapsulates the high-level label information. To the best of our knowledge, such important label information is poorly used in the current label fusion methods. Afterwards, sparsity constraint is further used in our proposed label fusion method to deal with the increased number of label-specific image patches.

Third, current label fusion methods fix the patch size throughout the entire label fusion procedure. Here, we go one step further, e.g., propose to iteratively refine the labeling results by gradually reducing the patch size with the progress of label fusion. Specifically, we use the large image patches in the beginning, in order to make the label fusion robust. Sparsity constraint is used to allow only a small number of atlas patches for joining the label fusion. Then, for those selected atlas patches, we can reduce their patch size and repeat the label fusion procedure to refine their respective weights in the final label fusion.

We comprehensively evaluate the performance of our new label fusion method both in segmenting hippocampus in ADNI dataset and labeling 54 ROIs in LPBA40 dataset. More accurate labeling results are achieved, compared with the state-of-the-art label fusion methods.

2 Methods

Given the target image T, the goal of label fusion is to automatically determine a label map LT for the target image T. To achieve it, we need to first register all atlas images as well as their labeled maps to the target image space. Here, we use I = {Is|S = 1, …, N} and L = {Ls|S = 1, …, N} to denote N registered atlases and label maps, respectively. For each target image point x (xT), all the atlas patches1 within a certain search neighborhood n(x), denoted as β⃑s,y (β⃑s,yIs, yn(x)), are used to compute the patchwise similarities w.r.t. the target image patch α⃑T,x (α⃑T,xT). It is worth noting that we arrange each patch, β⃑s,y and α⃑T,x, into a column vector.

Next, label fusion strategies, e.g., non-local averaging, can be used to calculate the weighting vector W⃑ = [Ws,y]s=1, …,N,yn(x) for each atlas patch β⃑s,y. As we will explain in Section 2.2, we adopt the sparsity constraint in our method by regarding the label fusion procedure as the problem of finding optimal combination among a set of atlas patches {β⃑s,y} for the target image patch α⃑T,x [5, 6]:

W^=argminwαT,x-Bw2+λw1,s.t.w>0 (1)

where the scalar λ controls the strength of sparsity constraint and B is the matrix by assembling all column vectors {β⃑s,y} in a columnwise way. Assuming that we have M possible labels {l1, …, lm, …, lM} in the atlases, then the label on target image point x can be efficiently determined by:

L^T(x)=argmaxm=1,,Ms=1Nyn(x)[ws,y·δ(Ls(y),lm)] (2)

where Dirac function δ(Ls(y), lm) is always zero except for the case when Ls(y) bears the label lm. In that case, δ(Ls(y), lm) equals to 1.

It is clear that image intensities in the entire image patch are used in label fusion (Eq. 1). Since one image patch may contain more than one anatomical structures and the to-be-segmented ROI may have very complex shape/appearance pattern, current patch-based label fusion methods have high risk of being misled by the current definition of patchwise similarities that are computed based on the entire image patch. In the following, we propose three ways to improve the label fusion accuracy: (1) substantially upgrading the feature discrimination power by using multi-scale feature representations (Section 2.1); (2) adaptively building label-specific atlas patches by using the existing label information in the atlases (Section 2.2); and (3) hierarchically improving label fusion accuracy in a coarse-to-fine manner by gradually reducing the patch size (Section 2.3).

2.1 Multi-scale Feature Representations

In current patch-based label fusion methods, every point in the image patch uses its own intensity value and equally contributes in computing the patchwise similarity. Here, we allow each point to use adaptive scale for capturing local appearance characteristics. Specifically, we first partition the whole image patch into several nested non-overlapping layers, spreading from the center point to the bound of image patch. Next, we use small scale to capture the fine-scale features for the layer closest to the patch center. Gradually, we use larger and larger scale to capture the coarse-scale information as the distance to the patch center increases. Although advanced pyramid image technique can be applied for multiscale feature representation, we choose a more efficient way by replacing the intensity value with the average intensity in a certain neighborhood, due to the consideration of computational time. For example, for the points in the first layer that is the closest to the patch center (including the patch center and its 6 immediate neighboring points), we still keep using their original intensities. For each point in the second layer, we replace its intensity value with the average intensity value in its 3×3×3 neighborhood. Similarly, we use intensity average in a larger neighborhood as the feature representation for the image points beyond the second layer. In this way, the image patch is now equipped with the multi-scale feature representation. Hereafter, α⃑T,x and β⃑s,y denote the image patches after replacing the original intensities with the multi-scale feature representations.

2.2 Label-Specific Atlas Patch Partition

Since atlas image patches have label information, we can partition each atlas patch into a set of new label-specific atlas patches for encoding the label information. Given the atlas patch β⃑s,y, we use γ⃑s,y to denote its associated labels. Suppose there are M kinds of labels in γ⃑s,y. Then, the proposed label-specific atlas patch set Ps,y consists of M label-specific atlas patches, i.e., Ps,y={ps,ymm=1,,M}, where ps,ym is the column vector. Each element u in ps,ym keeps the intensity value β⃑s,y (u) if and only if γ⃑s,y (u) has label lm; otherwise, ps,ym(u)=0. Mathematically, we have ps,ym(u)=βs,y(u)·δ(γs,y(u),lm), where δ(.,.) is the same Dirac function as used in Eq. 2.

Note that the number of image patches increases significantly after we partition each atlas patch into the label-specific atlas patch set. Thus, we propose to use the sparsity constraint again in label fusion, in order to select only a small number of label-specific atlas patch ps,ym to represent the target image patch α⃑T,x. By replacing each conventional atlas patch with label-specific atlas patches, the matrix of atlas patches B in Eq. 1 now expands to P = [Ps,y]s=1, …, N,yn(x). Then, the new energy function for label fusion can be reformulated as:

ξ^=argminξαT,x-Pξ2+λξ1,s.t.ξ>0, (3)

where ξ=[ξs,ym] is the weighting vector for each label-specific atlas patch ps,ym. Since each ps,ym is only related with a particular label lm, each element ξs,ym in ξ⃑ represents the probability of labeling the center point x of the target image patch α⃑T,x by label lm. Therefore, the labeling result on the target image point x can be obtained by:

L^T(x)=argmaxm=1,,Ms=1Nyn(x)ξs,ym (4)

Fig. 1 demonstrates the construction of label-specific atlas patch set P for the case with only two labels, i.e., M = 2. As displayed in Fig. 1(a), each atlas patch β⃑s,y is split into two label-specific atlas patches ps,y1 and ps,y2, where we use the black to denote the zero elements. For example, the zero elements in ps,y1 have their label as l2, instead of l1. The objective function in Eq. 1 is to minimize the appearance difference between α⃑T,x and BW⃑. In our method, we first divide each whole atlas patch into several label-specific patches and then recognize the structural patterns in α⃑T,x in a label-by-label manner. In this way, our method makes the representation of α⃑T,x more selective and flexible.

Fig. 1.

Fig. 1

(a) Construction of label-specific atlas patch set and (b) the advantage in label fusion

The advantage of using label-specific atlas patches is demonstrated by the toy example in Fig. 1(b), where we use red and blue to denote two different labels and numbers represent the intensity values. To be simple, only two atlas patches are used in this example. Apparently, the first atlas patch (first column in B) and α⃑T,x belongs to the same structure since their intensity values are both in the ascending order. If we estimate the weighting vector W⃑ based on the entire atlas patch by Eq. 1 (λ = 0.01), the weights for the first and second atlas patches are 0.43 and 0.49, respectively. According to Eq. 2, we have to assign the target point with the blue (incorrect) label. In our method, we first extend the matrix B to label-specific atlas patch set P, as shown in the bottom of Fig. 1(b) and then solve the new weighing vector ξ⃑ by Eq. 3. As suggested by ξ⃑, the overall weights for red and blue labels are 0.885 (0.88+0.005) and 0.800 (0.69+0.11), respectively. Therefore, we can correctly assign the target point with red label. This example demonstrates the power of our method.

2.3 Hierarchical Patch-Based Label Fusion

In the beginning of patch-based label fusion, we often use a large patch size in order to obtain global image information. Since we use the sparsity constraint in solving the weighing vector ξ⃑, only a small number of image patches are selected to represent the target image patch α⃑T,x, as many weights in ξ⃑ are zero or almost zero. After discarding those non-selected atlas patches, we are more confident to reduce the patch size of those selected atlas patches and then repeat the whole label fusion procedure as described in Section 2.1 and 2.2 by using more detailed local features. In this way, our label fusion method can iteratively improve the labeling results in a hierarch hical way.

3 Experiments

In the following experiments, we compare our label fusion method (by Eq. 3 and Eq. 4) with the sparse patch-based label fusion method (by Eq. 1 and Eq. 2). To label the target image, we first use FLIRT in FSL package to linearly register all atlas images onto the target image and then use diffeomorphic Demons [7] to compute the remaining local deformations2. After optimization, λ is set to 0.1 for both the label fusion methods. The patch size is 5×5×5 for the sparse patch-based label fusion method. In our method, the patch size is initialized with 11×11×11 in the first iteration and reduced to 5×5×5 in the second iteration. Here, we use Dice ratio on each ROI to measure the labeling accuracy.

3.1 Evaluation on Hippocampus Labeling

In this experiment, we randomly select 66 elderly brains from ADNI dataset3, where hippocampus has been manually labeled for each brain. Besides comparing with the baseline sparse patch-based label fusion method (Sparse PBL), to evaluate the contribution of each component in our label fusion method, we further compare our method with the three degraded versions of our method, (1) Degraded_1: our method using only the multi-scale feature representation (with patch size 11×11×11), (2) De-graded_2: our method using only the label-specific atlas patches (with patch size 11×11×11), and (3) Degraded_3: our method using only the hierarchical labeling mechanism.

We evaluate each of the above five label fusion methods with a leave-one-out cross-validation. From all 66 leave-one-out cases, the mean and standard deviation of Dice ratios in hippocampus and the surface distance are calculated and provided in Table 1. It is clear that: (1) Our full method achieves the highest Dice ratio and lowest surface distance over other four comparison methods, where we obtain almost 1.2% improvement over the baseline Sparse PBL method; (2) Each component in our label fusion method has contribution in improving the labeling accuracy, as evidenced by 0.6%, 0.9%, and 0.3% Dice ratio increases over the baseline Sparse PBL by De-graded_1, Degraded_2, and Degraded_3, respectively. Also, we find all degraded methods have significant improvement over the baseline method in paired t-test.

Table 1.

The statistics of Dice ratios in hippocampus labeling by 5 different methods

Sparse PBL Degraded_1 Degraded_2 Degraded_3 Our method
Dice Ratio 87.3±3.4 87.9±3.0 88.2±2.5 87.6±2.9 88.5±2.2
Surf. Dist 0.38mm 0.35mm 0.34mm 0.35mm 0.33mm

3.2 Evaluation on LPBA40 Dataset

LPBA 40 dataset4 consists of 40 MR brain images, each with 54 manually labeled ROIs. We randomly select 20 images as atlases and another 20 as the target images. The statistics of overall Dice ratio across 54 ROIs are given in Table 2, where our full method achieves 1.5% improvement over the baseline Sparse PBL method. Apparently, each component in our proposed label fusion method has its contribution in enhancing the label fusion results. Fig. 2 shows the Dice ratio in each left-and-right-combined ROI by Sparse PBL (in blue) and our full method (in red), from which we can observe significant improvements in 12 out of 27 ROIs (‘*’ denoting the significant improvement confirmed by paired t-test (p < 0.05)).

Table 2.

The statistics of Dice ratios in labeling 54 ROIs on LPBA40 dataset by 5 diffent methods

Sparse PBL Degraded_1 Degraded_2 Degraded_3 Our method
Dice Ratio 80.3±3.2 81.1±2.5 81.5±2.4 80.6±3.0 81.8±2.1

Fig. 2.

Fig. 2

The Dice ratios of 27 ROIs (left and right combined) in LPBA 40 dataset by Sparse PBL (in blue) and our method (in red)

4 Conclusion

In this paper, we explore a new perspective to substantially enhance the discriminative power of the conventional, widely used, image patch in label fusion. Specifically, we assign each atlas patch with multi-scale feature representation, and further develop label-specific atlas patches according to the existing label information in the atlases for making each atlas patch more flexible during label fusion. Moreover, we present a hierarchical label fusion mechanism to iteratively improve the labeling results by gradually reducing the patch size. Promising labeling results have been obtained on ADNI and LPBA40 dataset, by comparing with state-of-the-art methods.

Footnotes

1

Some label fusion methods use patch pre-selection to discard the less similar patches.

2

The main parameters for running diffeomorphic Demons are: 15, 10, and 5 iterations in low, middle, and high resolution, respectively. The smoothing kernel size is 2.0.

Contributor Information

Guorong Wu, Email: grwu@med.unc.edu.

Dinggang Shen, Email: dgshen@med.unc.edu.

References

  • 1.Coupe P, Manjoh J, Fonov V, Pruessner J, Robles M, Collins L. Patch-based segmentation using expert priors: Application to hippocampus and ventricle segmentation. NeuroImage. 2011;54(2):940–954. doi: 10.1016/j.neuroimage.2010.09.018. [DOI] [PubMed] [Google Scholar]
  • 2.Rousseau F, Habas PA, Studholme C. A Supervised Patch-Based Approach for Human Brain Labeling. IEEE Trans Medical Imaging. 2011;30(10):1852–1862. doi: 10.1109/TMI.2011.2156806. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Tong T, Wolz R, Coupe P, Hajnal J, Ruecket D. Segmentation of MR images via discriminative dictionary learning and sparse coding: application to hippocampus labeling. NeuroImage. 2013;76:11–23. doi: 10.1016/j.neuroimage.2013.02.069. [DOI] [PubMed] [Google Scholar]
  • 4.Li L, Su H, Xing E, Li F. Object Bank: A High-Level Image Representation for Scene Classification and Semantic Feature Sparsification. Proceedings of the Neural Information Processing System (NIPS); 2010. [Google Scholar]
  • 5.Tong T, et al. Segmentation of Brain Images via Sparse Patch Representaion. MICCAI Workshop on Sparsity Techniques in Medical Imaging; Nice, France. 2012. [Google Scholar]
  • 6.Zhang D, Guo Q, Wu G, Shen D. Sparse patch-based label fusion for multi-atlas segmentation. In: Yap P-T, Liu T, Shen D, Westin C-F, Shen L, editors. MBIA 2012. LNCS. Vol. 7509. Springer; Heidelberg: 2012. pp. 94–102. [Google Scholar]
  • 7.Vercauteren T, Pennec X, Perchant A, Ayache N. Diffeomorphic demons: efficient non-parametric image registration. NeuroImage. 2009;45(suppl 1):S61–S72. doi: 10.1016/j.neuroimage.2008.10.040. [DOI] [PubMed] [Google Scholar]

RESOURCES