Abstract
Medical image segmentation annotated by experts provides the labeled data sets for many scientific researches. However, due to the unevenly experienced backgrounds of the experts and limited numbers of patients with certain diseases or illnesses, not only do such labeled data sets have smaller samples but their quality and normality also can range in wide variabilities and be ambiguous. In practice, these segmentations are usually assigned to be the ground truths for the scientific studies, so it may undermine the trustworthiness of the resulting findings. Therefore, it is meaningful to consider how to give a more unified opinion of the annotations among different experts. In this paper, a novel approach to form normal distributions of segmentation is proposed based on multiple doctors’ annotations for the same patient. The proposed approach is developed through the following steps: (1) utilize a framework7 of averaging images to construct an averaged annotation based on different given annotations; (2) determine the image registration deformations from the averaged annotation to the given annotations; (3) build a joint multivariate Gaussian distribution over the logorithm of Jacobian determinants and curls of the registration deformations; lastly, (4) simulate a normal distribution of segmentation by the joint Gaussian distribution of registration deformation. This work translates the problem of forming a normal distribution of the image segmentation into a problem of forming joint Gaussian distribution of image registration deformations, which the latter can be reasoned by Jacobian determinant (models local size of pixel cells) and curl (models local rotation of pixel cells) information. In the following sections, a detailed walk-through of the proposed approach is provided along with its analytical mathematics and numerical examples for its effectiveness. A synthetic example of 3 manually defined label image is made to show how to construct a mean label image, and an example of a real cancer image annotated by 3 doctors demonstrates the formation of the normal distribution and the effectiveness of the propose method.
Keywords: averaging diffeomorphism, Jacobian determinant, curl, image segmentation, normal distribution
1. INTRODUCTION
In this data surging era, many data-based scientific studies have found their fertile soil for growth, including medical imaging and image processing. Image segmentation is one of the fundamental problems among these areas. In deep learning medical imaging3,5 problems, the segmented image usually is paired with the raw image together and treated as labeled data for training purposes. In general, the more labeled data, the better researches can be done. However, segmented medical image data is sometimes rare and considered as small samples in deep learning based researches. In reality, only limited amount of patients are digitally archived and annotated by experts. When experts annotate medical images for segmentation, due to the variabilities, in reality, the annotations often are presented with different opinions. Such annotations are usually taken as the ground truth in segmentation supervised researches. Whereas, in this labeled data, the quality level of the annotations are quite irregular. This may risk the trustworthiness of the research’s findings. The proposed approach aims to integrate the different annotations into a more unified opinion along with a normal distribution based on this unified opinion. To resolve this, a method that is capable to unify these segmentation opinions is need. In this paper, we utilize an images averaging framework7 to construct the mean label image as the unified label, then provide a formulation to construct normal distribution of label shape that takes the the unified label image as the mean image. Furthermore, in case of extra annotations are needed for the same image, simulated samples can be to be drawn from the built normal distribution.
The structure of this paper is described as follows. The proposed approach to form normal distribution of medical segmentation is presented with 4 sections and organized into two parts. The first part, which describes the computation strategy of how to build an averaged label, includes sections 2, 3 and 4. Section 2 introduces the overall computational strategy with Alg 1. Sections 3.1 and 4, which respectively are the Variational Principle for grid generation and an optimal control nonrigid image registration, demonstrate two key computation tools and explains how they form parts in the proposed approach. Section 5 is the second part, which includes an example of real cancer image annotated by three doctors is shown for the construction of joint Gaussian distribution and how to draw samples from the built distribution.
2. COMPUTATION STRATEGY TO CONSTRUCT THE AVERAGED LABEL

The diagram visualizes Alg 1. Suppose {Ii=1,…,n} are given with similar label information. Im is the initial template for the algorithm as well as the moving image in section 4. The dash arrowed line means to perform nonrigid image registration by section 4 and {Ti=1,…,n} are the found registration deformations accordingly. This reflects by step 1–2 of the algorithm. Next, the solid arrowed line means to average {Ti=1,…,n} by the Variational Principle for grid generation in sections 3.1 and 3.2, which is reflected in step 3–4 of the algorithm.

The red arrowed means to deform the initial template by the averaged deformation Tavg for constructing the desired averaged image Iavg, i.e., the claimed averaged label image. Alg 1 can be repeated by defining . The number of repeats is up to user’s choice, say k times. Once the repeats are done, one may define the averaged label by .

In Fig 1, (a,b,c) are 3 widely variated synthetic labels {L1,2,3}, and (d,e,f) are the averaged labels {L1avg,2avg,3avg} that were independently constructed by Alg 1 by picking {L1,2,3} as the initial template, respectively, with 4 repeats. They all deformed towards an unknown but common shape. This observation indicates Alg 1 is potentially choice-independent for the initial template, which is important in practices because each label may has its own bias. To have a robust method that produces unbiased or low biased label is desirable.
Figure 1.

Synthetic labels & their averaged labels
3. AVERAGING DIFFEOMORPHISMS
The first key tool for Alg 1 is a method for unfolding-grids (diffeomorphisms) generation, named Variational Principal6 (VP), which the idea was first introduced by Chen and Liao.1 It takes a prior information on the Jacobian determinant and curl of the desired grid then simulate or regenerate a grid. In the sense of constructing an unknown grid, so long as the Jacobian determinant and curl can be precisely prescribed, VP can find a meaningful solution accordingly. Here, VP is briefly described and the averaging diffeomorphism method based on VP is presented.
3.1. Variational Principle for Grid Generation
Given ϕo on the fixed and bounded domain on , find ϕ = ϕm ○ ϕo = ϕm(ϕo), where ϕm = id + u is an intermediate transformation Ω with u = 0 on ∂Ω, that minimizes (See Appendix 1)
| (1) |
3.2. Averaging Diffeomorphisms by VP
The original idea2 of averaging diffeomorphisms is based on taking arithmetic means for both Jacobian determinats and curls of the given diffeomoephisms. Motivated by the discoveries in VP,6 it is realized that the geometric mean fits better to the property of Jacobian determinant in serving Alg 1. So, here, a modified definition of averaged diffeomoephism is presented by using geometric mean for Jacobian determinants and arithmetic mean for the curls. Given n given diffeomorphisms {Ti=1,…,n} compute the geometric mean Jacobian determinants {det∇Ti=1,…,n} and the arithmetic mean for curls {∇ × Ti=1,…,n}, namely,
| (2) |
Then, the averaged diffeomorphism Tavg can be generated by feeding f0 and go to VP. In Fig 2, T is given as the ground truth, T1 and T2 are manually generated such that and are satisfied, so that it may be checked if the averaged diffeomorphism Tavg is close to the ground truth T. Affirmatively, as it shows in Figure 2(d), the averaged diffeomorphism Tavg in red almost overlaps the ground truth T in black.
Figure 2.

Averaging T1 and T2 to get Tavg
4. AN OPTIMAL CONTROL NONRIGID IMAGE REGISTRATION (IR)
The second key tool for Algorithm 1 is a nonrigid image registration4 method purposely designed in a sense of optimal control under the same constraints in (3.1), so that the solutions found by (4) are in the same category of the solutions to (3.1). Here, a modified version (See Appendix 2) of this nonrigid image registration is briefly described. Let Im be a moving image is to be registered to a fixed image If on the fixed and bounded domain . To minimize Loss2 over solutions of the form ϕ = id + u on Ω with u = 0 on ∂Ω,
| (3) |
In Fig 3, (a) is moved to the fixed image (b) by (c), the solution ϕ found by (3); (d) is the moved/registered image, which looks quite similar to (b). In Table 2, positive value of minimum Jacobian determinat min(det∇ϕ) indicates that the solution is diffeomorphic. JSC is the Jacoard simiarity coeffient and DICE is the dice score. They both take values in (0, 1) and the higher values indicate better registration results.
Figure 3.

Register a J-shaped label to a V-shaped label
Table 2.
Performance of Fig. 3
| Ω | ratio = Loss2 (ϕ) /Loss2 (id) | sec | iteration | max (det ∇ϕ) | min(det ∇ϕ) | JSC | DICE |
|---|---|---|---|---|---|---|---|
| [1,128]2 | 0.0028 | 29.95 | 16028 | 6.2849 | 0.15603 | 0.9167 | 0.9565 |
5. NORMAL DISTRIBUTION BASED ON MULTIPLE DOCTORS’ ANNOTATIONS
This section demonstrates the construction of normal distributions of real cancer tissue segmentation based on 3 expert annotations. In Fig 4, an image of a pancreatic cancer I0 is given along with {L1,2,3} the annotated labels by three doctors; and {S1,2,3} are the segmentation of the cancer tissue based on {L1,2,3}, accordingly.
Figure 4.

Annotations given by 3 different Doctors based on I0
Firstly, L1,2,3 are fed to Alg 1 for constructing {L1avg,2avg,3avg} by choosing individually L1,2,3 as Im in Alg 1. In Fig 5, the averaged labels {L1avg,2avg,3avg} are shown as expected to be close to an unknown but common shape. {S1avg,2avg,3avg} are the segmentations done by {L1avg,2avg,3avg} They visually seems very close.
Figure 5.

Averaged labels found by Algorithm 1
Secondly, to form a distribution based on an averaged label, we find registration deformations {ψ1,2,3} by performing IR in section 4, for instance from Im = L1avg to If = L1,2,3, individually. Next, we compute the Jacobian determinants {det∇ψ=1,…,n} and curls {∇ × ψ=1,…,n} of these registration deformations, so that the mean and the covariance matrix of the desired Gaussian distribution is defined as
| (4) |
Noted that is of 2-by-2 in 2D images and of 4-by-4 in 3D. fo is defined through the geometric mean of {det∇ψ=1,…,n}, so the constructed distribution over the Jacobian determinant is log-normal, i.e., the distribution over the logarithm of Jacobian determinant is normal. From (4), it forms a point-wise multivariate normal distribution, then globally, it should form a joint multivariate normal distribution of diffeomorphism with mean id as n being large enough.
Based on our computations, , and are quite similar to each other. To see that, we show the pairwise difference images of {L1avg,2avg,3avg} and shown in Fig 6(a,b,c), which the darker color indicates the better results. Also, we computed the pairwise and point-wise Kullback-Leibler Divergences of and displayed in Fig 6(d,e,f). Kullback-Leibler Divergence, take values in (0, ∞), and the closer to 0, the more similar from one distribution to another . In fact, this simulation recorded , and which occur near the bright dots in Fig 6(d,e,f). This indicate that these three distributions are very close to each other which fits to the expectation of a choice independent in choosing the initial template from the given label images.
Figure 6.

Pairwise difference images of {L1avg,2avg,3avg} & KL divergences
is the constructed joint multivariate normal distribution of diffeomorphisms based on L1avg. One may draw a sample of det∇ψ and ∇ × ψ from this distribution, then reconstruct ψ by VP and deform L1avg by ψ to get L1ανg(ψ), which L1ανg(ψ) is a sample label image. The collection of such L1ανg(ψ) forms the distribution of label image In Fig 7, Fig 8 and Fig 9, , and are visualized respectively. In Fig 10, 7 samples are randomly drawn within the interval [−1.5σ, 1.5σ] of .
Figure 7.

Visualizing and
Figure 8.

Visualizing and
Figure 9.

Visualizing and
Figure 10.

Samples drawn from the segmentation distribution
It is clear to see that each of these sampled segmentation image is a valid labeled image paired with the raw image, which can used as a labeled data generation for medical segmentation images. To form a more accurate probability, having more experts to annotate the same medical image is encouraged, so that the given number of annotations n may be closer to be sufficient large.
6. CONCLUSION
The proposed method forms normal distribution of segmentation images based on multiple doctors’ annotations. The effectiveness of the proposed method is demonstrated with 2D real cancer label images. The proposed method can be used as a segmentation label data generator. This method encourages more different experts to annotate the same patient to form a more accurate distribution instead of having more patients annotated.
Table 1.
Performance of Fig 2
| Ω | sec | iteration | max differences of Fig. 2(d) | |||
|---|---|---|---|---|---|---|
| ‖Tavg − T‖2 | ||||||
| [1, 101]2 | 0.00009 | 775.04 | 16028 | 8.7348 * 10−6 | ||
ACKNOWLEDGMENT
Liao is partially supported by the grant R03MH120627 as the grant PI and Zhou is an approved collaborator. Appreciation goes to Prof. Xiaoqun Zhang and her Medical Imaging research group in Shanghai Jiao Tong University, where many brilliant ideas were shared and explored.
Appendix 1
Gradient with respect to control function F in section 3.1
To find a solution to section 3.1, we need to find its variational gradient with respect to the control function F as follows. Denote P = det∇ϕ – fo and Q = ∇ × ϕ – go, then, for all δF vanishing on ∂Ω,
Here, the “big vector”s are now denoted as Ai, where i = 1, 2, 3. By Green’s identities with fixed boundary condition and for some Bi such that ΔBi = −∇ · Ai and i = 1, 2, 3, then it can be carried to,
| (5) |
Appendix 2
Gradient with respect to control F in section 4
To find a solution to section 4, we need to derive the variational gradient with respect to δΔϕ = δΔu = δF. For all δF vanishing on ∂Ω and by Green’s identities with fixed boundary condition,
again, by Green’s identities with fixed boundary condition,
| (6) |
Hessian Matrix with respect to control function F in section 4
In case of a Newton optimizing scheme is applicable, based on (6), one can derive the Hessian matrix H with respect to F as follow,
where ,
| (7) |
Gradients with respect to control functions f and g in section 4
To ensure section ?? producing diffeomorphic solutions that is controlled by Jmin ∈ (0, 1), instead of optimizing along F by (6), we need to optimize along and g. Since it is known δΔu = δF = δ(∇f − ∇ × g) and , then, it can be carried to,
| (8) |
REFERENCES
- [1].Chen X and Liao G: New Variational Method of Grid Generation with Prescribed Jacobian determinant and Prescribed Curl. arxiv.org/pdf/1507.03715, 2015.
- [2].Chen X and Liao G: New method of averaging diffeomorphisms based on Jacobian determinant and curl vector. https://arxiv.org/abs/1611.03946, 2016.
- [3].Dalca A Yu E, Golland P, Fishl B, Sabuncu M and Isgleias J: Unsupervised Deep Learning for Bayesian Brain MRI Segmentation https://arXiv:1904.11319v2, 2019 [DOI] [PMC free article] [PubMed]
- [4].Hsiao H-Y, Hsieh C-Y, Chen X, Gong Y, Luo X and Liao G: New Development of Nonrigid Registration. ANZIAM Journal, vol. 55, pp. 289–297, 2014. doi: 10.1017/S1446181114000091 [DOI] [Google Scholar]
- [5].Kohl S, Romera-Paredes B Meyer C, Fauw J, Ledsam J, Maier-Hein K, Eslami A Rezende D and Ronneberger O: A Probabilistic U-Net for Segmentation of Ambiguous Images. https://arxiv.org/abs/1806.05034
- [6].Zhou Z, and Liao G Construction of Diffeomorphisms with Prescribed Jacobian Determinant and Curl, https://arxiv.org/abs/2105.09302, May 2021. [DOI] [PMC free article] [PubMed]
- [7].Zhou Z Image Analysis Based on Differential Operators with Applications to Brain MRIs, PhD Dissertation, University of Texas at Arlington, 2019. [Google Scholar]
