Abstract
Purpose
In emission tomography, the EM (expectation maximization) algorithm is easy to use with only one parameter to adjust — the number of iterations. On the other hand, the EM algorithms for transmission tomography are not so user-friendly and have many problems. This paper develops a new transmission algorithm similar to the emission EM algorithm.
Methods
This paper develops a family of emission-EM-look-alike algorithms by expressing the emission EM algorithm in the additive form and changing the weighting factor. One of the family members can be applied to transmission tomography such as the x-ray CT (computed tomography).
Results
Computer simulations are performed and compared with a similar algorithm by a different group using the transmission CT noise model. Our algorithm has the same convergence rate as theirs, and our algorithm provides better contrast-to-noise ratio for lesion detection.
Conclusions
For any noise variance function, an emission EM look-alike algorithm can be derived. This algorithm preserves many properties of the emission EM algorithm such as multiplicative update, non-negativity, faster convergence rate for the bright objects, and ease of implementation.
Keywords: EM algorithm, Iterative image reconstruction, Transmission tomography, Convergence rate, X-ray CT
I. INTRODUCTION
The EM (expectation maximization) methodology is a general approach to compute maximum likelihood estimates by using iterative techniques.1 There are many EM algorithms. The most famous EM algorithm in medical imaging community is the one for emission tomography.2–5 The emission EM algorithm uses a multiplicative form to update the image; it has a built-in property to enforce the image non-negativity and Poisson noise nature in the data. It is efficient to implement and stable. It has no adjustable parameters other than the iterations. It is safe to state that it is the most favorite iterative algorithm in nuclear medicine.
An EM algorithm for transmission tomography was developed in Reference 5 by Lange and Carson. Unlike its emission tomography counterpart, this EM algorithm for transmission tomography has many drawbacks. It is complicated to compute and slow in convergence.6 There are other versions of the transmission EM algorithms, but there are no fundamental improvements. 7–10
In 2001, Nuyts et al proposed a method to scale the measurements making the scaled version somewhat like the emission data with Poisson noise in the sense that the variance is approximately the mean value.11 Our proposed method here is very similar, except that their scaling factor was a function of the measured sinogram value while our proposed scaling factor is a function of the forward projection of the image reconstructed at the previous iteration.
Since the original transmission EM algorithm, many user-friendly algorithms have been developed.12,13 A faster algorithm may not be in the form of an EM algorithm.14 An EM algorithm can also be applied to other applications such as simultaneous estimation of the emission activity and the attenuation map in PET (positron emission tomography).15
This paper develops a family of emission-EM-look-alike algorithms. They are iterative algorithms in the form of multiplicative image update, which intrinsically enforces the image non-negativity. The unique feature of this family is that the scaling factor is formed by the forward projection of the reconstructed image at the previous iteration, which is a unique feature in the “E-step” in an EM algorithm. Each member of the family has its own noise model as explained in detail in the next section.
II. Methods
2.1 The emission EM algorithm
The starting point of our development is the emission EM algorithm
| (1) |
where is the ith image pixel at the kth iteration, Pj is the jth line-integral (ray-sum) measurement value, and aji is the contribution of the ith image pixel to the jth measurement. The summation over the second index is the projector and the summation over the first index is the backprojector. Expression (1) is in the form of multiplicative image update, and it can be re-written in the form of additive image update
| (2) |
The last line of (2) is in the form of the iterative Landweber algorithm, which is a gradient decent algorithm. In (2),
| (3) |
is the relaxation parameter, which is also known as the step size, and
| (4) |
is the weighting factor for the jth projection ray. For the Poisson noise model, the noise variance is the mean value of the ray sum. The denominator of the right-hand-side of (4) is the “E-step” estimation of the variance of the sinogram variance for emission tomography by using the forward projection.
We can make two observations from (3) and (4). First, the EM algorithm’s step size is scaled by the image pixel value at the kth iteration. The step is larger for objects with larger image values. Therefore, the bright lesions converge faster than the dark lesions.
Second, the weighting factor is the reciprocal of the estimated mean value of the jth ray by the reconstruction at the kth iteration. The mean value is the same as the variance for Poisson noise.
2.2 Modification of the emission EM algorithm
We propose to introduce a new scaling factor in the backprojector for each projection ray j. Thus the original emission EM algorithm (1) becomes
| (5) |
As a consequence, the additive version (2) becomes
| (6) |
with a new step size (also known as relaxation parameter) and a new weighting factor:
| (7) |
and
| (8) |
Eq. (5) represents a family of emission-EM-look-alike algorithms depending on the definition of the new scaling factor , as explained in the following special cases (or examples).
2.3 Special case 1: No weighting
Let’s consider a hypothetical imaging system, in which the noise in the sinogram is identically distributed with the same variance. No noise weighting should be used for the image reconstruction algorithm. Thus we require the weighting factor (8) to be
| (9) |
which results in
| (10) |
Substituting (10) into (5) yields
| (11) |
In (11), is the backprojection of the measured sinogram and is the backprojection of the forward projection of the current estimate of the image.
2.4 Special case 2: Modified Emission
Let’s consider another hypothetical imaging system, in which the noise in the sinogram is not quite Poisson distributed, but the variance is given as
| (12) |
where p̄j is the mean value of the sinogram pj and γ is a constant. When γ = 1, this case is the classic situation of emission tomography using a Poisson noise model.
Using the EM strategy of estimating p̄j by , the weighting factor (8) for our modified emission case is given as
| (13) |
which results in
| (14) |
Substituting (14) into (5) yields
| (15) |
As expected, when γ = 1, (15) is the famous emission EM algorithm.
2.5 Special case 3: Transmission
In transmission tomography, the sinogram variance is proportional to the exponential function of the sinogram’s mean value, that is,16
| (16) |
The weighting factor can be chosen as the reciprocal of the variance. Using the EM strategy of estimating p̄j by , the weighting factor (8) for the transmission tomography case is given as
| (17) |
which results in
| (18) |
Substituting (18) into (5) yields
| (19) |
Eq. (19) is the main result of this paper. It is an emission EM look-alike algorithm for the transmission tomography. It is in the form of multiplicative image update, has an intrinsic non-negativity constraint, and weights the sinogram with (17). Most importantly, (19) is user friendly and easy to implement. The initial values of Algorithm (19) must be positive.
2.6 Special case 4: Modified Transmission
Again, this is another hypothetical imaging system, in which the noise in the sinogram has a variance given by
| (20) |
where p̄j is the mean value of the sinogram pj and γ is a constant. When γ = 1, this case is the classic situation of transmission tomography using a Poisson noise model for its pre-log data.
According to the noise model (20), the weighting factor can be chosen as
| (21) |
which results in
| (21) |
Substituting (21) into (5) yields
| (22) |
When γ = 1, this degenerates to the usual transmission tomography case (19). When γ = 0, this is the no-weighting case as (11).
From the examples above, a large family of emission EM look-alike image reconstruction algorithms can be developed as long a noise variance function is provided.
2.7 Computer Simulations
In this paper, computer simulations are used to compare the performance of our proposed algorithm with a similar algorithm developed by Nuyts et al by using transmission CT noise model. Gaussian noise was added to the sinogram data. The computer simulations in this paper use a two-dimensional (2D) uniform elliptical phantom containing 3 small hot lesions and 2 small cold lesions, as shown in Figure 1. The large ellipse (with a linear attenuation coefficient of 0.001/unit) had a length of the horizontal semi-axis of 204.8 and a length of the vertical semi-axis of 128. The 3 small circular hot lesions (with a linear attenuation coefficient of 0.004/unit) had a radius of 10.24, 5.12, and 2.56, respectively. The 2 small circular cold lesions (with a linear attenuation coefficient of 0.0004/unit) had a radius of 5.12 and 2.56, respectively. The unit is the pixel size. The detector bin size was the same as the image pixel size. The large ellipse is also referred to as the background.
Figure 1.

The computer generated lesion phantom with 3 hot lesions and 2 cold lesions.
The performance is characterized by the lesion contrast recovery coefficient (CRC), which is defined as18
| (23) |
where Mles and Mback represent the mean of the lesion and the mean of the background, and the subscripts ‘rec’ and ‘phan’ denote the reconstruction and the true phantom, respectively. In computer simulations, we calculated the CRC for each of the five lesions for every reconstruction using the noiseless projections. The lesion value Mles was the value at the center of the lesion. The background value Mback was the value at the center of the large disc. For the hot lesions (Mles/Mback)phan −1 = (0.002/0.001)−1 = 1. For the cold lesions (Mles/Mback)phan −1 = (0.0004/0.002) −1 = −0.8.
The image noise was evaluated using the normalized standard deviation of the noise in the central 70 × 70 square region of the large disc reconstructed with the noisy data as
| (24) |
where σ is the standard deviation, N is the number of image elements used in the calculation, xi is the value of the ith pixel of the image reconstructed from noisy data, and x̄i is the expected mean value of the ith pixel. The purpose of the normalization is to eliminate the influence on the noise measurement of non-uniform values of the image within the regions that are supposed to be uniform.
A parallel-beam imaging geometry is assumed. The image array was 512 × 512, the number of views was 400 over 180°, and the number of detection channels was 512. To avoid inverse-problem crime, the projector used to generate projection sinograms is different from the projector used for image reconstruction. In sinogram date generation, the image was first up-sampled 5 times and the detector was also up-sampled 5 times.17 The noiseless sinogram data was then down-sampled 5 times and noise was introduced. Negative sinogram values were set to zero. Transmission CT noise model was used in sinogram data with extremely low counts, in which the sinogram variance was proportional to the exponential function of the sinogram value.
Computer simulations were performed and compared with a similar algorithm by a different group using the transmission CT noise model.
III. Results
Three algorithms have been implemented for the transmission CT noise model with extremely low counts, because for the high count situations the performance is similar for similar algorithms. Two of the three algorithms are our proposed algorithms, and the third algorithm was developed by Nuyts et al.11 The only difference among these three algorithms is the calculation of the scaling factor in (8). In one of our algorithms, the scaling factor is calculated using the forward projection of the current image, that is,
| (25) |
In the algorithm by Nuyts et al, the scaling factor is calculated using the measured sinogram, that is,
| (26) |
where the purpose of using a small threshold ε = 0.01 is to prevent the scaling factor from being zero.
There is a third algorithm that is also presented in this result section, to show that none of the above algorithm is optimal in the sense of lesion contrast-to-noise ratio, because the third algorithm outperforms the two algorithms discussed above. This third algorithm is obtained by trial-and-error and is a hybrid combination of the above two methods. For the third method (referred to as “Mix” method), the scaling factor is giving as
| (27) |
It is important to note that (27) cannot be changed to
| (28) |
because when , the measurement pj is never used in the image reconstruction.
These three algorithms gave almost the same images as shown in Figure 2, all at iteration number 50. However, the differences in performance can be better revealed by the CRC curves for each lesion, as shown in Figures 3, 4, 5, 6, and 7, respectively. Those curves are produced by varying the number of iterations. The “good points” in the curves are close to the upper left corner, where the contrast is high and the noise is low. The “bad points” in the figures are close to the lower right corner, where the contrast in low and the noise is high. We observe that our proposed emission-EM lookalike method outperforms the Nuyts Method, and the mixed method is the best of the three, in terms of lesion contrast-to-noise ratio.
Figure 2.
Images reconstructed from sinogram with transmission CT noise model. Left image: 25 iterations of the proposed algorithm. Right image: 25 iterations of the algorithm by Nuyts et al. Middle image: 25 iterations of our “mix” method expressed by (27). Each image is displayed from its own minimum to its own maximum.
Figure 3.
Contrast recovery coefficient (CRC) vs. normalized noise standard deviation curves for three algorithms. The contrast is measured at the largest hot lesion. The Mix method gives the best performance and the Nuyts gives the poorest performance.
Figure 4.
Contrast recovery coefficient (CRC) vs. normalized noise standard deviation curves for three algorithms. The contrast is measured at the mid-size hot lesion. The Mix method gives the best performance and the Nuyts gives the poorest performance.
Figure 5.
Contrast recovery coefficient (CRC) vs. normalized noise standard deviation curves for three algorithms. The contrast is measured at the small hot lesion. The Mix method gives the best performance and the Nuyts gives the poorest performance.
Figure 6.
Contrast recovery coefficient (CRC) vs. normalized noise standard deviation curves for three algorithms. The contrast is measured at the small cold lesion. The Mix method gives the best performance and the Nuyts gives the poorest performance.
Figure 7.
Contrast recovery coefficient (CRC) vs. normalized noise standard deviation curves for three algorithms. The contrast is measured at the mid-size cold lesion. The Mix method gives the best performance and the Nuyts gives the poorest performance.
Convergence rate is an essential property of an iterative algorithm. Figure 8 shows the learning curves of the three algorithms. A learning curve is the common logarithm of the Euclidean norm of the discrepancy that is defined as the difference between the measurements and the forward projection values as a function of the iteration number. The shape of the learning curve represents the convergence rate of an iterative algorithm. It is observed from Figure 8 that all these three algorithms have approximately the same convergence rate. This observation is expected, because these three scaling factors have essentially the same magnitude. The differences are the small fluctuations, which do not affect the convergence rate as measured in the sinogram domain. Due to noise, the sinogram domain discrepancy does not reduce to zero. The converged discrepancy value is different for different scaling strategies.
Figure 8.
Learning curves for three algorithms. The three algorithms have approximately the same convergence rate.
Our algorithm has the same convergence rate as that of the other group, and our algorithm provides better contrast-to-noise ratio for lesion detection.
The emission-EM lookalike algorithms use small and conservative update step sizes. The emission-EM lookalike algorithms use the multiplicative image update scheme: ; therefore they can be sped up by introducing a power function with a proper γ > 1. This speedup strategy may destroy a good property of the emission EM algorithm which conserves the total forward projection counts at every iteration. One remedy is to re-normalize the image according to the total count at every iteration.19, 20
IV. Discussion and conclusions
The emission EM algorithm has wide applications in nuclear medicine due to its good noise control property, intrinsic non-negativity constraint, and ease of implementation. The emission EM algorithm is a nonlinear algorithm and its convergence rate is non-stationary. If written in the additive update form, it is easy to observe that the step size of the algorithm depends on the image intensity. It converges faster for bright objects and slower for dark objects. Its noise weighting factor is estimated by the previous iteration of the image solution, while many other algorithms (such as the similar algorithm developed by Nuyts et al11 ) use the sinogram data to calculate the weighting factors.
All these properties mentioned above are preserved by our newly suggested emission-EM look-alike algorithms, which are developed in an ad hoc manner. For any noise variance function, an emission-EM look-alike algorithm can be easily devised by following the steps in Section II.
One important application is the algorithm for transmission tomography. According to our computer simulations, the proposed method’s convergence rate is comparable to that of the algorithm by Nuyts et al and the proposed method’s lesion contrast-to-noise ratio is better. The images reconstructed by Nuyts method is noisier might be due to the noise propagation from the scaling factor into the reconstructed image. In fact, Nuyts et al already realized this problem and suggested a remedy of applying a lowpass filter to smooth the scaling fact. However, our proposed method does not need this lowpass filter.
We do not claim that the emission-EM look-alike algorithms are the optimal algorithms. The “mix” algorithm implemented in the Results section uses a scaling factor which is a hybrid combination of the forward projection of the image from the previous iteration and the sinogram measurement actually gives better results. Therefore, it is still an open research problem to find the optimal noise weighing strategy.
Acknowledgments
Research reported in this publication was supported by the National Institute Of Biomedical Imaging And Bioengineering of the National Institutes of Health under Award Number R15EB024283. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Footnotes
The authors have no conflicts to disclose.
References
- 1.Dempster AP, Laird NM, Rubin DB. Maximum likelihood from complete data via the EM algorithm. Journal of the Royal Statistical Society: Series B. 1977;39(1):1–38. [Google Scholar]
- 2.Richardson WH. Bayesian-based iterative method of image restoration. JOSA. 1972;62(1):55–59. doi: 10.1364/JOSA.62.000055. [DOI] [Google Scholar]
- 3.Lucy LB. An iterative technique for the rectification of observed distributions. Astronomical Journal. 1974;79(6):745–754. doi: 10.1086/111605. [DOI] [Google Scholar]
- 4.Shepp LA, Vardi Y. Maximum likelihood reconstruction for emission tomography. IEEE Trans Med Imag. 1982;1:113–122. doi: 10.1109/TMI.1982.4307558. [DOI] [PubMed] [Google Scholar]
- 5.Lange K, Carson R. EM reconstruction algorithms for emission and transmission tomography. J Comp Ass Tomogr. 1984;8:302–316. [PubMed] [Google Scholar]
- 6.Lange K, Fessler JA. Globally convergent algorithms for maximum a posteriori transmission tomography. IEEE Trans Imag Proc. 1995;4:1430–1438. doi: 10.1109/83.465107. [DOI] [PubMed] [Google Scholar]
- 7.Kent JT, Wright C. Some suggestions for transmission tomography based on the EM algorithm. In: Barone MP, Frigessi A, editors. Stochastic Models, Statistical Methods, and Algorithms in Imag Analysis. New York: Springer; 1992. pp. 219–232. 74 of Lecture Notes in Statistics. [Google Scholar]
- 8.Browne JA, Holmes TJ. Developments with maximum likelihood x-ray computed tomography. IEEE Trans Med Imag. 1992;12:40–52. doi: 10.1109/42.126909. [DOI] [PubMed] [Google Scholar]
- 9.Ollinger JM, Johns G. The use of maximum a-posteriori and maximum likelihood transmission images for attenuation correction PET. Proc IEEE Nucl Sci Symp Med Imag Conf. 1992;2:1185–1187. [Google Scholar]
- 10.Ollinger JM. Maximum likelihood reconstruction of transmission images in emission computed tomography via the EM algorithm. IEEE Trans Med Imag. 1994;13:89–101. doi: 10.1109/42.276147. [DOI] [PubMed] [Google Scholar]
- 11.Nuyts J, Michel C, Dupont P. Maximum-likelihood expectation-maximization reconstruction of sinograms with arbitrary noise distribution using NEC-transformations. IEEE transactions on medical imaging. 2001;20(5):365–375. doi: 10.1109/42.925290. [DOI] [PubMed] [Google Scholar]
- 12.Lange K, Fessler JA. Globally convergent algorithms for maximum a posteriori transmission tomography. IEEE Transactions on Image Processing. 1995;4(10):1430–1438. doi: 10.1109/83.465107. [DOI] [PubMed] [Google Scholar]
- 13.Fessler JA, Ficaro EP, Clinthorne NH, Lange K. Grouped-coordinate ascent algorithms for penalized-likelihood transmission image reconstruction. IEEE transactions on medical imaging. 1997;16(2):166–175. doi: 10.1109/42.563662. [DOI] [PubMed] [Google Scholar]
- 14.Yu Z, Thibault JB, Bouman CA, Sauer KD, Hsieh J. Fast model-based X-ray CT reconstruction using spatially nonhomogeneous ICD optimization. IEEE Transactions on image processing. 2011;20(1):161–175. doi: 10.1109/TIP.2010.2058811. [DOI] [PubMed] [Google Scholar]
- 15.Mihlin A, Levin CS. An expectation maximization method for joint estimation of emission activity distribution and photon attenuation map in PET. IEEE transactions on medical imaging. 2017;36(1):214–224. doi: 10.1109/TMI.2016.2602339. [DOI] [PubMed] [Google Scholar]
- 16.Zeng GL, Wang W. Noise weighting with an exponent for transmission CT. Biomedical Physics & Engineering Express. 2016;2:045004. doi: 10.1088/2057-1976/2/4/045004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Kaipio J, Somersalo E. Statistical inverse problems: Discretization, model reduction and inverse crimes. Journal of Computational and Applied Mathematics. 2007;198(2):493–504. [Google Scholar]
- 18.Liow JS, Strother SC. Practical tradeoffs between noise, quantitation, and number of iterations for maximum likelihood-based reconstructions. IEEE Trans Med Imaging. 1991;10:563–571. doi: 10.1109/42.108591. [DOI] [PubMed] [Google Scholar]
- 19.Hwang DS, Zeng GL. Convergence study of an accelerated ML-EM algorithm using bigger step size. Phys Med Biol. 2006;51:237–252. doi: 10.1088/0031-9155/51/2/004. [DOI] [PubMed] [Google Scholar]
- 20.Tanaka E, Nohara N, Tomitani T, Yamamoto M. Utilization of non-negativity constraints in reconstruction of emission tomograms. in Proc 9th Conference of Information Processing in Medical Imaging. 1985:379–393. [Google Scholar]







