Abstract
In medical imaging, it is widely recognized that image quality should be objectively evaluated based on performance in clinical tasks. To evaluate performance in signal-detection tasks, the ideal observer (IO) is optimal but also challenging to compute in clinically realistic settings. Markov Chain Monte Carlo (MCMC)-based strategies have demonstrated the ability to compute the IO using pre-computed projections of an anatomical database. To evaluate image quality in clinically realistic scenarios, the observer performance should be measured for a realistic patient distribution. This implies that the anatomical database should also be derived from a realistic population. In this manuscript, we propose to advance the MCMC-based approach towards achieving these goals. We then use the proposed approach to study the effect of anatomical database size on IO computation for the task of detecting perfusion defects in simulated myocardial perfusion SPECT images. Our preliminary results provide evidence that the size of the anatomical database affects the computation of the IO.
Keywords: Image-quality evaluation, SPECT, Markov chain Monte Carlo
1. INTRODUCTION
There is wide recognition in the medical imaging community that image quality should be evaluated based on performance in clinical tasks, such as those of detection and estimation [1]. Of these, detection tasks are performed by an observer, who makes a decision on whether the signal to be detected is present or absent. The term observer typically conjures the images of a trained radiologist, but can also be mathematical observers, also referred to as model observers [2]. Of the various observers, the one that utilizes all the statistical information available regarding the task to maximize task performance is referred to as the ideal observer (IO) [3]. This observer provides the best possible performance on the detection task, as quantified using the area under the receiver operating characteristic (ROC) curve (AUC). The use of IO is recommended for optimizing system instrumentation and acquisition protocols to ensure that the measured data has the maximum possible information for the detection task [4, 1]. However, the IO requires a complete knowledge of the distribution of the image data. Such a distribution is very high dimensional, and, thus, very difficult, if not, impossible to obtain. To address this challenge, various approaches have been proposed [5, 6] but computing IO performance in clinically realistic settings remains a challenge. Thus, there is an important need for strategies to compute the IO in clinically realistic scenarios.
A seminal contribution towards IO computation was a Markov Chain Monte Carlo (MCMC)-based technique proposed by Kupinski et al. [7]. The technique demonstrated the ability to compute IO for parametric object models, in particular, a lumpy background model. This technique was later advanced for a parametric description of the human cardiac region [8]. To improve the efficiency of the IO computation, projection data of an anatomical database were precomputed. To increase the realism of this approach, Ghaly et al. [9] proposed a strategy that used anthropomorphic extended cardiac and torso (XCAT) phantoms [10] that simulated the anatomical variability in patient populations. The above studies focused on IO computation on a patient population with anatomical parameters sampled from a uniform distribution. However, in clinical settings, the distribution of anatomical parameters is unlikely to be uniformly distributed. For example, patient heights typically follow normal distributions [11]. For accurate computation of the IO, and in general, for rigorous image-quality evaluation studies, the anatomical variability needs to be sampled from realistic patient populations. Thus, there is a need to advance these approaches to be able to sample from anatomical parameters from non-uniform distributions.
Another important consideration in using MCMC-based approaches to IO computation is the size of the anatomical database. He et al.. [8] indicated that this size may affect the calculation of IO. Further, they suggested that the number of anatomies required to accurately compute the IO may be a function of system resolution. Thus, the effect of the size of anatomical database on the computation of IO also needs to be investigated.
In this paper, we advance the previously developed MCMC-based methods towards advancing further on the goal of performing IO computation with clinically realistic populations. In this proof-of-concept study, we show the ability of the proposed method to sample from an anatomical database where the distribution of height and heart sizes are described by clinically realistic normal distributions. We then validate the use of this method to compute the IO performance. We further use the proposed method to investigate the effect of size of the anatomical database on the accuracy of the IO computation. We implement and evaluate the method in the context of computing the IO for the task of detecting cardiac defects from myocardial perfusion SPECT images.
2. THEORY
Consider a myocardial perfusion SPECT system imaging a radiotracer distribution, denoted by the infinite-dimensional vector f, that, we assume, lies in the Hilbert space . Denote the SPECT imaging system by the Hilbert space operator . Next, denote the projection data obtained by the SPECT system by the M-dimensional vector g, that, we assume, lies in the Hilbert space . Thus the SPECT system can be described by the mapping : , and the imaging-system equation is given by
| (1) |
where n denotes the M-dimensional noise vector. We write the object f as a combination of the signal of interest, denoted by fs and the rest of the object (referred to as the background object), denoted by fb. Thus
| (2) |
In a clinical setting, fs could denote the region with abnormal uptake, such as a lesion, while the anatomical and physiological variability in the rest of the patient would be denoted by fb. Finally, let the noise-free image corresponding to the background object be denoted by b, i.e. .
In a detection task performed on the projection data g, the goal is to determine, from the projection data, whether the underlying signal of interest is present or absent. Let pr(g | H1) and pr(g | H0) denote the probability distribution function of the image given that the signal is present and absent, respectively. To compute the IO, the following test statistic, referred to as the likelihood ratio, and denoted by Λ(g), is computed:
| (3) |
From this equation, we observe that computing the IO requires complete knowledge of the distribution of g under both the signal present and signal-absent hypothesis.
When the to-be-detected signal fs and background fb are both known exactly, this test statistic can be computed using the knowledge of the noise statistics. We denote this test statistic by ΛBKE(g). However, in a clinically realistic setting, where both the signal and the background vary, the probabilities in Eq. (3) can be difficult, if not impossible to define. A simplification is obtained when we consider the detection task where the properties of the signal are known, while the rest of the patient properties are variable. For this detection task, referred to as the signal-known-exactly/background known statistically (SKE/BKS) task, the expression for the IO is given by [7]:
| (4) |
If we could sample backgrounds from the posterior distribution pr(b|g, H0), then this integral can be computed through a Monte Carlo integration procedure:
| (5) |
where bj denotes the jth realization of the background. How-ever, b is a very high-dimensional vector ( 1283 for SPECT). Sampling from this distribution is challenging. An alternative is to consider a parametric representation of the background [7, 8]. However, human anatomies are challenging to represent through parametric models. To address this issue, Ghaly et al. [9] proposed the use of XCAT phantoms. However, to the best of our knowledge, their anatomical database size was limited to Nd = 54. However, larger-sized databases may be needed to conduct rigorous image-quality evaluation studies. Moreover, in their approach, the anatomical parameters were sampled from a uniform distribution, which, as mentioned above, has limitations in modeling realistic populations.
In this study, we advance the above MCMC-based approach to sample from a more realistic patient population distribution. Similar to He et al., we parameterize the anatomy of each sample of the patient population using two parameters, namely, the radius of the left ventricle of the heart, and the body size. These are denoted by θh and θb, respectively. The activity uptake in the different organs, including the heart, liver and lungs, is parameterized by a q-dimensional vector θact. Denote θ = {θact, θh, θb} and the ith element of θ in iteration j by . Mathematically, at each iteration, we sample from a one-dimensional proposal distribution given by:
| (6) |
where denotes a normal distribution with mean and standard deviation σi. Further, denotes the newly proposed sample of the ith parameter. Note that for the jth iteration,
| (7) |
| (8) |
where, denotes sampling from a discrete uniform distribution with range between 1 to q + 2 with a step size of 1. As the anatomical database contains only discrete LV radius and body size, the anatomy proposed at each iteration should be chosen such that the LV radius and body size of that anatomy are closest to the proposed LV radius and body size. This strategy provides a way to guide the sampling process through a large-sized phantom population and allows for the prior information about the LV radius and body size to be accounted for in the calculation of acceptance ratio.
3. IMPLEMENTATION AND EVALUATION OF THE PROPOSED APPROACH
3.1. Anthropomorphic phantom model and projection-data generation
We used the XCAT phantom [10] for generating patient populations. For realistic variability in phantom, the body size, LV length and radius, and the body height were sampled from independent Gaussian distribution with mean and variance based on Emory PET dataset and listed in Ghaly et al. [12]. We only considered one transaxial slice of each anatomy that contained the centroid of the defect in defect-present case. The phantom was generated over a 256 × 256 grid with pixel size of 2 mm. The defect-present case was generated by introducing a cold defect with 25% severity and 10% extent in the anterior wall. A total of 10000 pairs (defect present/absent) of anatomies were generated, 5000 pairs each for both male and female patients.
Projection data for the phantom population was generated with a simulated 2-D SPECT system that modeled the collimator-detector response and noise in SPECT systems. The projection data was acquired in 64 projection bins and at 3 projection angles. The pixel size of each projection bin was 8 mm. As in [9] and [8], we pre-calculated the organ-specific projection for the 10, 000 anatomies for defect-absent case. The defect-only projection was also computed to be used in SKE/BKS task and to generate the defect-present projections to be used in observer study. The organs considered were the heart, lung, liver and the background. This strategy of obtaining organ-specific projections substantially reduced the computational requirements of the MCMC method.
3.2. Generating the population for observer study
We used the generated projection data from the 10,000 anatomies and defect-only projection for each anatomy to generate 2000 pairs of projection data, g using as similar strategy as in [8]. In brief, for organ i and anatomy index k, denote the noise-free projection data as bi,k. Denote the LV radius and body size associated with anatomy index k as θh,k and θb,k, respectively. Then the noise-free phantom projection data is given by
| (9) |
We refer to this dataset as the test set for computing IO performance.
We scaled the background projection data such that, on average, the total number of counts in projection data was 5000. To get the final projection data, g, we added Poisson noise to this projection data. For each anatomy, we sampled the organ activity from distribution mentioned in [12].
3.3. The MCMC simulation
We initialized the activity parameters by randomly sampling from the distribution given in [12]. The initial anatomy parameter was sampled uniformly from the anatomical database. Then at each iteration of MCMC, we randomly picked a single parameter to modify. For proposing a new activity parameter [θact]i, we used a Gaussian proposal density with standard deviation set to one-tenth of the standard deviation of that particular prior organ activity distribution. However, to sample from the anatomy parameter k, we made use of the distribution of LV radius and body size. Based on the previous iterations’ anatomy k j, we obtained the corresponding LV radius and body size. For proposing a new LV radius, we set the proposal density as a Gaussian distribution. The standard deviation of this distribution was one-fifth the standard deviation of the clinically observed LV-radius distribution. Similar strategy was used for proposing new body size. Using these proposed parameters, we selected the anatomy for which the LV radius and body size was the closest match. As the proposal density is symmetric, we accepted the newly proposed parameter set over the previous parameter θj with probability
| (10) |
Note that, prior probabilities can be modeled again by a Gaussian distribution from the patient data.
Challenges in implementation of the MCMC technique include cases where some projection bins in and are null. This can be addressed by discarding the contribution of such pixels while calculating the acceptance probability. We also discard the initial 400,000 burn-in iterations where non-stationarity of ΛBKE estimates are observed to get a reliable IO estimate.
4. RESULTS
In Fig. 1(a), we show the convergence of IO estimate when the size of anatomical database, Nd was 10000. We observed that the estimate converges after around 4 million iterations. We also divide the entire iterations into consecutive blocks and show the estimate of IO test statistics calculated from each block, as in [8]. The block size was set to 500 iterations. We observed that the distribution follows a log-normal distribution and the log of this estimate follows a normal distribution (Fig. 1(b)).
Fig. 1.

(a) The estimate of IO test statistics as a function of total number iterations used for MCMC simulation. (b) The distribution of block estimate of log of IO test statistic.
We also validate the proposed MCMC method by comparing empirical and theoretical estimate of ⟨Λ | H0⟩ and ⟨Λ | H1⟩ – V ar(Λ H0). Theoretically, both these terms should be equal to one [8]. From Fig. 2, we observed that these quantities reach 0.913 and 1.03, respectively for Nd = 10000. The closeness of these summary statistics to 1 provides evidence in the direction of validating this method. The minor deviation could be due to the unique nature of distribution of anatomical phantom population and needs further investigation. However, overall, the results in Figs. 1 and 2 provide evidence in support of validating the MCMC technique.
Fig. 2.

(a) and (b) as a function of number of image pairs used for observer study.
We next investigated the effect of the size of anatomical distributions on IO computation using the MCMC technique. For this purpose, we varied the size of anatomical database and for each setting, computed the IO performance. The AUC value as a function of anatomical database size is shown in Fig. 3. We observed that increasing the size of anatomical database resulted in an increase in AUC value. Thus, a large-sized population of the anatomical database is required to get reliable estimate of the IO performance through the MCMC approach for anthropomorphic phantoms.
Fig. 3.

AUC values as a function of size of anatomical database. The error bar indicates 95% confidence interval.
5. DISCUSSIONS AND CONCLUSION
In this paper, an MCMC-based method was proposed in the context of computing IO performance in clinically realistic settings. We observe that the proposed method operates with the anthropomorphic XCAT phantoms, and is able to sample the anatomical descriptors of this phantom from a clinically realistic normal distribution. Our investigation of the effect of size of anatomical database on IO computation with this method showed that large sample size of the anatomical patient database are required to accurately compute the IO performance. These findings thus suggest the use of approaches that can generate such large databases.
The proposed approach has multiple applications in addition to computing the IO. This includes generating phantom populations for virtual clinical trials and other image-quality evaluation studies [13, 14, 15, 16, 17]. Another application is in simulation-guided deep learning approaches that provide the advantage of using patient populations with known ground truth, which can then be used for training [18]. The proposed method may be used to generate such a patient population.
In conclusion, we advanced a MCMC-based strategy with the eventual goal of sampling from a clinically realistic patient population. We then demonstrated the application of this strategy to sample from an anatomical database with height and heart sizes described by clinically realistic normal distributions. The strategy was validated in the context of evaluation IO performance for defect-detection tasks in myocardial perfusion SPECT. Our analysis provides evidence of the efficacy of having a large-sized anatomical database to reliably compute IO performance.
7. ACKNOWLEDGMENTS
This work was supported by the National Institute of Biomedical Imaging and Bioengineering of the National Institute of Health under grants R21-EB024647, R01 EB031051, and R56 EB028287.
Footnotes
COMPLIANCE WITH ETHICAL STANDARDS
This is a numerical simulation study for which no ethical approval was required.
8. REFERENCES
- [1].Barrett Harrison H, Myers Kyle J, Hoeschen Christoph, Kupinski Matthew A, and Little Mark P, “Task-based measures of image quality and their relation to radiation dose and patient risk,” Phys. Med. Biol, vol. 60, no. 2, pp. R1, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [2].Barrett Harrison H, Yao Jie, Rolland Jannick P, and Myers Kyle J, “Model observers for assessment of image quality,” Proceedings of the National Academy of Sciences, vol. 90, no. 21, pp. 9758–9765, 1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [3].Barrett Harrison H, Abbey Craig K, and Clarkson Eric, “Objective assessment of image quality. III. ROC metrics, ideal observers, and likelihood-generating functions,” JOSA A, vol. 15, no. 6, pp. 1520–1535, 1998. [DOI] [PubMed] [Google Scholar]
- [4].Jha Abhinav K, Myers Kyle J, Obuchowski Nancy A, Liu Ziping, Rahman Md Ashequr, Saboury Babak, Rahmim Arman, and Siegel Barry A, “Objective Task-Based Evaluation of Artificial Intelligence-Based Medical Imaging Methods:: Framework, Strategies, and Role of the Physician,” PET clinics, vol. 16, no. 4, pp. 493–511, 2021. [DOI] [PubMed] [Google Scholar]
- [5].Clarkson Eric and Shen Fangfang, “Fisher information and surrogate figures of merit for the task-based assessment of image quality,” JOSA A, vol. 27, no. 10, pp. 2313–2326, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Jha Abhinav K, Clarkson Eric, and Kupinski Matthew A, “An ideal-observer framework to investigate signal detectability in diffuse optical imaging,” Biomed. Opt. Express, vol. 4, no. 10, pp. 2107–2123, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7].Kupinski Matthew A, Hoppin John W, Clarkson Eric, and Barrett Harrison H, “Ideal-observer computation in medical imaging with use of Markov-chain Monte Carlo techniques,” JOSA A, vol. 20, no. 3, pp. 430–438, 2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [8].He Xin, Caffo Brian S, and Frey Eric C, “Toward realistic and practical ideal observer (IO) estimation for the optimization of medical imaging systems,” IEEE Trans. Med. Imaging, vol. 27, no. 10, pp. 1535–1543, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Ghaly Michael, Links Jonathan M, and Frey Eric, “Optimization of energy window and evaluation of scatter compensation methods in myocardial perfusion SPECT using the ideal observer with and without model mismatch and an anthropomorphic model observer,” J. Med. Imaging, vol. 2, no. 1, pp. 015502, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Segars W Paul, Sturgeon G, Mendonca S, Grimes Jason, and Tsui Benjamin MW, “4D XCAT phantom for multimodality imaging research,” Med. Phys, vol. 37, no. 9, pp. 4902–4915, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Komlos John and Han Kim Joo, “Estimating trends in historical heights,” Historical Methods: A Journal of Quantitative and Interdisciplinary History, vol. 23, no. 3, pp. 116–120, 1990. [Google Scholar]
- [12].Ghaly Michael, Du Yong, Fung George SK, Tsui Benjamin MW, Links Jonathan M, and Frey Eric, “Design of a digital phantom population for myocardial perfusion SPECT imaging research,” Phys. Med. Biol, vol. 59, no. 12, pp. 2935, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Barrett Harrison H, Furenlid Lars R, Freed Melanie, Hesterman Jacob Y, Kupinski Matthew A, Clarkson Eric, and Whitaker Meredith K, “Adaptive SPECT,” IEEE Trans. Med. imaging, vol. 27, no. 6, pp. 775–788, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [14].Ghanbari Nasrin, Clarkson Eric, Kupinski Matthew, and Li Xin, “Optimization of an adaptive SPECT system with the scanning linear estimator,” IEEE Trans. Rad. Plas. Med. Sci, vol. 1, no. 5, pp. 435–443, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [15].Yu Zitong, Rahman Md Ashequr, Schindler Thomas, Laforest Richard, and Jha Abhinav K, “A physics and learning-based transmission-less attenuation compensation method for SPECT,” in Proc. SPIE Med. Imag International Society for Optics and Photonics, 2021, vol. 11595, p. 1159512. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Liu Ziping, Mhlanga Joyce C, Laforest Richard, Derenoncourt Paul-Robert, Siegel Barry A, and Jha Abhinav K, “A Bayesian approach to tissue-fraction estimation for oncological PET segmentation,” Phys. Med. Biol, vol. 66, no. 12, pp. 124002, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Yu Zitong, Rahman Md Ashequr, Schindler Thomas, Gropler Robert, Laforest Richard, Wahl Richard, and Jha Abhinav, “AI-based methods for nuclear-medicine imaging: Need for objective task-specific evaluation,” J. Nucl. Med, vol. 61, no. supplement 1, pp. 575, 2020. [Google Scholar]
- [18].Leung Kevin H, Marashdeh Wael, Wray Rick, Ashrafinia Saeed, Pomper Martin G, Rahmim Arman, and Jha Abhinav K, “A physics-guided modular deep-learning based automated framework for tumor segmentation in PET,” Phys. Med. Biol, vol. 65, no. 24, pp. 245032, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
