Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Feb 11.
Published in final edited form as: Proc SPIE Int Soc Opt Eng. 2017 Mar 10;10136:101360L. doi: 10.1117/12.2255902

No-gold-standard evaluation of image-acquisition methods using patient data

Abhinav K Jha 1, Eric Frey 1
PMCID: PMC5459320  NIHMSID: NIHMS860208  PMID: 28596636

Abstract

Several new and improved modalities, scanners, and protocols, together referred to as image-acquisition methods (IAMs), are being developed to provide reliable quantitative imaging. Objective evaluation of these IAMs on the clinically relevant quantitative tasks is highly desirable. Such evaluation is most reliable and clinically decisive when performed with patient data, but that requires the availability of a gold standard, which is often rare. While no-gold-standard (NGS) techniques have been developed to clinically evaluate quantitative imaging methods, these techniques require that each of the patients be scanned using all the IAMs, which is expensive, time consuming, and could lead to increased radiation dose. A more clinically practical scenario is where different set of patients are scanned using different IAMs. We have developed an NGS technique that uses patient data where different patient sets are imaged using different IAMs to compare the different IAMs. The technique posits a linear relationship, characterized by a slope, bias, and noise standard-deviation term, between the true and measured quantitative values. Under the assumption that the true quantitative values have been sampled from a unimodal distribution, a maximum-likelihood procedure was developed that estimates these linear relationship parameters for the different IAMs. Figures of merit can be estimated using these linear relationship parameters to evaluate the IAMs on the basis of accuracy, precision, and overall reliability. The proposed technique has several potential applications such as in protocol optimization, quantifying difference in system performance, and system harmonization using patient data.

Keywords: No-gold-standard evaluation, Quantitative imaging, Imaging system evaluation

1. INTRODUCTION

Quantitative imaging, i.e. the measurement and use of numerical/statistical features from medical images to facilitate clinical decision making1,2, is finding applications in many diagnostic and therapeutic procedures. For example, the absorbed organ doses measured using single-photon emission computed tomography (SPECT) for dosimetry3, imaging biomarkers to predict and monitor cancer response such as metabolic tumor volume and total lesion glycolysis measured from positron emission tomography (PET)47, tumor size measured with magnetic resonance imaging (MRI)8, apparent diffusion constant measured with diffusion MRI912, myocardial blood flow measured using PET to diagnose cardiac diseases13,14, and bone-mineral density estimated using dual x-ray absorptiometry for diagnosing osteoporosis15.

Given the variety of applications, several new and improved image-acquisition methods (IAMs) are being developed. In this context, an IAM refers to any combination of modalities, instrumentation, or protocols. However, these IAMs often have design tradeoffs that affect the reliability (i.e. accuracy and precision) of the measured quantitative feature from the image. For example, consider the task of measuring the radiotracer uptake in a tumor using a SPECT scanner for a dosimetry study. For this task, a scanner that can provide both high resolution (to accurately capture the uptake at the edges leading to accurate measurements) and high signal-to-noise ratio (SNR) (leading to precise measurements) is ideal. To obtain high resolution, a SPECT scanner with a long fine-bore collimator is required, but such a collimator would allow only a very few photons to pass through, yielding images with poor SNR, as in Fig. 1. Conversely, if a collimator with large-bore diameters is used, the resolution of the images will be degraded. Due to such tradeoffs, the choice of the best IAM is not often obvious, and studies comparing the IAMs on the quantitative task are required.

Figure 1.

Figure 1

Two different sets of patients scanned using different SPECT scanners. The imaging procedure is performed to measure some quantitative value about the patient (e.g. the tumor uptake or tumor volume). The objective of the NGS framework is to rank such IAMs based on their reliability in measuring the quantitative value, in the absence of any gold standard.

It is highly desirable to compare the above IAMs using patient data, since eventually these IAMs must be used with patients. However, comparing IAMs using patient data is often impossible or impractical due to the lack, imprecision, or inaccuracy of available gold standards. Thus, animal, phantom and realistic simulation studies are instead used. However, animal studies are not definitive since the organ sizes and geometries in animals are different from humans. Similarly, physical phantoms often do not model anatomy, physiology, or patient variability well, and simulations may not model some aspects of the biology or instrumentation. These limitations reduce the confidence of physicians in the output of these studies, and thus the clinical translation and acceptance of new and improved IAMs is delayed. A no-gold-standard (NGS) evaluation technique (Fig. 1) to compare the performance of IAMs on quantitative tasks with patient data is thus highly desirable.

NGS techniques have been previously developed for evaluating quantitative imaging methods. Kupinski et al16,17, in a seminal paper, proposed an NGS technique, referred to as the regression-without-truth (RWT) technique, to evaluate quantitative imaging methods in the absence of ground truth in the context of ranking different cardiac-ejection fraction methods18. The technique was used to evaluate segmentation methods for cardiac cine MR images19. We then extended the NGS technique to a larger range of quantitative imaging tasks and extensively validated the method, both using numerical experiments and evaluating reconstruction methods for quantitative SPECT20,21 and segmentation methods for diffusion MRI22,23. Further, to address the practical difficulties in applying the NGS technique to patient data, statistical methods were developed, yielding an NGS framework24. This framework was applied to evaluate tumor-segmentation methods for PET images of patients with head-and-neck cancer24. However, each of these existing NGS techniques require that, for each evaluated method, the data be available from all set of patients. Thus, to use these methods to evaluate IAMs would require that all the patients are scanned using each of the different IAMs. This is expensive, time consuming and can lead to increased radiation dose, for example in CT imaging. A more clinically practical scenario is where different sets of patients have been scanned using different IAMs. Our objective in this study was to design a NGS technique that could use patient data, where different sets of patients were scanned using different IAMs, to objectively evaluate the different IAMs on the quantitative task.

2. METHODS

2.1 Basis of proposed approach

Consider a clinical study to measure a quantitative value that describes some underlying physical process/anatomical feature about the patient. A different set of patients is imaged using each IAM. The objective of the NGS evaluation algorithm, as illustrated by an example in Fig. 1, is to use these patient images to compute figures of merit that can evaluate the different IAMs based on how reliably they estimate the unknown true quantitative value.

Assume that the true values, denoted by ap for the pth patient, have been sampled from an unknown distribution. We assume that this unknown distribution is unimodal, which is reasonable, especially after the patients have been stratified according to factors such as age, disease type, and obesity. We model this true distribution using the beta distribution (parameterized by two unknown shape parameters α and β), which, as illustrated in Fig. 2, allows us to model different possible shapes of the true value distribution. Also, consider that the pth patient was scanned using the kth IAM, yielding the measurement âp,k.

Fig. 2.

Fig. 2

A schematic illustrating that various values for α and β in a beta distribution could model different shapes for the distribution of true values.

The measured quantitative values are the result of a specific image-formation and quantitation process applied to the object being imaged. Thus, a statistical relationship between the true and measured values using the different IAMs is expected. Often, due to the linearity of the image-formation and quantitation process, it is justified to assume that this relationship is linear25. This linearity has been observed in realistic simulation and phantom studies in quantitative SPECT20, diffusion MRI26, DaTScan imaging27, PET imaging28, CT imaging29,30. Also, we have observed in clinical FDG-PET data of patients with head-and-neck cancer that the tumor-volume measurements using different imaging methods are consistent with the linearity assumption24. Linearity between true and measured quantitative values is highly desirable25,29,30 since it ensures that any changes in the true value are proportionately reflected in the measured value. The assumption of linearity also simplifies the computation of precision31. Due to all these reasons, imaging-system designers strive for linearity. Thus, we focus on quantitative applications where the assumption of linearity between true and measured values could be justified. The linear relationship is described as below:

a^p,k=ukap+vk+N(0,σk2), (1)

where uk,vk and N(0,σk2) denote the slope, bias, and normally-distributed zero-mean noise term with variance σk2. As Fig. 3 illustrates, under the linearity assumption, the method with the highest value of the noise standard deviation term and bias term would be considered the most imprecise and the most inaccurate, respectively. Further, the difference between the noise standard deviations or the biases of different IAMs could quantitatively compare the IAMs. Thus, under the assumption of this linear relationship, if we could estimate these terms, we could compute figures of merit that could compare the different IAMs.

Fig. 3.

Fig. 3

A scatter plot of the true vs. measured quantitative values for three different IAMs. The corresponding linear-relationships for the three IAMs are superimposed on the scatter plot, and demonstrate that IAMs that are most noisy (Method 1) and most inaccurate (Method 3) have the highest noise standard deviation term and bias term, respectively.

Before describing the mathematical formalism for estimating these parameters, we first provide an intuition into how these parameters can be estimated. The intuition is very similar to the intuition provided for the RWT and existing NGS techniques in Jha et al24. First note, as illustrated in Fig. 4a, that the distribution of measured values for the kth IAM can be described by the parameters {α,β,uk,vk,σk}.

Fig. 4.

Fig. 4

Fig. 4a: Schematic illustrating the parameterized form for the distribution of measured values

Fig. 4b: Schematic illustrating the intuition behind how the NGS technique can estimate the model parameters

We could therefore design a technique to estimate {α,β,uk,vk,σk} that maximizes the likelihood of the data from all the measurements made with the kth IAM (a maximum-likelihood technique). However, we observe via numerical experiments that the problem is ill-posed, in that the terms are not estimated uniquely. These parameters become more identifiable, or uniquely defined, when we consider images from different IAMs if the true values for each IAM was drawn from the same underlying distribution of true values. This is because now we have different sets of independent measurements, each of which are characterized by the same parameters {α,β} but a different set of {uk,vk,σk}, as illustrated in Fig. 4b.

This premise has been previously used to develop RWT and similar NGS evaluation techniques16,17,20 for cases where all patients have been imaged using all methods. A general version of this same idea can be used to perform NGS evaluation in the absence of the true values even when different sets of patients are scanned via different IAMs, as we describe in the next sub-section.

2.2 Theory

Using Eq. 1, the measured value using the kth IAM, âp,k is normally distributed with a mean of ukap+vk, i.e.

pr(a^p,kap,uk,vk,σk)~N(ukap+vk,σk2), (2)

where pr(x) denotes the probability of a random variable x. However, the above distribution depends on ap, which is not known. To circumvent this issue, we assume that ap has been sampled from a parametric distribution. We choose this distribution to be a beta distribution characterized by parameters Ω={α,β}.

Next note that the joint distribution of âp,k and ap can be written as

pr(a^p,k,apΘk,Ω)=pr(a^p,kap,Θk)pr(apΩ).

Marginalizing on both sides over the random variable ap yields

pr(a^p,kΘk,Ω)=dappr(a^p,kap,Θk)pr(apΩ),

After the marginalization, the distribution of âp,k is no more dependent on ap. Finally, under the assumption that the true values are independent of each other, the joint distribution of all the measurements i.e. pr({âp,k},p=1,2,…P) can be written simply as the product of the individual distributions of âp,k’s, i.e.

pr({a^p,k}{Θk},Ω)=p=1pdappr(a^p,kap,Θk)pr(apΩ), (3)

where Π is a notation for the product of the distributions, and where pr(âp,k|apk) is given by Eq. 2.

From Eq. 3, we have the likelihood of all the measurements, parameterized in terms of Θk and Ω, and with no dependency on ap. Given the likelihood, we can use the maximum-likelihood (ML) procedure to estimate {Θk} and Ω, i.e.

{{Θ^k},Ω^k}ML=argmax{Θk},Ωpr({a^p,k}{Θk},Ω), (4)

where argmaxy f(y) is the value of y at which the function f(y) is maximized. Note that the ML estimator has several important properties that make it an optimal technique for estimating these parameters. For example, if an efficient estimator exists, the ML estimator is efficient, i.e. unbiased and attains the lowest bound on the variance of any estimator, also referred to as the Cramer Rao lower bound.

From the ML estimates of the linear-relationship parameters, we can compute different figures of merit (FoMs) as required by the clinical application. For example, accuracy, precision, and reliability (accuracy and precision) of the different IAMs are quantified using the bias term vk, the std. dev. σk and the mean square error (MSE) ( vk2+σk2). These FoMs could be used to rank the IAMs, which is required in the context of optimizing IAM performance. They could also be used to quantify the difference in IAM performance on the basis of bias, standard deviation and mean square error of the quantitative values.

2.3 Implementation

From the above derivation, the NGS evaluation problem reduces to an optimization task. The optimization routine is simplified by taking the logarithm on both sides of Eq. 3 (which reduces products to sums). Next, using a quasi-Newton optimization technique (we use the Interior-point algorithm in the software Matlab®), we can find this ML solution.

The above treatment, although based on the same idea as existing NGS techniques16,17,20, is general in that at no point does it require image data where the same patient has been scanned using all the available IAMs. Further, no constraints are placed on the number of true values from a given IAM. Thus, even if different number of patient studies are available from each IAM, a very realistic scenario, the NGS technique can still be used to estimate the model parameters. All these factors make the proposed NGS technique very general and practical.

3. EXPERIMENTS AND RESULTS

The proposed NGS technique was validated in the context of evaluating different IAMs on the task of estimating activity concentration in a known region of interest. Here we present preliminary results from this validation study.

Note that to validate the proposed technique on the task of evaluating IAMs, a study where the true quantitative values are known is needed for reference. For this purpose, we conducted highly realistic simulation studies using anthropomorphic phantoms. SPECT imaging of an I-131 labeled anti-CD20 antibody used for radio-immunotherapy of non-Hodgkin’s lymphoma was simulated. The object database consisted of 42-patient digital phantom population. The organ time activity curves and organ volumes for these phantoms were based on measurements from patient data, and were based on the NURBS-based Cardiac-Torso (NCAT) phantom. A Philips Precedence SPECT system with a 9.525 mm thick Nal crystal and a high-energy general-purpose collimator was simulated. Low-noise projections of eight organs were generated from photons with emission energies of 364, 637, and 722 keV and the appropriate abundances. The SimSET software, in conjunction with angular response functions tables, which accurately modeled the collimator and detector effects32, was used to simulate the projections. The software to simulate the SPECT system was validated by comparison with experimental data32. Using this software, low-noise projections were scaled and summed according to the emission abundance and activity distributions in the 42-patient phantom population, yielding 42 low-noise projection datasets. Subsequently, 50 independent noisy projection datasets were generated from each low-noise dataset using a Poisson pseudo-random-number generator.

The patient population was split into two halves. For each half, the projection data was reconstructed using two different methods. This simulated the case where different sets of patients are scanned using different IAMs. Both these reconstruction methods were OSEM-based, but provided different levels of compensation. The projection data corresponding to the first set were reconstructed using the AGS method, which compensated for attenuation (A), scatter (S), and geometric response (G) of the SPECT system. The second set were reconstructed using ADS.DWN method, which additionally compensated for collimator-detector response (D) and down-scatter from high-energy photons (DWN)32. The activity concentration in eight different VOIs corresponding to the eight different organs were estimated assuming knowledge of the true organ VOIs.

The proposed NGS technique was used to rank these two IAMs based on the task of quantifying activity concentration in the different organs. 50 trials of this process were conducted. In each trial, data from different patient subsets were reconstructed using the two reconstruction methods, thus simulating population variability. The entire process was repeated for the 50 different noise realizations of the data. Thus, in total, 2500 (50X50) trials of the NGS technique were conducted.

Since this was a simulation study, the true values of the various FoMs for the two IAMs were determined from the available ground truth. It was observed that the metrics estimated using the NGS technique yielded similar rankings as the true rankings on the basis of accuracy for more than 90% of the IAMs. This is as illustrated in Fig. 5.

Fig. 5.

Fig. 5

The mean bias for each of the 50 noise realizations when (a) ground truth was known and (b) the NGS technique was used assuming no knowledge of true values. The error bars represent 95% confidence intervals. The ranking obtained with the NGS technique using the bias FoM were the same as the true rankings for more than 90% of the 2500 trials.

4. CONCLUSIONS

A no-gold-standard technique developed to evaluate image-acquisition methods (IAMs) when different sets of patients are scanned using different IAMs has been developed. Preliminary results demonstrate the potential of this technique in ranking the imaging methods on the basis of accuracy. In future, we propose to conduct several additional studies to comprehensively validate the performance of this technique. The proposed technique has several potential applications including protocol optimization for multi-center trials and scanner harmonization. A limitation of the technique is that it requires data from a large number of patients, which might not always be available. We are currently developing strategies that could help us overcome this requirement33.

Acknowledgments

This work was supported by National Institute of Health under grant numbers R01-EB016231, R01-CA109234 and U01-CA140204.

References

  • 1.Abramson RG, Burton KR, Yu J-PJ, Scalzetti EM, Yankeelov TE, et al. Methods and Challenges in Quantitative Imaging Biomarker Development. Academic radiology. 2015;22(1):25–32. doi: 10.1016/j.acra.2014.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Gillies RJ, Kinahan PE, Hricak H. Radiomics: Images Are More than Pictures, They Are Data. Radiology. 2016;278(2):563–77. doi: 10.1148/radiol.2015151169. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Bailey DL, Willowson KP. An evidence-based review of quantitative SPECT imaging and potential clinical applications. J Nucl Med. 2013;54(1):83–89. doi: 10.2967/jnumed.112.111476. [DOI] [PubMed] [Google Scholar]
  • 4.Naqa I. The role of quantitative PET in predicting cancer treatment outcomes. Clinical and Translational Imaging. 2014;2(4):305–320. [Google Scholar]
  • 5.Wahl RL, Jacene H, Kasamon Y, Lodge MA. From RECIST to PERCIST: Evolving Considerations for PET response criteria in solid tumors. J Nucl Med. 2009;50(Suppl 1):122s–50s. doi: 10.2967/jnumed.108.057307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Mena E, Sheikhbahaei S, Taghipour M, Jha AK, Vicente E, et al. 18F-FDG PET/CT Metabolic Tumor Volume and Intratumoral Heterogeneity in Pancreatic Adenocarcinomas: Impact of Dual-Time Point and Segmentation Methods. Clin Nucl Med. 2016 doi: 10.1097/RLU.0000000000001446. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Mena E, Taghipour M, Sheikhbahaei S, Jha AK, Solnes L, et al. Value of intra-tumoral metabolic heterogeneity and quantitative 18F-FDG PET/CT parameters to predict prognosis, in patients with HPV-positive primary oropharyngeal squamous cell carcinoma. Clin Nucl Med. 2016 doi: 10.1097/RLU.0000000000001578. accepted. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Tirkes T, Hollar MA, Tann M, Kohli MD, Akisik F, et al. Response criteria in oncologic imaging: review of traditional and new criteria. Radiographics. 2013;33(5):1323–41. doi: 10.1148/rg.335125214. [DOI] [PubMed] [Google Scholar]
  • 9.Stephen RM, Jha AK, Roe DJ, Trouard TP, Galons JP, et al. Diffusion MRI with Semi-Automated Segmentation Can Serve as a Restricted Predictive Biomarker of the Therapeutic Response of Liver Metastasis. Magn Reson Imaging. 2015 doi: 10.1016/j.mri.2015.08.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Xu QG, Xian JF. Role of Quantitative Magnetic Resonance Imaging Parameters in the Evaluation of Treatment Response in Malignant Tumors. Chinese medical journal. 2015;128(8):1128–1133. doi: 10.4103/0366-6999.155127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Jha AK, Rodriguez JJ, Stopeck AT. A maximum-likelihood method to estimate a single ADC value of lesions using diffusion MRI. Magn Reson Med. 2016 doi: 10.1002/mrm.26072. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Jha AK. ADC Estimation in Diffusion-Weighted Images. Tucson, Arizona: Dept. of ECE, University of Arizona; 2009. [Google Scholar]
  • 13.Rahmim A, Tahari AK, Schindler TH. Towards quantitative myocardial perfusion PET in the clinic. Journal of the American College of Radiology : JACR. 2014;11(4):429–432. doi: 10.1016/j.jacr.2013.12.018. [DOI] [PubMed] [Google Scholar]
  • 14.Schindler TH, Schelbert HR, Quercioli A, Dilsizian V. Cardiac PET imaging for the detection and monitoring of coronary artery disease and microvascular health. JACC Cardiovasc Imaging. 2010;3(6):623–40. doi: 10.1016/j.jcmg.2010.04.007. [DOI] [PubMed] [Google Scholar]
  • 15.Link TM. Osteoporosis imaging: state of the art and advanced imaging. Radiology. 2012;263(1):3–17. doi: 10.1148/radiol.12110462. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Kupinski MA, Hoppin JW, Clarkson E, Barrett HH, Kastis GA. Estimation in medical imaging without a gold standard. Academic Radiology. 2002;9(3):290–297. doi: 10.1016/s1076-6332(03)80372-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Hoppin JW, Kupinski MA, Kastis GA, Clarkson E, Barrett HH. Objective Comparison of Quantitative Imaging Modalities Without the Use of a Gold Standard. IEEE Transactions on Medical Imaging. 2002;21(5):441–449. doi: 10.1109/TMI.2002.1009380. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Kupinski MA, Hoppin JW, Krasnow J, Dahlberg S, Leppo JA, et al. Comparing cardiac ejection fraction estimation algorithms without a gold standard. Academic Radiology. 2006;13(3):329–337. doi: 10.1016/j.acra.2005.12.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Lebenberg J, Buvat I, Lalande A, Clarysse P, Casta C, et al. Nonsupervised Ranking of Different Segmentation Approaches: Application to the Estimation of the Left Ventricular Ejection Fraction From Cardiac Cine MRI Sequences. IEEE Transactions on Medical Imaging. 2012;31(8):1651–1660. doi: 10.1109/TMI.2012.2201737. [DOI] [PubMed] [Google Scholar]
  • 20.Jha AK, Caffo B, Frey EC. A no-gold-standard technique for objective assessment of quantitative nuclear-medicine imaging methods. Phys Med Biol. 2016;61(7):2780–800. doi: 10.1088/0031-9155/61/7/2780. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Jha AK, Song N, Caffo B, Frey EC. Objective evaluation of reconstruction methods for quantitative SPECT imaging in the absence of ground truth. Proc SPIE Int Soc Opt Eng. 2015;9416:94161K–94161K-8. doi: 10.1117/12.2081286. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Jha AK, Kupinski MA, Rodriguez JJ, Stephen RM, Stopeck AT. Evaluating segmentation algorithms for diffusion-weighted MR images: a task-based approach. Proc SPIE Int Soc Opt Eng. 2010:7627. doi: 10.1117/12.845515. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Jha AK, Kupinski MA, Rodriguez JJ, Stephen RM, Stopeck AT. Task-based evaluation of segmentation algorithms for diffusion-weighted MRI without using a gold standard. Physics in Medicine and Biology. 2013;58(1):183–183. doi: 10.1088/0031-9155/57/13/4425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Jha AK, Mena E, Caffo B, Ashrafinia S, Rahmim A, et al. Practical no-gold-standard evaluation framework for quantitative imaging methods: application to lesion segmentation in positron emission tomography. Journal of Medical Imaging. 2017;4(1):011011–011011. doi: 10.1117/1.JMI.4.1.011011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kessler LG, Barnhart HX, Buckler AJ, Choudhury KR, Kondratovich MV, et al. The emerging science of quantitative imaging biomarkers terminology and definitions for scientific studies and regulatory submissions. Stat Methods Med Res. 2015;24(1):9–26. doi: 10.1177/0962280214537333. [DOI] [PubMed] [Google Scholar]
  • 26.Jha AK, Kupinski MA, Rodriguez JJ, Stephen RM, Stopeck AT. Task-based evaluation of segmentation algorithms for diffusion-weighted MRI without using a gold standard. Phys Med Biol. 2012;57(13):4425–46. doi: 10.1088/0031-9155/57/13/4425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Tossici-Bolt L, Dickson JC, Sera T, de Nijs R, Bagnara MC, et al. Calibration of gamma camera systems for a multicentre European (1)(2)(3)I-FP-CIT SPECT normal database. Eur J Nucl Med Mol Imaging. 2011;38(8):1529–40. doi: 10.1007/s00259-011-1801-5. [DOI] [PubMed] [Google Scholar]
  • 28.Liow JS, Strother SC. The convergence of object dependent resolution in maximum likelihood based tomographic image reconstruction. Phys Med Biol. 1993;38(1):55–70. doi: 10.1088/0031-9155/38/1/005. [DOI] [PubMed] [Google Scholar]
  • 29.Li Q, Gavrielides MA, Sahiner B, Myers KJ, Zeng R, et al. Statistical analysis of lung nodule volume measurements with CT in a large-scale phantom study. Medical Physics. 2015;42(7):3932–3947. doi: 10.1118/1.4921734. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Obuchowski NA, Buckler A, Kinahan P, Chen-Mayer H, Petrick N, et al. Statistical Issues in Testing Conformance with the Quantitative Imaging Biomarker Alliance (QIBA) Profile Claims. Academic Radiology. 2016;23(4):496–506. doi: 10.1016/j.acra.2015.12.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Obuchowski NA, Barnhart HX, Buckler AJ, Pennello G, Wang XF, et al. Statistical issues in the comparison of quantitative imaging biomarker algorithms using pulmonary nodule volume as an example. Stat Methods Med Res. 2015;24(1):107–40. doi: 10.1177/0962280214537392. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Song N, Du Y, He B, Frey EC. Development and evaluation of a model-based down-scatter compensation method for quantitative I-131 SPECT. Med Phys. 2011;38(6):3193–3204. doi: 10.1118/1.3590382. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Jha AK, Frey E. Incorporating prior information in a no-gold-standard technique to assess quantitative SPECT reconstruction methods. International Meeting on Fully 3D reconstruction in Radiology and Nuclear Medicine. 2015:47–51. [Google Scholar]

RESOURCES