Correspondence/Findings
To assess accuracy and precision of the positron emission tomography-computed tomography (PET-CT) carotid standardized uptake values (SUV) of 18F–fluorodeoxyglucose (18FDG) as an inflammatory biomarker for determining cerebrovascular diseases such as stroke, methodology and statistical issues should be taken into account. Otherwise, misleading messages will be the main outcome of such research. Briefly, confusing accuracy and precision will mainly produce misleading messages.
I was interested to read the paper by Giannotti N and colleagues published in the Dec 2017 issue of EJNMMI Res [1]. Positron emission tomography-computed tomography (PET-CT) carotid standardized uptake values (SUV) of 18F–fluorodeoxyglucose (18FDG) have been proposed as an inflammatory biomarker for determining cerebrovascular diseases such as stroke. Consideration of varying methodological approaches and software packages is critical to the calculation of accurate SUVs in cross-sectional and longitudinal patient studies. They aimed to investigate whether or not carotid atherosclerotic plaque SUVs are consistent and reproducible between software packages [1]. 18FDG-PET SUVs of carotids were taken in 101 patients using two different software packages [1]. Data from five to seven anatomical sites were measured. A total of ten regions of interest (ROI) were drawn on each site. Based on their results statistically significant differences in SUV measurements, between the two software packages, ranging from 9 to 21.8% were found depending on ROI location. In 79% (n = 23) of the ROI locations, the differences between the SUV measurements from each software package were found to be statistically significant. They highlighted the importance of standardizing all aspects of methodological approaches to ensure accuracy and reproducibility.
However, reproducibility (precision, repeatability, reliability, or interchangeability) and accuracy (validity) are two completely different methodological issues [2–8]. The methodological approach and statistical estimates to assess these issues are completely different. For reliability purposes, our approach should be individual based. It means for continues variables, intraclass correlation coefficient (ICCC) absolute agreement single measure should be considered. 9 to 21% statistically significant differences in SUV measurements between the two software packages indicate that the authors did not applied this approach. They considered global average approach for reliability which is a common mistake and usually applied to assess accuracy of a test compared to a gold standard. It is crucial to know that a test can be accurate with no reliability and vice versa. Moreover, statistically significant should not be considered in reproducibility analysis because it dramatically depends on the sample size [2–8]. Finally, confusing precision and accuracy will mainly produce misleading messages.
Competing interests
The author declares that he/she has no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Footnotes
This reply refers to the comment available at: https://doi.org/10.1186/s13550-017-0309-9.
References
- 1.Giannotti N, O'Connell MJ, Foley SJ, Kelly PJ, McNulty JP. Carotid atherosclerotic plaques standardised uptake values: software challenges and reproducibility. EJNMMI Res. 2017;7(1):39. doi: 10.1186/s13550-017-0285-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Szklo M, Nieto FJ. Epidemiology beyond the basics. 2. Manhattan: Jones and Bartlett Publisher; 2007. [Google Scholar]
- 3.Sabour S. Reproducibility of 18F-fluoromisonidazole intratumour distribution in non-small cell lung cancer; methodological issues to avoid mismanagement of the patients. EJNMMI Res. 2017;7(1):23. doi: 10.1186/s13550-017-0270-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Sabour S. Reproducibility of semi-automatic coronary plaque quantification in coronary CT angiography with sub-mSv radiation dose; common mistakes. J Cardiovasc Comput Tomogr. 2016;10(5):e21–e22. doi: 10.1016/j.jcct.2016.07.002. [DOI] [PubMed] [Google Scholar]
- 5.Sabour S, Ghassemi F. The validity and reliability of a signal impact assessment tool: statistical issue to avoid misinterpretation. Pharmacoepidemiol Drug Saf. 2016;25(10):1215–1216. doi: 10.1002/pds.4061. [DOI] [PubMed] [Google Scholar]
- 6.Sabour S. Myocardial blood flow quantification by Rb-82 cardiac PET/CT: methodological issues on reproducibility study. J Nucl Cardiol. 2016;6 [Epub ahead of print] [DOI] [PubMed]
- 7.Sabour S, Ghassemi F. Accuracy and reproducibility of the ETDRS visual acuity chart: methodological issues. Graefes Arch Clin Exp Ophthalmol. 2016;254(10):2073–2074. doi: 10.1007/s00417-016-3420-0. [DOI] [PubMed] [Google Scholar]
- 8.Sabour S, Farzaneh F, Peymani P. Evaluation of the sensitivity and reliability of primary rainbow trout hepatocyte vitellogenin expression as a screening assay for estrogen mimics: methodological issues. Aquat Toxicol. 2015;164:175–176. doi: 10.1016/j.aquatox.2015.05.003. [DOI] [PubMed] [Google Scholar]
