Skip to main content
RSNA Journals logoLink to RSNA Journals
. 2020 Jan 7;294(3):647–657. doi: 10.1148/radiol.2019191882

The QIBA Profile for FDG PET/CT as an Imaging Biomarker Measuring Response to Cancer Therapy

Paul E Kinahan 1,, Eric S Perlman 1, John J Sunderland 1, Rathan Subramaniam 1, Scott D Wollenweber 1, Timothy G Turkington 1, Martin A Lodge 1, Ronald Boellaard 1, Nancy A Obuchowski 1, Richard L Wahl 1
PMCID: PMC7053216  PMID: 31909700

Abstract

The Quantitative Imaging Biomarkers Alliance (QIBA) Profile for fluorodeoxyglucose (FDG) PET/CT imaging was created by QIBA to both characterize and reduce the variability of standardized uptake values (SUVs). The Profile provides two complementary claims on the precision of SUV measurements. First, tumor glycolytic activity as reflected by the maximum SUV (SUVmax) is measurable from FDG PET/CT with a within-subject coefficient of variation of 10%–12%. Second, a measured increase in SUVmax of 39% or more, or a decrease of 28% or more, indicates that a true change has occurred with 95% confidence. Two applicable use cases are clinical trials and following individual patients in clinical practice. Other components of the Profile address the protocols and conformance standards considered necessary to achieve the performance claim. The Profile is intended for use by a broad audience; applications can range from discovery science through clinical trials to clinical practice. The goal of this report is to provide a rationale and overview of the FDG PET/CT Profile claims as well as its context, and to outline future needs and potential developments.

© RSNA, 2020

Online supplemental material is available for this article.

See also the editorial by Ulaner in this issue.


graphic file with name radiol.2019191882.VA.jpg


Summary

This article provides a summary of the claims of precision of standardized uptake value measurements and the context of the technically confirmed fluorodeoxyglucose PET/CT Profile by the Quantitative Imaging Biomarkers Alliance.

Key Results

  • ■ In controlled test-retest PET/CT scans, the within-subject coefficient of variation of tumor fluorodeoxyglucose (FDG) uptake measured with the maximum standardized uptake value (SUVmax) is 10%–12%.

  • ■ Equivalently, a measured increase in SUVmax of 39% or more, or a decrease of 28% or more, indicates that a true change has occurred with a 95% level of confidence.

  • ■ For the above statements to be true, imaging is best performed with the same scanner and protocol, the FDG-avid lesions are more than 2 cm in diameter with baseline SUVmax greater than or equal to 4 g/mL, and FDG uptake durations for baseline and follow-up scans (nominally 60 minutes) are within 10 minutes.

Introduction

Despite the challenges of quantitative imaging of fluorine 18 (18F)–fluorodeoxyglucose (FDG) with PET combined with x-ray CT, or FDG PET/CT, the potential to monitor response to therapy or disease progression by comparing serial FDG uptake measurements has motivated efforts to both characterize and improve the precision of measurements from FDG PET/CT scans. In 2007, the Radiological Society of North America organized the Quantitative Imaging Biomarkers Alliance (QIBA), whose mission is “to improve the value and practicality of quantitative imaging biomarkers by reducing variability across devices, patients and time” (1,2).

The primary outputs of the QIBA initiative are standards-based quantitative imaging documents called Profiles, which are based on a similar process used by the Integrating the Healthcare Enterprise, or IHE, initiative (3). A Profile makes statistically based performance claims about a quantitative imaging biomarker. It also describes the clinical context and specifies the performance requirements and appropriate compliance procedures to achieve the claims.

The FDG PET/CT Profile describes the measurement accuracy of FDG PET/CT imaging and also defines the technical and behavioral performance levels and quality control specifications for whole-body FDG PET/CT scans used in single- and multicenter clinical trials of oncologic therapies.

The goal of this report is to provide a concise summary of the Profile and its claims, as well as to describe how the claims were derived. In addition, we review the relationships with other initiatives and proposed criteria. We also describe the time course of the Profile development, the knowledge gaps that were identified, and the “groundwork” projects undertaken to bridge those gaps. We discuss the implications for patient care in both clinical trials and clinical care, current and potential uses, how barriers to implementation are being addressed, and future directions.

The Roles and Challenges of Quantitative FDG PET/CT Imaging

PET/CT imaging is widely used for diagnosis, staging, restaging, treatment planning, and response assessment for several diseases, as well as a research tool for developing new therapies and basic research (4). PET/CT imaging is most commonly used for cancer imaging with the radiolabeled glucose analog 2-(18F) FDG across a wide range of solid tumor cancers (5). The rationale for the use of FDG in oncology is based on the increased rate of glycolysis in many tumor types compared with normal tissue (6). In contrast to normal differentiated cells, which rely primarily on mitochondrial oxidative phosphorylation to generate the energy needed for cellular processes, most cancer cells have greatly increased rates of glycolysis, a phenomenon termed the Warburg effect and considered a hallmark of cancer (7). FDG is transported into cells following the glucose pathway, but the slight structural difference with glucose results in FDG being trapped in the cell and accumulating at a rate proportional to glucose utilization (6). This biochemistry, combined with the favorable physics of detecting positrons emitted by the 18F radioisotope, and the advent of combined PET/CT scanners, makes FDG PET/CT imaging an effective method for imaging cancer. Starting in 1998, the Centers for Medicare & Medicaid Services approved Medicare reimbursement for FDG PET imaging in multiple indications.

The spatially dependent measurement made by PET scanners is the in vivo radioactivity concentration, that is, kilobecquerels per milliliter. The most common form of reporting uptake is a relative uptake measure called the standardized uptake value (SUV), defined as the measured radioactivity concentration divided by decay-corrected injected dose and patient weight. The resulting units are typically grams per milliliter, although variations exist. Use of SUVs minimizes confounding effects due to variations in the amount of FDG injected and the biodistribution (eg, patient size). Typically, only the maximum SUV (SUVmax) is reported, because this is a more repeatable measurement than averages within a defined region of interest (8,9).

Longitudinal changes in tumor FDG accumulation during therapy often can predict clinical outcomes earlier than changes in standard anatomic measurements (10). The TBCRC026 trial of neoadjuvant therapy for patients with stage II or III, estrogen receptor–negative, human epidermal growth factor receptor 2–positive breast cancer has recently shown the ability of FDG SUVmax to be an early predictor of pathologic complete response (11). The TBCRC026 trial followed the QIBA FDG PET/CT Profile as described next.

Therefore, tumor metabolic response or progression as determined by FDG uptake can serve as a pharmacodynamic end point in well-controlled phase I and phase IIA studies, as well as an efficacy end point in phase II and phase III studies. In experimental tumor therapy settings where the preceding phase II trials have shown a statistically significant relationship between FDG PET response and an independent measure of outcome, changes in tumor FDG activity may serve as the primary efficacy end point for regulatory approval in drug registration trials. Recent U.S. Food and Drug Administration guidelines for imaging in clinical trials note that “[S]tandardization, while important for all clinically used measures, becomes essential for an imaging endpoint used in a clinical trial to reduce variability and to ensure interpretability of the results” (12).

In clinical practice, the Food and Drug Administration has recently proposed draft guidelines for quantitative imaging in device premarket submissions and noted “[T]he utility of any quantitative imaging value is greatest if the performance of the quantitative imaging function is well characterized and users have sufficient information to understand and interpret the quantitative values being reported” (13).

In clinical practice, the SUVs are commonly reported, but without the context of the known or potential variability. This impacts the ability to predict clinical outcomes.

The key issues identified as barriers to the use of SUVs as a quantitative imaging biomarker for FDG PET/CT imaging have been defined by Boellaard and others (1416). In brief, these are a result of scanner features (spatial resolution, image scatter and noise), patient factors (blood glucose, renal function, biologic variability), imaging protocols (injected activity, duration of uptake period, respiratory motion), and data correction and image reconstruction algorithms. The multiple sources of potential error reduce the reproducibility of SUV measurements and constrain the use of a change in SUV to determine response to therapy or disease progression. These analyses were used in the construction of the FDG PET/CT Profile and its claims as described next.

The Profile Claim

Under several constraints described next, the QIBA Profile summarizes the test-retest variability of SUVs measured from FDG PET/CT images in two equivalent claims:

  1. Tumor glycolytic activity as reflected by the SUVmax is measurable from FDG PET/CT with a within-subject coefficient of variation (wCV) of 10%–12%.

  2. A measured increase in SUVmax of 39% or more, or a decrease of 28% or more, indicates that a true change has occurred with 95% confidence.

It is important to note that these claims do not address the bias or variance of an SUV measurement from a single scan, but rather the variance of measured SUVs for the same lesion in the same patient under the assumption that there have been no systematic changes in the lesion, the patient, the scan protocol, or the analysis. The main reasons the claims focused on this specific measurement task were the relevance for clinical trials and clinical practice, as well as the availability of sufficient published test-retest studies in humans to justify the claims statistically.

The practical implication of the Profile claims is that the measurement precision of test-retest FDG PET SUVmax values is established as a measurement assay. Essentially this is describing the technical performance as an assessment of how precise the measurement is in participants under controlled conditions (17). Specifically, if the conditions below are met, then the wCV is 10%–12%. This can also be expressed as predicting the likelihood of whether there has been a true change in FDG tracer uptake in longitudinal PET scans, as described in part 2 of the claim. This expression of the claim can be used as a basis for defining or modifying response criteria, as discussed next in the section entitled “Relation to Criteria Determining Response.”

Conditions for the Profile Claim to Be Valid

There are several conditions considered necessary for the claim to be considered valid:

  1. The claims only apply to tumors that are considered evaluable with PET. In practice, this means tumors of a minimum size and baseline SUVmax (4,18). The Profile recommends a minimum baseline SUVmax of 4 g/mL and a minimum diameter of 2 cm but notes that for some clinical trials or clinical use cases, higher or lower values may be appropriate (19,20).

  2. In the derivation of the claim, it was assumed that the repeatability of SUVmax could be described by a fixed percentage of the baseline measurement. This assumption may not be applicable over the full range of clinically relevant SUVs and combinations of relative and absolute SUV changes have been proposed (18).

  3. The claim is applicable for single-center studies using the same scanner and protocol. However, for studies where a patient may be imaged with different PET/CT scanners, Kurland et al (21) have shown that the claim can be met provided that the QIBA Profile is followed, the same make and model scanner is used, and the scanners are cross calibrated (21). This same study also showed that test-retest studies with different scanners had a much larger wCV (increasing from 9% to 22%; see Supplementary Table 4 in Kurland et al [21]), consistent with the results of Kumar et al (22), which emphasized that the variance of SUVs for FDG PET/CT imaging is greater in clinical practice than under controlled settings.

  4. Additional conditions are that the FDG uptake durations (time between injection and scanning) for baseline and follow-up scans (nominally 60 minutes) are within 10 minutes. These and other requirements in the Profile are summarized as concise checklists for the imaging sites and equipment manufacturers. These checklists are included in Appendix E1 (online).

The rationale for the threshold of a minimum baseline SUVmax of 4 g/mL (condition 1 above) is described in section 3.6.5.3 of the Profile, which in turn is based on section 10.2.1.1. of the Uniform Protocol for Imaging in Clinical Trials (UPICT) protocol described next. For this, the UPICT and QIBA working groups consolidated available literature and clinical trial procedures to form the following consensus statement:

“A minimum FDG-avidity is required and should be specified in the clinical trial protocol. … For a general cutoff, a SUVmax of 4 is suggested for all target lesions, although in some settings a lower minimum SUVmax may be acceptable, such as in the lung or breast” (23).

Similarly, the size threshold of 2 cm, and its process for measurement using anatomic imaging, have the following consensus statements:

“[sizes from] FDG-avid anatomically measurable lesion(s) are preferable to FDG-avid lesion(s) that are not anatomically measurable” (20).

and

“Evaluation of lesion size (eg, longest diameter) may be difficult. Lesions subject to partial volume effect of SUV measurement … should be excluded” (23).

The consensus statements above are accompanied by extensive discussions in the primary documents (20,23). We note that the anatomic measurement is intended to reflect the longest dimension and is agnostic of nodal versus extranodal status.

Although the thresholds for change in SUVmax of −28% and 39% are the 95% confidence limits, caution should be used in interpreting changes larger than that as meaningful response or progression because that will be dependent on the clinical scenario.

There are limitations to the claim. As discussed next, this Profile and the claims should be reassessed for technological changes, such as point spread function–based reconstruction or time-of-flight imaging that were not used in the initial published test-retest studies. In addition, although the claim was informed by a review of the literature, it is currently a technically confirmed claim that has not been independently verified by a prospective study that conforms to the Profile. These points are discussed later in the “Future Developments and Conclusions” section.

Derivation of the Claim

A number of publications report test-retest repeatability for tumor SUV measurements with FDG PET (2432). Table 1 lists these publications and summarizes the repeatability measurement results. Comparing repeatability measurements from the various reports is complicated by the different methodologies used in each study and also the different metrics used to characterize repeatability.

Table 1:

Publications on Test-Retest Variability That Were Available and Included in the Derivation of the Profile Claims

graphic file with name radiol.2019191882.tbl1.jpg

The region-of-interest or volume-of-interest methodology varied between publications. Minn et al (27) report SUV calculated by using the mean value of a region of interest placed over the FDG-avid lesion, or SUVmean, derived from a fixed-size 1.2 × 1.2-cm region of interest. Weber et al (31) report SUVmean derived from a volume of interest defined by a 50% isocontour. The remaining articles report SUVmax, although data for multiple types of region-of-interest definitions were sometimes reported.

Nahmias and Wahl (28) report SUVmax but unlike the other publications, the wCV was not evaluable. Direct comparison with the other reports was therefore impossible. Kamibayashi et al (25) compared the repeatability of SUVs measured with different scanner systems, whereas the other reports involve test-retest studies with the same scanner. For this reason, the Kamibayashi data were considered not comparable with the other articles. The study by Kumar et al (22) used different scanners for test and retest scans (seeking to answer the question of repeatability when imaging systems were interchanged), and so was not comparable to the other studies. The remaining publications (24,26,29,30,32) (the last five rows of Table 1) allow a direct comparison because they report the repeatability of SUVmax, with test and retest studies both performed with the same scanner system using the same protocol.

A further complication when comparing reports is the different metrics used to characterize repeatability. In Table 1, we translate the reported repeatability measurements to a wCV to allow a more direct comparison. Details of how these so-called inferred wCV values were derived are listed in Table 2. The last five rows of Table 1 (24,26,29,30,32) allowed direct comparison of the wCV values, which ranged from 10% to 12%. This was used as the basis for the Profile claims. We note that these data are for a variety of tumor types, including breast, lung, esophageal, and gastrointestinal cancers.

Table 2:

Relationships Used to Compare Repeatability Metrics Found in the Literature

graphic file with name radiol.2019191882.tbl2.jpg

Claim 2 is based on observations that the test-retest SUVmax differences shown in Table 1 do not follow a normal distribution. However, several publications demonstrated that the differences (d) of the logarithms of the test-retest SUVmax values did follow a normal distribution with a standard deviation (SD) of SD(d) (30,32,33). The 95% repeatability coefficient in the log-normal distribution (ie, ± 1.96 ∙ SD[d]) are thus symmetric about the mean. This suggests that when these limits are converted back to SUV units by exponentiation using a repeatability coefficient of 100 {exp [± 1.96 ∙ SD(d)]−1}, the SUV repeatability coefficient limits are necessarily asymmetric. In addition, we note that if the SD is not large compared with the level of the measurement, then it can be shown that 100 {exp [SD(d)/sqrt(2)]−1} is approximately equal to the wCV, expressed as a percent (30,34). If we assume a wCV of 12% for SUVmax, then the repeatability coefficients are −28% and 39%, consistent with the findings by Velasquez et al (30) for advanced gastrointestinal malignancies, and those of Weber et al (32) for non–small cell lung cancer.

Conceptually these asymmetric limits of repeatability can be thought of as being related to the reference SUV used to calculate the relative change. Lodge (35) notes that use of a single baseline SUV as the reference leads to a skewing of the data that necessitates asymmetric limits. This can be illustrated by supposing two test-retest measurements are SUVmax1 of 7.0 and SUVmax2 of 9.75, with a net change in SUVmax of 2.75, then:

graphic file with name radiol.2019191882.uneq1.jpg
graphic file with name radiol.2019191882.uneq2.jpg

In other words, a true net change in SUVmax of 2.75 in either direction represents an absolute change or physiologic difference. As an increase, it is a change of 39%, and as a decrease, it is a change of −28%. The article by Lodge also provides more detailed background information.

Since the derivation of the claims listed above, there was a confirmatory study published by Kurland et al (21) who found a wCV for SUVmax of 9% for test-retest studies with the same scanner (see Supplemental Table 4 in Kurland et al [21]). In addition, a recent study by Fraum et al (36) found a wCV for SUVmax of 8.5%–12.8%, depending on the choice of image reconstruction method.

Relation to Criteria Determining Response

The European Organization for Research and Treatment of Cancer (EORTC) response criteria for PET were proposed in 1999 (37). Although anatomic response criteria such as those proposed by the World Health Organization and the Response Evaluation Criteria in Solid Tumors (38) had been in use, the EORTC expert consortium recognized that subclinical metabolic response seen early after treatment at PET, but not seen anatomically, was likely to be important.

As a side note on imaging protocols, the EORTC recommendations also included a set of consensus standards for protocols for clinical trials using quantitative PET imaging. Proposed standards for clinical trial protocols were also published by a National Institutes of Health–led consensus group (39); the Netherlands consortium (40); the European Association of Nuclear Medicine procedure guidelines for tumor imaging, version 2 (41); and the UPICT (19,20).

The EORTC response criteria were expanded and modified by Wahl and colleagues (4) in 2006 for the Positron Emission Tomography Response Criteria in Solid Tumors, or PERCIST. A detailed comparison between the EORTC and PERCIST criteria is provided by Wahl et al (4), as well as justifications for the differences in the PERCIST criteria. Here we summarize the main points of difference between the EORTC and PERCIST criteria and the QIBA Profile claims. It is important to note that the EORTC and PERCIST response criteria are intended to provide information on disease response, stability, or progression, whereas the QIBA Profile claims are providing information about the statistical variability of SUVs under the assumption of no true biologic change (17). Nonetheless, the areas of overlap are shown in Table 3.

Table 3:

Comparison of EORTC and PERCIST Response Criteria and QIBA Profile Claims

graphic file with name radiol.2019191882.tbl3.jpg

Profile Structure, Protocols, and Compliance

The overall structure of the technically confirmed FDG PET/CT Profile is listed in Table 4. Each section of the Profile describes the conditions for FDG PET/CT imaging that must be performed to achieve the repeatability performance described in the claims. We do not go into detail, but rather provide a brief overview and point out relevant connections with other activities or initiatives.

Table 4:

High-Level Structure of the QIBA FDG PET/CT Profile

graphic file with name radiol.2019191882.tbl4.jpg

Profile Details (Protocols)

Section 3 on Profile Details (now called Protocols in the revised QIBA protocol template) addresses the steps illustrated in Figure 1. This section of the Profile organizes acquisition, reconstruction and processing, analysis, and interpretation as steps in a pipeline that extracts quantitative imaging biomarker data from images.

Figure 1:

Figure 1:

Image shows use of fluorine 18 (18F)–fluorodeoxyglucose (FDG) PET/CT imaging process as assay method for computing and interpreting tumor metabolic activity as pipeline using either one or two or more scan sequences. Standardized uptake values (SUVs) are used to reduce variations caused by differences in patient biodistribution and amount of FDG used. Measure SUVx refers to one of several possible SUV measures such as SUVmax, SUVmean, or SUVpeak, where max, mean, and peak refer to value calculated from region placed on image of FDG-avid lesion. Biodistribution normalization is by body weight or lean body mass.

Each subsection contains one or more tables of the key normative components (ie, requirements) necessary to achieve the claims. One example is in Figure 2 illustrating the three components of parameter, entity or actor, and specification. In addition to the normative text, there are extensive descriptive text sections that are meant to be informative (ie, not requirements).

Figure 2:

Figure 2:

Image shows excerpt of normative text requirements from fluorodeoxyglucose (FDG) PET/CT Profile from section 3.2.1.1 entitled "Timing of Image Data Acquisition".

A convention that was adopted for this Profile was the introduction of “intended future requirements,” which are indicated by a shaded box, as shown in Figure 3. These are for requirements that were considered necessary, but are not universally available at this time. In the example shown in Figure 3, the intended future requirement is that PET scanners and imaging site clocks will “[P]rovide time synchronization as per the IHE Consistent Time Integration Profile.”

Figure 3:

Figure 3:

Image shows sample “intended future requirements” (shaded box) from fluorodeoxyglucose (FDG) PET/CT Profile from section 3.6.3.1.4 entitled “Clocks and timing devices.” Eur = Europe, NA = North America, IHE = Integrating the Healthcare Enterprise.

Relationship to the FDG PET/CT UPICT

Many of the protocol requirements in section 3 of the Profile are based on publicly available requirements and in particular on the FDG PET/CT UPICT (19,20). The UPICT working group, operating in parallel and in conjunction with the FDG PET/CT Profile writing group, developed a detailed template to guide the performance of whole-body FDG PET/CT studies in the context of single- and multicenter clinical trials of oncologic therapies (20).

The FDG PET/CT UPICT document is as extensive (more than 70 pages) and detailed as the QIBA FDG PET/CT Profile and also followed an IHE-style development cycle with a public comment period. Although there is overlap between the QIBA Profile and UPICT documents, the emphasis and goals of the two standards are different. The FDG PET/CT Profile is a document that includes a claim on the measurement performance of FDG PET/CT imaging, if certain conditions are met. These conditions include a subset of the UPICT components listed below, as well as requirements on equipment, software, and other components of the imaging chain. The imaging protocol components of the UPICT protocol are mapped into section 3 of the Profile as shown in Figure 4. The UPICT approach was based on a three-tier approach of specifications: acceptable, target, and ideal. The intent is to provide goals for improving the conduct of clinical trials. For the protocol component of the QIBA Profile however, only the acceptable criteria were used.

Figure 4:

Figure 4:

Diagram shows relationship between Uniform Protocol for Imaging in Clinical Trials (UPICT) Protocol and Quantitative Imaging Biomarkers Alliance (QIBA) Profile. FDG = fluorodeoxyglucose.

This Profile is complementary to recently published clinical guidelines by the Society of Nuclear Medicine and Molecular Imaging and the European Association of Nuclear Medicine (41), as well as the well-established requirements for clinical accreditation by the American College of Radiology (42) that are focused primarily on clinical FDG PET/CT imaging. Efforts were made to harmonize the UPICT document with the then-current European Association of Nuclear Medicine FDG protocol standard (43).

Conformance (Compliance)

The Conformance section (now called Compliance in the revised QIBA protocol template) contains requirements for the inherent capabilities of the equipment and the imaging site. These are distinct from the protocol requirements in the Profile Details section, which describe imaging protocol requirements.

For example, when tested at an imaging site, a PET/CT scanner is expected to generate images with a specified level of quantitative accuracy and precision consistent with clinical or clinical trial studies (Fig 5). These requirements are included in the Profile Details section.

Figure 5:

Figure 5:

Image shows excerpt from section 3.6.4 entitled "Phantom Imaging" describing maximum tolerable bias in standardized uptake value (SUV) measurements by using PET/CT scanner. This is example of requirements from Profile Details section for imaging protocols. Shaded box is intended future specification. ACR = American College of Radiology, ROI = region of interest.

However, when tested at the factory with sufficient time and resources allocated for careful measurement, PET/CT scanners can achieve a higher level of accuracy. This can be considered as an inherent capability of the system (ie, as shipped), as opposed to accuracy achieved under clinical operating conditions. The differences between operational and inherent capabilities are illustrated by comparing Figures 5 and 6. The imaging protocol requirement is a maximum tolerable bias of plus or minus 10% in SUV (Fig 5), whereas the equipment conformance requirement (under ideal testing conditions) is plus or minus 2% (Fig 6) for the same test. Similarly, the inherent requirements for other necessary equipment and staff are listed in the Conformance section.

Figure 6:

Figure 6:

Image shows excerpt from section 4.2 entitled "PET/CT Acquisition Device" describing maximum tolerable bias in standardized uptake value (SUV) measurements by using PET/CT scanner. This is example of requirements from Conformance section, describing inherent capability of the system. ACR = American College of Radiology, ROI = region of interest, FDG = fluorodeoxyglucose.

Appendices and Checklists

The appendices contain critical information that support the claim, as well as lists of conventions and definitions. Of particular note, Appendix B contains the analysis approach that was used for the claims as described above. Appendix D contains illustrative manufacturer scanner model-specific instructions and parameters for quality assurance and quality control procedures. Appendix H is a careful analysis of the accepted formula for computing lean-body-mass normalization for SUVs. This was necessary to correct an error in some of the literature for lean-body-mass normalization for PET SUV calculation (44).

As noted above, the normative statements are the requirements for the Profile claim to be met. These are accompanied by descriptive text to provide context. To facilitate testing for conformance, the normative statements were condensed into two concise checklists for imaging sites and equipment and software manufacturers. These are Appendix I in the Profile and are shown in Tables E1 and E2 (online).

Development and Context of the FDG PET/CT Profile Stages

The FDG PET/CT Profile started development as a gap analysis of the bias and precision of quantitative FDG uptake measures in clinical trials. The first draft of the current form of the Profile was developed in the fall of 2011. The writing of the Profile was performed by the QIBA FDG-PET/CT Biomarker Committee, with over 100 volunteer participants including representatives from radiopharmaceutical producers, scanner manufacturers, third-party image analysis workstation developers, the U.S. Food and Drug Administration, the National Institute of Standards and Technology, the National Cancer Institute, the National Institute of Biomedical Imaging and Bioengineering, clinical trial groups, contract research organizations, professional and international organizations and societies including the Radiological Society of North America, Society of Nuclear Medicine and Molecular Imaging, European Association of Nuclear Medicine, American Association of Physicists in Medicine, Medical Imaging and Technology Alliance (formerly National Electrical Manufacturers Association), and Digital Imaging and Communications in Medicine, or DICOM, among others. Disclosures of interest were required from QIBA steering committee members and all patents relevant to the Profile were required to be disclosed. Lists of all participants, as well as all policies, procedures, and meeting minutes are publicly available on the QIBA wiki (2).

Major Results from a Gap Analysis of Quantitative FDG PET/CT Imaging

The draft Profile was based on published data and existing standards whenever possible. The gap analysis identified several missing components considered necessary for quantitative PET imaging with FDG, from which we list the four most substantial findings:

  1. The lack of a DICOM image data element to contain the plasma glucose level. This was addressed by modifying the DICOM standard for FDG PET to add a data field to contain the plasma glucose concentration in International System of Units–derived units (millimoles per liter).

  2. The lack of a methodology to test the accuracy of SUV values reported by display station software. This was addressed by creating and validating a novel digital reference object (45). In a multicenter study evaluating 21 different PET/CT display software packages at 16 sites, errors in the reported maximum SUV ranged up to 38% for an isolated voxel, and errors in the reported mean SUV ranged up to 100% for a region with controlled noise. There was also a range of errors in the minimum and SD for different regions. The variability of computed SUVs between different software packages is substantial enough to warrant the introduction of a reference standard for medical image viewing workstations. The PET SUV digital reference object and instructions for use are publicly available (46) and detailed in Appendix F of the Profile. Anecdotal evidence indicates that the FDG PET/CT digital reference object is heavily used by industry.

  3. The lack of a publicly available common terminology and framework for SUV calculations from the DICOM images files that takes into account the variations in implementation of the DICOM standard by different manufacturers. This was addressed by collaborating with multiple vendors to create a “vendor-neutral pseudocode” to illustrate SUV calculations, which is included in an appendix to the Profile. The pseudocode and supporting documents are also publicly available on the QIBA wiki (2).

  4. Errors were found in some of the lean-body mass formulas and calculations published in peer-reviewed literature. In addition, improved lean-body mass calculation methods were found in pharmacology peer-reviewed literature. These two gaps were addressed by adding Appendix H to the Profile entitled “Consensus Formula for Computing Lean-Body-Mass Normalization for SUVs.” This appendix carefully traces the development of lean-body-mass calculation methods and their introduction into peer-reviewed literature describing SUV calculations. In addition, the DICOM standard was updated to allow for both the corrected default method and the more accurate improved lean-body-mass calculation method.

If needed data were not available during the development of the QIBA Profile, then so-called groundwork studies were performed to collect the necessary data. Over a 6-year period, these studies were supported as part of a series of funded contracts from the National Institute of Biomedical Imaging and Bioengineering. One such example was the development of the digital reference object described above. If groundwork projects were not feasible, then expert consensus opinion was used for specifications following a Delphi-like process.

The version for public comment was released on January 17, 2013. The public comment phase ran through May 24, 2013, during which 96 comments, suggestions, and questions from several dozen individuals and organizations were received. The Profile was revised over the period ending November 2013 and all meeting minutes, responses, and recommendations are publicly available on the QIBA wiki site (2). The revised publicly reviewed version of the Profile was released as version 1.05 on December 11, 2013.

The FDG PET/CT Profile then went through a process of feasibility or field testing to progress to a so-called technically confirmed version (Table 3). The technical confirmation process was performed at over a dozen sites, ranging from academic sites through community hospitals, using systems from all manufacturers to assure that conformance was both reasonable and achievable. The feasibility testing also involved direct feedback from all PET/CT scanner manufacturers. During the technical confirmation phase, aspects of the Profile that can be tested, short of patient studies, were evaluated. In addition, systematic and extensive checklists for each actor (ie, entity) were created to facilitate assessment of conformance to the Profile. Important feedback from both imaging sites and manufacturers centered around the need for a simplification of the process, and more emphasis on a checklist approach. Many specific recommendations were provided. These recommendations were reviewed and used to modify the Profile and provide more concise checklists, which are included as Appendix I in the Profile.

The revised FDG PET/CT Profile achieved the level of technically confirmed as version 1.13 on November 18, 2016 after review and voting by QIBA members (23). All meeting minutes, queries, responses, and recommendations are publicly available on the QIBA wiki site (2).

During the course of its development, a QIBA Profile is intended to go through multiple stages as listed in Figure 7. The FDG PET/CT Profile is currently in the technically confirmed stage and plans are in development to move it to the so-called claim confirmed level. Profiles are expected to evolve due to the introduction of new technologies, new procedures, and more accurate methods or information. To achieve the level of a clinically confirmed profile, conformant data must be obtained and shown to achieve a statistically significant confirmation of the claim, which may require the equivalent of a multicenter clinical trial.

Figure 7:

Figure 7:

Quantitative Imaging Biomarkers Alliance (QIBA) Profile levels of development.

Future Developments and Conclusions

There are updates under development or consideration for the current technically confirmed version of the FDG-PET/CT Profile:

  1. The next stage of the Profile will be “claim confirmed” (Fig 6), which requires multicenter testing with patient scans. Planning for this process is currently underway and includes protocol design and securing needed funding.

  2. The multicenter testing is designed to test inclusion of lesions smaller than 2.0 cm and with SUVs less than 4 g/mL and will include reassessment of the claim for technological advancements, including scanners with improved resolution and sensitivity that have not been used in published test-retest studies, but are now being used clinically: point spread function–based reconstruction, time-of-flight imaging, radiation treatment planning using FDG PET/CT, and dual-mode PET/MRI scanners. A recent study estimated the wCV of SUVmax measured by using PET/MR to be 6.6%–8.7% across all PET reconstructions (36).

  3. Future technology developments: Machine learning or deep learning–based image reconstruction algorithms, and extended axial field-of-view scanners including total body PET scanners.

  4. The FDG PET/CT Profile can be extended for other 18F-based radiotracers, as well as other positron emitting tracers in general. This is currently underway with the development of the QIBA Profile for 18F-labeled PET tracers targeting amyloid as an imaging biomarker (47).

It is anticipated that the technically confirmed version of the fluorodeoxyglucose (FDG) PET/CT Profile is now stable enough for implementation by all parties concerned with quantitative FDG PET/CT imaging. Indeed, claims of Quantitative Imaging Biomarkers Alliance (QIBA) conformance have started to appear in manufacturers data sheets for product technical specifications in PET imaging, although this is currently a voluntary attestation. The QIBA process is still evolving and welcomes contributions from all interested parties. The process developed with FDG PET also provides a template for development and advancement of other QIBA profiles, and, we believe, provide a solid foundation for moving quantitative imaging into clinical practice.

APPENDIX

Appendix E1, Tables E1-E2 (PDF)
ry191882suppa1.pdf (190.6KB, pdf)

Acknowledgments

Acknowledgments

We thank all the current and past members of the FDG-PET/CT Biomarker Committee (listed in Appendix A of the Profile), in particular previous co-chairs Richard Frank, MD and Ling Shao, PhD. We also thank the individuals and groups that provided the extensive comments for the publicly reviewed version of the FDG-PET/CT Profile. We also want to acknowledge the QIBA leadership and RSNA staff that have diligently supported the QIBA efforts, including Daniel Sullivan, MD; Edward Jackson, PhD; Kevin O’Donnell, MASc; and Andrew Buckler, MS.

Study supported in part by the National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Department of Health and Human Services (HHSN268201000050C, HHSN268201300071C, HHSN268201500021C).

P.E.K. supported by the National Cancer Institute (U01CA148131, R01CA169072, U01CA190254). R.S. supported by the National Institutes of Health, Quantitative Imaging Biomarkers Alliance, and Radiological Society of North America. M.A.L. and N.A.O. supported by the Radiological Society of North America.

Disclosures of Conflicts of Interest: P.E.K. Activities related to the present article: disclosed no relevant relationships. Activities not related to the present article: has grants/grants pending with GE Healthcare. Other relationships: disclosed no relevant relationships. E.S.P. Activities related to the present article: disclosed no relevant relationships. Activities not related to the present article: is employed by Perlman Advisory Group. Other relationships: disclosed no relevant relationships. J.J.S. disclosed no relevant relationships. R.S. Activities related to the present article: disclosed no relevant relationships. Activities not related to the present article: is a board member and consultant for Blue Earth Diagnostics. Other relationships: disclosed no relevant relationships. S.D.W. Activities related to the present article: disclosed no relevant relationships. Activities not related to the present article: is employed by GE Healthcare. Other relationships: disclosed no relevant relationships. T.G.T. Activities related to the present article: disclosed no relevant relationships. Activities not related to the present article: is a consultant for Data Spectrum. Other relationships: disclosed no relevant relationships. M.A.L. disclosed no relevant relationships. R.B. disclosed no relevant relationships. N.A.O. Activities related to the present article: disclosed no relevant relationships. Activities not related to the present article: is a board member of Eastern Cooperative Oncology Group-American College of Radiology Imaging Network. Other relationships: disclosed no relevant relationships. R.L.W. Activities related to the present article: disclosed no relevant relationships. Activities not related to the present article: is a board member of Clarity; is a consultant for Clarity and Nihon Medi-Physics; is employed by Washington University; has grants/grants pending with the National Institutes of Health; received payment for lectures including service on speakers bureaus from Siemens. Other relationships: disclosed no relevant relationships.

Abbreviations:

EORTC
European Organization for Research and Treatment of Cancer
FDG
fluorodeoxyglucose
QIBA
Quantitative Imaging Biomarkers Alliance
SUV
standardized uptake value
SUVmax
maximum SUV
UPICT
Uniform Protocol for Imaging in Clinical Trials
wCV
within-subject coefficient of variation

References

  • 1. Jackson EF. . Quantitative Imaging: The Translation from Research Tool to Clinical Practice . Radiology 2018. ; 286 ( 2 ): 499 – 501 . [DOI] [PubMed] [Google Scholar]
  • 2. QIBA . Quantitative Imaging Biomarkers Alliance (QIBA) . https://www.rsna.org/QIBA. Accessed August 8, 2019 .
  • 3. IHE . IHE Radiology User’s Handbook . ACC/HIMSS/RSNA. https://www.ihe.net/wp-content/uploads/2018/07/ihe_radiology_users_handbook_2005edition.pdf. Published 2005. Accessed August 8, 2019 .
  • 4. Wahl RL, Jacene H, Kasamon Y, Lodge MA. . From RECIST to PERCIST: Evolving Considerations for PET response criteria in solid tumors . J Nucl Med 2009. ; 50 ( Suppl 1 ): 122S – 150S . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Fletcher JW, Djulbegovic B, Soares HP, et al . Recommendations on the use of 18F-FDG PET in oncology . J Nucl Med 2008. ; 49 ( 3 ): 480 – 508 . [DOI] [PubMed] [Google Scholar]
  • 6. Kelloff GJ, Hoffman JM, Johnson B, et al . Progress and promise of FDG-PET imaging for cancer patient management and oncologic drug development . Clin Cancer Res 2005. ; 11 ( 8 ): 2785 – 2808 . [DOI] [PubMed] [Google Scholar]
  • 7. Ward PS, Thompson CB. . Metabolic reprogramming: a cancer hallmark even warburg did not anticipate . Cancer Cell 2012. ; 21 ( 3 ): 297 – 308 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Hatt M, Cheze Le Rest C, Albarghach N, Pradier O, Visvikis D. . PET functional volume delineation: a robustness and repeatability study . Eur J Nucl Med Mol Imaging 2011. ; 38 ( 4 ): 663 – 672 . [DOI] [PubMed] [Google Scholar]
  • 9. Nestle U, Kremp S, Schaefer-Schuler A, et al . Comparison of different methods for delineation of 18F-FDG PET-positive tissue for target volume definition in radiotherapy of patients with non-Small cell lung cancer . J Nucl Med 2005. ; 46 ( 8 ): 1342 – 1348 . [PubMed] [Google Scholar]
  • 10. Weber WA. . Assessing tumor response to therapy . J Nucl Med 2009. ; 50 ( Suppl 1 ): 1S – 10S . [DOI] [PubMed] [Google Scholar]
  • 11. Connolly RM, Leal JP, Solnes L, et al . TBCRC026: Phase II Trial Correlating Standardized Uptake Value With Pathologic Complete Response to Pertuzumab and Trastuzumab in Breast Cancer . J Clin Oncol 2019. ; 37 ( 9 ): 714 – 722 . [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
  • 12. FDA . Clinical Trial Imaging Endpoint Process Standards Guidance for Industry . Silver Spring, Md: : FDA; , 2018. ; 1 – 31 . [Google Scholar]
  • 13. FDA . Technical Performance Assessment of Quantitative Imaging in Device Premarket Submissions: Draft Guidance for Industry and Food and Drug Administration Staff . Silver Spring, Md: : FDA; , 2019. ; 1 – 17 . [Google Scholar]
  • 14. Adams MC, Turkington TG, Wilson JM, Wong TZ. . A systematic review of the factors affecting accuracy of SUV measurements . AJR Am J Roentgenol 2010. ; 195 ( 2 ): 310 – 320 . [DOI] [PubMed] [Google Scholar]
  • 15. Boellaard R. . Standards for PET image acquisition and quantitative data analysis . J Nucl Med 2009. ; 50 ( Suppl 1 ): 11S – 20S . [DOI] [PubMed] [Google Scholar]
  • 16. Kinahan PE, Fletcher JW. . Positron emission tomography-computed tomography standardized uptake values in clinical practice and assessing response to therapy . Semin Ultrasound CT MR 2010. ; 31 ( 6 ): 496 – 505 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Sullivan DC, Obuchowski NA, Kessler LG, et al . Metrology Standards for Quantitative Imaging Biomarkers . Radiology 2015. ; 277 ( 3 ): 813 – 825 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. de Langen AJ, Vincent A, Velasquez LM, et al . Repeatability of 18F-FDG uptake measurements in tumors: a metaanalysis . J Nucl Med 2012. ; 53 ( 5 ): 701 – 708 . [DOI] [PubMed] [Google Scholar]
  • 19. Graham MM, Wahl RL, Hoffman JM, et al . Summary of the UPICT Protocol for 18F-FDG PET/CT Imaging in Oncology Clinical Trials . J Nucl Med 2015. ; 56 ( 6 ): 955 – 961 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. FDG-PET/CT UPICT Subgroup. FDG-PET/CT UPICT V 2.0. Publically-reviewed version. Uniform Protocols for Imaging in Clinical Trials . Quantitative Imaging Biomarkers Alliance. http://qibawiki.rsna.org/index.php/Profiles. Published 2014. Accessed September 28, 2019 .
  • 21. Kurland BF, Peterson LM, Shields AT, et al . Test-Retest Reproducibility of 18F-FDG PET/CT Uptake in Cancer Patients Within a Qualified and Calibrated Local Network . J Nucl Med 2019. ; 60 ( 5 ): 608 – 614 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Kumar V, Nath K, Berman CG, et al . Variance of SUVs for FDG-PET/CT is greater in clinical practice than under ideal study settings . Clin Nucl Med 2013. ; 38 ( 3 ): 175 – 182 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. FDG-PET/CT Biomarker Committee . FDG-PET/CT as an Imaging Biomarker Measuring Response to Cancer Therapy Profile. Version 1.13. Technically Confirmed Version . Quantitative Imaging Biomarkers Alliance. http://qibawiki.rsna.org/images/1/1f/QIBA_FDG-PET_Profile_v113.pdf. Published 2016. Accessed August 8, 2019 .
  • 24. Hatt M, Cheze-Le Rest C, Aboagye EO, et al . Reproducibility of 18F-FDG and 3′-deoxy-3′-18F-fluorothymidine PET tumor volume measurements . J Nucl Med 2010. ; 51 ( 9 ): 1368 – 1376 . [DOI] [PubMed] [Google Scholar]
  • 25. Kamibayashi T, Tsuchida T, Demura Y, et al . Reproducibility of semi-quantitative parameters in FDG-PET using two different PET scanners: influence of attenuation correction method and examination interval . Mol Imaging Biol 2008. ; 10 ( 3 ): 162 – 166 . [DOI] [PubMed] [Google Scholar]
  • 26. Krak NC, Boellaard R, Hoekstra OS, Twisk JW, Hoekstra CJ, Lammertsma AA. . Effects of ROI definition and reconstruction method on quantitative outcome and applicability in a response monitoring trial . Eur J Nucl Med Mol Imaging 2005. ; 32 ( 3 ): 294 – 301 . [DOI] [PubMed] [Google Scholar]
  • 27. Minn H, Zasadny KR, Quint LE, Wahl RL. . Lung cancer: reproducibility of quantitative measurements for evaluating 2-[F-18]-fluoro-2-deoxy-D-glucose uptake at PET . Radiology 1995. ; 196 ( 1 ): 167 – 173 . [DOI] [PubMed] [Google Scholar]
  • 28. Nahmias C, Wahl LM. . Reproducibility of standardized uptake value measurements determined by 18F-FDG PET in malignant tumors . J Nucl Med 2008. ; 49 ( 11 ): 1804 – 1808 . [DOI] [PubMed] [Google Scholar]
  • 29. Nakamoto Y, Zasadny KR, Minn H, Wahl RL. . Reproducibility of common semi-quantitative parameters for evaluating lung cancer glucose metabolism with positron emission tomography using 2-deoxy-2-[18F]fluoro-D-glucose . Mol Imaging Biol 2002. ; 4 ( 2 ): 171 – 178 . [DOI] [PubMed] [Google Scholar]
  • 30. Velasquez LM, Boellaard R, Kollia G, et al . Repeatability of 18F-FDG PET in a multicenter phase I study of patients with advanced gastrointestinal malignancies . J Nucl Med 2009. ; 50 ( 10 ): 1646 – 1654 . [DOI] [PubMed] [Google Scholar]
  • 31. Weber WA, Ziegler SI, Thödtmann R, Hanauske AR, Schwaiger M. . Reproducibility of metabolic measurements in malignant tumors using FDG PET . J Nucl Med 1999. ; 40 ( 11 ): 1771 – 1777 . [PubMed] [Google Scholar]
  • 32. Weber WA, Gatsonis CA, Mozley PD, et al . Repeatability of 18F-FDG PET/CT in Advanced Non-Small Cell Lung Cancer: Prospective Assessment in 2 Multicenter Trials . J Nucl Med 2015. ; 56 ( 8 ): 1137 – 1143 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Thie JA, Hubner KF, Smith GT. . The diagnostic utility of the lognormal behavior of PET standardized uptake values in tumors . J Nucl Med 2000. ; 41 ( 10 ): 1664 – 1672 . [PubMed] [Google Scholar]
  • 34. Bland JM, Altman DG. . Measurement error proportional to the mean . BMJ 1996. ; 313 ( 7049 ): 106 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Lodge MA. . Repeatability of SUV in Oncologic 18F-FDG PET . J Nucl Med 2017. ; 58 ( 4 ): 523 – 532 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Fraum TJ, Fowler KJ, Crandall JP, et al . Measurement Repeatability of 18F-FDG PET/CT Versus 18F-FDG PET/MRI in Solid Tumors of the Pelvis . J Nucl Med 2019. ; 60 ( 8 ): 1080 – 1086 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Young H, Baum R, Cremerius U, et al . Measurement of clinical and subclinical tumour response using [18F]-fluorodeoxyglucose and positron emission tomography: review and 1999 EORTC recommendations. European Organization for Research and Treatment of Cancer (EORTC) PET Study Group . Eur J Cancer 1999. ; 35 ( 13 ): 1773 – 1782 . [DOI] [PubMed] [Google Scholar]
  • 38. Eisenhauer EA, Therasse P, Bogaerts J, et al . New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1) . Eur J Cancer 2009. ; 45 ( 2 ): 228 – 247 . [DOI] [PubMed] [Google Scholar]
  • 39. Shankar LK, Hoffman JM, Bacharach S, et al . Consensus recommendations for the use of 18F-FDG PET as an indicator of therapeutic response in patients in National Cancer Institute Trials . J Nucl Med 2006. ; 47 ( 6 ): 1059 – 1066 . [PubMed] [Google Scholar]
  • 40. Boellaard R, Oyen WJ, Hoekstra CJ, et al . The Netherlands protocol for standardisation and quantification of FDG whole body PET studies in multi-centre trials . Eur J Nucl Med Mol Imaging 2008. ; 35 ( 12 ): 2320 – 2333 . [DOI] [PubMed] [Google Scholar]
  • 41. Boellaard R, Delgado-Bolton R, Oyen WJ, et al . FDG PET/CT: EANM procedure guidelines for tumour imaging: version 2.0 . Eur J Nucl Med Mol Imaging 2015. ; 42 ( 2 ): 328 – 354 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. MacFarlane CR; American College of Radiologists . ACR accreditation of nuclear medicine and PET imaging departments . J Nucl Med Technol 2006. ; 34 ( 1 ): 18 – 24 . [PubMed] [Google Scholar]
  • 43. Boellaard R, O’Doherty MJ, Weber WA, et al . FDG PET and PET/CT: EANM procedure guidelines for tumour PET imaging: version 1.0 . Eur J Nucl Med Mol Imaging 2010. ; 37 ( 1 ): 181 – 200 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Janmahasatian S, Duffull SB, Ash S, Ward LC, Byrne NM, Green B. . Quantification of lean bodyweight . Clin Pharmacokinet 2005. ; 44 ( 10 ): 1051 – 1065 . [DOI] [PubMed] [Google Scholar]
  • 45. Pierce LA, 2nd, Elston BF, Clunie DA, Nelson D, Kinahan PE. . A Digital Reference Object to Analyze Calculation Accuracy of PET Standardized Uptake Value . Radiology 2015. ; 277 ( 2 ): 538 – 545 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.University of Washington Imaging Research Laboratory . The PET/CT Digital Reference Object. http://depts.washington.edu/petctdro/DROsuv_main.html. Accessed August 8, 2019.
  • 47. PET-Amyloid Biomarker Committee . 18F-labeled PET tracers targeting Amyloid as an Imaging Biomarker. Consensus version . Quantitative Imaging Biomarkers Alliance. http://qibawiki.rsna.org/images/3/3f/QIBA_AmyloidPET_20June2018_consensus.pdf. Published 2018. Accessed August 8, 2019 .

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix E1, Tables E1-E2 (PDF)
ry191882suppa1.pdf (190.6KB, pdf)

Articles from Radiology are provided here courtesy of Radiological Society of North America

RESOURCES