Skip to main content
The British Journal of Radiology logoLink to The British Journal of Radiology
. 2016 Aug 19;89(1065):20160113. doi: 10.1259/bjr.20160113

Pulmonary imaging after stereotactic radiotherapy—does RECIST still apply?

Sarah A Mattonen 1, Aaron D Ward 1,2, David A Palma 2,3,
PMCID: PMC5124920  PMID: 27245137

Abstract

The use of stereotactic ablative radiotherapy (SABR) for the treatment of primary lung cancer and metastatic disease is rapidly increasing. However, the presence of benign fibrotic changes on CT imaging makes response assessment following SABR a challenge, as these changes develop with an appearance similar to tumour recurrence. Misclassification of benign fibrosis as local recurrence has resulted in unnecessary interventions, including biopsy and surgical resection. Response evaluation criteria in solid tumours (RECIST) are widely used as a universal set of guidelines to assess tumour response following treatment. However, in the context of non-spherical and irregular post-SABR fibrotic changes, the RECIST criteria can have several limitations. Positron emission tomography can also play a role in response assessment following SABR; however, false-positive results in regions of inflammatory lung post-SABR can be a major clinical issue and optimal standardized uptake values to distinguish fibrosis and recurrence have not been determined. Although validated CT high-risk features show a high sensitivity and specificity for predicting recurrence, most recurrences are not detected until more than 1-year post-treatment. Advanced quantitative radiomic analysis on CT imaging has demonstrated promise in distinguishing benign fibrotic changes from local recurrence at earlier time points, and more accurately, than physician assessment. Overall, the use of RECIST alone may prove inferior to novel metrics of assessing response.


Stereotactic ablative radiotherapy (SABR) has become a standard treatment option for patients with early stage (T1/T2 N0) non-small-cell lung cancer who refuse surgery or are considered medically inoperable.1,2 The use of SABR, which is also known as stereotactic body radiation therapy, for curative-intent treatment of non-small-cell lung cancer has been rapidly increasing over the last decade.3 SABR differs from conventional radiotherapy techniques in that it delivers high doses per fraction (approximately 18 Gy per fraction vs 2 Gy per fraction) over a shorter treatment time (typically 1–2 weeks vs 4–6 weeks). These high doses are achievable with the use of highly conformal treatment plans, which include precise planning, targeting and treatment delivery.

The effectiveness of SABR in local tumour control has been well established. Reported 3-year local control rates often exceed 90%.4,5 These high rates of local control have led to suggestions that SABR may be as effective as surgery and should be considered for use in patients who are operable.6 Three randomized trials comparing resection vs SABR have closed owing to poor accrual. A pooled analysis of the accrued patients from two trials has been completed, and although the sample size was small, results showed the two treatment options to be comparable.4 SABR was better tolerated (10% grade 3 toxicity with SABR vs 44% grade 3–4 toxicity with surgery), with better post-treatment quality of life.7 SABR achieved better overall survival (OS) than surgery (3-year OS 95% vs 79%; p = 0.037); however, larger studies are needed to confirm these findings.

In addition to treatment of primary lung cancer, the use of SABR has also been rapidly increasing for oligometastatic disease.8,9 Several single-institution studies have demonstrated high rates of local control, with favourable comparisons with surgery in OS outcomes.10,11 However, for colorectal cancer, rates of local control after SABR may be lower than that of other histologies, approximately 70–80%.11 The impact of SABR on OS in patients with oligometastatic disease is currently being evaluated in randomized trials (NCT01446744 and NCT02364557).12

COMMON APPEARANCES ON CT IMAGING AFTER STEREOTACTIC ABLATIVE RADIOTHERAPY

With the increase in the number of patients receiving SABR for primary lung cancer or metastatic disease, determining the appropriate follow-up and management of patients is critical.13 With a shift towards the use of SABR for patients declining surgery, or borderline operative candidates, modern cohorts receiving SABR are fit with longer life expectancies. As a result, surgical or non-surgical salvage opportunities are available if failure occurs.1416

After SABR treatment, patients are typically followed with physical examination and CT imaging every 3–6 months for the first 3 years following treatment.17 The development of radiographic radiation-induced lung injury (RILI) on CT imaging is common. This is a direct result of the highly ablative and conformal doses delivered with SABR, which can result in these changes appearing similar to a recurring tumour (Figure 1). The total dose, fractionation, treatment delivery technology and tumour size are all factors which may affect the degree of radiographic lung injury.18,19

Figure 1.

Figure 1.

Planning CT image for stereotactic ablative radiotherapy treatment and subsequent follow-up imaging after radical treatment for early stage primary lung cancer. m, months; y, years.

The appearance and patterns of RILI can vary across follow-up time intervals. Radiation pneumonitis is typically seen in the acute setting within 6 months of treatment, following which it is classified as fibrosis. In the acute setting, common CT patterns include consolidative and ground-glass opacity changes. Late findings include modified conventional, mass-like or scar-like patterns.20,21 A modified conventional pattern has been described, defining a fibrosis pattern that is larger than the original tumour size, may be associated with ground-glass opacity, and may include consolidation, volume loss and bronchiectasis that is similar to or less extensive than conventional radiation fibrosis.20,22 These radiographic changes can persist and continue to evolve even after 2 years following treatment.

These changes on CT can result in a major clinical dilemma with respect to accurately distinguishing patients with local recurrence from those with benign RILI, especially in cases with mass-like changes.23 Misclassification of benign RILI as recurrence can result in patients undergoing unnecessary biopsy or surgical intervention for only benign disease. On the other hand, if classification of a local recurrence is missed, patients may miss the opportunity to have timely salvage intervention. Several groups have reported patients with suspicious findings on CT and/or fludeoxyglucose (FDG) positron emission tomography (PET) imaging who underwent salvage lung resection to have pathology show no viable tumour cells.2426 In most cases, persistent CT findings do not indicate recurrence: a recent study determining the fate of residual masses after SABR found that in 50 patients with masses present for more than 1 year following treatment, only 8 patients developed local recurrence.27

TIMING AND FREQUENCY OF RECURRENCES AFTER STEREOTACTIC ABLATIVE RADIOTHERAPY

Outcomes following SABR are favourable, with recent studies demonstrating 5-year local and regional control rates of 90% and 87%, respectively.17 Local recurrences, typically defined as failure within the treated area, typically manifest at a median time of 15 months post-SABR, but they may present up to 5 years following treatment.17 Despite high rates of local control, patients still remain at risk of lobar recurrence: the multicentre Radiation Therapy Oncology Group 0236 trial demonstrated a 5-year primary tumour recurrence rate of 7%, but an involved lobar recurrence rate of 20%.28,29 However, lobar recurrence after SABR may be difficult to distinguish from development of second primary lung cancers. Regardless of the classification as recurrence or second primary lung cancers, many patients with lobar recurrence can be salvaged with surgical resection.4,15,16,30

Many factors have been identified in the literature as predictive of local control based on Cox multivariable analysis. These include both dose factors, including the biologically effective dose and minimum planning target volume dose, as well as tumour factors including T-stage and gross tumour volume size.3133

RESPONSE EVALUATION CRITERIA IN SOLID TUMOURS

Response evaluation criteria in solid tumours (RECIST) are the standard measure of imaging response in oncology. The RECIST, first published in 2000, have been widely adopted by many institutions and provide a clear set of guidelines to perform unidimensional measurements for the overall evaluation of tumour response. In 2009, the RECIST guidelines were updated to v. 1.134 and specific criteria are used to determine tumour response for a target lesion based on measurement of the sum of longest diameters of all target lesions. The baseline sum of longest diameters is used as the reference to characterize response. A complete response denotes the disappearance of all target lesions. A partial response is at least a 30% decrease in sum of longest diameters of the target lesions (reference being the baseline sum of longest diameters). Progressive disease (PD) is at least a 20% increase in sum of longest diameters of the target lesions (reference being the smallest sum of longest diameters since the start of the treatment) or the appearance of one or more new lesions. Lastly, stable disease does not have sufficient shrinkage to be considered a partial response or sufficient increase in size to be considered a PD (<20% increase or <30% decrease in diameter of the target lesion), again taking as reference the smallest sum of longest diameters since the start of the treatment.

Response is determined through measurement of the longest diameter of the target lesion within the imaging plane (axial for CT imaging). In the event of isotropic reconstructions, measurements can be made on the reconstructed images in the non-imaging planes. However, since not all radiology sites are capable of producing isotropic reconstructions, caution must be taken to avoid the undesirable situation in which measurements are taken on different imaging planes at subsequent assessments. It is worth noting that for CT scans of the chest, in which typical slice thicknesses of 5 mm are used, target lesions should have a minimum size of 10 mm to be considered measurable. There are also several other CT image acquisition parameters which should be taken into account for consistency when evaluating lesions using RECIST. These include the anatomic coverage, contrast administration, slice thickness and reconstruction interval, which can all impact the evaluation of lesion response.34

LIMITATIONS OF RESPONSE EVALUATION CRITERIA IN SOLID TUMOURS

Although RECIST provides a clear set of guidelines for response assessment, they have several limitations.35,36 Response assessment based on RECIST relies on the physician measurement of lesion diameter. It was been well described that variability in target lesion diameter exists and this can have an impact on accurately assessing response.3739 Interobserver variability is greater than intraobserver variability, and measurement differences are greatest when there is an irregular edge or spiculated lesion.39 For consistent measurements, one should consider having a single observer measure the target lesion response across the course of follow-up. The specification of non-measureable disease when the lesion diameter is <10 mm can be a major limitation after SABR for small lung nodules.35 The requirement that measurements be taken in the imaging plane can also be a limitation in the context of post-SABR response assessment, since craniocaudal growth may be a major predictor of recurrence and is measured in the sagittal/coronal plane.40

RECIST can also be challenging for patients receiving new agents such as targeted drugs or immunotherapy. Following targeted therapy, stable disease might be the best response rate observed across follow-up.41 In the case of immunotherapy, initial pseudoprogression can result in judgment of PD according to the response criteria.42

In the context of response assessment following SABR, the presence of benign fibrotic changes within the high-dose region on CT can affect the ability to accurately assess response.43 When measuring the longest axial diameter of post-SABR changes, it can be unknown whether these changes represent viable tumour cells or benign fibrotic tissue. Another limitation of RECIST is the difficulty and inherent variability in measuring non-spherical lesions. This is specifically important in patients treated with SABR, as the appearance and morphology of post-SABR changes can be quite irregular with pleural attachment (Figure 1).This makes accurately determining local lesion response very difficult in light of the significant fibrotic changes following SABR. An example of RECIST failure in a patient treated with SABR is shown in Figure 2. The ability of size measurements to predict local recurrence at 3 or 6 months post-SABR was investigated and showed a poor performance, with areas under the receiver-operating characteristic curve of 0.65–0.72.44 However, longest axial diameter measurements were significantly different between recurrence and injury patient groups as we move further from treatment (after 15 months post-SABR), when changes become more salient on imaging.45

Figure 2.

Figure 2.

Demonstration of response evaluation criteria in solid tumours (RECIST) failure in a patient who received stereotactic radiotherapy for stage I non-small-cell lung cancer: radiation planning scan (a) showing as concentric lines, from inner to outer, the prescribed dose (54 Gy in 3 fractions), 50% of prescribed dose and 25% of prescribed dose. The 3-month scan (b) showing a large area of consolidation meeting the RECIST criteria for progressive disease, but the patient was observed. Ongoing observation at 6 months (c) and 40 months (d) showed development of fibrosis with no progression.

LIMITATIONS OF PHYSICIAN ASSESSMENT

In light of these limitations, physician assessment of response is difficult.46 In a blinded study, 3 thoracic radiation oncologists and 3 thoracic radiologists were asked to independently score all follow-up images for 45 patients treated with SABR. A custom-made interface was developed to mimic the clinical assessment of sequential scans over time. Physicians were asked to score each image as either local recurrence or benign injury/no recurrence, indicate their certainty level of the assessment and recommend their immediate next step for follow-up.

Physicians had a median sensitivity of 83.8% (range 67–100%) and median specificity of 75.0% (range 67–87%), with only a moderate level of agreement across 6 observers, for detecting local recurrence at any time point during follow-up. This indicates that there are still many patients being misdiagnosed and can result in a recommendation for further intervention with PET, biopsy or immediate salvage treatment for patients with only benign disease. Local recurrences were also typically not detected until more than 1-year post-SABR. Radiologists were generally able to detect the recurrence earlier (mean of 13.4 months) than radiation oncologists (mean of 18.2 months), but also had a lower specificity (mean of 70% vs 82% for radiation oncologists) or a higher rate of false-positive assessments. Physicians were also typically not suspicious of recurrence within 6 months of SABR, and this finding is consistent with the published literature on early post-SABR imaging. Daly et al47 found that only 3% of 62 patients with changes on images within 6 months of treatment ultimately led to a diagnosis of early recurrence.

POTENTIAL ALTERNATIVES TO RESPONSE EVALUATION CRITERIA IN SOLID TUMOURS

High-risk CT features

A series of high-risk features (HRFs) on CT imaging have been identified for the detection of local recurrence following SABR. These include the presence of an enlarging opacity, enlargement after 1 year, sequential enlargement from one scan to the next, bulging margin, linear margin disappearance and air bronchogram loss.48 These HRFs were identified based on a systematic review of the literature and then validated in a blinded study of patients with pathologic proof of recurrence.40 Patients with recurrence were matched 1 : 2 to patients without local recurrence according to baseline factors. A new HRF of craniocaudal growth was identified in this cohort. All HRFs were significantly associated with local recurrence, and the odds of recurrence increased fourfold for each additional HRF.40 A recent validation of these features was performed on an independent patient cohort and it demonstrated bulging margin, linear margin disappearance and craniocaudal growth as the best predictor, (Table 1).49 Combining HRFs was also shown to increase sensitivities and specificities over number of HRFs.

Table 1.

High-risk features for recurrence prediction on CT imaging40

High-risk feature Sensitivity (%)
Specificity (%)
Huang et al40 Peulen et al49 Huang et al40 Peulen et al49
Enlarging opacity at primary site 92 100 67 31
Sequential enlargement 67 62 100 77
Enlargement after 12 months 100 92 83 50
Bulging margin 83 85 83 100
Linear margin disappearance 42 85 100 100
Loss air bronchogram 67 15 96 100
Craniocaudal growth of ≥5 mm and ≥20% 92 100 83 50

However, not all studies have found all HRFs to be useful. A study by Halpenny et al50 examined the predictive value of qualitative CT features for predicting local recurrence following SABR. 8 patients with local recurrence and 83 patients without local recurrence were evaluated for the following signs of local recurrence on CT: a new bulging margin, opacification of air bronchograms, a new or enlarging pleural effusion, a new or enlarging mass or increase in lung density in the irradiated field. They found that the only feature significantly associated with local recurrence was a new bulging margin at the treatment site.

The use of HRFs is subject to limitations. Early detection of local recurrence is difficult, as many require sequential assessments (i.e. sequential enlargement, loss of air bronchograms and loss of linear margin) and may vary depending on the frequency of scanning. One HRF cannot be detected until more than 1 year following treatment. Interobserver and intraobserver variability in detecting HRFs is not well established.

A systematic workflow for imaging follow-up post-SABR has been published based on HRFs and maximum standardized uptake value (SUVmax) on PET imaging.40 This workflow classifies patients as having a low, intermediate or high risk of recurrence. A more rigorous follow-up might be indicated in patients with a higher likelihood of disease recurrence, including those patients with larger tumours or suboptimal radiation doses.51,52 As more data become available in the management of patients following SABR, applicability of this follow-up recommendation is expected to change.

Radiomics

Radiomics is an emerging area of study which aims to extract more information from medical images.53,54 The use of radiomics and texture analysis in oncology, and specifically radiation oncology, has been rapidly expanding over the past decade to quantify tumour heterogeneity and predict response.55 Radiomics has the potential to tailor a patient's radiotherapy treatment based on predicted response on pre-treatment imaging, or to detect treatment failure at an earlier time point post-treatment.56 This can involve the extraction of quantitative image features from regions of interest on either pre- or post-treatment images. To undertake radiomics analyses, a region of interest is defined and within it a series of radiomic image features can be calculated (Figure 3).57 Such features include first-order statistics based on the distribution of the intensity histogram (e.g. mean, median and standard deviation). Second-order texture features take into account the neighbouring relationships of voxels within the region of interest. These features can include grey-level co-occurrence matrix texture features as well as grey-level run length matrix texture features.5861 Size- and shape-based features of the region can also be calculated. Size can be quantified by measures such as three-dimensional volume and surface area.62 Shape-based features can include the sphericity, roughness or spiculation to characterize shape complexity.63

Figure 3.

Figure 3.

Radiomics involves image acquisition and region of interest delineation. An example CT image and corresponding region of interest are shown. Within the region of interest, several image features can be extracted, including first-order statistics, second-order texture and size- and shape-based features. These features can be used to predict patient outcomes.

In patients receiving SABR, the performance of radiomics for early prediction of recurrence has been compared with physician assessment on CT imaging.46 On follow-up CT images, two regions of interest intended to sample regions of post-SABR changes were semi-automatically generated: consolidative and surrounding periconsolidative regions. First-order statistics, second-order grey-level co-occurrence matrix texture features and size- and shape-based features were calculated within the regions. Feature selection and machine learning were performed to generate and evaluate a radiomic signature's ability to predict local recurrence. A radiomic signature consisting of five image appearance features in post-SABR consolidative and periconsolidative regions could predict recurrence within 6 months post-SABR with an error of 23.7%, false-positive rate (FPR) of 24.0% and false-negative rate of 23.1% on a data set of 45 patients. At the same time point following SABR, physicians assessed the majority of images as benign injury/no recurrence with an FPR of 99%. These findings require validation prior to clinical implementation and several unanswered questions remain, including determining how physicians will perform when provided with the radiomics decision support tool.

The use of quantitative and CT texture analysis has also been applied to quantify radiation-induced lung damage. Quantification of lung density changes has been investigated for patient-specific susceptibility of radiation-induced lung damage following SABR.64 A recent study by Ghobadi et al65 showed that the combination of mean density changes with the standard deviation of the density was a more sensitive and specific method to assess radiation-induced lung damage than measuring differences in mean density. Predictive modelling of radiation pneumonitis using texture analysis on CT has been studied following definitive radiation for lung and oesophageal cancer.66,67 Future work integrating radiomics and genomics (radiogenomics) could aid in characterizing tumour phenotypes and genotypes to associate with outcomes.68

Fludeoxyglucose positron emission tomography

The use of FDG-PET in the context of response assessment post-SABR has been well studied; however, the data are quite heterogeneous.6972 Some studies have shown that the SUVmax7376 and residual standardized uptake value (SUV) 12 weeks post-treatment77 are strong predictors of local recurrence. Additional work has found that a pre-treatment SUVmax ≥ 5, post-treatment SUVmax ≥ 2 or a reduction in SUVmax < 2.55 were associated with a higher risk of distant failure.78 Although optimal SUV cut-offs vary across studies, an SUVmax > 5, or greater than the pre-treatment value, appears to be most indicative of recurrence.48,76 However, many of these studies are subject to Type I errors, as multiple SUV cut-offs were assessed for statistical significance.

A limitation of PET is that an inflammatory reaction in areas of the lung receiving high doses from SABR can result in an elevated uptake on PET imaging, resulting in false-positive findings.79,80 Another limitation of the use of FDG-PET imaging is in regard to the standardization of image acquisition across scanners and institutions, which must be considered in the context of these studies.81 PET is also more costly than standard CT imaging and may not be a routine post-treatment investigation at some institutions.

The use of radiomics to quantify the appearance of FDG-PET SUV changes can also be performed. This may be an important area of future study by determining regional variations in SUV uptake and for predicting outcomes in this patient population following SABR treatment. However, all of these studies need to ensure standardization of methodology, as discretization of SUVs can have a major impact on the resultant texture features.82

Other novel imaging methods

CT perfusion imaging has been investigated for response assessment in pulmonary metastases treated with SABR.83 Although changes in perfusion data were not statistically significant, a qualitative trend consisting of an early increase followed by a decrease in tumour perfusion was noted. Validation on a larger data set is required to determine the role of CT perfusion in response assessment post-SABR. Enhancement patterns have also been investigated following SABR and have shown that patients with recurrence have a more rapid washin and washout phenomenon, compared with the continuous enhancement observed in RILI.84

Although several studies aiming to improve the assessment of response post-SABR have been completed, they must be considered within the context of their limitations. Pathologic proof of recurrence is very uncommon in most studies owing to the fact that many patients are unable to have confirmatory biopsy for recurrence. Many of these patients are therefore defined to have “recurrence” based on imaging alone, using observations of serial CT enlargement or FDG avidity on PET. The use of patients without histologic proof of recurrence may overinflate imaging study results, since patients diagnosed solely on imaging findings may actually have benign fibrosis.

CONCLUSION AND FUTURE DIRECTIONS

Response assessment following SABR can be difficult. The current clinical standard use of RECIST has many limitations in the context of post-SABR follow-up imaging. Validated HRFs may provide a more rigorous framework for assessment of response. Promising novel advanced imaging options are being developed to aid in response assessment in the context of SABR. Quantitative radiomic analysis of CT has shown promise in predicting recurrence within 6 months of SABR treatment. The use of FDG-PET imaging can also aid in the detection of recurrence; however, false-positive findings due to post-SABR inflammation have been observed and optimal SUV cut-offs for defining recurrence are lacking. Standardization of image acquisition, specifically in PET imaging, and radiomic analysis must also be considered to support generalizability to other scanners and institutions. Ongoing studies aim to correlate imaging findings with pathological specimens in patients undergoing SABR plus surgical resection (NCT02136355). In the interim, assessment of response after SABR in patients with fibrotic changes should be conducted by a multidisciplinary team. The use of HRFs and PET/CT scanning if appropriate, should be considered rather than merely relying on RECIST measurements.

CONFLICT OF INTEREST

AD Ward, DA Palma and SA Mattonen have a pending US patent for image feature analysis for stereotactic ablative radiotherapy response assessment (non-commercialized).

FUNDING

The authors would like to acknowledge the following sources of funding: Natural Sciences and Engineering Research Council of Canada, the Ontario Institute for Cancer Research and Cancer Care Ontario.

Contributor Information

Sarah A Mattonen, Email: smattone@uwo.ca.

Aaron D Ward, Email: aaron.ward@uwo.ca.

David A Palma, Email: david.palma@lhsc.on.ca.

REFERENCES


Articles from The British Journal of Radiology are provided here courtesy of Oxford University Press

RESOURCES