Skip to main content
Radiology: Imaging Cancer logoLink to Radiology: Imaging Cancer
. 2020 Mar 27;2(2):e190021. doi: 10.1148/rycan.2020190021

Performance of the Vancouver Risk Calculator Compared with Lung-RADS in an Urban, Diverse Clinical Lung Cancer Screening Cohort

Abraham Kessler 1, Robert Peng 1,, Edward Mardakhaev 1, Linda B Haramati 1, Charles S White 1
PMCID: PMC7983652  PMID: 33778703

Abstract

Purpose

To compare the performance of the Vancouver risk calculator (VRC) with the American College of Radiology’s Lung CT Screening Reporting and Data System (Lung-RADS) for a lung cancer screening cohort in an urban, diverse clinical setting.

Materials and Methods

This study included a total of 486 patients with lung nodules (63 years ± 5.2 [standard deviation], 261 female patients), 448 of whom had lung nodules that were subsequently classified as benign and 38 of whom had those that were classified as malignant. The mean follow-up time was 40.0 months ± 14. Institutional review board approval was obtained for this Health Insurance Portability and Accountability Act–compliant retrospective study, and a waiver of informed consent was received. All patients undergoing lung cancer screening who underwent an initial baseline screening CT between December 2012 and June 2016 that demonstrated a nodule and had at least 1 year of follow-up comprised the study population. Each examination was assigned a Lung-RADS score between 2 and 4B, with 4A and 4B considered as showing positive results. The VRC calculates the risk of cancer at different thresholds using nine variables related to patient and imaging characteristics. Analysis was performed per patient based on the largest nodule. Lung-RADS and VRC using the 5% threshold were compared to assess diagnostic performance in determining the risk of developing lung cancer in a patient with a nodule found at screening CT. The McNemar test was used to compare differences in performance between Lung-RADS and VRC.

Results

Lung-RADS resulted in nine false-positive and 16 false-negative findings, whereas VRC with a 5% threshold resulted in 29 false-positive and 10 false-negative findings. Overall sensitivity and specificity for Lung-RADS was 58.0% and 98.0%, and for VRC with a 5% threshold was 73.7% and 93.5%, respectively (P = .313, P < .001, respectively).

Conclusion

The VRC performs well in an urban, diverse lung cancer screening program. Further studies may be directed at determining whether its use in conjunction with Lung-RADS leads to improved lung cancer detection.

Keywords: CT, Lung, Thorax

© RSNA, 2020


Summary

The Vancouver risk calculator performed well among patients in a diverse clinical lung cancer screening program, with a trend toward higher sensitivity and modestly lower specificity as compared with the Lung CT Screening Reporting and Data System.

Key Points

  • ■ The Vancouver risk calculator (VRC), using a 5% threshold, had a modestly lower specificity and trended toward higher sensitivity as compared with the Lung CT Screening Reporting and Data System (Lung-RADS) when a score of 4A or 4B was deemed as a positive finding.

  • ■ The VRC, using a 10% threshold, had comparable accuracy, sensitivity, and specificity compared with Lung-RADS, with a score of 4A or 4B deemed as a positive finding, which warrants further investigation into using both scoring systems in conjunction for lung cancer prediction.

Introduction

Lung cancer is the leading cause of cancer mortality in the world. In 2011, the National Lung Screening Trial (NLST) showed a 20% decrease in lung cancer mortality by using low-dose CT as compared with radiography (1), as low-dose CT provides improved diagnostic value, thus allowing early intervention to be started. Lung cancer screening was subsequently given a grade B recommendation by the U.S. Preventative Services Task Force, received approval by Medicare, and is currently in clinical use. The decrease in mortality demonstrated by the NLST came at the expense of a high false-positive rate. The American College of Radiology developed the Lung CT Screening Reporting and Data System (Lung-RADS) in 2014 to standardize reporting and stratify lung cancer risk (2). Lung-RADS incorporates CT findings such as nodule size, type, and change over time to assign a risk score, corresponding to follow-up or diagnostic recommendations. Limited data suggest that Lung-RADS is valid in clinical settings, increasing positive predictive value without increasing false-negative results as compared with the schema used in the NLST (3).

A concurrent proposal for distinguishing malignant nodules from benign ones was published by McWilliams et al (4) in 2013 and included patients’ clinical characteristics in addition to CT findings to assign a malignancy likelihood score using a lung cancer risk calculator called the Vancouver risk calculator (VRC). In contrast to the Lung-RADS score, which is assigned on a per-patient, per-screening CT basis, the VRC calculates the lung cancer risk for each nodule. The VRC was based on data from two Canadian screening cohorts and has been shown to perform well when applied in clinical practice (5,6). The VRC has demonstrated good predictive value when applied to a subset of the NLST (5) and has been demonstrated to have greater diagnostic accuracy, sensitivity, and specificity than Lung-RADS in the NLST population, both on a per-nodule and per-patient basis (6).

The lung cancer screening program at our multicenter academic institution cares for a population that differs in several ways from the NLST cohort, with greater sex and ethnic diversity, higher current smoker rate, and more patient comorbidities (7). In fact, our institution’s county had the worst health outcomes in New York state in 2017 (8). Although it was questionable whether lung cancer screening would be as effective in such a population as it was in the NLST, our preliminary data suggest that it performs well, with fewer false-positive findings than NLST when using Lung-RADS as a standardized reporting system (9). However, the VRC has demonstrated higher accuracy than Lung-RADS when using data from a research trial population (10). We hypothesized that the VRC might further improve diagnostic accuracy for our clinical screening population as well, as it incorporates clinical attributes in stratifying risk. The goal of the present pragmatically designed study was to compare VRC with Lung-RADS in assessing the risk of lung cancer on a per-patient level in an urban, diverse, socioeconomically poor, clinical lung cancer screening cohort.

Materials and Methods

Study Design

Institutional review board approval was obtained for ongoing retrospective study of the institution’s clinical lung cancer screening program (Einstein Montefiore institutional review board identifier 2014–3310) with waiver of informed consent. Preliminary data and Lung-RADS performance from the same screening program regarding an earlier group of patients, in which all patients screened as opposed to only those in which a nodule was detected at initial screening examination were analyzed, have been previously published (7,9). Recruitment to the program required local physician referral. NLST and, subsequently, Centers for Medicare and Medicaid Services inclusion criteria were used to determine eligibility for screening (1,11). A total of 486 patients were included within this study (mean age, 63 years ± 5 [standard deviation]; 261 women). All patients had their CT screening interpreted by five fellowship-trained, board-certified chest radiologists for approximately 90% of examinations, with the remainder interpreted by an additional two fellowship-trained chest radiologists and cardiothoracic imaging fellows in the final quarter of their training (9). One author (L.B.H.) with approximately 25 years of clinical experience interpreted some of the examinations. All lung cancer screening patients who underwent their initial baseline (T0) screening CT between December 2012 and June 2016 and were prospectively assigned a Lung-RADS score of 2–4B or its equivalent from the Breast Imaging Reporting and Data System (BI-RADS; see Scoring Systems) (7,9), indicating the presence of one or multiple nodules, were potentially eligible for inclusion in the study population. Only the T0 screening CT scan was considered because the VRC was designed and validated for only the prevalence (T0) scan.

To ensure adequate follow-up, the study cohort included only patients with at least 1 year of follow-up from the index CT or with a diagnosis of cancer if detected sooner. Follow-up was defined as histologic evaluation, follow-up imaging, or clinical evaluation. Follow-up time was calculated as the amount of time from the baseline screening CT to the most recent clinical, radiologic, or histopathologic follow-up at the institution. Patients without any of these clinical activities within the institution health system were searched in the New York State Cancer Registry, National Death Index, as of May 2017, and phone calls were made to patients. Seven patients with highly suspicious imaging findings had their final diagnosis adjudicated case by case to assign cancer status determined by a senior chest radiologist (C.S.W., 28 years of experience) who was not clinically affiliated with the screening program.

CT Image Acquisition

Low-dose CT scans were acquired with 64–detector row scanners (LightSpeed VCT and Optima 660; GE Healthcare, Chicago, Ill). A low-dose technique was used at all sites. If body mass index information was not available in the electronic medical record, technologists estimated patient body size on the basis of visual inspection; large patients underwent helical scans at 120 kVp and 100 mA, whereas average-sized patients underwent helical scans at 100 kVp and 75 mA. Further details of CT image acquisition in the institution’s lung cancer screening program have been previously described by Milch et al (7).

Scoring Systems

Lung-RADS.—Studies were assigned a Lung-RADS score at the time of interpretation or, for studies performed before development of Lung-RADS in 2014, were retrospectively converted to Lung-RADS scores from a locally BI-RADS–based score that was assigned at the time of initial interpretation (9). Lung-RADS version 1.0 was the most up-to-date version available and was therefore the one used for score assignment.

VRC.—Data necessary for the risk calculator described by McWilliams et al (4) were gathered from the CT report and the electronic medical record for all baseline screening studies demonstrating presence of a nodule, determined as having had a Lung-RADS score of 2 or more. These data included assessing nine variables: age, sex, family history of lung cancer, presence of emphysema, size of largest nodule, largest nodule type (solid, semisolid, ground-glass, or nonsolid), upper-lobe location, speculation, and nodule count. In some patients with multiple small nodules, the size and number of nodules were not specified. In these cases, to calculate the VRC, nodule size was assigned as 4 mm and nodule count was assigned as four, corresponding respectively to the maximum size and lowest number of nodules that would not typically be individually described in our clinical practice. Nodule malignancy risk was calculated using the above values with the VRC using the Brock University cancer prediction website (12).

Patient information was obtained from a questionnaire that was completed at the time of recruitment, from the electronic medical record, and from a bilingual (English and Spanish) screening coordinator who called patients when they were due for follow-up or diagnostic studies. Smoking status and ethnicity were self-reported. Patients whose initial questionnaires were missing information on family history of lung cancer had family history determined from provider notes in the electronic medical record, and family history was assumed to be negative when not reported. Body mass index data were obtained from the electronic medical record. Cancer information was recorded and confirmed through institutional and state cancer registries.

Statistical Analysis

Descriptive statistics were used to evaluate the cohort demographics and cancer characteristics. Sensitivity, specificity, accuracy, and positive and negative predictive values were measured for Lung-RADS and VRC. These measurements were calculated using Lung-RADS 4A and 4B scores as positive (malignant) results. A secondary analysis was performed using Lung-RADS scores of 3 and 4 as positive results. The VRC threshold used to define a positive result was 5%, which was independent of study data, as this correlates with the malignancy probability of a Lung-RADS score of 4A (2), with 1.5% and 10% thresholds secondarily analyzed. The McNemar test and resultant P values were used to determine whether the differences in performance between the two tests were statistically significant. A receiver operating characteristic curve and area under the receiver operating characteristic curve (AUC) for the VRC were calculated with MedCalc Statistical Software, version 19.1 (MedCalc, Ostend, Belgium).

Results

The study cohort comprised 486 patients who had a nodule at T0 screening CT between December 2012 and June 2016. The cohort was 53.7% (261 of 486) female, with a total mean age of 63 years ± 5. Sixty-seven percent (324 of 486) of participants were current smokers, with an overall mean of 53.4 pack-years ± 30. The most prevalent self-reported race and/or ethnicities were black at 28% (136 of 486), Hispanic at 33% (159 of 486), and white at 21% (104 of 486). Their baseline characteristics are shown in Table 1. The screening program at our institution used Hispanic ethnicity in the same category of identifiers as other ethnicities and races, whereas the NLST used Hispanic ethnicity as a separate identifier from race.

Table 1:

Baseline Characteristics of Patients

graphic file with name rycan.2020190021.tbl1.jpg

Mean follow-up was 40.0 months ± 14, with a median follow-up time of 40.4 months (interquartile range: 29.9–52.2 months). Of 486 patients, 448 (92.2%) had lung nodules that were categorized as benign. A total of 7.8% (38 of 486) of the cohort were diagnosed with lung cancer. With regard to cancer subtype, 36.8% (14 of 38) were diagnosed as adenocarcinoma; 23.7% (nine of 38) squamous cell carcinoma; 13.2% (five of 38) small cell carcinoma; 5.3% (two of 38) non–small cell lung carcinoma, not otherwise specified; and 21.1% (eight of 38) had another diagnosis. Of the patients with cancer, 50% (19 of 38) of the cancers were stage I, 10.5% (four of 38) of the cancers were stage II, 26.3% (10 of 38) of the cancers were stage III, 10.5% (four of 38) of the cancers were stage IV, and 2.6% (one of 38) of the cancers were of an unknown stage. A total of 1.4% (seven of 486) of patients had CT findings suspicious for lung cancer without histologic confirmation. Of the seven, one underwent treatment with radiation therapy, one subsequently underwent right upper lobectomy that showed multiple cancer subtypes, and two were adjudicated to have lung cancer on the basis of subsequent imaging (all included in the “other” cancer subtype count).

Next, the nodules were assessed with both scoring systems and then compared with one another. Independent assessments were made using Lung-RADS such that category 4A or 4B were deemed as indicating positive results (Table 2) and were also made for VRC performance at three different thresholds: 1.5%, 5%, and 10% (Table 3). Overall sensitivity and specificity for Lung-RADS, with 4A or 4B indicating a positive result, were 58.0% (95% confidence interval [CI]: 40.8%, 73.7%) and 98.0% (95% CI: 96.2%, 99.1%), and overall sensitivity and specificity for VRC with a 5% composite risk threshold were 73.7% (95% CI: 56.9%, 86.6%) and 93.5% (95% CI: 90.8%, 95.6%), respectively (P = .313 and P < .001, respectively). A secondary analysis was performed with a Lung-RADS score of 3 and 4A or 4B together considered as indicating a positive result, for which the overall sensitivity and specificity were 84.2% (95% CI: 68.1%, 93.4%) and 79.2% (95% CI: 75.1%, 82.8%), respectively. Further descriptive statistical measures were compared between both the Lung-RADS and VRC with three different thresholds and presented in Table 4. The receiver operating characteristic curves for Lung-RADS and VRC are shown in the Figure. Lung-RADS demonstrated an AUC of 0.868 (95% CI: 0.835, 0.897), and VRC demonstrated an AUC of 0.879 (95% CI: 0.847, 0.907).

Table 2:

Stratification of Nodules by Lung-RADS Category 4A or 4B as Positive Result

graphic file with name rycan.2020190021.tbl2.jpg

Table 3:

Nodule Classification Results per VRC Risk Threshold

graphic file with name rycan.2020190021.tbl3.jpg

Table 4:

Statistical Comparison of Lung-RADS and VRC Scoring Systems

graphic file with name rycan.2020190021.tbl4.jpg

Figure 1:

Figure 1:

Figure: Receiver operating characteristic curves for Lung CT Screening Reporting and Data System (lung_rad) and Vancouver risk calculator (vrc). The green line represents the VRC curve, and the blue line represents the Lung-RADS curve.

Discussion

The aim of the present study was to compare the efficacy of Lung-RADS, which is currently in clinical use in the United States, to the efficacy of VRC in assessing the risk of lung cancer. To our knowledge, the two have not been compared previously in a clinical population, as opposed to a research study population such as the NLST or Pan-Canadian Early Detection of Lung Cancer (PanCan) cohorts (13). Because of the limitations of our retrospective study, a per-patient outcome analysis was performed, rather than a per-nodule analysis. This contrasts with the application of VRC to the PanCan data, in which it was used to determine the likelihood of any specific nodule developing into lung cancer.

We hypothesized that the VRC would demonstrate better performance characteristics than Lung-RADS because it incorporates several clinical attributes in addition to CT imaging findings. We found that the VRC trended toward higher sensitivity, although it did not meet criteria for statistical significance, and had lower specificity than Lung-RADS in our clinical screening population. In screening populations, particularly those in resource-poor communities that struggle with lower follow-up rates, high sensitivity is a crucial metric to minimize cancers that would have otherwise progressed past the point of treatment.

The AUC for the VRC using our cohort demonstrated good discriminant value between benign and malignant nodules in our population. However, the AUC for VRC for our cohort was lower than that found when applying the VRC to the NLST data (5): 0.879 versus 0.963, respectively. This, in addition to the greater diagnostic capability found by White et al (6) within the NLST population, may be due in part to differences in the study populations. Our population contained a higher proportion of women, current smokers, and nonwhite patients than both the NLST and PanCan cohorts. As this is a clinical population, there are lower rates of follow-up compared with trial populations, which can lead to missing patients who ultimately develop cancer and make subsequent statistical analyses difficult, as was the case in our study.

In the present study, we used the VRC in a manner similar to how Lung-RADS was used, assessing the cancer probability on a per-patient basis. We used the largest nodule found at the screening CT for determining the VRC score for the patient. The present results from our clinical program demonstrated lower sensitivity and higher specificity for both VRC and Lung-RADS compared with the results of VRC and Lung-RADS applied to the NLST on a per-patient basis (6). We attribute this to each nodule being considered independently in prior studies using the NLST cohort, even among patients with multiple benign nodules, which may have subsequently led to improved sensitivity and accuracy.

It is important to note that in our primary analysis, we designated a Lung-RADS score of 3 as a negative screening result. Assigning Lung-RADS score of 3 as positive would have increased sensitivity but decreased specificity. The American College of Radiology reports that a Lung-RADS score of 3 corresponds to a 1%–2% probability of malignancy, whereas the equivalent rates for 4A and 4B are 5%–15% and greater than 15%, respectively (2). A Lung-RADS score of 3 indicates about the same cancer probability as the VRC 1.5% threshold, and we in fact found that the sensitivity and specificity values were very similar when a Lung-RADS score of 3 was designated as positive.

Our study had several limitations and challenges. The relatively low number (38 of 486) of patients with cancers was a limitation of the present study. One of the challenges we faced when calculating the VRC for our patients was missing data points in our patient population. The questionnaire used for the screening program from its inception was not designed to accommodate a risk calculator such as the VRC. For example, family history of lung cancer was not requested, and to obtain this information, we examined the patient’s medical provider notes for the patient’s family history, which may have resulted in underestimation. Other missing data points included patients with scattered small nodules, in whom nodule size and count were assigned as 4 mm and four nodules on the basis of our clinical practice that 4 mm was the largest size that would not be measured and that fewer than four nodules would be individually enumerated in the report. Additionally, very small nodules have been found to have a less than 1% risk of lung cancer (4,14,15). These assumptions may have also led to under- or overestimation of nodule size and count in some instances. An additional limitation was our use of the largest-size nodule for determining the VRC score for a patient, despite the possibility that a smaller nodule with a different location and characteristics could have a higher VRC score. Similar to White et al (6), we did not account for the difference in nodule measurement methodology (ie, the longest transverse diameter for the VRC and mean transverse diameter for Lung-RADS). Because most nodules are nearly round, the variation in measurement probably led to only a slight increase in size for the VRC on average. The pragmatic design we used to study a real-world clinical program in a resource-poor environment has inherent limitations. Despite this, the authors believe that this population merits particular attention in the literature.

In conclusion, the VRC with a 5% threshold performed well in our urban clinical lung cancer screening population, with a trend toward higher sensitivity and modestly lower specificity compared with Lung-RADS. Future assessments could include using Lung-RADS and VRC in conjunction to determine whether improved lung cancer predictions can be made. Three-dimensional volumetric risk assessment of lung nodules, as described in the Dutch-Belgian Randomized Lung Cancer Screening Trial (the Dutch NELSON trial), which demonstrated a lung cancer mortality reduction of 26% in men and between 39% and 61% in women (16), and the recent update in Lung-RADS (17) have the potential to incrementally improve risk assessment in lung cancer screening. Further efforts to develop a risk calculator specifically tailored to underserved populations may improve specificity and overall clinical utility for use in lung cancer screening.

Authors declared no funding for this work.

Disclosures of Conflicts of Interest: A.K. disclosed no relevant relationships. R.P. disclosed no relevant relationships. E.M. disclosed no relevant relationships. L.B.H. Activities related to the present article: disclosed no relevant relationships. Activities not related to the present article: spouse is a board member for Kyron. All other authors have no conflicts of interest related to the material discussed in this article. Other relationships: disclosed no relevant relationships. C.S.W. disclosed no relevant relationships.

Abbreviations:

AUC
area under the receiver operating characteristic curve
CI
confidence interval
Lung-RADS
Lung CT Screening Reporting and Data System
NLST
National Lung Screening Trial
VRC
Vancouver risk calculator

References


Articles from Radiology: Imaging Cancer are provided here courtesy of Radiological Society of North America

RESOURCES