Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Apr 25.
Published in final edited form as: Cancer Epidemiol Biomarkers Prev. 2023 Mar 6;32(3):329–336. doi: 10.1158/1055-9965.EPI-22-0532

Improving Lung Cancer Diagnosis with Computed Tomography Radiomics and Serum Histoplasmosis Testing

Hannah N Marmor 1, Stephen A Deppen 1,2, Valerie Welty 3, Michael N Kammer 4, Caroline M Godfrey 1, Khushbu Patel 4, Fabien Maldonado 4, Heidi Chen 3, Sandra L Starnes 5, David O Wilson 6, Ehab Billatos 7, Eric L Grogan 1,2
PMCID: PMC10128087  NIHMSID: NIHMS1884770  PMID: 36535650

Abstract

Background:

Indeterminate pulmonary nodules (IPNs) are a diagnostic challenge in regions where pulmonary fungal disease and smoking prevalence are high. We aimed to determine the impact of a combined fungal and imaging biomarker approach compared to a validated prediction model (Mayo) to rule out benign disease and diagnose lung cancer.

Methods:

Adults aged 40-90 years with 6-30 mm IPNs were included from four sites. Serum samples were tested for Histoplasmosis IgG and IgM antibodies by enzyme immunoassay and a computed tomography based risk score was estimated from a validated radiomic model. Multivariable logistic regression models including Mayo score, radiomics score, and IgG and IgM Histoplasmosis antibody levels were estimated. The areas under the receiver-operating characteristics curves (AUC) of the models were compared among themselves and to Mayo. Bias-corrected clinical net reclassification index (cNRI) was estimated to assess clinical reclassification using a combined biomarker model.

Results:

We included 327 patients; 157 from Histoplasmosis-endemic regions. The combined biomarker model including radiomics, Histoplasmosis serology, and Mayo score demonstrated improved diagnostic accuracy when endemic Histoplasmosis was accounted for (AUC 0.84, 95% CI 0.79-0.88, p<0.0001 compared to 0.73, 95%CI 0.67-0.78 for Mayo). The combined model demonstrated improved reclassification with cNRI of 0.18 among malignant nodules.

Conclusions:

Fungal and imaging biomarkers may improve diagnostic accuracy and meaningfully reclassify IPNs. The endemic prevalence of Histoplasmosis and cancer impact model performance when using disease related biomarkers.

Impact:

Integrating a combined biomarker approach into the diagnostic algorithm of IPNs could decrease time to diagnosis.

Keywords: biomarker, Histoplasmosis, radiomics, pulmonary, nodule

Introduction:

Approximately 1.5 million indeterminate pulmonary nodules (IPNs) are identified each year in the United States(1). While most are benign, they often require costly and invasive testing to rule out malignancy. IPN diagnosis can be especially challenging in regions where both smoking prevalence is high and pulmonary fungal diseases, such as Histoplasmosis, induce lung granulomas and mimic lung cancer on imaging(2). We have shown that the specificity of 18F-flouro-deoxyglucose positron emission tomography (FDG-PET), which is often recommended for intermediate risk IPNs, is reduced in Histoplasmosis endemic geographic regions, further limiting non-invasive diagnostic options(3).

Current management of an IPN depends on its pretest probability of malignancy which can be ascertained through clinical judgment or a validated lung cancer prediction model such as the Mayo Clinic Model (Mayo)(4). Low risk IPNs can be followed with surveillance imaging, while high risk nodules are usually referred for surgical biopsy or resection(5,6). The majority of IPNs fall into the intermediate risk category which pose the greatest diagnostic challenge and often require costly and invasive procedures to obtain a diagnosis. Current guidelines recommend FDG-PET, biopsy, or surveillance imaging(4,7). The greatest number of invasive biopsies for benign disease occur within this risk category(8).

Non-invasive biomarkers are needed to improve the evaluation and management of IPNs. Quantitative imaging analysis (radiomics) can serve as a digital biomarker by extracting hundreds of features from computed tomography (CT) scans to develop risk models for evaluating pulmonary nodules(9). Radiomic approaches have been shown to enhance lung cancer prediction accuracy(10-14). The enzyme immunoassay (EIA) for Histoplasma immunoglobulin G (IgG) and immunoglobulin M (IgM) was recently validated for diagnosing benign IPNs in regions endemic for Histoplasmosis with a positive predictive value of 100% for patients with both positive IgM and IgG levels(15).

To improve IPN diagnosis, we combined a highly specific fungal biomarker for benign disease, that when positive rules out malignancy. However, as a negative fungal test offers minimal diagnostic information, we combined it with radiomic and clinical risk factors (Mayo) to improve cancer risk estimation. We hypothesized that this combined biomarker approach would improve the diagnostic accuracy of IPN evaluation and meaningfully reclassify intermediate risk nodules. Additionally, we hypothesized that population Histoplasmosis and cancer prevalence would impact model performance when fungal and imaging biomarkers were used.

Methods:

Following a prospective specimen collection, retrospective blinded evaluation study design, serum and clinical information were identified from patients with newly detected 6-30 mm IPNs(16). The population included in this study was derived from four cohorts and contained patients with incidentally discovered and lung cancer screening detected IPNs. The first cohort consisted of patients from Vanderbilt University Medical Center and the Tennessee Valley VA Healthcare System in Nashville, Tennessee (VUMC, n=111) consented for research between 2003 and 2017. Approximately 15% of the VUMC cohort consisted of patients from a lung screening population, and 85% were incidentally discovered. The second cohort was derived from the University of Cincinnati in Cincinnati, Ohio (UoC, n=46) consented for research between 2015 to 2019 from a surgical clinic with incidentally discovered nodules. The third cohort consisted of patients with IPNs from the University of Pittsburgh Medical Center’s lung screening program (UPMC, n=71) consented for research between 2006 to 2015. The fourth cohort consisted of patients from the Detection of Early Cancer Among Military Personnel consortium (DECAMP, n=99) which includes screening and incidentally discovered nodules from 12 Veterans Affairs sites (Supplementary Figure S1 and Table S1). The DECAMP participants were made available as a case-control cohort consented for research between 2013 to 2017(17). All samples were enrolled under separate IRB in accordance to Belmont Report or US Common Rule guidelines.

Patients were included in the study if they had an IPN between 6-30 mm in largest axial diameter, were 40–90 years old at the time of enrollment, had prospectively collected treatment naive serum and CT scans (with slice thickness ≤ 3 mm), and had a definitive cancer or no-cancer diagnosis. In patients with multiple nodules, the IPN of greatest clinical concern was selected for analysis. Diagnosis was biopsy-proven for malignant nodules. Benign nodules were diagnosed through biopsy, at least two-year longitudinal imaging follow-up from detection time showing no signs of growth (extended past three years for subsolid or suspicious nodules), or benign evidence on radiology (e.g., nodule shrinking or benign calcification pattern seen on CT). Exclusion criteria included patients with non-solid nodules, known metastatic or benign disease at time of enrollment, lung cancer diagnosis within two years of enrollment, histology showing non-lung primary malignancy, > 90 days between CT and serum collection (> 120 days in the VUMC cohort if the medical record confirmed no treatment within this window), and patients with missing radiomics scores, Histoplasmosis EIA, or definitive IPN diagnosis. This study was approved by the VUMC IRB #030763 and 000616 and separate IRB’s from participating sites. All patients were consented to have biological specimens and clinical data used for future research. Written informed consent was obtained from patients.

Histoplasmosis Serology

Frozen serum was shipped to MiraVista Diagnostics (Indianapolis, IN) who performed serologic testing. All biological specimen collection, storage, and processing followed existing Early Detection Research Network’s (EDRN) protocols(18). Specimens were tested separately for Histoplasma IgG and IgM antibodies by enzyme immunoassay (EIA). The methodology for serologic testing has been previously reported (19). MiraVista was blinded to all clinical data.

Radiomics

A previously derived and validated radiomic score from Kammer, et al. was used in this study (12). Pulmonary nodules were segmented and quantitative features extracted from CT scans using the HealthMyne picture archiving and communication system (Madison, WI). Briefly, morphological features were extracted, including measurements of nodule size, shape, and location. Textural features, including measurements of heterogeneity and density distribution and wavelet features, including 3D transforms for multi-dimensional spectral content, were extracted for each segmented nodule. These features were collated and used in statistical analysis, such as analyzing variability and building predictive models. A total of 856 quantitative imaging features across the three types were extracted using HealthMyne. The 10 most informative features related to nodule size, shape, and texture were identified and combined into a single lung cancer risk score. This radiomic risk score was obtained using regression shrinkage and subset selection by the LASSO method. The methodology for radiomic score development has been described in more detail elsewhere(12). Investigators performing imaging analysis were blinded to all clinical data other than nodule location and CT related features.

Mayo Clinic Model

The Mayo Clinic Model is a validated logistic regression lung cancer prediction model, commonly used in clinical practice for predicting risk of IPN malignancy. It includes a combination of patient and nodule characteristics(20). In this study, the Mayo model was calculated for each patient using fixed coefficients for the variables age, smoking history, history of extra-thoracic cancer ≥ 5 years prior, nodule diameter, location, and spiculation.

Statistical Analysis

Patient demographics and nodule characteristics were summarized using descriptive statistics. Analyses were performed using Stata version 16 (College Station, TX) and R version 3.6.3. P values <0.05 were considered statistically significant.

Prediction Modeling

Multivariable logistic regression models were estimated to predict the presence of lung cancer. The binary outcome for each model was cancer versus benign disease. Predictor variables were continuous and included Mayo score, radiomics score, histoplasmosis IgG EIA, and histoplasmosis IgM EIA levels. Radiomics score and IgM EIA level were assumed to have a linear relationship with the outcome, while Mayo score and IgG EIA level took a flexible functional form via restricted cubic splines to allow for nonlinearity. This selection was based on visual assessment of variable effect plots and of overall model calibration plots. Results presented in this study were derived from models including Mayo score and IgG as restricted cubic splines with three knots.

Only patients with complete data (Mayo score, radiomics risk score, Histoplasmosis IgM and IgG EIA levels, and definitive nodule diagnosis) were included in the analysis. Area under the receiver-operating characteristics curve (AUC) and calibration plots were used to compare the diagnostic performance of prediction models to assess the combined performance of the radiomic marker and Histoplasmosis markers versus their individual improvements upon the Mayo model. This study investigated two prediction modeling approaches outlined in Figure 1. The initial simplified model (approach 1) demonstrated the need for an interaction term to address the varying effect of Histoplasmosis endemicity on predictor variables (Supplementary Table S2 and Figure S2). Both the methods and results of this modeling approach can be found with more detail in the supplementary text.

Figure 1. Statistical Modeling Approaches.

Figure 1.

Two modeling approaches were investigated in this study. One approach trained and externally validated lung cancer prediction models using multivariable logistic regression (details can be found in supplementary material under Initial Simplified Lung Cancer Prediction Models (Approach 1), Table S2, and Figure S2). The other approach estimated and internally validated prediction models including a statistical interaction term to account for the endemic presence of Histoplasmosis.

In order to account for varying fungal prevalence in different populations, and the potential differential effect of radiomics and Histoplasmosis antibody levels on predicting cancer, a second modeling approach (approach 2) was investigated. An additional variable was added to the model indicating the endemic presence of Histoplasmosis along with interaction terms between all predictor variables except Mayo. These interaction terms were explored in relation to the radiomic risk score, histoplasmosis IgG levels, and histoplasmosis IgM levels. The model was developed and assessed in the combined dataset (n=327) and allowed for differential effects to be represented by binary interaction terms for the endemic presence of Histoplasmosis in the region in which the sample was collected. Model performance was also assessed in the endemic (VUMC and UoC, n=157) and non-endemic (UPMC and DECAMP, n=170) subgroups. Mayo and Histoplasmosis IgG were included as non-linear restricted cubic spline terms, based on prior nonlinearity assessments and given that calibration plots demonstrated slightly improved model calibration. Model coefficients were fixed when assessing the apparent AUC in both the combined dataset and in the subgroups. Additionally, a bootstrapping approach was used to internally validate the models and provide optimism-corrected assessments of AUC(21,22).

Reclassification

Reclassification of IPNs out of the intermediate risk category was calculated using the combined biomarker predictions (Mayo, radiomics, and Histoplasmosis EIA) with interaction terms compared to the baseline Mayo prediction. All four cohorts were pooled together, and IPNs separated according to malignant (n=192) or benign (n=135) diagnosis. Nodules were classified into low, intermediate, or high-risk categories using decision thresholds of 10% and 70% (<10% indicating low risk, 10-70% intermediate risk, and >70% high risk) per British Thoracic Society guidelines(23). After grouping nodules by these cutoffs, the bias-corrected cNRI was calculated for cancer and benign disease separately. This method of obtaining a bias-corrected cNRI adjusts for expected reclassification based on random movements alone, and can be used to investigate the clinical utility of a combined biomarker prediction model.

Data Availability

The data generated in this study are not publicly available as no public database with this information exists, but are available upon reasonable request from the corresponding author.

Results:

Cohort Characteristics

A total of 157 patients with complete data were included in the Histoplasmosis-endemic cohort (VUMC and UoC), and 170 patients were included in the non-endemic cohort (UPMC and DECAMP). Population Histoplasmosis prevalence was >90% in VUMC, >70% in UoC, and <10% in the non-endemic cohort from UPMC and DECAMP (24). Within our endemic cohort, 22% of patients were Histoplasmosis IgG positive (35/157). Of these 35 IgG positive patients, 22 were from VUMC. Likewise, within the endemic cohort, 6% of patients were Histoplasmosis IgM positive (10/157). Of these 10 IgM positive patients, 7 were from VUMC. Within our non-endemic cohort, 8% of patients were IgG positive (14/170). Of these 14 IgG positive patients, 8 were from the DECAMP cohort. Furthermore, 4% were IgM positive (7/170). Of these 7 positive patients, 3 were from DECAMP. From the entire cohort, 6 patients were dual IgG and IgM positive. Of these 6 patients 4 were from VUMC, 1 was from UoC, and 1 from DECAMP. Of note, none of these patients who were both IgG and IgM positive had cancer. Cancer prevalence was 75% in the endemic cohort, and 44% in the non-endemic validation set (Table 1). Characteristics of each site can be found in the supplement (Supplementary Table S1).

Table 1.

Histoplasmosis-endemic and non-endemic cohort characteristics

Endemic (VUMCa and
UoCb n=157)
Non-endemic (UPMC and
DECAMPc, n=170)
Characteristic Benign Cancer Benign Cancer
  Count, No. (%) 40 (25.5) 117 (74.5) 95 (56) 75 (44)
  Age, median (IQRd), y 61 (53-68) 70 (64-75) 68 (62-72) 68 (63-76)
  Current/Former Smoker, No. (%) 36 (90) 111 (95) 95 (100) 75 (100)
  Previous Cancer, No. (%) 6 (15) 41 (35) 17 (18) 16 (21)
  Located in Upper Lobe, No. (%) 26 (65) 72 (62) 47 (49) 41 (55)
  Size, median (IQRd), mm 16 (12-20) 18 (13-22) 11 (8-14) 17 (13-21)
  Spiculated, No. (%) 11 (28) 50 (43) 27 (28) 26 (35)
  Gender (Male), No. (%) 19 (48) 60 (51) 67 (71) 53 (71)
Mayo Model Risk
  Risk score, median (IQRd) 33 (19-57) 60 (36-76) 26 (14-47) 44 (31-67)
Cancer Histology, No. (%)
  Adenocarcinoma 74 (63) 39 (52)
  Squamous Cell 25 (21) 15 (20)
  Small Cell 11 (9) 5 (7)
  Large Cell - 5 (7)
  Carcinoid 2 (2) 1 (1)
  NSCLC 1 (1) 6 (8)
  Othere 4 (3) 4 (5)
a

Vanderbilt University Medical Center

b

University of Cincinnati

c

University of Pittsburgh Medical Center and Detection of Early Lung Cancer Among Military Personnel

d

IQR: interquartile range

e

Other: adenosquamous, adenomatous hyperplasia, neuroendocrine, mucoepidermoid, schwannoma, adenocarcinoma and squamous cell carcinoma simultaneous primary, mixed large cell/small cell neuroendocrine

Full Prediction Model with Interaction Term

The following results are presented in Table 2, Figure 2, and supplementary figure S3. The AUC for Mayo alone in the Histoplasmosis-endemic cohort was 0.71 (95% CI 0.61-0.80). In the non-endemic cohort, the AUC for Mayo alone was similar (0.70, 95% CI 0.62-0.78). When both cohorts were combined, the AUC for Mayo was 0.73 (95%CI 0.67-0.78).

Table 2.

Interaction term prediction model diagnostic characteristics in endemic, non-endemic, and combined cohorts

Endemica
(n=157)
Non-endemicb
(n=170)
Combined
(n=327)
AUCc (95% CId) p-valuee AUCc (95% CId) p-valuee AUCc (95% CId) p-valuee
Mayo 0.71 (0.61, 0.80) 0.70 (0.62, 0.78) 0.73 (0.67, 0.78)
Bootstrap (0.632) optimism corrected 0.70 (0.60, 0.82) 0.70 (0.61, 0.78) 0.72 (0.66, 0.78)
Mayo and Histoplasmosis 0.79 (0.70, 0.88) 0.05 0.71 (0.63, 0.79) 0.46 0.80 (0.75,0.85) <0.001
Bootstrap (0.632) optimism corrected 0.79 (0.69, 0.89) 0.69 (0.59, 0.78) 0.77 (0.71,0.82)
Mayo and Radiomics 0.73 (0.64, 0.82) 0.44 0.81 (0.74, 0.87) <0.001 0.80 (0.76, 0.85) <0.001
Bootstrap (0.632) optimism corrected 0.73 (0.63, 0.82) 0.80 (0.73, 0.87) 0.79 (0.73, 0.84)
Combined biomarker modelf 0.81 (0.73, 0.89) 0.03 0.81 (0.75, 0.87) <0.001 0.84 (0.79, 0.88) <0.001
Bootstrap (0.632) optimism corrected 0.80 (0.71, 0.89) 0.79 (0.72, 0.87) 0.80 (0.74, 0.85)
c

AUC, Area under the receiver-operating characteristics curve

d

95% CI, 95% confidence interval

a

Endemic for Histoplasmosis

b

Not endemic for Histoplasmosis

f

Includes radiomics, Histoplasmosis EIA, Mayo, interaction terms

e

For each model vs. Mayo

Figure 2. Receiver Operating Characteristics Curve Comparing Lung Cancer Prediction Models with Interaction Terms Among Combined (2a), Histoplasmosis-Endemic (2b), and Non-Endemic Cohorts (2c).

Figure 2.

Figure 2a displays the apparent AUCs for lung cancer prediction models among the combined cohort (n=327). The combined biomarker model including Mayo, Histoplasmosis EIA, and radiomics exhibits the greatest AUC. Figure 2b displays apparent AUCs for lung cancer prediction models among the Histoplasmosis-endemic cohort (VUMC and UoC, n=157). The combined biomarker model including Mayo, Histoplasmosis EIA, and radiomics exhibits the greatest AUC. Figure 2c displays apparent AUCs for lung cancer prediction models among the non-endemic cohort (UPMC and DECAMP, n=170). The combined biomarker model including Mayo, Histoplasmosis EIA, and radiomics as well as the model including Mayo and radiomics exhibit the greatest AUCs.

AUC, Area under the receiver-operating characteristics curve

VUMC, Vanderbilt University Medical Center

UoC, University of Cincinnati

UPMC, University of Pittsburgh Medical Center

DECAMP, Detection of Early Lung Cancer Among Military Personnel

Histoplasmosis with Interaction Term

When Histoplasmosis IgG and IgM were added to Mayo including a variable accounting for endemic Histoplasmosis (interaction term), the endemic cohort’s AUC increased to 0.79 (95% CI 0.70-0.88, p=0.05). However, in the non-endemic cohort, the AUC for this model did not significantly improve (0.71, 95% CI 0.63-0.79, p=0.46). When the two groups were combined, the AUC was 0.80 (95% CI 0.75–0.85, p=0.0006).

Radiomics with Interaction Term

When radiomics score was added to Mayo with an interaction term, diagnostic accuracy was not significantly improved in the Histoplasmosis-endemic cohort with an AUC of 0.73 (95% CI 0.64-0.82, p=0.44). However, in the non-endemic set, diagnostic accuracy of this model did improve with an AUC of 0.81 (95% CI 0.74-0.87, p=0.0007). When both groups were combined, the AUC was 0.80 (95% CI 0.76-0.85, p=0.0005).

Combined Biomarker Model with Interaction Term

The combined biomarker model included Mayo, radiomics, Histoplasmosis IgG and IgM, and interaction terms. This model demonstrated improved diagnostic accuracy over Mayo alone with an AUC of 0.81 (95% CI 0.73-0.89, p=0.03) in the endemic group. Likewise, this model exhibited improved performance in the non-endemic group with an AUC of 0.81 (95% CI 0.75-0.87, p=0.0007). Diagnostic improvement remained when both groups were combined with an AUC of 0.84 (95% CI 0.79-0.88, p<0.0001). This model demonstrated a sensitivity of 90%, specificity of 61%, and positive predictive value of 77% based on Youden’s index.

Internal Validation

Each of the above models was internally validated in the combined population (n=327) as well as the endemic and non-endemic subgroups. Bootstrap optimism corrected AUCs are provided in Table 2. Model optimism was minimal across different groupings and models, indicating overestimation of model performance was minimal to none. Calibration plots demonstrated similar findings with improved model calibration when Mayo and Histoplasmosis IgG were allowed to have a flexible non-linear relationship (Supplementary Figure S4).

Reclassification Results

Reclassification was calculated using predictions from the combined biomarker model including interaction terms versus the Mayo model. We observed improved reclassification among malignant nodules with the combined biomarker model (cNRI=0.18). There were no nodules incorrectly reclassified as low risk (false negatives) when the combined model was used. There were 19 benign nodules incorrectly reclassified as high risk (false positives), and we observed no significant difference in the ability of the combined model to reclassify benign nodules compared to the Mayo model (cNRI=0.009) (Figure 3).

Figure 3. Risk Reclassification for Benign and Malignant Nodules Using Combined Biomarker Model (CBM).

Figure 3.

Reclassification of benign (n=135, left) and malignant (n=192, right) IPNs by the CBM (x-axis: Mayo score, y-axis: CBM score). Vertical and horizontal lines represent 10% and 70% risk thresholds (<10% low, >70% high). Green boxes represent IPNs correctly reclassified as low risk for benign (true negatives) and high risk for cancer (true positives) using the CBM. Red boxes represent IPNs incorrectly reclassified as high risk for benign (false positives) and low risk for cancer (false negatives) using the CBM. Within the benign nodules, 17 patients were correctly reclassified as low risk while 19 patients were incorrectly reclassified as high risk. Since an almost equal number of patients were shifted into these respective categories, the cNRI is close to zero (0.009) meaning there was no significant difference in the ability to reclassify benign nodules. Importantly, no patients with malignant nodules were incorrectly reclassified as low risk which could lead to missed or delayed cancer diagnoses.

CBM, combined biomarker model: Mayo, Histoplasmosis, and Radiomics

Discussion:

IPNs constitute a diagnostic and financial burden in healthcare. This is especially true in regions endemic for Histoplasmosis. Here, the rate of positive lung screening results can be triple the rate in non-endemic regions(25). Although most are benign, many IPNs require expensive and invasive procedures to obtain a diagnosis. This study demonstrated an improvement in the diagnostic accuracy of IPNs when a combined biomarker model including Histoplasmosis antibodies and radiomics was used compared to the current standard of risk estimation (Mayo). Furthermore, the combined biomarker model exhibited a strong ability to reclassify malignant nodules from intermediate risk to high risk, illustrating it’s potential as a clinically useful rule-in test for cancer.

This study also demonstrated the importance of understanding populations when building clinical risk prediction models. Four sites were used to train and validate lung cancer prediction models in this study. VUMC and UoC had high cancer prevalence and were located in Histoplasmosis-endemic regions, while UPMC and DECAMP were not. Additionally, these sites included both screening and nodule clinic populations. We observed that Histoplasmosis and cancer prevalence in a population impact model performance when these biomarkers are used. When the combined biomarker model was created in the VUMC cohort (initial simplified prediction model), it demonstrated variable results in the validation cohorts due to population differences. This showed a single risk model could not perform in different types of populations, perhaps due to varying degrees of cancer and fungal prevalence.

A model including the effect of Histoplasmosis endemicity was needed to account for these population differences. When the combined biomarker model including this endemic interaction term was applied to cohorts grouped by fungal prevalence, it highlighted the differential effects seen from radiomics and Histoplasmosis. When radiomics was added, there was a significant increase in the AUC within the non-endemic cohort. When Histoplasmosis was added, the AUC increased significantly within the endemic cohort. Internal validation supported the robustness of these results. Similar trends were noted in our first simplified modeling approach when VUMC served as a training cohort. While the Mayo model includes certain variables related to nodule features such as diameter, location, and presence of spiculation, the radiomics score includes far more complex image derived morphological as well as textural features. When these additional quantitative imaging features were used, there was an improvement in the diagnostic accuracy for IPNs among the combined cohort. However, optimization and validation of the radiomic model is needed for endemic regions.

The combined biomarker model exhibited a strong ability to reclassify malignant nodules out of the intermediate risk category. Notably, 19 of 99 benign Mayo-calculated intermediate risk nodules were incorrectly reclassified as high risk, representing this model's limitation in reclassifying benign disease. Since 17 benign nodules were correctly reclassified as low risk, an almost equal number of patients were shifted into these respective categories. Thus, mathematically the cNRI is close to zero (0.009) meaning the model showed no significant difference in the ability to reclassify benign nodules. This is likely attributable to the high cancer prevalence seen in the Histoplasmosis endemic populations comprising this study. Even so, these results suggest the potential for a combined biomarker approach to serve as a clinically meaningful rule-in test for cancer.

There are several strengths in this study including the addition of Histoplasmosis antibodies to a combined biomarker lung cancer prediction model. While studies have demonstrated an improvement in cancer risk prediction using combinations of biomarkers and clinical risk factors, the addition of a fungal biomarker is a novel approach(12,26,27). The Histoplasmosis serology testing with EIA used in this study was developed by a company with extensive peer reviewed publication on its use and is a strength of this work(2,19,28). MiraVista Diagnostics is a CLIA and CAP certified laboratory which conducts diagnostic testing for fungal disease exclusively. This laboratory is accessible to multiple institutions, allows for a variety of sample media (serum, plasma, urine, sputum, and BAL) and various settings including clinics, hospitals, universities, and larger laboratories. The test is also highly robust in signal detection in frozen samples irrespective of age of storage, assuming few freeze-thaw cycles. By using this diagnostic company, we attempted to ensure quality control of Histoplasmosis EIA testing. Additional strengths in this study include the blinding of specimens and imaging which minimized the chance a diagnosis affected the interpretation of results. Our study population consisted of adults with IPNs, which is the setting for which this combined biomarker model is intended. Both lung screening and surgical clinic (incidentally detected nodule) populations from various regions were included in the study which increases generalizability. With at least 85% of lung cancer associated with smoking behaviors and 100% of the screening population having a significant smoking history, the vast majority of our overall population had some smoking history which would be a potential benefit if used in the screening setting(29-31). Finally, our prediction models accounted for non-linearity among variables and included a specific variable addressing population differences in Histoplasmosis prevalence. Of note, we did not include an interaction term with the Mayo variable as the Mayo score was developed for the general population and thus no differential effect was expected.

This study also has several limitations. Our initial aim was to train and validate a combined biomarker model in various populations (approach 1, initial simplified model). While external validation was performed after the model was trained in VUMC, the sample size of the endemic validation cohort from UoC was small with little benign disease. In order to develop a more optimized prediction model, we next included the effect of Histoplasmosis endemicity (approach 2, full prediction model with interaction term). Including the endemic interactions allowed for differential effects of the newly considered biomarkers in these populations. We did not include the interaction in the well-established and validated Mayo model. As such, some of the improvement in discriminative ability of the new models as compared to Mayo may be due to the effect of Histoplasmosis endemicity on the predicted cancer risk. The results of Mayo with the endemic main effect term can be found in the supplementary text (Table S3). While the model including endemic interaction terms was internally validated, the bootstrap did not account for all modeling steps (namely, the selection of linear vs nonlinear terms) which could result in an underestimation of the model optimism. Further, this more advanced model requires external validation. The majority of cohorts were derived from tertiary care referral centers which could contribute to population or selection bias. Thus, the prevalence of cancer seen in this study may not represent that of the general population and therefore model performance may not be as generalizable. Additionally, while there was a minimal difference in patient age between benign and malignant disease for the non-endemic cohort, there was a larger difference in age between nodule diagnosis among the endemic cohort. As age is a variable within the Mayo model, this difference is a potential confounder in our study. Our study included the Mayo model as this model is a commonly used lung cancer prediction model in clinical practice. The nodule radiomic features are derived from the segmented nodule only and not from additional features outside the nodule, like nodal or splenic calcifications, which is a limitation to the combined model’s radiomic feature set.(12) Not all variables for newer risk models such as the Brock model were available. Finally, the prediction models investigated in this study focused on Histoplasmosis antibodies as fungal biomarkers. Future directions should include other mycotic diseases and their biomarkers as well.

In conclusion, this study demonstrated an improvement in the diagnostic accuracy of IPNs when a combined biomarker model including Histoplasmosis antibodies, radiomics score, and clinical risk factors (Mayo) was used and modeling included interaction terms to account for population differences. While further validation is required, and recalibration/optimization of radiomic models is needed in regions endemic for Histoplasmosis, integrating a combined biomarker approach into the diagnostic algorithm for IPNs could increase diagnostic accuracy and reclassification and therefore improve clinical management.

Supplementary Material

Supplementary Content

Financial Support:

This work is supported by U01CA152662 (E.L. Grogan and S.A. Deppen) and T32CA106183-18 (H.N. Marmor).

Footnotes

Disclosures: The authors declare no potential conflicts of interest.

References:

  • 1.Gould MK, Tang T, Liu ILA, Lee J, Zheng C, Danforth KN, et al. Recent Trends in the Identification of Incidental Pulmonary Nodules. Am J Respir Crit Care Med. 2015. Nov 15;192(10):1208–14. [DOI] [PubMed] [Google Scholar]
  • 2.Deppen SA, Massion PP, Blume J, Walker RC, Antic S, Chen H, et al. Accuracy of a Novel Histoplasmosis Enzyme Immunoassay to Evaluate Suspicious Lung Nodules. Cancer Epidemiol Biomarkers Prev. 2019. Feb;28(2):321–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Grogan EL, Deppen SA, Ballman KV, Andrade GM, Verdail FC, Aldrich MC, et al. Accuracy of FDG-PET to diagnose lung cancer in the ACOSOG Z4031 trial. JCO [Internet]. 2012. May 20 [cited 2021 Jun 14];30(15_suppl):7008–7008. Available from: http://ascopubs.org/doi/10.1200/jco.2012.30.15_suppl.7008 [Google Scholar]
  • 4.Choi HK, Ghobrial M, Mazzone PJ. Models to Estimate the Probability of Malignancy in Patients with Pulmonary Nodules. Ann Am Thorac Soc. 2018. Oct;15(10):1117–26. [DOI] [PubMed] [Google Scholar]
  • 5.Gould MK, Donington J, Lynch WR, Mazzone PJ, Midthun DE, Naidich DP, et al. Evaluation of individuals with pulmonary nodules: when is it lung cancer? Diagnosis and management of lung cancer, 3rd ed: American College of Chest Physicians evidence-based clinical practice guidelines. Chest. 2013. May;143(5 Suppl):e93S–e120S. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Callister MEJ, Baldwin DR, Akram AR, Barnard S, Cane P, Draffan J, et al. British Thoracic Society guidelines for the investigation and management of pulmonary nodules. Thorax. 2015. Aug;70 Suppl 2:ii1–54. [DOI] [PubMed] [Google Scholar]
  • 7.Network, N.C.C. Non-Small Cell Lung Cancer. NCCN Clinical Practice Guidelines in Oncology, 2021. Version 4.2021. p.16–24. [Google Scholar]
  • 8.Massion PP, Walker RC. Indeterminate pulmonary nodules: risk for having or for developing lung cancer? Cancer Prev Res (Phila). 2014. Dec;7(12):1173–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Paez R, Kammer MN, Massion P. Risk stratification of indeterminate pulmonary nodules. Current Opinion in Pulmonary Medicine [Internet]. 2021. Jul [cited 2022 Jan 3];27(4):240–8. Available from: https://journals.lww.com/10.1097/MCP.0000000000000780 [DOI] [PubMed] [Google Scholar]
  • 10.Lee G, Park H, Bak SH, Lee HY. Radiomics in Lung Cancer from Basic to Advanced: Current Status and Future Directions. Korean J Radiol. 2020. Feb;21(2):159–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Liu Y, Balagurunathan Y, Atwater T, Antic S, Li Q, Walker RC, et al. Radiological Image Traits Predictive of Cancer Status in Pulmonary Nodules. Clin Cancer Res. 2017. Mar 15;23(6):1442–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Kammer MN, Lakhani DA, Balar AB, Antic SL, Kussrow AK, Webster RL, et al. Integrated Biomarkers for the Management of Indeterminate Pulmonary Nodules. Am J Respir Crit Care Med. 2021. Aug 31; 1306–1316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Maldonado F, Varghese C, Rajagopalan S, Duan F, Balar AB, Lakhani DA, et al. Validation of the BRODERS classifier (Benign versus aggRessive nODule Evaluation using Radiomic Stratification), a novel HRCT-based radiomic classifier for indeterminate pulmonary nodules. Eur Respir J. 2021. Apr;57(4):2002485. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Massion PP, Antic S, Ather S, Arteta C, Brabec J, Chen H, et al. Assessing the Accuracy of a Deep Learning Method to Risk Stratify Indeterminate Pulmonary Nodules. Am J Respir Crit Care Med. 2020. Jul 15;202(2):241–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Shipe ME, Deppen SA, Sullivan S, Kammer M, Starnes SL, Wilson DO, et al. Validation of Histoplasmosis Enzyme Immunoassay to Evaluate Suspicious Lung Nodules. Ann Thorac Surg. 2021. Feb;111(2):416–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Pepe MS, Feng Z, Janes H, Bossuyt PM, Potter JD. Pivotal evaluation of the accuracy of a biomarker used for classification or prediction: standards for study design. J Natl Cancer Inst. 2008. Oct 15;100(20):1432–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Billatos E, Duan F, Moses E, Marques H, Mahon I, Dymond L, et al. Detection of early lung cancer among military personnel (DECAMP) consortium: study protocols. BMC Pulm Med. 2019. Mar 7;19(1):59. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Tuck MK, Chan DW, Chia D, Godwin AK, Grizzle WE, Krueger KE, et al. Standard operating procedures for serum and plasma collection: early detection research network consensus statement standard operating procedure integration working group. J Proteome Res. 2009. Jan;8(1):113–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Richer SM, Smedema ML, Durkin MM, Herman KM, Hage CA, Fuller D, et al. Improved Diagnosis of Acute Pulmonary Histoplasmosis by Combining Antigen and Antibody Detection. Clin Infect Dis. 2016. Apr 1;62(7):896–902. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Swensen SJ, Silverstein MD, Ilstrup DM, Schleck CD, Edell ES. The probability of malignancy in solitary pulmonary nodules. Application to small radiologically indeterminate nodules. Arch Intern Med. 1997. Apr 28;157(8):849–55. [PubMed] [Google Scholar]
  • 21.Efron B Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation. Journal of the American Statistical Association [Internet]. 1983. Jun [cited 2022 Jan 24];78(382):316–31. Available from: http://www.tandfonline.com/doi/abs/10.1080/01621459.1983.10477973 [Google Scholar]
  • 22.Harrell FE. Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis.2nd edition. Springer; 2015. [Google Scholar]
  • 23.Baldwin DR, Callister MEJ, Guideline Development Group. The British Thoracic Society guidelines on the investigation and management of pulmonary nodules. Thorax. 2015. Aug;70(8):794–8. [DOI] [PubMed] [Google Scholar]
  • 24.Edwards LB, Acquaviva FA, Livesay VT, Cross FW, Palmer CE. An atlas of sensitivity to tuberculin, PPD-B, and histoplasmin in the United States. Am Rev Respir Dis. 1969. Apr;99(4):Suppl:1–132. [PubMed] [Google Scholar]
  • 25.Starnes SL, Reed MF, Meyer CA, Shipley RT, Jazieh AR, Pina EM, et al. Can lung cancer screening by computed tomography be effective in areas with endemic histoplasmosis? J Thorac Cardiovasc Surg. 2011. Mar;141(3):688–93. [DOI] [PubMed] [Google Scholar]
  • 26.Ajona D, Remirez A, Sainz C, Bertolo C, Gonzalez A, Varo N, et al. A model based on the quantification of complement C4c, CYFRA 21–1 and CRP exhibits high specificity for the early diagnosis of lung cancer. Transl Res. 2021. Jul;233:77–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Silvestri GA, Tanner NT, Kearney P, Vachani A, Massion PP, Porter A, et al. Assessment of Plasma Proteomics Biomarker’s Ability to Distinguish Benign From Malignant Lung Nodules: Results of the PANOPTIC (Pulmonary Nodule Plasma Proteomic Classifier) Trial. Chest. 2018. Sep;154(3):491–500. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Connolly PA, Durkin MM, Lemonte AM, Hackett EJ, Wheat LJ. Detection of histoplasma antigen by a quantitative enzyme immunoassay. Clin Vaccine Immunol. 2007. Dec;14(12):1587–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Siegel DA, Fedewa SA, Henley SJ, Pollack LA, Jemal A. Proportion of Never Smokers Among Men and Women With Lung Cancer in 7 US States. JAMA Oncol. 2021. Feb 1;7(2):302–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.US Preventive Services Task Force, Krist AH, Davidson KW, Mangione CM, Barry MJ, Cabana M, et al. Screening for Lung Cancer: US Preventive Services Task Force Recommendation Statement. JAMA. 2021. Mar 9;325(10):962–70. [DOI] [PubMed] [Google Scholar]
  • 31.Warren GW, Cummings KM. Tobacco and lung cancer: risks, trends, and outcomes in patients with cancer. Am Soc Clin Oncol Educ Book. 2013;359–64. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Content

Data Availability Statement

The data generated in this study are not publicly available as no public database with this information exists, but are available upon reasonable request from the corresponding author.

RESOURCES