Abstract
Background
Evidence-based guidelines recommend management strategies for malignant pleural effusions (MPEs) based on life expectancy. Existent risk-prediction rules do not provide precise individualized survival estimates.
Research Question
Can a newly developed continuous risk-prediction survival model for patients with MPE and known metastatic disease provide precise survival estimates?
Study Design and Methods
Single-center retrospective cohort study of patients with proven malignancy, pleural effusion, and known metastatic disease undergoing thoracentesis from 2014 through 2017. The outcome was time from thoracentesis to death. Risk factors were identified using Cox proportional hazards models. Effect-measure modification (EMM) was tested using the Mantel-Cox test and was addressed by using disease-specific models (DSMs) or interaction terms. Three DSMs and a combined model using interactions were generated. Discrimination was evaluated using Harrell’s C-statistic. Calibration was assessed by observed-minus-predicted probability graphs at specific time points. Models were validated using patients treated from 2010 through 2013. Using LENT (pleural fluid lactate dehydrogenase, Eastern Cooperative Oncology Group performance score, neutrophil-to-lymphocyte ratio and tumor type) variables, we generated both discrete (LENT-D) and continuous (LENT-C) models, assessing discrete vs continuous predictors’ performances.
Results
The development and validation cohort included 562 and 727 patients, respectively. The Mantel-Cox test demonstrated interactions between cancer type and neutrophil to lymphocyte ratio (P < .0001), pleural fluid lactate dehydrogenase (P = .029), and bilateral effusion (P = .002). DSMs for lung, breast, and hematologic malignancies showed C-statistics of 0.72, 0.72, and 0.62, respectively; the combined model’s C-statistics was 0.67. LENT-D (C-statistic, 0.60) and LENT-C (C-statistic, 0.65) models underperformed.
Interpretation
EMM is present between cancer type and other predictors; thus, DSMs outperformed the models that failed to account for this. Discrete risk-prediction models lacked enough precision to be useful for individual-level predictions.
Key Words: malignant pleural effusions, prediction model, survival analysis
Abbreviations: BLESS, Breast and Lung Effusion Survival Score; CI, Harrell’s C-statistic including ties; CRP, C-reactive protein; DSM, disease-specific model; ECOG, Eastern Cooperative Oncology Group; EMM, effect-measure modification; GCMM, general cancer multivariate model; HR, hazard ratio; LDH, lactate dehydrogenase; LENT, pleural fluid lactate dehydrogenase, ECOG performance score, neutrophil-to-lymphocyte ratio and tumor type; LENT-C, continuous LENT model; LENT-D, discrete LENT model; MPE, malignant pleural effusion; NLR, neutrophil to lymphocyte ratio
FOR EDITORIAL COMMENT, SEE PAGE 805
Malignant pleural effusion (MPE) is the leading cause of pleural exudates among the approximately 1.5 million new pleural effusions diagnosed yearly in the United States.1,2 MPEs are associated with poor quality of life and a median survival of 4 to 7 months from the time of diagnosis.1,3,4
Evidence-based guidelines suggest using temporizing treatments such as thoracentesis in patients with limited life expectancy, whereas they suggest offering patients with a better prognosis definitive interventions such as indwelling pleural catheters, active pleurodesis, or both.5,6 This requires a method to predict survival in MPE, given that clinical judgment often is wrong in these cases.7
Several prediction models have been developed, but most have not been validated externally.8, 9, 10, 11, 12, 13, 14 The LENT score is a discrete risk-prediction model that estimates survival for MPE based on pleural fluid lactate dehydrogenase (LDH), Eastern Cooperative Oncology Group (ECOG) score, blood neutrophil to lymphocyte ratio (NLR), and tumor type.10 It assigns patients a low, moderate, or high risk of death, and it has been adopted widely because of its simplicity.15,16 The PROMISE score predicts 3-month risk of death using previous chemotherapy or radiotherapy history, hemoglobin level, white blood count, C-reactive protein (CRP) level, ECOG score, and cancer type.14
However, because in LENT only three possible predicted median survival times are included (319, 130, and 44 days for low-, moderate-, and high-risk patients, respectively), it lacks precision and has significant limitations when used to inform decisions in individual patients.17 In contrast, continuous risk-prediction models could offer individualized predictions precise enough to inform clinical decisions effectively in different contexts. Another limitation of both LENT and PROMISE is that they consolidate all cancer types into just three groups. In LENT, patients with MPE were stratified by their underlying cancer type, and those who showed similar mortality were clustered together.10 This simplifying assumption disregards the biological variability of different tumors (eg, lung and urothelial cancers, which include bladder, prostate, testis, and penile malignancies, are grouped together). Consequently, the magnitude of the risk associated with a variable, which may depend on the cancer type, could be overlooked.18,19
Our primary objective was to develop a continuous risk-prediction model for patients with MPE to provide more precise individualized estimates of survival probability. Our secondary objectives were to validate the model, to validate the LENT score externally, and to compare LENT with our disease-specific models. We hypothesized that predicting survival in MPE requires separate models for each cancer type because the hazard ratios (HRs) associated with a given risk factor would change significantly depending on the underlying cancer present.
Methods
This was a single-center retrospective cohort study. We included consecutive patients 18 years of age or older with a known active cancer and prior evidence of metastases who underwent a first thoracentesis for symptomatic pleural effusions from January 2014 through July 2017. Purely diagnostic thoracenteses without symptoms were not included because guidelines were developed for symptomatic MPE.20,21 Patients were identified by the Current Procedure Terminology billing code for thoracentesis and their charts were reviewed. Diagnosis of MPE was based on pleural fluid cytologic findings, pleural biopsy results, or an exudative pleural effusion in patients with histologically proven metastatic disease and no other identifiable cause after a thorough workup.3,4,22 Exclusion criteria were: (1) lost to follow-up immediately after the thoracentesis; (2) lack of radiographic imaging within 14 days from first thoracentesis; (3) loculated pleural effusions on chest radiography; (4) previous pleural fluid drainage, including chest tube placement; (5) no histology-proven active cancer; and (6) absence of biopsy-proven or high clinical suspicion of metastatic disease. The model is designed for patients with known metastatic disease because this clinical scenario is both common and reproducible and avoids mixing populations with different survival probabilities together. A patient with a primary lung cancer and proven distant metastases to bones who also harbors an MPE carries a different prognosis than a patient with just an MPE, and this is reflected in the staging system (M1b or M1c for distant metastasis vs M1a for MPE alone).23 Loculated effusions were excluded from the study because they follow a different management algorithm, and therefore do not inform management of the current patient population.6
The primary outcome was time from first thoracentesis to death during the subsequent 6 months. Patients alive at 180 days or lost to follow-up were censored. For all patients, we recorded date of death or of last follow-up, age, sex, cancer type, blood absolute neutrophil and lymphocyte counts, previous diagnosis of congestive heart failure, high clinical suspicion of pneumonia, ascites, and chemotherapy or radiation within 30 days of thoracentesis. From the first thoracentesis, we recorded ECOG score, pleural fluid volume, and LDH, protein, cholesterol, and triglyceride levels. Laboratory values that were not measured routinely, like pleural fluid pH and CRP level, were not included to reduce missing values and to maintain sample size. Size of effusion was estimated on chest radiography within 14 days before thoracentesis using a previously developed classification (e-Fig 1).24 If patients underwent chest CT imaging within 30 days before thoracentesis, it was used to evaluate lung nodules or masses and pleural thickening or nodularity.
For continuous variables (age, NLR, and pleural fluid LDH, protein, cholesterol, and triglycerides levels), we tested whether the association with survival time was linear using log-log plots. If evidence of a nonlinear relationship was found, the continuous variable was transformed into a categorical variable. This was necessary only for pleural fluid triglycerides in the lung cancer cohort; the three categories created were: < 26 mg/dL, 26 to 105 mg/dL, and > 105 mg/dL.
Potential risk factors for mortality were chosen based on literature review.12, 13, 14,17,24,25 We used univariate Cox models to evaluate the association between these factors and mortality. We prespecified that potential risk factors with a P value of < .2 on univariate analysis qualified as candidates for multivariate model building, checking for proportional hazards assumption violations using a linear regression of the scaled Schoenfeld residuals and log-log plots. We then used backward selection with the candidate variables to build the multivariate model, using a P value of ≤ .05 as the cutoff to remain in the model.
Effect-Measure Modification
Effect-measure modification (EMM) refers to the situation in which a measure of effect between an outcome variable and an exposure variable changes over values of some other variable.26 Previous studies suggested significant interactions between mortality, NLR, and LDH and tumor type.27 To check for EMM, we used the Mantel-Cox test or log-rank test (see e-Appendix 1).28 Significant interactions (P < .05) were analyzed graphically using forest plots.29 We tested for interactions between continuous exposure variables and dichotomous effect modifiers by using factorial interactions and graphs of the relative hazard difference.
After being detected, we examined EMM with two different methods. First, we created a multivariate Cox model with interaction terms to address EMM by cancer type. Only the three most common cancers in our cohort (lung, breast, and hematologic) were included. We call this the general cancer multivariate model (GCMM) because it is a single equation applicable to any patient with these three malignancies. Second, we created a disease-specific model (DSM) for each of these three malignancies, allowing the flexibility for each cancer type to have its own baseline survival function and independently estimated HR (see e-Appendix 1).
Model Validation
All models were validated temporally using an independent database of patients having their first thoracentesis between January 2010 and December 2013,24 selecting only patients with lung, breast, or hematologic malignancies. We generated predicted values for each patient using the GCMM and DSM models and two variants of the LENT model. One was the original LENT score with discrete categorical risk predictors (LENT-D). The other used the same LENT variables, but generated a continuous risk predictor (LENT-C) (see e-Appendix 1).10
Assessment of Model Performance
We tested discrimination and calibration for each model. For discrimination, we used Harrell’s C-statistic, Somers’ D statistic, and time-dependent area under receiver operating characteristic curves. Although all these methods measure concordance, they lead to different interpretations (see e-Appendix 1).30 To evaluate the effect of ties on concordance, Harrell’s C-statistic was including ties (CI) and excluding ties was calculated. Ties are not observed regularly if the risk predictor and the outcome are both continuous, so Harrell’s C-statistic including and excluding ties yield the same results. Conversely, ties are common in discrete models with few levels, such as LENT-D. Their inclusion moves the C-statistic toward 0.5.31 CI is a better indicator of how clinically useful a prediction model is because it reflects how well the model discriminates across all cases, which is what clinicians deal with in practice (see e-Appendix 1). C-statistics were compared by estimating the differences between Harrell’s C-statistic results for each model and generating a P value using the methods described by Newson.32,33 We compared time-dependent areas under receiver operating characteristic curve at 30, 90, and 180 days for all models.
Both calibration-in-the-large and strong calibration were evaluated for all models (see e-Appendix 1). For calibration-in-the-large, we compared Kaplan-Meier graphs for mean predicted survival and observed survival for each cohort. Predicted survival curves were calculated for each patient, based on their covariates, by using the estimated HR and estimated baseline hazard function. For strong calibration, we used the difference between observed and predicted survival, and the smoothed residuals were graphed at 30, 60, 90, 120, 150, and 180 days.34 At the same time points, we also obtained the predicted survival probability by comparing time-specific observed vs predicted graphs.35 We further quantified prediction accuracy of the final Breast and Lung Effusion Survival Score (BLESS) model by measuring accuracy on a per-patient basis. Consistent with prior studies,36 we defined an individual patient’s prediction as accurate if the predicted median survival for that patient was within 33% of the actual survival time (see e-Appendix 1).
Calculating Predicted Survival Probabilities
When Cox models are used at different institutions to estimate survival probability, the baseline survivor function for the model must be provided. Otherwise, only qualitative descriptions (eg, low or high risk) can be derived, but not quantitative estimates of survival probabilities (eg, 45% at 90 days). Previously published Cox models do not provide this information, which limits their applicability and makes external validation difficult.10,14 To facilitate calculating predicted survival probabilities for individual patients with lung, breast, and hematologic malignancies, we provided the baseline survivor function for a reference patient for each model along with an explanation of how to use it (e-Appendix 1, e-Tables 1-8) and generated an online calculator (available at: https://biostatistics.mdanderson.org/shinyapps/BLESS). A web-based interface was used for data collection (Research Electronic Data Capture). All variables were defined before data abstraction. All statistical analysis was carried out using Stata version 14 software (StataCorp LLC)37 or R version 3.6.1 software (The R Foundation for Statistical Computing).38 This study was reported using the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis guidelines.39
Results
The initial development cohort included 937 patients (Fig 1), 372 of whom (40%) were alive at 180 days, 523 of whom (56%) died before that, and 42 of whom (4%) were lost to follow-up (e-Appendix 1, e-Table 9). On univariate analysis, the variables associated with time to death (P < .2) were: ECOG score, cancer type, congestive heart failure, ascites, high suspicion of pneumonia, prior radiation, bilateral effusion, size of effusion, lung nodularity, NLR, and pleural fluid cytologic, LDH, protein, cholesterol, and triglycerides results (e-Table 9). Analysis using Schoenfeld residuals suggested that the variables bilateral effusion and ECOG score might violate the proportional hazards assumption. However, tests using log-log plots demonstrated that the curves never intersected, suggesting that the proportional hazard assumption was reasonable.
Figure 1.
Flow chart showing patient selection for the development cohort. CXR = chest radiography.
Effect-Measure Modification
The Mantel-Cox test indicated that interactions were present between cancer type and NLR (P < .0001), pleural fluid LDH (P = .029), and bilateral effusion (P = .002). To address these interactions, we trimmed the cohort to include only patients with the three most common cancer types—lung, breast, and hematologic malignancies—to a final model development cohort of 562 (Table 1, e-Fig 2).
Table 1.
Final Cohort Characteristics and Univariate Analysis
Covariate | Entire Group (N = 562) | Alive (n = 224) | Dead (n = 296) | Lost to Follow-up (n = 42) | Univariate Analysis |
||
---|---|---|---|---|---|---|---|
Hazard Ratio | 95% CI | P Value | |||||
Sex | 1.03 | 0.82-1.30 | .802 | ||||
Female | 325 (57.83) | 128 (57.14) | 172 (58.11) | 25 (59.52) | |||
Male | 237 (42.17) | 96 (42.86) | 124 (41.89) | 17 (40.48) | |||
Age, y | 1.00 | 0.99-1.01 | .461 | ||||
Median (IQR) | 62 (53-70) | 61 (51.5-69) | 63 (54-70) | 63.5 (52-72) | |||
Mean (SD) | 60.7 (13.73) | 59.8 (13.6) | 61.1 (13.7) | 62.3 (14.6) | |||
ECOG performance status | < .001 | ||||||
0 | 31 (5.92) | 20 (9.62) | 5 (1.81) | 6 (15.38) | Reference | ... | ... |
1 | 113 (21.56) | 65 (31.25) | 41 (14.8) | 7 (17.95) | 2.52 | 0.99-6.38 | .051 |
2 | 183 (34.92) | 55 (26.44) | 113 (40.79) | 15 (38.46) | 5.52 | 2.25-13.53 | < .001 |
3 | 152 (29.01) | 57 (27.4) | 86 (31.05) | 9 (23.08) | 5.86 | 2.38-14.44 | < .001 |
4 | 45 (8.59) | 11 (5.29) | 32 (11.55) | 2 (5.13) | 8.83 | 3.44-22.69 | < .001 |
Cancer type | .674 | ||||||
Lung | 182 (32.38) | 71 (31.7) | 96 (32.43) | 15 (35.71) | Reference | ... | ... |
Breast | 138 (24.56) | 59 (26.34) | 71 (23.99) | 8 (19.05) | 0.97 | 0.71-1.31 | .821 |
Hematologic | 242 (43.06) | 94 (41.96) | 129 (43.58) | 19 (45.24) | 0.94 | 0.73-1.23 | .672 |
CHF present | |||||||
No | 511 (90.93) | 208 (92.86) | 266 (89.86) | 37 (88.10) | Reference | ... | ... |
Yes | 51 (9.07) | 16 (7.14) | 30 (10.14) | 5 (11.90) | 1.33 | 0.91-1.94 | .142 |
Ascites present | |||||||
No | 521 (92.7) | 217 (96.88) | 266 (89.86) | 38 (90.48) | Reference | ... | ... |
Yes | 41 (7.3) | 7 (3.13) | 30 (10.14) | 4 (9.52) | 1.91 | 1.31-2.79 | .001 |
High clinical suspicion of pneumonia | |||||||
No | 348 (61.92) | 154 (68.75) | 168 (56.76) | 26 (61.9) | Reference | ... | ... |
Yes | 214 (38.08) | 70 (31.25) | 128 (43.24) | 16 (38.1) | 1.36 | 1.08-1.71 | .009 |
Chemotherapy within 30 d before tap | |||||||
No | 260 (46.26) | 110 (49.11) | 129 (43.58) | 21 (50) | Reference | ... | ... |
Yes | 302 (53.74) | 114 (50.89) | 167 (56.42) | 21 (50) | 1.08 | 0.86-1.36 | .504 |
Radiation within 30 d before tap | |||||||
No | 506 (90.04) | 208 (92.86) | 261 (88.18) | 37 (88.10) | Reference | ... | ... |
Yes | 56 (9.96) | 16 (7.14) | 35 (11.82) | 5 (11.90) | 1.45 | 1.02-2.06 | .040 |
Surgery within 30 d before first thoracentesis | |||||||
No | 553 (98.4) | 223 (99.55) | 288 (97.30) | 42 | Reference | ... | ... |
Yes | 9 (1.6) | 1 (0.45) | 8 (2.70) | 0 | 2.40 | 1.19-4.86 | .014 |
Side of pleural effusion of interest | |||||||
Right | 308 (54.8) | 122 (54.46) | 166 (56.08) | 20 (47.62) | Reference | ... | ... |
Left | 254 (45.2) | 102 (45.54) | 130 (43.92) | 22 (52.38) | 0.95 | 0.76-1.20 | .679 |
Bilateral effusion present | |||||||
No | 259 (46.09) | 108 (48.21) | 133 (44.93) | 18 (42.86) | Reference | ... | ... |
Yes | 303 (53.91) | 116 (51.79) | 163 (55.07) | 24 (57.14) | 1.12 | 0.89-1.40 | .350 |
Size of effusion on most recent chest radiographs within 2 w before first thoracentesis | .302 | ||||||
Up to apex diaphragm | 63 (12.23) | 28 (13.21) | 26 (9.77) | 9 (24.32) | Reference | ... | ... |
Up to vascular pedicle | 228 (44.27) | 87 (41.04) | 120 (45.11) | 21 (56.76) | 1.14 | 0.75-1.72 | .542 |
Up to top of cardiac silhouette | 155 (30.1) | 68 (32.08) | 84 (31.58) | 3 (8.11) | 1.23 | 0.80-1.88 | .350 |
Up to top of aortic arch | 36 (6.99) | 14 (6.6) | 19 (7.14) | 3 (8.11) | 1.26 | 0.70-2.26 | .434 |
Higher than aortic arch | 33 (6.41) | 15 (7.08) | 17 (6.39) | 1 (2.70) | 1.24 | 0.68-2.27 | .477 |
Lung or pleural nodularity thickening or mass > 1 cm during study within 30 d before first tap? | |||||||
No | 260 (46.26) | 111 (49.55) | 132 (44.59) | 17 (40.48) | Reference | ... | ... |
Yes | 302 (53.74) | 113 (50.45) | 164 (55.41) | 25 (59.52) | 1.14 | 0.91-1.43 | .267 |
Malignant pleural fluid cytologic examination | |||||||
No | 251 (44.74) | 110 (49.33) | 114 (38.51) | 27 (44.74) | Reference | ... | ... |
Yes | 310 (55.26) | 113 (50.67) | 182 (61.49) | 15 (35.71) | 1.53 | 1.21-1.94 | < .001 |
Neutrophil to lymphocyte ratio, K/μL | 6.11 (2.94-12.54) | 5.19 (2.80-9.74) | 7.52 (3.51-15.04) | 4.77 (2.74-9.08) | 1.01 | 1.00-1.01 | .009 |
Amount of pleural fluid drained, mL | 1,000 (700-1,300) | 1,000 (750-1,300) | 1,000 (725-1,300) | 775 (550-1,250) | 1.00 | 1.00-1.00 | .454 |
Pleural fluid LDH/100, U/L | 5.89 (3.27-11.92) | 4.65 (3.05-9.36) | 7.51 (3.95-14.83) | 3.91 (2.74-9.02) | 1.00 | 1.00-1.00 | .136 |
Pleural fluid protein, g/dL | 3.4 (2.6-4.1) | 3.8 (2.9-4.4) | 3.2 (2.4-3.8) | 3.4 (2.7-4) | 0.69 | 0.61-0.78 | < .001 |
Pleural fluid cholesterol, mg/dL | 59 (50-78.5) | 65 (50-84) | 53 (50-73) | 52.5 (50-76) | 0.99 | 0.98-0.99 | < .001 |
Pleural fluid triglycerides, mg/dL | 26 (26-39) | 26 (26-39) | 27.5 (26-38) | 26 (26-31) | 1.00 | 1.00-1.00 | .042 |
Data are presented as No. (%) or median (interquartile range), unless otherwise indicated. CHF = congestive heart failure; ECOG = Eastern Cooperative Oncology Group; HR = hazard ratio; IQR = interquartile range; LDH/100 = pleural fluid lactate dehydrogenase/100.
General Cancer Multivariate Model
Because EMM was found, univariate analysis was repeated for the final cohort (Table 1). Backward selection with candidate variables and interaction terms yielded the final GCMM (Table 2). In the final model, cancer type interacted with four other variables: NLR, pleural fluid LDH, pleural fluid protein, and bilateral effusion.
Table 2.
General Cancer Cox Model Using Interaction Terms
Interaction Term | HR | 95% CI | P Value |
---|---|---|---|
ECOG performance status | |||
0 | Reference | ... | ... |
1 | 2.35 | 0.92-6.01 | .075 |
2 | 4.42 | 1.78-10.95 | .001 |
3 | 4.27 | 1.70-10.71 | .002 |
4 | 5.78 | 2.19-15.23 | < .001 |
Malignant pleural fluid cytologic findings | 1.91 | 1.44-2.55 | < .001 |
Interaction between cancer type and NLR | |||
Lung cancer × NLR | 1.01 | 0.99-1.03 | .284 |
Hematologic malignancy × NLR | 1.00 | 0.99-1.01 | .616 |
Breast cancer × NLR | 1.05 | 1.02-1.08 | .001 |
Interaction between cancer type and bilateral effusion | |||
Lung cancer and unilateral effusion | 1.00 (Reference) | ... | ... |
Lung cancer and bilateral effusion | 2.28 | 1.48-3.53 | < .001 |
Hematologic malignancies and unilateral effusion | 1.55 | 0.41-5.91 | .522 |
Hematologic malignancies and bilateral effusion | 0.87 | 0.25-2.98 | .819 |
Breast cancer and unilateral effusion | 1.04 | 0.19-5.64 | .964 |
Breast cancer and bilateral effusion | 0.66 | 0.14-3.15 | .603 |
Interaction between cancer type and pleural fluid LDH/100 | |||
Lung cancer × LDH/100 | 1.02 | 1.00-1.03 | .016 |
Hematologic malignancies × LDH/100 | 1.00 | 0.99-1.00 | .826 |
Breast cancer × LDH/100 | 1.01 | 1.00-1.02 | .078 |
Interaction between cancer type and pleural fluid protein | |||
Lung cancer × pleural fluid protein | 0.61 | 0.47-0.79 | < .001 |
Hematologic malignancies × pleural fluid protein | 0.66 | 0.51-0.84 | .001 |
Breast cancer × pleural fluid protein | 0.64 | 0.46-0.88 | .007 |
ECOG = Eastern Cooperative Oncology Group; HR = hazard ratio; LDH/100 = pleural fluid lactate dehydrogenase/100; NLR = blood neutrophil to lymphocyte ratio.
Disease-Specific Models
Lung, breast, and hematologic malignancies were the only cancer types with sample size large enough for individual DSM development, and a different one was built for each of them (Table 3, Table 4, Table 5). The most significant variables in the DSMs were part of the GCMM. All of them included ECOG score, malignant pleural fluid cytologic findings, and pleural fluid protein. NLR was not significant in the multivariate analysis in the lung cancer DSM, despite being significant in univariate analysis (P < .001). Significant interaction between cancer type and bilateral effusion was observed in the GCMM (e-Fig 3), but bilateral effusion showed no effect (and was discarded) in the breast cancer DSM (P = .70) and showed opposite effects in the lung cancer and hematologic malignancy DSMs. In the former, it was associated with an increased risk of death (HR, 2.27; P < .001), and in the latter, it was protective (HR, 0.62; P = .02).
Table 3.
Lung Cancer Model Univariate and Multivariate Analysis
Parameter | Univariate Cox Model |
Multivariate Cox Modela |
|||||
---|---|---|---|---|---|---|---|
HR | 95% CI | P Value | HR | 95% CI | P Value | ||
Age | 0.99 | 0.98 | 1.01 | .643 | ... | ... | ... |
Male sex | 1.05 | 0.71 | 1.57 | .797 | ... | ... | ... |
ECOG | |||||||
0 | 1 | ... | ... | ... | ... | ... | ... |
1 | 1.48 | 0.42 | 5.18 | .543 | 1.67 | 0.45-6.14 | .44 |
2 | 4.27 | 1.31 | 13.91 | .016 | 3.76 | 1.11-12.66 | .03 |
3 | 5.27 | 1.62 | 17.23 | .006 | 3.44 | 0.96-12.24 | .06 |
4 | 6.89 | 1.86 | 25.56 | .004 | 4.45 | 1.09-18.11 | .04 |
CHF present | 1.44 | 0.67 | 3.12 | .349 | ... | ... | ... |
Ascites present | 2.79 | 0.88 | 8.83 | .082 | ... | ... | ... |
High clinical suspicion of pneumonia | 1.10 | 0.68 | 1.79 | .689 | ... | ... | ... |
Chemotherapy within 30 d prior | 0.99 | 0.66 | 1.49 | .983 | ... | ... | ... |
Radiation within 30 d before tap | 1.09 | 0.63 | 1.89 | .760 | ... | ... | ... |
Surgery within 30 d before tap | 2.48 | 1.01 | 6.11 | .049 | 2.24 | 0.77-6.53 | .14a |
Right-sided effusion present | 0.99 | 0.66 | 1.49 | .974 | ... | ... | ... |
Bilateral effusion present | 2.27 | 1.52 | 3.40 | < .001 | 2.25 | 1.44-3.53 | < .001 |
Size of the effusion on chest radiograph | |||||||
Up to apex diaphragm | 1 | ... | ... | ... | ... | ... | ... |
Up to vascular pedicle | 1.28 | 0.59 | 2.76 | .53 | ... | ... | ... |
Up to top of cardiac silhouette | 1.53 | 0.69 | 3.38 | .29 | ... | ... | ... |
Up to top of aortic arch | 1.16 | 0.42 | 3.20 | .78 | ... | ... | ... |
Higher than aortic arch | 1.48 | 0.57 | 3.82 | .42 | ... | ... | ... |
Lung or pleural nodularity thickening or mass on study within 30 d before first tap | 1.28 | 0.72 | 2.30 | .403 | ... | ... | ... |
Malignant pleural fluid on cytologic analysis | 1.61 | 1.04 | 2.51 | .034 | 1.86 | 1.14-3.05 | .01 |
Tumor type | |||||||
Adenocarcinoma | 1 | ... | ... | ... | ... | ... | ... |
Squamous cell carcinoma | 0.79 | 0.41 | 1.55 | .493 | ... | ... | ... |
Small cell carcinoma | 1.09 | 0.47 | 2.52 | .845 | ... | ... | ... |
Unspecified non-small cell | 1.44 | 0.86 | 2.41 | .166 | ... | ... | ... |
Neutrophil to lymphocyte ratio | 1.03 | 1.02 | 1.05 | < .001 | ... | ... | ... |
Amount of fluid drained | 1.00 | 1.00 | 1.00 | .442 | ... | ... | ... |
Pleural fluid LDH/100 | 1.01 | 1.00 | 1.02 | .031 | 1.02 | 1.004-1.03 | .01 |
Pleural fluid protein | .51 | 0.40 | 0.65 | < .001 | 0.57 | 0.43-0.76 | < .001 |
Pleural fluid cholesterol | 0.99 | 0.98 | 1.00 | .039 | ... | ... | ... |
Pleural fluid triglycerides, mg/dL | |||||||
< 26 | 1 | ... | ... | ... | ... | ... | ... |
27-105 | 1.08 | 0.72 | 1.64 | .71 | ... | ... | ... |
> 105 | 1.11 | 0.27 | 4.55 | .89 | ... | ... | ... |
CHF = congestive heart failure; ECOG = Eastern Cooperative Oncology Group; HR = hazard ratio; LDH/100 = pleural fluid lactate dehydrogenase/100.
Using backward regression (n = 167), surgery within 30 d: HR, 4.67; 95% CI, 1.21-18.06; P = .03. Thus, surgery qualifies to be in the model. After identification of the risk factors, the model is reapplied with just the risk factors identified. Because variables like neutrophil to lymphocyte ratio are no longer used, more patients have complete data and sample size increases (n = 172), changing the HR slightly.
Table 4.
Breast Cancer Model Univariate and Multivariate Analysis
Parameter | Univariate Analysis Cox Model |
Multivariate Analysis Cox Model |
||||
---|---|---|---|---|---|---|
HR | 95% CI | P Value | HR | 95% CI | P Value | |
Age | 1.00 | 0.98-1.02 | .626 | ... | ... | ... |
ECOG score | ... | ... | .010a | ... | ... | .04 |
0-1 | 1 | ... | ... | ... | ... | ... |
2-4 | 2.14 | 1.20-3.82 | ... | 1.91 | 1.04-3.52 | ... |
CHF present | 0.65 | 0.16-2.65 | .546 | ... | ... | ... |
Ascites present | 3.04 | 1.22-7.61 | .017a | ... | ... | ... |
High clinical suspicion of pneumonia | 1.30 | 0.78-2.19 | .317 | ... | ... | ... |
Chemotherapy within 30 d prior | 0.78 | 0.49-1.24 | .297 | ... | ... | ... |
Radiation within 30 d before tap | 1.68 | 0.88-3.20 | .113a | ... | ... | ... |
Surgery within 30 d before tap | 1.67 | 0.41-6.81 | .477 | ... | ... | ... |
Right-sided effusion present | 1.04 | 0.65-1.69 | .862 | ... | ... | ... |
Bilateral effusion present | 0.91 | 0.57-1.45 | .697 | ... | ... | ... |
Size of the effusion on chest radiograph | ||||||
Up to apex diaphragm | 1 | ... | ... | ... | ... | ... |
Up to vascular pedicle | 0.96 | 0.29-3.17 | .948 | ... | ... | ... |
Up to top of cardiac silhouette | 1.45 | 0.44-4.76 | .544 | ... | ... | ... |
Up to top of aortic arch | 1.17 | 0.26-5.23 | .837 | ... | ... | ... |
Higher than aortic arch | 0.26 | 0.03-2.55 | .246 | ... | ... | ... |
Lung or pleural nodularity thickening or mass of 1 cm during study within 30 d before first tap? | 1.17 | 0.74-1.87 | .504 | ... | ... | ... |
Malignant pleural fluid cytologic findings | 2.56 | 1.11-5.92 | .028a | 2.68 | 1.07-6.74 | .04 |
Neutrophil to lymphocyte ratio | 1.07 | 1.03-1.10 | < .001a | 1.06 | 1.02-1.09 | .001 |
Amount of fluid drained | 1.00 | 1.00-1.00 | .456 | ... | ... | ... |
Pleural fluid LDH/100 | 1.01 | 1.00-1.02 | .060a | 1.01 | 1.002-1.020 | .02 |
Pleural fluid protein | 0.70 | 0.54-0.92 | .009a | 0.68 | 0.51-0.93 | .01 |
Pleural fluid cholesterol | 0.99 | 0.97-1.00 | .033a | ... | ... | ... |
Pleural fluid triglycerides | 1.00 | 0.99-1.00 | .392 | ... | ... | ... |
CHF = congestive heart failure; ECOG = Eastern Cooperative Oncology Group; HR = hazard ratio; LDH/100 = pleural fluid lactate dehydrogenase/100.
Because of variable distribution within this cohort, ECOG score was split into only two categories.
Table 5.
Hematologic Malignancies Model Univariate and Multivariate Analysis
Parameter | Univariate Analysis Cox Model |
Multivariate Analysis Cox Model |
||||
---|---|---|---|---|---|---|
HR | 95% CI | P Value | HR | 95% CI | P Value | |
Age | 1.01 | 1.00-1.02 | .115a | ... | ... | ... |
Male sex | 1.04 | 0.73-1.47 | .843 | ... | ... | ... |
ECOG score | ... | ... | < .001a | ... | ... | .003 |
0-1 | 1 | ... | ... | ... | ... | ... |
2-4 | 2.71 | 1.58-4.67 | ... | 2.31 | 1.33-4.03 | ... |
CHF present | 1.47 | 0.92-2.35 | .107a | ... | ... | ... |
Ascites present | 1.78 | 1.12-2.81 | .014a | ... | ... | ... |
High clinical suspicion of pneumonia | 1.77 | 1.23-2.56 | .002a | 1.65 | 1.10-2.48 | .02 |
Chemotherapy within 30 d prior | 1.46 | 1.00-2.14 | .053a | ... | ... | ... |
Radiation within 30 d before tap | 2.11 | 1.07-4.15 | .032a | ... | ... | ... |
Surgery within 30 d before tap | 7.00 | 0.96-51.25 | .055a | ... | ... | ... |
Right-sided effusion present | 1.10 | 0.78-1.55 | .601 | ... | ... | ... |
Bilateral effusion present | 0.78 | 0.54-1.11 | .166a | 0.59 | 0.39-0.88 | .01 |
Size of the effusion on chest radiograph | ||||||
Up to apex diaphragm | 1 | ... | ... | ... | ... | ... |
Up to vascular pedicle | 1.14 | 0.66-1.95 | .64 | ... | ... | ... |
Up to top of cardiac silhouette | 0.91 | 0.50-1.65 | .76 | ... | ... | ... |
Up to top of aortic arch | 1.49 | 0.64-3.45 | .35 | ... | ... | ... |
Higher than aortic arch | 2.05 | 0.85-4.95 | .11a | ... | ... | ... |
Lung or pleural nodularity thickening or mass of 1 cm during study within 30 d before first tap | 1.08 | 0.75-1.56 | .687 | ... | ... | ... |
Malignant pleural fluid cytologic findings | 1.56 | 1.09-2.23 | .015a | 2.07 | 1.40-3.05 | < .001 |
Neutrophil to lymphocyte ratio | 1.00 | 1.00-1.01 | .524 | ... | ... | ... |
Amount of fluid drained | 1.00 | 1.00-1.00 | .981 | ... | ... | ... |
Pleural fluid LDH/100 | 1.00 | 1.00-1.01 | .709 | ... | ... | ... |
Pleural fluid protein | 0.69 | 0.56-0.85 | < .001a | 0.67 | 0.52-0.85 | .001 |
Pleural fluid cholesterol | 0.98 | 0.97-0.99 | .002a | ... | ... | ... |
Pleural fluid triglycerides | 1.00 | 1.00-1.00 | .076a | ... | ... | ... |
CHF = congestive heart failure; ECOG = Eastern Cooperative Oncology Group; HR = hazard ratio; LDH/100 = pleural fluid lactate dehydrogenase/100.
Because of variable distribution within this cohort, ECOG score was split into only two categories.
Assessing Model Performance
The validation cohort included 727 patients (228 patients with lung cancer, 214 patients with breast cancer, and 285 patients with hematologic malignancies) (e-Table 10). Predicted values were obtained using the GCMM, the appropriate DSM for each patient, LENT-D (discrete risk classifiers: low, moderate, and high risk), and LENT-C (continuous) models. HRs used to generate the LENT-C predictions are available in e-Table 11.
By CI statistic, LENT-C, GCMM, and DSMs outperformed LENT-D in discrimination (Table 6), particularly the GCMM and breast cancer DSM. The difference in time-dependent areas under receiver operating characteristic curve between these two models and LENT-D also were significant at all time points, but not between LENT-C and all models (e-Figs 4-9, e-Table 12).
Table 6.
Summary of Discrimination Statistics for All Models in the Validation Cohort
Population | Model | Harrell’s Ca |
Somers’ D | Time-Dependent AUCs (Including Ties) |
|||
---|---|---|---|---|---|---|---|
CI | CE | 1 mo | 3 mo | 6 mo | |||
General cancer cohort | |||||||
GCMM | 0.67 | 0.67 | 0.34 | 0.75 | 0.71 | 0.69 | |
LENT-D | 0.60 | 0.71 | 0.21 | 0.66 | 0.64 | 0.64 | |
P value | GCMM vs LENT-D | < .001 | ... | ... | < .001 | < .001 | .01 |
LENT-C | 0.65 | 0.65 | 0.29 | 0.72 | 0.68 | 0.69 | |
P value | GCMM vs LENT-C | .158 | ... | ... | .209 | .135 | .776 |
Lung cancer cohort | |||||||
DSM-lung | 0.70 | 0.70 | 0.38 | 0.75 | 0.75 | 0.74 | |
LENT-D | 0.64 | 0.78 | 0.33 | 0.72 | 0.72 | 0.72 | |
P value | DSM-lung vs LENT-D | .243 | ... | ... | .345 | .392 | .528 |
LENT-C | 0.72 | 0.72 | 0.43 | 0.79 | 0.77 | 0.78 | |
P value | DSM-lung vs LENT-C | .241 | ... | ... | .402 | .428 | .160 |
Breast cancer cohort | |||||||
DSM-breast | 0.72 | 0.72 | 0.41 | 0.83 | 0.74 | 0.72 | |
LENT-D | 0.59 | 0.77 | 0.19 | 0.66 | 0.61 | 0.72 | |
P value | DSM-breast vs LENT-D | < .001 | ... | ... | < .001 | < .001 | .005 |
LENT-C | 0.67 | 0.67 | 0.34 | 0.79 | 0.69 | 0.70 | |
P value | DSM-breast vs LENT-C | .156 | ... | ... | .469 | .127 | .580 |
Hematologic malignancies cohort | |||||||
DSM-hematologic | 0.62 | 0.62 | 0.21 | 0.68 | 0.64 | 0.61 | |
LENT-D | 0.58 | 0.74 | 0.17 | 0.61 | 0.62 | 0.60 | |
P value | DSM-hematologic vs LENT-D | .231 | ... | ... | .104 | .565 | .727 |
LENT-C | 0.59 | 0.59 | 0.20 | 0.64 | 0.62 | 0.60 | |
P value | DSM-hematologic vs LENT-C | .442 | ... | ... | .221 | .580 | .655 |
AUC = area under the receiver operating characteristic curve; CE = Harrell’s C-statistic excluding ties; CI = Harrell’s C-statistic including ties; DSM = disease-specific model; GCMM = general cancer multivariate model; LENT-C = continuous LENT model; LENT-D = discrete LENT model.
For continuous risk predictor models with a continuous outcome, no difference exists between CI and CE. When comparing models that make predictions on the same population, the CI statistics should be used.
Kaplan-Meier plots for predicted vs observed median survival demonstrated good calibration-in-the large for all models (e-Figs 10, 11). In strong calibration, as assessed by predicted-minus-observed survival smooth residuals (Figure 2, Figure 3, Figure 4, Figure 5, e-Figs 12-17), GCMM and DSMs outperformed LENT-C. For instance, in the lung cancer DSM at 60 days (Fig 3A), the difference between observed and predicted survival is fairly small across the range of predicted probabilities, whereas LENT-C at the same time point (Fig 3B) shows an observed survival significantly higher than predicted survival for patients at higher risk (> 50% predicted mortality). This indicates that calibration is significantly off for LENT-C, which is overly pessimistic in this context.
Figure 2.
A–C, Line graphs showing the smoothed residuals of observed minus predicted event probability. Perfect calibration would have no difference between observed and predicted, as indicated by the horizontal axis with a difference of 0. A, The general cancer multivariate model is a continuous risk predictor, so it produces a unique prediction for each patient. B, Because the LENT-D model uses discrete predictors, it generates one of three predictions (low, moderate, or high) for a given patient. No true intermediate prediction exists between them. The line connecting the three points is just the weight average of the two adjacent points, but a given patient can have only one prediction. Although they are represented with a line, they are not continuous. C, LENT-C model, which uses continuous predictors, generates a unique prediction for each patient.
Figure 3.
Line graphs showing the observed minus predicted mortality probability smoothed residuals for the disease-specific model for lung cancer in the validation cohort at t = 30, 60, 90, 120, 150, and 180 days vs the LENT-C model in the same cohort. The red line indicates the line of unity representing perfect calibration; the blue line indicates the observed probability minus model predicted probability.
Figure 4.
Line graphs showing the observed minus predicted mortality probability smoothed residuals for the disease-specific model for breast cancer in the validation cohort at t = 30, 60, 90, 120, 150, and 180 days vs the LENT-C model in the same cohort. The red line indicates the line of unity representing perfect calibration; the blue line indicates the observed probability minus model predicted probability.
Figure 5.
Line graphs showing the observed minus predicted mortality probability smoothed residuals for the disease-specific model for hematologic malignancies in the validation cohort at t = 30, 60, 90, 120, 150, and 180 days vs the LENT-C model in the same cohort. The red line indicates the line of unity representing perfect calibration; the blue line indicates the observed probability minus model predicted probability.
Although LENT-D Kaplan-Meier analysis shows good separation (e-Fig 18), an examination of the box plots shows wide interquartile ranges within strata, indicating that LENT-D lacks precision because it is a discrete predictor (Fig 6). Converting LENT-D to a continuous predictor improves precision somewhat, but calibration of LENT-C is not as good as that of GCMM (Fig 2) or DSMs (Figure 3, Figure 4, Figure 5).
Figure 6.
Box plots showing predicted survival by LENT score in the derivation and validation cohorts up to 1,000 days. Only patients with lung, breast, or hematologic malignancies were included. Note the wide interquartile ranges for the low and moderate risk groups. ∗Graph is truncated and does not show entire plot distribution. For the development cohort low-risk group, median is not reached, in the moderate-risk group, the 95th percentile is truncated; in the moderate-risk group, the upper bound of the 95% CI is truncated. In the validation cohort low-risk group, the 75th percentile is not reached in the low-risk group; in the moderate-risk group, the 95th percentile is truncated. ∗∗Parameter was not reached before end of follow-up. Median survival was not reached in the low-risk group for the development cohort and the 75th percentile was not reached in the same group for the validation cohort.
In the validation cohort, the lung cancer DSM accurately predicted survival 44% of the time, whereas the breast cancer DSM accurately predicted survival 45% of the time. In the lung cancer validation cohort, LENT-D was accurate 47% of the time and LENT-C was accurate 38% of the time. For the breast cancer validation cohort, LENT-D was accurate 52% of the time and LENT-C was accurate 36% of the time. The high accuracy of the LENT-D models is explained by the high median survival of the moderate- and low-risk groups being more than the censoring limit (e-Appendix 1).
Discussion
Our main finding was that, to obtain models with accurate individualized predictions for clinical applications, DSMs using continuous risk predictors are necessary. This is because of EMM between predictors. If survival depends on tumor type,1 and tumor type impacts the relationship between other predictors and survival (ie, EMM), then models that do not account for EMM should be inaccurate, and our findings support this principle. EMM may be addressed with interaction variables or by developing DSMs for each cancer. It is usually difficult to use interaction terms because the independent effect of some variables on outcomes is tied to the presence or absence of other factors. For interaction variables, a single HR cannot be interpreted in isolation because the overall impact is contingent on the HR for the other variables. Because tumor type is known at the time of MPE diagnosis, DSMs may be the best choice for MPE survival prediction modelling, so herein we introduce our BLESS models.
The impact of EMM on model performance and interpretation is illustrated by the relationship between tumor type and bilateral effusion. After stratifying by cancer type, bilateral effusion was associated with decreased mortality in hematologic malignancies, but increased mortality in lung cancer, whereas it had no effect in breast cancer. Thus, the relationship between bilateral effusion and mortality risk varied according to cancer type. This makes good clinical sense: bilateral effusion in lung cancer may indicate bilateral malignant disease, which carries a worse prognosis, whereas in hematologic malignancies, bilateral effusion may indicate a more treatable problem such as fluid overload.
EMM impacts identification of risk factors, too. For instance, searching for mortality risk factors in MPE using cohorts with different proportions of lung cancer and hematologic malignancy, without considering EMM,8,10,13,24,25 could lead to very different results, because bilateral effusions may have an increased, decreased, or no effect on mortality. Another example of the importance of EMM is the interaction between NLR and cancer type and how it impacts LENT score validation. In patients with solid tumors without MPEs, a meta-analysis of 66 studies demonstrated that high baseline NLR is associated with increased mortality risk.27,40 Hematologic malignancies were not covered in the meta-analysis because NLR varies widely in those cases. In patients with chronic myeloid leukemia, the median neutrophil count is approximately 100,000/mm3, yielding a very high NLR.41 In contrast, in chronic lymphocytic leukemia, blood lymphocytes are usually > 5,000/mm3 and NLR is low.42 In the LENT development cohort, most patients had mesothelioma and just a few had hematologic malignancies,10 so the HR for NLR was significant. As long as the validation cohort included a small fraction of hematologic malignancies, survival could be predicted accurately (ie, acceptable calibration-in-the-large), but LENT should underperform in cohorts with higher proportions of hematologic malignancies, and it did in ours.
The performance of LENT in our study was similar to that observed in PROMISE (CI of 0.6 in our study and C-statistic of 0.62 in PROMISE). No explanation of the inclusion or exclusion of ties was given in PROMISE. Also, PROMISE used the TIME-1 (first therapeutic intervention in malignant pleural effusion trial), TIME-2, and TIME-3 cohorts to develop the model, and they include few hematologic malignancies (it was an exclusion criteria in TIME-1 and only 1 patient in TIME-2 had a hematologic malignancy).14,43,44 Thus, the PROMISE model is probably valid only for the solid tumors used in the development cohort, and it is unknown how it might perform in patients with hematologic malignancies. We could not test the PROMISE score because we do not measure CRP routinely.
Models that lump very different diseases together likely will be inaccurate. Indeed, of our three DSMs, the one for hematologic malignancies is the least accurate and has significant residual heterogeneity. Hematologic malignancies includes aggressive diseases (eg, acute myeloid leukemia; median survival, 5-10 months45) and indolent diseases (eg, chronic lymphocytic leukemia; median survival, 2-13 years46). In addition, the distribution of variables like NLR and LDH are probably very different among hematologic malignancies. Development of prediction models should consider both EMM and grouping of sufficiently similar underlying diseases. For hematologic malignancy associated MPE, more narrowly defined DSMs could be more appropriate, accurate, and practical. This approach has been tried in patients with lymphoma and pleural effusion,25,47 but because of a small sample size for each subtype of hematologic malignancy, we could not do so here. Conversely, in other cancers (eg, lung cancer), different histologic subtypes (eg, adenocarcinoma vs squamous cell carcinoma) may be sufficiently similar (ie, insignificant EMM) and could be grouped together, but specific tumor mutations (like the epidermal growth factor) may need to be evaluated separately.11,48 This highlights how understanding and capturing the underlying biology is vital to model building.
The models developed in this study build on previous models.8, 9, 10, 11, 12, 13 Although LENT was validated and describes risk in discrete categories, many times terms like moderate risk are too qualitative to be useful for individual patients. Our lung and breast cancer DSMs use a continuous risk predictor and provide a quantitative survival probability for individuals that is clearer and easier to apply.
The adequate calibration-in-the-large of LENT-D (e-Fig 11) makes it useful when predicting median survival for a population and for clinical research,49 but its individual predictions are not necessarily accurate. Within each risk strata for LENT-D, significant variability in survival exists (Fig 6), as indicated by wide interquartile ranges (eg, 46-466 days for moderate risk in the validation cohort). This is a consequence of using a discrete risk classifier when survival has a high degree of dispersion. Even a perfect discrete risk classifier with only three levels will lack precision. For a model of MPE survival to be useful clinically, precise individual survival time predictions are required; recommendations for pleural interventions likely would not be the same for a patient with a 46-day predicted survival vs a patient with a 466-day survival.5,6,50 Discrete risk classifiers often will lump these patients into a single moderate risk group. The simplification into a discrete risk predictor system also weakens LENT strong calibration (Fig 2). Strong calibration refers to the ability of the model to predict individual survival times accurately, and it is considered to be more informative and useful clinically, although harder to achieve.49
Coefficients for the LENT score used in our study were obtained from our development cohort, so the performance of the model was optimized to this dataset and represent an overly optimistic best-case scenario (see e-Appendix 1). If our modified LENT score were applied externally, its performance might be worse.
The impact of discrete vs continuous risk predictors was evaluated by comparing the performance of LENT-D vs LENT-C. Indeed, CI does improve with LENT-C. Furthermore, discrimination across cohorts is similar between LENT-C and DSMs, suggesting that the only limitation of the LENT model could be the use of discrete risk predictors. However, because LENT-C does not consider EMM, its calibration suffers across all cancer types, and it shows worse calibration than DSMs in all time points.
The BLESS models potentially are useful in several clinical scenarios for patients with lung or breast cancer undergoing an initial MPE and who have metastatic disease. Patient and family counseling for available treatment and management options, such as hospice care, is guided by predicted survival. However, physician’s predictions of survival in cancer patients often are inaccurate.36 In studies in which predictive accuracy was defined as predicted survival being ± 33% of the actual survival time, physician estimates were accurate 22.8% to 35% of the time.51, 52, 53 Two studies also reported discrimination, with a C-statistic of 0.58 and 0.62.51,54 In contrast, using the same definition of accuracy, BLESS predictions were accurate 45% and 44% of the time and C-statistics were 0.72 and 0.7 in breast and lung cancer, respectively. Although BLESS predictions were more accurate than physician predictions from the literature, it is important to remember that physician accuracy estimates were for cancer patients in general, whereas BLESS deals with a very specific subset of cancer patients.
Guidelines also recommend using predicted survival to inform the decision of how to manage recurrent MPE, whether it is with repeat thoracentesis vs indwelling pleural catheter (IPC) vs thoracoscopic pleurodesis.5,6,50 Shorter predicted survival favors a repeat thoracentesis, whereas longer predicted survival favors IPC or thoracoscopic pleurodesis. More recent decision analyses suggest that IPC plus talc pleurodesis may be the most cost-effective method, but this again depends on survival. When survival was < 40 days, IPC with symptomatic drainage was more cost-effective than IPC plus talc.50 Survival probability therefore is a key variable for making these clinical decisions, so better survival predictions should lead to more optimal decisions that in turn will help to maximize quality-adjusted survival while providing high-value care. Optimal decisions in this context require integration of individual survival predictions with other factors, such as patient preferences, social supports, and local resources.
The BLESS models also can be used for clinical research. Some trials, especially intervention trials like the Australasian Malignant Pleural Effusion (AMPLE) and TIME trials, use a predicted survival more than a threshold value (eg, 90 days) to enroll patients.4,43,44,55 In AMPLE and TIME, they used clinician estimates. However, clinician judgment is qualitative, and physicians may differ in what constitutes a threshold probability of being “likely” to survive 90 days. Is that 51%, or 99%, or something in between? In each case, accurate and consistent prediction is important because it impacts the power of the study and impacts recruitment time and study efficiency.
Although this study advances our ability to predict survival in individual patients with lung and breast cancer-related effusions, it has important limitations. First, the results have been validated temporally but not externally, so they need to be validated prospectively in multicenter studies. Future studies using prospective, observational, multicenter data for validation have been planned. Second, we included only patients with metastases in addition to the MPE, so our predictions should not be applied to patients in whose only manifestation of metastatic disease is an MPE. This is not a limitation of the model per se; rather, it is important to apply the model to the population for which it is developed, namely, patients with known metastatic disease in addition to a MPE. Third, we included patients with an exudative effusion that could not be explained by another condition.3,4,22 This is similar to LENT, but other models have focused solely on patients with cytologic examination-proven malignancy, so comparisons should be made with caution. Including fluid cytologic analysis as a variable in the BLESS models allows them to be applied to this broader group of patients. Given that recurrent symptomatic exudative effusions with negative cytologic results often require interventional procedures such as indwelling pleural catheters3,4,22 being able to predict survival in this population is clinically relevant. Fourth, the superior performance of the BLESS models is limited to lung and breast cancer. The DSM for hematologic malignancies lacks accuracy because residual EMM is present. Future studies probably should use more narrowly defined groups (eg, only acute myeloid leukemia). Fifth, some reported risk factors (eg, CRP and molecular markers) were not collected routinely in the cohorts, and it is possible that they could improve model performance.14 Sixth, our model was generated in a specialized tertiary hospital with treatments (eg, targeted therapy) that improve patient outcomes. Our model may not be generalizable to centers without access to these therapies. However, referral bias also could contribute because more complex cases may be more likely to be referred, and this may not be captured in our model. The direction of any such bias is difficult to determine.
Interpretation
In summary, we demonstrated that EMM by cancer type impacts survival prediction in patients with MPE and that continuous risk classifiers outperform discrete risk classifiers such as those in the LENT score. To address these, we developed the BLESS models for predicting survival in individual patients with pleural effusions in the context of metastatic lung and breast cancer. These models were validated temporally and demonstrated adequate discrimination and calibration.
Take-home Points.
Study Question: Can a newly developed continuous risk-prediction survival model for patients with malignant pleural effusions and known metastatic disease provide precise survival estimates?
Results: The Mantel-Cox test demonstrated effect-measure modification (EMM) between cancer type and neutrophil to lymphocyte ratio (P < .0001), pleural fluid LDH (P = .029), and bilateral effusion (P = .002). Disease-specific models (DSMs) for lung, breast, and hematologic malignancies were generated and showed C-statistics of 0.72, 0.72, and 0.62 respectively; compared with a discrete LENT (C-statistic, 0.60) and a continuous LENT (C-statistic, 0.65), the new models outperformed.
Interpretation: Because EMM is present, continuous DSMs outperformed previous discrete risk-prediction models that failed to account for interactions and lacked enough precision to be useful for individual-level predictions.
Acknowledgments
Author contributions: D. E. O. was the principal investigator for this study and was responsible for project conception, oversight, organization, data collection and auditing, statistical analysis, and manuscript writing. S. M., G. M.-Z., and P. V. S. were involved in data collection, statistical analysis, and manuscript writing. H. B. G. was involved in data collection and manuscript revision. R. A. was involved in manuscript writing and editing. L. L. and C. H. L. were the primary biostatisticians for the project, constructed models and analysis, and contributed to writing.
Financial/nonfinancial disclosures: None declared.
Role of sponsors: The sponsor had no role in the design of the study, the collection and analysis of the data, or the preparation of the manuscript.
Additional information: The e-Appendix, e-Figures, and e-Tables can be found in the Supplemental Materials section of the online article.
Footnotes
FUNDING/SUPPORT: Statistical analysis was supported in part by the National Cancer Institute [Cancer Center Support Grant P30 CA016672].
Supplementary Data
References
- 1.Feller-Kopman D., Light R. Pleural disease. N Engl J Med. 2018;378(8):740–751. doi: 10.1056/NEJMra1403503. [DOI] [PubMed] [Google Scholar]
- 2.Heffner J.E. Diagnosis and management of malignant pleural effusions. Respirology. 2008;13(1):5–20. doi: 10.1111/j.1440-1843.2007.01154.x. [DOI] [PubMed] [Google Scholar]
- 3.Ost D.E., Jimenez C.A., Lei X. Quality-adjusted survival following treatment of malignant pleural effusions with indwelling pleural catheters. Chest. 2014;145(6):1347–1356. doi: 10.1378/chest.13-1908. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Thomas R., Fysh E.T.H., Smith N.A. Effect of an indwelling pleural catheter vs talc pleurodesis on hospitalization days in patients with malignant pleural effusion: the AMPLE randomized clinical trial. JAMA. 2017;318(19):1903–1912. doi: 10.1001/jama.2017.17426. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Bibby A.C., Dorn P., Psallidas I. ERS/EACTS statement on the management of malignant pleural effusions. Eur Respir J. 2018;52(1):1800349. doi: 10.1183/13993003.00349-2018. [DOI] [PubMed] [Google Scholar]
- 6.Feller-Kopman D.J., Reddy C.B., DeCamp M.M. Management of malignant pleural effusions. An official ATS/STS/STR Clinical Practice Guideline. Am J Respir Crit Care Med. 2018;198(7):839–849. doi: 10.1164/rccm.201807-1415ST. [DOI] [PubMed] [Google Scholar]
- 7.Bhatnagar R., Piotrowska H.E.G., Laskawiec-Szkonter M. Effect of thoracoscopic talc poudrage vs talc slurry via chest tube on pleurodesis failure rate among patients with malignant pleural effusions: a randomized clinical trial. JAMA. 2020;323(1):60–69. doi: 10.1001/jama.2019.19997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Abrao F.C., Peixoto R.D., de Abreu I.R. Prognostic factors in patients with malignant pleural effusion: is it possible to predict mortality in patients with good performance status? J Surg Oncol. 2016;113(5):570–574. doi: 10.1002/jso.24168. [DOI] [PubMed] [Google Scholar]
- 9.Anevlavis S., Kouliatsis G., Sotiriou I. Prognostic factors in patients presenting with pleural effusion revealing malignancy. Respiration. 2014;87(4):311–316. doi: 10.1159/000356764. [DOI] [PubMed] [Google Scholar]
- 10.Clive A.O., Kahan B.C., Hooper C.E. Predicting survival in malignant pleural effusion: development and validation of the LENT prognostic score. Thorax. 2014;69(12):1098–1104. doi: 10.1136/thoraxjnl-2014-205285. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Kasapoglu U.S., Arinc S., Gungor S. Prognostic factors affecting survival in non-small cell lung carcinoma patients with malignant pleural effusions. Clin Respir J. 2016;10(6):791–799. doi: 10.1111/crj.12292. [DOI] [PubMed] [Google Scholar]
- 12.Lee Y.S., Nam H.S., Lim J.H. Prognostic impact of a new score using neutrophil-to-lymphocyte ratios in the serum and malignant pleural effusion in lung cancer patients. BMC Cancer. 2017;17(1):557. doi: 10.1186/s12885-017-3550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Zamboni M.M., da Silva C.T., Jr., Baretta R., Cunha E.T., Cardoso G.P. Important prognostic factors for survival in patients with malignant pleural effusion. BMC Pulm Med. 2015;15:29. doi: 10.1186/s12890-015-0025-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Psallidas I., Kanellakis N.I., Gerry S. Development and validation of response markers to predict survival and pleurodesis success in patients with malignant pleural effusion (PROMISE): a multicohort analysis. Lancet Oncol. 2018;19(7):930–939. doi: 10.1016/S1470-2045(18)30294-8. [DOI] [PubMed] [Google Scholar]
- 15.Bibby A.C., Dorn P., Psallidas I. ERS/EACTS statement on the management of malignant pleural effusions. Eur J Cardiothorac Surg. 2019;55(1):116–132. doi: 10.1093/ejcts/ezy258. [DOI] [PubMed] [Google Scholar]
- 16.Koegelenberg C.F.N., Shaw J.A., Irusen E.M., Lee Y.C.G. Contemporary best practice in the management of malignant pleural effusion. Ther Adv Respir Dis. 2018;12 doi: 10.1177/1753466618785098. 1753466618785098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.MDCalc LENT prognostic score for malignant pleural effusion. 2019. MDCalc website. https://www.mdcalc.com/lent-prognostic-score-malignant-pleural-effusion
- 18.Rozman A., Mok T.S.K. Is the LENT score already outdated? Respiration. 2018;96(4):303–304. doi: 10.1159/000491678. [DOI] [PubMed] [Google Scholar]
- 19.Faiz S.A., Pathania P., Song J. Indwelling pleural catheters for patients with hematologic malignancies. A 14-year, single-center experience. Ann Am Thorac Soc. 2017;14(6):976–985. doi: 10.1513/AnnalsATS.201610-785OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Roberts M.E., Neville E., Berrisford R.G., Antunes G., Ali N.J. Management of a malignant pleural effusion: British Thoracic Society pleural disease guideline 2010. Thorax. 2010;65(suppl 2):ii32. doi: 10.1136/thx.2010.136994. [DOI] [PubMed] [Google Scholar]
- 21.Simoff M.J., Lally B., Slade M.G. Symptom management in patients with lung cancer: diagnosis and management of lung cancer, 3rd ed: American College of Chest Physicians evidence-based clinical practice guidelines. Chest. 2013;143(5 suppl):e455S–e497S. doi: 10.1378/chest.12-2366. [DOI] [PubMed] [Google Scholar]
- 22.Wahidi M.M., Reddy C., Yarmus L. Randomized trial of pleural fluid drainage frequency in patients with malignant pleural effusions. The ASAP trial. Am J Respir Crit Care Med. 2017;195(8):1050–1057. doi: 10.1164/rccm.201607-1404OC. [DOI] [PubMed] [Google Scholar]
- 23.Eberhardt W.E., Mitchell A., Crowley J. The IASLC Lung Cancer Staging Project: proposals for the revision of the M descriptors in the forthcoming eighth edition of the TNM Classification of Lung Cancer. J Thorac Oncol. 2015;10:1515–1522. doi: 10.1097/JTO.0000000000000673. [DOI] [PubMed] [Google Scholar]
- 24.Grosu H.B., Molina S., Casal R. Risk factors for pleural effusion recurrence in patients with malignancy. Respirology. 2019;24(1):76–82. doi: 10.1111/resp.13362. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Özyurtkan M.O., Balcı A.E., Çakmak M. Predictors of mortality within three months in the patients with malignant pleural effusion. Eur J Intern Med. 2010;21(1):30–34. doi: 10.1016/j.ejim.2009.09.012. [DOI] [PubMed] [Google Scholar]
- 26.Rothman K.J. Oxford University Press; 2012. Epidemiology: An Introduction. [Google Scholar]
- 27.Templeton A.J., McNamara M.G., Seruga B. Prognostic role of neutrophil-to-lymphocyte ratio in solid tumors: a systematic review and meta-analysis. J Natl Cancer Inst. 2014;106(6):dju124. doi: 10.1093/jnci/dju124. [DOI] [PubMed] [Google Scholar]
- 28.Clark T.G., Bradburn M.J., Love S.B., Altman D.G. Survival analysis part IV: further concepts and methods in survival analysis. Br J Cancer. 2003;89(5):781–786. doi: 10.1038/sj.bjc.6601117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Barthel F.M.S., Royston P. Graphical representation of interactions. Stata Journal. 2006;6(3):348–363. [Google Scholar]
- 30.Guo C., Yo S., Jang W. Evaluating predictive accuracy of survival models with PROC PHREG. In: Proceedings of the SAS Global Forum 2017 Conference; April 2-5, 2017. https://support.sas.com/resources/papers/proceedings17/SAS0462-2017.pdf
- 31.Heller G., Mo Q. Estimating the concordance probability in a survival analysis with a discrete number of risk groups. Lifetime Data Anal. 2016;22(2):263–279. doi: 10.1007/s10985-015-9330-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Newson R. Confidence intervals for rank statistics: Somers’ D and extensions. Stata Journal. 2006;6(3):309–334. [Google Scholar]
- 33.Newson R. Comparing the predictive powers of survival models using Harrell’s C or Somers’ D. Stata Journal. 2010;10:339–358. [Google Scholar]
- 34.Royston P. Tools for checking calibration of a Cox model in external validation: approach based on individual event probabilities. Stata Journal. 2014;14(4):738–755. [Google Scholar]
- 35.Royston P., Altman D.G. External validation of a Cox prognostic model: principles and methods. BMC Med Res Methodol. 2013;13:33. doi: 10.1186/1471-2288-13-33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Chu C., Anderson R., White N., Stone P. Prognosticating for adult patients with advanced incurable cancer: a needed oncologist skill. Curr Treat Options Oncol. 2020;21(1):5. doi: 10.1007/s11864-019-0698-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.StataCorp. Stata statistical software: release 14 [computer program] StataCorp LP; College Station, TX: 2015. [Google Scholar]
- 38.R Foundation for Statistical Computing . R Foundation for Statistical Computing; Vienna, Austria: 2019. R: a language and environment for statistical computing [computer program] [Google Scholar]
- 39.Moons K.G., Altman D.G., Reitsma J.B. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015;162(1):W1–W73. doi: 10.7326/M14-0698. [DOI] [PubMed] [Google Scholar]
- 40.Mei Z., Shi L., Wang B. Prognostic role of pretreatment blood neutrophil-to-lymphocyte ratio in advanced cancer survivors: a systematic review and meta-analysis of 66 cohort studies. Cancer Treat Rev. 2017;58:1–13. doi: 10.1016/j.ctrv.2017.05.005. [DOI] [PubMed] [Google Scholar]
- 41.Thompson P.A., Kantarjian H.M., Cortes J.E. Diagnosis and treatment of chronic myeloid leukemia in 2015. Mayo Clin Proc. 2015;90(10):1440–1454. doi: 10.1016/j.mayocp.2015.08.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Nabhan C., Rosen S.T. Chronic lymphocytic leukemia: a clinical review. JAMA. 2014;312(21):2265–2276. doi: 10.1001/jama.2014.14553. [DOI] [PubMed] [Google Scholar]
- 43.Rahman N.M., Pepperell J., Rehal S. Effect of opioids vs NSAIDs and larger vs smaller chest tube size on pain control and pleurodesis efficacy among patients with malignant pleural effusion: the TIME1 randomized clinical trial. JAMA. 2015;314(24):2641–2653. doi: 10.1001/jama.2015.16840. [DOI] [PubMed] [Google Scholar]
- 44.Davies H.E., Mishra E.K., Kahan B.C. Effect of an indwelling pleural catheter vs chest tube and talc pleurodesis for relieving dyspnea in patients with malignant pleural effusion: the TIME2 randomized controlled trial. JAMA. 2012;307(22):2383–2389. doi: 10.1001/jama.2012.5535. [DOI] [PubMed] [Google Scholar]
- 45.Döhner H., Weisdorf D.J., Bloomfield C.D. Acute myeloid leukemia. N Engl J Med. 2015;373(12):1136–1152. doi: 10.1056/NEJMra1406184. [DOI] [PubMed] [Google Scholar]
- 46.Kipps T.J., Stevenson F.K., Wu C.J. Chronic lymphocytic leukaemia. Nat Rev Dis Primers. 2017;3:16096. doi: 10.1038/nrdp.2016.96. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Bielsa S., Salud A., Martínez M. Prognostic significance of pleural fluid data in patients with malignant effusion. Eur J Intern Med. 2008;19(5):334–339. doi: 10.1016/j.ejim.2007.09.014. [DOI] [PubMed] [Google Scholar]
- 48.Abisheganaden J., Verma A., Dagaonkar R.S., Light R.W. An observational study evaluating the performance of LENT score in the selected population of malignant pleural effusion from lung adenocarcinoma in Singapore. Respiration. 2018;96(4):308–313. doi: 10.1159/000489315. [DOI] [PubMed] [Google Scholar]
- 49.Van Calster B., Nieboer D., Vergouwe Y., De Cock B., Pencina M.J., Steyerberg E.W. A calibration hierarchy for risk models was defined: from utopia to empirical data. J Clin Epidemiol. 2016;74:167–176. doi: 10.1016/j.jclinepi.2015.12.005. [DOI] [PubMed] [Google Scholar]
- 50.Shafiq M., Simkovich S., Hossen S., Feller-Kopman D.J. Indwelling pleural catheter drainage strategy for malignant effusion: a cost-effectiveness analysis. Ann Am Thorac Soc. 2020;17(6):746–753. doi: 10.1513/AnnalsATS.201908-615OC. [DOI] [PubMed] [Google Scholar]
- 51.Vasista A., Stockler M., Martin A. Accuracy and prognostic significance of oncologists’ estimates and scenarios for survival time in advanced gastric cancer. Oncologist. 2019;24(11):e1102–e1107. doi: 10.1634/theoncologist.2018-0613. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Urahama N., Sono J., Yoshinaga K. Comparison of the accuracy and characteristics of the prognostic prediction of survival of identical terminally ill cancer patients by oncologists and palliative care physicians. Jpn J Clin Oncol. 2018;48(7):695–698. doi: 10.1093/jjco/hyy080. [DOI] [PubMed] [Google Scholar]
- 53.Amano K., Maeda I., Shimoyama S. The accuracy of physicians’ clinical predictions of survival in patients with advanced cancer. J Pain Symptom Manage. 2015;50(2):139–146.e131. doi: 10.1016/j.jpainsymman.2015.03.004. [DOI] [PubMed] [Google Scholar]
- 54.Farinholt P., Park M., Guo Y., Bruera E., Hui D. A Comparison of the accuracy of clinician prediction of survival versus the palliative prognostic index. J Pain Symptom Manage. 2018;55(3):792–797. doi: 10.1016/j.jpainsymman.2017.11.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Muruganandan S., Azzopardi M., Fitzgerald D.B. Aggressive versus symptom-guided drainage of malignant pleural effusion via indwelling pleural catheters (AMPLE-2): an open-label randomised trial. Lancet Respir Med. 2018;6(9):671–680. doi: 10.1016/S2213-2600(18)30288-1. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.