Abstract
Background
Circulating tumor DNA (ctDNA) in cerebrospinal fluid (CSF) has become a promising surrogate for genomic profiling of central nervous system tumors. However, suboptimal ctDNA detection rates from CSF limit its clinical utility. Thus precise screening of suitable patients is needed to maximize the clinical benefit.
Patients and methods
Between February 2017 and December 2020, 66 newly diagnosed non-small-cell lung cancer (NSCLC) patients with brain parenchymal metastases were prospectively enrolled as a training cohort and 30 additional patients were enrolled as an external validation cohort. CSF samples and matched primary tumor tissues were collected before treatment and subjected to next-generation sequencing (NGS). The imageological characteristics of patients’ brain tumors were evaluated by radiologists using enhanced magnetic resonance imaging images. The clinical and imageological characteristics were evaluated by complete subsets regression, Akaike information criteria, and Bayesian information criteria methods to establish the prediction model. A nomogram was then built for CSF ctDNA detection prediction.
Results
The somatic mutation detection rate of genes covered by our targeted NGS panel was significantly lower in CSF ctDNA (59.09%) than tumor tissue (91.84%). The Tsize (diameter of the largest intracranial lesion) and LVDmin (minimum lesion–ventricle distance for all intracranial lesions) were significantly associated with positive CSF ctDNA detection, and thus, were selected to establish the prediction model, which achieved an area under the ROC curve (AUC) of 0.819 and an accuracy of 0.800. The model’s predictive ability was further validated in the independent external cohort (AUC of 0.772, accuracy of 0.767) and by internal cross-validation. The CSF ctDNA detection rate was significantly improved from 58.18% (32/55) to 81.81% (27/33) in patients after model selection (P = 0.022).
Conclusions
This study developed a regression model to predict the probability of detecting CSF ctDNA using the phenotypic characteristics of metastatic brain lesions in NSCLC patients, thus, maximizing the benefits of CSF liquid biopsies.
Key words: brain metastases, CSF ctDNA, prediction model
Graphical abstract
Highlights
-
•
Intracranial tumor size and distance to nearest ventricle were significantly correlated with positive CSF ctDNA detection.
-
•
A prediction model incorporating Tsize and LVDmin was developed and validated to evaluate the odds of CSF ctDNA positivity.
-
•
The CSF ctDNA detection rate was significantly improved in patients after model selection.
Introduction
Brain metastases are common in patients with advanced non-small-cell lung cancer (NSCLC) and always lead to poor prognosis.1 The genetic heterogeneity of brain metastasis compared with their primary tumors, as well as their different clinical responses have increasingly been acknowledged, which is crucial for the diagnosis and optimal treatment of cancer.2,3 Recently, cerebrospinal fluid (CSF) has been explored as a promising source of circulating tumor DNA (ctDNA) to characterize brain malignancies, as obtaining intracranial tumor samples is invasive, particularly in cases with multiple brain metastases.4,5 Several studies showed that CSF ctDNA is more abundant and comprehensively representative of the genomic alterations of brain tumors than plasma.6, 7, 8 However, the suboptimal ctDNA detection rate from CSF, which is 50%-60% according to recent studies,7,8 may limit its clinical applicability. The collection of CSF through a lumbar puncture is minimally invasive, but imparts multiple risks, including nerve damage, infection, and pain during and after the procedure. Therefore there is an urgent clinical need to screen patients for informative ctDNA analyses prior to CSF collection. At present, no study has explored the effects of clinical factors on ctDNA detection rates from CSF. In this study, we sought to investigate the association of clinical and imageological features of brain lesions with the probability of detecting CSF ctDNA in lung cancer patients with brain metastasis. We then developed a model to predict CSF ctDNA detection to maximize the clinical benefits by avoiding unnecessary health care costs and examinations.
Methods
Study design and participants
This was a prospectively designed, observational study (NCT03257735) that included 66 newly diagnosed NSCLC patients with parenchymal brain metastases confirmed by magnetic resonance imaging (MRI) at Sun Yat-sen University Cancer Center between February 2017 and December 2019. Patients were eligible to participate in the study if they met the following criteria: (1) 18-75 years of age and an ECOG PS (Eastern Cooperative Oncology Group Performance Status) ≤2; (2) histologically confirmed NSCLC; (3) parenchymal brain metastases confirmed by enhanced brain MRI at the primary diagnosis; (4) treatment naïve (no previous systemic anticancer treatment or radiotherapy for brain metastases); (5) no contraindication for lumbar puncture. The main exclusion criteria included patients that had previously received treatment, had obvious central nervous system (CNS) symptoms that required local treatment, brain metastasis that occurred during treatment, any contraindication for lumbar puncture, or severe uncontrolled systemic disease. An additional independent cohort of 30 NSCLC patients with parenchymal brain metastases was recruited to validate the model between January 2020 and December 2020. This study was approved by the Ethics Committee of the Guangdong Association Study of Thoracic Oncology (GASTO ID:1028, Approval No. A2017-003). All patients provided informed written consent to participate in the study and provide samples for research purposes.
Collection of clinical and imageological characteristics
Patients’ clinical data, including sex, age, and smoking, were analyzed. CNS symptoms were rated by the treating physicians. Patients’ enhanced MRI images were interpreted by two experienced radiologists (LZ and YL) who were blind to patients’ clinical information. Imageological parameters, including Tnum (total number of intracranial lesions), Tsize (diameter of the largest intracranial lesion), LLVD (shortest lesion–ventricle distance for the largest intracranial lesion), and LVDmin (minimum lesion–ventricle distance for all intracranial lesions) were used for further analyses (Figure 1). In cases of disagreement, the images were re-evaluated until a consensus was reached between the radiologists.
Next-generation sequencing and data processing
CSF samples and tumor tissues were collected before treatment. Within 4 h of CSF collection (3-5 ml), the cellular fraction was removed using two-step centrifugation at 4°C (1900g for 10 min followed by 16 000g for 10 min). The supernatants were stored at −80°C until further analysis. ctDNA was extracted using the QIAamp Circulating Nucleic Acid Kit (Qiagen) and analyzed using hybridization-capture–based targeted next-generation sequencing (NGS) of 425 cancer-related genes in a Clinical Laboratory Improvement Amendments and College of American Pathologists accredited testing laboratory (Nanjing Geneseeq Technology, Jiangsu, China), as previously described.9, 10, 11 Genomic DNA from patients with available extracranial tumor biopsy samples (5-10 formalin-fixed paraffin-embedded slides) was extracted using the QIAamp DNA FFPE Tissue Kit (Qiagen) and sequenced using the NGS panel of 425 cancer-related genes. Targeted enriched libraries were sequenced on the Illumina HiSeq 4000 platform (Illumina), with variant calling and data analyses performed as previously described.9, 10, 11
Feature selection
The association between different features, including patients’ clinical and MRI imageological characteristics (Tsize, LLVD, and LVDmin), were analyzed using Pearson’s correlation in R version 3.3.2 (R foundation for Statistical Computing, Vienna, Austria). Two features with a correlation coefficient >0.75 were considered as highly correlated variables. Independent features were then measured using both univariate logistic regression (R package, rms) and variable importance estimation (R package, caret).
Logistic regression model and nomogram construction
Complete subset regressions for all combinations of features were used to select the best fitting model (R package, leaps). Akaike information criteria (AIC) and Bayesian information criteria (BIC) were used to measure candidate model forecasts.
Model performance was further validated by the following: (i) an external independent validation cohort; and (ii) simple random sampling cross-validation in the training cohort. Patients in the training cohort were randomly partitioned into a 7 : 3 ratio 100 times with 70% of the samples being the training cohort and the rest being the validation cohort. (iii) The model was also assessed by the leave-one-out cross-validation method in the training cohort. The predictive performance was evaluated based on the area under the ROC curve (AUC), accuracy, sensitivity, and specificity (R package, pROC). A predictive nomogram was constructed based on model fine-tuning results using the R package, rms. All analyses were conducted using R version 3.3.2.
Results
Patients’ clinical and MRI imageological characteristics
The study CONSORT (Consolidated Standards of Reporting Trials) diagram is shown in Figure 2. The median age of the training cohort was 56 years and ranged from 28 to 72 years. Approximately 62% (41/66) of patients were male and over half of the cohort (39/66) had no history of smoking. Most patients (79%, 52/66) were diagnosed with lung adenocarcinoma and 53% (35/66) of patients exhibited CNS symptoms at diagnosis. The imageological characteristics of brain lesions are summarized in Table 1. A total of 55 patients who had confirmed values for Tsize, LLVD, and LVDmin were included for modeling. The median Tsize was 18 mm and ranged from 6 to 38 mm. Twenty-two of the 55 patients had equal values of LLVD and LVDmin when the largest lesion was closest to the ventricle. The median LLVD and LVDmin was 17.7 and 4.6 mm, respectively.
Table 1.
Characteristics | Training cohort (n = 66) | Validation cohort (n = 30) | P value |
---|---|---|---|
Age/year Median (range) |
56 (28-72) | 60 (32-74) | 0.213 |
Sex, n (%) | 0.284 | ||
Male | 41 (62) | 22 (73) | |
Female | 25 (38) | 8 (27) | |
Smoking status, n (%) | 0.271 | ||
Current or former | 24 (36) | 15 (40) | |
Never | 39 (59) | 15 (50) | |
Unknown | 3 (5) | 0 (0) | |
Histology type, n (%) | 0.182 | ||
Adenocarcinoma | 52 (79) | 27 (90) | |
Others | 14 (21) | 3 (10) | |
CNS symptoms, n (%) | 0.962 | ||
Yes | 35 (53) | 16 (53) | |
No | 28 (42) | 13 (44) | |
NA | 3 (5) | 1 (3) | |
Primary tumor sample, n (%) | 0.665 | ||
Available | 49 (74) | 21 (70) | |
Unavailable | 17 (26) | 9 (30) | |
Tnuma, n (%) | 0.012 | ||
=1 | 5 (8) | 3 (10) | |
>1 | 41 (62) | 26 (87) | |
NA | 20 (30) | 1 (3) | |
Tsize (mm)a | 0.616 | ||
Median (range) | 18 (6-38) | 19.3 (4-47.4) | |
LLVD (mm)a | 0.187 | ||
Median (range) | 17.7 (0-39.4) | 18.5 (0-40.4) | |
LVDmin (mm)a | 0.194 | ||
Median (range) | 4.6 (0-31.9) | 3.95 (0-31.5) |
CNS, central nervous system; LLVD, shortest lesion–ventricle distance for the largest lesion; LVDmin, minimum lesion–ventricle distance for all intracranial lesions; NA, not available; Tnum, total number of intracranial lesions; Tsize, the maximum diameter of the largest intracranial lesion.
Patients’ imageological characteristics of brain tumors were evaluated by radiologists using magnetic resonance imaging.
CSF ctDNA detection in brain metastatic NSCLC patients
In the training cohort (N = 66), CSF samples from all patients were subject to genetic profiling by NGS. Forty-nine of those patients also had matched primary tumor tissues available for analysis. The detection rates of somatic mutations in CSF ctDNA (59.09%) were significantly lower than those in tumor tissues (91.84%). For patients with matched CSF ctDNA and tumor tissue samples, a high degree of genetic heterogeneity was observed between the primary tumor tissue and brain metastases, as sequenced using CSF liquid biopsy. More specifically, approximately half of the patients (26/49) had detectable somatic mutations in both sample types, while 19 patients had mutations only in the tumor tissue, and 2 patients had mutations in their CSF ctDNA only. The remaining two patients had no mutations in either sample (Supplementary Figure S1 and Table S1, available at https://doi.org/10.1016/j.esmoop.2021.100305).
Features selection
To establish a model for CSF ctDNA detection, we first examined the correlation between patients’ clinical and imageological features, including age, sex, CNS symptom, Tsize, LLVD, and LVDmin (Figure 3A). Notably, Tnum was not applicable for patients who had dispersed tumor cells in the brain. The detailed feature information of each patient is provided in Supplementary Table S2, available at https://doi.org/10.1016/j.esmoop.2021.100305. All features were considered independent according to the Pearson correlation coefficients (r2 < 0.75). In univariate analyses of all factors, Tsize, LLVD, and LVDmin were identified as significantly associated with detectable CSF ctDNA, and Tsize and LVDmin were confirmed by multivariate logistic regression (Table 2). Variable importance analyses from the R package Caret based on a 10-times fivefold cross-validation also revealed that Tsize, LLVD, and LVDmin were the top three most significant features associated with the CSF ctDNA detection (Figure 3B).
Table 2.
Factor | Univariate analysis |
Multivariate analysisa |
||
---|---|---|---|---|
OR (95% CI) | P value | OR (95% CI) | P value | |
Age (≥56 versus <56 years) | 0.76 (0.28-2.03) | 0.59 | NA | NA |
Sex (male versus female) | 1.23 (0.44-3.38) | 0.69 | NA | NA |
Histology (LUAD versus others) | 1.11 (0.32-3.65) | 0.87 | NA | NA |
Smoking (ever versus never) | 1.08 (0.39-3.08) | 0.88 | NA | NA |
CNS symptom (yes versus no) | 1.69 (0.62-4.71) | 0.31 | NA | NA |
Tsize | 1.08 (1.02-1.17) | 0.02b | 1.11 (1.03-1.22) | 0.01b |
LLVD | 0.90 (0.84-0.96) | 0.002b | 0.94 (0.87-1.01) | 0.12 |
LVDmin | 0.88 (0.81-0.95) | 0.002b | 0.87 (0.78-0.96) | 0.009b |
CI, confidence interval; CNS, central nervous system; CSF, cerebrospinal fluid; ctDNA, circulating tumor DNA; LLVD, shortest lesion–ventricle distance for the largest lesion; LUAD, lung adenocarcinoma; LVDmin, minimum lesion–ventricle distance for all intracranial lesions; NA, not available; OR, odds ratio; Tsize, the maximum diameter of the largest intracranial lesion.
NA: These variables were eliminated in the multivariate logistic regression model, and thus, the OR and P values are not available.
P < 0.05.
Logistic regression model and nomogram
AIC and BIC were used to examine model performance based on the complete subset regressions of all factors. The combination of Tsize and LVDmin provided the best predictive ability using the BIC method. However, the trio of Tsize, LLVD, and LVDmin was preferable using the AIC method (Figure 3C). Thus, we eventually included the two factors (Tsize and LVDmin) in the multivariate model. The final prediction model with the full training cohort yielded an AUC of 0.819, an accuracy of 0.800, a sensitivity of 0.844, and a specificity of 0.739 (Figure 3D).
The model was validated using an independent external cohort of another 30 lung cancer patients with brain metastases. There were no significant differences in the baseline characteristics between the training and validation cohorts, except for Tnum (Table 1). The prediction performance on the validation cohort included an AUC of 0.772, an accuracy of 0.767, a sensitivity of 0.809, and a specificity of 0.600 (Figure 3D). Model performance was also evaluated in the training cohort by two internal cross-validation methods. The simple random sampling method (a ratio of 7:3, 100 times) demonstrated an average AUC of 0.78 and an accuracy of 0.75 (Figure 3E). Moreover, the leave-one-out cross-validation yielded an AUC of 0.77, an accuracy of 0.78, a sensitivity of 0.81, and a specificity of 0.74. Together, these approaches demonstrated the robustness and considerable accuracy of the model based on the Tsize and LVDmin of the intracranial lesions.
A nomogram was also generated based on the logistic regression models for individual predictions of CSF ctDNA detection in routine practice (Figure 3F). Patients with calculated detection rates >0.5 may be considered for CSF ctDNA sequencing. As a result, the proportion of detectable CSF ctDNA was significantly improved from 58.18% (32/55) to 81.81% (27/33) after model selection (P = 0.022).
Discussion
In this study, we enrolled the largest cohort of lung cancer patients with brain metastases and built a prediction model based on their clinical phenotypic characteristics to predict patients with high likelihood of detecting CSF ctDNA. Recently, several studies showed that CSF is more enriched for brain tumor-derived ctDNA than peripheral blood due to the existence of the blood–brain barrier.12,13 De Mattos-Arruda et al.12 reported that CSF is relatively accurate in representing the genomic mutations in brain tumors and that the ctDNA levels fluctuate over time, following the changes in brain tumor burden. However, the detection rate of somatic mutations in CSF ctDNA is relatively low, which might limit its clinical utility. In our study, the detection rate of CSF ctDNA mutations was only 59.09% in the training cohort, which was similar to that found in previous studies. Thus those data suggested an urgent clinical need to screen patients using informative ctDNA analyses prior to CSF collection. Wang et al.14 previously reported that high-grade gliomas abutting ventricle were associated with high levels of CSF ctDNA. In this study, we demonstrated that intracranial tumor size and the minimum lesion–ventricle distance were significantly correlated with the probability of detecting CSF ctDNA. We also developed a predictive model to estimate the probability of CSF ctDNA detection. The model can be used routinely according to MRI images. We validated model performance in an independent external cohort and revealed the considerable accuracy of the model based on Tsize and LVDmin. Compared with traditional predictive models, nomograms are more visual and easier to interpret, and can therefore be more readily applied to clinical decision-making practices. The proportion of detectable CSF ctDNA in the model-selected cohort was significantly higher than that of the unselected cohort, underscoring that the developed nomogram represents a valuable and cost-effective method for maximizing the benefits of CSF liquid biopsies from NSCLC patients with brain metastases, and thus, reduces unnecessary diagnostic risks and costs.
To our knowledge, this was the first proof-of-concept study to characterize the phenotypic characteristics of metastatic brain tumors for the prediction of the probability of detecting CSF ctDNA in NSCLC patients. This study also had a few limitations. First, the study population was restricted in size, and second, the study did not recruit patients who had leptomeningeal metastases.
Conclusion
In summary, we established a regression model and a nomogram for predicting the detection of CSF ctDNA using the phenotypic characteristics of metastatic brain lesions in NSCLC patients. This model can improve the probability of detecting CSF ctDNA, avoiding unnecessary examinations, and maximizing the benefits of CSF liquid biopsies.
Acknowledgements
The authors thank Drs Qiuxiang Ou, Hua Bao, Xue Wu, and Fufeng Wang from Nanjing Geneseeq Technology Inc. for their insightful comments and suggestions on the manuscript.
Funding
This study was supported by a grant (No. 82072559) from the National Natural Science Foundation of China.
Disclosure
YM and YS are employees of Nanjing Geneseeq Technology Inc. The other authors have no conflicts of interest to declare.
Contributor Information
Y. Mou, Email: mouyg@sysucc.org.cn.
L. Chen, Email: chenlk@sysucc.org.cn.
Ethics approval and consent to participate
This study was approved by the ethics committee of the Guangdong Association Study of Thoracic Oncology (GASTO ID:1028, Approval No. A2017-003). Written consent was obtained from patients.
Supplementary data
References
- 1.Mamon H.J., Yeap B.Y., Janne P.A., et al. High risk of brain metastases in surgically staged IIIA non-small-cell lung cancer patients treated with surgery, chemotherapy, and radiation. J Clin Oncol. 2005;23(7):1530–1537. doi: 10.1200/JCO.2005.04.123. [DOI] [PubMed] [Google Scholar]
- 2.Gerlinger M., Rowan A.J., Horswell S., et al. Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N Engl J Med. 2012;366(10):883–892. doi: 10.1056/NEJMoa1113205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Kuukasjarvi T., Karhu R., Tanner M., et al. Genetic heterogeneity and clonal evolution underlying development of asynchronous metastasis in human breast cancer. Cancer Res. 1997;57(8):1597–1604. [PubMed] [Google Scholar]
- 4.Siravegna G., Geuna E., Mussolin B., et al. Genotyping tumour DNA in cerebrospinal fluid and plasma of a HER2-positive breast cancer patient with brain metastases. ESMO Open. 2017;2(4):e000253. doi: 10.1136/esmoopen-2017-000253. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.McEwen A.E., Leary S.E.S., Lockwood C.M. Beyond the blood: CSF-derived cfDNA for diagnosis and characterization of CNS tumors. Front Cell Dev Biol. 2020;8:45. doi: 10.3389/fcell.2020.00045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Ma C., Yang X., Xing W., Yu H., Si T., Guo Z. Detection of circulating tumor DNA from non-small cell lung cancer brain metastasis in cerebrospinal fluid samples. Thorac Cancer. 2020;11(3):588–593. doi: 10.1111/1759-7714.13300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Zhao Y., He J.Y., Zou Y.L., et al. Evaluating the cerebrospinal fluid ctDNA detection by next-generation sequencing in the diagnosis of meningeal Carcinomatosis. BMC Neurol. 2019;19(1):331. doi: 10.1186/s12883-019-1554-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Martinez-Ricarte F., Mayor R., Martinez-Saez E., et al. Molecular diagnosis of diffuse gliomas through sequencing of cell-free circulating tumor DNA from cerebrospinal fluid. Clin Cancer Res. 2018;24(12):2812–2819. doi: 10.1158/1078-0432.CCR-17-3800. [DOI] [PubMed] [Google Scholar]
- 9.Shu Y., Wu X., Tong X., et al. Circulating tumor DNA mutation profiling by targeted next generation sequencing provides guidance for personalized treatments in multiple cancer types. Sci Rep. 2017;7(1):583. doi: 10.1038/s41598-017-00520-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Xia H., Xue X., Ding H., et al. Evidence of NTRK1 fusion as resistance mechanism to EGFR TKI in EGFR+ NSCLC: results from a large-scale survey of NTRK1 fusions in Chinese patients with lung cancer. Clin Lung Cancer. 2021;21(3):247–254. doi: 10.1016/j.cllc.2019.09.004. [DOI] [PubMed] [Google Scholar]
- 11.Jiang B.Y., Li Y.S., Guo W.B., et al. Detection of driver and resistance mutations in leptomeningeal metastases of NSCLC by next-generation sequencing of cerebrospinal fluid circulating tumor cells. Clin Cancer Res. 2017;23(18):5480–5488. doi: 10.1158/1078-0432.CCR-17-0047. [DOI] [PubMed] [Google Scholar]
- 12.De Mattos-Arruda L., Mayor R., Ng C.K.Y., et al. Cerebrospinal fluid-derived circulating tumour DNA better represents the genomic alterations of brain tumours than plasma. Nat Commun. 2015;6:8839. doi: 10.1038/ncomms9839. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Pentsova E.I., Shah R.H., Tang J., et al. Evaluating cancer of the central nervous system through next-generation sequencing of cerebrospinal fluid. J Clin Oncol. 2016;34(20):2404–2415. doi: 10.1200/JCO.2016.66.6487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Wang Y., Springer S., Zhang M., et al. Detection of tumor-derived DNA in cerebrospinal fluid of patients with primary tumors of the brain and spinal cord. Proc Natl Acad Sci U S A. 2015;112(31):9704–9709. doi: 10.1073/pnas.1511694112. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.