Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Apr 1.
Published in final edited form as: J Thorac Oncol. 2021 Feb 2;16(4):537–545. doi: 10.1016/j.jtho.2021.01.1616

Biomarker Discovery and Validation: Statistical Considerations

Fang-Shu Ou 1, Stefan Michiels 2, Yu Shyr 3, Alex A Adjei 4, Ann L Oberg 1
PMCID: PMC8012218  NIHMSID: NIHMS1669144  PMID: 33545385

Abstract

Biomarkers have various applications including disease detection, diagnosis, prognosis, prediction of response to intervention, and disease monitoring. In this era of precision medicine, having validated biomarkers to inform clinical decision making is more important than ever. In this article, we discuss best practices and potential issues in biomarker discovery and validation. We encourage team science partnerships to bring cutting edge discovery from bench to bedside, leading to improved patient care and outcomes.

Keywords: biomarker, exploratory analysis, confirmation analysis, clinical trial

INTRODUCTION

A biological marker (biomarker) is “a defined characteristic that is measured as an indicator of normal biological processes, pathogenic processes, or biological responses to an exposure or intervention, including therapeutic interventions” 1. Biomarkers have various applications such as risk estimation, disease screening and detection, diagnosis, estimation of prognosis, prediction of benefit from therapy, and disease monitoring (Figure 1). In oncology, biomarker candidates often consist of biological molecules found in cancer cells. The most common biomarkers are cancer-associated proteins, gene mutations, deletions, rearrangements and extra copy numbers of genes. These molecules are sometimes secreted into the circulation and so may be detected by blood-based assay, while others are present in cancer cells and so require a biopsy to obtain tissue for testing. An ideal biomarker satisfies the following properties: it should be either binary (i.e. present or absent) or quantifiable without subjective assessments; the result should be generated by an assay that is adaptable to routine clinical practice and have a timely turnaround (i.e. in a matter of days rather than weeks); the biomarker assay should be sensitive and specific; and most importantly, the biomarker should be detectable using easily accessible specimens.

Figure 1.

Figure 1.

Use of biomarkers in relation to the course of disease

Molecular biomarkers are used together with clinical information to achieve precision medicine to customize prevention, screening and treatment strategies to a group of patients with similar characteristics (Figure 1). Risk stratification biomarkers may identify patients at higher than usual risk of disease who should be monitored more closely than the general population, e.g. smoking increases the risk of lung cancer 2. Disease screening and detection biomarkers are used to detect diseases before symptoms manifest, when therapy has a greater likelihood of success, e.g. low-dose computed tomography (LDCT) screening is recommended for patients at high risk of lung cancer 2. Diagnostic biomarkers detect presence of diseases, e.g. biopsies can be used in the diagnosis of lung cancer 2. Prognostic biomarkers provide information about overall expected clinical outcomes for a patient, regardless of therapy or treatment selection, e.g. sarcomatoid mesothelioma has a poor outcome regardless of therapy 3. Predictive biomarkers inform the overall expected clinical outcome based on treatment decisions in biomarker-defined patients only. The most important predictive biomarkers found for non-small cell lung cancer (NSCLC), for example, are mutations in the epidermal growth factor receptor (EGFR) gene, B-Raf proto-oncogene (BRAF), or MET proto-oncogene (MET) gene, as well as rearrangements involving the anaplastic lymphoma kinase (ALK), ROS proto-oncogene 1 (ROS1), ret proto-oncogene (RET) and NTRK family genes 4; various targeted therapies are available for patients identified by most of these biomarkers.

A biomarker’s journey from discovery to clinical use is long and arduous, but can be broken into phases or steps 5-8. Biomarker discovery efforts have increased with the emergence of technologies for gathering relevant data; for example, single-cell next-generation sequencing (NGS), liquid biopsy (blood sample) for circulating tumor DNA (ctDNA), microbiomics, radiomics, and other types of high-throughput technologies have exploded in popularity in recent years, due to their ability to produce an enormous volume of data quickly and at relatively low cost. Across the continuum of biomarker data capture and utilization, however, many more challenges lie ahead—from analysis of high-throughput biomarker data to maximum exploitation of the electronic health record (EHR), and to the ultimate goal of biomarker-driven clinical practice. Biomarker discovery and validation are essential steps in establishing biomarkers in all applications across the disease course. In this article we discuss best practices for biomarker discovery and validation from a statistical perspective (Figure 2).

Figure 2. Simplified schematic of biomarker development.

Figure 2.

PRoBE: prospective-specimen-collection, retrospective-blinded-evaluation.

BIOMARKER DISCOVERY

The intended use of a biomarker (e.g. risk stratification, screening, etc.) 1 and the target population to be tested need to be defined early in the development process. The use of a biomarker in relation to the course of a disease and specific clinical contexts should also be pre-specified (Figure 1). The patients and specimens should both directly reflect the target population and intended use.

Key considerations for biomarker discovery

Key considerations for conducting discovery studies using archived specimens are the patient population represented by the specimen archive, power of the study (through the number of samples and number of events), prevalence of the disease, the analytical validity of the biomarker test, and the pre-planned analysis plan 9. The most reliable setting in which to perform such (retrospective) studies is via specimens and data collected during prospective trials, and the results of one study need to be reproduced in another. Definitions for levels of evidence have been developed to evaluate the clinical utility of biomarkers in oncology and in medicine 9,10.

Bias, a systematic shift from truth, is one of the greatest causes of failure in biomarker validation studies 11. Bias can enter a study during patient selection, specimen collection, specimen analysis, and patient evaluation. Randomization and blinding are two of the most important tools for avoiding bias. Randomization in biomarker discovery should be carried out to control for non-biological experimental effects due to changes in reagents, technicians, machine drift, etc. that can result in batch effects12. Specimens from controls and cases should be assigned to arrays, testing plates or batches by random assignment, ensuring the distributions of cases, controls, and age of specimen are equally distributed.13 Blinding can be carried out by keeping the individuals who generate the biomarker data from knowing the clinical outcomes; it prevents the bias induced by unequal assessment of biomarker result14. Randomization and blinding should be used in the process of biomarker data generation and should be incorporated at every stage of a study when possible.

Prognostic and predictive biomarker identification

A prognostic biomarker can be identified in properly conducted retrospective studies that do not rely solely on convenience samples but use biospecimens prospectively collected from a cohort which represents the target screening population, case-control studies and single-arm trials. A prognostic biomarker is identified through a main effect test of association between the biomarker and the outcome in a statistical model. An example of a prognostic biomarker is the STK11 mutation which is associated with poorer outcome in non-squamous NSCLC 15. Tissue samples were collected from a consecutive series of patients with non-squamous NSCLC who underwent curative-intent surgical resection in 2001 to 2006 at two hospitals. An a priori power calculation was performed to ensure a sufficient number of overall survival (OS) events to provide adequate statistical power to assess five candidate biomarkers. Even though convenience samples were used, the prognostic effect was validated in 2 external datasets which strengthened the validity of the discovery.

A predictive biomarker needs to be identified in secondary analyses using data from a randomized clinical trial, through an interaction test between the treatment and the biomarker in a statistical model. Secondary analyses refer to subsequent correlative studies which may or may not be pre-defined as a protocol objective. An example of predictive biomarker identification is the IPASS study 16. The IPASS study enrolled patients with advanced pulmonary adenocarcinoma who were nonsmokers or former light smokers and randomly assigned patients to receive gefitinib or carboplatin plus paclitaxel (CP). Patients’ EGFR mutation status was not known at the time of enrollment and was determined retrospectively. The interaction between treatment and EGFR mutation was highly statistically significant (P<.001), and indicated that among patients who have EGFR mutated tumors, progression-free survival (PFS) was significantly longer (hazard ratio [HR], 0.48; 95% confidence interval [CI], 0.36 to 0.64) for those receiving gefitinib compared to those receiving CP. In contrast, among patients who have EGFR wildtype tumors, PFS was significantly shorter (HR, 2.85; 95% CI, 2.05 to 3.98) for those receiving gefitinib compared to those receiving CP 16.

Analytical methods

Analytical methods should be chosen to address study specific goals and hypotheses. Data-driven analyses and the resulting findings are less likely to be reproducible in an independent set of data. Thus, the analytical plan should be written and agreed upon by all members of the research team prior to receiving data in order to avoid the data influencing an analysis. This includes defining the outcomes of interest, hypotheses that will be tested, and criteria for success. Control of multiple comparisons should be implemented when multiple biomarkers are evaluated; a measure of false discovery rate (FDR) is especially useful when using large scale genomic or other high dimensional data for biomarker discovery 17. During biomarker discovery, evaluation of associations between a biomarker and disease status, demographic or clinical characteristics such as age, sex, BMI, or in diseased patients, stage or other disease characteristics, can inform design of future validation studies. Metrics useful for evaluating biomarkers (Table 1) include differences between groups, sensitivity, specificity, positive and negative predictive value, discrimination (i.e. receiver operating characteristic [ROC] area under the curve [AUC]), calibration, clinical validity and utility 10,18-21. The appropriate metric depends upon the study goals, and should be determined by a study team including clinicians, scientists, statisticians, and epidemiologists.

Table 1.

Metrics useful for evaluating biomarker performance

Metrics Description
Sensitivity The proportion of cases that test positive
Specificity The proportion of controls that test negative
Positive predictive value Proportion of test positive patients who actually have the disease; is a function of disease prevalence
Negative predictive value Proportion of test negative patients who truly do not have the disease; is a function of disease prevalence
Receiver Operating Characteristic (ROC) Curve Plot of sensitivity (true positive rate) versus 1-specificity (false positive rate), with a data point calculated for every value of the marker in the dataset
Discrimination How well the marker distinguishes cases from controls; often measured by the area under the ROC curve; ranges from 0 to 1, with 0.5 indicating performance equivalent to a coin flip, 1 corresponds to perfect ability to distinguish
Calibration How well a marker estimates the risk of disease or of the event of interest

It is often the case that information from a panel of multiple biomarkers will be required to achieve better performance than a single biomarker, in spite of the added potential measurement errors that come from multiple assays. Using each biomarker in its continuous state instead of a dichotomized version retains maximal information for model development, and in turn, greater improvement in panel performance; dichotomization for clinical decision making is best left for later studies. The optimal analytical strategy for combining multiple biomarkers and for choosing which biomarkers to combine depends on both sample size and clinical context. Incorporation of some form of variable selection, such as shrinkage, during model estimation generally minimizes overfitting and maximizes the likelihood of validation; hundreds to thousands of patients are generally required to incorporate nonlinear functions such as interactions, smoothing splines, or machine learning and artificial intelligence algorithms without overfitting. It is useful to generate pilot data for use in simulations to inform sample size calculations and plan the appropriate analytical strategy 18-20,22,23.

Missing data can lead to biased results. Thus, the analysis plan should include an approach to handle missing data, including assessment of the mechanism responsible for the missingness and an approach to handle the missingness that minimizes potential biases from being introduced into an analysis 24.

The EQUATOR network assembles an important collection of guidelines for the design and reporting of diagnostic and prognostic modelling studies (https://www.equator-network.org/).

BIOMARKER VALIDATION

Validation is “a process to establish that the performance of a test, tool, or instrument is acceptable for its intended purpose” 1. Internal validation establishes a biomarker’s performance in the data in which the biomarker was developed and should be assessed via resampling methods such as bootstrapping or cross-validation to provide realistic expectations18. External validation establishes a biomarker’s performance in a completely independent dataset not used during development; it must be established using data from different timeframes, institutions, or geographic regions which we discuss in subsequent paragraphs. Analytical validation and clinical validation are two distinct aspects of biomarker validation. Use of specimens collected prospectively from the target population before knowing patient outcomes is a critical design feature of all validation studies that minimizes the influence of bias.

Analytical validation

Analytical validation aims to establish the performance characteristics of a biomarker including sensitivity, specificity, accuracy, precision, inter-laboratory reproducibility and other relevant performance characteristics following a pre-specified protocol. The statistical analysis methods used for analytical validation are similar to the methods mentioned in biomarker discovery (Table 1). The goal of analytical validation is to demonstrate a biomarker’s technical performance (i.e. the biomarker will provide consistent measurements to the unknown true values) and not its usefulness.

Clinical validation

Clinical validation aims to establish an association between the biomarker and the endpoint of interest (i.e. clinical validity per Teutsch et al. 10) and to demonstrate the usefulness of the biomarker (i.e. clinical utility per Teutsch et al. 10). Clinical validation relies on external validation and can be done by retrospective use of clinical trial data or by prospective clinical trials. Retrospective use of clinical trial data is a form of external clinical validation where the biomarker evaluation is not part of the original study design.

Establishing clinical utility or usefulness generally requires a prospective clinical trial, a form of external validation, to demonstrate that use of the biomarker to guide patient care translates into improved health outcomes. An example is the approval of pembrolizumab as the first tissue-agnostic approval granted by the United States Food and Drug Administration (FDA) 25. Patients with microsatellite instability-high (MSI-H) tumors treated with pembrolizumab showed higher overall response rates compared to patients with microsatellite stable (MSS) tumors regardless of the tumor origin in the KEYNOTE-016 study. The regulatory approval was based on data from 5 different trials (N=149) where MSI-H patients were retrospectively identified from 2 prospective studies (N=14) and prospectively identified from 3 studies (N=135). The objective response rate was 39.6% (7% with complete response) among 149 patients with MSI-H tumor consisting of 15 different tumor types which was considered clinically meaningful (compared to an objective response rate of 0% among colorectal cancer patients with MSS tumors in KEYNOTE-01626). At the time of the approval, no companion in vitro diagnostic device was available. Patients were enrolled predominantly based on PCR-based tests for MSI-H and IHC-based tests for deficient mismatch repair (dMMR) available in the community as laboratory-developed tests. The FDA determined that the risk to patients with “false positive” tumors is low in this setting and, given the efficacy observed, FDA approved for this use 27. There was commitment from Merck to develop a companion diagnostic test for detection of MSI-H and deficient mismatch repair across all cancers post-marketing.

Study designs for biomarker validation

Though costly, biomarker evaluation efforts are enhanced by biobanks of specimens collected prospectively from an observational cohort that represents the target population intended for the biomarker 28 . A PRoBE (prospective-specimen-collection, retrospective-blinded-evaluation) design 29 can be performed in such a setting to validate screening, diagnostic, and prognostic biomarkers. Specimens and clinical data are collected without knowing the patient outcome. Case patients and control patients would be randomly selected based on their outcome status. The biomarker data is then generated for the patients selected, blinded to clinical and outcome information. An example of such design is the MILD study 30. The MILD trial, a randomized prospective clinical trial, enrolled 4,099 current or former smokers without history of cancer and randomized them to low-dose computed tomography versus observation. Whole blood was collected at enrollment and subsequent follow-up. Retrospectively, one thousand consecutive plasma samples collected from June 2009 to July 2010 among lung cancer-free individuals enrolled onto the trial were used for validation of a MicroRNA signature classifier. The classifier was pre-specified with predefined cut points, risk scores were generated blinded to clinical outcome for individual participants and submitted to an independent research center. Data analysis was completed according to a pre-specified statistical analysis plan by the independent research center. This validation study intentionally utilized the full cohort rather than a random subset of patients to maximize the study power.

There are several prospective clinical trial designs aimed to validate the clinical utility of a predictive biomarker in a clinical setting. Enrichment designs screen all patients for the biomarker but only enroll and randomize those with the desired molecular features. A treatment will be evaluated within the biomarker-defined subgroup only. Enrichment designs are advantageous when the biomarker prevalence is low (<15-20%). An example of such a design is the EURTAC trial 31 which led to the FDA’s approval of erlotinib for the first-line treatment of patients with metastatic NSCLC harboring EGFR mutations. EURTAC trial screened 1227 patients then randomized 174 patients with EGFR mutations to receive erlotinib or standard chemotherapy (Figure 3a).

Figure 3.

Figure 3.

Trial design schema

All-comer (stratified by biomarker status) designs screen all patients for the biomarker then enroll and randomize patients with a valid biomarker result. The randomization can be stratified by the biomarker status (if the turnaround time of biomarker testing is short) and the test of treatment by biomarker interaction is included in the pre-specified analysis plans. All-comer designs are appropriate when the treatment benefit needs to be better understood in both patients who test positive and in those who test negative. An example of such a design is the MARVEL trial (N0723, NCT00738881). The MARVEL trial planned to enroll 1,196 patients with advanced NSCLC after first-line therapy and patients’ EGFR expression via Fluorescence in situ Hybridization (FISH) was evaluated by central pathology review. After the FISH result was available, patients were randomized to receive pemetrexed versus erlotinib, stratified by the FISH status and other factors. The goal was to identify 287 FISH positive patients and 670 (70%) FISH negative patients to evaluate whether there are differences in progression-free survival due to treatment with erlotinib compared to pemetrexed for subsets defined by FISH positivity versus negativity (Figure 3b).

Subgroup designs validate a predictive biomarker in a specific subgroup of patients as well as in the overall population using a multiple-hypothesis design 32. In this design, all patients with a particular disease are randomized to experimental therapy versus standard-of-care, but co-primary objectives are defined to test the superiority of the experimental therapy in the subgroup of patients selected by the biomarker, as well as for all enrolled patients. This design is advantageous when there is evidence that the experimental therapy will be most effective in patients with the biomarker of interest, but could also have a broad impact in the general disease population. An example is SWOG S0819 33 which was designed to test the hypothesis that EGFR amplification can identify patients most likely to benefit from EGFR antibodies in combination with chemotherapy in patients with advanced NSCLC. S0819 randomized 1313 eligible patients to chemotherapy with cetuximab versus chemotherapy alone. EGFR-FISH status was not required to be known at trial enrollment and was evaluated at each interim analysis. Co-primary endpoints were progression-free survival in patients with EGFR-FISH positive cancer and overall survival in the entire population (Figure 3c).

Platform-type trial designs, such as umbrella trials (histology specific) and basket trials (biomarker specific and agnostic to histology), can be advantageous in biomarker validation as well 34.

There are common features for establishing analytical validity, clinical validity, and clinical utility, i.e. there should be a pre-specified protocol dealing with the specifics of the validation process, such as specimen collection, specimen handling and storage procedures, biomarker and clinical outcomes of interest, the purposes of the biomarker, and the potential benefits and risks associated with the use of the biomarker.

CONCLUSION

In this article, we discussed statistical perspectives on best practices for biomarker discovery and validation. One aspect that we omitted was the biomarker qualification process with the regulatory agencies 7. Readers should note that the FDA requires biomarker candidates to undergo clinical validation and be assessed as a companion diagnostic before receiving regulatory approval. The biomarkers used to direct therapies need to be generated by an assay that is performed in a CLIA (Clinical Laboratory Improvement Amendments) certified laboratory which will be the first step towards clinical validation. We encourage investigators to reach out to health authorities early to discuss potential biomarkers of interest.

We would also like to take this opportunity to urge oncologists to resist the temptation of adopting unvalidated biomarker findings into practice. Attempts to discover biomarkers have accelerated through advanced technology in generating relevant data. The potential biomarkers discovered should be considered as hypothesis generating and the biomarkers need to be validated (both analytically and clinically) before adoption. An example would be the STK11 and KEAP1 mutations which appeared to be predictive with emerging data showing patients with STK11 and KEAP1 mutations do not respond to immunotherapy. However, an exploratory analysis using clinical trial data demonstrated that pembrolizumab monotherapy was associated with improved overall response rates compared with chemotherapy regardless of STK11 and KEAP1 mutational status, i.e., these mutations were prognostic 35. Additionally, an analysis using real world evidence also showed that STK11 and KEAP1 mutations are prognostic biomarkers and unlikely to be predictive biomarkers for anti-PD-1/anti-PD-L1 therapy 36. STK11 and KEAP1 remain unvalidated predictive biomarkers and clinicians’ treatment decisions should not be swayed by the mutation status of these two genes.

The discovery and validation of biomarkers requires thorough planning and the collaboration of clinicians, scientists, statisticians, and epidemiologists. The success of these endeavors requires collaborative and cross-disciplinary approaches. A cohesive and an effective team of collaborative scientists is crucial for biomarker development and we promote such partnerships to ultimately accelerate the translation of cutting edge scientific discoveries from bench to bedside thus leading to improved patient care and outcomes.

ACKNOWLEDGMENTS:

This work was partially supported by the National Institutes of Health Grant P30CA15083 (Mayo Comprehensive Cancer Center Grant; FSO, ALO, AAA), NCI Grant P50CA136393 (Mayo Clinic SPORE in Ovarian Cancer grant; ALO), NCI Grant P50CA102701 (Mayo Clinic SPORE in Pancreatic Cancer; ALO), NCI Grant U10CA180882 (Alliance Statistics and Data Management Center; FSO, ALO), NIH grant U24CA213274 (YS), NIH grant U54TR002243 (YS), and NIH grant P30CA068485 (YS)

Footnotes

CONFLICTS OF INTEREST:

FSO / SM / ALO/ AAA: None

Dr. Shyr reports personal fees from AstraZeneca, Eisai, Janssen, Novartis, Pfizer, Roche, outside the submitted work.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

REFERENCES

  • 1.Group F-NBW: BEST (Biomarkers, EndpointS, and other Tools) Resource, 2020 [PubMed]
  • 2.National Comprehensive Cancer Network. Lung cancer screening. 2020. version. https://www.nccn.org/patients/guidelines/content/PDF/lung-early-stage-patient.pdf. Accessed 04JAN2021.
  • 3.Nicholson AG, Sauter JL, Nowak AK, et al. : EURACAN/IASLC Proposals for Updating the Histologic Classification of Pleural Mesothelioma: Towards a More Multidisciplinary Approach. J Thorac Oncol 15:29–49, 2020 [DOI] [PubMed] [Google Scholar]
  • 4.Remon J, Ahn MJ, Girard N, et al. : Advanced-Stage Non-Small Cell Lung Cancer: Advances in Thoracic Oncology 2018. J Thorac Oncol 14:1134–1155, 2019 [DOI] [PubMed] [Google Scholar]
  • 5.Pepe MS, Etzioni R, Feng Z, et al. : Phases of biomarker development for early detection of cancer. J Natl Cancer Inst 93:1054–61, 2001 [DOI] [PubMed] [Google Scholar]
  • 6.IOM (Institute of Medicine): in Micheel CM, Ball JR (eds): Evaluation of Biomarkers and Surrogate Endpoints in Chronic Disease. Washington (DC), 2010 [PubMed] [Google Scholar]
  • 7.Food and Drug Administration, Biomarker qualification: evidentiary framework guidance for industry and FDA staff, https://www.fda.gov/regulatory-information/search-fda-guidance-documents/biomarkerqualification-evidentiary-framework, Accessed 04JAN2021.
  • 8.Rudin M: Imaging readouts as biomarkers or surrogate parameters for the assessment of therapeutic interventions. Eur Radiol 17:2441–57, 2007 [DOI] [PubMed] [Google Scholar]
  • 9.Simon RM, Paik S, Hayes DF: Use of archived specimens in evaluation of prognostic and predictive biomarkers. J Natl Cancer Inst 101:1446–52, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Teutsch SM, Bradley LA, Palomaki GE, et al. : The Evaluation of Genomic Applications in Practice and Prevention (EGAPP) Initiative: methods of the EGAPP Working Group. Genet Med 11:3–14, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Ransohoff DF, Gourlay ML: Sources of bias in specimens for research about molecular markers for cancer. J Clin Oncol 28:698–704, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Leek JT, Scharpf RB, Bravo HC, et al. : Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet 11:733–9, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Qin LX, Zhou Q, Bogomolniy F, et al. : Blocking and randomization to improve molecular biomarker discovery. Clin Cancer Res 20:3371–8, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Ransohoff DF: Bias as a threat to the validity of cancer molecular-marker research. Nat Rev Cancer 5:142–9, 2005 [DOI] [PubMed] [Google Scholar]
  • 15.Pecuchet N, Laurent-Puig P, Mansuet-Lupo A, et al. : Different prognostic impact of STK11 mutations in non-squamous non-small-cell lung cancer. Oncotarget 8:23831–23840, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Mok TS, Wu YL, Thongprasert S, et al. : Gefitinib or carboplatin-paclitaxel in pulmonary adenocarcinoma. N Engl J Med 361:947–57, 2009 [DOI] [PubMed] [Google Scholar]
  • 17.Storey JD, Tibshirani R: Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences (PNAS) 100:9440–9445, 2003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Harrell FE Jr.: Regression Modeling Strategies With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis (ed Second ), Springer, 2015 [Google Scholar]
  • 19.Steyerberg EW: Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating (ed 2), Springer International Publishing, 2019 [Google Scholar]
  • 20.Pepe M: The Statistical Evaluation of Medical Tests for Classification and Prediction. New York, NY, Oxford University Press, 2003 [Google Scholar]
  • 21.Mahar AL, Compton C, McShane LM, et al. : Refining Prognosis in Lung Cancer: A Report on the Quality and Relevance of Clinical Prognostic Tools. J Thorac Oncol 10:1576–89, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Hastie T, Tibshirani R, Friedman J: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Springer Series in Statistics (ed 09 February 2009), Springer, 2009 [Google Scholar]
  • 23.James G, Witten D, Hastie T, et al. : An Introduction to Statistical Learning: with Applications in R. New York, Springer, 2013 [Google Scholar]
  • 24.Little RJ, Rubin DB: Statistical analysis with missing data, John Wiley & Sons, 2019 [Google Scholar]
  • 25.Boyiadzis MM, Kirkwood JM, Marshall JL, et al. : Significance and implications of FDA approval of pembrolizumab for biomarker-defined disease. J Immunother Cancer 6:35, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Le DT, Uram JN, Wang H, et al. : Programmed death-1 blockade in mismatch repair deficient colorectal cancer. Journal of Clinical Oncology 34:103–103, 2016. 26628472 [Google Scholar]
  • 27.Marcus L, Lemery SJ, Keegan P, et al. : FDA Approval Summary: Pembrolizumab for the Treatment of Microsatellite Instability-High Solid Tumors. Clin Cancer Res 25:3753–3758, 2019 [DOI] [PubMed] [Google Scholar]
  • 28.Andre F, McShane LM, Michiels S, et al. : Biomarker studies: a call for a comprehensive biomarker study registry. Nat Rev Clin Oncol 8:171–6, 2011 [DOI] [PubMed] [Google Scholar]
  • 29.Pepe MS, Feng Z, Janes H, et al. : Pivotal evaluation of the accuracy of a biomarker used for classification or prediction: standards for study design. J Natl Cancer Inst 100:1432–8, 2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Sozzi G, Boeri M, Rossi M, et al. : Clinical utility of a plasma-based miRNA signature classifier within computed tomography lung cancer screening: a correlative MILD trial study. J Clin Oncol 32:768–73, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Rosell R, Carcereny E, Gervais R, et al. : Erlotinib versus standard chemotherapy as first-line treatment for European patients with advanced EGFR mutation-positive non-small-cell lung cancer (EURTAC): a multicentre, open-label, randomised phase 3 trial. Lancet Oncol 13:239–46, 2012 [DOI] [PubMed] [Google Scholar]
  • 32.Hoering A, Leblanc M, Crowley JJ: Randomized phase III clinical trial designs for targeted agents. Clin Cancer Res 14:4358–67, 2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Herbst RS, Redman MW, Kim ES, et al. : Cetuximab plus carboplatin and paclitaxel with or without bevacizumab versus carboplatin and paclitaxel with or without bevacizumab in advanced NSCLC (SWOG S0819): a randomised, phase 3 study. Lancet Oncol 19:101–114, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Ou FS, An MW, Ruppert AS, et al. : Discussion of Trial Designs for Biomarker Identification and Validation Through the Use of Case Studies. JCO Precis Oncol 3, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Collingridge D: 2020 ASCO Virtual Annual Meeting. The Lancet. Oncology, 2020 [DOI] [PubMed] [Google Scholar]
  • 36.Papillon-Cavanagh S, Doshi P, Dobrin R, et al. : STK11 and KEAP1 mutations as prognostic biomarkers in an observational real-world lung adenocarcinoma cohort. ESMO Open 5, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES