arising from Cheung et al. npj Precision Oncology 10.1038/s41698-023-00351-6 (2023)
AACR Project GENIE is an open source, international, pan-cancer registry of real-world clinico-genomic data built by data sharing among a network of academic tertiary referral centers. We write in response to a recent report by Cheung and colleagues1 who claim, based on an analysis of the distribution of race and ethnicity in GENIE and benchmarking those distributions to 2017 U.S. cancer incidences using CDC WONDER (http://wonder.cdc.gov/cancer-v2017.html), that “GENIE is not sufficiently powered to detect small yet potentially clinically meaningful differences between white and non-white patients in even the most common cancer types.”
We disagree with Cheung et al.’s emphasis on powering comparisons for a Cohen’s h of less than 0.2. As per Cohen’s guidelines2, h of 0.2 is the benchmark for a small effect size. Using this definition, comparisons of Black, Asian, and Hispanic primary tumor samples versus white samples from three of the top five most common cancers are adequately powered to achieve this small effect size. This benchmark is also met for comparisons in the metastatic setting of white versus: Black and Asian non-small cell lung cancer samples; Black, Asian and Hispanic breast cancer samples; and Black and Hispanic colorectal cancer samples.
Cheung et al. used the v9.1 public release (January 2021) for their analysis. GENIE has updated public releases every 6 months, which include both new patients and samples. Using the most recent release from January 2023 (v13.0-public)3, several previously underpowered comparisons with white patient samples (Black primary prostate cancer samples, Hispanic primary pancreatic cancer and metastatic non-small cell lung cancer, and Asian metastatic colorectal cancer samples) are now sufficiently powered for Cohen’s h of 0.2. As GENIE continues to collect data and as new centers with currently underrepresented patient populations join the consortium, each release will include more patients across all race and ethnicity groups. This will allow for smaller effect sizes to be detected, as well as for additional comparisons to be adequately powered.
We agree with Cheung and colleagues on the many challenges to collecting and analyzing self-reported race and ethnicity data. AACR Project GENIE uses the standards established by the North American Association of Central Cancer Registries (NAACR) whenever possible to define specific data elements, including self-reported race and ethnicity (https://www.synapse.org/#!Synapse:syn50678640). Collection of these data at international GENIE institutions is further complicated by varying European Member State laws4. Given the complexities and missingness in self-reported race and ethnicity data, GENIE is currently undertaking an infrastructure build to impute genetic ancestry from off-target Next Generation Sequencing (NGS) panel reads5.
Clinical research datasets have inherent biases that may limit their generalizability. It is important to consider these limitations when evaluating the appropriateness of a dataset for its intended use6. Inclusion in the GENIE Registry requires that a patient’s tumor undergoes NGS testing. As such, the data reflect patient populations and practice patterns at participating institutions, which may not represent the broader population of patients that are diagnosed and treated for cancer7. Biobanking studies demonstrate participation bias can lead to false positive inferences about genetic associations and phenotype8,9.
The GENIE consortium recently underwent an open call to expand the consortium by adding institutions with clinical and genomic data from cancer patients consistent with the national average of minority and underrepresented patients treated for cancer or who are from rural populations10,11. Four institutions were selected and are currently being onboarded, with the first release incorporating data from these patients anticipated in January 2024. The GENIE Consortium is also undertaking several parallel pathways to connect select patient- and area-level social determinants of health variables that should allow for a more comprehensive evaluation of factors that influence variation in outcomes.
The members of the AACR Project GENIE Consortium fully believe in the need for high-quality real-world clinico-genomic data, and agree with Cheung et al. on the importance of racial and ethnic representation in such databases. GENIE and its partnering institutions will continue to adapt in conjunction with changes to practice and health policy so that clinico-genomic data can be captured for as broad and representative a population of patients as is feasible. We look forward to continuing to serve the research community for years to come by providing a publicly available source of high-quality clinico-genomic data.
Author contributions
S.M.S. – was the lead author/writer on the paper. J.A.L. – provided statistical support and expertise for the manuscript. H.E.F. - provided statistical support and expertise for the manuscript. J.A.L. – provided writing support, editing, and referencing. S.B. - provided statistical support and expertise for the manuscript. K.S.P. - provided statistical support and expertise for the manuscript. C.L.S. - provided writing support, editing, and referencing. P.L.B. - provided writing support, editing, and referencing.
Competing interests
S.M.S. has an immediate family member who works for ConcertAI and is the Senior Director of AACR Project GENIE. AACR Project GENIE received research funding from Amgen, Inc., Bristol-Myers Squibb Company, Bayer HealthCare Pharmaceuticals Inc., Merck Sharp & Dohme Corp., Pfizer, AstraZeneca UK Limited, Genentech, Novartis, Boehringer Ingelheim, and Janssen Pharmaceuticals, Inc. A portion of J.A.L.’s support is paid by AACR Project GENIE. A portion of H.E.F.’s support is paid by AACR Project GENIE. J.A.L. owns stock in Abbott and Gilead Sciences. J.A.L. is the Associate Director of AACR Project GENIE. A portion of Samantha Brown’s support is paid by AACR Project GENIE. K.S.P. owns stock in Adicet Bio, Codexis, Chinook Therapeutics, T2 Biosystems, Vincerx Pharma, and 23andMe. A portion of K.S.P.’ support is paid by AACR Project GENIE. C.L.S. serves on the Board of Directors of Novartis, is a co-founder of ORIC Pharmaceuticals and co-inventor of enzalutamide and apalutamide. He is a science advisor to Arsenal, Beigene, Blueprint, Column Group, Foghorn, Housey Pharma, Nextech, KSQ and PMV. P.L.B. serves as science advisor to Seattle Genetics, Elli Lilly and Co., Amgen, Inc., Gilead Sciences, Bristol-Myers Squibb Company, Pfizer, AstraZeneca UK Limited, Genentech/Roche, GlaxoSmithKline, Novartis, Merck Sharp & Dohme Corp., Bicara, Zymeworks, Medicenna, Bayer HealthCare Pharmaceuticals Inc.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Cheung ATM, et al. Racial and ethnic disparities in a real-world precision oncology data registry. NPJ Precis. Oncol. 2023;7:7. doi: 10.1038/s41698-023-00351-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Cohen, J. Statistical Power Analysis for the Behavioral Sciences, 2nd edn (Routledge, 1988).
- 3.AACR Project GENIE. Public Release v 13.0, 10.7303/syn50678294 (2023).
- 4.Farkas, L. Analysis and comparative review of equality data collection practices in the European Union: data collection in the field of ethnicity (Publications Office of the European Union, 2020).
- 5.Carrot-Zhang J, et al. Genetic ancestry contributes to somatic mutations in lung cancers from admixed Latin American populations. Cancer Discov. 2021;11:591–598. doi: 10.1158/2159-8290.CD-20-1165. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Framework for FDA’s Real-World Evidence Program (U.S. Food & Drug Administration, 2018).
- 7.Levine DM, et al. Community-academic health center partnerships for underserved minority populations: one solution to a national crisis. JAMA. 1994;272:309–311. doi: 10.1001/jama.1994.03520040071043. [DOI] [PubMed] [Google Scholar]
- 8.Pirastu N, et al. Genetic analyses identify widespread sex-differential participation bias. Nat. Genet. 2021;53:663–671. doi: 10.1038/s41588-021-00846-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Schoeler T, et al. Correction for participation bias in the UK Biobank reveals non-negligible impact on genetic associations and downstream analyses. Nature Human Behaviour. 2023;7:1216–1227. doi: 10.1038/s41562-023-01579-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Cancer Facts and Figures for African Americans 2019-2021 (American Cancer Society, 2019).
- 11.Siegel RL, Miller KD, Fuchs HE, Jemal A. Cancer statistics, 2021. CA Cancer J. Clin. 2021;71:7–33. doi: 10.3322/caac.21654. [DOI] [PubMed] [Google Scholar]