We would like to first clarify that throughout our manuscript, “sequencing coverage” and “sequencing depth” were used interchangeably. Therefore, there was no attempt to explain the coverage with depth. We did not attribute the lower sequencing coverage of 3 cancer types (Prostate Adenocarcinoma [PRAD], Lung Adenocarcinoma [LUAD], and Colon Adenocarcinoma [COAD]) to racial bias in the reference genome. Our study concluded that “For the other 3 cancers, the reasons of lower sequencing coverage of ancestrally African exomes remain unknown.”
Besides the sequencing depths or coverages, the second disparity we reported was that even with comparable sequencing coverages or depths, exomes from ancestrally African patients were statistically significantly enriched with positions with less than sufficient coverages compared with those from Europeans. The racial bias in the human reference genome as a potential factor was discussed in this context. Mitr et al. (1) correctly pointed out that a single male contributed approximately 70% of the sequences used to construct the reference genome, which makes the racial makeup of Buffalo, NY, USA irrelevant. The genetic admixture of this individual was approximately 50% African and 50% European (2-4). DNA from an individual with primarily East Asian ancestry represents the second largest fraction of the reference (5.5% of the total), and the remaining sequences originated from donors with primarily European ancestry (3). In total, sequences from Africans constitute approximately one-third of the human reference genome, which makes the European bias of the human reference genome not as substantial as we suggested but still a factor. Sequences from Europeans contributed to approximately 60% of the reference genome, and therefore the capture kit design is substantially biased.
An additional factor that potentially contributed to our second observation is the greater genetic diversity among Africans relative to other ancestral groups. For example, among the 11 global populations in the HapMap project, those with African ancestry exhibit the greatest number of low-frequency and copy-number variants (5). Similarly, African populations within the 1000 Genomes Project have more novel variants than any other ancestral group (6). In addition, a pan-Africa genome assembled from 910 ancestrally African individuals contained 10% more DNA than the GRCh38 human reference genome build due to the presence of unique DNA sequences among this cohort (7). Our study found that among loci within the common capture region of the 3 NimbleGen kits, the 1000 Genomes African population had statistically significantly higher minor allele frequencies than the 1000 Genomes European population, further underscoring the greater genetic diversity present among individuals of African descent. These facts indicate that the African-derived sequences in the human reference genome failed to adequately represent the heterogeneity of African ancestry, which likely contributed to the statistically significantly higher number of positions in African exomes with lower coverage even with comparable total sequencing depth compared with those from European exomes.
Funding
Dr Yan Asmann’s base budget from Mayo Clinic Research Committee funded the writing of this response.
Notes
Role of the funder: The funder had no role in the writing of this response or the decision to submit it for publication.
Disclosures: The authors declare that they have no competing interests. M.E.S., who is a JNCI Associate Editor and co-author on this Response, was not involved in the editorial review or decision to publish the manuscript.
Author contributions: D.P.W.—writing, original draft. Y.W.A—writing—original draft, writing—review and editing. M.E.S., D.C.R., and A.S.M.—writing—review and editing.
Contributor Information
Daniel P Wickland, Department of Quantitative Health Sciences, Mayo Clinic, Jacksonville, FL, USA.
Mark E Sherman, Department of Quantitative Health Sciences, Mayo Clinic, Jacksonville, FL, USA.
Derek C Radisky, Department of Cancer Biology, Mayo Clinic, Jacksonville, FL, USA.
Aaron S Mansfield, Division of Medical Oncology, Department of Oncology, Mayo Clinic, Rochester, MN, USA; Precision Cancer Therapeutics, Mayo Clinic Center for Individualized Medicine, Rochester, MN, USA.
Yan W Asmann, Department of Quantitative Health Sciences, Mayo Clinic, Jacksonville, FL, USA; Precision Cancer Therapeutics, Mayo Clinic Center for Individualized Medicine, Rochester, MN, USA.
Data Availability
No new data were generated or analyzed for this editorial. The data availability statement for the original article can be found in the authors’ original publication at https://doi.org/10.1093/jnci/djac054.
References
- 1. Mitr R, Pollack JR.. RE: Lower exome sequencing coverage of ancestrally African patients in The Cancer Genome Atlas. J Natl Cancer Inst. 2022; doi: 10.1093/jnci/djac132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Green RE, Krause J, Briggs AW, et al. A draft sequence of the neandertal genome. Science. 2010;328(5979):710-722. doi: 10.1126/science.1188021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Aganezov S, Yan SM, Soto DC, et al. A complete reference genome improves analysis of human genetic variation. Science. 2022;376(6588):eabl3533. doi: 10.1126/science.abl35334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Schneider VA, Graves-Lindsay T, Howe K, et al. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res. 2017;27(5):849-864. doi: 10.1101/gr.213611.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Altshuler DM, Gibbs RA, Peltonen L, et al. ; International HapMap 3 Consortium. Integrating common and rare genetic variation in diverse human populations. Nature. 2010;467:52-58. doi: 10.1038/nature09298. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Auton A, Brooks LD, Durbin RM, et al. ; 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature. 2015;526(7571):68-74. doi: 10.1038/nature15393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Sherman RM, Forman J, Antonescu V, et al. Assembly of a pan-genome from deep sequencing of 910 humans of African descent. Nat Genet. 2019;51(1):30-35. doi: 10.1038/s41588-018-0273-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
No new data were generated or analyzed for this editorial. The data availability statement for the original article can be found in the authors’ original publication at https://doi.org/10.1093/jnci/djac054.
