Abstract
Symposium 3 of the 53rd Annual Meeting of the Japanese Environmental Mutagen and Genome Society (JEMS), entitled “Potential for Computational Genotoxicity,” was held at Shujitsu University, Okayama, Japan, on December 8, 2024. The symposium discussed the application of advanced informatics technologies, such as (quantitative) structure-activity relationship ((Q)SAR) and error-corrected next-generation sequencing (ecNGS), to the field of genotoxicity within the framework of computational genotoxicity. In this symposium, we invited three scientists who are global leaders in the field of computational genotoxicity. This report summarizes the key discussions and presentations from the symposium. The organizers hope this summary will increase awareness of computational genotoxicity.
Keywords: (Q)SAR, ecNGS, Computational genotoxicity, CPCA, AI, Regulatory application, Genotoxicity prediction, Domain knowledge, Carcinogenesis, Hazard identification
Background
The 53rd Annual Meeting of the Japanese Environmental Mutagen and Genome Society (JEMS) was held on December 7 and 8, 2024 in Okayama, Japan, with the theme “Saving the Environment & Genome in the Future.” In recent years, genotoxicity studies have increasingly emphasized the use of (quantitative) structure-activity relationship ((Q)SAR) modeling for predicting Ames mutagenicity [1, 2] and error-corrected next-generation sequencing (ecNGS) for detecting low-frequency gene mutations based on the principles of double-strand consensus sequencing [3–10]. Symposium 3 brought together experts from cheminformatics and bioinformatics to discuss the latest advances in informatics and ecNGS within the framework of computational genotoxicity and their regulatory applications. The symposium aimed to create a computational genotoxicity platform for a more data-informed approach to genotoxicity assessment for drug products and various environmental mutagens. The symposium program was as follows:
Potential for Computational Genotoxicity (Introduction), Naoki KOYAMA, Safety and Bioscience Research Dept. Translational Research Division, Chugai Pharmaceutical Co., Ltd.
US FDA Experience in the Regulatory Application of (Q)SAR, Naomi Louise KRUHLAK, US Food and Drug Administration, Center for Drug Evaluation and Research.
Toward Fully Automated Genotoxicity Prediction, Nicolas Ken SHINADA, SBX Corporation.
Importance of Domain Knowledge in Data Analysis: ecNGS Analysis, Kazuki IZAWA, Division of Genome Safety Science, National Institute of Health Sciences.
Here, we summarize the symposium presentations and discuss future perspectives for research in computational toxicology.
Opening Address
Naoki KOYAMA (Chugai Pharmaceutical Co., Ltd.)
Dr. Naoki Koyama, the symposium chair, opened the event and introduced the basic concepts of computational genotoxicity. He first distinguished between “computational toxicology” and “computational genotoxicity”. While computational toxicology is a recognized field concerned with using computer-based models to predict the interactions of biological organisms (at population, individual, cellular, and molecular levels) with chemical agents [11], “computational genotoxicity” has not yet been formally defined. Dr. Koyama described computational genotoxicity as a field that integrates a wide variety of informatics data to elucidate, analyze, and predict genotoxicity-related phenomena from multiple perspectives. He emphasized how advanced informatics technologies, such as (Q)SAR and ecNGS, can be applied within this framework and adapted to the field of genotoxicity.
US FDA Experience in the Regulatory Application of (Q)SAR
Naomi Louise KRUHLAK, US Food and Drug Administration, Center for Drug Evaluation and Research.
Dr. Naomi Kruhlak, a pioneer of (Q)SAR research in genetic toxicology, delivered a comprehensive presentation on the methodology and its practical application in the regulation of pharmaceuticals. She explained that (Q)SAR computational models can predict toxicity based solely on a chemical’s structure. The U.S. Food and Drug Administration’s Center for Drug Evaluation and Research (FDA/CDER) uses (Q)SAR models to provide predictions for chemicals under review, such as drug impurities, when robust experimental data are unavailable [12, 13]. Dr. Kruhlak highlighted that the International Council for Harmonisation (ICH) of Technical Requirements for Pharmaceuticals for Human Use M7 guideline, first published in 2014, recommends (Q)SAR approaches as alternatives to the Ames test for evaluating the mutagenicity of impurities [14]. This has led to a growing interest in (Q)SAR modeling for drug development and regulation [12, 13]. Dr. Kruhlak also presented her most recent work in modeling the carcinogenic potency of a sub-class of mutagenic impurities―nitrosamines―resulting in the regulatory implementation of a categorical SAR model to determine acceptable intake limits [15, 16]. In closing, she shared practical considerations for the regulatory use of (Q)SAR models to promote transparency and enhance predictive performance [17, 18].
Toward Fully Automated Genotoxicity Prediction
Nicolas Ken SHINADA (SBX Corporation)
Dr. Nicolas Ken Shinada addressed the limitations of traditional genotoxicity assessment methods, such as the Ames test, which are expensive, time-consuming, and labor-intensive. He emphasized that reliable in silico methods, i.e., artificial intelligence (AI) tools, are fundamental to the drug design process. Dr. Shinada presented his recent research utilizing machine learning (ML) and natural language processing (NLP) to develop cost-effective and scalable in silico mutagenicity prediction models [19]. However, constructing reliable models for genotoxicity remains challenging due to the need for substantial training data. While structured resources such as the Hansen mutagenicity benchmark dataset [20] and (Q)SAR databases offer some data for Ames mutagenicity, a vast amount of valuable information embedded in scientific literature remains underutilized. This is primarily due to the complexity of extracting structured data from unstructured textual sources. He presented a novel text-mining framework using a transformer-based model, MutaPredBERT [21], which reformulates the task as a biomedical question-answering problem to extract mutagenic information directly from scientific abstracts. The model achieved a high macro F1-score of 0.88 on a curated dataset, demonstrating its accuracy. Additionally, Dr. Shinada highlighted a 2022 ML study on optimizing molecular fingerprint-based models through systematic feature selection, in which algorithms like descriptors aggregation and recursive feature elimination significantly improved prediction performance [19]. He emphasized the importance of a multidisciplinary strategy combining cheminformatics and NLP to modernize mutagenicity assessment.
Importance of Domain Knowledge in Data Analysis: ecNGS Analysis
Kazuki IZAWA (Division of Genome Safety Science, National Institute of Health Sciences).
Dr. Kazuki Izawa gave a presentation on ecNGS-based data analysis. He focused on ecNGS, an advanced informatics technology capable of detecting genome-wide low-frequency mutations with double-strand consensus sequencing. Dr. Izawa emphasized that domain knowledge is crucial for developing ecNGS technologies. This includes a deep understanding of chemical mutagen, genome structure, mutation types, and potential artifacts in the data analysis pipeline. Specifically, he explained that the DNA synthesizing steps in library preparation, such as end-repair, make artifacts that result in residual errors in low-frequency mutation detection. He demonstrated that identifying and filtering these artifacts is essential for accurate mutation quantification. Dr. Izawa also introduced research analyzing genome-wide mutation profiles associated with specific chemical mutagens, which provides a broader view of genomic effects than single-locus analysis [22]. His presentation highlighted the critical synergy between data-driven science and human expertise, emphasizing the need for genotoxicity researchers to collaborate with informatics scientists to fully harness the potential of these technologies.
Conclusion
The 53rd Annual Meeting of JEMS in 2024 was a significant success, attracting over 200 participants. Symposium 3 effectively brought together key stakeholders and researchers from various sectors to discuss the current state of computational genotoxicity. Participants gained an understanding of computational genotoxicity, a new and rapidly evolving field. The symposium showed how informatics technologies like (Q)SAR and ecNGS are revolutionizing genotoxicity prediction. Dr. Koyama introduced the foundational concepts, while Dr. Kruhlak shared practical regulatory experience with (Q)SAR. Dr. Shinada presented cutting-edge research on ML prediction models, and Dr. Izawa highlighted the indispensable role of domain knowledge in interpreting complex ecNGS data. The symposium demonstrated that computational genotoxicity is an emerging interdisciplinary field that integrates informatics and biological expertise to enable more efficient and accurate genotoxicity evaluation.
The symposium organizers, Dr. Naoki Koyama and Dr. Ayako Furuhama, extend their sincere thanks to Professor Masahiko Watanabe, the President of the 53rd Annual Meeting of JEMS, the organizing committee, and the staff of Senkyo Co., Ltd., for their support and contributions to the symposium’s success. As the organizers, we extend our sincere gratitude to all who participated in this symposium.
Acknowledgements
We would like to express our sincere thanks to everyone who supported and attended the symposium.
Abbreviations
- AI
Artificial intelligence
- CPCA
Carcinogenic Potency Categorization Approach
- ecNGS
Error-corrected next-generation sequencing
- ICH
International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use
- JEMS
The Japanese Environmental Mutagen and Genome Society
- (Q)SAR
(Quantitative) structure-activity relationship
- ML
Machine learning
- NLP
Natural language processing
Author contributions
The symposium was organized and chaired by NK and AF. NK, NLK, NS, and KI gave presentations. The manuscript was drafted by NK and AF and was reviewed and revised by all authors.
Funding
JEMS provided financial support for the symposium.
Data availability
No datasets were generated or analysed during the current study.
Declarations
Ethics approval and consent to participate
Not applicable.
FDA disclaimer
This manuscript reflects the views of the authors and should not be construed to represent FDA’s views or policies.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Naoki Koyama and Ayako Furuhama contributed equally to this work.
References
- 1.Landry C, Kim MT, Kruhlak NL, Cross KP, Saiakhov R, Chakravarti S, Stavitskaya L. Transitioning to composite bacterial mutagenicity models in ICH M7 (Q)SAR analyses. Regul Toxicol Pharmacol. 2019;109:1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Honma M, Kitazawa A, Cayley A, Williams RV, Barber C, Hanser T, Saiakhov R, Chakravarti S, Myatt GJ, Cross KP, Benfenati E, Raitano G, Mekenyan O, Petkov P, Bossa C, Benigni R, Battistelli CL, Giuliani A, Tcheremenskaia O, DeMeo C, Norinder U, Koga H, Jose C, Jeliazkova N, Kochev N, Paskaleva V, Yang C, Daga PR, Clark RD, Rathman J. Improvement of quantitative structure-activity relationship (QSAR) tools for predicting Ames mutagenicity: outcomes of the Ames/QSAR International Challenge Project. Mutagenesis. 2019;34:3–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Kinde I, Wu J, Papadopoulos N, Kinzler KW, Vogelstein B. Detection and quantification of rare mutations with massively parallel sequencing. Proc Natl Acad Sci U S A. 2011;108:9530–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Schmitt MW, Kennedy SR, Salk JJ, Fox EJ, Hiatt JB, Loeb LA. Detection of ultra-rare mutations by next-generation sequencing. Proc Natl Acad Sci U S A. 2012;109:14508–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Kennedy SR, Schmitt MW, Fox EJ, Kohrn BF, Salk JJ, Ahn EH, et al. Detecting ultralow-frequency mutations by duplex sequencing. Nat Protoc. 2014;9:2586–606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Hoang ML, Kinde I, Tomasetti C, McMahon KW, Rosenquist TA, Grollman AP, et al. Genome-wide quantification of rare somatic mutations in normal human tissues using massively parallel sequencing. Proc Natl Acad Sci U S A. 2016;113:9846–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Matsumura S, Sato H, Otsubo Y, Tasaki J, Ikeda N, Morita O. Genome-wide somatic mutation analysis via Hawk-Seq™ reveals mutation profiles associated with chemical mutagens. Arch Toxicol. 2019;93:2689–701. [DOI] [PubMed] [Google Scholar]
- 8.You X, Thiruppathi S, Liu W, Cao Y, Naito M, Furihata C, et al. Detection of genome-wide low-frequency mutations with paired-end and complementary consensus sequencing (PECC-Seq) revealed end-repair-derived artifacts as residual errors. Arch Toxicol. 2020;94:3475–85. [DOI] [PubMed] [Google Scholar]
- 9.Abascal F, Harvey LMR, Mitchell E, Lawson ARJ, Lensing SV, Ellis P, et al. Somatic mutation landscapes at single-molecule resolution. Nature. 2021;593:405–10. [DOI] [PubMed] [Google Scholar]
- 10.Marchetti F, Cardoso R, Chen CL, Douglas GR, Elloway J, Escobar PA, et al. Error-corrected next generation sequencing - Promises and challenges for genotoxicity and cancer risk assessment. Mutat Res Rev Mutat Res. 2023;792:108466. [DOI] [PubMed] [Google Scholar]
- 11.Nature Portfolio. Computational toxicology articles from across Nature Portfolio. 2025. [accessed 9 Feb 2025] Available: https://www.nature.com/subjects/computational-toxicology.
- 12.Rouse R, Kruhlak N, Weaver J, Burkhart K, Patel V, Strauss D. Translating New Science Into the Drug Review Process: The US FDA’s Division of Applied Regulatory Science. Ther Innov Regul Sci. 2017;52:244–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Chiu K, Racz R, Burkhart K, Florian J, Ford K, Garcia MI, Geiger RM, Howard KE, Hyland PL, Ismaiel OA, Kruhlak NL, Li Z, Matta MK, Prentice KW, Shah A, Stavitskaya L, Volpe DA, Weaver JL, Wu WW, Rouse R, Strauss DG. New science, drug regulation, and emergent public health issues: The work of FDA’s Division of Applied Regulatory Science. Front Med. 2023;9:1109541. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.ICH M7 (R2); ICH Harmonized Guideline. assessment and control of DNA reactive (Mutagenic) Impurities in pharmaceuticals to limit potential carcinogenic risk. Adopted on 3 April 2023. https://database.ich.org/sites/default/files/ICH_M7%28R2%29_Guideline_Step4_2023_0216_0.pdf
- 15.Kruhlak NL, Schmidt M, Froetschl R, Graber S, Haas B, Horne I, Horne S, King ST, Koval IA, Kumaran G, Langenkamp A, McGovern TJ, Peryea T, Sanh A, Siqueira Ferreira A, van Aerts L, Vespa A, Whomsley R. Determining recommended acceptable intake limits for N-nitrosamine impurities in pharmaceuticals: Development and application of the Carcinogenic Potency Categorization Approach (CPCA). Regul Toxicol Pharmacol. 2024;150:105640. [DOI] [PubMed] [Google Scholar]
- 16.Food US, Administration D. Recommended acceptable intake limits for nitrosamine drug substance-related impurities (NDSRIs) Guidance for Industry. 2023. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/recommended-acceptable-intake-limits-nitrosamine-drug-substance-related-impurities
- 17.Jayasekara PS, Skanchy SK, Kim MT, Kumaran G, Mugabe BE, Woodard LE, Yang J, Zych AJ, Kruhlak NL. Assessing the impact of expert knowledge on ICH M7 (Q)SAR predictions. Is expert review still needed? Regul. Toxicol Pharmacol. 2021;125:105006. [DOI] [PubMed] [Google Scholar]
- 18.Hasselgren C, Bercu J, Cayley A, Cross KP, Glowienke S, Kruhlak NL, Muster W, Nicolette J, Reddy MV, Saiakhov R, Dobo K. Management of pharmaceutical ICH M7 (Q)SAR predictions – the impact of model updates. Regul Toxicol Pharmacol. 2020;118:104807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Shinada NK, Koyama N, Ikemori M, Nishioka T, Hitaoka S, Hakura A, Asakura S, Matsuoka Y, Palaniappan SK. Optimizing machine-learning models for mutagenicity prediction through better feature selection. Mutagenesis. 2022;37:191–202. [DOI] [PubMed] [Google Scholar]
- 20.Hansen K, Mika S, Schroeter T, Sutter A, ter Laak A, Steger-Hartmann T, Heinrich N, Müller KR. Benchmark data set for in silico prediction of Ames mutagenicity. J Chem Inf Model. 2009;49:2077–81. [DOI] [PubMed] [Google Scholar]
- 21.Acharya S, Shinada NK, Koyama N, Ikemori M, Nishioka T, Hitaoka S, Hakura A, Asakura S, Matsuoka Y, Palaniappan SK. Asking the right questions for mutagenicity prediction from BioMedical text. NPJ Syst Biol Appl. 2023;18(1):9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Izawa K, Tsuda M, Suzuki T, Honma M, Sugiyama KI. Detection of in vivo mutagenicity in rat liver samples using error-corrected sequencing techniques. Genes Environ. 2023;45:30. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
No datasets were generated or analysed during the current study.
