Skip to main content
HHS Author Manuscripts logoLink to HHS Author Manuscripts
. Author manuscript; available in PMC: 2025 Nov 26.
Published in final edited form as: Regul Toxicol Pharmacol. 2025 Nov 8;164:105985. doi: 10.1016/j.yrtph.2025.105985

APPLICATION OF ERROR-CORRECTED SEQUENCING TECHNOLOGIES FOR IN VIVO REGULATORY MUTAGENICITY ASSESSMENT

Carole L Yauk 1, Anthony M Lynch 2, Vasily N Dobrovolsky 3, Maik Schuler 4, Stephanie L Smith-Roe 5, Devon Fitzgerald 6, Jake Higgins 6, Naveed Honarvar 7, Frank Le Curieux 8, Shoji Matsumura 9, Sheroy Minocherhomji 10, Leslie Recio 11, Jesse J Salk 12, Kei-ichi Sugiyama 13, Takayoshi Suzuki 13, John W Wills 2, Francesco Marchetti 14,*
PMCID: PMC12645987  NIHMSID: NIHMS2124110  PMID: 41213378

Abstract

Error-corrected sequencing (ECS) is a transformative method for in vivo mutagenicity assessment, enabling direct, highly sensitive measurement of mutation frequency and spectrum. ECS addresses key limitations of the transgenic rodent (TGR) assay, including lack of integration into standard toxicity studies, restricted model availability, and limited alignment with the 3R principles. To support regulatory acceptance, an expert workgroup of the International Workshop on Genotoxicity Testing (IWGT) reviewed ECS technologies and developed consensus recommendations for its inclusion into Organisation for the Economic Co-operation and Development (OECD) test guidelines. The workgroup agreed that ECS: produces results that are concordant with validated TGR assays; can be incorporated into standard ≥28-day repeat-dose toxicity studies; and, data interpretation should be based on overall mutation frequency compared with concurrent vehicle controls. The workgroup emphasized harmonized data reporting aligned with OECD principles and endorsed study designs that enable quantitative risk assessment. Overall, the workgroup agreed that ECS offers a significant advancement over current mutagenicity assays by enabling the use of diverse models beyond conventional TGR systems described in OECD test guideline 488. The workgroup fully supports the application of ECS to generate in vivo mutagenicity data for regulatory submissions and recommends its inclusion in future OECD test guidelines.

Keywords: Duplex sequencing, Hawk-seq, Jade-seq, PECC sequencing, SMM-seq, HiFi-seq, mutation

1. EXECUTIVE SUMMARY

There is broad agreement in the scientific community that error-corrected sequencing (ECS) is poised to transform mutagenesis research for chemical hazard identification and risk assessment. ECS offers a promising alternative to conventional approaches by enabling mutation analyses across tissues and experimental designs without reliance on transgenic rodent (TGR) models. This flexibility is particularly important in the context of evolving regulatory and ethical demands, including the global push to reduce animal uses in testing. To support the inclusion of ECS in the Organisation for the Economic Co-operation and Development (OECD) Test Guidelines (TG), an expert working group of the International Workshop in Genotoxicity Testing (IWGT) reviewed existing ECS technologies to evaluate their readiness for regulatory application. The present manuscript reports the unanimous consensus reached by the expert working group on the proper conduct of ECS studies, focusing on available ECS methodologies, experimental design, and evaluation and interpretation of results.

The working group reached the following conclusions on ECS technologies:

  1. There is sufficient evidence to indicate that ECS produces concordant results with those obtained using TGR assays.

  2. The input for the library preparation portion of the assay should be genomic DNA, ideally in a quantity that is readily attainable from commonly assessed tissues.

  3. The assay may be non-targeted (genome-scale) or target-based (defined regions of the genome) but should be capable of detecting mutation frequencies (MF) in the order of 10−7 or lower in vehicle controls.

  4. The assay must effectively account for, and remove, germline variants.

  5. The assay and bioinformatic tools/settings should be transferable to enable inter-laboratory reproducibility as per an OECD reporting framework.

The working group reached consensus on key aspects of ECS study design, emphasizing:

  1. Integration of ECS within a standard 28-day repeat-dose rodent toxicity study constitutes a valid mutagenicity test. Longer exposure durations (i.e., 90- or 180-days) are also acceptable.

  2. Exposure durations of at least 28 days do not require an expression time except for germ cells when the 28-day exposure duration is employed. In the latter case, an expression time of 28 days, as recommended in TG 488, should be used.

  3. Exposure durations shorter than 28 days represent a valid test, provided they produce a positive result. A study with a minimum 28-day exposure duration is required to demonstrate the absence of mutagenicity.

  4. The number of animals per group should be chosen to enable the detection of a 2-fold change in MF with 80% power.

  5. The use of a positive control is required during establishment of proficiency in the conduct of the ECS assay.

Regarding the evaluation and interpretation of results, the working group concluded that:

  1. The overall MF and the concurrent vehicle control should be used for statistical analysis and data interpretation of positive and negative results.

  2. Historical negative control (HNC) data sets are valuable for understanding if the ECS method used by a laboratory is “under control”.

  3. For target-based approaches, MF should be calculated using the assumption that the same mutation present in multiple consensus reads within a sample arose from a clonal expansion event, and therefore, only one instance of the mutation should be counted.

  4. Using appropriate statistical analysis with multiple testing correction, dose-dependent changes observed for one mutation subtype (using simple spectrum) can result in a positive call, even if the overall MF does not meet the criteria for a positive call (e.g., equivocal calls).

  5. More research is needed to determine if dose-dependent increases in MF at a single target, or a small set of targets, should lead to a positive call even if the overall MF does not meet criteria for a positive call.

  6. A data reporting framework for ECS studies should be developed using the “OECD Omics Reporting Framework for Transcriptomics and Metabolomics in Regulatory Toxicology” as a template.

  7. Although mutagenicity testing applies an experimental design aimed at hazard identification, the working group recommends the adoption of experimental designs appropriate for quantitative dose-response analysis (e.g., Benchmark dose modelling) when such designs would improve the applicability of the data to inform human health risk assessment.

The conclusions and recommendations presented in this position paper lay the foundation for integrating ECS into regulatory mutagenicity testing. Adoption of ECS will expand the scope of in vivo mutagenicity studies beyond the limitation of conventional TGR systems and improve data quality, flexibility and relevance to human health risk assessment. The expert working group encourages continued international collaboration to develop standardized guidance, reporting frameworks and validation studies to support integration of ECS into OECD TG.

2. INTRODUCTION

Genetic toxicology testing is a critical component of drug development and cancer risk assessment. In vivo genotoxicity testing is typically required by regulatory authorities when in vitro data demonstrates a chemical’s mutagenic potential. Its purpose is to determine whether the mutagenic effects observed in cultured bacteria and/or in vitro mammalian cells also occur in whole organisms.

Over the last five decades, the Organisation for Economic Co-operation and Development (OECD) has established various test guidelines (TGs) for evaluating in vivo mutagenicity. These guidelines encompass a range of mammalian tests that detect DNA damage and chromosomal aberrations, such as the alkaline comet assay (TG 489) (OECD, 2016e), the erythrocyte micronucleus test (TG 474) (OECD, 2016a), the bone marrow chromosome aberration test (TG 475) (OECD, 2016b), the dominant lethal test (TG 478) (OECD, 2016c), and the spermatogonial chromosomal aberration test (TG 483) (OECD, 2016d). The transgenic rodent (TGR) somatic and germ cell test (TG 488) (OECD, 2025) and the erythrocyte Pig-a gene mutation test (TG 470) (OECD, 2022) are the only two existing TGs that measure gene mutations in vivo. While effective for hazard identification, these tests come with significant limitations (Marchetti et al., 2023b). The TGR assay measures mutations in bacterial reporter transgenes that may not fully reflect endogenous mammalian gene responses (Cosentino and Heddle, 1999; Cosentino and Heddle, 2000; Monroe et al., 1998; Skopek et al., 1995), while the Pig-a gene assay detects only mature and immature erythrocytes with a mutant phenotype. Furthermore, as routinely conducted, neither of these tests provide information on the types of induced mutations to elucidate mechanism of action. Thus, there is a pressing need for improved in vivo mutagenicity testing methods that: (a) detect a broader range of mutations across different genomic contexts (e.g., transcribed vs non-transcribed sequences); and (b) can be integrated into standard non-clinical toxicity studies, aligning with the 3R principles of animal use in research (i.e., Replacement, Reduction, Refinement).

Error-corrected sequencing (ECS) has emerged as a transformative technology for mutagenicity assessment. By repeatedly sequencing the same DNA molecules, ECS enables the identification and removal of sequencing errors, reducing error rates to 1 × 10−7 to 1× 10−8, which is comparable to baseline somatic and germline mutation frequencies (MF). There is growing enthusiasm for adopting ECS to complement and replace conventional methods for mutagenicity testing (Marchetti et al., 2023a; Marchetti et al., 2023b). Empirical evidence increasingly supports that several ECS technologies are useful for regulatory testing applications. To facilitate their acceptance, the International Workshop on Genotoxicity Testing (IWGT) established a working group to evaluate the use of existing ECS technologies for in vivo mutagenicity testing, identify knowledge gaps, and provide recommendations for assay implementation to advance regulatory acceptance of ECS approaches in genetic toxicology. In the next section, we provide further details about currently available ECS methodologies.

3. ECS APPROACHES FOR DETECTING IN VIVO MUTATIONS

Since the development of Safe-Seq to improve sequencing fidelity of individual reads on the Illumina platform (Kinde et al., 2011), multiple methods have emerged for detecting rare mutations (Menon and Brash, 2023; Salk and Kennedy, 2020; Salk et al., 2018). Most ECS methods enhance accuracy by matching sequences of the forward and reverse strands of the original DNA fragments to build a consensus sequence, as true mutations will be present on both strands. For some methods this is achieved by attaching unique molecular identifiers (UMIs) to both strands. Other methods rely solely on shear point alignments to the reference genome to uniquely identify mutations. Adapter asymmetry informs strand orientation for all methods.

ECS methods can be categorized into targeted and genome-scale approaches. For example, duplex sequencing (DS) uses representative panels of endogenous loci as a target set (LeBlanc et al., 2022; Smith-Roe et al., 2023). This offers simplicity, consistency, and high molecular depth to enable identification of germline variants and low-level clonal expansions (see box 1 for definitions); however, panel-based approaches may not capture mutations across a sufficiently broad genomic landscape. In contrast, genome-scale approaches provide a more comprehensive distribution of mutational events but often require supplemental whole genome sequencing (WGS) to identify germline variants. For genome-scale approaches, the fraction of the genome covered is proportional to the number of original DNA molecules input to the PCR step (Hoang et al., 2016; Maslov et al., 2022) or to the sequencing step for PCR-free protocols (Matsumura et al., 2019). Increasing DNA input increases the average molecular depth for all ECS methods, but sufficient sequencing reads must be allocated to generate the number of sequence copies per molecule required to make a consensus. Each method requires empirical characterization of the relationship between DNA input, sequencing requirements, and molecular depth, as the efficiency with which original DNA molecules are converted to consensus sequence data will vary. Although sequencing reads can theoretically scale up, considerations such as cost and computational demands present practical limitations, especially for genome-scale methods that require supplemental WGS. Even with supplemental WGS, identification of modest levels of clonal mutations with genome-scale methods remain challenging. ECS methods are generally amenable to different sample types across species and tissues, as long as double-stranded DNA can be obtained, and a high-quality reference genome is available.

Box 1.

Germline variant:

A DNA sequence change that is inherited and therefore is present in every cell of the organism. Germline variants must be distinguished from new somatic or germ cell mutations in ECS analyses. In genome-scale approaches, this is typically done by comparing to a matched whole genome sequence or population databases. In targeted sequencing approaches, where very high sequencing depth is achieved at select loci, germline variants are identified and filtered based on their variant allele frequency (VAF).

Variant Allele Frequency (VAF) = (Number of molecules with the variant) / (Total informative consensus basecalls covering the position).

VAF is a key parameter used to distinguish germline variants from somatic mutations. Because germline variants are present in nearly all cells, they generally appear at or near a VAF of 50% (heterozygous) or 100% (homozygous), whereas somatic mutations typically appear at much lower VAFs.

Informative consensus basecall (or base):

A double-strand consensus basecall of A, T, C, or G. If an unambigious consensus basecall cannot be made due to disagreement between raw reads or strands, that position is called as “N” in that particular DNA molecule. Ns are not included in VAF or mutation frequency (MF) calculations (at least for duplex sequencing).

Unique mutation:

A base substitution or indel detected at a distinct genomic position. Multiple identical mutations within a sample can arise from a single ‘unique’ mutational event. Thus, these multiple identical mutations are counted as one ‘unique mutation’ regardless of how many times they are observed. This definition is essential for estimating mutation burden without the influence of clonal expansion.

Clonal expansion:

The process by which a single cell that is carrying a mutation undergoes proliferation, resulting in a population of descendant cells (a clone) that inherit the same mutation. In sequencing data, this is observed as a mutation appearing in multiple independent DNA molecules from the same sample.

Clonal mutation:

A mutation that is present in multiple DNA molecules, possibly due to clonal expansion of the mutated cell.

MFmin (Mutation Frequency Minimum):

The number of unique mutations per base sequenced, calculated by dividing the number of unique mutations by the total number of informative consensus bases. This metric assumes the minimal mutation burden by collapsing clonal mutation events into a single count, providing a conservative estimate of MF.

MFmax (Mutation Frequency Maximum):

The total number of mutations observed per base sequenced, including clonal mutations. MFmax is calculated by dividing the total number of mutation counts by the total number of informative consensus bases. This metric reflects the maximal detectable mutation burden and is sensitive to both mutation frequency and clonal proliferation.

Simple mutation spectrum:

The relative frequencies of each of the six possible single base substitution types (C>A, C>G, C>T, T>A, T>C, T>G), without considering adjacent sequence context. Useful for high-level comparisons of mutational processes. Note that substitutions are expressed only from a pyrimidine reference (for example, C>A and G>T represent the same change in a double-stranded molecule).

All ECS approaches generate error-corrected base calls, compare the number of mutant bases to total consensus bases to generate MF, and characterize the mutation spectrum. All methods are effective at capturing single nucleotide variants (SNVs) and small insertions and deletions. Detection of structural variants (SVs) remains challenging, although progress is being made (Heid et al., 2024; Wilson et al., 2023) and therefore these lesions may become part of an integrated ECS assessment in due course. In the meantime, ECS can be integrated with structural and numerical cytogenetic assessments in the same animal model (e.g., bone marrow micronucleus test). Moreover, generated ECS data can be leveraged to perform mutation signature analyses to identify chemical-specific fingerprints, map tissue-specific susceptibilities and provide mechanistic insights (Chawanthayatham et al., 2017; LeBlanc et al., 2022; Matsumura et al., 2019; Schuster et al., 2024). All ECS methods require significant computing resources and bioinformatics expertise. Both proprietary and open platforms exist, and these are available from a range of commercial and academic institutions. Below we summarize the ECS approaches that have been used to assess chemical mutagenesis. Strengths and weaknesses of each platform are summarized in Table 1.

Table 1.

Strengths and weaknesses of Error-Corrected Sequencing approaches

ECS approach Strengths Weaknesses
Duplex Sequencing
  • Extensively used and validated

  • Well-characterized panels for mouse, rat, and human; representative of the genome

  • Effective filtering of clonal and germline variants

  • Enables characterization of locus-specific MF and clonal expansion events

  • High sensitivity and specificity; concordant with TGR and Pig-a assays

  • Strong inter-laboratory reproducibility (r > 0.96)

  • Power analyses completed

  • Compatible with standard Illumina workflows

  • Commercially available, user-friendly, cloud-based analysis suite (DNAnexus), facilitating standardized and accessible data processing

  • Currently not commercially available (as of manuscript submission)

  • Requires complex library prep and bioinformatics processing

  • Limited to panel regions (~48 kb); broader genome coverage requires redesign

  • Target-based approach requiring species-specific panels

  • Not capable of detecting large structural variants (SVs)

Hawk-Seq
  • Second most extensively validated by comparison with the TGR assay

  • Genome-scale ECS using shear point mapping; no need for UMIs

  • Uses widely available Illumina kits with minimal modifications

  • Open-source tools used for analysis

  • Demonstrated reproducibility in inter-laboratory comparisons

  • Requires supplemental WGS or SNP databases to remove germline variants

  • Cannot reliably detect SVs or low-frequency clonal expansions

  • Current bioinformatic workflows require custom scripting and expertise, limiting broader adoption and standardization

Jade-Seq
  • Enhanced error correction via S1 Nuclease removes end-repair artifacts

  • Improved sensitivity vs. Hawk-Seq, especially for G:C mutations

  • Maintains compatibility with standard Illumina workflows

  • Good for detection of low-frequency SNVs across the genome

  • Limited use to date in complex mammalian in vivo studies

  • Current bioinformatic workflows require custom scripting and expertise, limiting broader adoption and standardization

  • Requires supplemental WGS or SNP databases to remove germline variants

PECC-Seq
  • No requirement for exogenous UMIs (simpler library prep than DS)

  • PCR-free method reduces amplification bias

  • Consensus achieved by pair-end overlap

  • Maintains compatibility with standard Illumina workflows

  • Effective detection of genome-wide SNVs with good mutation spectra

  • Short fragment generation and recovery of complementary strands can be challenging

  • Lower genome coverage compared to other genome-scale ECS methods

  • Requires supplemental WGS or SNP databases to remove germline variants

  • Current bioinformatic workflows require custom scripting and expertise, limiting broader adoption and standardization

SMM-seq
  • Enables genome-scale mutation detection with improved accuracy through rolling circle amplification (RCA)

  • Amplifies both strands linearly before PCR, reducing early-cycle PCR artifacts and improving variant confidence

  • Compatible with standard Illumina platforms and optimized for high-throughput workflows

  • Commercially available as a fee-for-service (MutagenTech), increasing accessibility for labs without in-house sequencing capacity

  • Limited independent validation or application in regulatory mutagenesis studies to date

  • No publicly available, user-friendly analysis pipeline; bioinformatics expertise is required

  • Detection of structural variants still in early stages and not yet validated for mutagenicity testing

  • Protocol complexity may present technical challenges for new users (e.g., managing RCA and hairpin adapter ligation)

  • Requires supplemental WGS or SNP databases to remove germline variants

HiFi sequencing
  • Long-read technology enables broad genomic target coverage

  • No requirement for exogenous UMIs (simpler library prep than DS)

  • High consensus accuracy from multiple circular passes

  • No PCR amplification to reduce artifact generation

  • Capable of detecting SNVs, small indels, and some SVs

  • Off-the-shelf commercial kits and sequencing services available

  • Requires supplemental whole genome sequencing (WGS) or SNP databases to remove germline variants

  • Lower throughput and data yield vs. short-read Illumina platforms

  • Not currently adapted for analysis of both unique and clonally expanded mutations

  • Current bioinformatic workflows require custom scripting and expertise, limiting broader adoption and standardization

3.1. Duplex Sequencing (DS)

DS uniquely tags original DNA duplexes to track both strands through PCR, sequencing, and bioinformatic analysis (Kennedy et al., 2014; Schmitt et al., 2012). This process generates duplex consensus sequences (DCS), identifying true mutations as complementary changes occurring on both strands of the original DNA molecule (Figure 1A). Duplex error correction reduces the theoretical error rate to <10−9 as independent, complementary PCR errors must occur at the same position on both strands to be mistaken for a true mutation (Kennedy et al., 2014; Schmitt et al., 2012). Originally commercialized by TwinStrand Biosciences, DS represents the most broadly adopted ECS method to date.

Figure 1 – Methods for error-corrected Next Generation Sequencing.

Figure 1 –

Figure 1 –

(A) Duplex Sequencing (DS). Adapters containing DS tags are ligated onto the ends of double-stranded DNA fragments to uniquely label each strand of the original molecule (i -ii) such that both strands can be tracked throughout amplification and sequencing (iii - iv). Tagged source DNA molecules (ii) are amplified by PCR and the resulting sequencing reads are grouped by unique tag and strand (iii). Top and bottom strands are compared to eliminate errors generated during PCR and sequencing, resulting in a duplex consensus sequence (iv). (B) Hypothesis alignment with weak overlap sequencing (Hawk-Seq). (i) The protocol involves preparing sequencing libraries using a standard Illumina PCR-based method with minor modifications to PCR conditions. After paired-end sequencing, read pairs originating from the same double-stranded DNA (dsDNA) fragment are identified based on their mapping positions on the reference genome. (ii) Consensus sequences are generated if at least one read pair from each strand of the dsDNA fragment is obtained, and variants are detected from these consensus sequences. Hawk-Seq does not require external molecular barcodes. (C) Justifies Analyte Dna sEquence (Jade-Seq). (i) Sequencing errors can be caused by end-repair process during library preparation. (ii) The method removes these errors by digesting single stranded regions in DNA fragments using S1 Nuclease. (D) Paired-end complimentary consensus sequencing (PECC-Seq). Mutations are identified by matching variants in forward and reverse reads of DNA fragments sequenced on an Illumina platform. (i) Genomic DNA is sheared into ~150-bp double-stranded fragments, and a PCR-free library is prepared. (ii) After paired-end sequencing, data is aligned to a reference genome, and forward and reverse strands are identified by mapping coordinates. (iii) Each DNA fragment generates four reads (two paired reads per strand). Variants are classified as mutations if present in all four reads, ensuring high-confidence detection. (E) Single-Molecule Mutation Sequencing (SMM-seq). After DNA fragmentation with specific nucleases (i), the method employs hairpin adapters (ii) and rolling circle amplification (RCA) (iii) to linearly amplifies both strands of each DNA duplex while preserving the physical linkage between the original strands and their copies (iv). True variants are identified by consensus sequences (v). This approach enhances analytical efficiency by minimizing the risk of strand separation during amplification. (F) HiFi sequencing. The methodology is built on Single Molecule Real-Time (SMRT) technology. (i) High-molecular-weight genomic DNA is randomly sheared into 5–10 kbp double-stranded fragments, and SMRTbell adapters are ligated to both ends, converting the fragments into circularized single-stranded templates. A sequencing primer is annealed to the circularized template, and a DNA polymerase is loaded onto the DNA/primer complex. The polymerase extends the primer along the circular template in the presence of fluorescently labeled nucleotide triphosphates. (ii) Forward and reverse consensus sequences are generated, aligned to the reference genome and variant identified when present in both consensus sequences.

DS library preparation begins with genomic DNA fragmentation, originally by mechanical methods but updated to an enzymatic method that reduces 5’ end artifacts. DNA fragments (~300 bp) are then ligated with adapters containing UMIs and strand-defining sequences. Sample indexes and Illumina adapter sequences are added by PCR followed by hybrid capture of target regions. Final DS libraries are compatible with standard Illumina workflows for 150 bp paired-end sequencing. Hybrid capture mutagenesis panels have been developed for human (Cho et al., 2023; Wang et al., 2021), mouse (Ashford et al., 2025; Dodge et al., 2023; LeBlanc et al., 2022), and rat (Bercu et al., 2023; Smith-Roe et al., 2023) genomes. Each mutagenesis panel covers 20 regions of ~2.4 kb (48 kb total) of non-repetitive sequence, distributed across most autosomes. Targets represent the whole genome in terms of GC content together with nucleotide and trinucleotide composition. Selected regions have no known role in cancer and are unlikely to be under strong positive or negative selection.

The target-based approach of DS allows very high coverage (typically > 10,000x molecular depth) across loci, enabling precise differentiation between germline and somatic variants within a single DS library. Additionally, clonal variants can be “collapsed” or filtered out to more accurately estimate MF (Dodge et al., 2023). DS has been widely used to assess chemical mutagenicity in transgenic rodents (Armijo et al., 2023; Ashford et al., 2025; Bercu et al., 2023; Chawanthayatham et al., 2017; Dodge et al., 2023; LeBlanc et al., 2022; LeBlanc et al., 2025; Schuster et al., 2024; Valentine et al., 2020; Xia et al., 2025; Zhang et al., 2025a; Zhang et al., 2024), wild-type rodents (Luzadder et al., 2025; Minko et al., 2024; Sahib et al., 2024; Simon et al., 2025; Smith-Roe et al., 2023) and cultured cells (Armijo et al., 2023; Cho et al., 2023; Huliganga et al., 2025; Wang et al., 2021; Zhivagui et al., 2023). Baseline MF in untreated rodent tissues range from approximately 4×10−8 to 2×10−7, depending on the tissue and the fragmentation method. DS shows high concordance with established mutagenicity assays such as TGR (Ashford et al., 2025; Bercu et al., 2023; Dodge et al., 2023; LeBlanc et al., 2022; Valentine et al., 2020) and Pig-a (Smith-Roe et al., 2023). Furthermore, a good concordance between induced MF in the DS mutagenesis gene panel and the lacZ transgene panel has been reported (Ashford et al., 2025). Multiple studies have demonstrated high inter-laboratory reproducibility with Pearson correlation coefficients (r) > 0.96 (Cho et al., 2023; LeBlanc et al., 2022; Zhang et al., 2025b).

A strength of DS is the availability of the TwinStrand Biosciences DuplexSeq Mutagenesis App, a user-friendly, cloud-based analysis suite on DNAnexus, facilitating standardized and accessible data processing. Weaknesses of DS is the requirement of species-specific panels that focus on a small portion of the genome. In addition, it is not currently able to detect large structural variants. Library preparation is a complex, multi-day protocol. Finally, at the time of manuscript submission, DS kits and services are not commercially available.

3.2. Hawk-Seq

Similarly to other ECS methods, Hawk-Seq, named after “hypothesis alignment with weak overlap”, detects true mutations by comparing sequences from both DNA strands (Matsumura et al., 2019; Otsubo et al., 2021). Library preparation follows a standard Illumina PCR-based protocol with random fragmentation by sonication and minor modifications of PCR conditions. After paired-end sequencing, data analysis is performed using both open-source and custom bioinformatics tools. Read pairs derived from the same DNA fragment are grouped based on shear point mapping positions in the reference genome; thus, exogenous UMIs are not necessary. A consensus sequence is created only if at least one read pair from each strand of an original DNA fragment is obtained (Figure 1B) and variants must be present in both strands of the consensus sequence. Although identical alignment of read ends from different DNA fragments is possible, this is minimized by controlling the input DNA amount for PCR (IDAP) per unit genome length during library preparation (Matsumura et al., 2019). In the absence of sample-paired WGS, known genetic variant data registered in dbSNP (https://www.ncbi.nlm.nih.gov/snp/) have been used to filter out genetic polymorphisms present in mice (Matsumura et al., 2019).

A strength of Hawk-seq is the potential for genome-scale mutation analysis and the use of commercially available Illumina kits without the need for exogenous UMI. It requires only slight modifications (i.e., adjusting input DNA amount) in the PCR steps (Matsumura et al., 2019). A weakness is that there is no completed and standardized bioinformatics package currently available, although it utilizes open-source tools for analysis. Hawk-Seq has been used to detect induced mutations in Salmonella typhimurium TA100 exposed to 15 mutagens and in transgenic gpt delta mice exposed to five mutagens (Matsumura et al., 2019; Otsubo et al., 2021).

3.3. Jade-Seq

Jade-Seq, named after “Justifies Analyte DNA sEquence”, is a genome-scale ECS method. Like other ECS methods, Jade-seq is based on detecting mutations on both strands of a DNA fragment and it is an improvement over Hawk-Seq (Otsubo et al., 2022). Jade-Seq reduces sequencing errors caused by end-repair during library preparation (Figure 1C). These errors typically occur near sequencing read ends (You et al., 2020) and are efficiently eliminated before sequencing via digestion of single-strand-specific regions in DNA fragments using S1 Nuclease. Otsubo et al. evaluated several single-strand-specific nucleases and found S1 Nuclease to be the most effective in reducing these artifacts (Otsubo et al., 2022). Compared to Hawk-Seq, Jade-Seq reduced G:C > T:A and G:C > C:G MF in the same DNA sample from 1.0×10−7 and 1.5×10−7 to 4.1×10−8 and 2.1×10−8 bp, respectively. This increased accuracy was particularly important for detecting rare mutations induced by polycyclic aromatic hydrocarbons in S.typhimurium TA100 (Otsubo et al., 2022).

Materials required for Jade-Seq (i.e., Illumina library kit and S1 Nuclease) are widely available. The laboratory protocol requires an additional digestion step prior to end-repair processing during Illumina library preparation. A possible limitation is biased genomic representation due to the sequence specificity of S1 Nuclease; however, Otsubo et al. concluded that there was no significant influence of this on coverage of the S.typhimurium genome. Data analysis utilizes open-source tools (Otsubo et al., 2022) and is similar to Hawk-Seq (Matsumura et al., 2019), but no complete pipeline package is available.

3.4. PECC Sequencing

Paired-end complimentary consensus sequencing (PECC-Seq) identifies mutations by matching complementary sequences in forward and reverse reads of double-stranded DNA fragments sequenced on the Illumina platform (Izawa et al., 2023; You et al., 2023; You et al., 2020). The method requires randomly fragmenting genomic DNA to ~150-bp fragments and library preparation using a PCR-free protocol. With such short inserts, paired-end reads mostly or completely overlap, enabling highly accurate mutation detection. After standard paired-end sequencing, the sequencing data are aligned to the reference genome, and complementary DNA strands are matched by shear point mapping coordinates (Figure 1D). Since low molecular depth is generally sufficient for unique identification, PECC-seq does not require exogenous UMIs.

Each double-stranded DNA insert in the original library can produce four sequencing reads: two paired reads for the forward strand and two reads for the reverse strand. Variants are classified as ‘putative’ mutations if they are present in both strands of the sequenced DNA fragment and appear in all four sequencing reads. After eliminating the terminal 10 bp of inserted fragments to reduce artifacts caused by the end-repair process, a mutation is considered ‘true’ only if it meets strict criteria: it is unique; it is present only within a single DNA fragment; and, it does not appear in any overlapping reads in the same or other sequenced samples. During these filtering steps, possible single nucleotide polymorphisms (SNPs) can be eliminated without prior information by WGS. However, if SNP information is already available for the same strain, it can also be used to support their elimination.

A strength of PECC-seq is genome-scale mutation detection, producing representative, detailed mutational spectra without requiring specialized reagents. Weaknesses include challenges in generating short DNA fragments for library construction, recovery of the complementary strand, and potentially reduced genome sequencing coverage. Existing bioinformatic pipelines for PECC-Seq are not yet available as a commercial package. PECC-Seq has been successfully applied to detect mutations induced by prototypical mutagens in both in vitro (You et al., 2020) and in vivo models (Izawa et al., 2023; You et al., 2023).

3.5. SMM-seq

Single-Molecule Mutation Sequencing (SMM-seq) is an efficient ECS method at genome-scale (Maslov et al., 2022). Specifically optimized for high-throughput sequencing on Illumina platforms, this method features a distinctive library preparation process that utilizes hairpin adapters and rolling circle amplification (RCA). RCA is employed to linearly amplify both strands of each DNA duplex while temporarily maintaining physical linkage of the original DNA strands and their copies (Figure 1E). This is followed by a standard PCR step, during which strands and copies separate prior to sequencing.

A strength of SMM-Seq is performing RCA prior to standard PCR; this effectively increases the number of templates representing each original molecule, which theoretically improves the accuracy of mutation detection by reducing the risk of early round PCR errors that could cause sequence artifacts. A possible limitation is biased genomic representation due to reliance on restriction endonucleases Alu I and Mlu CI for DNA fragmentation. SMM-seq typically analyzes about 200 MB of the mammalian genome and has been successfully applied in a variety of research settings, including studies quantifying increases in MF induced by N-ethyl-N-nitrosourea (ENU) in vitro, and age-related mutational changes in human tissues (Maslov et al., 2022). A modified SMM-seq approach to detect SVs has been developed (Heid et al., 2024). At the time of manuscript preparation, SMM-seq is available as a commercial fee-for-service platform from MutagenTech (https://mutagentech.com).

3.6. HiFi Sequencing

High-fidelity (HiFi) Sequencing detects mutations using Single DNA Molecule Real-Time (SMRT) sequencing developed by Pacific Biosciences of California, Inc. (PacBio). Like other ECS methods, HiFi Sequencing relies on detecting mutations in both forward and reverse DNA strands. Library preparation and sequencing use PacBio’s off-the-shelf reagents and kits. High-molecular weight genomic DNA is randomly sheared to 5–10 kb double-stranded fragments and SMRTbell adapters are ligated on both sides circularizing the original double-stranded fragments into single-stranded templates (Figure 1F). A sequencing primer is annealed to the circularized template, and a DNA polymerase is loaded onto the DNA/primer complex where it can extend the primer along the circular template in the presence of fluorescent nucleotide triphosphates. PacBio’s strand displacing DNA polymerase allows continuous reading of forward and reverse strands of the circular template, producing highly accurate forward and reverse consensus sequences from repeat passes. Bioinformatic processing aligns consensus sequences to a reference genome, identifying variants as potential mutations. While SMRT sequencing can produce an accurate single consensus sequence, generating separate forward and reverse consensus sequences and matching the variants present in both produces a more precise list of putative mutations (Matsuda et al., 2015).

A strength of HiFi Sequencing is its broad genomic target for mutation detection, which facilitates detailed mutation spectra analysis. PCR is not used, reducing the risk of early round PCR errors causing artifactual mutations. HiFi has been used for chemical mutagenicity analyses in simple (e.g., bacterial) genomes (Matsuda et al., 2015; Miranda et al., 2023; Revollo et al., 2021) and complex genomes, including nematodes (Miranda et al., 2022a) and mammals (Dobrovolsky et al., 2023; Miranda et al., 2022a; Miranda and Revollo, 2024; Seo et al., 2024). Weaknesses include the relatively high cost for generating sufficient depth of coverage and the potential requirement for additional WGS, especially when working with outbred rodent strains.

3.7. Other relevant ECS approaches

A few other ECS approaches have been published that have error frequencies at or below the spontaneous MF. Although these methods have yet to be applied in chemical mutagenesis studies, the working group acknowledges their significant potential in advancing this field. They are briefly summarized below.

Concatenating Original Duplex for Error Correction (CODEC) detects mutations by physically linking the forward and reverse strands of a double-stranded DNA fragment and sequencing them on the Illumina platform using paired-end technology (Bae et al., 2023). An advantage of CODEC is that since complementary strands are physically linked together, each read pair represents both original strands, and no bioinformatic tools are required for locating them. Mutation identification follows a standard bioinformatics protocol, including the alignment to a reference genome and ensuring that mutations are present in both strands. CODEC has shown comparable performance to DS in detecting mutations in human DNA (Bae et al., 2023). An updated version that allows simultaneous detection of mutations and single-base methylation resolution has been published (Liu et al., 2025).

Nanorate Sequencing (NanoSeq) is an ECS method that improves mutation detection by removing library preparation steps, particularly end-repair after DNA fragmentation and nick extension, that introduce errors between strands (Abascal et al., 2021). NanoSeq has been used to investigate somatic mutation rates in humans (Abascal et al., 2021) and to study cancer driver mutation evolution in human tissue samples (Lawson et al., 2025; Neville et al., 2025). A comprehensive NanoSeq bioinformatic pipeline is available (https://github.com/cancerit/NanoSeq), involving preprocessing of bam files, contamination estimation, base substitution and indel calling, as well as estimation of several quality metrics. NanoSeq data are output in standard formats compatible with other analysis pipelines.

Hairpin Duplex Enhanced Fidelity sequencing (HiDEF-seq) is a HiFi-related method that can detect base substitutions and modifications, e.g., cytosine deamination (Liu et al., 2024). HiDEF-seq can be performed with both random or restriction enzyme-based fragmentation, incorporates an updated A-tailing step, and employs smaller insert sizes, which enables a greater number of passes per DNA molecule on the PacBio platform. These innovations significantly improve the detection of dsDNA single base substitution mutations, as well as single-strand mutations related to DNA mismatch and damage. However, additional bioinformatic filtering steps are required to fully harness its potential for precision mutation analysis.

3.8. Summary of ECS methodologies

The development of ECS, exemplified by the platforms described above, represents a natural evolution of sequencing technologies that have become essential in precision oncology for accurate diagnosis and tailored treatment of various cancers (Hussen et al., 2022) and rare diseases (Fernandez-Marmiesse et al., 2018). Beyond clinical applications, ECS has transformed our ability to investigate somatic mutation rates, providing key insights into, for example, human somatic mosaicism (Martincorena, 2019) and clonal selection in cancer evolution (Kakiuchi and Ogawa, 2021).

Building on these advances, ECS holds immense potential to significantly improve mutagenicity testing. By accurately identifying rare mutations across the genome, including those present in only a small subset of cells, ECS enables more reliable, high-resolution detection of mutagenic effects. Its improved accuracy increases the understanding of chemical-induced genetic alterations to support regulatory decision making. As ECS continues to evolve, new platforms are likely to emerge, and existing ones will be refined. To remain at the forefront, it is imperative for the discipline of genetic toxicology to keep pace with these scientific advances and adopt ECS technologies, as has already occurred in clinical and diagnostic medicine. The next section summarizes progress in applying ECS in assessing pharmaceutical, consumer product, and environmentally induced mutagenesis in vivo and its potential to serve as a substitute for traditional mutagenicity tests, such as the TGR and Pig-a assays.

4. IN VIVO MUTAGENESIS STUDIES WITH ECS

We mined the literature (PubMed and BioRxiv up to June 30, 2025) for studies where ECS has been used for assessing chemical mutagenesis. First, we focused on those studies where ECS has been used in tandem with other in vivo mutation assessments, i.e., the TGR assay (OECD, 2025) or the Pig-a assay (OECD, 2022). We identified 35 experiments with 11 chemicals that quantified MF using these two OECD tests and ECS within the same animals and experiment (Table 2). There were 25 experiments using DS, eight using Hawk-seq, and two using PECC-seq, across various transgenic rodents (Gpt delta mice, BigBlue® mice and rats, MutaMouse), and Sprague Dawley rats. The chemicals tested were mostly established mutagens including a variety of alkylating agents and chemicals causing bulky DNA adducts. The studies cover seven tissues of varying metabolic capacity, site-of-contact and distal tissues, as well as fast and slow proliferating tissues.

Table 2.

In vivo experiments examining the concordance between OECD mutagenicity tests and Error-Corrected Sequencing.

Exposure Rodent System Tissue OECD Assay OECD Assay Result ECS Method ECS Result Concordanceb Reference
Aflatoxin B1 Gpt delta mouse Liver TGR (OECD TG 488) + Duplex Sequencing + + Chawanthayatham et al. (2017)
Aristolochic acid Gpt delta mouse Kidney TGR (OECD TG 488) + Hawk-Seq + + Matsumura et al. (2019)
Gpt delta mouse Kidney TGR (OECD TG 488) + PECC-Seq + + You et al. (2023)
Benzo[a]pyrene Big Blue® mouse Bone marrow TGR (OECD TG 488) + Duplex Sequencing + + Valentine et al. (2020)
Big Blue® mouse Liver TGR (OECD TG 488) + Duplex Sequencing + + Valentine et al. (2020)
MutaMouse Bone marrow TGR (OECD TG 488) + Duplex Sequencing + + LeBlanc et al. (2022)
Gpt delta mouse Bone marrow TGR (OECD TG 488) + Hawk-Seq + + Matsumura et al. (2019)
Gpt delta mouse Liver TGR (OECD TG 488) + Hawk-Seq + + Matsumura et al. (2019)
Big Blue® rat Liver TGR (OECD TG 488) + Duplex Sequencing + + Xia et al. (2025)
N-Ethyl-N-nitrosourea SDa rat Bone marrow Pig-a (OECD TG 470) + Duplex Sequencing + + Smith-Roe et al. (2023)
SDa rat Blood Pig-a (OECD TG 470) + Duplex Sequencing + + Smith-Roe et al. (2023)
SDa rat Liver Pig-a (OECD TG 470) + Duplex Sequencing + + Smith-Roe et al. (2023)
SDa rat Stomach Pig-a (OECD TG 470) + Duplex Sequencing + + Smith-Roe et al. (2023)
Big Blue® mouse Bone marrow TGR (OECD 488) + Duplex Sequencing + + Valentine et al. (2020)
Big Blue® mouse Liver TGR (OECD TG 488) + Duplex Sequencing + + Valentine et al. (2020)
Gpt delta mouse Bone marrow TGR (OECD TG 488) + Hawk-Seq + + Matsumura et al. (2019)
Gpt delta mouse Liver TGR (OECD TG 488) + Hawk-Seq + + Matsumura et al. (2019)
MutaMouse Germ cells TGR (OECD TG 488) + Duplex Sequencing + + LeBLanc et al (2025)
Big Blue® mouse Kidney TGR (OECD TG 488) + Duplex Sequencing + + Zhang et al. (2025)
Big Blue® mouse Liver TGR (OECD TG 488) + Duplex Sequencing + + Zhang et al. (2025)
Methylnitrosourea Gpt delta mouse Bone marrow TGR (OECD TG 488) + Hawk-Seq + + Matsumura et al. (2019)
Gpt delta mouse Liver TGR (OECD TG 488) + Hawk-Seq + + Matsumura et al. (2019)
N-Nitrosodiethylamine Big Blue® rat Liver TGR (OECD TG 488)b + Duplex Sequencing + + Bercu et al. (2023)
Big Blue® mouse Liver TGR (OECD TG 488)b + Duplex Sequencing + + Zhang et al. (2024)
Big Blue® mouse Bone marrow TGR (OECD TG 488)b Duplex Sequencing + Zhang et al. (2024)
Gpt delta mouse Liver TGR (OECD TG 488) + Hawk-Seq + + Matsumura et al. (2019)
Gpt delta rat Liver TGR (OECD TG 488) + PECC-Seq + + Izawa et al. (2023)
N-Nitrosodimethylamine Gpt delta mouse Liver TGR (OECD TG 488) + Duplex Sequencing + + Armijo et al. (2023)
Gpt delta mouse Lung TGR (OECD TG 488) + Duplex Sequencing + + Armijo et al. (2023)
MutaMouse Liver TGR (OECD TG 488) + Duplex Sequencing + + Ashford et al. (2025)
N-nitroso-ethylisopropylamine Big Blue® rat Liver TGR (OECD TG 488) + Duplex Sequencing + + Xia et al. (2025)
N-Nitrosomorpholine Big Blue® mouse Kidney TGR (OECD TG 488) + Duplex Sequencing + + Zhang et al. (2025)
Big Blue® mouse Liver TGR (OECD TG 488) + Duplex Sequencing + + Zhang et al. (2025)
N-Nitroso reboxetine Big Blue® mouse Liver TGR (OECD TG 488) Duplex Sequencing + Zhang et al. (2025)
Procarbazine hydrochloride MutaMouse Bone marrow TGR (OECD TG 488) + Duplex Sequencing + + Dodge et al. (2023)
a

SD=Sprague-Dawley,

b

ECS and TRG hazard calls for mutagenicity were in overall agreement

Analysis of these data shows 97% concordance in hazard calls between OECD in vivo mutagenesis tests and ECS, with 33/34 positive calls for mutagenicity in both assays (97% sensitivity). The only discordant result was for N-nitroso reboxetine, which was negative using the TGR assay but positive by DS in mouse liver (Zhang et al., 2025a). Within this study, the authors showed that this compound was also positive using the comet assay in liver and positive in the Ames test. Thus, the data strongly suggest that the TGR assay result for this compound is a false negative. There is also one expected negative result for nitrosodiethylamine (NDEA) in the bone marrow by both DS and the TGR assay, due to the bone marrow’s inability to metabolically activate NDEA, (Zhang et al., 2024). Correlation analyses between the TGR assay and DS yielded strong r values of 0.85 (Dodge et al., 2023), 0.89 (Valentine et al., 2020), and 0.94 (LeBlanc et al., 2022). Furthermore, dose-response analyses showed very similar Benchmarck Dose (BMD) confidence intervals (Bercu et al., 2023; Dodge et al., 2023; LeBlanc et al., 2022; Zhang et al., 2024). Concordant hazard calls and BMDs are also observed in liver from MutaMouse males exposed to N-nitrosodimethylamine (NDMA) and analyzed with the DS mouse mutagenesis and the lacZ transgene panels (Ashford et al., 2025). These data indicate that hazard calls and points of departure are highly concordant between ECS and the TGR assay, and that assessment based on ECS would produce identical conclusions to those using TGR data.

Table 3 lists 29 additional ECS experiments that were conducted mostly in wild type rodents without concurrent TGR or Pig-a data. There were 13 experiments with DS, four with PECC-seq, and 12 with HiFi sequencing that quantified MF for 13 chemicals in six tissues. Seven of these chemicals do not overlap with those listed in Table 2 and includes the non-mutagenic carcinogen methapyrilene (Sahib et al., 2024). Analysis of these 29 additional experiments shows 100% concordance in hazard calls with expected results based on available data and knowledge about the chemicals. Considering Tables 2 and 3, there are 18 chemicals that have been tested thus far, supporting the ability of ECS to correctly identify mutagenic compounds.

Table 3.

Additional in vivo experiments using Error-Corrected Sequencing for assessing chemical mutagenesis

Exposure Rodent System Tissue ECS Method Expected resulta ECS Result Reference
7,12-dimethylbenz[a]anthracene C57BL/6NHsd mice Kidney Liver Lung Spleen HiFi + + Miranda et al. (2024)
Aflatoxin B1 C57BL/6J mice Liver Duplex Sequencing + + Minko et al. (2024)
C57BL/6J mice Liver Duplex Sequencing + + Luzadder et al. (2025)
Aristolochic Acid Fisher 344 rats Liver Duplex Sequencing + + Sahib et al. (2024)
Benzo(a)pyrene C57BL/6J mice Liver PECC-Seq + + You et al. (2023)
Crl:NMRI (Han) mice Bone MarrowLiver Duplex Sequencing + + Simon et al. (2025)
Benzo[b]fluoranthene MutaMouse Bone MarrowLiver Duplex Sequencing + + Schuster et al. (2024)
Methapyrilene Fisher 344 rats Liver Duplex Sequencing Sahib et al. (2024)
N-Ethyl-N-nitrosourea C57BL/6J mice Liver PECC-Seq + + You et al. (2023)
N-Nitroso-bisoprolol Crl:NMRI (Han) mice Bone MarrowLiver Duplex Sequencing + + Simon et al. (2025)
N-Nitrosodiethylamine C57BL/6J mice Liver PECC-Seq + + You et al. (2023)
N-Propyl-N-nitrosourea C57BL/6NHsd mice Kidney Liver Lung Spleen HiFi + + Miranda et al. (2024)
Procarbazine hydrochloride C57BL/6NHsd mice Kidney Liver Lung Spleen HiFi + + Miranda et al. (2024)
Quinoline DBA2 mice Liver PECC-Seq + + You et al. (2023)
Urethane Tg-rasH2 mice Blood Lung Spleen Duplex Sequencing + + Valentine et al. (2020)
a

Based on previously published data with TGR models

An important consideration is whether ECS detects increases in MF at the same or lower doses than the TGR assay. This is particularly critical as the genetic toxicology community is increasingly emphasizing quantitative assessments of mutagenicity data. Evidence to date indicates that ECS is equally or more sensitive than conventional tests in 28-day oral exposure designs; for example, a study with NDEA in Big Blue® male rats showed a ‘lowest observed genotoxic effect level’ (LOGEL) for MF in liver at 0.1 mg/kg/day by DS, whereas the LOGEL for the TGR cII mutation assay in the same rats was 1 mg/kg/day (Bercu et al., 2023). Similarly, Zhang et al. (Zhang et al., 2024) observed a LOGEL for NDEA in male BigBlue® mouse liver MF at 0.1 mg/kg/day with DS, while the LOGEL for the cII TGR assay was 1 mg/kg/day. Comparative dose response analysis with another N-nitrosamine, NDMA, has shown that DS detects a LOGEL below that of the lacZ assay in MutaMouse male livers (Ashford et al., 2025). Dodge et al. (Dodge et al., 2023) analyzed bone marrow of MutaMouse males exposed to procarbazine hydrochloride (PRC), an anti-neoplastic agent that is a more potent clastogen than a mutagen. They found a statistically significant increase in MF at the lowest dose of PRC by DS but not with the TGR lacZ assay. In all these cases, the ability to detect increases in MF by DS at doses that are negative with the TGR assay results from its reduced intragroup variability. Finally, the only study so far available in germ cells (LeBlanc et al., 2025) reported a significant increase in ENU-induced mutations only at the high dose with DS, while the TGR assay showed significant effects at both the high and middle doses. Thus, while the published studies demonstrate that ECS detects chemically induced mutations at lower doses than the TGR assay in somatic tissues, additional research is needed to determine whether this is the case in germ cells.

A current gap in the available ECS data is the very limited analysis of non-mutagenic chemicals (expected negatives) other than methapyrilene (Sahib et al., 2024) and the six solvent vehicles that have been used (Table 4). The use of vehicle controls as non-mutagenic chemicals to evaluate assay performance was an approach reported in the Detailed Review Paper for the TGR assay and accepted by the OECD (OECD, 2009). Moreover, simulation analyses of DS data from rat vehicle control liver samples have estimated rates of false positives by comparing random groups of control samples to each other (unpublished data generously provided by Maik Schuler and Shaofei Zhang, Pfizer, and analyzed by Andrew Williams, Health Canada, respectively). These simulations identified a false discovery rate of ~0.05 (Supplementary Table 1), which aligns with the conventional threshold for statistical significance (α = 0.05) used in hypothesis testing. Overall, the results suggest that there is a 95% probability (adjusted for multiple comparisons) of not seeing a significant difference when randomly comparing groups of samples treated with non-mutagenic agents.

Table 4.

Vehicle controls and spontaneous mutation frequencies in Error-Corrected Sequencing studies.

Exposure Rodent System Tissue ECS Method DNA fragmentation method Mutation Frequency (×10−7 ± SD) Reference
0.5% Methylcellulose in Deionized Water Big Blue® rat Liver Duplex Sequencing Sonication 0.94 ± 0.08 Bercu et al. (2023)
Big Blue® mouse Bone marrow
Liver
Duplex Sequencing Enzymatic 0.49 ± 0.07
0.41 ± 0.09
Zhang et al. (2024)
Big Blue® mouse Kidney
Liver
Liver
Duplex Sequencing Enzymatic 0.30 ± 0.08
0.52 ± 0.18
0.41 ± 0.07
Zhang et al. (2025)
Crl:NMRI (Han) mice Bone marrow
Liver
Duplex Sequencing Enzymatic 0.62 ± 0.14
0.96 ± 0.21
Simon et al. (2025)
Big Blue® rat Liver Duplex Sequencing Enzymatic 0.53 ± 0.16 Xia et al. (2025)
MutaMouse Liver Duplex Sequencing Enzymatic 0.50 ± 0.10 Ashford et al. (2025)
Corn oil Gpt delta rat Liver PECC-Seq Sonication 10.03 ± 5.34b Izawa et al. (2023)
Dimethyl sulfoxide (DMSO) Gpt delta mouse Liver Duplex Sequencing Sonication 2.7c Chawanthayatham et al. (2017)
C57BL/6J Liver Duplex Sequencing Sonication 0.60 ± 0.05 Minko et al. (2024)
C57BL/6J Liver Duplex Sequencing Sonication 0.40 ± 0.06 Luzadder et al. (2025)
Olive oil Big Blue® mouse Bone Marrow
Liver
Duplex Sequencing Sonication 2.09 ±0.76
1.10 ±0.49
Valentine et al. (2020)
MutaMouse Bone Marrow Duplex Sequencing Sonication 1.25 ± 0.25 LeBlanc et al. (2022)
MutaMouse Bone Marrow Duplex Sequencing Sonication 1.31 ± 0.19 Dodge et al. (2023)
MutaMouse Bone Marrow
Liver
Duplex Sequencing Enzymatic 0.41 ± 0.27
0.53 ± 0.15
Schuster et al. (2024)
Gpt delta mouse Bone Marrow
Kidney
Hawk-Seq Sonication 2.25 ± 0.20
2.04 ± 0.20
Matsumura et al. (2019)
Gpt delta mouse Liver PECC-Seq Sonication 0.28 ± 0.04 You et al. (2023)
C57BL/6J Liver PECC-Seq Sonication 0.69 ± 0.05 You et al. (2023)
DBA2 Liver PECC-Seq Sonication 0.49 ± 0.11 You et al. (2023)
Phosphate buffered saline SDa rat Blood
Bone Marrow
Liver
Stomach
Duplex Sequencing Sonication 1.42 ± 0.19
1.70 ± 0.53
1.02 ± 0.15
0.47 ± 0.09
Smith-Roe et al. (2023)
Tg-rasH2 mice Blood
Lung
Spleen
Duplex Sequencing Sonication 1.25 ± 0.45
0.74 ± 0.21
1.00 ± 0.29
Valentine et al. (2020)
Saline Gpt delta mouse Bone marrow
Liver
Hawk-Seq Sonication 1.72 ± 0.08
2.86 ± 0.20
Matsumura et al. (2019)
Gpt delta mouse Kidney PECC-Seq Sonication 0.37 ± 0.09 You et al. (2023)
Gpt delta mouse Liver
Lung
Duplex Sequencing Sonication 3.64 ± 0.68
0.40 ±0.03
Armijo et al. (2023)
Fisher 344 rats Liver Duplex Sequencing Sonication 0.50 ± 0.07 Sahib et al. (2024)
Water (Milli-Q; autoclaved) C57BL/6N Hsd mice Kidney
Liver
Lung
Spleen
PacBio HiFi Mechanical (g-tubes) 0.50 ± 0.04
0.79 ± 0.12
0.43 ± 0.03
0.71 ± 0.07
Miranda and Revollo (2024)
a

SD=Sprague-Dawley

b

Shift to enzymatic fragmentation reduced the mutation frequency to 1–3 ×10−7 (Suzuki, personal communication)

c

Standard deviation not available

Regulatory acceptance of ECS depends on technical reproducibility within and across laboratories. Two studies have conducted DS library preparations and sequencing in different facilities to explore cross-laboratory reproducibility of rodent MF (Table 5). LeBlanc et al. (LeBlanc et al., 2022) analyzed mutations in bone marrow DNA from mice orally exposed to benzo[a]pyrene (BaP) and vehicle controls, achieving an r value of 0.97 between libraries prepared at Health Canada to those prepared at TwinStrand Biosciences. Similarly, Smith-Roe et al. (Smith-Roe et al., 2023) reported r values 0.99 for a variety of tissues in rats orally exposed to ENU or vehicle controls and prepared at Inotiv (Research Triangle Park, NC) versus libraries prepared at TwinStrand Biosciences. The Health and Environmental Sciences Institute (HESI) ECS working group conducted the most comprehensive inter-laboratory reproducibility study to date using DS (Zhang et al., 2025b). DNA samples from livers of an untreated Sprague Dawley rat, or rats treated with either 100 mg/kg/day of BaP for 10 days or 40 mg/kg/day ENU for three days, were mixed to construct simulated test samples with target MF increases of 1.2-, 1.5-, and 2-fold above the untreated controls. Eight laboratories in North America and Europe prepared libraries for sequencing these samples. The measured MF and spectra were virtually identical across the laboratories (Zhang et al., 2025b). A high level of technical and biological reproducibility has also been reported using in vitro samples, with high correlation (r = 0.97) in independent experiments on ENU-treated human TK6 cells from two separate laboratories (Health Canada and Inotiv), both analyzed by TwinStrand Biosciences (Cho et al., 2023). Finally, a collaborative study conducted in the Japanese Environmental Mutagen and Genome Society (JEMS) – Mammalian Mutagenicity Study Group, demonstrated high inter-laboratory reproducibility (r2 ≥ 0.97) of Hawk-Seq using mouse DNA samples across three laboratories (Matsumura et al., 2025). Collectively, these findings underscore the high reproducibility of results produced using ECS establishing a robust foundation for regulatory acceptance and facilitating adoption across laboratories globally.

Table 5.

Interlaboratory reproducibility studies applying the same Error-Corrected Sequencing assay on the same rodent tissue samples in two different laboratories.

Interlaboratory Reproducibility Concordance Reference
Treatment Rodent Model System Tissue Lab 1 Lab 1 Result Lab 2 Lab 2 Result Correlation of Results (r)
Benzo[a]pyrene (BaP) MutaMouse Bone marrow TwinStrand + Health Canada + 0.94 + LeBlanc et al. (2022)
N-ethyl-N-nitrosourea (ENU) SDa Rat Stomach, blood, bone marrow, liver TwinStrand + Inotiv-RTP + 0.99 + Smith-Roe et al. (2023)
a

SD=Sprague-Dawley

Several studies have conducted power analyses to determine the sample sizes required to detect different effect sizes using ECS. For example, Dodge et al. (Dodge et al., 2023) used data from PRC-treated mice to determine that three animals per dose group, with a minimum of 500 million consensus bases per sample, is sufficient to detect a 1.5-fold increase in mutations with > 80% power. Esina et al. (Esina et al., 2024) collected data from mouse DNA libraries prepared using enzymatic fragmentation, which reduces the background MF relative to sonication. They used simulated liver and bone marrow mouse DS data to explore different sigmas and duplex read depths, again with 500 ng DNA input. They found that a sample size of four animals per group was sufficient to obtain >80% power to detect a 2-fold change in MF relative to baseline, even when modelling the largest amount of animal-to-animal variability they considered. The HESI ECS working group also conducted simulation studies based on parameters derived from DS libraries prepared using 1000 ng input of enzymatically fragmented DNA in their eight-laboratory ring trial described above (Zhang et al., 2025b). They concluded that DS has 80% power to detect a 1.5- fold increase in MF with animal group sizes of N = 4. The OECD test guidelines for the TGR and Pig-A assay recommend 5 and 6 animals per dose group, respectively. Thus, although based on studies conducted with potent mutagens, power analyses for DS studies could use smaller animal group sizes to detect similar increases in MF compared with conventional in vivo mutagenesis assays. Given the reported MF and inter-animal variability observed with other ECS platforms, studies utilizing these alternative platforms are expected to achieve sufficient power when employing the recommended animal group sizes currently used for in vivo mutagenicity assessment (note, repeated dose toxicity studies in rodents typically use group sizes of n = 10 per sex). More work is needed to determine power for other platforms, although the working group expects that the other platforms will demonstrate similar power.

Overall, the available data indicate strong concordance between ECS in vivo mutagenicity results and OECD TGs, particularly in liver and bone marrow. The existing OECD tests were validated using a wide range of chemicals covering a broad spectrum of chemical structures, mechanisms of mutagenicity, and potency. Sanger sequencing was used to verify that the mutations were the cause of the mutant phenotypes observed in the conventional tests. It is thus debatable whether, ECS, a sequencing technology, requires the same level of validation to demonstrate equivalence. Although future evaluation across a wider range of chemicals, encompassing additional mechanisms of mutagenicity and weak mutagens would be useful, this should not be an impediment to the deployment of ECS’s capabilities in a regulatory setting.

Liver, gastrointestinal tract, and bone marrow are the primary tissues most frequently used in gene mutation testing in the TGR assay (Lambert et al., 2005). Expanding the scope of ECS to additional tissues beyond liver and bone marrow would be beneficial, but again this should not be an impediment to ECS deployment, as available TGR data for stomach, kidney, and blood clearly demonstrate consistency with ECS results (Table 2), and where no data currently exists concurrent vehicle control assessments would provide sufficient comparison. Moreover, it would take considerable time and resources to generate ECS data in a wide range of tissues up-front, and this could unnecessarily impede ECS adoption. There is also a need for more published studies on compounds expected to yield negative in vivo genotoxicity results to support a balanced assay performance assessment. Although a substantial number of nitrosamines tested are negative for mutagenicity using DS, these findings have yet to be published (Schuler, personal communication). These data will be critical for assessing the specificity and false positive rates of ECS technologies. Finally, these initial studies suggest that technical sources of variation are minimal compared to biological variation; however, this observation would benefit from further examples across different ECS platforms beyond DS.

While further studies would help to confirm the robustness and reproducibility of these technologies, particularly in the context of regulatory applications, it is important to recognize the standard elements that are shared across all in vivo mutagenicity assays (e.g., animal treatments, tissue sampling, DNA preparation) and every ECS platform (e.g., DNA sequencing and bioinformatics pipelines). Inter-animal variability exists for all in vivo studies. The main differentiator for ECS platforms is library preparation strategies used to identify rare mutations and bioinformatics pipelines. However, as described below, bioinformatics pipelines can be subjected to quality controls and standardization to ensure transparency and reproducibility of data analyses as, for example, proposed for toxicogenomics (Harrill et al., 2021). Available data acquired using DS clearly provide sufficient proof of concept and establish a framework for other ECS platforms. As such, the potential of ECS as a viable alternative to traditional in vivo mutagenicity tests is self-evident and its adoption for regulatory testing is fully supported by the IWGT working group.

5. CONSENSUS ON THE USE OF ECS FOR REGULATORY MUTAGENICITY TESTING

During the development of OECD TGs 488 and 470, confirming the genotypic basis of the mutant phenotypes through DNA sequencing (the “gold standard”) represented a critical validation step (Lambert et al., 2005; OECD, 2009; OECD, 2020). It could be argued that the application of ECS in mutagenicity testing is a natural evolution of these approaches. As such, it could be said that the technology has come “full circle”. Thus, extensive validation of ECS for well-characterized mutagens may be unnecessary as the various ECS platforms reviewed in Section 3 simply replicate the diversity of assays already used to assess mutagenesis and offer a molecular (DNA-level) lens on mutagenesis rather than a phenotypic one. Nonetheless, it is important to understand how specific ECS platforms and assay conditions influence performance to ensure confidence in their regulatory application.

In the following sections, we present and discuss the consensus statements (highlighted in bold) reached by the working group on available ECS methodologies, recommended experimental design, and evaluation and interpretation of results for studies aimed at generating mutagenicity data for regulatory submissions. The working group recognizes the diversity of available platforms for mutation detection by ECS. Consensus on recommendations that are independent of any specific method are briefly discussed below.

5.1. Comparison of ECS and TGR results

Based on the available data (see section 4), the working group agrees that there is sufficient evidence to indicate that ECS produces concordant results with those obtained using TGR assays. The available data come from experiments using well-characterized mutagens (Tables 2 and 3); thus, no conclusion can be made at this time about the specificity of ECS. Data with expected negative compounds would be needed to evaluate the ability of ECS to correctly detect non mutagenic compounds. Expanding the analysis to include more chemicals (mutagens with diverse mechanisms and potencies and non-mutagens) would also be valuable. However, the expected negative results observed with methapyrilene (Sahib et al., 2024), a non-genotoxic carcinogen, and NDEA in bone marrow, which lacks the appropriate metabolic capacity, contrasted with the positive results in liver (Zhang et al., 2024), support the utility of ECS in correctly characterizing mutagenic activity. In addition, comparison of mutation frequencies in animals receiving vehicle controls versus naïve animals could provide further confidence in the ability of ECS to correctly identify non-mutagens. It is noted that the evidence is more extensive for DS and Hawk-seq than for other ECS approaches. Nevertheless, the working group expects that other ECS assays will work as well, provided that the recommendations outlined in the present document are adhered to. Testing across different ECS platforms to compare sensitivity and specificity in detecting mutations in vivo should primarily focus on demonstrating platform equivalence relative to current TG methods (i.e., TGR and Pig-a assays). Given the strong evidence from DS, such comparative studies should be conducted only as a prerequisite for regulatory acceptance rather than as a fundamental validation of ECS technologies. Indeed, the use of tissues/DNA samples from existing comparative studies is encouraged, where possible.

5.2. ECS library preparation

OECD TG 488 does not describe methods for isolating DNA from tissues. Similarly, the working group proposes that upstream sample processing, such as DNA extraction, should not be considered part of the TG for ECS. In addition, while it is theoretically possible to study mutations using RNA (Jessen et al., 2021), the working group recommends that the input for the library preparation portion of the assay should be genomic DNA, ideally in a quantity that is readily attainable from commonly assessed tissues. It is desirable to use quality contract research services and/or off-the-shelf commercially available kits manufactured according to appropriate quality standards and certifications for preparing libraries and sequencing. Independently of whether the ECS assay is conducted using a kit or a service, protocols and all reagents should be defined, and service laboratories should demonstrate proficiency for each assay element. During the establishment of assay proficiency, reference DNA samples with established performance metrics and known MF, should be included as technical controls for evaluating library preparation, sequencing, and bioinformatic processing. Additionally, the physical portion of the assay should also be version-controlled, with a change to any component or handling step resulting in a new version number and clearly documented.

5.3. Spontaneous mutation frequencies

The working group agrees that the assay may be non-targeted (genome-scale) or target-based (defined regions of the genome) but should be capable of detecting MF in the order of 107 or lower in vehicle controls. Available data show that, with one exception, rodent tissues in vehicle control groups have MF in the range of 0.28 to 2.86×10−7 (Table 4); thus, the assay should be able to accurately quantify MF in the expected range though factors including age, tissue and fragmentation approach can affect baseline MF. The total consensus nucleotide coverage for determining MF should be sufficiently high to allow derivation of mutational spectra and signatures. A range from 5×108 to 1×109 analyzable consensus bases per sample may be a good starting point, though more may be required if controls have extremely low MF or if granular trinucleotide spectra are desired, and fewer may be required for strong mutagens or large groups. The relationship between DNA input, sequencing allocation, and total consensus bases should be determined empirically for each assay. The use of non-targeted or target-based approaches should be clearly described in upstream documentation and in reports, as the approaches require different strategies for variant filtering and interpretation.

5.4. Identification of germline variants

Germline variants can artificially inflate the observed MF if they are counted as mutations. Accounting for germline SNPs is particularly critical for outbred models commonly used in safety assessment. For example, significant population divergence has been observed across colonies of Charles River and Harlan Sprague Dawley rats, even within the same facility (Gileta et al., 2022), millions of novel variants have been identified in CD-1 mice (Jung et al., 2023), and distinct differences have been reported between separate MutaMouse colonies (Meier et al., 2017). Thus, the working group agrees that the assay must effectively account for, and remove, germline variants. Germline variants will generally be present at variant allele frequency (VAF) of approximately 0.5 or 1.0 with some variation, and somatic mutations arising during development are present at varying levels (Kim et al., 2022; Muyas et al., 2020). If the assay generates sufficient molecular depth, as targeted assays generally do, germline variants can be distinguished from induced mutations in a single assay. In tests with DS, somatic mutations induced by toxic exposures are generally present in one or a relatively small number of molecules out of a molecular depth of tens of thousands, easily distinguishing them from germline or moderately clonal variants.

Filtering of germline variants is particularly important for genome-scale assays, as the data may contain many more SNPs than chemically induced mutations. If the assay is performed at shallow molecular depth, germline genotyping with WGS can be performed in parallel to identify germline variants to be filtered from ECS data. Studies in outbred strains or humans may require WGS to be run on every sample. Public SNP databases or published SNP datasets, mutation data from different animals, different organs within the same animal, or different reads in the same sample are all approaches that could be useful in identifying possible germline variants. Future studies should confirm that germline filtering using representative samples or public datasets sufficiently reduce background mutation frequencies for analyses using outbred strains.

5.5. Bioinformatics pipeline

ECS approaches require refined and transparent bioinformatics pipelines for OECD acceptance. The working group recommends that the assay and bioinformatic tools/settings should be transferable to enable inter-laboratory reproducibility as per an OECD reporting framework. Software for computer analyses of raw sequencing data, filtering strategy, metrics of mutation classification, and detection of clonal expansion should be packaged into a complete, version-controlled software suite. The suite may contain open-source tools and/or customized tools or algorithms. This suite should be part of the regulated assay. A change to any component in the suite should result in a new version number for the entire suite. Analysis steps should be transparent and preferably vetted by the community for accuracy. This will enable appropriate quality control for the bioinformatics portion of the assay and facilitate portability and reproducibility. Justification for each filtering criterion should be scientifically sound. Filtering steps should be accounted for such that a user can combine the final variant call set with the filtered variants to obtain a variant set matching the pre-filtered set.

6. CONSENSUS ON THE EXPERIMENTAL DESIGN FOR ECS STUDIES

OECD TG 470 and 488 provide a well-established framework for assessing in vivo mutagenicity, offering standardized methodologies, validation criteria, and regulatory acceptance. Recommendations in these guidelines can be leveraged to streamline the integration of ECS by aligning it with current best practices. Key elements, such as study design, data interpretation, and statistical thresholds, can be adopted to ensure consistency and regulatory relevance while addressing the unique capabilities of ECS methods.

6.1. Exposure duration

One of the most significant advances that ECS provides is the ability to conduct mutation studies without requiring specialized transgenic rodents. Given that OECD TG 488 recommends a 28-day exposure, integrating mutagenicity testing within OECD TG 407 (e.g., the 28-day repeated dose toxicity tests), is both logical and feasible. The working group agrees that integration of ECS within a standard 28-day repeat-dose rodent toxicity study constitutes a valid mutagenicity test. Longer exposure durations (i.e., 90- or 180-days) are also acceptable. Dose-setting and study design elements for these TGs are highly aligned approaches used are based on maximum feasible dose, maximum tolerated dose or clinical considerations. Tissue selection may be based on TG 488 guidelines or target tissues identified in other studies. Indeed, it is practical to integrate both the in vivo micronucleus assay (OECD, 2016a) and mutagenicity testing within OECD repeated dose toxicity studies. This would provide a “one stop” in vivo assay covering the salient endpoints of concern for genetic toxicology (i.e., mutagenicity, clastogenicity and aneuploidy) dramatically increasing efficiencies in testing and reducing animal use. We recommend that this be trialed and promoted for adoption as soon as possible.

Theoretically, ECS provides the opportunity to integrate mutagenicity assessment into any of the sub-chronic and chronic duration rodent toxicology studies, such as the repeated dose 90-day oral toxicity study (OECD, 2018a), chronic toxicity studies (OECD, 2018c), and reproductive toxicity studies such as the combined repeated dose toxicity study with the reproduction/developmental toxicity screening test (OECD, 2016g). These tests may provide ideal test systems within which to conduct mutagenicity studies as we transition to more quantitative applications of genetic toxicology and increase our understanding of chronic duration exposures that are more relevant to human pharmaceutical, consumer product, and environmental exposures. TG 488 cautions that exposure durations longer than eight weeks may produce an apparent increase in mutant frequency through clonal expansion. However, unlike TGR models, ECS can identify and correct for clonal expansion given sufficient molecular depth (Dodge et al., 2023; LeBlanc et al., 2022). A DS study using MutaMouse males orally exposed for extended durations (up to 180 days) of benzo[b]fluoranthene, a mutagenic polycyclic aromatic hydrocarbon (Long et al., 2016; Schuster et al., 2024), demonstrated strong mutagenic responses following the longer exposure durations and lower BMDs with prolonged exposure for unique (not clonally expanded) mutations (Yauk, Marchetti, personal communication). Thus, empirical evidence supports the ability to integrate ECS within longer duration (> 28 day) and even chronic studies where warranted.

6.2. Sampling time for somatic tissues and germ cells

Gene mutation studies typically include an expression time, often referred to as sampling time, after exposure to allow the manifestation of mutations (Thybaud et al., 2003). Each tissue may have an optimal sampling time based on its proliferation rate; however, it is neither practical nor desirable to have different sampling times for different tissues in regulatory testing primarily due to considerations related to cost and the 3R principles (Diaz et al., 2020). In line with this, the working group agreed that exposure durations of at least 28 days do not require an expression time except for germ cells when the 28-day exposure duration is employed. In the latter case, an expression time of 28 days, as recommended in TG 488, should be used. For somatic cells, the recommendation not to require an expression time is aligned with TG 470, but not with TG 488, which requires at least three days of expression time. While expression time allows for unrepaired DNA lesions to be fixed into stable mutations, the impact of measuring mutations the day after a 28-day exposure, rather than three days after as per TG 488, is expected to be small and should not adversely affect the ability to correctly identify mutagenic compounds (Hori et al., 2019). However, it would be beneficial to have empirical data demonstrating this.

For germ cells, a 28-day sampling time after a 28-day repeated dose study is necessary to ensure that the population of germ cells that can be collected from the seminiferous tubules has received sufficient exposure during the proliferating phase of spermatogenesis when mutations are fixed (Marchetti et al., 2018). However, this does not preclude integration with standard toxicity tests as they normally include animals from satellite groups that are maintained for monitoring the reversibility, persistence, or delayed occurrence of toxic effects. Animals from these satellite groups could be used for assessing mutagenicity in germ cells at the required sampling time.

6.3. Alternative experimental designs

Although it is possible to identify mutagenic chemicals in shorter-term studies, there is insufficient data, even from the TGR assay, to have confidence in a negative call in these tests. The working group agrees that exposure durations shorter than 28 days represent a valid test, provided they produce a positive result. A study with a minimum 28-day exposure duration is required to demonstrate the absence of mutagenicity. Evidence suggests that genetically neutral transgene mutations accumulate linearly with the number of daily or weekly treatments up to 90 days of exposure (Heddle et al., 2000). The same may be true for mutations detected by ECS methods. Consequently, the more treatments are given, the more mutations are likely to be induced, suggesting that ECS could be integrated into long term studies. However, longer treatments are often more toxic than short term treatments, which means that sometimes a compromise between treatment duration and dosage needs to be made. In addition, the ideal sampling time is highly dependent on the tissue, the mutagen and the impact of cytotoxicity (Heddle et al., 2003). This is especially true when considering the best timepoints for slowly proliferating tissues like the liver or rapidly proliferating tissues like the bone marrow.

6.4. Number of animals

In vivo OECD TGs for genotoxicity testing generally recommend 5–6 animals per dose group and three doses plus controls with a 2-fold increase with respect to controls generally considered as a biologically meaningful response (OECD, 2017). The working group concurs that the number of animals per group should be chosen to enable the detection a 2-fold change in MF with 80% power. Existing data, at least from studies employing DS, demonstrate that this is achievable with ≤ 5 animals per group (see Section 4). For the positive control group, three animals are sufficient to demonstrate that the assay is working properly (see below). In addition, each laboratory should conduct power analyses during the establishment of proficiency to ensure that their experimental design aligns with these goals. Notably, the number of animals required to conduct an ECS study is less than what is normally used in standard repeat dose toxicity tests further supporting the integration of mutagenicity testing within other toxicity endpoints.

6.5. Use of positive controls

In mutagenicity testing, the inclusion of a positive control group is used to demonstrate that the method is working properly and capable of detecting chemically induced mutations. This is even more critical when the test substance does not induce mutations, as the positive control serves as the sole evidence of the method’s ability to detect mutagenicity. This practice is an integral part of all OECD mutagenicity tests. The working group agrees that the use of a positive control is required during establishment of proficiency in the conduct of the ECS assay. However, once proficiency is established as described in OECD TG 488, a concurrent or non-concurrent positive control is not required for every test and several strategies could be used to control the performance of the assay. Existing OECD TGs for in vivo mutagenicity testing allow the possibility to omit the routine in-life phase for positive control groups, primarily for animal welfare reasons. For example, the in vivo micronucleus TG (OECD, 2016a) permits the use of previously generated positive control samples to confirm the accuracy of the scoring process. Similarly in TG 488, positive control tissue samples from previous studies can serve as concurrent positive controls to verify the laboratory’s ability to lyse and successfully package DNA into phages and infect bacteria. The working group believes this strategy is equally applicable to ECS studies.

Another factor that should not be overlooked is that, in contrast with other endpoint readouts, ECS has integrated quality measures that ensure the overall quality of the data set. These quality measures include: evaluating DNA quality used for library preparation and assessing sequencing reads quality using, for example, Phred quality score (Q score) (Richterich, 1998). The Q score assesses the accuracy of the sequencing platform and its base calling. Further bioinformatics processes, such as determining of the number of consensus sequences and error correction also contribute to the overall quality assessment. Nevertheless, standard DNA with known mutations (internal standards) should be included as technical controls in each experiment at least during the proof-of-proficiency stage. These internal standards provide an additional layer of quality assurance in addition to the parameters already integrated into ECS. Ideally, this DNA sample can be shared among the library preparations, since it should give consistent results.

7. CONSENSUS ON THE EVALUATION OF ECS RESULTS

In OECD TG 488, mutant frequency is the measure by which results are identified as positive, negative, or equivocal. The sequencing of transgenic DNA is conducted secondarily, on a case-by-case basis, to identify whether shifts in mutation spectrum could aid interpretation of an equivocal result, or to identify jackpot mutations that could influence whether a result is positive. In contrast, DNA is directly sequenced and mutations identified by ECS providing a data-rich approach to evaluating mutagenesis. In addition to MF, ECS readily provides mutation spectrum and other mechanistic information that can contribute to determining whether results are positive. The mechanistic information that is intrinsic to ECS provide a foundation for interpretating in vivo mutagenicity findings within a broader biological context. Furthermore, the management of sequencing data and the appropriate reporting of metadata for regulatory purposes must be carefully considered. Lastly, the quantitative nature of ECS and the ability to directly sequence mammalian genomes invites closer application of these data to quantitative human health risk assessments, and therefore adoption of experimental designs that are optimal for BMD response analysis where applicable.

7.1. Evaluation Criteria

The working group agrees that the overall MF and the concurrent vehicle control should be used for statistical analysis and data interpretation of positive and negative results. Both the magnitude of the change and the p-value relative to concurrent vehicle control should be reported. Consistent with OECD TG 470 and 488, the working group considers that the analysis of the ECS data should determine whether: (i) at least one treatment group exhibits a statistically significant increase in MF compared with the concurrent vehicle control; and (ii) the MF responses are dose related. If both criteria are met in any studied tissue, the test chemical is considered mutagenic. Conversely, a test chemical is considered negative in the studied tissues if: (i) no treatment group shows a statistically significant increase in MF compared with the concurrent vehicle control group; (ii) the MF response is not dose-related; and, (iii) it is confirmed that exposure to the test chemical and/or its metabolites occurred using approaches such as those described in OECD TG 488. If a result is not positive or negative as defined above, expert judgement should be applied. Data such as mutation spectrum and/or locus-specific responses can help to conclude whether the result is positive or negative (see consensus statements 7.4 and 7.5).

7.2. Use of historical negative controls

In line with the findings of a previous IWGT working group (Dertinger et al., 2023), the working group considers that historical negative control (HNC) data sets are valuable for understanding if the ECS method used by a laboratory is “under control.” Generally, the HNC is considered to be ‘under control’ when animal-to-animal variability accounts for the majority of the observed variation across experiments together with stable mean MF and variance (Dertinger et al., 2023). HNC can be used as a proxy for “normal biological variation” only if robust HNC data were generated (at least 20 independent studies and 100 individual animals as per Annex 2 of TG 488). When robust HNC data are available, the concurrent vehicle control can be compared to the HNC to show that it is acceptable and that the ECS test performed is “under control”. However, as laboratories may perform tests using different mammalian (usually rodent) species, strains, sexes, tissues, and ECS technologies, it is not feasible to develop a robust HNC database for all experimental conditions. A practical solution is for the laboratory to establish a data set with 10 vehicle controls and 10 positive controls to confirm that they are obtaining mutation frequencies in the range expected (consistent with Annex 2 of TG 488). In addition, generating MF in vehicle control samples that fall within the expected range (see section 5.3) provides another evidence that the assay is performing as expected.

7.3. Calculation of mutation frequencies

An acknowledged limitation of TGR models is that they do not control for the impact of clonal expansion on mutant frequencies as they are routinely conducted. As stated in OECD TG 488, clonal amplification can artificially inflate MF in individual tissues. DNA sequencing to correct for clonal expansion in TGR studies can reveal cases where a specific mutational type significantly increased in a single tissue sample from an individual animal (Recio and Meyer, 1995; Sisk et al., 1994) and provides more power to detect an effect (Beal et al., 2015; Besaratinia et al., 2012).

Correction of clonally expanded mutations is feasible with ECS since it allows the characterization of each individual mutation. However, targeted and genome-scale approaches differ in their ability to resolve clonal expansions. For example, DS with its deep coverage across 20 targeted genomic sites readily allows direct assessment of clonal expansion of mutant cells. This enables MF to be calculated using two alternative approaches: MFmin where the repeated occurrence of the same mutation within a sample is considered the clonal expansion of a single initial event; and MFmax, where multiple occurrences of the same mutation are considered independent events. The working group agrees that for target-based approaches, MF should be calculated using the assumption that the same mutation present in multiple consensus reads within a sample arose from a clonal expansion event, and therefore, only one instance of the mutation should be counted. Conversely, as noted in Section 2, genome-scale ECS approaches typically lack the resolution to distinguish clonally expanded mutations, unless the expansion is large. In practice, genome-scale ECS detects all mutations as single counts, making the distinction between MFmin and MFmax unnecessary. Thus, for non-target ECS approaches a single MF should be reported and used to determine whether the test chemical induced a significant increase in mutations, as recommended in consensus statement 7.1.

7.4. Analysis of mutation subtypes

OECD TG 488 recommends sequencing mutant plaques or colonies in cases where mutant frequencies alone are not sufficient to determine whether a result is positive or negative and to support weakly positive results. Dose-dependent increases in mutation subtypes provide mechanistic evidence for mutagenesis. Unlike the TGR assay, ECS provides the mutation spectrum without additional work to aid the interpretation of results. Thus, the working group agrees that using appropriate statistical analysis with multiple testing correction, dose-dependent changes observed for one mutation subtype (using simple spectrum) can result in a positive call, even if the overall MF does not meet criteria for a positive call. When changes in mutation subtypes are observed, supporting data such as DNA adduct formation or the mutational signatures of related compounds should be considered if available, as this information may support the biological significance of such changes. Note that ECS methods may differ in their ability to detect certain mutation subtypes, such as SNVs, small indels, and SVs. Laboratory procedures such as the method of DNA fragmentation or end repair may affect the error rate for specific base substitutions (Otsubo et al., 2022), or bioinformatic pipeline components such as the aligner or variant caller could restrict SV detection. Further analyses would be useful to determine the power to detect different mutation subtypes for the various ECS methods.

7.5. Analysis of response at individual targets

DS uses 20 target sites spread across the genome to measure MF. Notably, the 2.4 kb DS target sites are comparable in size to transgenic loci, which range from about 0.2 kb (cII gene) to 3 kb (lacZ gene), although transgenic loci are present in multiple, tandem copies integrated into a single site of a rodent genome. The DS panel was designed to include loci that broadly represent certain genomic features, such as the composition of trinucleotide sequences and GC base pair content across the genome (see Section 3.1). Thus far, the species-specific target sites exhibit variation in responsiveness to mutagens with different mechanisms of action in both in vivo (Bercu et al., 2023; Dodge et al., 2023; LeBlanc et al., 2022; Schuster et al., 2024; Smith-Roe et al., 2023; Zhang et al., 2024) and in vitro models (Cho et al., 2023). Considering that MF can vary by target site, it may be possible to encounter data sets in which dose-dependent increases in MF occur in only one or a few target sites. These effects may be diluted in analyses of the overall MF. An important question for targeted ECS methods is thus whether each target should be treated as a biological replicate. However, the working group agrees that more research is needed to determine if dose-dependent increases in MF at a single target, or a small set of targets, should lead to a positive call even if the overall MF does not meet criteria for a positive call.

Targets in the DS panels are distributed between genic and intergenic locations and reduced of mutagenesis at genic sites compared to intergenic sites has been shown for several mutagens (Dodge et al., 2023; LeBlanc et al., 2022; Schuster et al., 2024). This suggests that DNA lesions induced by the test article can be removed by transcription-coupled repair (TCR) before the lesion is converted to a mutation. Indeed, recent data demonstrate a role for TCR in modulating MF in genic targets (Luzadder et al., 2025; Minko et al., 2024). It is not expected that all DNA lesions would effectively trigger TCR; for example, no difference was observed with NDEA between genic and intergenic targets (Bercu et al., 2023; Zhang et al., 2024). Nevertheless, a clear difference in MF for genic versus intergenic sites would add to the biological plausibility of mutation induction, which may aid interpretation of ECS data. Taken together, more research is needed with different test articles, testing regimens, and in vivo and in vitro models to better inform expert judgement regarding the biological relevance of selective increases in MF at target sites and differences in MF at genic and intergenic targets. The need for research in this area should not diminish the acceptance of ECS as a regulatory test for mutagenicity.

7.6. Data reporting

The OECD has developed a reporting framework for transcriptomics and metabolomics in regulatory toxicology (OECD, 2023) that provides a structured approach for data standardization, quality control, and analysis to ensure reproducibility and facilitate regulatory acceptance of data. The working group agrees that a data reporting framework for ECS studies should be developed using the “OECD Omics Reporting Framework for Transcriptomics and Metabolomics in Regulatory Toxicology” as a template for creating an ECS-specific module. A reporting framework for ECS provides a starting point in establishing best practices and standardized data analysis pipelines for SNP profiling, clonal expansion correction, and MF reporting, enhancing data comparability across studies. By incorporating best practices for reporting wet and dry experimental protocols, an OECD ECS framework would support its integration into regulatory decision-making, ensuring robust evaluations of mutagenicity and genomic stability. To begin, the working group has drafted a “Data Acquisition and Processing Reporting Module (DAPRM)” and a “Data Analysis Reporting Module (DARM)” for ECS methods (Modules 3.2 and 41 in Supplementary file 1, respectively).

At high level, an assay performance report should list total informative consensus bases and other metrics generated per sample so the user can determine whether all individual samples meet the minimum data requirement, and whether sufficient data were generated to detect a certain MF fold change given the baseline MF in controls and the number of samples per group. Data acceptance criteria, such as the minimum sequence amount, maximum spontaneous MF, or identity of spontaneous mutation spectra, should be clearly specified. The report should address version-control with documented design history and validation, including library preparation reagent formulations, lab protocol, and a deterministic bioinformatics analysis pipeline. Previous versions should remain available for some time after an update. The assay’s documentation should list genomic positions, variant types, and thresholds for allele frequencies or counts that are automatically filtered from the final dataset. Documentation of additional manual filters should be appended to individual datasets as necessary. Finally, the bioinformatics pipeline should output separate variant call files for variants retained in the final dataset vs. those filtered out, such that all initial consensus variant calls are accounted for.

7.7. Quantitative dose-response analyses

As indicated in Section 3, quantitative analyses of genotoxicity data are gaining acceptance among the scientific and regulatory communities. The BMD approach works by fitting curves to dose-response data to enable estimation of the ‘benchmark dose’ most likely to cause a small, predetermined effect-size relative to control-group levels. Compared to pairwise statistical testing approaches used to determine NOGEL / LOGEL values, BMD modelling is advantageous because the derived point-of-departure is not limited to being one of the dose-groups employed in the study design. Moreover, information from all animals across all dose-groups contributes to the BMD estimate and its confidence interval (MacGregor et al., 2015).

Although mutagenicity testing applies an experimental design aimed at hazard identification, the working group recommends the adoption of experimental designs appropriate for quantitative dose-response analysis (e.g., Benchmark dose modelling) when such designs would improve the applicability of the data to inform human health risk assessment. Precise BMD estimation is best enabled by experimental data that captures the complete shape of the dose-response relationship. Consistently, this has been demonstrated to be better achieved (for the same total number of animals) using greater numbers of dose-groups each containing fewer animals (Slob, 2014a; Slob, 2014b; Wills et al., 2016). Thus, compared to the experimental design proposed in TG 488, experiments designed to best enable BMD analyses would employ more dose-groups with fewer animals per dose-group. If pairwise statistical testing to determine NOGEL / LOGEL values is to be carried out alongside BMD analyses, the number of animals per dose-group should be maintained at a level sufficient to demonstrate 80% power to detect a 2-fold effect (i.e., aligned with consensus statement 5.4). Providing sufficient animal numbers are used, the working group suggests that it is appropriate to first test high doses to determine if a test is positive before proceeding to establish responses to lower doses when a precise BMD estimate is also required. This tiered approach supports 3Rs initiatives by not adding animals until needed.

Available studies indicate that BMDs derived using ECS data agree with those derived using TGR methods (Ashford et al., 2025; Dodge et al., 2023; LeBlanc et al., 2022; Zhang et al., 2024). However, more results will be required to establish consensus and define the most appropriate critical effect-size for use with ECS data. Until sufficient data are available to enable empirical determination of an endpoint-specific critical effect-size for ECS data, a 50% increase over the concurrent vehicle control should be utilized, which is in-line with recommendations for other in vivo mutagenicity endpoints (White et al., 2025).

8. ADDITIONAL OPPORTUNITIES

As previously suggested (Marchetti et al., 2023a; Marchetti et al., 2023b), the working group considers integration of ECS within TG 488 and TG 407 as the most pragmatic initial steps toward its adoption for regulatory use. In addition to other TGs already identified in section 5.1, the working group recognizes that potential applications of ECS extend beyond these TGs, with opportunities to enhance in vivo and in vitro assessments. A few examples are discussed in the following sections.

8.1. Integration of ECS with reproductive toxicity studies

Table 6 lists potential OECD TGs where ECS could be integrated. Neurotoxicity and dermal studies were omitted due to differences in dose selection and possible limitations in systemic exposure, respectively. The OECD TGs considered here all use an approach similar to OECD TG 488 for determining the high dose. Many of the recently revised TGs refer to the parameters described in OECD guidance document 19 (OECD, 2002), which includes consideration for dose selection. The age and number of animals used in these TGs do not pose an issue as they all use the same or larger numbers of animals per group and both sexes, and the age of animals is typically about 8 weeks. In some cases (e.g., subchronic and chronic studies), younger animals are used; however, considering the duration of the study, the use of younger animals at the start of the treatment can be justified.

Table 6.

Summary of key study design parameters in different OECD test guidelines and consensus on whether they are amenable to integration with mutation analysis by Error-Corrected Sequencing.

OECD TG # Title Approach for selection of the top dose Age at the start of administration Number of animals per group/sex Study duration (days) TG applicable for ecNGS assessment? Germ cell mutation assessment possible
488 (2022) Transgenic somatic and germ cell gene mutation assays The highest dose should be the dose that will be tolerated without evidence of study limiting toxicity, relative to the duration of the study period, i.e., inducing toxic effects but not death or evidence of pain, suffering or distress necessitating humane killing. 8–12 weeks 5 28-day exposure + 28-day sampling time Yes Yes
407 (2008) Repeated dose 28.day oral toxicity study in rodents The highest dose level should be chosen with the aim of inducing toxic effects but not death or severe suffering. Young healthy adult 5 28 (optional: additional 14-day recovery period) Yes No. However, a recovery period of at least 28 days is also considered within the study, these recovery groups would be suitable.
408 (2018) Repeated dose 28 day oral toxicity study in rodents The highest dose level should be chosen with the aim to induce toxicity but not death or severe suffering (see OECD Series on Testing and Assessment No. 19 (19) Young healthy adult 10 90 (optional; additional 14-day recovery period) Yes Yes
412 (2018) Subacute Inhalation toxicity: 28-day study The high concentration level should result in a clear level of toxicity but not cause lethality or persistent signs that might lead to lethality or prevent a meaningful evaluation of the results. When testing aerosols, the high concentration may be the maximally achievable level that can be reached while meeting the particle size distribution standard. 8–10 5 28 days (5 or 7 days/week) Yes No
413 (2018) Subchronic inhalation toxicity study: 90-day study The high concentration level should result in a clear level of toxicity but not cause lethality or persistent signs that might lead to lethality or prevent a meaningful evaluation of the results. When testing aerosols, the high concentration may be the maximally achievable level that can be reached while meeting the particle size distribution standard 8–10 10 90 (5 or 7 days/week) Yes Yes
414 Prenatal development toxicity study Not in full accordance with the protocol required for gene mutation analysis using ECS. Nevertheless, ECS data from the F1 animals might provide information regarding the mutagenic potential of in utero exposures.
415 One-generation reproduction toxicity study Not used for regulatory purposes.
416 (2001) Two-generation reproduction toxicity study The highest dose level should be chosen with the aim to induce toxicity but not death or severe suffering. In case of unexpected mortality, studies with a mortality rate of less than approximately 10% in the parental (P) animals would normally still be acceptable. 5–9 Weeks At least 20 Males for one complete spermatogenic cycle Yes (currently males only) Yes
421 (2016) Reproduction/developmental toxicity screening test The highest dose level should be chosen with the aim to induce toxicity but not death or severe suffering. Healthy young adult 10 animals At least 4 weeks Yes (currently males only) No
422 (2016) Combined repeated dose toxicity study with reproduction/developmental toxicity screening test The highest dose level should be chosen with the aim of inducing toxic effects but not death nor obvious suffering. Healthy young adult 10–12 weeks 10 males Yes (currently males only) No
443 (2018) Extended one-generation reproductive toxicity study The highest dose level should be chosen with the aim to induce toxicity but not death or severe suffering. Approx. 11 weeks At least 20 At least 10 weeks Yes (currently males only) No
451 (2018) Carcinogenicity studies The highest dose level should be chosen to identify the principal target organs and toxic effects while avoiding suffering, severe toxicity, morbidity, or death. <8 weeks At least 50 24 months Yes Yes
452 (2018) Chronic toxicity studies The highest dose level should be chosen to identify the principal target organs and toxic effects while avoiding suffering, severe toxicity, morbidity, or death. < 8 weeks At least 20 12 months Yes Yes
453 Combined chronic toxicity/carcinogenicity studies The highest dose level should be chosen to identify the principal target organs and toxic effects while avoiding suffering, severe toxicity, morbidity, or death. < 8 weeks At least 50 12 & 24 months Yes Yes

Integration of ECS in reproductive toxicity studies, such as the two-generation (OECD TG 416) (OECD, 2001) or extended one-generation (OECD TG 443) (OECD, 2018b) studies, as well as the screening versions OECD TG 421 (OECD, 2016f) and TG 422 (OECD, 2016g), is also feasible. Although further investigation may be needed for tissues from female animals that undergo pregnancy during the treatment phase (dose inducing maternal toxicity might differ in non-pregnant females), tissues from male animals from such studies follow the same treatment regimen as in repeated dose toxicity studies (e.g., OECD TG 407) (OECD, 2008). Therefore, male tissues from these studies can be directly used for ECS assessments. Thus, for routine somatic cell mutagenicity testing, ECS could readily be integrated into numerous OECD TGs that are widely and routinely used. For germ cell analyses, only a selected number of OECD TGs may be used, as indicated in Table 6.

While the incorporation of tissues from females used in reproductive toxicity tests for ECS analysis needs further attention, the adult offspring from such studies provide valuable insights on the mutagenic effects of in utero exposures. The high rate of cell proliferation during organogenesis is hypothesized to enhance susceptibility to chemically induced mutations and clonal expansion of cells carrying these mutations, leading to mutation propagation to large portions of organs and tissues (i.e., somatic mosaicism) (Godschalk et al., 2020). Exposure to genotoxic chemicals during sensitive periods of organogenesis may lead to enhanced mutagenicity compared to adult exposures and impact both somatic and germ cell mutagenesis in male and female offspring (Chawanthayatham et al., 2015; Luderer et al., 2019; Mei et al., 2005; Meier et al., 2017).

8.2. Application of ECS for in vitro mutagenesis

The focus of this position paper is on the use of ECS as an alternative to in vivo mutagenicity testing (TGR and Pig-a models); however, the working group acknowledges that ECS is also valuable for in vitro mutagenicity testing with mammalian cells (Armijo et al., 2023; Cho et al., 2023; Huliganga et al., 2025; Maslov et al., 2022; Miranda et al., 2022a; Miranda et al., 2022b; Seo et al., 2024; Wang et al., 2021; Zhivagui et al., 2023).

The regulatory genetic toxicology test battery includes bacterial, in vitro and in vivo mammalian cell mutagenicity tests. A positive response in this battery can eliminate candidate compound from further development or trigger costly and time-consuming animal testing (e.g. TG 470, TG 488, a 6-month rasH2 mouse cancer bioassay, or a 2-year rat carcinogenicity bioassay). Globally, regulatory toxicology is shifting toward New Approach Methods (NAMs), which often utilize human-based in vitro systems to enhance predictive accuracy, increase efficiency and support the 3Rs principles. However, regulatory acceptance of NAMs is hindered by validation challenges; specifically, the evidence that a test can accurately predict the responses known to occur in traditional regulatory testing and more importantly human-relevant outcomes.

Validated, human-relevant genetic toxicity NAMs are urgently needed to reduce reliance on costly and time-consuming in vivo animal models and improve on existing human-relevant in vitro tests. Some regulatory assays, such as the mouse lymphoma assay and Hprt (hypoxanthine-guanine-phosphoribosyl-transferase) assay, use cells (mouse lymphoma L5178Y cells or Chinese Hamster V79 or CHO cells, respectively) that lack a functional Tp53 (Fowler et al., 2012), a key DNA damage response pathway, which can lead to inflated false-positive rates. This has prompted expert recommendations to transition towards Tp53-proficient human cell lines for improved accuracy (e.g., HPRT assay in human primary peripheral blood lymphocytes). Furthermore, the development of human-tissue derived NAMs, including skin, lung, liver, and colon models, enables more physiologically relevant analyses. These model systems can also be used in dose-response study designs to derive BMDs for quantitative risk assessment. Traditional in vitro mutagenicity assays rely on phenotypic clonal selection using single-gene locus tests developed over 50 years ago (Liber et al., 1989; McGregor et al., 1996; Moore and Clive, 1982). Quantifying rare mutational events (1 in 100,000 – 1,000,000 cells) by clonal selection techniques requires large numbers of cells (i.e., up to 2 × 107 cells per exposure condition) and involves labor-intensive subculturing for mutant selection. These methods are poorly suited to 3D human tissue NAMs. By contrast, ECS provides rapid assessment of mutation frequency and spectrum by directly examining DNA after mutation fixation (i.e., two rounds of DNA replication), bypassing lengthy phenotypic expression and clonal expansion periods, paving the way for more accurate and human-relevant genotoxicity testing (Seo et al., 2024; Wang et al., 2021).

Rodent models have well-recognized limitations in predicting human responses to chemical exposures. There is notable variability in animal response and uncertainties regarding the biological alignment of the high maximum tolerated dose used in rodent studies compared to the relatively low human exposures from pharmaceuticals or chemical exposures typically encountered in the environment. In addition, rodent models may not adequately capture the spectrum of potential effects and the broad genetic variability among human responses to xenobiotic exposures, limiting the ability to accurately assess human health hazards and potential risk (National Academies of Sciences and Medicine, 2023). Human relevant NAMs combined with measurements of cellular health status, xenobiotic metabolism and DNA repair capacity, toxicogenomic responses and genetic toxicology endpoints, integrated with in vitro-in vivo (IVIVE) modeling, can move genetic toxicology beyond hazard identification to quantitative analyses to predict potential human risk as part of next-generation risk assessments.

Overall, ECS represents a major leap in genome analysis enabling the quantification of mutational changes in human cell lines including metabolically competent hepatocytes models and a broad spectrum of physiologically relevant NAMs (Seo et al., 2024).

9. CONCLUSIONS

ECS has significantly advanced the ability to detect rare mutations with high accuracy. Empirical evidence provides strong support for the use of these technologies to generate in vivo mutagenicity data for regulatory submissions and ultimately replace the TGR and Pig-a assays. To facilitate the acceptance of ECS by the broader regulatory community, the genetic toxicology community is encouraged to share samples (specifically, in vivo derived samples for reducing unnecessary animal use) among the groups developing different ECS methods. At the time of writing, most of ECS methods have not been directly compared to others using shared DNA samples to precisely characterize strengths and weaknesses of each method. Methods may differ quantitatively in cost, data yield, complexity, baseline MF, ease of standardization, or other factors. Although, they appear qualitatively similar, it is possible that different ECS assays may be fit for different regulatory questions and purposes. Collaborative work should clarify the fit-for-purpose use of ECS in different safety evaluation contexts of use. This is critical because the current lack of availability of TwinStrand kits and services highlights the danger of relying on a single technology. Development of methods that are easily portable to other sequencing platforms (besides Illumina and PacBio) is encouraged.

Despite considerable progress, challenges remain in balancing cost, coverage, and the identification of subtle mutations, particularly for genome-scale applications. As the field progresses, future innovations will likely focus on optimizing sequencing strategies and bioinformatics pipelines to further refine mutation detection. Nevertheless, it is the opinion of this working group that ECS technologies represent a significant advance with respect to the existing in vivo mutagenesis approaches codified in OECD test guidelines. An effort is in progress to develop a detailed review paper for ECS technologies as a first step toward inclusion of ECS in OECD test guidelines. However, formal acceptance could take several years. The working group believes that there is no reason to wait for an approved test guideline before ECS technologies are routinely used to perform chemical mutagenicity testing for regulatory use.

Looking ahead, the precision and universality of ECS enable mutation analyses in virtually any organism, tissue, or cell type, opening new frontiers in mechanistic research and toxicity testing. With this capability, ECS can be used to dissect the biological determinants of mutagenesis and carcinogenicity, such as species-specific difference in xenobiotic metabolism and DNA repair. Furthermore, ECS can facilitate cross-species comparisons of mutagenicity and characterization of dose-response relationship, which are key factors in determining the relevance of rodent MF data to human carcinogenesis. These applications position ECS not only as regulatory tool, but also as a transformative research platform to elucidate the molecular underpinnings of mutagenesis and strengthen quantitative risk assessment.

Supplementary Material

1
2
  • ECS enables ultra-sensitive detection of mutation frequency and spectra

  • Results closely mirror validated transgenic rodent (TGR) assay outcomes

  • Integrates seamlessly within standard ≥28-day toxicity study designs

  • Advances 3Rs goals and expands models beyond traditional TGR systems

  • IWGT workgroup supports ECS integration into OECD test guidelines

FUNDING INFORMATION

The work was conducted under the auspices of the International Workshop on Genotoxicity Testing. All authors provided in kind time to this work.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

DISCLAIMER

This presentation reflects the views of the authors and does not necessarily reflect the views of the U.S. Food and Drug Administration, the U.S. National Institute of Environmental Health Sciences, Health Canada, or the European Chemicals Agency. Any mention of commercial products is for clarification only and is not intended as approval, endorsement, or recommendation.

CONFLICT OF INTEREST

Jesse J. Salk is a founder, former employee and minority equity-holder of TwinStrand Biosciences Inc. He is a named author on Duplex Sequencing-related patents owned by TwinStrand. He is a named author on Duplex Sequencing patents owned by the University of Washington and licensed to TwinStrand, for which he receives royalties. Devon Fitzgerald is a former employee and equity holder of TwinStrand. She is a named inventor on a pending patent related to Duplex Sequencing for which she is not expected to gain financial benefits. Jake Higgins is equity holder and former employee of TwinStrand. Shoji Matsumura is an employee of Kao Corporation that has applied for the patent for Hawk-Seq and Jade-Seq. Naveed Honarvar is an Editorial Board member of this journal. All other authors declare no conflict of interest.

Carole Yauk, Anthony Lynch, Vasily Dobrovolsky, Maik Schuler, Stephanie Smith-Roe, Frank Le Curieux, Sheroy Minocherhomji PhD, ERT, FRSB, Les Recio, Kei-ichi Sugiyama, Takayoshi Suzuki, Ph. D., John W. Wills, Francesco Marchetti—The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Devon Fitzgerald—The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: I was employed by and held equity in TwinStrand Biosciences within the last three years, but not at the time of manuscript submission. I am listed as an inventor on a pending patent related to TwinStrand’s duplex sequencing technologies, but do not stand to gain any financial benefits from this intellectual property.

Jake Higgins—The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: JH is an equity holder and former employee of TwinStrand Biosciences. TwinStrand products are described in this manuscript.

Naveed Honarvar—The author is an Editorial Board Member/Editor-in-Chief/Associate Editor/Guest Editor for this journal and was not involved in the editorial review or the decision to publish this article.

Shoji Matsumura—The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: SM is an employee of Kao Corporation, which has applied for the patent for Hawk-Seq and Jade-Seq.

Jesse Salk—The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Jesse Salk is a founder, former employee and minority equity-holder in TwinStrand Biosciences Inc. He is a named author on Duplex Sequencing-related patents owned by TwinStrand. He is a named author on Duplex Sequencing patents owned by the University of Washington and licensed to TwinStrand, for which he receives royalties.

CRediT AUTHOR STATEMENT

Carole L Yauk: Conceptualization, Supervision, Investigation, Writing – original draft, Writing – review & editing; Anthony M. Lynch: Conceptualization, Supervision, Investigation, Writing – original draft, Writing – review & editing; Vasily N. Dobrovolsky: Investigation, Writing – original draft, Writing – review & editing; Maik Schuler: Investigation, Writing – original draft, Writing – review & editing; Stephanie L. Smith-Roe: Investigation, Writing – original draft, Writing – review & editing; Devon Fitzgerald: Investigation, Writing – original draft, Writing – review & editing; Jake Higgins: Investigation, Writing – original draft, Writing – review & editing; Naveed Honarvar: Investigation, Writing – original draft, Writing – review & editing; Frank Le Curieux: Investigation, Writing – original draft, Writing – review & editing; Shoji Matsumura: Investigation, Writing – original draft, Writing – review & editing; Sheroy Minocherhomji: Investigation, Writing – original draft, Writing – review & editing; Leslie Recio: Investigation, Writing – original draft, Writing – review & editing; Jesse J. Salk; Investigation, Writing – original draft, Writing – review & editing; Kei-ichi Sugiyama: Investigation, Writing – original draft, Writing – review & editing; Takayoshi Suzuki: Investigation, Writing – original draft, Writing – review & editing; John W. Wills: Investigation, Writing – original draft, Writing – review & editing, Francesco Marchetti: Conceptualization, Supervision, Project administration, Investigation, Writing – original draft, Writing – review & editing.

REFERENCES

  1. Abascal F, et al. , 2021. Somatic mutation landscapes at single-molecule resolution. Nature. 593, 405–410. [DOI] [PubMed] [Google Scholar]
  2. Armijo AL, et al. , 2023. Molecular origins of mutational spectra produced by the environmental carcinogen N-nitrosodimethylamine and S(N)1 chemotherapeutic agents. NAR Cancer. 5, zcad015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Ashford AL, et al. , 2025. Alignment between Duplex Sequencing and transgenic rodent mutation assay data in the assessment of in vivo NDMA-induced mutagenesis. Arch Toxicol. 99, 4227–4242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bae JH, et al. , 2023. Single duplex DNA sequencing with CODEC detects mutations with high sensitivity. Nat Genet. 55, 871–879. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Beal MA, et al. , 2015. Characterizing Benzo[a]pyrene-induced lacZ mutation spectrum in transgenic mice using next-generation sequencing. BMC Genomics. 16, 812. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bercu JP, et al. , 2023. Comparison of the transgenic rodent mutation assay, error corrected next generation duplex sequencing, and the alkaline comet assay to detect dose-related mutations following exposure to N-nitrosodiethylamine. Mutat Res Genet Toxicol Environ Mutagen. 891, 503685. [DOI] [PubMed] [Google Scholar]
  7. Besaratinia A, et al. , 2012. A high-throughput next-generation sequencing-based method for detecting the mutational fingerprint of carcinogens. Nucleic Acids Res. 40, e116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Chawanthayatham S, et al. , 2015. Prenatal exposure of mice to the human liver carcinogen aflatoxin B1 reveals a critical window of susceptibility to genetic change. Int J Cancer. 136, 1254–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Chawanthayatham S, et al. , 2017. Mutational spectra of aflatoxin B1 in vivo establish biomarkers of exposure for human hepatocellular carcinoma. Proc Natl Acad Sci U S A. 114, E3101–E3109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Cho E, et al. , 2023. Error-corrected duplex sequencing enables direct detection and quantification of mutations in human TK6 cells with strong inter-laboratory consistency. Mutat Res Genet Toxicol Environ Mutagen. 889, 503649. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Cosentino L, Heddle JA, 1999. A comparison of the effects of diverse mutagens at the lacZ transgene and Dlb-1 locus in vivo. Mutagenesis. 14, 113–9. [DOI] [PubMed] [Google Scholar]
  12. Cosentino L, Heddle JA, 2000. Differential mutation of transgenic and endogenous loci in vivo. Mutat Res. 454, 1–10. [DOI] [PubMed] [Google Scholar]
  13. Dertinger SD, et al. , 2023. Assessing the quality and making appropriate use of historical negative control data: A report of the International Workshop on Genotoxicity Testing (IWGT). Environ Mol Mutagen. 10.1002/em.22541. [DOI] [Google Scholar]
  14. Diaz L, et al. , 2020. Ethical Considerations in Animal Research: The Principle of 3R’s. Rev Invest Clin. 73, 199–209. [DOI] [PubMed] [Google Scholar]
  15. Dobrovolsky VN, et al. , 2023. Whole-genome high-fidelity sequencing: A novel approach to detecting and characterization of mutagenicity in vivo. Mutat Res Genet Toxicol Environ Mutagen. 891, 503691. [DOI] [PubMed] [Google Scholar]
  16. Dodge AE, et al. , 2023. Duplex sequencing provides detailed characterization of mutation frequencies and spectra in the bone marrow of MutaMouse males exposed to procarbazine hydrochloride. Arch Toxicol. 97, 2245–2259. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Esina E, et al. , 2024. Power analyses to inform Duplex Sequencing study designs for MutaMouse liver and bone marrow. Environ Mol Mutagen. 65, 234–242. [DOI] [PubMed] [Google Scholar]
  18. Fernandez-Marmiesse A, et al. , 2018. NGS Technologies as a Turning Point in Rare Disease Research, Diagnosis and Treatment. Curr Med Chem. 25, 404–432. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Fowler P, et al. , 2012. Reduction of misleading (“false”) positive results in mammalian cell genotoxicity assays. I. Choice of cell type. Mutat Res. 742, 11–25. [DOI] [PubMed] [Google Scholar]
  20. Gileta AF, et al. , 2022. Genetic characterization of outbred Sprague Dawley rats and utility for genome-wide association studies. PLoS Genet. 18, e1010234. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Godschalk RWL, et al. , 2020. In utero Exposure to Genotoxicants Leading to Genetic Mosaicism: An Overlooked Window of Susceptibility in Genetic Toxicology Testing? Environ Mol Mutagen. 61, 55–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Harrill JA, et al. , 2021. Progress towards an OECD reporting framework for transcriptomics and metabolomics in regulatory toxicology. Regul Toxicol Pharmacol. 125, 105020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Heddle JA, et al. , 2000. In vivo transgenic mutation assays. Environ Mol Mutagen. 35, 253–9. [DOI] [PubMed] [Google Scholar]
  24. Heddle JA, et al. , 2003. Treatment and sampling protocols for transgenic mutation assays. Environ Mol Mutagen. 41, 1–6. [DOI] [PubMed] [Google Scholar]
  25. Heid J, et al. , 2024. Detection of genome structural variation in normal cells and tissues by single molecule sequencing. BioRxiv. 10.1101/2024.08.08.607188. [DOI] [Google Scholar]
  26. Hoang ML, et al. , 2016. Genome-wide quantification of rare somatic mutations in normal human tissues using massively parallel sequencing. Proc Natl Acad Sci U S A. 113, 9846–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Hori H, et al. , 2019. Integration of micronucleus tests with a gene mutation assay in F344 gpt delta transgenic rats using benzo[a]pyrene. Mutat Res Genet Toxicol Environ Mutagen. 837, 1–7. [DOI] [PubMed] [Google Scholar]
  28. Huliganga E, et al. , 2025. Adverse Outcome Pathway-Informed Integrated Testing to Identify Chemicals Causing Genotoxicity Through Oxidative DNA Damage: Case Study on 4-Nitroquinoline 1-Oxide. Environmental and Molecular Mutagenesis. 66, 185–198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Hussen BM, et al. , 2022. The emerging roles of NGS in clinical oncology and personalized medicine. Pathol Res Pract. 230, 153760. [DOI] [PubMed] [Google Scholar]
  30. Izawa K, et al. , 2023. Detection of in vivo mutagenicity in rat liver samples using error-corrected sequencing techniques. Genes Environ. 45, 30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Jessen E, et al. , 2021. Determining mutational burden and signature using RNA-seq from tumor-only samples. BMC Med Genomics. 14, 65. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Jung YH, et al. , 2023. Characterization of a strain-specific CD-1 reference genome reveals potential inter- and intra-strain functional variability. BMC Genomics. 24, 437. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Kakiuchi N, Ogawa S, 2021. Clonal expansion in non-cancer tissues. Nat Rev Cancer. 21, 239–256. [DOI] [PubMed] [Google Scholar]
  34. Kennedy SR, et al. , 2014. Detecting ultralow-frequency mutations by Duplex Sequencing. Nat Protoc. 9, 2586–606. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Kim JH, et al. , 2022. Analysis of low-level somatic mosaicism reveals stage and tissue-specific mutational features in human development. PLoS Genet. 18, e1010404. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Kinde I, et al. , 2011. Detection and quantification of rare mutations with massively parallel sequencing. Proc Natl Acad Sci U S A. 108, 9530–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Lambert IB, et al. , 2005. Detailed review of transgenic rodent mutation assays. Mutat Res. 590, 1–280. [DOI] [PubMed] [Google Scholar]
  38. Lawson ARJ, et al. , 2025. Somatic mutation and selection at epidemiological scale. Nature. 10.1038/s41586-025-09584-w [DOI] [Google Scholar]
  39. LeBlanc DPM, et al. , 2022. Duplex sequencing identifies genomic features that determine susceptibility to benzo(a)pyrene-induced in vivo mutations. BMC Genomics. 23, 542. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. LeBlanc DPM, et al. , 2025. Duplex sequencing identifies unique characteristics of ENU-induced mutations in male mouse germ cells. Biol Reprod. 112, 1015–1027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Liber HL, et al. , 1989. A comparison of mutation induction at the tk and hprt loci in human lymphoblastoid cells; quantitative differences are due to an additional class of mutations at the autosomal tk locus. Mutat Res. 216, 9–17. [DOI] [PubMed] [Google Scholar]
  42. Liu MH, et al. , 2024. DNA mismatch and damage patterns revealed by single-molecule sequencing. Nature. 630, 752–761. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Liu R, et al. , 2025. Methyl-CODEC enables simultaneous methylation and duplex sequencing. Nucleic Acids Res. 10.1093/nar/gkaf482. [DOI] [Google Scholar]
  44. Long AS, et al. , 2016. Tissue-specific in vivo genetic toxicity of nine polycyclic aromatic hydrocarbons assessed using the MutaMouse transgenic rodent assay. Toxicol Appl Pharmacol. 290, 31–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Luderer U, et al. , 2019. In Utero Exposure to Benzo[a]pyrene Induces Ovarian Mutations at Doses That Deplete Ovarian Follicles in Mice. Environ Mol Mutagen. 60, 410–420. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Luzadder MM, et al. , 2025. The Distinct Roles of NEIL1 and XPA in Limiting Aflatoxin B1-Induced Mutagenesis in Mice. Mol Cancer Res. 23, 46–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. MacGregor JT, et al. , 2015. IWGT report on quantitative approaches to genotoxicity risk assessment I. Methods and metrics for defining exposure-response relationships and points of departure (PoDs). Mutat Res Genet Toxicol Environ Mutagen. 783, 55–65. [DOI] [PubMed] [Google Scholar]
  48. Marchetti F, et al. , 2018. Simulation of mouse and rat spermatogenesis to inform genotoxicity testing using OECD test guideline 488. Mutat Res. 832–833, 19–28. [Google Scholar]
  49. Marchetti F, et al. , 2023a. Error-corrected next-generation sequencing to advance nonclinical genotoxicity and carcinogenicity testing. Nat Rev Drug Discov. 22, 165–166. [DOI] [PubMed] [Google Scholar]
  50. Marchetti F, et al. , 2023b. Error-corrected next generation sequencing - Promises and challenges for genotoxicity and cancer risk assessment. Mutat Res Rev Mutat Res. 792, 108466. [DOI] [PubMed] [Google Scholar]
  51. Martincorena I, 2019. Somatic mutation and clonal expansions in human tissues. Genome Med. 11, 35. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Maslov AY, et al. , 2022. Single-molecule, quantitative detection of low-abundance somatic mutations by high-throughput sequencing. Sci Adv. 8, eabm3259. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Matsuda T, et al. , 2015. Mutation assay using single-molecule real-time (SMRT(TM)) sequencing technology. Genes Environ. 37, 15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Matsumura S, et al. , 2019. Genome-wide somatic mutation analysis via Hawk-Seq reveals mutation profiles associated with chemical mutagens. Arch Toxicol. 93, 2689–2701. [DOI] [PubMed] [Google Scholar]
  55. Matsumura S, et al. , 2025. Whole genome mutagenicity evaluation using Hawk-Seq demonstrates high inter-laboratory reproducibility and concordance with the transgenic rodent gene mutation assay. Genes Environ. 47, 13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. McGregor DB, et al. , 1996. Mutagenic responses of L5178Y mouse cells at the tk and hprt loci. Toxicol In Vitro. 10, 643–7. [DOI] [PubMed] [Google Scholar]
  57. Mei N, et al. , 2005. Age-dependent sensitivity of Big Blue transgenic mice to the mutagenicity of N-ethyl-N-nitrosourea (ENU) in liver. Mutat Res. 572, 14–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Meier MJ, et al. , 2017. In Utero Exposure to Benzo[a]Pyrene Increases Mutation Burden in the Soma and Sperm of Adult Mice. Environ Health Perspect. 125, 82–88. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Menon V, Brash DE, 2023. Next-generation sequencing methodologies to detect low-frequency mutations: “Catch me if you can”. Mutat Res Rev Mutat Res. 792, 108471. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Minko IG, et al. , 2024. Frequencies and spectra of aflatoxin B(1)-induced mutations in liver genomes of NEIL1-deficient mice as revealed by duplex sequencing. NAR Mol Med. 1, ugae006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Miranda JA, et al. , 2022a. Genome-wide detection of ultralow-frequency substitution mutations in cultures of mouse lymphoma L5178Y cells and Caenorhabditis elegans worms by PacBio sequencing. Environ Mol Mutagen. 63, 68–75. [DOI] [PubMed] [Google Scholar]
  62. Miranda JA, et al. , 2023. Unbiased whole genome detection of ultrarare off-target mutations in genome-edited cell populations by HiFi sequencing. Environ Mol Mutagen. 64, 374–381. [DOI] [PubMed] [Google Scholar]
  63. Miranda JA, et al. , 2022b. Evaluation of the mutagenic effects of Molnupiravir and N4-hydroxycytidine in bacterial and mammalian cells by HiFi sequencing. Environ Mol Mutagen. 63, 320–328. [DOI] [PubMed] [Google Scholar]
  64. Miranda JA, Revollo JR, 2024. Assessment of in vivo chemical mutagenesis by long-read sequencing. Toxicol Sci. 202, 96–102. [DOI] [PubMed] [Google Scholar]
  65. Monroe JJ, et al. , 1998. A comparative study of in vivo mutation assays: analysis of hprt, lacI, cII/cI and as mutational targets for N-nitroso-N-methylurea and benzo[a]pyrene in Big Blue mice. Mutat Res. 421, 121–36. [DOI] [PubMed] [Google Scholar]
  66. Moore MM, Clive D, 1982. The quantitation of TK−/− and HGPRT- mutants of L5178Y/TK+/− mouse lymphoma cells at varying times post-treatment. Environ Mutagen. 4, 499–519. [DOI] [PubMed] [Google Scholar]
  67. Muyas F, et al. , 2020. The rate and spectrum of mosaic mutations during embryogenesis revealed by RNA sequencing of 49 tissues. Genome Med. 12, 49. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. National Academies of Sciences, E., Medicine, 2023. Building Confidence in New Evidence Streams for Human Health Risk Assessment: Lessons Learned from Laboratory Mammalian Toxicity Tests. The National Academies Press, Washington, DC. [Google Scholar]
  69. Neville MDC, et al. , 2025. Sperm sequencing reveals extensive positive selection in the male germline. Nature. 10.1038/s41586-025-09448-3. [DOI] [Google Scholar]
  70. OECD, 2001. Test No. 416: Two-Generation Reproduction Toxicity. OECD Publishing, Paris. [Google Scholar]
  71. OECD, 2002. Guidance Document on the Recognition, Assessment and Use of Clinical Signs as Human Endpoints for Experimental Animals Used in Safety Evaluation,. OECD Publishing, Paris. [Google Scholar]
  72. OECD, 2008. Test 407: Repeated dose 28-day oral toxicity study in rodents. OECD publishing, Paris. [Google Scholar]
  73. OECD, 2009. Detailed Review Paper on Transgenic Rodent Mutations Assay. Paris. [Google Scholar]
  74. OECD, 2016a. Test 474: Mammalian erythrocyte micronucleus test. OECD Publishing, Paris. [Google Scholar]
  75. OECD, 2016b. Test 475: Mammalian Bone Marrow Chromosomal Aberration Test. OECD Publishing, Paris. [Google Scholar]
  76. OECD, 2016c. Test 478: Rodent dominant lethal test. OECD Publishing, Paris. [Google Scholar]
  77. OECD, 2016d. Test 483: Mammalian spermatogonial chromosomal aberration test. OECD Publishing, Paris. [Google Scholar]
  78. OECD, 2016e. Test 489: In vivo ammalian alkaline comet assay. OECD Publisher, Paris. [Google Scholar]
  79. OECD, 2016f. Test No. 421: Reproduction/Developmental Toxicity Screening Test. OECD Publishing,, Paris. [Google Scholar]
  80. OECD, 2016g. Test No. 422: Combined Repeated Dose Toxicity Study with the Reproduction/Developmental Toxicity Screening Test. OECD Publishing, Paris. [Google Scholar]
  81. OECD, 2017. Overview of the set of OECD Genetic Toxicology Test Guidelines and updates performed in 2014–2015 OECD Publishing,, Paris. [Google Scholar]
  82. OECD, 2018a. Test 408: Repeated dose 90-day oral toxicity study in rodents. OECD publishing, Paris. [Google Scholar]
  83. OECD, 2018b. Test No. 443: Extended One-Generation Reproductive Toxicity Study. OECD Publishing, Paris [Google Scholar]
  84. OECD, 2018c. Test No. 452: Chronic Toxicity Studies,. OECD Publishing, Paris,. [Google Scholar]
  85. OECD, 2020. The in vivo erythrocyte Pig-a gene mutation assay – Part 2 – Validation report. Paris. [Google Scholar]
  86. OECD, 2022. Test 470: Mammalian Erythrocyte Pig-a Gene Mutation Assay. OECD Publishing, Paris. [Google Scholar]
  87. OECD, 2023. OECD Omics Reporting Framework (OORF): Guidance on reporting elements for the regulatory use of omics data from laboratory-based toxicology studies. OECD Publishing, Paris. [Google Scholar]
  88. OECD, 2025. Test No. 488: Transgenic Rodent Somatic and Germ Cell Gene Mutation Assays.
  89. Otsubo Y, et al. , 2021. Hawk-Seq differentiates between various mutations in Salmonella typhimurium TA100 strain caused by exposure to Ames test-positive mutagens. Mutagenesis. 36, 245–254. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Otsubo Y, et al. , 2022. Single-strand specific nuclease enhances accuracy of error-corrected sequencing and improves rare mutation-detection sensitivity. Arch Toxicol. 96, 377–386. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Recio L, Meyer KG, 1995. Increased frequency of mutations at A:T base pairs in the bone marrow of B6C3F1 lacI transgenic mice exposed to 1,3-butadiene. Environ Mol Mutagen. 26, 1–8. [DOI] [PubMed] [Google Scholar]
  92. Revollo JR, et al. , 2021. PacBio sequencing detects genome-wide ultra-low-frequency substitution mutations resulting from exposure to chemical mutagens. Environ Mol Mutagen. 62, 438–445. [DOI] [PubMed] [Google Scholar]
  93. Richterich P, 1998. Estimation of errors in “raw” DNA sequences: a validation study. Genome Res. 8, 251–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Sahib S, et al. , 2024. Application of duplex sequencing to evaluate mutagenicity of aristolochic acid and methapyrilene in Fisher 344 rats. Food Chem Toxicol. 185, 114512. [DOI] [PubMed] [Google Scholar]
  95. Salk JJ, Kennedy SR, 2020. Next-Generation Genotoxicology: Using Modern Sequencing Technologies to Assess Somatic Mutagenesis and Cancer Risk. Environ Mol Mutagen. 61, 135–151. [DOI] [PMC free article] [PubMed] [Google Scholar]
  96. Salk JJ, et al. , 2018. Enhancing the accuracy of next-generation sequencing for detecting rare and subclonal mutations. Nat Rev Genet. 19, 269–285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Schmitt MW, et al. , 2012. Detection of ultra-rare mutations by next-generation sequencing. Proc Natl Acad Sci U S A. 109, 14508–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Schuster DM, et al. , 2024. Dose-Related Mutagenic and Clastogenic Effects of Benzo[b]fluoranthene in Mouse Somatic Tissues Detected by Duplex Sequencing and the Micronucleus Assay. Environ Sci Technol. 58, 21450–21463. [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Seo JE, et al. , 2024. Evaluating the mutagenicity of N-nitrosodimethylamine in 2D and 3D HepaRG cell cultures using error-corrected next generation sequencing. Arch Toxicol. 98, 1919–1935. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Simon S, et al. , 2025. Deriving safe limits for N-nitroso-bisoprolol by error-corrected next-generation sequencing (ecNGS) and benchmark dose (BMD) analysis, integrated with QM modeling and CYP-docking analysis. Arch Toxicol. 99, 3935–3962. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Sisk SC, et al. , 1994. Molecular analysis of lacI mutants from bone marrow of B6C3F1 transgenic mice following inhalation exposure to 1,3-butadiene. Carcinogenesis. 15, 471–7. [DOI] [PubMed] [Google Scholar]
  102. Skopek TR, et al. , 1995. Relative sensitivity of the endogenous hprt gene and lacI transgene in ENU-treated Big Blue B6C3F1 mice. Environ Mol Mutagen. 26, 9–15. [DOI] [PubMed] [Google Scholar]
  103. Slob W, 2014a. Benchmark dose and the three Rs. Part I. Getting more information from the same number of animals. Crit Rev Toxicol. 44, 557–67. [DOI] [PubMed] [Google Scholar]
  104. Slob W, 2014b. Benchmark dose and the three Rs. Part II. Consequences for study design and animal use. Crit Rev Toxicol. 44, 568–80. [DOI] [PubMed] [Google Scholar]
  105. Smith-Roe SL, et al. , 2023. Adopting duplex sequencing technology for genetic toxicity testing: A proof-of-concept mutagenesis experiment with N-ethyl-N-nitrosourea (ENU)-exposed rats. Mutat Res Genet Toxicol Environ Mutagen. 891, 503669. [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Thybaud V, et al. , 2003. In vivo transgenic mutation assays. Mutat Res. 540, 141–51. [DOI] [PubMed] [Google Scholar]
  107. Valentine CC 3rd, et al. , 2020. Direct quantification of in vivo mutagenesis and carcinogenesis using duplex sequencing. Proc Natl Acad Sci U S A. 117, 33414–33425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  108. Wang Y, et al. , 2021. Genetic toxicity testing using human in vitro organotypic airway cultures: Assessing DNA damage with the CometChip and mutagenesis by Duplex Sequencing. Environ Mol Mutagen. 62, 306–318. [DOI] [PMC free article] [PubMed] [Google Scholar]
  109. White PA, et al. , 2025. Benchmark Response (BMR) Values for In Vivo Mutagenicity Endpoints. Environ Mol Mutagen. 66, 172–184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  110. Wills JW, et al. , 2016. Empirical analysis of BMD metrics in genetic toxicology part II: in vivo potency comparisons to promote reductions in the use of experimental animals for genetic toxicity assessment. Mutagenesis. 31, 265–75. [DOI] [PubMed] [Google Scholar]
  111. Wilson TE, et al. , 2023. svCapture: efficient and specific detection of very low frequency structural variant junctions by error-minimized capture sequencing. NAR Genom Bioinform. 5, lqad042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  112. Xia L, et al. , 2025. N-nitroso-ethylisopropylamine mutagenicity in rat liver using the cII transgenic mutation assay and duplex sequencing analysis of genomic DNA. Chem Biol Interact. 418, 111603. [DOI] [PubMed] [Google Scholar]
  113. You X, et al. , 2020. Detection of genome-wide low-frequency mutations with Paired-End and Complementary Consensus Sequencing (PECC-Seq) revealed end-repair-derived artifacts as residual errors. Arch Toxicol. 94, 3475–3485. [DOI] [PubMed] [Google Scholar]
  114. You X, et al. , 2023. Genome-wide direct quantification of in vivo mutagenesis using high-accuracy paired-end and complementary consensus sequencing. Nucleic Acids Res. 51, e109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  115. Zhang S, et al. , 2024. Assessing the genotoxicity of N-nitrosodiethylamine with three in vivo endpoints in male Big Blue(R) transgenic and wild-type C57BL/6N mice. Environ Mol Mutagen. 65, 190–202. [DOI] [PubMed] [Google Scholar]
  116. Zhang S, et al. , 2025a. Re-Evaluating Acceptable Intake: A Comparative Study of N-Nitrosomorpholine and N-Nitroso Reboxetine Potency. Environ Mol Mutagen. 66, 80–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
  117. Zhang S, et al. , 2025b. Transferability, Reproducibility and Sensitivity of Mutation Quantification by Duplex Sequencing. Environmental and Molecular Mutagenesis. [Google Scholar]
  118. Zhivagui M, et al. , 2023. DNA damage and somatic mutations in mammalian cells after irradiation with a nail polish dryer. Nat Commun. 14, 276. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2

RESOURCES