Skip to main content
BMC Pharmacology & Toxicology logoLink to BMC Pharmacology & Toxicology
. 2026 Jan 29;27:39. doi: 10.1186/s40360-026-01096-1

Novel mechanisms of atrazine endocrine disruption: an integrated approach reveals progesterone and glucocorticoid receptor targeting

Chaoyuan Jin 1,2,3,#, Ruijinlin Hao 4,#, Xingxing Ren 5,, Jie Shen 1,2,3,
PMCID: PMC12924616  PMID: 41612433

Abstract

Background

Atrazine is a widely used herbicide with reported endocrine-disrupting effects, yet the molecular targets and pathway-level mechanisms remain to be fully elucidated. We aimed to build an integrated, hypothesis-generating workflow to prioritize candidate atrazine targets by combining literature curation, network analysis, machine-learning–based prioritization, and molecular docking.

Methods

Candidate endocrine-related genes were compiled from the Comparative Toxicogenomics Database and PubMed literature screening, yielding 78 genes. A protein–protein interaction (PPI) network was constructed and analyzed to identify a connected subnetwork of 26 genes. Gene Ontology (GO) and KEGG enrichment analyses were performed to characterize overrepresented biological processes and pathways. To prioritize candidate genes within the 26-gene subnetwork, we engineered composite network features and trained a weakly supervised model using three literature-labeled seed positives (ESR1, CYP19A1, and AR) versus the remaining genes as putative negatives, with internal cross-validation for exploratory assessment. Molecular docking was conducted for selected receptors to evaluate potential binding signals.

Results

Network topology and enrichment analyses highlighted endocrine-related and steroid hormone–associated biological functions within the atrazine-associated gene set. In exploratory internal cross-validation on the 26-gene subnetwork, an ensemble model showed high apparent discrimination. The prioritization step ranked NR3C1 and PGR among the top candidates for follow-up. Docking simulations suggested moderate binding affinities between atrazine and PGR (− 6.495 kcal/mol) and NR3C1 (− 6.248 kcal/mol), providing complementary in silico evidence consistent with these candidates and motivating investigation of progesterone- and glucocorticoid-related signaling.

Conclusions

This integrative in silico workflow supports a network-informed prioritization of potential atrazine endocrine targets and highlights NR3C1 and PGR as candidates warranting further investigation. Because the machine-learning component is trained on a very limited literature-labeled set and no external test set is available, the findings should be interpreted as hypothesis-generating and require validation in biological models (in vitro and/or in vivo).

Supplementary Information

The online version contains supplementary material available at 10.1186/s40360-026-01096-1.

Keywords: Atrazine, Endocrine disruptor, Network toxicology, Machine learning, Molecular docking

Introduction

Endocrine disrupting chemicals (EDCs) represent a class of environmental contaminants that can interfere with hormone systems, potentially leading to adverse health effects in wildlife and humans [1]. Among these compounds, atrazine (2-chloro-4-ethylamino-6-isopropylamino-s-triazine) has received considerable attention due to its widespread use as an herbicide and persistent presence in aquatic environments [2]. Despite regulatory restrictions in some countries, atrazine remains one of the most commonly detected pesticides in surface and ground waters globally [3, 4].

Numerous studies have demonstrated atrazine’s potential to disrupt endocrine function across various species. In amphibians, atrazine exposure has been associated with gonadal abnormalities and feminization of male frogs at environmentally relevant concentrations [5]. In mammals, atrazine has been shown to affect reproductive development, alter hormone levels, and potentially increase cancer risk in hormone-responsive tissues [6]. These effects appear to be mediated through multiple mechanisms, including alterations in steroidogenesis, particularly through effects on aromatase (CYP19A1) [7], and possible interactions with hormone receptors such as ESR1 (encoding estrogen receptor alpha, ERα) and androgen receptor (AR) [8]. However, the effects of atrazine on progesterone and glucocorticoid signaling have not been thoroughly investigated.

Despite extensive research, the precise molecular mechanisms and key targets mediating atrazine’s endocrine disrupting effects remain to be fully elucidated [9]. Traditional toxicological approaches have typically focused on individual pathways or a limited set of predefined targets, potentially overlooking important molecular interactions and novel mechanisms [10]. Additionally, the complex interplay between multiple pathways affected by atrazine necessitates a more comprehensive systems-level analysis [11].

Recent advances in computational toxicology offer promising approaches to address these limitations. Network toxicology provides a framework to analyze chemical-gene-disease relationships at a systems level, revealing emergent properties not apparent through reductionist approaches [12, 13]. Machine learning techniques can effectively integrate diverse data types to identify patterns and make predictions about chemical toxicity. Molecular docking simulations complement these approaches by providing structural insights into chemical-protein interactions [14, 15].

The integration of these computational methods presents an opportunity to more comprehensively characterize atrazine’s endocrine-disrupting mechanisms [16]. By combining network analysis to identify potential molecular targets, machine learning to prioritize key genes, and molecular docking to validate predicted interactions, we can develop a more complete understanding of atrazine’s biological effects. This study aims to identify novel molecular targets and mechanisms of atrazine-induced endocrine disruption through an integrative computational approach, offering insights to enhance risk assessment and regulatory strategies for this common herbicide.

Materials and methods

Data collection and preprocessing

An overview of the study design and analytical workflow is presented in Fig. 1.

Fig. 1.

Fig. 1

Overall workflow of the study. Atrazine-related endocrine genes were identified through database and literature mining, followed by PPI network construction and connected 26-gene sub-network. Network and pathway features were extracted and used for weakly supervised machine-learning–based prioritization. In silico validation was performed using molecular docking, resulting in prioritized candidate genes and putative receptor targets for further investigation. ML denotes machine learning

Identification of atrazine-related genes

Atrazine-related genes were identified through a two-pronged strategy integrating database mining and literature-based evidence (Fig. 2).

Fig. 2.

Fig. 2

Identification and selection of atrazine-related genes for network analysis. Atrazine-associated genes (9,554) and endocrine disruptor–related genes (158) were retrieved from the Comparative Toxicogenomics Database, yielding 77 overlapping genes. Six additional genes (ESR1, CYP19A1, AR, NR3C1, NR5A1 [SF-1], and KISS1) were identified through structured PubMed literature curation. The final gene set (n = 78) was used for subsequent network and machine learning analyses

First, genes associated with atrazine exposure and endocrine disruption were retrieved from the Comparative Toxicogenomics Database (CTD; http://ctdbase.org/). Queries were performed using “atrazine” and “endocrine disruptor” as search terms. This procedure yielded 9,554 atrazine-associated genes and 158 endocrine disruptor–related genes. Genes appearing in both categories were retained as database-derived candidates, resulting in 77 overlapping genes.

Second, a structured literature review was conducted to identify experimentally supported genes and signaling pathways involved in atrazine-associated endocrine regulation. PubMed was searched from database inception to December 2024 using the query string “atrazine AND endocrine disruptor AND gene”, supplemented by additional keywords related to endocrine signaling, steroidogenesis, and hormone regulation. Studies were included if they were original research articles employing in vivo vertebrate models and reported molecular, biochemical, or functional endocrine-related evidence following atrazine exposure. Reviews, invertebrate-only studies, and articles lacking gene-level or pathway-level evidence were excluded. Based on these criteria, 18 primary studies were retained for evidence extraction (Supplementary Table S1).

Across the selected studies, six genes or signaling pathways—CYP19A1, NR5A1 (SF-1), ESR1, AR, NR3C1, and KISS1 (kisspeptin)—were consistently implicated in atrazine-associated endocrine regulation. These literature-supported genes were merged with the CTD-derived candidate genes, and duplicates were removed, resulting in a final set of 78 atrazine-related genes for downstream analyses.

Protein-protein interaction network construction

The STRING database (version 11.5, https://string-db.org/) was used to construct a protein-protein interaction (PPI) network for the identified genes. Parameters were set to include only interactions with high confidence scores (≥ 0.7) and restricted to Homo sapiens as the reference organism. The resulting network data were exported in TSV format, containing information on nodes (genes/proteins) and edges (interactions) for subsequent network analysis.

Network toxicology analysis

Topological analysis

Network topology analysis was performed on the refined 26-gene sub-network using Cytoscape (version 3.9.1). Multiple centrality measures were calculated to characterize the importance of each node within the network:

Degree: the number of direct connections a node has to other nodes.

Betweenness Centrality: the frequency with which a node lies on the shortest path between other nodes.

Closeness Centrality: the reciprocal of the sum of the shortest distances to all other nodes.

Clustering Coefficient: the proportion of actual connections among a node’s neighbors relative to the potential connections.

Primary hubs were defined as the top-ranked nodes by degree centrality, while secondary hubs comprised the subsequent tier of highly connected nodes.

Pathway enrichment analysis

Functional characterization of the network was conducted through pathway enrichment analysis: KEGG pathway enrichment was performed using the KEGG database (https://www.genome.jp/kegg/) with Benjamini–Hochberg adjusted p < 0.05.

Gene Ontology (GO) enrichment analysis was conducted in R (version 4.3.2) using the clusterProfiler package (version 4.10.1), examining Biological Process (BP), Cellular Component (CC), and Molecular Function (MF) categories. Significantly enriched pathways and GO terms were visualized and interpreted to identify biological processes potentially affected by atrazine exposure.

Terminology: “high-degree nodes” refer to nodes ranked among the top four by degree centrality within the 26-gene connected sub-network, as an operational network-based definition.

“Model-prioritized candidates” denote genes ranked highly by the supervised learning framework based on network-derived features, reflecting relative prioritization under a weakly supervised setting in which three literature-labeled genes were used as seed positives, rather than confirmed biological involvement. “Targets” refer to proteins (estrogen receptor alpha [ERα], androgen receptor [AR], aromatase [encoded by CYP19A1], glucocorticoid receptor [encoded by NR3C1], and progesterone receptor [PGR]) included in molecular docking analyses as putative in silico atrazine-binding candidates.

Network feature–assisted gene prioritization framework

Feature engineering

To characterize gene properties within the atrazine-associated interaction network, we constructed a feature set comprising conventional network metrics, pathway-derived attributes, and several composite indices designed to integrate topological and functional information.

Traditional network metrics included degree, betweenness centrality, closeness centrality, and clustering coefficient.

Pathway-related features were derived from pathway enrichment analyses and included the minimum nominal p-value (min_pvalue), minimum adjusted p-value (min_p_adjust), minimum q-value (min_qvalue), and the number of enriched pathways in which a given gene appeared (pathway_count).

In addition, we defined several engineered composite features to summarize multiple aspects of network topology and pathway involvement:

  • (i)

    Degree–Clustering Index, defined as the product of degree and clustering coefficient Degree–Clustering Index = Degree × Clustering Coefficient, intended to jointly capture global connectivity and local neighborhood density.

  • (ii)

    Centrality Composite Index, defined as a weighted combination of normalized degree, betweenness centrality, and closeness centrality: CentralityIndex = ω₁·(Degree/max(Degree)) + ω₂·(Betweenness/max(Betweenness)) + ω₃·(Closeness/max(Closeness)). Candidate weight triples (ω₁, ω₂, ω₃) satisfying ω₁ + ω₂ + ω₃ = 1 were evaluated using a constrained grid search with a step size of 0.05. Model performance under each weighting scheme was assessed using leave-one-out cross-validation based on the three literature-labeled seed positives. The selected weights (0.4, 0.3, 0.3) were adopted for subsequent analyses and should be interpreted as heuristic and context-specific rather than optimal or generalizable.

  • (iii)

    Connection Density, defined as the ratio of clustering coefficient to degree. ConnectionDensity = Clustering Coefficient / Degree, describing local clustering relative to the number of connections

  • (iv)

    Pathway Significance Index, defined as the negative logarithm of the minimum pathway p-value. PathwaySignificance = − log₁₀(min_pvalue), applied to stabilize the scale of enrichment significance measures.

  • (v)

    Network–Pathway Integration Index, defined as NetPathIndex = CentralityIndex / (pathway_count + 1), integrating network centrality with pathway participation.

For methodological comparison, these features were evaluated using a small set of literature-supported endocrine-related genes (ESR1, CYP19A1, AR, NR3C1, NR5A1, and KISS1). Comparative performance between conventional network metrics and composite features is reported in the Results section.

Dataset preparation and class imbalance handling

Based on the literature evidence and enrichment patterns, we defined a working label set in which three genes—ESR1, CYP19A1, and AR—were treated as literature-labeled seed positives for supervised prioritization. From the initial set of 78 atrazine-related genes identified in Section “Identification of atrazine-related genes”, a protein-protein interaction (PPI) network was constructed (see Section “Protein-protein interaction network construction”), resulting in a reduced set of 33 genes with high-confidence interactions (combined score ≥ 0.7). This set was further refined using Cytoscape (version 3.9.1) by removing disconnected nodes, yielding a connected sub-network of 26 genes. These genes were chosen based on multiple lines of evidence:

ESR1: Multiple studies have demonstrated that atrazine exposure affects estrogen signaling. Hayes et al. [5] reported feminization effects in male frogs following atrazine exposure, while Roberge et al. [17] investigated atrazine’s effects on estrogen-responsive genes. Although direct binding of atrazine to ERα (encoded by ESR1) remains controversial, its disruption of estrogen-dependent pathways is well documented.

CYP19A1 (Aromatase): Strong experimental evidence has established that atrazine increases aromatase expression and activity. Sanderson et al. [18] demonstrated that atrazine induces aromatase activity in H295R human adrenocortical carcinoma cells, while Fan et al. [7] elucidated the SF-1 dependent mechanism by which atrazine upregulates aromatase expression, increasing estrogen production through enhanced testosterone conversion.

AR (Androgen Receptor): Hayes et al. [19] provided evidence supporting atrazine’s anti-androgenic effects, while Solomon et al. [20] discussed atrazine’s impact on androgen-dependent developmental processes across multiple species.

Additionally, these three genes showed significant enrichment in hormone signaling pathways in our network analysis and ranked among the top nodes by multiple centrality measures. While other genes also showed associations with atrazine exposure, these three candidates had the strongest combined evidence from both experimental studies and our network analysis.

These literature-labeled positives (seed positives) served as positive examples for our machine learning models, while the remaining 23 genes were treated as unlabeled/putative negatives for model training, acknowledging that some may represent true positives that are currently under-studied. Accordingly, the supervised learning step should be interpreted as weakly supervised prioritization that may inherit literature bias; therefore, downstream docking and future biological experiments are necessary to substantiate the newly prioritized candidates.

Machine learning framework

Given the extremely small and imbalanced labeled set (3 literature-labeled seed positives versus 23 unlabeled genes), classifier performance was evaluated using gene-level leave-one-out cross-validation (LOOCV; K = 26). In each iteration, one gene was held out for evaluation and the remaining 25 genes were used for model training, and this procedure was repeated until every gene had served once as the held-out test sample. To reduce stochastic variability, all stochastic learners were run with a fixed random seed (random_state = 2024) to ensure reproducibility. Any operations that could introduce information leakage were performed exclusively within each training set and subsequently applied to the held-out gene. For confusion matrices, predicted probabilities were converted to class labels using a pre-specified threshold of 0.5 (i.e., no data-driven threshold optimization was performed).

Four supervised learning algorithms were implemented: Random Forest (RF) models were trained using 100 decision trees with a maximum depth of 5, with √(n_features) randomly selected features considered at each split; Gradient Boosting Trees (GBT) were implemented with 100 estimators, a learning rate of 0.1, and a maximum tree depth of 3; Support Vector Machine (SVM) models used a radial basis function kernel (C = 1.0), with class weights adjusted to account for class imbalance. An ensemble model was constructed by combining predicted probabilities from the three base learners using weighted averaging (RF: 0.4, GBT: 0.4, SVM: 0.2) to leverage complementary algorithmic strengths. Model performance was assessed using accuracy, precision, recall, F1-score, and the area under the receiver operating characteristic curve (AUC). Feature importance was evaluated using permutation importance, defined as the decrease in model performance following random shuffling of individual features.

Molecular docking analysis

To analyze the binding affinities and modes of interaction between atrazine and its predicted molecular targets, AutoDock Vina (version 1.2.2) was employed for in silico protein–ligand docking. The molecular structure of atrazine was retrieved from PubChem (CID: 2256). The 3D coordinates of estrogen receptor alpha (ERα; PDB ID: 1A52; resolution: 2.8 Å), CYP19A1 (PDB ID: 3EQM; resolution: 2.9 Å), androgen receptor (AR; PDB ID: 1E3G; resolution: 2.0 Å), NR3C1 (PDB ID: 1M2Z; resolution: 2.5 Å), and progesterone receptor (PGR; PDB ID: 1A28; resolution: 1.8 Å) were downloaded from the RCSB Protein Data Bank. Protein and ligand structures were prepared and converted to PDBQT format by removing crystallographic water molecules, adding polar hydrogen atoms, and assigning AutoDock atom types as required by Vina. Docking was performed with a rigid receptor model. For blind docking, the search space was centered at the geometric center of each receptor, and the grid box dimensions were set to 30 Å × 30 Å × 30 Å. Unless otherwise specified, Vina default search parameters were used (exhaustiveness = 8, num_modes = 9, and energy_range = 3 kcal/mol), and the random seed was not explicitly fixed (default behavior).

Results

Network construction and topological analysis

The protein-protein interaction (PPI) network was constructed from 78 atrazine-related genes using the STRING database, with high-confidence interactions, and refined in Cytoscape to a connected sub-network of 26 genes, with 36 edges, visualized in Fig. 3b. Figure 3b highlights the connected 26-gene sub-network, featuring major high-degree nodes ESR1, NR3C1, AR, and POMC (in red), and secondary hubs CYP19A1, PGR, CASP3, MAPK1, and others (in orange). This structure reveals functional clustering of nuclear receptor proteins and their signaling molecules.

Fig. 3.

Fig. 3

Network analysis and pathway enrichment of atrazine-related genes. (a) Protein–protein interaction network of atrazine-related genes (confidence score ≥ 0.7). (b) Connected 26-gene subnetwork highlighting high-degree nodes, with primary hubs (ESR1, NR3C1, AR, POMC) shown in red and secondary hubs (CYP19A1, PGR, CASP3, MAPK1, etc.) in orange. (c) GO enrichment analysis of biological processes, cellular components, and molecular functions, with enrichment observed in steroid metabolism, reproductive development, and hormone receptor activity. (d) KEGG pathway enrichment analysis shown as a bubble plot, where bubble size represents gene count and color indicates significance; enriched pathways include phospholipase D signaling, prolactin signaling, and neuroactive ligand–receptor interaction

The network exhibits a sparse yet organized structure, with major hubs like POMC and ESR1 showing high connectivity, primarily linked to hormone signaling pathways, especially nuclear receptor-mediated signaling. Betweenness centrality analysis identified ESR1, AR, and MAPK1 as key bridge nodes, facilitating interactions across functional modules. This organization supports atrazine’s potential endocrine-disrupting mechanisms via hormone signaling pathways.

Pathway enrichment analysis

GO enrichment analysis (Fig. 3c) revealed significant enrichment in biological processes (orange bars), cellular components (green bars), and molecular functions (blue bars). In total, we identified 892 significantly enriched GO terms (including 774 Biological Process, 44 Cellular Component, and 74 Molecular Function terms) and 50 significantly enriched KEGG pathways under the predefined threshold.

The top enriched biological processes included “steroid metabolic process,” “gonadal development,” “reproductive development,” and “sex differentiation,” directly relevant to endocrine and reproductive functions. Key enriched molecular functions included “steroid hormone receptor activity,” “hormone binding,” and “ligand-activated transcription factor activity,” further supporting the endocrine-related mechanisms of atrazine.

KEGG pathway enrichment analysis (Fig. 3d) identified several significantly enriched pathways, visualized as a bubble plot where bubble size represents gene count and color intensity indicates p-value significance. The top enriched pathways included “Phospholipase D signaling pathway,” “Long-term depression,” “Amphetamine addiction,” “Prolactin signaling pathway,” and “Neuroactive ligand-receptor interaction.” The enrichment of hormone-related pathways, particularly prolactin signaling, and neuroactive ligand-receptor interaction pathways, is consistent with atrazine’s reported effects on reproductive and neuroendocrine function.

These enrichment results collectively indicate that the identified atrazine-related genes are primarily involved in hormone signaling, reproductive development, and transcriptional regulation, supporting their relevance to atrazine’s endocrine disrupting effects. The combined GO and KEGG enrichment patterns suggest that atrazine may affect multiple aspects of endocrine function through various signaling pathways.

Machine learning–assisted candidate target prioritization

Model performance

Given the extremely small and imbalanced labeled set (3 literature-labeled seed positives vs. 23 unlabeled negatives), the cross-validated results should be interpreted as exploratory estimates of feature-space separability rather than evidence of generalizable predictive performance (Fig. 4a). Under gene-level LOOCV, the Gradient Boosting Trees model achieved the highest AUC (0.95) and recall (1.00) for the three seed positives. However, because the positive class comprises only three genes, these metrics are inherently unstable and may overestimate performance.

Fig. 4.

Fig. 4

Machine learning–assisted prioritization of candidate genes potentially involved in atrazine-associated endocrine disruption. (a) Performance comparison of four machine learning models under internal LOOCV; metrics are shown to illustrate relative feature-space separability in a weakly supervised setting rather than generalizable predictive accuracy. (b) Receiver operating characteristic (ROC) curves summarizing internal discrimination performance (AUC range: 0.87–0.95) under cross-validation. (c) Permutation-based feature importance analysis indicating the Degree–Clustering Index and Centrality Composite Index as the most influential features for candidate prioritization. (d) Model-derived prioritization scores highlighting literature-labeled seed positives (ESR1, CYP19A1, AR) and model-prioritized candidate genes (NR3C1 and PGR). (e) Feature distribution comparison between model-prioritized candidates and lower-ranked genes across key network-derived metrics. (f) Confusion matrices summarizing internal cross-validation results for each model. (g) Two-dimensional visualization of genes in the feature space defined by the Centrality Composite Index and Degree–Clustering Index, illustrating separation between literature-labeled seed positives, model-prioritized candidates, and lower-ranked genes

The ensemble model yielded the highest precision (1.00) and accuracy (0.95) during cross-validation, but these values reflect performance within a small internal resampling scheme and should not be interpreted as proof of out-of-sample reliability. ROC curve analysis (Fig. 4b) showed that Gradient Boosting exhibited the strongest overall discrimination (AUC = 0.95), followed by the Ensemble Model (AUC = 0.92), Random Forest (AUC = 0.88), and SVM (AUC = 0.87).

Confusion matrix analysis highlighted complementary performance characteristics among models (Fig. 4f). The Gradient Boosting Trees model correctly identified all three literature-labeled seed positives (sensitivity = 1.00) but generated one false positive. In contrast, the ensemble model achieved perfect specificity with no false positives, but missed one seed positive (false negative; ESR1). This complementarity motivated the use of an ensemble strategy to balance sensitivity and specificity in the weakly supervised setting. For completeness, accuracy, precision, recall, F1-score, and AUC for each model under gene-level LOOCV are reported in Supplementary Table S2.

Feature importance analysis

Permutation importance analysis revealed that engineered composite features significantly outperformed traditional single network metrics in predicting genes ranked highly by the supervised prioritization model (Fig. 4c). The Degree-Clustering Index emerged as the most influential feature (importance score: 0.26), followed by the Centrality Composite Index (0.22). Among traditional metrics, Degree showed moderate importance (0.16), while other single features contributed less substantially.

This finding underscores that model-prioritized candidate genes tend to exhibit a combination of high connectivity and strategic positioning within densely connected network modules, rather than being distinguished by any single network property in isolation.

Machine learning–based candidate gene prioritization

The models successfully identified all three literature-labeled positives (seed positives; ESR1, CYP19A1, and AR) with high consistency across different algorithms as shown in Fig. 4d. This supports our initial literature-based selection, although it should be noted that the machine learning models were trained using these genes as positive examples. More interestingly, two novel candidates were identified with substantial prediction scores: NR3C1 and PGR.

Notably, both newly predicted candidates belong to the nuclear receptor superfamily, which includes the literature-labeled seed positives estrogen receptor alpha (ERα) and androgen receptor (AR). This biological coherence provides biological plausibility in the predictions, as genes within the same family often share functional and regulatory characteristics and suggests our machine learning approach is detecting biologically meaningful patterns beyond the training examples.

Feature distribution analysis revealed consistent and substantial differences between model-prioritized candidate genes and lower-ranked genes across all metrics (Fig. 4e). The most pronounced disparity was observed in the Degree-Clustering Index and Degree, with model-prioritized candidates exhibiting higher values than lower-ranked genes.

The two-dimensional visualization of genes in the feature space defined by Centrality Index and Degree-Clustering Index demonstrated clear separation between model-prioritized candidates and lower-ranked genes (Fig. 4g). Literature-labeled positives (seed positives; red) clustered in the high-value region of the plot, while the newly predicted candidates NR3C1 and PGR (orange) positioned adjacently to this cluster, providing spatial validation of their predicted status.

Molecular docking results

Molecular docking simulations were conducted to examine predicted binding interactions between atrazine and five hormone receptors prioritized from the network analysis. Predicted binding affinities, expressed as binding free energy (ΔG), were estimated for each receptor–ligand complex (Fig. 5a–e).

Fig. 5.

Fig. 5

Molecular docking of atrazine to nuclear receptors. Binding poses of atrazine in the ligand-binding domains of five key proteins: (a) ERα (estrogen receptor, binding energy − 5.97 kcal/mol), (b) CYP19A1 (aromatase, -5.553 kcal/mol), (c) AR (androgen receptor, -5.728 kcal/mol), (d) NR3C1 (glucocorticoid receptor, -6.248 kcal/mol), and (e) PGR (progesterone receptor, -6.495 kcal/mol). Each panel shows the overall protein structure (left) and magnified binding site (right), with atrazine shown in red, key interacting amino acid residues in green, and hydrogen bonds as yellow dashed lines. The highest binding affinities were observed for PGR and NR3C1, suggesting these receptors as candidate targets that warrant further experimental investigation

Atrazine showed heterogeneous predicted affinities across the examined receptors, with the strongest binding predicted for the progesterone receptor (PGR, − 6.495 kcal/mol), followed by the glucocorticoid receptor (NR3C1, − 6.248 kcal/mol), estrogen receptor alpha (ERα, − 5.97 kcal/mol), androgen receptor (AR, − 5.728 kcal/mol), and aromatase (CYP19A1, − 5.553 kcal/mol).

Binding pose analysis suggested that atrazine may localize within the ligand-binding regions of these proteins. In ERα (Fig. 5a), atrazine was predicted to occupy the hydrophobic pocket with an interaction involving LEU-346. In CYP19A1 (Fig. 5b), atrazine was positioned near the catalytic region, with a predicted interaction involving THR-310. In AR (Fig. 5c), the docked conformation placed atrazine within the receptor binding cavity, with interactions involving MET-894.

For NR3C1 (Fig. 5d), atrazine was predicted to form multiple interactions within the ligand-binding domain, including hydrogen bonds with TYR-735 and THR-739. The most favorable predicted binding was observed for PGR (Fig. 5e), where atrazine interacted with ASN-719 and LEU-718, consistent with its comparatively lower predicted binding free energy.

Overall, these docking results suggest that atrazine may engage hormone receptor binding pockets across multiple nuclear receptors, with relatively stronger predicted interactions for progesterone and glucocorticoid receptors. However, these findings are based solely on in silico docking and do not establish competitive binding or functional agonistic or antagonistic effects. Accordingly, the observed interaction patterns should be interpreted as structural hypotheses that require experimental validation.

Discussion

This study employed an integrated approach combining network toxicology, machine learning, and molecular docking to more comprehensively characterize the molecular mechanisms underlying atrazine’s endocrine disrupting effects. Our systematic analysis pipeline, spanning gene identification, network-based feature engineering, and molecular docking, provided multi-level evidence for atrazine’s mechanisms of action. Our findings not only support previously established targets but also nominate underexplored pathways, particularly involving progesterone and glucocorticoid signaling via the PGR and NR3C1 receptors.

Network analysis reveals hormone signaling as primary target

The protein-protein interaction network analysis highlighted the centrality of hormone signaling pathways in atrazine’s biological effects. The identification of ESR1, AR, and CYP19A1 as high-degree nodes is consistent with previous studies implicating estrogen and androgen signaling in atrazine toxicity. Fan et al. [7] demonstrated atrazine’s ability to induce aromatase expression, potentially increasing estrogen production, while Suzawa and Ingraham [21] reported effects on estrogen receptor signaling.

The enrichment of multiple hormone-related pathways, including estrogen, progesterone, prolactin, and GnRH signaling, provides a systems-level view of atrazine’s potential impacts. This multi-pathway effect may explain the diverse reproductive abnormalities observed across different experimental models, from amphibians to mammals [5, 7].

Machine learning identifies novel candidate genes

Our study introduces composite network candidate target prioritization in toxicological networks. The Degree-Clustering Index (importance score: 0.26) and Centrality Composite Index (0.22) outperform traditional metrics like degree alone (0.16), suggesting that the model preferentially ranks genes characterized by joint patterns of high connectivity and dense local clustering. Unlike conventional methods relying on single centrality measures, our approach integrates network topology with pathway enrichment, tailoring it for toxicant target identification [22]. This is critical as toxicants like atrazine often target functional modules rather than isolated proteins [23].

These features consistently predicted our literature-labeled positives (seed positives; ESR1, CYP19A1, and AR) across models, supporting the internal consistency of the prioritization framework. More notably, they identified NR3C1 and PGR as novel mediators, expanding atrazine’s known effects beyond estrogen and androgen signaling. Notably, PGR is a well-recognized estrogen-responsive gene within estrogen receptor signaling contexts; therefore, its prioritization here may reflect downstream coupling within endocrine networks rather than an entirely independent receptor axis, and should be interpreted accordingly. While these pathways are less studied, indirect evidence supports NR3C1’s role, with Zimmerman et al. [24] linking atrazine to aldosterone production via NR3C1. These findings highlight progesterone and glucocorticoid signaling as underexplored mechanisms warranting experimental validation.

Molecular docking provides structural support for computational predictions

Molecular docking provided structural support for the computational predictions, indicating possible in silico interactions between atrazine and several nuclear receptors. Among these, PGR (− 6.495 kcal/mol) and NR3C1 (− 6.248 kcal/mol) showed the strongest predicted binding affinities, nominating progesterone and glucocorticoid receptor–related pathways as candidate pathways warranting further experimental investigation.

Binding pose analysis suggested that atrazine may occupy ligand-binding pockets of these receptors. In PGR, atrazine was predicted to interact with ASN-719 and LEU-718, while in NR3C1, interactions involved TYR-735 and THR-739—residues implicated in hormone recognition.

Consistent with prior reports, docking results also supported potential interactions between atrazine and classical endocrine-related targets, including ERα, CYP19A1, and AR. Collectively, these structural predictions suggest that atrazine may be compatible with interactions involving multiple nuclear receptors based on docking predictions, suggesting that atrazine’s endocrine-disrupting effects may not be limited to hormone synthesis alone. However, these findings are based on in silico analyses and require validation in biological systems.

Implications for understanding atrazine’s biological effects

The identification of PGR and NR3C1 as potential atrazine-associated targets provides insight into possible biological pathways affected by atrazine exposure. Progesterone and glucocorticoid signaling are involved in key physiological processes, including reproduction, metabolism, immune regulation, and development [25]. As PGR is an estrogen-responsive gene downstream of ERα signaling, perturbation of estrogen-related pathways by atrazine may also indirectly influence progesterone signaling. In this context, docking analysis suggests PGR as a candidate receptor with putative structural compatibility, warranting further experimental validation. Alterations in these signaling pathways may contribute to adverse outcomes reported in association with atrazine exposure, such as reproductive, metabolic, immune, and developmental effects.

The predicted ability of atrazine to associate with multiple receptors in docking analyses suggests a broad receptor-related interaction profile, which may help explain its diverse endocrine-related effects observed across experimental models. The differential binding affinities observed may also help explain the dose-dependent nature of atrazine’s effects, with different pathways potentially affected at different exposure levels.

Limitations and future directions

Several limitations should be noted. First, molecular docking reflects binding potential but does not capture protein–ligand dynamics or downstream signaling; molecular dynamics simulations and in vitro receptor assays are therefore required to validate the predicted interactions and clarify potential agonistic or antagonistic effects.

Second, the analysis was restricted to protein-coding genes and did not consider non-coding RNA or epigenetic mechanisms, which may also contribute to atrazine toxicity. Incorporating these layers could provide additional insight.

Third, the feature engineering and weighting scheme was optimized for this dataset, given the very limited number of literature-labeled seed positives, and should be regarded as heuristic. Although the general framework of integrating network topology and pathway information is conceptually transferable, parameter tuning and external validation are needed before broader application.

Fourth, supervised prioritization relied solely on internal cross-validation of a small gene set, with no independent test set or external toxicants available; thus, generalizability remains uncertain.

Finally, predictions were based on human proteins, and species-specific differences in receptor structure and function may affect atrazine responses across organisms.

Future studies should experimentally evaluate PGR and NR3C1 using binding and functional assays, examine potential interactions across multiple receptor pathways, explore species-specific effects, and further validate and refine the proposed feature engineering strategy using additional environmental toxicants.

Conclusion

This study presents an integrative framework combining network toxicology, machine learning, and molecular docking to explore potential endocrine-disrupting mechanisms of atrazine. The results are consistent with reported effects on estrogen- and androgen-related pathways and nominate the progesterone receptor (PGR) and glucocorticoid receptor (NR3C1) as candidate targets based on relatively higher predicted binding affinities. Overall, this framework provides a hypothesis-generating approach that may be extended to the toxicological assessment of other environmental chemicals.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1 (78.5KB, pdf)

Author contributions

Chaoyuan Jin: Conceptualization, Data curation, Formal analysis, Methodology, Visualization, Writing - original draft.  Ruijinlin Hao: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Validation, Visualization.  Xingxing Ren: Project administration, Funding acquisition, Investigation. Jie Shen: Supervision, Writing - review & editing.

Funding

This work was supported by National Natural Science Foundation of China (Grant NO.82000821), and Shanghai Sailing Program (20YF1406300).

Data availability

Data were sourced from public databases (CTD, STRING, KEGG, PDB, PubChem; identifiers in Methods). Software used is publicly available. Supporting data generated in this study are available from the corresponding author upon reasonable request.

Declarations

Ethics approval and consent to participate

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Chaoyuan Jin and Ruijinlin Hao contributed equally to this work.

Contributor Information

Xingxing Ren, Email: ren.xingxing@zs-hospital.sh.cn.

Jie Shen, Email: shen2018jie@163.com.

References

  • 1.Gore AC, Chappell VA, Fenton SE, Flaws JA, Nadal A, Prins GS, et al. EDC-2: the endocrine society’s second scientific statement on endocrine-Disrupting chemicals. Endocr Rev. 2015;36:E1–150. 10.1210/er.2015-1010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.de Albuquerque FP, de Oliveira JL, Moschini-Carlos V, Fraceto LF. An overview of the potential impacts of atrazine in aquatic environments: perspectives for tailored solutions based on nanotechnology. Sci Total Environ. 2020;700:134868. 10.1016/j.scitotenv.2019.134868. [DOI] [PubMed] [Google Scholar]
  • 3.De Rosa E, Montuori P, Di Duca F, De Simone B, Scippa S, Nubi R, et al. Assessment of atrazine contamination in the Sele river estuary: Spatial distribution, human health risks, and ecological implications in Southern Europe. Environ Sci Eur. 2024;36:115. 10.1186/s12302-024-00941-6. [Google Scholar]
  • 4.Graymore M, Stagnitti F, Allinson G. Impacts of atrazine in aquatic ecosystems. Environ Int. 2001;26:483–95. 10.1016/s0160-4120(01)00031-9. [DOI] [PubMed] [Google Scholar]
  • 5.Hayes TB, Khoury V, Narayan A, Nazir M, Park A, Brown T, et al. Atrazine induces complete feminization and chemical castration in male African clawed frogs (Xenopus laevis). Proc Natl Acad Sci U S A. 2010;107:4612–7. 10.1073/pnas.0909519107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Cooper RL, Laws SC, Das PC, Narotsky MG, Goldman JM, Lee Tyrey E, et al. Atrazine and reproductive function: mode and mechanism of action studies. Birth Defects Res B Dev Reprod Toxicol. 2007;80:98–112. 10.1002/bdrb.20110. [DOI] [PubMed] [Google Scholar]
  • 7.Fan W, Yanase T, Morinaga H, Gondo S, Okabe T, Nomura M, et al. Atrazine-induced aromatase expression is SF-1 dependent: implications for endocrine disruption in wildlife and reproductive cancers in humans. Environ Health Perspect. 2007;115:720–7. 10.1289/ehp.9758. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Wirbisky SE, Freeman JL. Atrazine exposure and reproductive dysfunction through the Hypothalamus-Pituitary-Gonadal (HPG) axis. Toxics. 2015;3:414–50. 10.3390/toxics3040414. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Kucka M, Pogrmic-Majkic K, Fa S, Stojilkovic SS, Kovacevic R. Atrazine acts as an endocrine disrupter by inhibiting cAMP-specific Phosphodiesterase-4. Toxicol Appl Pharmacol. 2012;265:19–26. 10.1016/j.taap.2012.09.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Spurgeon DJ, Jones OAH, Dorne J-LCM, Svendsen C, Swain S, Stürzenbaum SR. Systems toxicology approaches for Understanding the joint effects of environmental chemical mixtures. Sci Total Environ. 2010;408:3725–34. 10.1016/j.scitotenv.2010.02.038. [DOI] [PubMed] [Google Scholar]
  • 11.Galbiati V, Buoso E, d’Emmanuele DV, Bianca R, Paola RD, Morroni F, Nocentini G, et al. Immune and nervous systems interaction in endocrine disruptors toxicity: the case of atrazine. Front Toxicol. 2021;3:649024. 10.3389/ftox.2021.649024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Zhao S, Iyengar R. Systems pharmacology: network analysis to identify multiscale mechanisms of drug action. Annu Rev Pharmacol Toxicol. 2012;52:505–21. 10.1146/annurev-pharmtox-010611-134520. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Harrold J, Ramanathan M, Mager D. Network-Based approaches in drug discovery and early development. Clin Pharmacol Ther. 2013;94:651–8. 10.1038/clpt.2013.176. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Meng X-Y, Zhang H-X, Mezei M, Cui M. Molecular docking: A powerful approach for structure-based drug discovery. Curr Comput Aided Drug Des. 2011;7:146–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Agu PC, Afiukwa CA, Orji OU, Ezeh EM, Ofoke IH, Ogbu CO, et al. Molecular Docking as a tool for the discovery of molecular targets of nutraceuticals in diseases management. Sci Rep. 2023;13:13398. 10.1038/s41598-023-40160-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Li Y, Zhou T, Liu Z, Zhu X, Wu Q, Meng C, et al. Air pollution and prostate cancer: unraveling the connection through network toxicology and machine learning. Ecotoxicol Environ Saf. 2025;292:117966. 10.1016/j.ecoenv.2025.117966. [DOI] [PubMed] [Google Scholar]
  • 17.Roberge M, Hakk H, Larsen G. Atrazine is a competitive inhibitor of phosphodiesterase but does not affect the Estrogen receptor. Toxicol Lett. 2004;154:61–8. 10.1016/j.toxlet.2004.07.005. [DOI] [PubMed] [Google Scholar]
  • 18.Sanderson JT, Seinen W, Giesy JP, van den Berg M. 2-Chloro-s-triazine herbicides induce aromatase (CYP19) activity in H295R human adrenocortical carcinoma cells: a novel mechanism for estrogenicity? Toxicol Sci. 2000;54:121–7. 10.1093/toxsci/54.1.121. [DOI] [PubMed] [Google Scholar]
  • 19.Hayes TB, Stuart AA, Mendoza M, Collins A, Noriega N, Vonk A, et al. Characterization of atrazine-induced gonadal malformations in African clawed frogs (Xenopus laevis) and comparisons with effects of an androgen antagonist (cyproterone acetate) and exogenous Estrogen (17beta-estradiol): support for the demasculinization/feminization hypothesis. Environ Health Perspect. 2006;114(Suppl 1 Suppl 1):134–41. 10.1289/ehp.8067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Solomon KR, Carr JA, Du Preez LH, Giesy JP, Kendall RJ, Smith EE, et al. Effects of atrazine on fish, amphibians, and aquatic reptiles: a critical review. Crit Rev Toxicol. 2008;38:721–72. 10.1080/10408440802116496. [DOI] [PubMed] [Google Scholar]
  • 21.Suzawa M, Ingraham HA. The herbicide atrazine activates endocrine gene networks via non-steroidal NR5A nuclear receptors in fish and mammalian cells. PLoS ONE. 2008;3:e2117. 10.1371/journal.pone.0002117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Barel G, Herwig R. Network and pathway analysis of toxicogenomics data. Front Genet. 2018;9:484. 10.3389/fgene.2018.00484. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Law JN, Orbach SM, Weston BR, Steele PA, Rajagopalan P, Murali TM. Computational construction of toxicant signaling networks. Chem Res Toxicol. 2023;36:1267–77. 10.1021/acs.chemrestox.2c00422. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Zimmerman AD, Mackay L, Kemppainen RJ, Jones MA, Read CC, Schwartz D, et al. The herbicide atrazine potentiates angiotensin II-induced aldosterone synthesis and release from adrenal cells. Front Endocrinol (Lausanne). 2021;12. 10.3389/fendo.2021.697505. [DOI] [PMC free article] [PubMed]
  • 25.Kadmiel M, Cidlowski JA. Glucocorticoid receptor signaling in health and disease. Trends Pharmacol Sci. 2013;34:518–30. 10.1016/j.tips.2013.07.003. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1 (78.5KB, pdf)

Data Availability Statement

Data were sourced from public databases (CTD, STRING, KEGG, PDB, PubChem; identifiers in Methods). Software used is publicly available. Supporting data generated in this study are available from the corresponding author upon reasonable request.


Articles from BMC Pharmacology & Toxicology are provided here courtesy of BMC

RESOURCES