Skip to main content
Environmental Health Perspectives logoLink to Environmental Health Perspectives
. 2024 Apr 3;132(4):047002. doi: 10.1289/EHP12886

High-Throughput Transcriptomics of Nontumorigenic Breast Cells Exposed to Environmentally Relevant Chemicals

Kimberley E Sala-Hamrick 1,*, Anagha Tapaswi 1,*, Katelyn M Polemi 1, Vy K Nguyen 1,3, Justin A Colacino 1,2,4,
PMCID: PMC10990114  PMID: 38568856

Abstract

Background:

There is a suite of chemicals, including metals, pesticides, and personal care product compounds, which are commonly detected at high levels in US Center for Disease Control’s National Health and Nutrition Examination Survey (NHANES) chemical biomarker screens. Whether these chemicals influence development of breast cancer is not well understood.

Objectives:

The objectives were to perform an unbiased concentration-dependent assessment of these chemicals, to quantify differences in cancer-specific genes and pathways, to describe if these differences occur at human population–relevant concentrations, and to specifically test for differences in markers of stemness and cellular plasticity.

Methods:

We treated nontumorigenic mammary epithelial cells, MCF10A, with 21 chemicals at four concentrations (25 nM, 250 nM, 2.5μM, and 25μM) for 48 h. We conducted RNA-sequencing for these 408 samples, adapting the plexWell plate-based RNA-sequencing method to analyze differences in gene expression. We calculated gene and biological pathway-specific benchmark concentrations (BMCs) using BMDExpress3, identifying differentially expressed genes and generating the best fit benchmark concentration models for each chemical across all genes. We identified enriched biological processes and pathways for each chemical and tested whether chemical exposures change predicted cell type distributions. We contextualized benchmark concentrations relative to human population biomarker concentrations in NHANES.

Results:

We detected chemical concentration–dependent differences in gene expression for thousands of genes. Enrichment and cell type distribution analyses showed benchmark concentration responses correlated with differences in breast cancer–related pathways, including induction of basal-like characteristics for some chemicals, including arsenic, lead, copper, and methyl paraben. Comparison of benchmark data to NHANES chemical biomarker (urine or blood) concentrations indicated an overlap between exposure levels and levels sufficient to cause a gene expression response.

Discussion:

These analyses revealed that many of these 21 chemicals resulted in differences in genes and pathways involved in breast cancer in vitro at human exposure–relevant concentrations. https://doi.org/10.1289/EHP12886

Introduction

There are an estimated 2.3 million new cases of breast cancer diagnosed globally and >685,000 people die from breast cancer each year.1 Breast cancer is a heterogeneous disease—tumors can be grouped into subtypes based on their molecular signature and expression of estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2).2 Prognosis and outcomes vary based on subtype. Triple-negative breast cancers (TNBCs), those which do not express ER, PR, or HER2, have the worst overall survival rates.3 Identifying the factors that promote aggressive breast cancers, such as triple-negative breast cancers, is of high public health priority. In general, high penetrance genetic risk factors are only thought to explain <510% of all breast cancers.4 As such, there is likely a large impact of the environment on aggressive breast cancer risk. There is an urgent need to identify potentially modifiable risk factors, such as chemical exposures, that promote aggressive breast cancers so that we can develop new strategies for prevention.

As aggressive breast cancers form, they acquire the hallmarks of cancer, including sustained proliferative signaling, the activation of invasion pathways, immortality, the evasion of growth suppression, the reactivation of stem cell pathways, and the acquisition of cellular plasticity.57 As an example, triple-negative breast cancers are linked to an inflammatory microenvironment and enriched for expression of stem cell–associated pathways.812 Triple-negative breast cancers also often present with a basal-like phenotype.13 Experimental data surprisingly suggest that these basal-like breast cancers develop from dysregulated luminal cells, where these cells acquire cell state phenotypic plasticity and shift from a luminal-like to a basal-like expression pattern.14,15 Identifying chemicals that impact the development of aggressive breast cancers through the promotion of these various hallmarks, including altered stemness and cell state phenotypic plasticity pathways, would provide novel insights into the mechanisms by which environmental exposures impact breast cancer risk and outcomes.

There are hundreds or thousands of chemical exposures that could impact breast cancer risk, and a systematic and rapid method for evaluation is essential to identify those which may alter aggressive breast cancer hallmarks at human-relevant concentrations.4 Here, we developed a high-throughput transcriptomic platform for the unbiased assessment of chemical effects in vitro. We quantified the concentration-dependent effects of 21 chemical exposures in the commonly used in vitro breast carcinogenesis model, nontumorigenic MCF10A cells. We prioritized the chemicals based on them being commonly detected in chemical biomarker screens, with documented exposure disparities across demographic groups, and with putative links to breast cancer.4,1622 We had four goals, as follows: first, to assess and report concentration-dependent gene expression differences in an unbiased fashion; second, to test the hypothesis that these gene expression differences will be enriched in established breast cancer hallmark pathways; third, to contextualize the concentrations at which gene expression differences occur relative to biomarker concentrations in a representative sample of the US population; and, fourth, to test the hypothesis that treatment with these chemicals will lead to differences in cell state phenotypes consistent with a luminal-to-basal transition. Overall, we evaluated putative chemical contributors at human relevant concentrations on biological pathways associated with breast cancers.

Methods

Cell Culture

The nontumorigenic mammary epithelial cell line, MCF10A, was obtained from American Type Culture Collection (ATCC). Cells were cultured and maintained in Dulbecco’s modified Eagle’s medium/Hams F12 50/50 mix (Corning) supplemented with 5% horse serum (Thermo Fisher), 2.5mg/mL HEPES (Thermo Fisher), 5μg/mL insulin (Gibco/Thermo Fisher), 96μg/mL hydrocortisone (StemCell), 100μg/mL cholera toxin (Sigma-Aldrich), and 20 ng/mL epidermal growth factor (StemCell). The cells were maintained at 37oC in a humidified 5% CO2 condition.

Compound Preparation

The 21 chemicals [sodium (meta) arsenite, cadmium chloride, copper (II) chloride, lead (II) acetate trihydrate, mercury (II) chloride, phenanthrene, p,p′-DDE (dichlorodiphenyldichloroethylene), thiram, 1,4-dichlorobenzene, 2,5-dichlorophenol, methylparaben, propylparaben, perfluorodecanoic acid (PFDA), perfluorononanoic acid (PFNA), perfluorooctanoic acid (PFOA), perfluorooctanesulfonic acid (PFOS), polychlorinated biphenyl 153 (PCB153), polychlorinated biphenyl 187 (PCB187), bisphenol A, bisphenol F, bisphenol S] were purchased from Sigma-Aldrich and Cayman Chemicals (Table S1). The chemicals were weighed and dissolved in dimethyl sulfoxide (DMSO) (for most) and in sterile water (for some) at a concentration of 5mg/mL and stored at 20°C for long-term storage. For dosing, we prepared an intermediate stock concentration of 5 mM for each of the chemicals, which was further diluted into final concentrations for dosing the MCF10A cells: 25μM, 2.5μM, 0.25μM, and 0.025μM. These final concentrations were prepared fresh in MCF10A media for each experiment. For chemicals dissolved in DMSO, the final DMSO concentration was 0.5%, as were the vehicle controls. For chemicals dissolved in water, an equivalent amount of water was included in the vehicle control.

Cell Plating

MCF10A cells were cultured and expanded at passage 105 for all rounds of cell culture, and different flasks were used for different plates. Cells were plated in black, clear flat bottom 384-well plates (Corning) using a 2.5-to 125-μL multichannel pipette. First, 10μL of plain medium was added to prewarmed plates. Five hundred cells in 30μL of medium were added to each well at a slow speed to make up a final volume of 40μL. The cells were left undisturbed to allow the cells to attach and distribute uniformly over the well. Each microplate was incubated at 37°C in a humidified incubator with 5% CO2 for 24 h prior to dosing. Cells were then exposed to test chemicals for 48 h with four biological replicates per concentration, including four replicates for vehicle controls (DMSO or water). Experiments were run in two batches to include experimental replicates for all chemical—concentration combinations.

plexWell RNA-Sequencing Overview

To conduct high-throughput RNA-sequencing, we developed an adapted version of the plexWell (SeqWell) plate-based sequencing method to make the method applicable for high-throughput in vitro toxicological analysis. In the plexWell protocol, sequencing reactions were prepared in custom 96-well plates, where each well contained a unique oligo barcode. Plates were also assigned a unique oligo barcode. Cells were lysed and cDNA was directly prepared from the cellular lysate. A tagmentation reaction added the well-specific barcode to the cDNA from each well. The tagmented cDNAs were then pooled prior to library preparation, where plate-specific barcodes can also be incorporated for ultrahigh sample multiplexing prior to sequencing. For example, 384 wells of treated samples would be processed across four custom 96-well plates, where each of the samples received a well-specific barcode and each of the four plates received a plate-specific barcode. This led to each of the 384 samples receiving a unique combination of well and plate barcodes, which can be bioinformatically deconvoluted following sequencing. This protocol decreased the cost of library preparation by <90% relative to traditional RNA-sequencing, enabling unbiased high-throughput toxicological analyses.

Our key modification to the manufacturer protocol to streamline the process was to avoid the step of RNA purification prior to cDNA preparation. The plexWell protocol calls for the input of 1μL of RNA for cDNA preparation. To eliminate the step of RNA extraction, we devised a method of lysing the cells directly in the 384-well microplate. After 48 h of incubation with the test chemicals, the cells were washed twice with 40μL of 1× Hanks’ balanced salt solution (HBSS). After the second wash, 5μL of 1× HBSS was left behind to avoid cell loss. We added 10μL of plexWell lysis buffer (with RNase inhibitor) to each well using a 2.5-to 125-μL multichannel micropipette. The buffer was pipetted up and down to mix 10 times, and the cell lysates were collected in 1mL Eppendorf tubes on ice to avoid RNA degradation. Cell lysis was confirmed via examination by microscopy.

plexWell cDNA Preparation

The plexWell manufacturer’s protocol was followed to prepare cDNA for subsequent library preparation. In brief, 1μL of cell lysate was used as an input for cDNA preparation and oligo dT annealing in a PCR 96-well plate (Dot Scientific). The cDNA synthesis reaction was run for 12 cycles on a Bio-Rad CFX96 Touch real-time PCR detection system. Equivalent amount of MAGwise paramagnetic beads was added to the amplified cDNA in each well of the 96-well plate. The cDNA was allowed to bind for 5 min. The plate was placed on a magnet to allow the beads to pellet on the inner walls of each well. Supernatant was discarded, and the beads were washed once using 80% ethanol. The cDNA was eluted using 20μL of 10 mM Tris solution. Purified cDNA was stored at 20°C for short-term storage.

PicoGreen Assay

As per the plexWell protocol, we quantified the cDNA concentration of a subset of samples using the Quant-iT PicoGreen double stranded DNA (dsDNA) assay kit (Thermo Fisher). Standards ranging from 25 pg/mL to 25 ng/mL were prepared using the Lambda Standard DNA (100μg/mL) provided in the kit. One microliter of prepared dsDNA was diluted in 99μL of 1× TE buffer to get a 1:100 dilution. A 200-fold aqueous Quant-iT PicoGreen working solution was prepared in 1× TE buffer at a volume of 100μL/well. The standards and samples were plated in a flat bottom Corning 96-well plate using a 125-to 1,250-μL multichannel pipette. Quant-iT PicoGreen working solution added at 100μL/well using a 125-to 1,250-μL multichannel pipette. The samples were incubated at room temperature for 5 min and protected from light. The fluorescence from Quant-iT picoGreen working solution was read on SpectraMax M5e microplate reader (Molecular Devices). The preset protocol on SoftMaxPro software version 5.4 for PicoGreen assay for nucleic acid was used for analysis.

As per the plexWell protocol, six samples per 96-well plate were analyzed on an Agilent high-sensitivity DNA bioanalyzer (up to 15,000 base pairs) at the University of Michigan Advanced Genomics Core. The analyzed data showed electropherograms with a summary of fragment sizes. The electropherograms for submitted samples were compared to the example electropherograms provided in the plexWell protocol to check for fragment size discrepancies. A pool of 22 random samples was submitted for QuBit analysis at the Advanced Genomics Core per round of experiment to obtain an average concentration. According to the plexWell protocol, the reagents in the kit were formulated to tolerate up to 10-fold difference in sample input across 96 samples. After fragment distribution and concentration checks, we calculated a global dilution factor using the QuBit concentrations (one per library preparation) as described in the plexWell protocol. This global dilution factor gives the volume for dilution of cDNA prior to tagmentation. Four microliters of prepared cDNA was diluted in the calculated Tris-HCl volume.

Library Preparation

Post dilution, 4μL of the diluted cDNA per sample was used for library preparation. Each sample was individually barcoded in a hard skirted Sample Barcode (SB) plate provided in the plexWell 384 Rapid Single Cell RNA library prep kit (SeqWell). Each sample was labeled with an i7 index, referred to as a Sample Barcode, using a tagmentation reaction. Post i7 tagging, the 96 samples were pooled to a final volume of 800860μL. An equivalent amount of MAGwise paramagnetic beads was added to the pooled SB reactions. The cDNA was allowed to bind to the beads for 5 min. The beads then formed a pellet on the inner wall of the tube as it was placed on a magnetic stand. The pellet was washed two times with 80% ethanol. Forty microliters of 10 mM Tris was used to elute out purified SB reaction pool. Next, the purified SB reaction pool was labeled with an i5 index, referred to as a Pool Barcode (PB), using a tagmentation reaction. Post i5 tagging, an equivalent amount of MAGwise paramagnetic beads was added to the PB reaction. The DNA was allowed to bind to the beads. The beads then formed a pellet on the magnetic stand and was washed two times using 80% ethanol. Purified PB product was amplified using Bio-Rad CFX96 Touch real-time PCR detection system for 12 cycles. Post amplification, the total volume for the reaction was measured, and the DNA was diluted to a total volume of 205μL with 10 mM Tris. Two hundred microliters of the diluted product was transferred to new 1.5-mL LoBind tubes for purification. Five microliters of the unpurified product was stored as control. 0.8 equivalents of MAGwise paramagnetic beads was added to the PB product, and DNA was allowed to bind to the beads. The beads were allowed to form a pellet on the inner wall of the tube. The pellet was washed two times using 80% ethanol. Purified library was eluted using 32μL of 10 mM of Tris. Twenty-eight microliters of the purified product was transferred to a new 1.5-mL LoBind tube. Four libraries were prepared, each having a unique Pool Barcode. Purified libraries were stored at 20°C for short-term storage. Library quality control (QC) was done on the Agilent Bioanalyzer (High-Sensitivity DNA 5,000 kit) at the Advanced Genomics Core.

RNA-Sequencing and Data Processing

Libraries were sequenced on the Illumina NovaSeq 6000 at the University of Michigan Advanced Genomics Core. FASTQ reads were demultiplexed back into individual samples based on the i7 and i5 indices. Sequencing data were transferred to the University of Michigan Great Lakes high-performance computing cluster for analysis. Sequencing read quality was assessed via FastQC and MultiQC. Reads were aligned to a splice junction aware build of the human genome (GRCh38) using STAR. Aligned reads were assigned to genes using featureCounts, where multimapping and multi-overlapping reads were not counted.

Differential Gene Expression Testing

Read count matrices from featureCounts were loaded into edgeR, and samples with fewer than 1,000,000 mapped reads were excluded from downstream analysis. Genes with low expression were excluded from analysis using the edgeR filterByExpr() function with default settings. Normalization factors and dispersion were calculated prior to generating a log2-transformed normalized counts per million (cpm) matrix for downstream analysis. Differential gene expression between each treatment (average of four replicates) and the relevant controls (average of four replicates) was calculated using quasi-likelihood negative binomial generalized log-linear modeling in edgeR. Genes were considered differentially expressed between a treatment and control at a false discovery rate (FDR) adjusted p-value of <0.05.

Benchmark Concentration Analyses

Gene-specific benchmark concentration and best fit benchmark concentration model were identified using BMDExpress version 3.2, a free software package developed by the National Institute of Environmental Health Sciences (NIEHS) and available for download (https://github.com/auerbachs/BMDExpress-3/releases). Best practices for dose-response modeling for each chemical were conducted according BMDExpress online documentation (https://bmdexpress-2.readthedocs.io/en/feature-readthedocs/basic_workflow/). Normalized counts per million reads were imported into BMDExpress and prefiltered using one-way analysis of variance (ANOVA) for significance at a p-value of <0.05 to identify genes showing significant increasing or decreasing concentration responses. To determine concentration–response relationships, the filtered data were modeled in BMDExpress with hill, power, linear, polynomial (2°, 3°, 4°), and exponential (3 and 5) models and the best fit model was chosen. The benchmark response (BMR) was set to 1 standard deviation relative to control response, maximum iterations of the model was set to 250, a confidence level of 0.95 was used, and power was restricted to 1 for the power model. Best fit models were chosen for each concentration–response relationship per gene using nested chi square to select for the best model followed by lowest Akaike information criterion (AIC). Hill models were flagged if its “k” parameter was smaller than one-third of the lowest positive concentration and flagged hill models were included in the data. Directionality in response (upregulated/downregulated) was derived from the “Best adverseDirection” of the winning model for a given gene. We conducted “Defined Category” analyses to additionally assess the concentration–response relationship to the Molecular Signature Database “hallmark” gene sets,23 embryonic stem cell genes,24 breast cancer molecular signatures,25 and breast stem cell genes.26 Benchmark concentrations with values above the highest tested concentration (25μM) were excluded from individual gene and functional classification analyses. Defined category results were filtered at a significance at a Fisher’s exact two-tailed p-value of <0.05 for BMDExpress testing, and results were visualized by heatmap using the R package pheatmap and heatmap.2 function from the R package gplots. All original expression data, ANOVA filtered results, BMC results, and subsequent BMDExpress analysis for all chemicals are included in a .bm2 file available in supplementary information.

For hypergeometric enrichment testing of BMC data, we used the R package HypeR for hypergeometric enrichment testing for two user-defined gene sets (one Breast Cancer Screen or “BCScreen” gene set and gene set from Pal et al.).2729 For hypergeometric enrichment testing, background gene expression included all human genes (23467). Hypergeometric enrichment results were filtered for significance at an FDR of <0.05.

Comparison of NHANES Exposure Levels to Benchmark Concentrations

We compared bioactive concentrations of the assayed chemicals to human biomarker concentrations measured as part of the National Health and Nutrition Examination Survey (NHANES) following our previously established methodology.20 NHANES is designed to understand the health status of the US population, including a substantial number of chemical biomarker measurements in urine, blood, and serum. For this analysis, we used our curated NHANES dataset, which contains information on chemical biomarker concentrations of the 21 assessed chemicals in up to 57,786 women recruited from 1999–2018.30 Linkage between chemicals used in vitro and their corresponding exposure biomarkers in NHANES are shown in Table S2. Chemical biomarker concentrations were converted to molarity units and then compared to the benchmark concentrations identified through the benchmark concentration (BMC) analyses of the RNA-seq data using boxplots. These comparisons are designed to help us understand whether dysregulated gene expression due to toxicant exposure may be occurring at population-relevant levels. Overlapping NHANES biomarker concentrations and BMC concentrations indicate that chemical biomarkers measured in US women in NHANES occur within the estimated benchmark concentrations.

Secondary Analysis of Normal Breast Single Cell Data for Cell Type Composition Prediction

To quantify whether treatments resulted in differences in the cellular composition of our samples, we performed bioinformatic deconvolution of our bulk RNA-seq data based on a single cell atlas of the normal human mammary gland,28 where 10 normal mammary gland samples were processed for single cell analysis using the 10× Genomics Chromium platform. Sample-specific count matrices of the single cell RNA-seq data were downloaded from the Gene Expression Omnibus and processed via Seurat (version 4.1). Briefly, we excluded cells that expressed fewer than 200 or more than 5,000 genes, along with cells with >15% mitochondrial reads. Data were log-normalized and scaled and then clustered with 0.02 resolution, following Pal et al.’s methods, to identify four cell clusters.28 Clusters were then assigned to “luminal progenitor” (CD49f+EpCAM+), “mature luminal” (CD49fEpCAM+), and “myoepithelial” (CD49f+EpCAMlo/−), identities based on marker gene expression assigned by Pal et al.28 Gene expression signatures for each of the cell types were calculated using the FindAllMarkers() function in Seurat.

We then used these single cell gene expression data to estimate cell type proportions in our data using the multisubject single-cell (MuSiC) deconvolution method version 1.0.0 in R.31 MuSiC predicts cell type proportions in bulk RNA-seq data based on annotated single cell RNA expression data from multiple donors. MuSiC weights genes, which show cross-subject and cross-cell type consistency in the single cell RNA-seq data to then transfer cell type-specific gene expression information from single cell RNA-seq datasets to bulk RNA-seq datasets. Cell type proportions in bulk RNA-seq data are predicted using a nonnegative least squares regression-based approach with these weighted gene expression signatures, with the constraints that cell type proportions cannot be negative and must sum to 1. Using MuSiC, we estimated the cell type proportions in each bulk RNA-seq sample to test the hypothesis that chemical treatments may lead to cellular plasticity or differences in cell type marker expression. We compared two MuSiC estimates for our data: one where proportions of “mature luminal,” “luminal progenitor,” and “myoepithelial” cells were estimated and one where only “luminal progenitor” and “myoepithelial” cells were estimated. Based on model fit statistics (R2), the model prediction, which only included “luminal progenitor” and “myoepithelial” cells, was better (median R2=0.29 vs. median R2=0.22 when including the three cell populations). The decreased model performance when including “mature luminal” cells also aligns with biological expectations of the MFC10A cell line, which, as an immortalized line, is unlikely to contain a significant proportion of mature luminal cells when grown in two dimensions (2D). Thus, we moved forward with predicting proportions of “luminal progenitor” and “myoepithelial” proportions in our bulk RNA-seq data. To visualize the differences in predicted cell type proportions for each chemical by concentration were then visualized using ggboxplot() in R, with differences between each concentration and control for a given chemical tested using a t-test via the stat_compare_means() function from the ggpubr() package.

Data Availability

The sequencing data have been deposited at the Gene Expression Omnibus (accession number GSE220051).

Results

To understand the potential impacts of our chemicals of interest on dysregulated breast cancer biology, we treated nontumorigenic MCF10A cells with chemicals that represent exposures from metals, pesticides, personal care products, and other industrial and environmental sources, including per- and polyfluoroalkyl substances (see Methods and Table S1 for a list of the chemicals used). Although the chemicals differ in cytotoxicity, we treated with the same four concentrations (25 nM, 250 nM, 2.5μM, 25μM) of each chemical for 48 h for consistency and to obtain a broad understanding of potential bioactivity. Genes were considered differentially expressed between a treatment and control at a false discovery rate (FDR) adjusted p-value of <0.05. Different chemicals, particularly arsenic, cadmium, thiram, and DDE showed unique gene expression responses across concentrations, suggestive of unique transcriptional effects, even at high concentrations (Figure S1 and S2). Fold change and significance of differential gene expression varied substantially by chemical and by concentration (Figure 1 and Table 1; Figure S2 and Table S3). Sequencing data can be found at the Gene Expression Omnibus (accession number GSE220051). For example, p,p′-DDE showed over 700 differentially expressed genes (DEGs) at all four concentrations, while mercury only had more than 10 DEGs at the highest concentration of 25μM. Arsenic and cadmium both had DEGs at all four concentrations, but had more than 8,000 and 4,000 DEGs at 25μM, respectively. As an example of the differences in effect of each chemical at a given concentration, volcano plots showing the results of differential expression analysis comparing the 2.5μM concentration to control are shown in Figure 1, with the comparisons for the other concentrations compared to control included in Figure S2.

Figure 1.

Figure 1 is a set of twenty-one volcano plots titled Arsenic 2.5 micromolar versus Control, Cadmium 2.5 micromolar versus Control, Copper 2.5 micromolar versus Control, Lead 2.5 micromolar versus Control, Mercury 2.5 micromolar versus Control, Phenanthrene 2.5 micromolar versus Control, Dichlorodiphenyldichloroethylene 2.5 micromolar versus Control, Thiram 2.5 micromolar versus Control, 1,4-Dichlorobenzene underscore 14 2.5 micromolar versus Control, 2,5-Dichlorophenol underscore 25 2.5 micromolar versus Control, Methylparaben 2.5 micromolar versus Control, Propylparaben 2.5 micromolar versus Control, Perfluorodecanoic acid 2.5 micromolar versus Control, Perfluorononanoic acid 2.5 micromolar versus Control, perfluorooctanoic acid 2.5 micromolar versus Control, perfluorooctanesulfonic acid 2.5 micromolar versus Control, Polychlorinated Biphenyls 153 2.5 micromolar versus Control, Polychlorinated Biphenyls 187 2.5 micromolar versus Control, Bisphenol A 2.5 micromolar versus Control, Bisphenol F 2.5 micromolar versus Control, Bisphenol S 2.5 micromolar versus Control, plotting negative log 10 (false discovery rate), ranging from 0 to 40 in increments of 20, 0 to 300 in increments of 100, 0 to 30 in increments of 10, 0 to 30 in increments of 10, 0 to 8 in increments of 2, 0 to 3 in unit increments, 0 to 60 in increments of 20, 0 to 125 in increments of 25, 0.0 to 1.5 in increments of 0.5, 0.0 to 2.0 in increments of 0.5, 0 to 8 in increments of 2, 0 to 5 in unit increments, 0 to 8 in increments of 2, 0 to 15 in increments of 5, 0.0 to 7.5 in increments of 2.5, 0 to 4 in increments of 2, 0.0 to 1.0 in increments of 0.5, 0.0 to 1.0 in increments of 0.5, 0.0 to 7.5 in increments of 2.5, 0 to 4 in unit increments, and 0 to 3 in unit increments (y-axis) across negative log (fold change), ranging from negative 8 to 4 in increments of 4, negative 5 to 5 in increments of 5, negative 8 to 4 in increments of 4, negative 8 to 4 in increments of 4, negative 4 to 4 in increments of 4, negative 6 to 6 in increments of 3, negative 5 to 5 in increments of 5, negative 4 to 4 in increments of 4, negative 8 to 4 in increments of 4, negative 4 to 4 in increments of 4, negative 4 to 4 in increments of 4, negative 4 to 4 in increments of 4, negative 4 to 4 in increments of 4, negative 6 to 6 in increments of 6, negative 8 to 8 in increments of 4, negative 4 to 4 in increments of 4, negative 4 to 4 in increments of 4, negative 4 to 4 in increments of 4, negative 4 to 4 in increments of 4, negative 4 to 4 in increments of 4, and negative 4 to 4 in increments of 4 (x-axis), respectively.

Differential gene expression for 21 chemicals under assessment at 2.5μM. Differential gene expression between each chemical concentration and controls was calculated using quasi-likelihood negative binomial generalized log-linear modeling in edgeR. Red dotted lines mark a false discovery rate (FDR) adjusted p-value of 0.05. Cells were exposed to each chemical for 48 h with four biological replicates per concentration. Table S3 contains raw differential gene expression data across all chemicals and concentrations. Note: BPA, bisphenol A; BPF, bisphenol F; BPS, bisphenol S; DCP, dichlorophenol; DCB, dichlorobenzene; DDE, dichlorodiphenyldichloroethylene; FC, fold change; MPB, methylparaben; PCB, polychlorinated biphenyl; PFDA, perfluorodecanoic acid; PFNA, perfluorononanoic acid; PFOA, perfluorooctanoic acid; PFOS, perfluorooctanesulfonic acid; PPB, propylparaben.

Table 1.

Differentially expressed genes (DEGs) (FDR <0.05 relative to control) by concentration for each chemical.

Chemical 25 nM vs. control 250 nM vs. control 2.5μM vs. control 25μM vs. control Total unique DEGs
Down Up Down Up Down Up Down Up
Arsenic 8 13 41 34 137 211 3,604 4,701 8,367
Cadmium 1 4 0 7 0 67 2,539 2,087 4,627
Copper 9 17 72 53 246 182 288 467 918
Lead 0 15 29 48 78 85 235 343 604
Mercury 0 0 0 1 0 4 6 10 19
Phenanthrene 48 2 37 0 0 1 1 0 633
p,p′-DDE 467 312 852 572 645 562 960 749 2,311
Thiram 0 1 3 5 0 32 2,521 2,549 5,071
1,4-DCB 0 0 0 0 1 0 1 0 1
2,5-DCP 0 0 0 0 0 1 0 2 3
MPB 8 12 30 39 32 24 111 60 239
PPB 5 21 5 23 3 10 1 0 52
PFDA 24 0 10 12 12 0 36 1 72
PFNA 0 1 5 2 83 3 102 8 159
PFOA 11 4 15 5 17 6 10 5 35
PFOS 4 1 5 2 7 3 27 4 45
PCB153 1 0 1 50 0 0 1 0 52
PCB187 1 1 0 0 0 0 4 1 6
BPA 52 13 20 20 98 16 12 1 152
BPF 6 15 12 9 0 22 5 2 55
BPS 20 16 3 10 2 3 5 4 44
Total 12,609

Note: —, no data; BPA, bisphenol A; BPF, bisphenol F; BPS, bisphenol S; DCP, dichlorophenol; DCB, dichlorobenzene; DDE, dichlorodiphenyldichloroethylene; FC, fold change; MPB, methylparaben; PCB, polychlorinated biphenyl; PFDA, perfluorodecanoic acid; PFNA, perfluorononanoic acid; PFOA, perfluorooctanoic acid; PFOS, perfluorooctanesulfonic acid; PPB, propylparaben.

Genes related to breast cancer were commonly found to be differentially expressed by this panel of chemicals at various concentrations. Table S4 lists genes differentially expressed across the 21 chemicals assessed, sorted by number of chemicals where there were expression differences in at least one of the four concentrations tested. Of note, serum amyloid alpha 1 and 2 (SAA1 and SAA2) showed expression differences compared to controls resulting from 14 and 15 chemical exposures, respectively. S100A8 and S100A9 cytokines were also differentially expressed after treatment with 16 and 11 chemicals, respectively. G protein alpha subunit (GNAS) expression differences were observed across 13 of the 21 chemicals. Additionally, aldo-keto reductases AKR1C1, AKR1C2, and AKR1C3 expression were different across 15, 12, and 9 chemicals, respectively. Keratins KRT14, KRT15, and KRT17 expression levels were different across 5, 6, and 11 chemicals. Serine protease inhibitor, clade E member 1 (SERPINE1) also showed differences in expression across nine chemicals.

To characterize the concentration–response relationship of each exposure on gene expression, we performed benchmark concentration modeling to define the range of benchmark concentrations for the different chemicals (Figure 2). Best benchmark concentrations were estimated from a set of concentration–response models for each gene for each chemical (Figure 2A). The total number of genes filtered by ANOVA and fit to benchmark models was different for each chemical, with some chemicals (arsenic, cadmium, and thiram) inducing concentration–responsive differences in over 1,000 different genes. Concentration–response fit was assessed via examination of the best fit concentration–response curves for individual gene/chemical combinations. Two example gene BMC curves are displayed in Figure 2B,C, showing the gene expression of ARC following exposure to a range of arsenic concentrations in a linear best fit model (Figure 2B) and thiram’s concentration–response regulation of gene HSPA6 in a hill best fit model (Figure 2C) across the range of treated concentrations. All benchmark concentration response data for each dysregulated gene for every chemical tested is included in Table S5 and in the supplementary bm2 file. Of the BMC modeled data, 14 genes were dysregulated by 10 or more chemicals, 1,239 were dysregulated by 5 or more chemicals, and many of the genes dysregulated by multiple chemicals often corresponded to the same directionality of response (Table S5).

Figure 2.

Figure 2A is a line graph, plotting number genes impacted, ranging from 1 to 10 in increments of 9, 10 to 100 in increments of 90, 100 to 1,000 in increments of 900, and 1,000 to 10,000 in increments of 9,000 (y-axis) across median benchmark concentration (micromolar), ranging as 1e minus 10, 1e minus 06, 1e minus 02, and 1e plus 02 (x-axis) for chemical, including arsenic, cadmium, copper, lead, mercury, phenanthrene, p,p’, Dichlorodiphenyldichloroethylene, thiram, 1,4-Dichlorobenzene, 2,5-Dichlorophenol, Methylparaben, Propylparaben, Perfluorodecanoic acid, Perfluorononanoic acid, perfluorooctanoic acid, perfluorooctanesulfonic acid, Polychlorinated Biphenyls 153, Polychlorinated Biphenyls 187, Bisphenol A, Bisphenol F, Bisphenol S. Figures 2B and 2C are line graphs titled Best Fit model: Linear and Best fit model: Hill, plotting log 2 arsenic, ranging from negative 3 to 10 in unit increments and log 2 Heat Shock Protein Family A Member 6, ranging from negative 2 to 11 in unit increments (y-axis) across Arsenic concentration (micromolar), ranging as 0; 1e minus 1; 1e 0 benchmark concentration lower bound, benchmark concentration, benchmark concentration upper bound; and 1e 1 and thiram concentration (micromolar), ranging as 0; 1e minus 1; benchmark concentration lower bound, benchmark concentration, benchmark concentration upper bound; and 1e 1 (x-axis) for input data and model; and input data, model, N O T E L, and L O T E L.

Benchmark concentration accumulation plots and individual benchmark concentration modeling graphs. Benchmark concentration modeling was performed using BMDExpress software. In the accumulation plot of median best benchmark concentrations (A), each plotted point represents a differentially expressed gene and each differently colored line represents individual chemicals. On the right, example individual best fit benchmark concentration graphs are shown for the gene ARC in arsenic-treated cells (B) and for the gene HSPA6 in thiram-treated cells (C). For B and C, experimental data is shown in red, best fit model in blue, and lower bound of the 95% confidence interval of the benchmark concentration (BMCL), benchmark concentration (BMC), and upper bound of the 95% confidence interval of the benchmark concentration (BMCU) are labeled on each graph. The supplemental bm2 file contains all benchmark concentration modeling data. Note: BPA, bisphenol A; BPF, bisphenol F; BPS, bisphenol S; DCP, dichlorophenol; DCB, dichlorobenzene; DDE, dichlorodiphenyldichloroethylene; FC, fold change; MPB, methylparaben; PCB, polychlorinated biphenyl; PFDA, perfluorodecanoic acid; PFNA, perfluorononanoic acid; PFOA, perfluorooctanoic acid; PFOS, perfluorooctanesulfonic acid; PPB, propylparaben.

We next wanted to contextualize the concentrations required in vitro to dysregulate gene expression with biomarker levels detectable in women in the United States. We compared benchmark concentration results from our in vitro experiments to National Health and Nutrition Examination Survey (NHANES) biomarker concentration data for each chemical, plotting the range of blood or urine chemical biomarker concentrations in US women compared to the range of BMCs for each chemical from our in vitro data (Figure 3). For all chemicals, there were at least a few genes predicted to change within the range of human population exposures. For highly exposed individuals in NHANES, biomarker concentrations for some chemicals (arsenic, copper, lead, p,p′-DDE, thiram, 2,5-dichlorophenol, methyl paraben, and propyl paraben) exceeded the median predicted benchmark concentrations derived from the RNA-seq data.

Figure 3.

Figure 3 is a box and whiskers, plotting chemical name, Bisphenol S, Bisphenol F, Bisphenol A, Polychlorinated Biphenyls 187, Polychlorinated Biphenyls 153, perfluorooctanesulfonic acid, perfluorooctanoic acid, Perfluorononanoic acid, Perfluorodecanoic acid, Propyl paraben, Methyl paraben, 2,5−Dichloropheno, 1,4−Dichlorobenzene, Thiram, p,p’, dichlorodiphenyldichloroethylene, Phenanthrene, Mercury, Lead, Copper, Cadmium, and Arsenic (y-axis) across concentration (micromolar), ranging as 1e minus 11, 1e minus 08, and 1e minus 01 (x-axis) for concentration type, including benchmark concentration and National Health and Nutrition Examination Survey.

Comparison of benchmark concentration modeling results with National Health and Nutrition Examination Survey (NHANES) exposure biomarker levels. Median benchmark concentration (BMC) for each differentially expressed gene (red/textured) and NHANES chemical biomarker concentration data, converted to molarity units, for female participants (blue/solid) plotted for each chemical. Each box represents first quartile, median, and third quartile, the whiskers represent 1.5×interquartile range, and outliers are also plotted as black dots. Table S6 contains all numerical values for the box and whiskers plots. Note: BPA, bisphenol A; BPF, bisphenol F; BPS, bisphenol S; DCP, dichlorophenol; DCB, dichlorobenzene; DDE, dichlorodiphenyldichloroethylene; FC, fold change; MPB, methylparaben; PCB, polychlorinated biphenyl; PFDA, perfluorodecanoic acid; PFNA, perfluorononanoic acid; PFOA, perfluorooctanoic acid; PFOS, perfluorooctanesulfonic acid; PPB, propylparaben.

We next wanted to evaluate the differences in biological processes after chemical exposure and characterize the concentrations at which these processes may be dysregulated. BMC modeled genes were assessed using MSigDB’s hallmark gene sets consisting of 50 biological processes. The supplemental bm2 file contains benchmark concentration gene data for every hallmark category for each chemical organized under “functional classifications.” Figure 4A displays an accumulation plot depicting the median BMC at which hallmark gene sets were impacted (at least one gene in that gene set with effects significantly different from control). The accumulation plot demonstrates low concentration effects present for many of the hallmarks. Noticeably, this accumulation plot demonstrating pathway effects (Figure 4A) differs from the accumulation plot of gene expression results (Figure 2A), highlighting potential patterns in gene expression results. Figure 4B shows a heatmap of significantly different hallmark processes—20 of the chemicals showed significant enrichment or downregulation for 1 or more of the hallmark gene sets (filtered for significant enrichment, whereas Figure 4A is not filtered for significance). Notably, many of these processes are related to carcinogenesis, including the epithelial mesenchymal transition and reactive oxygen species pathway. Compared to controls, genes related to epithelial to mesenchymal transition were upregulated by bisphenol A (BPA), bisphenol F (BPF), bisphenol S (BPS), copper, lead, methyl paraben, PFOA, and thiram at median BMCs of 2.56μM, 3.21μM, 5.23μM, 0.32μM, 0.38μM, 0.64μM, 1.54μM, and 7.15μM, respectively. Genes related to reactive oxygen species were upregulated by BPF, lead, and mercury at median BMCs of 11.19μM, 0.24μM, and 1.81μM, respectively. Copper, p,p′-DDE, lead, PFDA, PFNA, and phenanthrene upregulated cell cycle–related genes, including E2F targets and G2M checkpoint (median BMCs of 2.11μM, 0.02μM, 1.12μM, 0.28μM, 1.54μM, and 2.09μM for E2F target genes, respectively, and 3.21μM, 0.02μM, 1.09μM, 0.30μM, 1.65μM, and 2.25μM for G2M checkpoint genes, respectively). E2F targets were downregulated by arsenic, cadmium, and thiram (median BMCs of 5.00μM, 8.38μM, and 7.16μM, respectively), and G2M checkpoint genes were downregulated by cadmium and thiram at median BMCs of 8.22μM and 8.30μM, respectively, highlighting the potential for distinct mechanisms of promoting carcinogenesis depending on the chemical involved (BMC data for hallmark processes in Table S7).

Figure 4.

Figure 3A is a line graph, plotting number of hallmarks impacted, ranging from 1 to 3 in increments of 2, 3 to 10 in increments of 7, and 10 to 30 in increments of 20 (y-axis) across median benchmark concentration (micromolar), ranging as 1e minus 04, 1e minus 02, and 1e plus 00 (x-axis) for arsenic, cadmium, copper, lead, mercury, phenanthrene, p,p’, Dichlorodiphenyldichloroethylene, thiram, 1,4-Dichlorobenzene, 2,5-Dichlorophenol, Methylparaben, Propylparaben, Perfluorodecanoic acid, Perfluorononanoic acid, perfluorooctanoic acid, perfluorooctanesulfonic acid, Polychlorinated Biphenyls 153, Polychlorinated Biphenyls 187, Bisphenol A, Bisphenol F, Bisphenol S. Figures 2B is a heatmap, plotting xenobiotic metabolism, W N T beta catenin signaling, U V response up, U V response down, unfolded protein response, T N F A signaling via nuclear factor kappa uppercase alpha, terrestrial gamma-ray flash beta signaling, spermatogenesis, reactive oxygen species pathway, protein secretion, P 13 K A K T M T O R signaling, peroxisome, pancreas beta cells, P 53 pathways, oxidative phosphorylation, myogenesis, M Y C targets V 2, M Y C targets V 1, M T O R C 1 signaling, M I T O T I C spindle, K R A S signaling up, K R A S signaling down, interferon gamma response, interferon alpha response, interleukin 6 J A K S T A T 3 signaling, interleukin 2 S T A T 5 signaling, hypoxia, glycolysis, G 2 M checkpoint, fatty acid metabolism, estrogen response late, estrogen response early, epithelial mesenchymal transition, E 2 F targets, DNA repair, complement, coagulation, cholesterol homostasis, apoptosis, apical surface, apical junction, angiogenesis, androgen response, allograft rejection, and adipogenesis (y-axis) across arsenic, cadmium, copper, lead, mercury, phenanthrene, p,p’,Dichlorodiphenyldichloroethylene, thiram, 1,4-Dichlorobenzene, 2,5-Dichlorophenol 25, Methylparaben, Propylparaben, Perfluorodecanoic acid, Perfluorononanoic acid, perfluorooctanoic acid, perfluorooctanesulfonic acid, Polychlorinated Biphenyls 153, Polychlorinated Biphenyls 187, Bisphenol A, Bisphenol F, Bisphenol S (x-axis) for up, conflict, and down. Figure 4C is a heatmap, plotting wrong embryonic stem cell core, S M I D breast cancer relapse in liver up, S M I D breast cancer relapse in brain up, S M I D breast cancer relapse in bone up, S M I D breast cancer normal like up, S M I D breast cancer luminal B up, S M I D breast cancer luminal A up, S M I D breast cancer E R B B 2 up, S M I D breast cancer Basal up, L I M Mammary stem cell up, L I M Mammary Luminal mature up (y-axis) across arsenic, cadmium, copper, lead, mercury, phenanthrene, p,p’,Dichlorodiphenyldichloroethylene, thiram, 1,4-Dichlorobenzene, 2,5-Dichlorophenol 14, Methylparaben, Propylparaben, Perfluorodecanoic acid, Perfluorononanoic acid, perfluorooctanoic acid, perfluorooctanesulfonic acid, Polychlorinated Biphenyls 153, Polychlorinated Biphenyls 187, Bisphenol A, Bisphenol F, Bisphenol S (x-axis) for up, conflict, and down.

Enrichment analyses. All of Molecular Signature Database’s (MSigDB’s) “hallmark” biological processes showing differences in expression of at least one gene as regulated by each chemical across different median benchmark concentrations (BMCs) representing the BMC modeling averaged result for the gene(s) affected in that hallmark process (A). Twenty chemicals showing significant enrichment or downregulation for 1 or more of 50 biological processes defined by hallmark gene sets (B). Nineteen chemicals showing significant differences in breast cancer–specific or stemness-specific gene pathways as defined by some of MSigDB’s “Chemical and Genetic Perturbations” sets (C). For panels B and C, “conflict” indicates gene expression differences correlated with both up and downregulation of the pathways of interest. BMDExpress-defined category analysis was used for enrichment analysis. Tables S7 contains the enrichment analysis results for Figures 4B,C. Note: BPA, bisphenol A; BPF, bisphenol F; BPS, bisphenol S; DCP, dichlorophenol; DCB, dichlorobenzene; DDE, dichlorodiphenyldichloroethylene; FC, fold change; MPB, methylparaben; PCB, polychlorinated biphenyl; PFDA, perfluorodecanoic acid; PFNA, perfluorononanoic acid; PFOA, perfluorooctanoic acid; PFOS, perfluorooctanesulfonic acid; PPB, propylparaben.

To assess specific pathways related to breast cancer, additional enrichment analyses assessed additional breast cancer–associated biological pathways in MSigDB’s Chemical and Genetic Perturbations (CGP) gene sets (Figure 4C): gene sets from three studies related to breast cancer and stemness.2527 Nineteen chemicals differentially regulated these gene sets at various concentrations. Genes upregulated in embryonic stem cells were upregulated by 1,4-dichlorobenzene, p,p′-DDE, PFDA, PFNA, and PFOS (median BMCs of 7.75μM, 0.02μM, 1.43μM, 1.64μM, and 1.82μM). Additionally, genes upregulated in basal subtypes of breast cancer were found upregulated by BPF, copper, and 1,4-dichlorobenzene (median BMCs of 3.18μM, 1.38μM, and 2.50μM, respectively). BPA, lead, and PCB187 also upregulated genes that are upregulated in normal-like breast cancers (median BMCs of 10.79μM, 1.34μM, and 8.35μM, respectively). (BMC data for CGP processes in Table S7.)

To assess potential breast cancer–specific gene expression differences regulated by our chemicals of interest, we tested for gene enrichment with a publicly available breast carcinogenesis gene panel, BCScreen.29 The overlapping genes within different categories of breast carcinogenesis are shown in Table S8. The gene panel included 500 genes total divided into 14 categories, with 33 genes per category other than the “mammary” category, which includes 71 genes. Of these, cells exposed to nine chemicals (arsenic, cadmium, copper, p,p′-DDE, thiram, 2,5-dichlorophenol, PFDA, PFNA, and BPF) showed differential expression of at least three genes in one or more categories, with cell cycle, genotoxicity, and mammary categories showing the largest number of genes dysregulated.

We next wanted to test the hypothesis that exposure to the chemicals under assessment result in differences consistent with a luminal-to-basal transition, a process likely important in the development of basal breast cancers. We performed cell type deconvolution based on a normal breast single cell RNA-seq reference28 to predict the proportions of myoepithelial and luminal progenitor cells in each of our exposures (Figure 5; Figures S3 and S4).28 The initial proportion of cell types varied between controls as the cell population of MCF10As is heterogeneous. Only cells exposed to PFNA showed significantly lower predicted myoepithelial cell proportions at the highest concentration of 25μM (Figure S3). Cells exposed to arsenic and lead showed a concentration-dependent significantly higher proportion of myoepithelial cells and corresponding lower proportion of luminal progenitor cells across the three highest concentrations compared to control; those exposed to copper showed a significantly higher proportion of myoepithelial cells across all concentrations, and those exposed to methyl paraben showed a significantly higher myoepithelial cell proportion at all concentrations other than 2.5μM (Figure 5). To assess for differences in proliferation profiles of chemically treated cells, expression of three cell cycle–associated genes (PCNA, MKI67, and MCM2) was assessed, and it was found that four chemical treatments (arsenic, copper, p,p′-DDE, and lead) caused upregulation and one chemical (cadmium) caused downregulation of PCNA, four chemicals (arsenic, copper, p,p′-DDE, and PFDA) caused upregulation and one chemical (cadmium) caused downregulation of MKI67, and one chemical (p,p′-DDE) caused upregulation and three chemicals (arsenic, PFOS, and thiram) caused downregulation of MCM2 (Table S10).

Figure 5.

Figure 5 is a set of four box and whiskers plots titled Arsenic, Lead, Copper, and Methylparaben, plotting Proportion of Myoepithelial Cells, ranging from 0.00 to 1.00 in increments of 0.25 (y-axis) across Concentration (micromolar), ranging from 0 to 0.025 in increments of 0.025, and 0.025 to 0.25 in increments of 0.225, 0.25 to 2.5 in increments of 2.25, and 2.5 to 25 in increments of 22.5 (x-axis), respectively.

Multisubject (MuSiC) single-cell deconvolution results. Comparing the proportion of estimated myoepithelial cells in MCF10A cultures treated with four concentrations of arsenic, lead, copper, and methyl paraben, respectively. Each box represents first quartile, median, and third quartile, and the whiskers represent 1.5×interquartile range, and outliers are also plotted as black dots. Table S9 contains the box and whiskers plots values. Note: MPB, methylparaben.

Discussion

Across all 21 chemicals representing exposures commonly affecting women in the United States,18 12,609 genes showed expression differences in exposed breast cells compared to controls. While many of these differences were found at the highest concentration, lower concentration effects were also seen, with BMCs falling within or near the range of NHANES biomarker levels for all chemicals. This suggests that the gene expression effects seen in this study may be biologically relevant to human exposure levels, and so our analysis may link chemical exposures to incidence and characteristics of breast cancer, although further study is needed to establish human relevance in addition to our model. Interestingly, of the chosen set of chemicals, only ten (arsenic, cadmium, lead, p,p′-DDE, 1,4-dichlorobenzene, 2,5-dichlorophenol, PFOA, PFOS, PCB187, and PCB153) have been classified by International Agency for Research on Cancer (IARC) as possible, probable, or definite carcinogens to humans.32 This study adds new data on the concentration-dependent effects of this suite of common exposures on nontumorigenic breast cells and identifies potential relevance to breast carcinogenesis.

Specific gene differences included some commonalities across chemical treatments, and many of these were related to breast cancer progression. This included SAA1 and SAA2, S100A8 and S100A9, GNAS, aldo-keto reductases, keratins, and SERPINE1. SAA1 has been found to be associated with the accumulation of suppressive neutrophils and SAA1 and SAA2 have been found to be associated with the progression of breast cancer as well as poor prognosis, particularly in triple-negative breast cancer.3335 Overexpression of proinflammatory cytokines S100A8 and S100A9 in MCF10As were associated with reprogramming of normal breast cells (MCF10As) into tumorigenic cells.36,37 GNAS has been implicated in breast cancer cell proliferation and epithelial to mesenchymal transition via modulation of the PI3K/AKT/Snail1/E-cadherin signaling axis.38 Aldo-keto reductases were shown to be involved in estrogen and progesterone production and thus involved in pro-proliferative signaling of hormone responsive tumors.39 Dysregulation of keratin expression levels has also been associated with breast cancer, with increased KRT14 expression associated with invasiveness, decreased KRT15 expression associated with poor prognosis, and increased KRT17 associated with the development of TNBC.4042 SERPINE1 was found to be associated with resistance of TNBC to paclitaxel and knockdown of SERPINE1 reversed resistance of breast cancer to paclitaxel through downregulation of vascular endothelial growth factor A (VEGFA).43

In addition to common gene expression differences across chemical treatments, our enrichment analyses suggested that chemicals may alter breast cell biology by several distinct mechanisms. Treatment with metals including copper, lead, and mercury were found to result in enrichment of genes associated with epithelial to mesenchymal transition and reactive oxygen species pathways, consistent with previous literature on metal-induced carcinogenesis that places particular emphasis on oxidative stress–related carcinogenesis.4446 In breast cancer, arsenic and cadmium have been associated with oxidative stress–related pathways of malignant transformation.47 While these metals as well as many of the other chemicals in our analysis have been shown to play an endocrine-disrupting role that could lead to breast tumorigenesis, these findings propose alternative routes of carcinogenesis.22,48,49 p,p′-DDE upregulated gene signatures related to cell cycle processes in our study. Previous literature has found that a mixture of organochlorine pesticides including p,p′-DDE had potential to modulate proliferation of breast cancer cell lines partially through induction of cell cycle entry.50

Differences in stemness features were also seen across cells exposed to fifteen of the chemicals analyzed, important because the cancer stem cell hypothesis states that there is a small subset of cancer stem cells that result in cancer initiation and invasion, and so differences in stemness features can lead to increased tumorigenicity and metastasis.51 We found stemness signatures enhanced by 1,4-dichlorobenzene, p,p′-DDE, PFDA, PFNA, and PFOS, findings that could indicate novel mechanisms of chemically induced carcinogenesis.

These findings may merit consideration in the context of breast cancer disparities. We previously identified that in the US, non-Hispanic black women were disproportionately exposed to many of the chemicals assayed here.18 For example, non-Hispanic black women, on average, had 140% higher levels of urinary methyl paraben, 75% higher levels of blood p,p′-DDE, 93% higher levels of urinary bisphenol S, 39% higher levels of blood mercury, and 17% higher levels of blood lead compared to non-Hispanic white women.18 Non-Hispanic black women were also three times more likely to be diagnosed with basal-like triple-negative breast cancers compared to white women.52,53 Triple-negative breast cancers often have a basal-like phenotype and may derive from a luminal to basal cell state transition.1315 Here, using enrichment analysis, we found upregulation of basal cancer subtype genes in cells exposed to copper, 1,4-dichlorobenzene, and BPF, and cell type proportion analysis found increased predicted myoepithelial proportions (through examination of gene expression patterns consistent with myoepithelial vs. luminal features) by arsenic, copper, lead, and methyl paraben and a lower basal cell proportion at the highest concentration of PFNA. The findings for PFNA here also indicate an area for future study to understand potential divergence in predictions by different methods. We further assessed proliferation markers in order to determine whether an increase in myoepithelial proportion was indicative of a change in proliferation, and we found that for some chemicals that induced a higher proportion of myoepithelial markers, such as arsenic, there was an increase in proliferation markers, while for others such as methyl paraben, there was no change in proliferation markers (Table S10). Previous literature has shown that arsenic is capable of inducting transition from luminal to basal features, but this association has not been studied in detail for the other chemicals.54 These findings highlight an increased need to examine chemical exposures as a modifiable risk factor for aggressive breast cancers, particularly in the context of breast cancer disparities.

One major limitation of the present study is that all exposures were done on an acute timescale (48 h) while actual human exposures to the chemicals analyzed here are likely chronic, over the course of years or a lifetime. Differences in the vehicle chemicals were dissolved in and compared (water vs. DMSO), which represents another limitation, as DMSO could be affecting gene expression. Another limitation is that only one cell type was used for this study, nontumorigenic immortalized breast cells. Ongoing work is considering treatment on a more chronic timescale as well as using different breast cell lines representing diverse donors.

Additionally, MCF10As are an imperfect model of human breast cancer, as a cell line does not capture the complex milieu of breast tissue. Future work could consider treatment of diverse breast cancer cell lines, ranging from more basal to more luminal tumor types to investigate differences in luminal or basal features related to chemical exposure. Future studies can also assess effects of a broader range of concentrations of each chemical to further understand effects across relevant ranges, including low concentration effects. Another limitation in our analyses integrating benchmark concentration levels with NHANES biomarker concentrations is that the NHANES chemical biomarkers are measured in urine and blood. These levels may or may not reflect mammary tissue levels. As additional research is conducted on biomonitoring of mammary tissues or physiologically based toxicokinetic models reflecting mammary gland concentrations are developed, these data and methods will substantially inform our understanding of how chemical exposures impact breast cancer risk.

Overall, our transcriptomic analysis of chemically treated normal breast cells uncovers molecular mechanisms that may be associated with more aggressive forms of breast cancer. This includes both features related to known carcinogenic properties of these chemicals as well as potentially novel mechanisms, opening the door to future avenues of investigation to elucidate how different exposures relate to breast carcinogenesis and breast cancer disparities.

Supplementary Material

ehp12886.s001.acco.pdf (25.8MB, pdf)

Acknowledgments

This work was supported by grants from the National Institutes of Health (R01 ES028802, R01 AG072396, T32 ES 007062, P30 ES017885, P30 CA046592).

Conclusions and opinions are those of the individual authors and do not necessarily reflect the policies or views of EHP Publishing or the National Institute of Environmental Health Sciences.

References

  • 1.Arnold M, Morgan E, Rumgay H, Mafra A, Singh D, Laversanne M, et al. 2022. Current and future burden of breast cancer: global statistics for 2020 and 2040. Breast 66:15–23, PMID: , 10.1016/j.breast.2022.08.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Johnson KS, Conant EF, Soo MS. 2021. Molecular subtypes of breast cancer: a review for breast radiologists. J Breast Imaging 3:12–24. [DOI] [PubMed] [Google Scholar]
  • 3.Lord SJ, Bahlmann K, O’Connell DL, Kiely BE, Daniels B, Pearson SA, et al. 2022. De novo and recurrent metastatic breast cancer—a systematic review of population-level changes in survival since 1995. EClinicalMedicine 44:101282, PMID: , 10.1016/j.eclinm.2022.101282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Rodgers KM, Udesky JO, Rudel RA, Brody JG. 2018. Environmental chemicals and breast cancer: an updated review of epidemiological literature informed by biological mechanisms. Environ Res 160:152–182, PMID: , 10.1016/j.envres.2017.08.045. [DOI] [PubMed] [Google Scholar]
  • 5.Hanahan D, Weinberg RA. 2000. The hallmarks of cancer. Cell 100(1):57–70, PMID: , 10.1016/s0092-8674(00)81683-9. [DOI] [PubMed] [Google Scholar]
  • 6.Hanahan D, Weinberg RA. 2011. Hallmarks of cancer: the next generation. Cell 144(5):646–674, PMID: , 10.1016/j.cell.2011.02.013. [DOI] [PubMed] [Google Scholar]
  • 7.Hanahan D. 2022. Hallmarks of cancer: new dimensions. Cancer Discov 12(1):31–46, PMID: , 10.1158/2159-8290.CD-21-1059. [DOI] [PubMed] [Google Scholar]
  • 8.Colacino JA, Azizi E, Brooks MD, Harouaka R, Fouladdel S, McDermott SP, et al. 2018. Heterogeneity of human breast stem and progenitor cells as revealed by transcriptional profiling. Stem Cell Reports 10(5):1596–1609, PMID: , 10.1016/j.stemcr.2018.03.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Thong T, Wang Y, Brooks MD, Lee CT, Scott C, Balzano L, et al. 2020. Hybrid stem cell states: insights into the relationship between mammary development and breast cancer using single-cell transcriptomics. Front Cell Dev Biol 8:288, PMID: , 10.3389/fcell.2020.00288. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Spike BT, Engle DD, Lin JC, Cheung SK, La J, Wahl GM. 2012. A mammary stem cell population identified and characterized in late embryogenesis reveals similarities to human breast cancer. Cell Stem Cell 10(2):183–197, PMID: , 10.1016/j.stem.2011.12.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Giraddi RR, Chung C-Y, Heinz RE, Balcioglu O, Novotny M, Trejo CL, et al. 2018. Single-cell transcriptomes distinguish stem cell state changes and lineage specification programs in early mammary gland development. Cell Rep 24(6):1653–1666, PMID: , 10.1016/j.celrep.2018.07.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Malta TM, Sokolov A, Gentles AJ, Burzykowski T, Poisson L, Weinstein JN, et al. 2018. Machine learning identifies stemness features associated with oncogenic dedifferentiation. Cell 173(2):338–354, PMID: , 10.1016/j.cell.2018.03.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Dent R, Trudeau M, Pritchard KI, Hanna WM, Kahn HK, Sawka CA, et al. 2007. Triple-negative breast cancer: clinical features and patterns of recurrence. Clin Cancer Res 13(15):4429–4434, PMID: , 10.1158/1078-0432.CCR-06-3045. [DOI] [PubMed] [Google Scholar]
  • 14.Lim E, Vaillant F, Wu D, Forrest NC, Pal B, Hart AH, et al. 2009. Aberrant luminal progenitors as the candidate target population for basal tumor development in BRCA1 mutation carriers. Nat Med 15(8):907–913, PMID: , 10.1038/nm.2000. [DOI] [PubMed] [Google Scholar]
  • 15.Rädler PD, Wehde BL, Triplett AA, Shrestha H, Shepherd JH, Pfefferle AD, et al. 2021. Highly metastatic claudin-low mammary cancers can originate from luminal epithelial cells. Nat Commun 12(1):3742, PMID: , 10.1038/s41467-021-23957-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.CDC (Centers for Disease Control and Prevention). 2022. National Report on Human Exposure to Environmental Chemicals. https://www.cdc.gov/exposurereport/ [accessed 31 January 2023].
  • 17.Nguyen VK, Colacino J, Patel CJ, Sartor M, Jolliet O. 2022. Identification of occupations susceptible to high exposure and risk associated with multiple toxicants in an observational study: national health and nutrition examination survey 1999–2014. Exposome 2(1):osac004, PMID: , 10.1093/exposome/osac004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Nguyen VK, Kahana A, Heidt J, Polemi K, Kvasnicka J, Jolliet O, et al. 2020. A comprehensive analysis of racial disparities in chemical biomarker concentrations in United States women, 1999–2014. Environ Int 137:105496, PMID: , 10.1016/j.envint.2020.105496. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Nguyen VK, Colacino JA, Arnot JA, Kvasnicka J, Jolliet O. 2019. Characterization of age-based trends to identify chemical biomarkers of higher levels in children. Environ Int 122:117–129, PMID: , 10.1016/j.envint.2018.10.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Polemi KM, Nguyen VK, Heidt J, Kahana A, Jolliet O, Colacino JA. 2021. Identifying the link between chemical exposures and breast cancer in African American women via integrated in vitro and exposure biomarker data. Toxicology 463:152964, PMID: , 10.1016/j.tox.2021.152964. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Zeinomar N, Oskar S, Kehm RD, Sahebzeda S, Terry MB. 2020. Environmental exposures and breast cancer risk in the context of underlying susceptibility: a systematic review of the epidemiological literature. Environ Res 187:109346, PMID: , 10.1016/j.envres.2020.109346. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Wan MLY, Co VA, El-Nezami H. 2022. Endocrine disrupting chemicals and breast cancer: a systematic review of epidemiological studies. Crit Rev Food Sci Nutr 62(24):6549–6576, PMID: , 10.1080/10408398.2021.1903382. [DOI] [PubMed] [Google Scholar]
  • 23.Liberzon A, Birger C, Thorvaldsdóttir H, Ghandi M, Mesirov JP, Tamayo P. 2015. The molecular signatures database (MsigDB) hallmark gene set collection. Cell Syst 1(6):417–425, PMID: , 10.1016/j.cels.2015.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Wong DJ, Liu H, Ridky TW, Cassarino D, Segal E, Chang HY. 2008. Module map of stem cell genes guides creation of epithelial cancer stem cells. Cell Stem Cell 2(4):333–344, PMID: , 10.1016/j.stem.2008.02.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Smid M, Wang Y, Zhang Y, Sieuwerts AM, Yu J, Klijn JGM, et al. 2008. Subtypes of breast cancer show preferential site of relapse. Cancer Res 68(9):3108–3114, PMID: , 10.1158/0008-5472.CAN-07-5644. [DOI] [PubMed] [Google Scholar]
  • 26.Lim E, Wu D, Pal B, Bouras T, Asselin-Labat M-L, Vaillant F, et al. 2010. Transcriptome analyses of mouse and human mammary cell subpopulations reveal multiple conserved genes and pathways. Breast Cancer Res 12(2):R21, PMID: , 10.1186/bcr2560. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Federico A, Monti S. 2020. hypeR: an R package for geneset enrichment workflows. Bioinformatics 36(4):1307–1308, PMID: , 10.1093/bioinformatics/btz700. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Pal B, Chen Y, Vaillant F, Capaldo BD, Joyce R, Song X, et al. 2021. A single-cell RNA expression atlas of normal, preneoplastic and tumorigenic states in the human breast. Embo J 40(11):e107333, PMID: , 10.15252/embj.2020107333. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Grashow RG, De La Rosa VY, Watford SM, Ackerman JM, Rudel RA. 2018. BCScreen: a gene panel to test for breast carcinogenesis in chemical safety screening. Comput Toxicol 5:16–24, PMID: , 10.1016/j.comtox.2017.11.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Nguyen VK, Middleton L, Huang L, Verly E, Kvasnicka J, Sagers L, et al. 2023. NHANES 1988–2018. Kaggle. 10.34740/KAGGLE/DSV/4856629 [accessed 19 December 2023]. [DOI] [Google Scholar]
  • 31.Wang X, Park J, Susztak K, Zhang NR, Li M. 2019. Bulk tissue cell type deconvolution with multi-subject single-cell expression reference. Nat Commun 10(1):380, PMID: , 10.1038/s41467-018-08023-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.WHO (World Health Organization). 2024. List of Classifications – IARC Monographs on the Identification of Carcinogenic Hazards to Humans. https://monographs.iarc.who.int/list-of-classifications/ [accessed 30 November 2023].
  • 33.Niu X, Yin L, Yang X, Yang Y, Gu Y, Sun Y, et al. 2022. Serum amyloid a 1 induces suppressive neutrophils through the toll-like receptor 2–mediated signaling pathway to promote progression of breast cancer. Cancer Sci 113(4):1140–1153, PMID: , 10.1111/cas.15287. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Yang M, Liu F, Higuchi K, Sawashita J, Fu X, Zhang L, et al. 2016. Serum amyloid a expression in the breast cancer tissue is associated with poor prognosis. Oncotarget 7(24):35843–35852, PMID: , 10.18632/oncotarget.8561. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Ignacio RMC, Gibbs CR, Kim S, Lee ES, Adunyah SE, Son DS. 2019. Serum amyloid a predisposes inflammatory tumor microenvironment in triple negative breast cancer. Oncotarget 10(4):511–526, PMID: , 10.18632/oncotarget.26566. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Jo SH, Heo WH, Son H-Y, Quan M, Hong BS, Kim JH, et al. 2021. S100A8/A9 mediate the reprograming of normal mammary epithelial cells induced by dynamic cell–cell interactions with adjacent breast cancer cells. Sci Rep 11(1):1337, PMID: , 10.1038/s41598-020-80625-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Song R, Struhl K. 2021. S100A8/S100A9 cytokine acts as a transcriptional coactivator during breast cellular transformation. Science Advances 7(1):eabe5357, 10.1126/sciadv.abe5357. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Jin X, Zhu L, Cui Z, Tang J, Xie M, Ren G. 2019. Elevated expression of GNAS promotes breast cancer cell proliferation and migration via the PI3K/AKT/Snail1/E-cadherin axis. Clin Transl Oncol 21(9):1207–1219, PMID: , 10.1007/s12094-019-02042-w. [DOI] [PubMed] [Google Scholar]
  • 39.Penning TM, Byrns MC. 2009. Steroid hormone transforming aldo-keto reductases and cancer. Ann NY Acad Sci 1155:33–42, PMID: , 10.1111/j.1749-6632.2009.03700.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Zhong P, Shu R, Wu H, Liu Z, Shen X, Hu Y. 2021. Low KRT15 expression is associated with poor prognosis in patients with breast invasive carcinoma. Exp Ther Med 21(4):305, PMID: , 10.3892/etm.2021.9736. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Zhang H, Zhang Y, Xia T, Lu L, Luo M, Chen Y, et al. 2022. The role of Keratin17 in human tumours. Front Cell Dev Biol 10:818416, PMID: , 10.3389/fcell.2022.818416. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Hanley CJ, Henriet E, Sirka OK, Thomas GJ, Ewald AJ. 2020. Tumor resident stromal cells promote breast cancer invasion through regulation of the basal phenotype. Mol Cancer Res 18(11):1615–1622, PMID: , 10.1158/1541-7786.MCR-20-0334. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Zhang Q, Lei L, Jing D. 2020. Knockdown of SERPINE1 reverses resistance of triple-negative breast cancer to paclitaxel via suppression of VEGFA. Oncol Rep 44(5):1875–1884, PMID: , 10.3892/or.2020.7770. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Harris GK, Shi X. 2003. Signaling by carcinogenic metals and metal-induced reactive oxygen species. Mutat Res 533(1–2):183–200, PMID: , 10.1016/j.mrfmmm.2003.08.025. [DOI] [PubMed] [Google Scholar]
  • 45.Eckstein M, Rea M, Fondufe-Mittendorf YN. 2017. Transient and permanent changes in DNA methylation patterns in inorganic arsenic-mediated epithelial-to-mesenchymal transition. Toxicol Appl Pharmacol 331:6–17, PMID: , 10.1016/j.taap.2017.03.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Chen QY, DesMarais T, Costa M. 2019. Metals and mechanisms of carcinogenesis. Annu Rev Pharmacol Toxicol 59:537–554, PMID: , 10.1146/annurev-pharmtox-010818-021031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Zimta A-A, Schitcu V, Gurzau E, Stavaru C, Manda G, Szedlacsek S, et al. 2019. Biological and molecular modifications induced by cadmium and arsenic during breast and prostate cancer development. Environ Res 178:108700, PMID: , 10.1016/j.envres.2019.108700. [DOI] [PubMed] [Google Scholar]
  • 48.Davey JC, Bodwell JE, Gosse JA, Hamilton JW. 2007. Arsenic as an endocrine disruptor: effects of arsenic on estrogen receptor-mediated gene expression in vivo and in cell culture. Toxicol Sci off Sci 98(1):75–86, PMID: , 10.1093/toxsci/kfm013. [DOI] [PubMed] [Google Scholar]
  • 49.Bimonte VM, Besharat ZM, Antonioni A, Cella V, Lenzi A, Ferretti E, et al. 2021. The endocrine disruptor cadmium: a new player in the pathophysiology of metabolic diseases. J Endocrinol Invest 44(7):1363–1377, PMID: , 10.1007/s40618-021-01502-x. [DOI] [PubMed] [Google Scholar]
  • 50.Aubé M, Larochelle C, Ayotte P. 2011. Differential effects of a complex organochlorine mixture on the proliferation of breast cancer cell lines. Environ Res 111(3):337–347, PMID: , 10.1016/j.envres.2011.01.010. [DOI] [PubMed] [Google Scholar]
  • 51.Kakarala M, Wicha MS. 2008. Implications of the cancer stem-cell hypothesis for breast cancer prevention and therapy. J Clin Oncol 26(17):2813–2820, PMID: , 10.1200/JCO.2008.16.3931. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.McCarthy AM, Friebel-Klingner T, Ehsan S, He W, Welch M, Chen J, et al. 2021. Relationship of established risk factors with breast cancer subtypes. Cancer Med 10(18):6456–6467, PMID: , 10.1002/cam4.4158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Carey LA, Perou CM, Livasy CA, Dressler LG, Cowan D, Conway K, et al. 2006. Race, breast cancer subtypes, and survival in the Carolina breast cancer study. JAMA 295(21):2492–2502, PMID: , 10.1001/jama.295.21.2492. [DOI] [PubMed] [Google Scholar]
  • 54.Danes JM, de Abreu ALP, Kerketta R, Huang Y, Palma FR, Gantner BN, et al. 2020. Inorganic arsenic promotes luminal to basal transition and metastasis of breast cancer. FASEB J 34(12):16034–16048, PMID: , 10.1096/fj.202001192R. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ehp12886.s001.acco.pdf (25.8MB, pdf)

Data Availability Statement

The sequencing data have been deposited at the Gene Expression Omnibus (accession number GSE220051).


Articles from Environmental Health Perspectives are provided here courtesy of National Institute of Environmental Health Sciences

RESOURCES