Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2026 Feb 23;54(4):gkag118. doi: 10.1093/nar/gkag118

Isoform-specific single-cell perturb-seq reveals distinct functions of alternative promoters in drug response

Helen E King 1,2,, Savannah O’Connell 3, Daisy Kavanagh 4,5, Sofia Mason 6,7, Cerys McCool 8,9, Javier Fernandez-Chamorro 10, Christine L Chaffer 11,12, Susan J Clark 13,14, Helaine Graziele S Vieira 15,c, Timothy Sterne-Weiler 16,17,c, Robert J Weatheritt 18,19,✉,c
PMCID: PMC12926921  PMID: 41728950

Abstract

CRISPR interference (CRISPRi) screens have emerged as powerful tools for dissecting gene function, yet their application to genes with multiple promoters, which comprise over 60% of human genes, remains poorly understood. Here, we demonstrate that CRISPR-dCas9-based screens exhibit widespread promoter specificity, with untargeted promoters often showing compensatory upregulation to maintain gene expression. Leveraging this selective targeting of individual promoters within the same gene, we developed Isoform-Specific single-cell Perturb-Seq to systematically analyse alternative promoter function. Our analysis revealed that alternative promoters in 51.6% of targeted genes drive distinct transcriptional programs. This suggests that promoter selection represents a fundamental mechanism for generating cellular diversity rather than mere transcriptional redundancy. In breast cancer models, this promoter-specific targeting revealed differential effects on drug sensitivity, where distinct estrogen receptor (ESR1) promoters showed opposing influences on tamoxifen response and patient survival. These findings demonstrate the necessity of promoter-level analysis in functional genomics and suggest new strategies for therapeutic intervention through promoter-specific targeting.

Graphical Abstract

Graphical Abstract.

Graphical Abstract

Introduction

CRISPR interference (CRISPRi) has revolutionized functional genomics by enabling the precise control of gene expression through sequence-specific targeting of promoter regions [1]. This technology has uncovered complex cellular networks and mechanisms, ranging from basic biological processes [24] to therapeutic applications [58]. The scalability of these screens, combined with their ability to tune gene expression rather than completely ablate it, has opened new avenues for understanding complex biological processes and developing more effective therapeutic strategies [1, 4]. However, a critical limitation has emerged: current approaches predominantly target a single promoter per gene [14, 6, 7, 9], potentially missing crucial regulatory dynamics in the over 60% of human genes that utilize multiple promoters [10, 11].

Alternative promoters serve as molecular switches, enabling genes to generate distinct transcript isoforms in response to cellular demands and environmental signals. Each promoter region integrates a wide array of regulatory signals, including from distal enhancers, chromatin modifications, and transcription factor binding, to control gene expression in different cellular contexts [12, 13]. Growing evidence suggests these alternative promoters play crucial roles in development, disease progression, and therapeutic response, yet their systematic functional analysis has remained technically challenging.

The CRISPRi machinery typically influences transcription within ∼1000 nucleotides of the guide RNA binding site, suggesting that distally separated alternative promoters might be independently targetable [14]. This spatial specificity, combined with the prevalence of alternative promoters separated by >1000 nucleotides, presents both a challenge and an opportunity: while current CRISPRi libraries may miss critical regulatory elements, the technology could potentially enable the systematic analysis of alternative promoter function.

Here, we leverage the spatial specificity of CRISPRi to develop isoform-resolved single-cell Perturb-Seq, a screening approach that enables systematic analysis of alternative promoter function. We hypothesized that alternative promoters are independent regulatory units capable of driving distinct cellular phenotypes and drug responses. By combining promoter-specific targeting with single-cell transcriptional profiling, we uncover widespread functional divergence between alternative promoters of the same gene and demonstrate their potential as therapeutic targets.

Materials and methods

Publicly available datasets

We utilized publicly available genome-wide single-cell CRISPR Perturb-Seq data [4] for the cell lines RPE1 and K562. This comprehensive screen utilizes over 2 million cells. Replogle et al. performed and analysed the screen [4]: each 10× channel (GEM group) was z-normalized to aggregate the whole screen into a gene-by-cell barcode count matrix, in which each cell barcode was associated with a gene knockdown and separated into pseudo-bulk populations of cells used in subsequent analyses. We used other publicly available datasets detailed in the table below.

Cell Line Description Identification number Reference
RPE1 Single-cell CRISPR Perturb-Seq data 170 [4]
K562 Single-cell CRISPR Perturb-Seq data [4]
RPE1 and K562 CRISPRa growth data Supplementary data from publication [15]
Tissue-wide Dolcetto optimized sgRNA library sequences [9]
Tissue-wide FANTOM-5 CAGE-seq Peaks hg19.cage_peak_phase1and2combined_tpm_ann.osc.txt.gz [10]
K562 Long Read RNA-seq ENCFF516IEG [16]
MCF-7 Tamoxifen Treatment RNA-seq SRP116398 [17]
MCF-7 Tamoxifen Resistance RNA-seq SRP262641 [18]
MCF-7 H3K4me3 ChIP-seq ENCSR000DWJ GSM945269 [19]
MCF-7 IDR from CAGE-seq ENCSR000CJO, GSM849364 [20]
MCF-7 Long-read Pacbio CCS data See citation [21]
MCF-7 ESR1 and Androgen Resistance ChIP-seq GSM1187117 [22, 23]

CRISPR library and promoter analysis

To identify tissue-wide regulation promoters per a gene, we downloaded RLE normalized CAGE peak expression profiles from Fantom5 (hg19.cage_peak_phase1and2combined_tpm_ann.osc.txt.gz) [10] and used liftover (hg19ToHg38.over.chain.gz) to convert hg19 genome coordinates to hg38. We defined distal alternative promoters as separated by >1000 nucleotides. Only promoters contributing >20% of total gene expression in at least one tissue type were considered as expressed. CRISPR library single guide RNAs (sgRNAs) within 1000 nt of a promoter were considered as targeting the promoter.

To analyse the correlation between knockdown efficiency and several alternative promoters, we downloaded the analysed count data long-read nanopore sequencing data from the ENCODE database from K562 [16]. We only considered promoters separated by 1000 nt, which were considered separate promoters. For promoters within this range, the count data was concatenated.

For the promoter expression analysis, BAM files were obtained for each knockdown from the Repogle et al. RPE1 dataset [4, 24], converted to FASTQ and analysed using the transcript analysis program Whippet. The sgRNA and transcript per million (TPM) values concatenated considered all transcripts with promoters within 1000 nt of P1 or P2 sgRNA as targeted. All transcripts with promoters outside this range were discarded (for the main figure) or included in additional calculations. To estimate a null distribution of expression variability, we used transcriptomic data from cells transduced with non-targeting control (NTC) sgRNAs (10% of the library) to define baseline variability in promoter expression in the absence of targeted perturbation.

Analysis of RNA-seq datasets

Whippet was used with an index constructed from Hg38 genome with annotation from Ensembl v102 using default settings. Ensembl v102 annotations supplemented with RefSeq annotations for ESR1. Whippet-quant was run with –biascorrect and default settings. Relative promoter expression was calculated by concatenating transcript TPM for P1 and P2 associated transcripts relative to total gene expression.

Deposited datasets

Description Link Information
Raw sequencing data from Perturb-Seq EBI ArrayExpressE-MTAB-14 567 FASTQ files from CRISPR library and transcriptome
Processed data from Perturb-Seq EBI ArrayExpressE-MTAB-14 567 BAM files output from cellranger

Cloning of isoform-specific CRISPRi dual guide library for capture

The Isoform-Specific Perturb-seq library (PromCRISPRi) was generated using a dual guide RNA cloning method, as outlined in Repogle et al. 2020 [24] and 2022 [4]. This method reduces library size and maximizes knockdown using two guides designed to target the same promoter on a single vector. Briefly, oligonucleotides were synthesized in a pool format (Twist Biosciences). SgRNAs protospacer sequences were spaced by a BsmBI enzyme site and flanking sequences of BstX1/BlpI followed by polymerase chain reaction (PCR) adapters: 5′- PCR adaptor – CCACCTTGTTG – protospacer sequence A – gtttcagagcgagacgtgcctgcaggatacgtctcagaaacatg – protospacer sequence B – GTTTAAGAGCTAAGCTG – PCR adaptor-3′ (Supplementary Table S2).

Oligo pool was PCR-amplified, digested with BstX1/Blp, purified and ligated into pJR85 (Addgene #140095) previously digested with the same enzymes. The newly generated library plasmid was digested with BsmBI enzyme and a second plasmid pJR89 (Addgene #140096) was also digested with the same enzyme to remove CR3/hU6 promoter insert. The released fragment from pJR89 was purified and ligated to the final library plasmid. Final PromCRISPRi library and the guide RNA distribution was confirmed by next generation sequencing (NGS) [25].

Cell culture and maintenance

Human embryonic kidney (HEK) 293T, MCF-7 wild-type and MCF-7 dcas9 Krab cell lines were cultivated in Dulbecco′s modified eagle medium (DMEM) supplemented with 10% (v/v) fetal bovine serum (FBS), and 1% penicillin/streptomycin. Cells were maintained at 37°C in a humidified atmosphere containing 5% CO2. For all experiments, cells at a lower passage were used, and a mycoplasma test was conducted.

Lentivirus production and transduction

To produce lentivirus, HEK239T cells, growing in a T-25 flask at 80%–90% confluency were co-transfected with 2 µg of pCAG-VSVG, 4 µg of psPAX2 (Addgene #35616, #12260), with either the dCas9 plasmid pHR-SSFV-KRAB-dCas9-P2A-mCherry (Addgene #60954) or the IsoCRISRPi library; using Lipofectamine 3000 transfection reagent (Thermo Fisher Scientific) following the manufacturer’s recommendations. Media was replaced 24 h after transfection, and 48 h later, supernatants containing viruses were collected by centrifugation and filtration through a 0.45-μm PVDF filter (Millex Millipore – 0890). The viruses were either used immediately or stored in aliquots at −80°C.

To generate a stable MCF-7 KRAB dCas9 cell line, MCF-7 wild-type cells in a T-25 flask at 60%–70% confluency were transduced for 24h with lentivirus containing pHR-SSFV-KRAB-dCas9-P2A-mCherry. Polyclonal populations of mCherry-positive cells were sorted using a Fluorescence-Activated Cell Sorting (FACS) Aria II Cell Sorter (BD Biosciences). The purity of the recovered populations was >98%, and dCas9 protein expression levels were verified by Western blot (Supplementary Fig. 2F).

After reaching 60%–70% confluency, stable MCF-7 KRAB dCas9 cells in a T75 flask were transduced with a lentivirus containing the PromCRISPRi library at a low multiplicity of infection (0.1) for 24 h. Subsequently, the cells were washed 10 times with warm PBS1× to eliminate residual viruses, then trypsinized and divided into two T75 flasks. The parental CRISPR guide vector pJR85 used for generating the PromCRISPRi brary has two selectable markers—puromycin and blue fluorescent protein (BFP)—enabling two rounds of selection to obtain pure populations of cells expressing dCas9 and harbouring the PromCRISPRi library. The first selection was carried out 3 days post-transduction, the percentage of cells expressing dCas9 (mCherry+) and dual guide RNA (BFP+) was determined, and cells were sorted to near purity using FACS Aria II (BD Biosciences) (Supplementary Fig. S2E). Following the sorting of double-positive cells (mCherry+, BFP+), they were pelleted and seeded in a T-25 flask with DMEM, 10% FBS, 1% penicillin/streptomycin, and 1 μg/ml puromycin. By the eighth day of drug selection, >70% of the MCF-7 KRAB dCas9 population expressed the PromCRISPRi library (BFP+), and a final cell sorting was conducted for single-cell direct capture Perturb-seq.

Single-cell direct capture perturb-seq and sequencing

By the eigth day post-transduction, >70% of the MCF-7 KRAB dCas9 population expressed the PromCRISPRi library (BFP+), and a final cell sorting (Aria II Cell Sorter – BD Biosciences) was conducted prior to single-cell direct capture Perturb-Seq (Supplementary Fig. S2E). PromCRISPRi cells with >90% viability were prepared as single-cell suspension and loaded into droplet emulsions to two lanes of Chromium Single Cell Chip K, aiming to recover ∼20 000 cells per GEM group = 40 000 in total. Gene expression (GEX) and PromCRISPRi libraries were prepared following 10× Genomics Chromium Single Cell 5′ Kit User Guide v2 (Dual Index) with Feature Barcode technology for CRISPR Screening (CG000510 Rev B). Sequencing was performed on a NovaSeq 6000 (Illumina) according to the 10× Genomics User Guide.

ESR1 promoter-specific knockdown validation and proliferation assay

The same cloning strategy, lentivirus production, and cell transduction used for the Promoter Specific Perturb-seq library were also applied to ESR1 gene promoter-specific knockdown and non-targeting sgRNA controls. The correct sequence of the ESR1 dual guide CRISPRi plasmids was confirmed by Sanger Sequencing. Following transduction, cells expressing gRNAs (+BFP) and dCas9 protein (+mCherry) were FACS sorted as previously described and cultivated under puromycin drug selection. After 6 days, the cells were harvested for total RNA extraction using the RNeasy Mini Plus Kit (74134 – Qiagen) according to the manufacturer’s instructions. Complementary DNA (cDNA) was synthesized using RevertAid RT Reverse Transcription kit (K1691 – Thermo Fisher Scientific) following the manufacturer’s protocol. Quantitative Real-Time Reverse Transcription PCR (qRT-PCR) was performed using primers for ESR1 P1, P2 (Supplementary Fig. S8) GAPDH, and PCR Master Mix Power SYBR Green (4367659 – Thermo Fisher Scientific) on a QuantSudio 7 Flex reverse transcriptase-polymerase chain reaction System (Applied Biosystems). Relative expression levels were calculated using the 2−ΔΔCT method with GAPDH as the endogenous control and non-target control sgRNA dCas9 as a reference sample. This was calculated across two biological replicates and two technical replicates (two cell populations of two dual guide sgRNA–dCas9 targeting P1 or P2) with NTC population for each—significance tested with Wilcoxon Mann–Whitney test. qRT-PCR primer sequences are provided in Supplementary Table S3.

To measure the impact of ESR1 P1 and P2 promoter knockdown on cellular proliferation, after FACS sorting cells were seeded at 10% confluency in 96 well plate and cultivated in DMEM supplemented with 10% FBS and 1% penicillin/streptomycin. The media was replaced for the samples treated with 5 mM tamoxifen, and the drug was added 24 h after cell seeding. Cell proliferation was assessed using a live-cell Incucyte S3 (Sartorious) imaging system, capturing images at four defined points per well every 4 h over 5 days. Two independent experiments were conducted with three technical replicates (wells) conditions. Analysis was performed using Incucyte S3 Analysis System. Relative phase object confluence is the phase object confluence subtracted from the confluence at 0h. Lmer package was used to fit a second-degree polynomial stats model to each condition and calculated significant difference in confidence intervals (CIs; Supplementary Table S14). Doubling time is calculated as below (Supplementary Table S15).

graphic file with name TM0001.gif

Promoter identification and guide design

A total of 42 candidates of genes were targeted in the MCF-7 cell line that were identified as ideal events of alternative promoter usage to be targeted for knockdown via the Perturb-seq technology. There were three stages for library design: (i) Promoter identification, (ii) Filtering for the ideal candidates, and (iii) Design of guides.

First, the promoter identification through integrating MCF-7 transcriptomic data from Irreproducible Discovery Rate (IDR) of CAGE-seq [20], proActiv [12] of short-read and long-read Pacbio CCS data [21], and annotated with known promoters refTSS database [26]. Next, the top 42 ideal candidates were selected due to being highly expressed in MCF-7 (TPM > 2), contained promoters with > 1000 bp distance from each other, RNA transcription factors [27], enrichment of H3K4me3 ChIP-seq [19] and the major promoter upstream of the minor promoter. Finally, guides were designed using FlashFry [28] to the first 100 bp of the promoter of interest. Per promoter (e.g. STAT3 P1), four guides were chosen or, two per dual guide vector. non-targeting sgRNAs composed 5% of the total library. The guides were targeted within the first 100 bp of a promoter. The top four guides for each site of interest were prioritized if found in Horbleck et al. 2016 hCRISPRiv2.1 guide library, found with a complimentary guide design tool CRISPRDO [29]. Finally, they had the highest score of on-target (Moreno-Mateos) and off-target scores (Doench, Supplementary Table S1 and Supplementary Fig. S2C) [26].

Gene-level quantification

Using Cell Ranger v8.8.0, the transcriptomic library was aligned to reference GRCh38 v77 using “count”, and the feature library was aligned to the supplied list of guides (Supplementary Table S2). For accurate guide calling, quantification of unique molecular identifier (UMI) into molecule_info.h5 file was reprocessed using guide calling pipeline [8] for coverage of read per UMI. Two unique sgRNAs expressed from a single lentiviral vector are assigned as `ideal`.

Transcript quantification

Pseudo-bulk populations of cells were separated into a specific gene knockdown BAM file. Picard MarkDuplicates removed the PCR-bias read amplification, then BAM files were converted into FASTQ. Whippet was used to quantify transcriptome cDNA with GRCh38. For the transcriptome, transcripts within 800 bp of a promoter were assigned to the promoter and their TPM summed. The percentage of KD calculated between the promoter’s knockdown (KD) of interest and NTC guides. Minimum of 0.5 TPM in both P1 and P2 driven isoforms within the NTC populations.

Quantifying transcriptomic differences

Count matrices were filtered for genes with a minimum of 20 cells and genes with a minimum of 1000 reads—highly variable genes assigned to top 2000 with Seurat_v3. From total UMI content normalized, log-scaled expression data, a neighborhood graph was computed (using scanpy.pp.neighbors with n_neighbors = 30, method=‘umap’, metric=‘correlation’, and n_pcs = 20) followed by Uniform Manifold Approximation and Projection (UMAP) embedding (using scanpy.tl.umap with default parameters). scPerturb [30] to quantify e-statistic using euclidean distance on X_pca. E-test quantified on e-distance to alpha = 0.05, runs = 200 using the control as the P2 targeted cell population relative to P1. N-terminus change is calculated for every gene if the canonical ATG resides between the P1 and P2 defined co-ordinates (Supplementary Table S1).

Rand score is calculated as how well unsupervised clusters recapitulate P1 and P2 labels. HDBSCAN assigns hierarchical clustering output to identify and label clusters using hyperparameter optimization. Clusters assigned algorithm=’best’, cluster_selection_method = leaf, min_cluster_size = 10, metric=”manhattan”,alpha = 0.6 and allow_single_cluster = False.

Neighbourhood gene KD and differential gene expression

Using reference, neighbourhood gene to target gene were identified. Input count matrices were separated into two pseudo-bulk population between targeted cell population and non-targeting gRNA control, statsmodel z-test with a null hypothesis of no difference in means and standard deviation between the two populations (ddof = 1). Differentially expressed genes were identified through scanpy rank_genes_group_df t.test method, filtered for genes P-value <.05 and log fold change > 0.5.

Nanopore sequencing

MCF-7-dCas9 cell lines were transfected with sgRNAs for ESR1-P1, ESR1-P2, and NTC. The sequencing was undertaken by the Garvan Nanopore Facility using two PromethION flow cells after cDNA conversion and multiplexing. Guppy was used to demultiplex samples.

Nanopore analysis

Nanopore long-read sequencing data was aligned to the hg38 genome using minimap2 (-ax splice). Bambu [31] was used to quantify bam files produced from minimap2 alignment [32] with annotation from refseq database (March 2025, release 229) downloaded from UCSC database. Plotbambu was used to visualize results. Differential genes were identified using Deseq2 [33] [adj-P-value < 0.05, abs(logFC) > 0.5].

Gene pathway enrichment, FUCCI-cell cycle assignment and CNV score

Spectra [34] was applied to find supervised gene programs without factor analysis biases. The enrichment used 2000 highly variable genes for assignment of gene programs using globally annotated enrichments tested over 1000 epochs, unassigned cell type, lambda = 0.1, delta = 0.001, rho = 0.001, and kappa = None. Complementary gene ontology enrichment was performed using pygsea [67]. First, using the prerank function to order log fold-change DEG from each guide population, then checking for gene enrichment against GO:BP with significant pathways defined as NOM P-value < 0.05 and false discovery rate (FDR) q-value < 0.25. Cell-cycle assignment was trained using the GSE146773 dataset that pairs cells with known FUCCI-cell states and single-cell transcriptomic data [35]. The input count matrix was log normalized and subsetted for highly variable genes (range of mean 0.0125–3 and min. dispersion of 0.5) regulated by the cell cycle identified previously [36]. High-dimensional reduction into UMAP (n Principal Component Analysis (PCAs) = 30). The dimensional reduction is input and split into 0.3 testing and 0.7 training data for classification with kNN scikit-learn model. Hyperparameter optimization identified best score of 0.8438 (3 s.f.) model (n_neighbours : 11, metric euclidean and uniform weights). Receiver Operating Characteristic (ROC) curve used to compare to the scanpy default cell cycle assignment method of score_genes_cell_cycle using known S phase and G2M genes explained previously.

Used inferCNV(37)to detect evidence of chromosomal copy number changes. Used the filtered, normalized matrix across genes (not just highly variable) to compute rolling average gene expression changes for windows of 100 genes with a dynamic threshold of 3 standard deviations for noise filtering (using infercnvpy.tl.infercnv). Genes containing oestrogen or androgen binding sites were identified by using ESR1 and AR ChIP-seq [22, 23] overlapping with annotated protein-coding genes.

All figures were plotted using Seaborn and matplotlib in Python or ggpubr on R.

Survival curve

The pyGEPIA2 package [38] was used to subset 415 Luminal A patients and 192 Luminal B patients from the TCGA/GTEx data source. The hazards ratio is calculated on the Cox PH model with a 95% CI with the group median cutoff for 50% in overall survival from transcripts driven by P1 and P2 (Supplementary Fig. S8A).

Software

Results

CRISPRi shows evidence of promoter-specific activity

To understand the scope of alternative promoter regulation in the human genome, we first analysed CAGE-seq data across >50 tissue types. This revealed that 14.8% (1444/9770) of human protein-coding genes have alternative promoters >1000 nucleotides from the major promoter. Given that the CRISPRi machinery has been shown to repress a region of ∼1000 nucleotides surrounding the sgRNA binding site [14, 39], we hypothesized that CRISPRi targeting of these genes might exhibit promoter specificity rather than gene-wide suppression.

To test this hypothesis, we evaluated the relationship between alternative promoter usage and the reported gene knockdown efficiency in published CRISPRi Perturb-Seq datasets from K562 cells [4, 24]. Using long-read Nanopore sequencing data from K562 cells [40], we quantified actively expressed promoters (> 3 supporting reads) for all protein-coding genes. Our analysis revealed a significant inverse correlation between gene knockdown efficiency [4] and the number of alternative promoters expressed per gene (Fig. 1AP < 5.58 × 10−37, Mann–Whitney U-test). This finding suggested that genes with multiple promoters show reduced apparent knockdown efficiency due to continued expression from untargeted alternative promoters.

Figure 1.

Figure 1.

Genome-wide CRISPRi screens overlook alternative promoters (A) Boxplot with outliers removed showing percentage gene knockdown from single cell CRISPRi-dCas9 Perturb-Seq data binned based on number of expressed promoters per a gene (One versus two, ***< 1.04 × 10 − 15 Mann–Whitney U-test, One versus “more than two” ***< 5.58 × 10 − 37) across single (n = 5571), dual (n = 2291), and multi (n = 778). (B) Heatmap showing the relative expression of transcripts initiating from targeted promoters (1 and 2) and non-targeted promoters compared to a negative control from pseudo-bulk populations. Heatmap on the left and right display percentage knockdown for sgRNAs targeting P1 and P2 promoters of named genes, respectively. (C) Cumulative bar plot showing the fraction of genes targeted by Weissman sgRNA library binned into alternative promoters ≤1000 nt from the major promoter. (D) Histogram comparing the maximum genomic distribution of sgRNAs from Weissman sgRNA library and promoters in a gene body.

CRISPRi exhibits complex regulatory dynamics at alternative promoters

Detailed examination of transcript-level expression revealed that this reduced efficiency stemmed from two distinct phenomena: persistent expression from untargeted promoters and compensatory upregulation (Fig. 1B). While targeted promoters showed consistent suppression, we found that 27.5% of untargeted promoters maintained stable expression (logFC < 0.5, relative to NTC cells). More strikingly, 11.3% of untargeted promoters displayed increased expression (logFC > 0.5), suggesting active compensation for the loss of transcripts from the targeted promoter.

To validate these findings independently, we analysed a CRISPRa dataset, where sgRNAs specifically target individual promoters for activation. Consistent with results from CRISPRi data, we observed an inverse correlation between the number of alternative promoters expressed per gene and the strength of the growth phenotype inhibition (Supplementary Fig. S1A, P < 1.18 × 10−04, Mann–Whitney U-test). This parallel finding in both activation and repression contexts is consistent with promoter-specific rather than gene-wide effects of CRISPR-based transcriptional modulation.

Genome-wide CRISPRi libraries overlook alternative promoters

Given these findings, we evaluated how effectively current genome-wide CRISPRi libraries capture alternative promoter diversity by assessing whether promoters identified from CAGE-sequencing data separated by >1000 nucleotides are targeted in CRISPRi screens. Analysis of the distribution of sgRNAs from two highly cited genome-wide libraries (Dolcetto [9] and Repogle [4]) revealed systematic gaps in coverage of distal alternative promoters. While these libraries effectively targeted proximal alternative promoters, they failed to target most alternative promoters separated by >1000 nucleotides (Fig. 1C and D, and Supplementary Fig. S1B and C). This includes alternative promoters within 11.5% (n = 1337) of tested genes that display a tissue-specific switch in major promoter usage (Supplementary Fig. S1D–F). These data suggest that tissue or cell line variability in gene knockdown efficiency may result from alternative promoter usage and that analysis pipelines for CRISPRi/a screens should incorporate alternative promoter usage.

Isoform-specific perturb-seq enables large-scale transcript-specific analysis

To systematically investigate how alternative promoter usage influences cellular phenotypes, we developed a promoter-specific knockdown screen termed Isoform-Specific Perturb-Seq. We focused our analysis on transcription factors, chromatin modifiers, and RNA binding proteins, reasoning that these regulatory proteins would substantially impact cellular gene expression programs (Fig. 2A and Supplementary Fig. S2A), and only selected genes with promoters >1000 nucleotide apart (Fig. 2B). Additionally, 10% of the library was assigned to NTC sgRNAs, serving as the control comparison and establishing the baseline of promoter expression for the experiment and subsequent analyses.

Figure 2.

Figure 2.

Isoform-Specific Perturb-Seq enables large-scale transcript-specific knockdown (A) A schematic of the pipeline of Promoter-Specific Perturb-Seq from promoter identification (left), generation of dual guides for respective P1 and P2 promoters and creation of MCF-7 stable cell line (centre) to 10× single-cell sequencing (right). (B) Plot of genomic and transcribed genome distances between P1 and P2 promoters for the candidate genes targeted in the screen. The transcribed genome distance represents the linear distance between the transcription start sites of P1 and P2 along the exon-only canonical Ensembl transcript annotation. This distance reflects the effective transcribed span separating the promoters within the reference transcript structure. (C) (Top) A genome browser track of ESR1 with P1 (green) and P2 (pink) promoters annotated. Top track displaying the targeted primers. Only transcripts with >5 TPM from bulk RNA-seq shown. (Bottom) Quantitative RT-qPCR results across sgRNA P1 and P2 from primers targeted to a region common to all estrogen receptor (ESR1) transcripts (left) and primers specific to P1 transcripts (right) pooled from two technical and two biological replicates. Blue line represents the NTC sgRNA relative expression. Relative Expression (ΔΔCt) relative to GADPH, a housekeeping gene (from left to right = 0.0108, = 0.221, = 0.934 and = 0.591, one-sample t-test). The blue dotted line represents the NTC. Significance codes P >.05 signified by n.s. (not significant), < 0.05 * (D) Violin plots of percentage knockdown of targeted promoters associated transcripts relative to NTCs for sgRNAs targeting P1 promoters relative to transcripts produced from targeted and non-targeted promoters (left, n = 35, NTC TPM > 0.5, ***< 6.3 × 10 − 5, two-sided Welch’s t-test) and P2 promoters (right, n = 35, NTC TPM > 0.5, *< 5 × 10 − 2, two-sided Welch’s t-test) (%KD = 100 × (sum TPM NTC​/sum TPM KD​​ − 1). (E) Violin plots showing a statistical quantification of transcriptome distances between single cells with successful P1- to P2-sgRNAs knockdowns (n = 31). TOP plot shows adjusted P-values (Monte–Carlo permutation test with Holm–Sidak multiple test correction). Bottom plot shows putative changes to mRNA or protein transcript isoform defined by if canonical ATG resides between P1 and P2. (F) UMAP plots for BRCA1-associated C-terminal helicase 1 (BRIP1), Myb-binding protein 1 (MYBB1A), Estrogen Receptor 1 (ESR1) and proteosome 26S subunit (PSMC5) showing different clustering of single cells knockdown by P1 and P2 promoters. (G) A dot plot displaying the differential gene expression (DEG) analysis for P1 and P2 promoters against NTCs. Numbers of P1 and P2 differential genes for each gene are shown with a central number displaying several overlapping significantly differentially expressed genes (pval < 0.05, log FC > 0.5, genes exp > 10% of cells, yellow lettering signifies sig. e-distance) between P1 and P2.

To implement our Isoform-Specific Perturb-Seq approach, we generated MCF-7 breast cancer cells stably expressing dCas9-CRISPRi. We selected MCF-7 cells due to their well-characterized alternative promoter usage and clinical relevance in breast cancer studies. To ensure robust knockdown, we employed a dual guide RNA strategy, targeting each promoter with two sgRNAs (Fig. 2A and BSupplementary Fig. S2, and Supplementary Tables S1 and S2, see the ‘Materials and methods’ section). Initial validation using quantitative PCR demonstrated successful promoter-specific knockdown of estrogen receptor 1 (ESR1), a gene with characterized alternative promoters (Fig. 2C and Supplementary Table S3). We also noted that targeting the P1 promoter of ESR1 significantly reduced the expression of P1-specific transcripts.

Our Isoform-Specific Perturb-Seq screen achieved high-quality single-cell transcriptome data (Median nUMIs per cell = 13 971, Median nGene per cell = 3718) across 27 063 cells (Supplementary Figs S3 and S4). After filtering out lowly expressed transcripts in the background NTC (NTC > 0.5 TPM), we observed a significant reduction in transcripts associated with targeted promoters while maintaining expression from untargeted promoters (Fig. 2Dn = 35, P1: P < 6.3 × 10−5; P2: P < 5 × 10−2, Welch’s t-test, Supplementary Fig. S5A and B, and Supplementary Table S4). This validated our ability to target individual promoters across a large-scale single-cell screen.

Alternative promoters drive distinct gene expression programs

Analysis of single-cell transcriptomes revealed that alternative promoters within the same gene can drive subtly different expression programs. Due to the sparsity and zero-inflation characteristic of single-cell RNA-seq data, volcano plots are not well suited to summarize DEG at single-cell resolution. Instead, we applied dimensionality reduction and e-distance metrics that better capture global transcriptomic divergence between promoter knockdowns. Of the genes that showed significant on-target knockdown and continued off-target expression (Quantile P1 and P2 KD < 0.9, n = 31), we used e-distance metrics to quantify transcriptional divergence [30], and found that 51.6% (16/31) of genes with alternative promoters showed significant differences, albeit with small effect size, in cellular transcriptomes when comparing P1 versus P2 knockdown (Fig. 2E), with 19.3% (6/31) showing stronger effects on transcriptome and the rest exhibiting more subtle changes (Fig. 2E). These differences were irrespective of the strength of targeted KD (Supplementary Fig. S6H, Pearson’s correlation P1: r = −0.21 and P2: r = 0.02) and even showed strong reproducibility between replicates (Supplementary Fig. S5E) including across orthologous techniques (Supplementary Fig. S6C and D, and Supplementary Tables S5 and S6) and compared to NTC transcriptomes (Supplementary Fig. S6), confirming their biological significance.

We next examined whether the relative usage of each promoter could account for the transcriptional changes observed after promoter-specific CRISPRi knockdown. For each gene, we plotted the promoter usage fraction (promoter-specific TPM divided by total gene TPM) against two measures of downstream impact: the global transcriptomic divergence—quantified by the e-distance of P1- or P2-perturbed transcriptomes relative to NTC—and the total number of differentially expressed genes. Across all comparisons, promoter usage explained only a minimal proportion of the variance in downstream effects, and weakly used promoters did not consistently produce weaker perturbation outcomes. Measures of the global response, including e-distance (Supplementary Fig. S6I; P1: r = –0.16, P2: r = 0.38, Pearson correlation) and the number of significant DEGs (Supplementary Fig. S6I; P1: r = –0.14, P2: r = 0.41), showed no meaningful correlation with promoter activity. Together, these results indicate that the extent of transcriptomic divergence induced by promoter-specific CRISPRi is largely independent of both the baseline activity of the targeted promoter and the degree of on-target knockdown.

Analysis of screen-positive genes revealed that the most functionally impactful promoter knockdowns involved transcripts contributing at least 5% of the total gene expression in MCF-7 cells. Notable examples include ESR1 and Myb-binding protein 1A (MYBBP1A), where knockdown of P1 versus P2 promoters produced distinct transcriptional landscapes (Fig. 2F), suggesting that these promoters drive distinct gene expression programs within the total expression landscape of each gene. It is important to note that in single-cell datasets, biologically meaningful differences often manifest as gradients within clusters rather than forming entirely distinct clusters [41, 42]. Further supporting this observation, differential expression analysis revealed minimal overlap between genes affected by P1 versus P2 knockdown, suggesting that the two promoters regulate largely distinct downstream pathways (Fig. 2G, Supplementary Fig. S6J–L, and Supplementary Tables S7 and S8). For the small subset of differentially expressed genes regulated by both P1 and P2 knockdown, the direction and magnitude of transcriptional changes were highly correlated, indicating a coherent shared regulatory component (Supplementary Fig. S6J–L and Supplementary Table S8). This functional divergence between promoters of the same gene highlights a previously underappreciated layer of transcriptional regulation.

To orthogonally validate the promoter-specific knockdown, we undertook nanopore long-read sequencing of MCF-7 cells transduced with sgRNAs targeting P1 and P2 promoters, as well as a NTC. This analysis validated the promoter-specific knockdown on CRISPRi on transcripts initiated by the respective promoters (Supplementary Fig. S6F). Furthermore, DEG validated the minimal overlap of genes regulated by P1 versus P2 knockdown (Supplementary Fig. S6G) further supporting that promoters often drive distinct downstream pathways.

Alternative promoters regulate distinct cellular phenotypes

To initially compare the functional consequences of promoter usage, we performed GO Biological Process enrichment separately for P1 and P2 knockdowns for each gene. Across targets, P1 and P2 generally enriched distinct pathways with limited overlap, and promoter-specific differences were quantified using normalized enrichment score differences, highlighting gene-specific divergence in downstream biological programs (Supplementary Fig. S7 and Supplementary Table S9). To understand the biological significance of promoter-specific regulation more accurately, we employed Spectra [34], to identify coordinated gene expression programs associated with each promoter. Spectra analyzes single-cell RNA-seq data to calculate pathway activity scores for individual cells, enabling comparison of pathway activation across different promoter knockdowns. These cell-level pathway scores are then statistically evaluated to identify differentially activated biological processes revealing promoter knockdown specific pathway alterations. This approach revealed that alternative promoters can control distinct biological processes, particularly in cell cycle regulation and proliferation pathways (Fig. 3A and Supplementary Table S10).

Figure 3.

Figure 3.

Alternative promoters drive distinct functional gene expression programs. (A) A boxplot of the top enriched functional programs (n = 10) identified by Spectra across the MCF-7 cell line used in the screen. (B) A heatmap of the mean spectra enrichment score for target gene candidates (n = 15, successful KD and significantly different e-distance scores) P1 and P2 promoters cell populations across G2M (left) and G1S (right) transition (two-sided Welch’s t-test, *P-value <.05). Spectra cell score plot, higher scores indicate a stronger presence or activity level of a specific gene program within a cell. For context, the absolute values of latent-space scores are method-specific and not directly comparable across models; their significance lies in the relative differences observed between conditions. (C) (Top) A scatterplot displaying the percentage of cells in particular cell cycle stages [G1 (purple), S (red), G2M (green)] for genes targeted in the screen. Dashed line fitted to negative control guides (two-sided Welch’s t-test, Benjamini–Hochberg FDR). (bottom) A barplot showing minimum percentage change, relative to NTC, of genes with promoters with differential impacts in G2M cellular populations. ESR1 is highlighted in bold. (D) (Left) Violin plots with boxplots for putative copy number variants inferred from infercnvpy for pseudo-bulk populations of ESR1 sgP1, ESR1 sgP2 and NTC (NTC versus sgP2, ***= 1.1 × 10 − 7, NTC versus sgP1 n.s. = 5.5 × 10 − 2, sgP1 versus sgP2 n.s. = 8.3 × 10 − 1) (right) visualization of copy number variants per chromosome with heatmaps highlighting individual genes of Relative CNV score and Rolling Average CNV score.

The cell cycle state for each cell was determined based on the expression of marker genes associated with G1, S, and G2/M phases [35, 43]. Cells were assigned to a specific phase by scoring gene expression profiles against established cell cycle markers (Supplementary Fig. S7). This analysis demonstrated that 34.38% of alternative promoters significantly altered cell cycle state distributions compared to controls (Fig. 3BP <.05, two-sided Welch’s t-test, Benjamini–Hochberg FDR). The most dramatic effects were observed with ESR1, where P1 and P2 promoter knockdowns produced opposing effects on cell cycle progression (Fig. 3A and B and Supplementary Table S11). This suggests that the knockdown of alternative promoters can lead to differences in cell cycle progression.

These transcriptional differences manifested in measurable phenotypic changes. Using inferCNV analysis (https://github.com/broadinstitute/inferCNV), which is computational tool used to analyze single-cell RNA sequencing (scRNA-seq) data to infer large-scale chromosomal copy number variations (CNVs), we identified significant increases in CNVs among cell cycle-related genes specifically in the P2 knockdown population, compared to P1 and control (Fig. 3D, Supplementary Table S12, Supplementary Table S13, and Supplementary Fig. S8D, P < 1.1 × 10−7, Wilcoxon rank sum test) with CNVs identified in multiple cell cycle related genes (Fig. 3D, right). Live cell imaging confirmed these findings, showing that P2 knockdown significantly reduced proliferation rates (Fig. 4BP < 1.62 × 10−10, t = −6.614, linear-mixed model, Supplementary Tables S14 and S15), while P1 knockdown had minimal effect on cell growth (Fig. 4BP < 0.0798, t = 1.757, linear-mixed model, Supplementary Table S14-S15). These findings suggest that the expression of the two ESR1 promoters exert divergent effects on cellular proliferation (Fig. 3B and Supplementary Fig. S9A).

Figure 4.

Figure 4.

Promoters of the estrogen receptor display tamoxifen-specific responses (A) Domain structures of the two major transcript isoforms of ESR1 based on RNA-seq data AF1 = activation function-1 domain; DBD = DNA-binding domain; AF2 = activation function-2 domain. (B) Cellular proliferation results using Incucyte for control-, P1- and P2 knockdown cells untreated (linear mixed model, sgP1 versus Control P < 0.0798, sgP2 versus Control n.s. P < 1.62 × 10−10, grey showing CI Supplementary Table S14). (C) Kaplan-Meier (KM) plots displaying survival curves for patients with Luminal-A breast cancer. KM plots showing transcripts originating from ESR1 promoter 1 (P1) on the left and ESR1 promoter 2 (P2) on the right. Showing low (<median) and high (>median) transcript expression (hazard ratio, Cox PH Model). (D) Line plot showing cellular proliferation results in MCF-7 breast cancer cells upon control-, P1- and P2-knockdown lines upon tamoxifen treatment (5 µM) [linear mixed model, sgP1 versus Control P < 1.86 × 10−5, sgP2 versus Control P < 2.00 × 10−16, grey showing CI Supplementary Table S14].

ESR1 promoter specificity determines patient outcomes

Given the clinical importance of estrogen receptor signalling in breast cancer [44], we investigated the relationship between ESR1 promoter usage and patient outcomes. Analysis of nanopore long read RNA-seq data [45, 46] across a range of MCF-7 cells revealed coordinated splicing between the P1 promoter and an alternative last exon [47] (Supplementary Fig. S8E). This results in the P1 and P2 promoters driving the production of distinct protein isoforms with coding differences in the activation function-2 (AF2) domain (Fig. 4A), a region responsible for binding estrogen, transcriptional coactivators, and selective estrogen receptor modulators (SERMs) [48]. The P1 isoform is predicted to contain a disordered C-terminal domain that potentially modifies interactions with estrogen and SERMs, while the P2 isoform produces the canonical ESR1 protein.

Analysis of TCGA data from 299 breast cancer patients [49] revealed a striking promoter-specific relationship with survival. High expression from the P1 promoter strongly correlated with decreased survival in luminal-A breast cancer patients (Hazard Ratio (HR) = 1.9, P <.015, Fig. 4C), while P2 promoter expression showed no significant association (HR = 1.2, P = 0.43) (Fig. 4C). This relationship was specific to the luminal-A subtype [50], as no significant correlation was observed in luminal-B breast cancer (Supplementary Fig. S9E), indicating a potential subtype-specific association for ESR1 promoter usage.

Differential drug response of ESR1 promoter isoforms

To understand the mechanistic basis for these clinical observations, we examined ESR1 promoter usage in the context of drug response. RNA-seq data analysis from tamoxifen-resistant MCF-7 cells [18] revealed a significant activation of P1 promoter usage compared to parental cells (Supplementary Fig. S9B and C). Resistant cells showed a 2.3-fold increase in P1 promoter usage (P <.01) with a corresponding decrease in P2 promoter activity (Supplementary Fig. S9B and C). This suggests that cells may become resistant to tamoxifen by altering their ESR1 promoter usage.

To directly test the functional impact of promoter-specific expression on drug response, we performed tamoxifen sensitivity assays in cells with P1 or P2 knockdown. Strikingly, P1 and P2 knockdown produced opposing effects on tamoxifen sensitivity. P2 knockdown cells showed heightened sensitivity to tamoxifen, with a significant reduction in proliferation rate compared to control cells (Fig. 4DP < 2.00 × 10−16, t = −10.998 linear-mixed model, Supplementary Table S14). In contrast, P1 knockdown cells exhibited increased proliferation in the presence of tamoxifen, growing significantly faster than control cells (Fig. 4DP < 1.86 × 10−5, t = 4.38, linear-mixed model, Supplementary Tables S14 and S15). These divergent responses were specific to tamoxifen treatment, as baseline proliferation rates showed different patterns (Fig. 4B), indicating a potential association between this promoter and isoform expression in drug resistance.

Notably, the P1 promoter shows tissue-specific expression patterns (Supplementary Fig. S9D), with high activity in breast tissue. This tissue specificity, combined with its selective role in drug resistance, indicates that targeting P1-specific transcripts could provide a more precise therapeutic strategy than current approaches that broadly target all ESR1 isoforms.

Discussion

Limitations of genome-wide functional screens and the importance of promoter-level analysis

Our findings advance our understanding of transcriptional regulation by systematically demonstrating that alternative promoters within the same gene can drive distinct cellular programs. This work exposes a critical limitation in current CRISPR-dCas9 functional screens [14, 39], where the regulatory complexity and potential compensatory mechanisms between alternative promoters are often overlooked. The discovery that 52% of surveyed transcription and chromatin factors drive distinct transcriptional programs suggests that current screening approaches may miss important regulatory dynamics, particularly in the context of gene regulation. An important note though is that differential expression in single-cell data must be interpreted cautiously due to the inherent sparsity, variability, and lack of true biological replicates. While we adopted conservative thresholds consistent with field standards, small fold changes may reflect meaningful shifts in regulatory programs rather than strong individual promoter effects. Our use of a large NTC, empirical null distributions, and phenotypic validation supports the robustness of our conclusions.

In our screen, when assessing alternative promoter usage of transcription factors in an isogenic cell line, we noted even modest shifts in promoter usage can produce widespread effects across the transcriptome. This is consistent with findings from gene knockdown studies [5155], where small perturbations in the expression of key transcription factors such as Oct4 in stem cells can both promote and inhibit pluripotency [54]. Conversely, in other contexts, substantial changes in gene expression may elicit minimal downstream effects, reflecting mechanisms such as stochastic buffering [56], enhancer redundancy [57], or threshold-dependent transcriptomic switching [58]. Collectively, these observations highlight the value of our Perturb-seq-based screens for initially distinguishing biological function from transcriptional noise.

Moreover, we demonstrate that untargeted promoters can exhibit compensatory upregulation, maintaining overall gene expression despite successfully targeting individual promoters. This compensation mechanism suggests that traditional measures of knockdown efficiency may underestimate the effectiveness of CRISPR-based perturbations when applied to genes with multiple promoters. These findings emphasize the need to incorporate promoter-level analysis into the design and interpretation of functional genomics studies.

Role of alternative promoters as drivers of unique cellular phenotypes

Our Isoform-Specific Perturb-Seq approach reveals a remarkable autonomy of alternative promoters in driving distinct cellular phenotypes and is complimentary to recent efforts to use engineered zinc finger to drive differential activation of alternative promoters [59]. This autonomy enables cells to generate diverse transcriptional landscapes without altering their genomic sequence, a feature particularly evident in our analysis of ESR1, where distinct promoters showed opposing effects on cell cycle progression and drug response. The ability of alternative promoters to integrate diverse regulatory signals—from transcription factors to chromatin modifications—positions them as sophisticated regulatory nodes that fine-tune cellular responses to environmental cues [11, 12, 60]. This functional divergence suggests that promoter selection represents a fundamental mechanism for generating cellular diversity rather than mere transcriptional redundancy.

Therapeutic opportunities of targeting alternative promoters in cancer

The dysregulation of alternative promoter usage is emerging as a critical mechanism in cancer progression and therapeutic resistance [12, 61]. The clinical relevance of promoter-specific regulation is particularly evident in our analysis of breast cancer, where we discovered that ESR1 promoters differentially impact tamoxifen sensitivity, with P1 knockdown conferring resistance while P2 knockdown enhanced drug sensitivity. This finding aligns with clinical data showing that P1 overexpression correlates with decreased survival in luminal-A breast cancer patients, suggesting that promoter usage patterns could serve as predictive biomarkers for treatment outcomes. Although P1-driven transcripts represent a substantial proportion of total ESR1 expression in tamoxifen-resistant cells, these data alone do not establish that P1 expression is functionally causal in resistance. The observed association may reflect broader transcriptional reprogramming or serve as a surrogate marker for other upstream regulatory changes. Functional studies beyond transcript-level perturbation and cellular proliferation assays are required to determine the phenotypic relevance of P1 expression in this context. Nevertheless these results suggests that, more generally, monitoring promoter usage patterns could help stratify patients for targeted therapies

Our findings also open other therapeutic possibilities. For example, our analysis revealed that the P1 promoter exhibits tissue-specific expression patterns suggesting selective targeting could reduce systemic side effects commonly associated with current estrogen receptor-targeted therapies [44]. This expands on the idea that with isoform-specific targeting, we may achieve more precise therapeutic control while minimizing off-target effects compared to broadly inhibiting gene function [62].

Moreover, our observation that specific promoters can drive drug resistance suggests new therapeutic strategies. For example, in breast cancer, targeting specific ESR1 promoters in combination with standard endocrine therapy might improve treatment outcomes by preventing the emergence of resistance through promoter switching [48, 63, 64]. Understanding compensatory promoter activation could also inform strategies to prevent or overcome drug resistance, potentially through combination therapies that target both primary and compensatory mechanisms [48, 63, 64].

Future directions and technological implications

Our work establishes a framework for systematic analysis of alternative promoter function, but several important questions remain unexplored. Future research should examine the tissue-specific nature of alternative promoter usage and its implications for drug development. Understanding the mechanisms governing compensatory promoter activation and developing improved CRISPR libraries that comprehensively target alternative promoters will be crucial. Additionally, investigating how alternative promoter usage changes during disease progression and treatment could provide valuable insights for therapeutic development. An important limitation of our work is that we did not impose a minimum expression cutoff during guide design, in hindsight, applying a threshold—such as requiring promoters to contribute at least 5% of total gene expression—may have been a valuable refinement, as the most phenotypically active promoters in our screen exceeded this level.

The integration of promoter-level analysis into drug discovery pipelines could significantly enhance our ability to develop more effective and precise therapeutic strategies. This is particularly relevant for diseases where isoform diversity plays a crucial role, including cancer, neurodegenerative disorders, and autoimmune conditions [6567].

Conclusion

In conclusion, our Isoform-Specific Perturb-Seq approach reveals a critical yet underappreciated role of alternative promoters in generating cellular diversity. This suggests that targeting a single promoter per gene may obscure critical regulatory dynamics or yield biologically misleading outcomes. This insight implies that certain dCas9 screens may have missed important regulatory elements, potentially overlooking the complexity of gene expression regulation. Our results also highlight the potential of leveraging dCas9 technology to develop isoform-specific therapies [62], enabling precise modulation of individual protein isoform expression thus paving the way for more targeted and effective therapeutic strategies.

Supplementary Material

gkag118_Supplemental_Files

Acknowledgements

We acknowledge the Garvan-Weizmann Centre for Cellular Genomics team for their assistance with the single-cell assay. Special thanks to Chia-Ling Chan, Winnie Luu, Yasmin Husaini, Eric Lam, Hanjie Wubvand, and Dominik Kaczorowski. We also acknowledge the UNSW Restech HPC Scheme DOI: 10.26190/PMN5-7J50 for computational support. We are grateful to all members of the Weatheritt lab past and present for their critical assistance with this paper and all their support.

Author contributions: Conceptualization (Robert Weatheritt), Resource (Robert Weatheritt), Data Curation (Helen King, Tim Sterne-Weiler and Robert Weatheritt), Software (Helen King), Formal analysis (Helen King, Daisy Kavanagh, Savannah O’Connell, Tim Sterne-Weiler and Robert Weatheritt), Supervision (Helaine Graziele S. Vieira, Timothy Sterne-Weiler, and Robert J Weatheritt), Funding acquisition (Robert Weatheritt), Validation (Helen King, Savannah O’Connell, Daisy Kavanagh, Sofia Mason, Cerys McCool, Javier Fernandez-Chamorro, Christine L Chaffer, Susan J Clark, Helaine Graziele S. Vieira, Timothy Sterne-Weiler, and Robert J Weatheritt), Investigation (Helen King, Helaine Graziele S. Vieira, Timothy Sterne-Weiler, and Robert J Weatheritt), Visualization (Helen King, Robert Weatheritt), Writing – original draft (Robert Weatheritt), Project administration (Helaine Graziele S. Vieira and Robert Weatheritt), Writing – review & editing (all authors)

Contributor Information

Helen E King, EMBL Australia, Garvan Institute of Medical Research, Sydney, 2010, New South Wales, Australia; St. Vincent Clinical School, University of New South Wales, Darlinghurst, 2010, New South Wales, Australia.

Savannah O’Connell, EMBL Australia, Garvan Institute of Medical Research, Sydney, 2010, New South Wales, Australia.

Daisy Kavanagh, EMBL Australia, Garvan Institute of Medical Research, Sydney, 2010, New South Wales, Australia; School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, 2033, Australia.

Sofia Mason, St. Vincent Clinical School, University of New South Wales, Darlinghurst, 2010, New South Wales, Australia; Cancer Ecosystems Program, Garvan Institute of Medical Research, Sydney, 2010, Australia.

Cerys McCool, St. Vincent Clinical School, University of New South Wales, Darlinghurst, 2010, New South Wales, Australia; Cancer Ecosystems Program, Garvan Institute of Medical Research, Sydney, 2010, Australia.

Javier Fernandez-Chamorro, EMBL Australia, Garvan Institute of Medical Research, Sydney, 2010, New South Wales, Australia.

Christine L Chaffer, St. Vincent Clinical School, University of New South Wales, Darlinghurst, 2010, New South Wales, Australia; Cancer Ecosystems Program, Garvan Institute of Medical Research, Sydney, 2010, Australia.

Susan J Clark, St. Vincent Clinical School, University of New South Wales, Darlinghurst, 2010, New South Wales, Australia; Cancer Ecosystems Program, Garvan Institute of Medical Research, Sydney, 2010, Australia.

Helaine Graziele S Vieira, EMBL Australia, Garvan Institute of Medical Research, Sydney, 2010, New South Wales, Australia.

Timothy Sterne-Weiler, Computational Biology & Translation, Genentech, 9, South San Francisco, 94080, United States; Discovery Oncology, Genentech, South San Francisco, 94080, United States.

Robert J Weatheritt, EMBL Australia, Garvan Institute of Medical Research, Sydney, 2010, New South Wales, Australia; School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, 2033, Australia.

Supplementary data

Supplementary data is available at NAR online.

Conflict of interest

None declared.

Funding

We gratefully acknowledge funding from the Australian Research Council Discovery Project Grant (DP250103133) and Future Fellowship (FT210100355), EMBL Australia, Scrimshaw Foundation, NSW Health, NSW Cancer Council and E.P. Oldham - Viertel Foundation. Funding to pay the Open Access publication charges for this article was provided by EMBL Australia.

Data availability

All single cell data produced for this paper is available at E-MTAB-14567. Source code is available at https://github.com/theheking/isoform_specific_perturb_seq and https://doi.org/10.5281/zenodo.18275435.

References

  • 1. Kampmann  M. CRISPRi and CRISPRa screens in mammalian cells for precision biology and medicine. ACS Chem Biol. 2018;13:406–16. 10.1021/acschembio.7b00657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Du  D, Roguev  A, Gordon  DE  et al.  Genetic interaction mapping in mammalian cells using CRISPR interference. Nat Methods. 2017;14:577–80. 10.1038/nmeth.4286. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Gilbert  LA, Horlbeck  MA, Adamson  B  et al.  Genome-scale CRISPR-mediated control of gene repression and activation. Cell. 2014;159:647–61. 10.1016/j.cell.2014.09.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Replogle  JM, Saunders  RA, Pogson  AN  et al.  Mapping information-rich genotype-phenotype landscapes with genome-scale perturb-seq. Cell. 2022;185:2559–2575.e28. 10.1016/j.cell.2022.05.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Jost  M, Chen  Y, Gilbert  LA  et al.  Combined CRISPRi/a-based chemical genetic screens reveal that Rigosertib is a microtubule-destabilizing agent. Mol Cell. 2017;68:210–223.e6. 10.1016/j.molcel.2017.09.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. le Sage  C, Lawo  S, Panicker  P  et al.  Dual direction CRISPR transcriptional regulation screening uncovers gene networks driving drug resistance. Sci Rep. 2017;7:17693. 10.1038/s41598-017-18172-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Lou  K, Steri  V, Ge  AY  et al.  KRAS(G12C) inhibition produces a driver-limited state revealing collateral dependencies. Sci Signal. 2019;12. 10.1126/scisignal.aaw9450. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Mukhopadhyay  S, Huang  HY, Lin  Z  et al.  Genome-wide CRISPR screens identify multiple synthetic lethal targets that enhance KRASG12C inhibitor efficacy. Cancer Res. 2023;83:4095–111. 10.1158/0008-5472.CAN-23-2729. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Sanson  KR, Hanna  RE, Hegde  M  et al.  Optimized libraries for CRISPR-Cas9 genetic screens with multiple modalities. Nat Commun. 2018;9:5416. 10.1038/s41467-018-07901-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Consortium  F, the, R.P.  C, Forrest  AR  et al.  A promoter-level mammalian expression atlas. Nature. 2014;507:462–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Davuluri  RV, Suzuki  Y, Sugano  S  et al.  The functional consequences of alternative promoter use in mammalian genomes. Trends Genet. 2008;24:167–77. 10.1016/j.tig.2008.01.008. [DOI] [PubMed] [Google Scholar]
  • 12. Demircioglu  D, Cukuroglu  E, Kindermans  M  et al.  A pan-cancer transcriptome analysis reveals pervasive regulation through alternative promoters. Cell. 2019;178:1465–1477.e17. 10.1016/j.cell.2019.08.018. [DOI] [PubMed] [Google Scholar]
  • 13. Hu  Z, Tee  WW. Enhancers and chromatin structures: regulatory hubs in gene expression and diseases. Biosci Rep. 2017;37. 10.1042/BSR20160183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Davies  R, Liu  L, Taotao  S  et al.  CRISPRi enables promoter-specific loss-of-function screens and identification of gastric cancer-specific isoform dependencies. Genome Biol. 2021;22:47. 10.1186/s13059-021-02266-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Horlbeck  MA, Gilbert  LA, Villalta  JE  et al.  Compact and highly active next-generation libraries for CRISPR-mediated gene repression and activation. eLife. 2016;5:e19760. 10.7554/eLife.19760. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Reese  F, Williams  B, Balderrama-Gutierrez  G  et al.  The ENCODE4 long-read RNA-seq collection reveals distinct classes of transcript structure diversity. bioRxiv, 10.1101/2023.05.15.540865, 16 May 2023, preprint: not peer reviewed. [DOI] [Google Scholar]
  • 17. Wu  Y, Zhang  Z, Cenciarini  ME  et al.  Tamoxifen resistance in breast cancer is regulated by the EZH2-ERalpha-GREB1 transcriptional axis. Cancer Res. 2018;78:671–84. 10.1158/0008-5472.CAN-17-1327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. De Angelis  C, Fu  X, Cataldo  ML  et al.  Activation of the IFN signaling pathway is associated with resistance to CDK4/6 inhibitors and immune checkpoint activation in ER-positive breast cancer. Clin Cancer Res. 2021;27:4870–82. 10.1158/1078-0432.CCR-19-4191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Zhang  J, Lee  D, Dhiman  V  et al.  An integrative ENCODE resource for cancer genomics. Nat Commun. 2020;11:3696. 10.1038/s41467-020-14743-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Lizio  M, Abugessaisa  I, Noguchi  S  et al.  Update of the FANTOM web resource: expansion to provide additional transcriptome atlases. Nucleic Acids Res. 2019;47:D752–8. 10.1093/nar/gky1099. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Anvar  SY, Allard  G, Tseng  E  et al.  Full-length mRNA sequencing uncovers a widespread coupling between transcription initiation and mRNA processing. Genome Biol. 2018;19:46. 10.1186/s13059-018-1418-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Campbell  TM, Castro  MAA, de Santiago  I  et al.  FGFR2 risk SNPs confer breast cancer risk by augmenting oestrogen responsiveness. CARCIN. 2016;37:741–50. 10.1093/carcin/bgw065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Fletcher  MN, Castro  MA, Wang  X  et al.  Master regulators of FGFR2 signalling and breast cancer risk. Nat Commun. 2013;4:2464. 10.1038/ncomms3464. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Replogle  JM, Norman  TM, Xu  A  et al.  Combinatorial single-cell CRISPR screens by direct guide RNA capture and targeted sequencing. Nat Biotechnol. 2020;38:954–61. 10.1038/s41587-020-0470-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Horlbeck  MA, Xu  A, Wang  M  et al.  Mapping the genetic landscape of human cells. Cell. 2018;174:953–967.e22. 10.1016/j.cell.2018.06.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Abugessaisa  I, Noguchi  S, Hasegawa  A  et al.  refTSS: a reference data set for human and mouse transcription start sites. J Mol Biol. 2019;431:2407–22. 10.1016/j.jmb.2019.04.045. [DOI] [PubMed] [Google Scholar]
  • 27. Lambert  SA, Jolma  A, Campitelli  LF  et al.  The human transcription factors. Cell. 2018;172:650–65. 10.1016/j.cell.2018.01.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. McKenna  A, Shendure  J. FlashFry: a fast and flexible tool for large-scale CRISPR target design. BMC Biol. 2018;16:74. 10.1186/s12915-018-0545-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Ma  J, Koster  J, Qin  Q  et al.  CRISPR-DO for genome-wide CRISPR design and optimization. Bioinformatics. 2016;32:3336–8. 10.1093/bioinformatics/btw476. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Peidli  S, Green  TD, Shen  C  et al.  scPerturb: harmonized single-cell perturbation data. Nat Methods. 2024;21:531–40. 10.1038/s41592-023-02144-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Chen  Y, Sim  A, Wan  YK  et al.  Context-aware transcript quantification from long-read RNA-seq data with Bambu. Nat Methods. 2023;20:1187–95. 10.1038/s41592-023-01908-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Li  H. New strategies to improve minimap2 alignment accuracy. Bioinformatics. 2021;37:4572–4. 10.1093/bioinformatics/btab705. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Love  MI, Huber  W, Anders  S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Kunes  RZ, Walle  T, Land  M  et al.  Supervised discovery of interpretable gene programs from single-cell data. Nat Biotechnol. 2024;42:1084–95. 10.1038/s41587-023-01940-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Mahdessian  D, Cesnik  AJ, Gnann  C  et al.  Spatiotemporal dissection of the cell cycle with single-cell proteogenomics. Nature. 2021;590:649–54. 10.1038/s41586-021-03232-9. [DOI] [PubMed] [Google Scholar]
  • 36. Kowalczyk  MS, Tirosh  I, Heckl  D  et al.  Single-cell RNA-seq reveals changes in cell cycle and differentiation programs upon aging of hematopoietic stem cells. Genome Res. 2015;25:1860–72. 10.1101/gr.192237.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Puram  SV, Tirosh  I, Parikh  AS  et al.  Single-cell transcriptomic analysis of primary and metastatic tumor ecosystems in head and neck cancer. Cell. 2017;171:1611–1624.e24. 10.1016/j.cell.2017.10.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Tang  Z, Kang  B, Li  C  et al.  GEPIA2: an enhanced web server for large-scale expression profiling and interactive analysis. Nucleic Acids Res. 2019;47:W556–60. 10.1093/nar/gkz430. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Horlbeck  MA, Witkowsky  LB, Guglielmi  B  et al.  Nucleosomes impede Cas9 access to DNA in vivo and in vitro. eLife. 2016;5. 10.7554/eLife.12677. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Consortium  EP. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. 10.1038/nature11247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Becht  E, McInnes  L, Healy  J  et al.  Dimensionality reduction for visualizing single-cell data using UMAP. Nat Biotechnol. 2018;37:38–44. [DOI] [PubMed] [Google Scholar]
  • 42. Jerber  J, Seaton  DD, Cuomo  ASE  et al.  Population-scale single-cell RNA-seq profiling across dopaminergic neuron differentiation. Nat Genet. 2021;53:304–12. 10.1038/s41588-021-00801-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Hao  Y, Stuart  T, Kowalski  MH  et al.  Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat Biotechnol. 2024;42:293–304. 10.1038/s41587-023-01767-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Riggs  BL, Hartmann  LC. Selective estrogen-receptor modulators – mechanisms of action and application to clinical practice. N Engl J Med. 2003;348:618–29. 10.1056/NEJMra022219. [DOI] [PubMed] [Google Scholar]
  • 45. Wiechens  E, Vigliotti  F, Siniuk  K  et al.  Gene regulation by convergent promoters. Nat Genet. 2025;57:206–17. 10.1038/s41588-024-02025-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Chen  Y, Davidson  NM, Wan  YK  et al.  A systematic benchmark of Nanopore long-read RNA sequencing for transcript-level analysis in human cell lines. Nat Methods. 2025. 22:801–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Alfonso-Gonzalez  C, Legnini  I, Holec  S  et al.  Sites of transcription initiation drive mRNA isoform selection. Cell. 2023;186:2438–2455.e22. 10.1016/j.cell.2023.04.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Guan  J, Zhou  W, Hafner  M  et al.  Therapeutic ligands antagonize estrogen receptor function by impairing its mobility. Cell. 2019;178:949–963.e18. 10.1016/j.cell.2019.06.026. [DOI] [PubMed] [Google Scholar]
  • 49. Cancer Genome Atlas Research, N., Weinstein  JN, Collisson  EA, Mills  GB  et al.  The cancer genome atlas pan-cancer analysis project. Nat Genet. 2013;45:1113–20. 10.1038/ng.2764. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Gao  JJ, Swain  SM. Luminal A breast cancer and molecular assays: a review. Oncologist. 2018;23:556–65. 10.1634/theoncologist.2017-0535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Domingo  J, Minaeva  M, Morris  JA  et al.  Nonlinear transcriptional responses to gradual modulation of transcription factor dosage. eLife. 2026; 13:RP100555. 10.7554/eLife.100555. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Liu  W, Saelens  W, Rainer  P  et al.  Dissecting the impact of transcription factor dose on cell reprogramming heterogeneity using scTF-seq. Nat Genet. 2025;57:2522–35. 10.1038/s41588-025-02343-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Nadig  A, Replogle  JM, Pogson  AN  et al.  Transcriptome-wide analysis of differential expression in perturbation atlases. Nat Genet. 2025;57:1228–37. 10.1038/s41588-025-02169-3. [DOI] [PubMed] [Google Scholar]
  • 54. Niwa  H, Miyazaki  J, Smith  AG. Quantitative expression of Oct-3/4 defines differentiation, dedifferentiation or self-renewal of ES cells. Nat Genet. 2000;24:372–6. 10.1038/74199. [DOI] [PubMed] [Google Scholar]
  • 55. St Laurent  G, Shtokalo  D, Tackett  MR  et al.  On the importance of small changes in RNA expression. Methods. 2013;63:18–24. 10.1016/j.ymeth.2013.03.027. [DOI] [PubMed] [Google Scholar]
  • 56. Kaern  M, Elston  TC, Blake  WJ  et al.  Stochasticity in gene expression: from theories to phenotypes. Nat Rev Genet. 2005;6:451–64. 10.1038/nrg1615. [DOI] [PubMed] [Google Scholar]
  • 57. Kvon  EZ, Waymack  R, Gad  M  et al.  Enhancer redundancy in development and disease. Nat Rev Genet. 2021;22:324–36. 10.1038/s41576-020-00311-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Kitano  H. Biological robustness. Nat Rev Genet. 2004;5:826–37. 10.1038/nrg1471. [DOI] [PubMed] [Google Scholar]
  • 59. de Mendoza  A, Nguyen  TV, Ford  E  et al.  Large-scale manipulation of promoter DNA methylation reveals context-specific transcriptional responses and stability. Genome Biol. 2022;23:163. 10.1186/s13059-022-02728-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Kjer-Hansen  P, Weatheritt  RJ. The function of alternative splicing in the proteome: rewiring protein interactomes to put old functions into new contexts. Nat Struct Mol Biol. 2023;30:1844–56. 10.1038/s41594-023-01155-9. [DOI] [PubMed] [Google Scholar]
  • 61. Zhang  M, Sjostrom  M, Cui  X  et al.  Integrative analysis of ultra-deep RNA-seq reveals alternative promoter usage as a mechanism of activating oncogenic programmes during prostate cancer progression. Nat Cell Biol. 2024;26:1176–86. 10.1038/s41556-024-01438-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Kjer-Hansen  P, Phan  TG, Weatheritt  RJ. Protein isoform-centric therapeutics: expanding targets and increasing specificity. Nat Rev Drug Discov. 2024;23:759–79. 10.1038/s41573-024-01025-z. [DOI] [PubMed] [Google Scholar]
  • 63. Kumar  P, Wu  Q, Chambliss  KL  et al.  Direct interactions with G alpha i and G betagamma mediate nongenomic signaling by estrogen receptor alpha. Mol Endocrinol. 2007;21:1370–80. 10.1210/me.2006-0360. [DOI] [PubMed] [Google Scholar]
  • 64. Nagel  A, Szade  J, Iliszko  M  et al.  Clinical and biological significance of ESR1 gene alteration and estrogen receptors isoforms expression in breast cancer patients. Int J Mol Sci. 2019;20:1881. 10.3390/ijms20081881 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Chen  C, Yue  D, Lei  L  et al.  Promoter-operating targeted expression of gene therapy in cancer: current stage and prospect. Mol Ther Nucleic Acids. 2018;11:508–14. 10.1016/j.omtn.2018.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66. Kahles  A, Lehmann  KV, Toussaint  NC  et al.  Comprehensive analysis of alternative splicing across tumors from 8,705 patients. Cancer Cell. 2018;34:211–224.e6. 10.1016/j.ccell.2018.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67. Sterne-Weiler  T, Weatheritt  RJ, Best  AJ  et al.  Efficient and accurate quantitative profiling of alternative splicing patterns of any complexity on a laptop. Mol Cell. 2018;72:187–200.e6. 10.1016/j.molcel.2018.08.018. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

gkag118_Supplemental_Files

Data Availability Statement

All single cell data produced for this paper is available at E-MTAB-14567. Source code is available at https://github.com/theheking/isoform_specific_perturb_seq and https://doi.org/10.5281/zenodo.18275435.


Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES