Large scale compound selection guided by cell painting reveals activity cliffs and functional relationships

Maxime Sanchez; Nicolas Bourriez; Ihab Bendidi; Ethan Cohen; Ivan Svatko; Elaine Del Nery; Hamza Tajmouati; Guillaume Bollot; Laurence Calzone; Auguste Genovesio

doi:10.1038/s42003-025-09500-y

. 2026 Jan 13;9:225. doi: 10.1038/s42003-025-09500-y

Large scale compound selection guided by cell painting reveals activity cliffs and functional relationships

Maxime Sanchez ^1,^2,^3,^4,⁵, Nicolas Bourriez ¹, Ihab Bendidi ¹, Ethan Cohen ¹, Ivan Svatko ^1,⁶, Elaine Del Nery ⁷, Hamza Tajmouati ⁵, Guillaume Bollot ⁵, Laurence Calzone ^2,^3,^4,^✉, Auguste Genovesio ^1,^✉

PMCID: PMC12901981 PMID: 41530519

Abstract

Traditional structure-based pre-screen compound selection relies on the assumption that chemical similarity implies similar biological activity. This paradigm narrows the exploration of chemical space and often fails to account for functional convergence, where structurally diverse compounds act through distinct targets to produce similar phenotypic effects. As a result, compounds with therapeutic potential may be overlooked. To overcome this constraint, we introduce a training-free, transfer learning-based method for large scale compound preselection that leverages deep phenotypic profiling of human cells. Notably, this enables robust pairwise comparison of phenotypic signatures across any source of the entire JUMP-CP, the largest publicly available cell painting dataset (112,480 compounds), preserving biological signals while mitigating batch effects. Validated across 65 high-throughput assays—including in vitro and in cellulo systems—our method provides efficient pre-screen enrichment of biologically active compounds, bypassing the blind spots of structure-centric approaches. Interestingly, because it is large scale, it also allows for a comprehensive analysis of structure–phenotypic activity relationships, revealing potentially thousands of compound activity cliffs, where minimal chemical changes in structure may result in profound phenotypic shifts. We show that these cliffs capture subtle, atom-level determinants of bioactivity that cannot be accessed by structure-based models. Furthermore, we demonstrate that structurally diverse compounds targeting different genes in the same biological pathway can induce either convergent or opposite phenotypes—a phenomenon validated across 30 pathways, hundreds of genes, and thousands of compounds. Finally, to support the broader community, we propose Phenoseeker, a web-based tool enabling instant retrieval of JUMP-CP compounds with similar phenotypic profiles. Together, these findings position phenotypic profiling not merely as a complementary tool, but as a transformative and scalable framework for navigating chemical space through a biological lens. By capturing rich morphological signatures that reflect functional outcomes—regardless of structural similarity—this approach enables the discovery of bioactive compounds, novel mechanisms of action, and unexpected target-pathway relationships. Applied at the scale of the JUMP-CP dataset, phenotypic profiling emerges as a powerful strategy for prioritizing compounds, illuminating activity cliffs, and accelerating the identification of therapeutically relevant candidates across diverse biological contexts.

Subject terms: Chemical libraries, Cellular signalling networks, Phenotypic screening, Image processing, Cellular imaging

Using DINOv2-derived batch-corrected phenotypic embeddings from 112k compounds, Phenoseeker enables structure-independent compound selection and systematic discovery of activity cliffs and pathway-consistent phenotypic convergence.

Introduction

Over the past decades, phenotypic drug discovery (PDD) has re-emerged as a powerful strategy for identifying novel therapeutic agents^1,2. In contrast to target-based approaches that focus solely on predetermined molecular targets, such as specific proteins, PDD harnesses observable cellular or organismal phenotypes². Although both strategies have produced effective therapeutics, growing evidence indicates that PDD is more successful at delivering first-in-class drugs³. Its target-agnostic nature is especially advantageous for addressing polygenic diseases or conditions linked to undruggable targets. Moreover, PDD offers the simultaneous assessment of many potential targets⁴ and the filtering of compounds based on key criteria such as solubility, toxicity, and cellular accessibility⁵. It also entails drawbacks, including the need for subsequent target deconvolution and the challenge of deciphering off-target interactions from direct effects. Importantly, the success of a drug discovery campaign highly depends on the relevance of the compound library, which, in the case of PDD, is hard to predict. In fact, an efficient pre-screening compound selection method for PDD should select compounds that are likely to reproduce the desired phenotype in the considered assay.

A relevant approach toward this goal is the Cell Painting assay⁶, a standardized, multiplexed imaging technique introduced in 2013⁷. This method produces high-content, high-resolution images that capture compound -or genetically- induced cellular changes by labeling non specific organelles (including nuclei, nucleoli, mitochondria, actin, and tubulin networks) chosen for their generic nature, unrelated to any specific phenotype. As a result, it provides an unbiased, holistic view of the cell state. Alone or in combination with other omics data, Cell Painting has helped to decipher cell death pathways⁸, to assess chemical exposures in non-tumorigenic breast cells⁹, to study gene roles in glaucoma¹⁰, or to predict drug-induced toxicity in muscle cells¹¹. Initially applied to U2OS cells, the assay has now been adapted for various cell lines¹², live imaging¹³, human induced pluripotent stem cells (hiPSC)¹⁴, and even 3D cell spheroids¹⁵, demonstrating its impact in both fundamental research and therapeutic discovery¹⁶.

In parallel, the resurgence of deep learning¹⁷ has revolutionized biotechnology, as highlighted by AlphaFold’s breakthrough in protein structure prediction¹⁸. Researchers have integrated ideas and architectures from diverse deep learning areas, such as representation learning¹⁹, multimodal learning²⁰, and generative models²¹ to reveal more nuanced and distinct phenotypic patterns in cell imaging data. This progress has led to novel approaches, such as InfoAlign²², Cell Painting CNN²³, or CLOOME²⁴, which can predict molecular properties or drug mechanisms of action^25–28. It has even helped uncover biases in CRISPR-Cas9 genome editing²⁵ and has been applied in genetic perturbation studies^29,30. In short, deep learning models can be used to convert microscopy images of cells to a vector called a deep phenotypic profile which can further be exploited to compute similarities between phenotypic responses to various perturbations.

If the hypothesis is made that two compound perturbations producing the same cell phenotype are likely to impact the same function, then phenotypic profiles computed from Cell Painting images could be used as a proxy to perform an efficient compound library pre-selection. We recently validated this hypothesis with a small proof of concept on 3 screens using a first Cell Painting screen of 10,000 compounds^26,31. Other works have used cell phenotypes to identify candidates most likely to exhibit a desired bioactivity (e.g., protein inhibition, cell death, or pathway activation)^{27,28,31–38}. Recent methods using supervised classifiers require hundreds of compounds with known bioactivities³⁸. Some works use handcrafted cell features (sometimes these features are combined with chemical structures³⁷ or genetic profiles), while other approaches use unimodal^28,33,34 or multimodal²⁶ deep encoders. Altogether, these methods require a training procedure (i.e., training data), and thus are not easily transferable to drug discovery projects. A transfer learning strategy such as the one we proposed in Cohen et al.³¹ does not necessitate any training step and is straightforward to use as it only requires a positive control reference. However, this preselection has been so far limited in scale due to the unavailability of large cell painting datasets.

Scaling our pre-selection method to a much larger library is now within reach, thanks to the introduction of the JUMP-CP dataset in 2023³⁹, which contains images of over 112,000 unique compound perturbations. However, for this massive dataset to be assembled, it was co-produced by 12 laboratories around the world, spread on various experimental batches and acquired with different microscopes. Consequently, it exhibits considerable non-biological variability, known as “batch effects”⁴⁰. Altogether, scaling compound selection to a large library dispatched on multiple sites and multiple batches such as JUMP-CP is not straightforward and it is yet unclear what combination of deep encoder, normalization strategy and profile replicate aggregation would be optimal at this scale to produce phenotypic profiles robust enough for an efficient compound selection. For this reason, to our knowledge, there is no work to date that benefits from an exhaustive view on the dataset compound perturbations for this task.

In this work, we identified an effective encoding-normalization-aggregation strategy to make possible robust comparison of any couple of compounds through their phenotypic profiles across the entire JUMP-CP dataset. We then devised a training-free transfer learning approach to perform an efficient preselection of compounds prior to a screen. We thoroughly validated its efficiency across 65 diverse screens, from internal and public databases of high-throughput assays, for a total of 40k+ readouts, covering both in vitro and in cellulo bioactivities. Interestingly, we identified that our method performs large “jumps” in the vast chemical space (10³³ drug-like compounds⁴¹) using phenotypic similarity as a biological proxy. These results show a greater chemical diversity among the selected compounds confirming preliminary observations made using a small Cell Painting dataset from one of the twelve JUMP-CP laboratories³⁸. We further demonstrated its ability to uncover activity cliffs that are typically missed by traditional structure-based selection approaches⁴². Our findings also show that compounds targeting the same signaling pathway tend to induce related phenotypic profiles, a trend we validated on 30 pathways spanning hundreds of genes and thousands of compounds. In order to facilitate real-world applications and accelerate drug discovery, we created a website (https://www.phenoseeker.bio.ens.psl.eu/) where researchers can retrieve in one click a large preselection from the 112,480 JUMP-CP compounds that are most likely to exhibit a bioactivity that is similar to the positive control query.

Results

Building phenotypic profiles that enhance signal and reduce batch effect over the whole JUMP-CP

Being able to compare cell phenotypes induced by all the compound perturbations across the whole JUMP-CP dataset required several processing steps (encoding, aggregating and normalizing). However, no clear state-of-the-art method has yet emerged for the whole JUMP-CP. We have developed a flexible pipeline (Fig. 1a) that includes various approaches for each of these steps, some of which can be found in previous works^29,40. Our goal was to identify the best combination that would minimize batch effects while preserving biologically relevant signals. The first step, extracting a feature vector from multiple 5-channel images acquired using various microscope-imaging configurations or plate formats (i.e., 384 or 1536 wells), can be achieved either using hand-crafted methods (e.g., CellProfiler⁴³) or various deep-learning encoders. These encoders may be pretrained on natural images (e.g., ResNet50⁴⁴, DINOv2⁴⁵) or on microscopy images (e.g., ChAda-ViT⁴⁶, OpenPhenom⁴⁷) (see “Methods”). Following this, the feature vectors of all fields of view from one well need to be aggregated in order to obtain one profile per experimental sample. Wells distributed over different plates in different batches from different laboratories then need to be aligned through a normalization step. For instance, one can use mock treatment control units in each batch to center and scale features or perform a Typical Variation Normalization (TVN)⁴⁸, which aligns both first-order statistics and covariance structures. Other methods from the field of transcriptomics, such as neural network–based scVI⁴⁹, mixture-model Harmony⁵⁰, or nearest neighbor Scanorama⁵¹, as well as dedicated training losses^52,53 and deep-learning architectures⁵² were considered, but these necessitate a training step which we aimed to avoid for robustness. Finally, replicate wells need to be aggregated into a single feature profile per condition referred to as “phenotypic profile”.

To identify which combination of these steps best preserves biological signal while mitigating batch effects, two metrics were used (Fig. 1b). The first metric captures the preservation of biological signal, as measured by the mean average precision (mAP) for eight positive control compounds present on every experimental plate. We seek to obtain a high value since it indicates that profiles induced by the same compound treatment are close in phenotypic space. The second metric captures batch effect removal as measured by the mAP on experimental plate labels (see “Methods”). We seek to approach the low value of a random retrieval, indicating that profiles from the same plate (therefore from the same batch and laboratory) are not particularly closer than those from different plates. By performing a thorough sweep of more than 5000 combinations (see “Methods”) of processing steps described above, we identified a relevant combination of steps that optimize our criteria. First, we determined that DINOv2-Giant provides a better encoding than other deep encoders (including those trained on microscopy images) or handcrafted solutions such as CellProfiler⁴³ (Fig. 1b). Following this, we determined that the most effective normalization approach consisted in fitting a whitening matrix to the DMSO negative control wells of each plate, then applying it to all the sample wells of this plate⁵⁴ then finally applying an inverse normal transformation (INT) to ensure each feature follows a normal distribution across the plate (Fig. 1). Before normalization, wells from the same laboratory tend to cluster together, reflecting strong batch effects and making phenotypic profile comparison across experimental batches irrelevant. After normalization, these clusters become more intermixed while, on the contrary, positive control replicates across laboratories cluster together, indicating a reduction in batch variation and an enhancement of biological signal over experimental noise (Fig. 1d). This step enables the comparison of any pair of samples, regardless of their origin, thus making the JUMP-CP large compound library fully usable for further applications.

Large-scale cell painting-based selection boosts active compound yield

By leveraging on this representation that allows comparisons across a large range of perturbations, we propose a practical compound selection method using similarity of Cell Painting phenotypic profiles as a proxy. Indeed, we hypothesized that two different compound perturbations exhibiting the same phenotypic response shared similar bioactivity. Our pipeline then uses cosine similarity (see Methods) to compare the phenotypic profiles of 112,480 compounds in JUMP-CP to the positive control (Fig. 2a), a compound that is known to induce the desired bioactivity. Because our approach is limited to compounds contained in the JUMP-CP dataset, all validation analyses rely on the overlap between these 112,480 compounds and those tested in external cell-based assays. In the following, we name the cosine similarity between two phenotypic profiles of compounds the “phenotypic similarity” between those two compounds. This approach enables rapid and efficient selection of bioactive compounds without requiring any machine learning training or database of compounds with known bioactivities. Designed for simplicity and usability, our method allows any researcher to identify compounds with a desired bioactivity using only a single positive control. If the phenotypic profile of this positive control is not already available in the JUMP-CP dataset, it can be generated by imaging the control with the Cell Painting assay and computing a normalized phenotypic profile with our pipeline. We validated our method using 65 screens, some of which were previously performed at the BioPhoenix platform at Institut Curie (16) and others were collected from ChEMBL (49) open source data, as well as 5 targets from Lit-PCBA benchmark⁵⁵ (see “Methods”). To evaluate the quality of our compound selection for these screens, we employed the normalized enrichment factor (nEF), defined as the enrichment factor (EF) divided by the maximum possible EF (Fig. 2a, “Methods”). As shown in Fig. 2b, c, d, nEF values were calculated for the selection of the top 5% compounds with the highest phenotypic similarity to the positive control. We observed comparable results at other selection percentages (see Supplementary Fig. 1), and the raw EF values are provided in Supplementary Fig. 2. Because the compounds tested in the JUMP-CP dataset and in each evaluation screen only partially overlapped, we lacked perfect positive controls for each of them. To address this, we systematically treated each identified hit from Institut Curie, ChEMBL and Lit-PCBA screens (i.e., bioactive compound) as a potential positive control, calculating the nEF for each of them. We then considered two evaluation scenarios: a pessimistic scenario where the nEF is averaged over all hits, thus accounting for false positives or weaker hits (Fig. 2b, c, d, green bars); and an optimistic scenario where the highest nEF obtained was considered, reflecting a more realistic configuration where the positive control is a tool compound with a strong phenotype and a well-characterized bioactivity (Fig. 2b, c, d, blue bars). In most tested screens, selecting the top 5% compounds most similar to the positive control, under the pessimistic scenario, outperformed (or sometimes matched) the hit rate of the original selection (Fig. 2b, c, d, purple bars). Importantly, our method almost never yields a worse hit ratio than the original screens, even when suboptimal or false positive compounds were taken into account. Moreover, when a highly effective positive control is considered (the optimistic scenario), our method always drastically improves hit enrichment, occasionally achieving the best possible selection of compounds and, at the very least, consistently doubling the proportion of active compounds compared to the initial selection. Notably, the method’s strong performance on 5 target proteins from the Lit-PCBA benchmark—which is specifically designed to assess target-dependent effects—demonstrates that it does not merely capture false positives or off-target signals, a common pitfall in phenotypic drug discovery. This target-specific validation confirms that our approach reliably enriches the library for compounds with relevant bioactivity, thereby enhancing its practical utility in drug discovery.

Fig. 2 — a Schematic of our selection method where the top k% closest JUMP-CP neighbors (dotted sphere) of a screen positive control (red dot) are selected for screening. To evaluate the capability of this selection process to increase screening hit rate, we applied it to past screens and thus computed the fraction of hits that fall into this selection (orange dots). b–d Evaluation of our method across diverse screens and sources. Mean (green bars) and maximum (blue bars) normalized Enrichment Factor (nEF) are displayed, comparing our method with the original selection (purple bar). There are 49 screens from ChEMBL (b), 5 targets from Lit-PCBA (c), and 16 screens from the Curie Screening Platform (d). nEF is determined based on the top 5% compounds with the highest phenotypic similarity to the positive control.

Phenotypic-based compound selection increases the chemical diversity of biologically active compounds

We further investigated the structural diversity of the compounds selected by our approach. To encode the structural topology of a chemical compound, we use Morgan fingerprints, a type of circular molecular descriptor widely used in cheminformatics⁵⁶. Using the Tanimoto similarity⁵⁷, we measured the structural similarity between the positive control from each of the Institut Curie and ChEMBL screens and the compounds selected through our biological proxy (see “Methods”). This structural similarity remained consistently low (Fig. 3a) and was statistically significantly lower than the structural similarity between the positive control and the top 5% compounds most similar in structure (see “Methods” and Supplementary Table 1 for statistical details). Remarkably, the structural similarity between the positive control and the entire pool of screened compounds was comparable to that between the positive control and the top 5% of compounds with the highest phenotypic similarity (Fig. 3a). We extended this finding for each compound of the entire JUMP-CP dataset by computing the Tanimoto similarity between its top 5% with highest phenotypic similarity and its top 5% most structurally similar (Fig. 3a). To illustrate this, we generated UMAP⁵⁸ embeddings of both phenotypic profiles and Morgan fingerprints (Fig. 3b) for compounds tested in a given cell-based fluorescence assay at the BioPhoenix platform. The positive control (in red) and the compounds selected with our method (in blue) appear in distinct regions of chemical space, further illustrating our approach’s capacity to identify chemically diverse compounds that may share similar bioactivities. Our method identifies compounds with different chemical scaffolds that produce phenotypic profiles similar to the positive control (Fig. 3c). This broadens the chemical diversity of selected compounds, and therefore future hits, and creates opportunities to discover compounds with potentially improved ADME-Toxicity profiles or fewer intellectual property constraints. Conversely, we observed instances where compounds with highly similar structures produced very different phenotypes (Fig. 3d).

Fig. 3 — a For illustration, the Mean Tanimoto structure similarity between the positive control of a screen from Institut Curie and three groups of compounds: all compounds tested in the screen (blue), the top 5% phenotypically similar compounds (green), and the top 5% structurally similar compounds (red). The same computation is then performed for a screen from ChEMBL and the whole JUMP-CP using each compound as a positive control. Means (green triangle), medians (yellow line), quartiles are computed from : 16 samples (one per screen) for *Curie Institute*, 49 samples (one per screen) for *ChEMBL*, 10000 samples (one per compound) for *CP-JUMP*. b On the left, *UMAP*⁵⁸ *of the phenotypic profiles* for compounds tested in the screen from Institut Curie. On the right, *UMAP of the structural fingerprints* for the same compounds. Positive control is shown in red; the 5% phenotypically closest compounds are shown in blue. Arrows point to illustration compounds from (c, d). c *Examples of selected compounds* (close to the positive control in phenotypic space but distant in structural space). d *Examples of pairs of compounds* that are close in structural space but distant in phenotypic space. For each compound, chemical structures and Cell Painting images are shown. Scale bars = 50 µm.

Clustering analogs per phenotype reveals activity cliffs

The last results motivated us to explore whether the JUMP-CP dataset could also reveal structural relationships between compounds explaining variation in their phenotypic profiles. Specifically, we aimed to examine whether compounds that share the same core scaffold, even with minor chemical variations, can produce distinct phenotypic profiles, denoting “phenotypic activity cliffs.” An activity cliff is a phenomenon in which two structurally similar compounds show a significant difference in biological activity.

First, compounds from the JUMP-CP dataset were grouped according to their Bemis-Murcko scaffold⁵⁹, resulting in 532 distinct chemical series with 6 compounds or more (Fig. 4a, “Methods”). Then, in each of these series, compounds were split in two clusters of phenotypes (defined as the highest level of a hierarchical clustering of their phenotypic profiles) and the intra- and inter-cluster phenotypic and structural similarities were computed (Fig. 4a). 81 series (2277 compounds in total, see Supplementary Fig. 3 for examples of scaffolds) showed a substantially higher intra-cluster phenotypic similarity compared to their inter-cluster phenotypic similarity (Fig. 4c, “Methods”) while showing minimal differences structure-wise by the same metrics (Fig. 4d) thus defining potentially thousands of compounds activity cliffs. We repeated this experiment using hierarchical clustering with 3, 4, and 5 clusters, consistently obtaining similar results (see Supplementary Fig. 4). This finding suggests that phenotypic profiles reveal activity cliffs that are difficult to predict by quantitative structure-activity relationship (QSAR) methods⁴². This illustrates the advantage of PDD and the value of using Cell Painting phenotypes to complement the structural description of compounds, enhancing chemical space exploration in drug discovery projects.

Fig. 4 — a JUMP-CP compounds were grouped by Murcko scaffold. b For each group of compounds, a hierarchical clustering based on their phenotypic profiles was performed. Phenotypic (cosine, red) and Structural (Tanimoto, blue) similarity within and between these clusters were computed. c Intra- versus inter-cluster phenotypic similarity. 81 scaffold-based groups exposing a substantially higher intra- than inter-cluster phenotypic similarity are highlighted in red. d Intra- versus inter-cluster structural similarities. The 81 scaffolds highlighted in red on panel c are highlighted in blue here. The green squares represent the example of the chemical series illustrated in (b).

In this analysis, compounds were initially grouped by their Bemis–Murcko scaffolds to reflect the practical context of medicinal chemistry, where optimization often proceeds within a chemical series defined by a common scaffold. However, Bemis–Murcko scaffolds capture only the molecular core, and compounds sharing a scaffold can still be chemically distinct. To validate and extend our observations, we adapted the Structure–Activity Landscape Index (SALI)⁶⁰ to phenotypic differences. The distribution of SALI scores across the dataset, as well as examples of phenotypic activity cliffs identified in this manner, are presented in Supplementary Fig. 5.

Clustering analogs per phenotype reveals key chemical functions

To expand on our previous finding, we cherry-picked examples of chemical series where compounds exhibit distinct phenotypic profiles (Fig. 5, Supplementary Fig. 6, Supplementary Fig. 7, Supplementary Fig. 8). For instance, a cluster of such a series was made of three compounds (C1, C2, C3 Fig. 5b) killing the cells (Fig. 5a). Interestingly, they share three specific moieties: (1) a bromine atom, (2) a methyl group, and (3) a sp3 carbon chain linked to the nitrogen atom (Fig. 5b, in green). These three compounds are also significantly different from the negative DMSO control (Fig. 5a). In contrast, compounds C4 through C11 induce different phenotypes, both from C1–C3 and from one another (Fig. 5c). These differences may stem from the protonation of the scaffold’s nitrogen (green highlight), as well as other structural variations (purple highlights). For instance, compound C9, which is the Murcko scaffold reference of this series, induces a phenotype similar to DMSO (Fig. 5a), indicating no significant phenotypic effect. Meanwhile, adding different chemical substitutions to the Murcko scaffold induces subtle changes in the cellular phenotype, yet these differences remain clearly distinguishable with our phenotypic profiles.

To find more target specific examples, we compared the list of pairs of compounds displaying activity cliffs extracted from ChEMBL by Zhang et al.⁶¹ to the compounds we have in JUMP-CP dataset to find one example (out of 6 pairs of compounds present in both dataset) of specific activity change (inhibition of Carbonic anhydrase VII) clearly reflected in the phenotypic profiles. This example is shown in Supplementary Fig. 9

Compounds reproducing the positive control phenotype tend to hit targets in the same pathway

We further investigated if compound perturbations reproducing the phenotype of a positive control were modulating the same biological pathways. A biological or signaling pathway is a series of biomolecular interactions, such as biochemical reactions or signaling events, through which cells respond to internal or external stimuli. Pathways regulate critical cellular processes including metabolism, gene expression, and cell signaling, enabling coordinated cellular behavior. To illustrate this hypothesis, we considered data from a fluorescence cell-based assay screen to identify inhibitors of the EGFR pathway (CHEMBL1613808). We chose Dezmapinod, a non-specific inhibitor of mitogen-activated protein kinases (MAPK), as our positive control (Fig. 6a). Then, we considered the top 5% compounds (15 compounds out of 300 that appeared both in CHEMBL1613808 and JUMP-CP) with highest phenotypic similarity to Dezmapinod. Among these 15 compounds (Fig. 6b), three were known to specifically inhibit proteins within the same pathway: EGFR for AG1478, BRD4 for Sb-202190, and MAPK8 for SP600125 (Fig. 6c). We explored a prior-knowledge database to connect these genes into a signaling pathway. Among all the existing databases, we focused on SIGNOR⁶² (see “Methods”). Although BRD4 was not part of the initial SIGNOR signaling pathway, it is known to modulate MYC activity and conversely^63,64 and therefore added to the network. These findings on a specific example suggest that compounds targeting the same pathway, even when they act on different proteins, can produce a similar phenotype.

Fig. 6 — a UMAP⁵⁸ of phenotypic profiles of compounds from the CHEMBL1613808 screen, designed to identify *in cellulo* inhibitors of the EGFR pathway. Positive control (***Dezmapimod***) is shown in red, while compounds selected by our method are shown in blue. b Examples of selected compounds (SC) include known EGFR pathway inhibitors, acting through distinct mechanisms of action and targeting different genes involved in the pathway. c The EGFR signaling pathway extracted from the SIGNOR database. Targets of the selected compounds are highlighted in blue, illustrating their diverse role within the pathway and confirming why structural similarity between compounds is not always expected. Scale bars = 50 µm.

Compounds targeting the same pathway tend to induce related phenotypes

We then aimed to evaluate the extent to which the previous observations, made on a single pathway and given a specific screen positive control, generalizes to any signaling pathways. To this end, we cross-referenced genes associated with the 30 pathways from MSigBD that had more than 10 known compound–gene interactions in BindingDB targeted by compounds found in JUMP-CP. For each pathway, we identified a few hundred compounds targeting up to a few tens of genes (see “Methods”). We then compared the phenotypic similarities between those compounds against the phenotypic similarity between random compounds in JUMP-CP for each pathway (Fig. 7a). As a result, we observed that compounds targeting the same pathway exhibit a significantly higher frequency of extreme phenotypic similarities, both highly similar and almost opposite, compared to random pairs. This result is consistent across every one of the 30 studied pathways (see “Methods”, Supplementary Fig. 10 and Table 1 for all distributions and statistical details). On one hand, these findings confirmed the hypothesis that compounds targeting the same pathway could be found through their phenotypic similarity. On the other hand, interestingly, it showed that a high dissimilarity could also be used as a criteria for selection. One explanation for the latter is that compounds targeting the same pathway may trigger opposite downstream effects, leading to opposite cellular responses because of the role of the target on the pathway. Illustrating it on the G2M pathway (Fig. 7), the previously described procedure led to a significantly high amount of compounds inducing both highly similar or highly dissimilar phenotypes (Fig. 7a). We then examined the effect of four well-characterized compounds known to interact with specific targets in this pathway and present in the JUMP-CP dataset (SCHEMBL1578316, Reversine, NSC-625987, and Gnf-5; Fig. 7b) that are displaying both similar and opposite phenotypic profiles (in terms of cosine similarities, see Fig. 7c). To this end, we constructed the network interaction with the 21 G2M pathway targeted genes using OmniPath interactions via the NeKo tool⁶¹ and considered the subnetwork with only genes linked to those compounds (Fig. 7d). We found that the direction of their phenotypic similarities, either similar or opposite phenotypic profiles (Fig. 7c), is coherent with their downstream effects on the interaction network. For instance, Reversine inhibits an inhibitor (AURKB) of cell proliferation (BIRK5), leading to a pro-proliferative outcome, whereas NSC-625987 inhibits CDK4, blocking proliferation. These two compounds target the same pathway with opposite effects. Even if these observations cannot yet be generalised to any pair of compounds, the visualization of their effect on networks highlights which mechanisms each of these compounds targets and can explain the observed phenotypes. The presence of this bimodal effect (highly similar/highly opposite phenotypic profiles)³⁰ in our representation of compounds displays the power and the complexity of biological information captured by phenotypic profiles.

Fig. 7 — a Violin plot of phenotypic similarity distributions: in green, similarities between all pairs within the 530 compounds active against the G2M pathway (280900 samples); in red, similarities between the same number of compound pairs not known to be active against the G2M pathway (280900 samples). b Cellular image (Phenotypes) and chemical structures of the four compounds. Scale bars = 50 µm. c Cosine similarity matrix showing the phenotypic similarity among the four compounds, each targeting different G2M pathway genes. Negative signs show opposite phenotypic profiles, whereas positive signs show similar phenotypic profiles. d Network of gene–gene interactions among G2M pathway genes targeted by JUMP-CP compounds. G2M pathway genes were obtained from MSigDB, interactions from OmniPath, and the network was constructed using NeKo. Red edges indicate inhibition, green edges indicate stimulation, and purple edges a bimodal effect.

Table 1.

Statistical comparison of phenotypic similarities between pathway-targeting and random compound pairs

Pathway	Number of Genes	Number of Impacted Genes	Number of Compounds Active on the Pathway	Pairwise Similarity Variance of Active Compounds	Pairwise Similarity Variance of Other Compounds	Statistical Significance of Differences Between Variances	Fraction of Active Pairs with Cosine Similarity Below -0.2	Fraction of Other Pairs with Cosine Similarity Below -0.2	Statistical Significance of Differences Between Fractions (Low)	Fraction of Active Pairs with Cosine Similarity Above 0.2	Fraction of Other Pairs with Cosine Similarity Above 0.2	Statistical Significance of Differences Between Fractions (High)
Apoptosis	161	24	488	0.053	0.012	*****	11.79%	3.19%	*****	14.55%	3.27%	*****
Allograft rejection	200	21	588	0.0787	0.0121	*****	13.16%	3.26%	*****	22.84%	3.27%	*****
Androgen response	101	10	120	0.1093	0.0119	*****	14.46%	3.15%	*****	36.6%	3.22%	*****
Apical junction	200	12	395	0.0917	0.0119	*****	21.08%	3.11%	*****	25.89%	3.17%	*****
Coagulation	138	13	144	0.0331	0.0122	*****	9.25%	3.26%	*****	7.84%	3.21%	*****
Complement	200	22	486	0.0569	0.012	*****	11.12%	3.15%	*****	15.23%	3.19%	*****
E2F targets	200	17	424	0.1232	0.012	*****	16.26%	3.19%	*****	37.21%	3.22%	*****
Estrogen response early	200	15	398	0.0711	0.0121	*****	10.51%	3.26%	*****	22.9%	3.31%	*****
Estrogen response late	200	15	475	0.0726	0.0117	*****	9.73%	3.05%	*****	23.45%	3.1%	*****
G2M checkpoint	200	21	530	0.1183	0.0118	*****	13.98%	3.08%	*****	35.96%	3.2%	*****
Glycolysis	200	12	420	0.106	0.0121	*****	15.47%	3.18%	*****	31.92%	3.24%	*****
Hypoxia	200	11	267	0.0815	0.012	*****	13.55%	3.16%	*****	23.28%	3.19%	*****
IL2 stat5 signaling	199	20	484	0.0332	0.0119	*****	7.57%	3.16%	*****	9.53%	3.23%	*****
IL6 JAK STAT3 signaling	87	10	148	0.0698	0.012	*****	12.98%	3.18%	*****	18.75%	3.29%	*****
Inflammatory response	200	23	434	0.0486	0.0121	*****	13.18%	3.23%	*****	14.27%	3.25%	*****
Interferon gamma response	200	21	495	0.0377	0.0123	*****	8.57%	3.33%	*****	10.89%	3.37%	*****
KRAS signaling DN	200	13	180	0.083	0.0119	*****	15.9%	3.14%	*****	21.16%	3.15%	*****
KRAS signaling UP	200	13	153	0.0608	0.0116	*****	14.7%	3.0%	*****	16.3%	3.09%	*****
Mitotic spindle	199	18	568	0.1096	0.0122	*****	13.84%	3.27%	*****	30.83%	3.28%	*****
MTORC1 signaling	200	14	474	0.0452	0.0119	*****	8.32%	3.1%	*****	12.73%	3.16%	*****
MYC TARGETS V1	200	11	237	0.1296	0.0121	*****	15.52%	3.23%	*****	35.19%	3.26%	*****
Myogenesis	200	12	387	0.0787	0.0119	*****	19.88%	3.16%	*****	23.03%	3.17%	*****
P53 pathway	200	19	306	0.0919	0.0123	*****	16.25%	3.31%	*****	24.37%	3.31%	*****
PI3K AKT MTOR Signaling	105	18	906	0.0485	0.0121	*****	9.11%	3.25%	*****	14.31%	3.33%	*****
Spermatogenesis	135	11	342	0.1266	0.0117	*****	14.9%	3.06%	*****	42.48%	3.13%	*****
TGF BETA signaling	54	10	235	0.1151	0.012	*****	13.22%	3.17%	*****	32.08%	3.27%	*****
TNFA signaling VIA NFKB	200	16	248	0.0461	0.0118	*****	12.11%	3.13%	*****	13.05%	3.18%	*****
UV response DN	144	17	323	0.0696	0.0116	*****	18.75%	3.0%	*****	18.98%	3.02%	*****
UV response UP	158	16	654	0.0549	0.0123	*****	13.96%	3.33%	*****	16.43%	3.33%	*****
Xenobiotic metabolism	200	11	91	0.0785	0.0119	*****	14.97%	3.14%	*****	21.2%	3.18%	*****

Open in a new tab

Discussion

QSAR-based library selection methods rely on the premise that chemically similar compounds often share biological activities. While valuable, this approach inherently limits the chemical diversity of selected compounds to those resembling the reference compound in structure. Through an extensive evaluation of deep image encoding and data processing methods, we identified an approach that enables robust pairwise comparison of phenotypic profiles of compounds across the entire JUMP-CP dataset while preserving biological signals and mitigating batch effects. Leveraging this approach, we were able, for the first time, to harness this vast dataset to develop a training-free transfer learning approach for the efficient selection of bioactive compounds for screening assay. Our method was extensively validated across 65 diverse high-throughput assays from both internal and public databases, covering in cellulo and in vitro bioactivities, underscoring the broad applicability and robustness of Cell Painting phenotypic profiling across biological systems. Importantly, because it is training-free, this approach is straightforward to use, as demonstrated by our web-based tool Phenoseeker, which allows researchers to simply submit the name of a positive control from their drug discovery project and retrieve, with a single click, a list of JUMP-CP compounds most likely to exhibit similar bioactivity.

Interestingly, we enabled, for the first time, a structure-activity relationship analysis spanning the largest publicly available dataset. On one hand, we systematically identified activity cliffs, where slight modifications in compound structure result in significant phenotypic variations, highlighting crucial compound properties that would remain undetected by structure-based approaches alone. By doing so, we demonstrated that deep phenotypic profiles can provide subtle yet critical information at the atomic level. On the other hand, we found that compounds with entirely different structures can target genes in the same pathway, inducing either significantly similar or distinct phenotypic profiles. This was validated across 30 pathways, hundreds of genes, and thousands of compounds. We illustrated this through examples leveraging known compound-to-gene and gene-to-gene interaction networks, reinforcing why phenotypic similarity and dissimilarity represent meaningful metrics for selecting bioactive compounds with a diverse range of molecular structures.

While we demonstrated the effectiveness of the proposed approach, some limitations should be acknowledged. First, although JUMP-CP is the largest publicly available dataset of its kind, it provides a phenotypic view based on a single cell line: U2OS. This constraint means that compounds targeting genes not expressed in U2OS may not produce the relevant phenotypic profiles, potentially leading to missed selections. An investigation of “what can be seen” in terms of specific pathways, gene functions, or gene families was performed in a recent work³⁰. Another limitation is that compound selection is restricted to the 112.480 compounds in the JUMP-CP library. While substantial, this represents only a small subset of available compounds and an even smaller fraction of the theoretical or even the smaller drug-like chemical space. A possible solution to this limitation would be to train a cross-modal model, which could predict a phenotypic profile for any given compound structure, allowing us to identify similar JUMP-CP phenotypes even for novel compounds. We recently trained such a model based on CLIP on the full JUMP-CP dataset⁵³, but it has not yet been extensively evaluated as a compound selection tool in existing screens. Moreover, learning the complex relationship between chemical structure and cell phenotype remains challenging due to the discontinuities introduced by activity cliffs and plateaus we observe in this work. While such a cross-modal model could estimate the phenotype of a given structure, it would not generate new structures likely to induce a desired phenotype—an ultimate goal in drug discovery. To achieve this, a generative model should be conditioned on target phenotypic profiles to design compounds with targeted bioactivity, and the approach we proposed here could provide a valuable foundation for such training.

Our findings underscore the power and scalability of the proposed compound pre-selection approach in both academic research and drug discovery. By harnessing phenotypic response as a biological proxy of compounds bioactivities, our method transcends the constraints of structural similarity, venturing into unexplored regions of chemical space to uncover novel bioactive compounds. This leap beyond traditional QSAR methods not only enhances structural diversity but also reveals drug-like candidates that might otherwise remain hidden. While structure-based approaches rely on chemical resemblance to predict bioactivity, inherently restricting exploration, our method captures diverse biological mechanisms and drug-target interactions, offering a powerful and complementary alternative to conventional selection strategies.

Methods

Cell painting dataset

Cell Painting is a multiplexed high-content imaging assay designed to capture comprehensive phenotypic signatures of cellular responses. This method employs fluorescent dyes targeting multiple cellular compartments—including nuclei, cytoskeleton, mitochondria, endoplasmic reticulum, and Golgi apparatus.

We used the JUMP-CP dataset from the eponymous consortium³⁹, which encompasses more than 136,000 chemical and genetic perturbations represented by over 5 million images acquired across 12 laboratories. These experiments were performed following the standardized Cell Painting assay protocol where each compound is tested at 10 µM and U2OS cells are exposed during 48 h to compound perturbation. For more details, see Bray et al. ⁶. Each Cell Painting image consists of five grayscale channels, each highlighting one or several key cellular organelles. Each assay plate includes negative controls (DMSO mock treatment) with 32 replicates on 384-well plates and 128 replicates on 1536-well plates, and positive controls comprising eight selected compounds (Supplementary Table 2), each with 4 replicates on 384-well plates and 16 replicates on 1536-well plates.

Specifically, we use the compressed version from Watkinson et al. ⁵³. It encompasses 112.480 unique chemical compounds.

Raw feature extraction from images

We utilized three main categories of models to extract features from the 5-channel Cell Painting images:

Handcrafted feature extraction: We used CellProfiler⁴³, a software designed to extract handcrafted morphological features at the single-cell level. After segmenting individual cells, features were computed per cell and then aggregated at the well level. For this study, we relied on precomputed and normalized features made publicly available by the Broad Institute (https://broadinstitute.github.io/jump_hub/howto/11_retrieve_profiles.html).
Deep learning models pretrained on natural images: Two neural network architectures originally trained on natural image datasets were adapted to our Cell Painting dataset. To accommodate the five-channel format of Cell Painting, the first convolutional layer’s weights initially trained on the RGB channels were duplicated: channels 4 and 5 used copies of the weights from the red (R) and green (G) channels, respectively. Note that the specific choice of the order of Cell Painting channels does not affect the model performance, as long as consistency is maintained during inference.

ResNet50⁴⁴, a convolutional deep neural network architecture that leverages residual connections widely used to extract features with robust performance across diverse tasks.
DINOv2⁴⁵, a state-of-the-art self-supervised Vision Transformer (ViT) that employs a teacher-student framework to learn robust, transferable representations from unlabeled data, thereby enabling efficient feature extraction across a range of tasks. We evaluated both the Small (DINOv2-S, 21 M parameters) and the Giant (DINOv2-G, 1300 M parameters) versions of DINOv2 to assess the impact of model depth and complexity on feature representation.

(3).
Deep learning models pretrained on microscopy images: We also explored two architectures specifically trained on microscopy data, offering potential advantages due to domain-specific pretraining:

ChAda-ViT (Channel Adaptive Vision Transformer)⁴⁶, a ViT model tailored for microscopy images with heterogeneous input channels (ranging from 1 to 10). ChAdA-ViT employs cross-channel attention mechanisms to adaptively learn joint representations, making it well-suited to multi-channel assays like Cell Painting.
OpenPhenom⁴⁷, an open-source Masked Autoencoder (MAE)⁶⁵ architecture explicitly trained on Cell Painting images. Developed using millions of Cell Painting images, OpenPhenom learns robust and generalizable representations through a self-supervised reconstruction task.

A detailed summary of the compound selection results for all models tested is provided in the supplementary material (Supplementary Fig. 11).

Several notable observations emerged from our analysis. Although CellProfiler shows excellent performance for retrieving replicates of identical perturbations and effectively reducing batch effects, this may partly result from differences in normalization procedures: CellProfiler features provided by the Broad Institute were normalized using all available negative controls from the original JUMP-CP dataset, whereas our analysis relied on approximately 70% of the wells and 6 instead of 9 images per well, as we used the compressed JUMP-CP dataset version published by Watkinson et al. Interestingly, neural networks pre-trained on natural images show strong performance, despite never having encountered Cell Painting data during their training. Among these, DINOv2 stands out by producing highly discriminative phenotypic features. ChAda-ViT achieves good results, considering its relatively small training set of ~100k microscopy images (including around 55k fluorescence microscopy images). Overall, DINOv2-Giant appears to be the optimal model for our selection method, slightly outperforming OpenPhenom.

Features alignment on DMSO

We evaluated several feature-wise alignment methods to standardize phenotypic profiles, aiming to minimize batch effects while preserving biological signals. The tested methods included:

Rescaling (Res)

Each feature $x_{i}$ ( $i \in [∣ 1, n_{f e a t u r e s} ∣]$ , $n_{f e a t u r e s}$ being 192, 384, 1515 or 2048 depending on the encoder) is linearly scaled to a defined range, either [0, 1] or [-1, 1], using:

For the range [0, 1] : x_{i, s c a l e d} = \frac{x_{i} - x_{i, \min}}{x_{i, \max} - x_{i, \min}}

For the range [- 1, 1] : x_{i, s c a l e d} = 2 \frac{x_{i} - x_{i, \min}}{x_{i, \max} - x_{i, \min}} - 1

The minimum and maximum across all samples.

Removal (Rem)

Features with low variability across samples were removed, based either on standard deviation (std) across samples, or mean absolute deviation (MAD) across samples, using thresholds.

With M A D (x_{i}) = m e d i a n (a b s (x_{i} - m e d i a n (x_{i})))

Z-score normalization (Z)

Each feature $x_{i}$ is standardized to zero mean and unit variance using its mean across samples ( $μ_{i}$ ) and standard deviation across samples ( $σ_{i}$ ):

x_{i, z_{s c o r e}} = \frac{x_{i} - μ_{i}}{σ_{i}}

Robust Z-score normalization (rZ)

A robust variant using median across samples and median absolute deviation across samples, less sensitive to outliers:

x_{i, R o b u s t Z_{s c o r e}} = \frac{x_{i} - m e d i a n (x_{i})}{M A D (x_{i})}

Inverse normal transform (Int)

Features are mapped onto a standard normal distribution by applying the rank-based inverse normal transformation:

x_{i, I N T} = ϕ^{- 1} (\frac{(r a n k (x_{i}) - 0.5)}{N})

where $ϕ^{- 1}$ is the inverse cumulative normal distribution and $N$ is the number of samples.

Sphering (Sph)

We whiten each plate’s feature matrix $X \in R^{n \times d}$ (with $n$ samples, $d$ features) by centering and then multiplying by the inverse square root of its covariance. Concretely:

X_{C e n t e r e d} = (X - μ)

where $μ$ is the feature-wise mean vector.

\sum = \frac{1}{1 - n} {X^{T}}_{C e n t e r e d} X_{C e n t e r e d} = V Λ V^{T} (E i g e n - d e c o m p o s i t i o n)

where $Σ$ is the covariance matrix and $Λ = d i a g (λ_{1}, . . ., λ_{d})$

Two whitening methods were tried :

PCA-whitening: $X_{S p h - P C A} = X_{C e n t e r e d} V {(Λ + ϵ I)}^{- \frac{1}{2}} \sqrt{n - 1}$
ZCA-whitening: $X_{S p h - Z C A} = X_{C e n t e r e d} V {(Λ + ϵ I)}^{- \frac{1}{2}} V^{T} \sqrt{n - 1}$

where $ϵ = 10^{- 6}$ is a small “fudge factor” to avoid division by zero on near-rank-deficient plates.

ZCA preserves each sample’s orientation in the original feature space, while PCA rotates onto the principal-component axes.

Each method was evaluated individually as well as in various combinations. For Z-score and Sphering methods, transformations were fitted either solely on negative controls (DMSO wells) or across all wells on a given plate. All normalization procedures were performed on a plate-wise and feature-wise basis.

Metric used to compare phenotypic profiles

In this article, we used cosine similarity to compare phenotypic profiles due to its ability to identify biologically related samples based on the similarity in their patterns of change rather than on the magnitude of these changes⁶⁶. Additionally, cosine similarity provides directional information, capturing the relationship orientation between two phenotypic profiles.

The cosine similarity is defined as :

C o s i n e S i m i l a r i t y (a, b) = \frac{a . b}{∣ ∣ a ∣ ∣ ∣ ∣ b ∣ ∣}

Where $a . b$ is the scalar product between $a$ and $b$ , and $∣ ∣ . ∣ ∣$ is the Euclidean norm of the vector.

Metrics and data used to compare profiling methods

We employed specific metrics tailored to our downstream application—bioactive compound selection through ranking and retrieval. Although these metrics are not flawless, they effectively capture our primary objectives. In particular, we utilized the mean average precision (mAP)⁶⁷ metric, a standard evaluation measure widely employed in information retrieval and ranking tasks. Specifically, the average precision (AP) for a given query is calculated by averaging the precision values at each position in the ranked list where a relevant item appears. The mAP is then obtained by averaging AP scores across all queries, effectively capturing both the quality of retrieved items and their ranking order. Higher mAP scores thus indicate superior retrieval performance and accuracy.

For a given query (a well phenotypic profiles), the Average Precision (AP) is defined as:

A P = \frac{1}{R} \sum_{k = 1}^{n_{w e l l p r o f i l e s}} P (k) . I {l a b e l_{k} = = l a b e l_{q u e r y}}

Where :

$P (k)$ is the precision at rank $k$ (here, $\frac{N u m b e r o f t r u e p o s i t i v e s i n t h e t o p k w e l l s}{k}$ )
$R$ the total number of relevant (ie, with the same label as the query one) well profiles
$I {.}$ the indicator function (1 if condition holds, 0 otherwise)

To quantify batch effect reduction, mAP was computed using phenotypic profiles from both negative (DMSO) and positive control wells, with experimental batch identifiers serving as labels. Conversely, to assess the preservation of biological signals, mAP was computed exclusively on positive control wells, using compound identities as labels. The mAP was not computed with sample wells to avoid potential bias due to sample phenotypic profiles distributions.

We can observe that all 8 positive controls do not have the same phenotypic impact on cells (Supplementary Fig. 12, Supplementary Fig. 13). Indeed, some have strong effects (JCP2022_046054, JCP2022_037716 or JCP2022_035095), some have weaker effects (JCP2022_012818 or JCP2022_064022) and some are not distinguishable from DMSO negative control (JCP2022_085227, JCP2022_050797). This explains the huge difference in mAP value between those compounds and the average mAP across those 8 compounds being only around 0.5 for best models and normalisations. The strong batch effect observed on Supplementary Fig. 13 also explains mAPs values.

For computational efficiency, we initially evaluated all possible combinations of normalization methods (over 5,000 in totals) using a subset of five plates per laboratory, each selected from distinct experimental batches. Subsequently, the 100 best-performing combinations were re-evaluated using approximately half of the JUMP-CP dataset (~600 plates), explicitly chosen to maximize stratification across experimental batches and laboratories. This evaluation process was conducted independently for each feature extractor: ChAda-ViT, ResNet50, DINOv2-G, DINOv2-S, and OpenPhenom.

Figure 1 shows the performance results for all tested combinations, while Supplementary Fig. 14 presents detailed results for the 100 best-performing combinations evaluated on the larger subset (~600 plates). Importantly, results remained consistent from the initial smaller set of plates (45 plates) to the expanded validation (600 plates), supporting the robustness of our normalization approach.

Compound structural representation using Morgan fingerprints

To describe the structural topology of chemical compounds, we computed Morgan fingerprints⁵⁶ using the RDKit toolkit. Specifically, Morgan fingerprints were generated with a radius of 2 and a bit-vector length of 1024. This fingerprinting method encodes structural information by iteratively considering atom neighborhoods within the specified radius, capturing both local chemical environments and global topological properties of the compound. The similarity between compounds was quantified using the Tanimoto similarity⁵⁷, a widely adopted measure for comparing molecular fingerprints. The Tanimoto similarity assesses the overlap of structural features between pairs of compounds.

It is defined as:

T (A, B) = \frac{c}{a + b - c}

where:

$a$ and $b$ are the number of bits set to 1 in the fingerprints of compounds $A$ and $B$ , respectively,
$c$ is the number of bits set to 1 in both fingerprints.

Chemical diversity of JUMP-CP

We compute a UMAP projection of the Morgan fingerprint of all compounds in JUMP-CP alongside those from ChEMBL and DrugBank to assess the extent of the chemical space covered by JUMP-CP compounds relative to known or previously studied compounds (see Supplementary Fig. 15). These UMAP projections show that JUMP-CP compounds broadly overlap with both ChEMBL and DrugBank while also occupying distinct regions in chemical space, underscoring the shared and novel structural diversity within the JUMP-CP dataset.

Assays description

Screen

A high-throughput experimental procedure to rapidly evaluate the activity of a library of compounds.

Assay

A standardized test that quantifies or qualifies the interaction, activity, or presence of a compound or biomolecule, used to validate screening results.

Institut Curie BioPhoenix screens (Supplementary Fig. 16) We included assays from the Institut Curie’s BioPhoenix screening platform based on specific criteria. Assays were selected if at least 100 compounds from the JUMP-CP dataset were screened, and if at least five active compounds (hits) were identified in the original assay. These screens were all conducted using cellular models, primarily employing imaging-based readouts, although some relied on fluorescence-based methods such as CellTiter-Glo. Many assays featured overlapping sets of compounds, typically originating from shared chemical libraries like the Prestwick library. Despite this overlap, hits identified varied significantly from one assay to another, even when the same compounds were tested. In total, 16 BioPhoenix screens met these criteria.

ChEMBL screens (https://www.ebi.ac.uk/chembl/) (Supplementary Fig. 17) We also incorporated assays from the ChEMBL database, applying the same criteria as those used for BioPhoenix assays. Assays included had at least 5 active and 100 inactive compounds also present in the JUMP-CP dataset. To focus specifically on challenging screening tasks, we retained only assays with a hit rate lower than 35%. The selected ChEMBL assays represent diverse experimental methodologies and biological contexts, covering in vitro and in cellulo systems. Overall, this resulted in 44 selected ChEMBL screens.

Lit-PCBA benchmark⁵⁵ (Supplementary Fig. 18) Finally, we utilized the Lit-PCBA dataset for additional benchmarking. From Lit-PCBA, we retained only targets with at least 10 active compounds and a minimum of 100 total compounds present in the JUMP-CP dataset. This filtering resulted in five targets meeting our criteria: ALDH1, MAPK1, PKM2, CDR, and ESR1-antagonist.

Collectively, Lit-PCBA target, ChEMBL and BioPhoenix assays encompassed significant chemical diversity, covering more than 10,000 unique compounds from the JUMP-CP dataset—representing around 10% of the dataset in its entirety.

A supplementary table for each source (BioPhoenix, ChEMBL, Lit-PCBA) provides detailed information including the screen or target identifiers, assay type, hit rate, and total number of tested compounds.

Additionally, Supplementary Fig. 19 illustrates the proportion of the JUMP-CP dataset evaluated across all selected assays.

Normalized enrichment factor

To evaluate the ability of our method to select bioactive compounds, we compute enrichment factors⁶⁸. We chose a threshold of 5% to select at least 5 compounds for all screens where we have 100 tested compounds (but results are consistent across thresholds, see Supplementary Fig. 1). Those values were normalized to help the comparison across screens with a very diverse number of compounds and hit rate. A normalized enrichment factor (nEF) of 100% means the best possible selection of compounds.

The enrichment factor at 5% is computed as:

{E F}_{5 %} = \frac{\frac{a}{n}}{\frac{A}{N}} = \frac{a . N}{n . A}

where:

$a$ is the number of active compounds in the top 5% selected compounds,
$n$ is the total number of compounds selected (5% of $N$ ),
$A$ is the total number of actives in the dataset,
$N$ is the total number of compounds tested.

The Normalized Enrichment Factor (nEF) is then defined as:

{n E F}_{5 %} = \frac{E F_{5 %}}{E F_{i d e a l, 5 %}} . 100

where: $E F_{i d e a l, 5 %}$ is the theoretical maximum enrichment at 5% (i.e., when all selected compounds are active or all active compounds are selected).

Bemis-Murcko scaffolds and phenotypic subclusters

To systematically group compounds according to their structural similarity, we computed molecular scaffolds using RDKit’s implementation of the Bemis–Murcko scaffold definition. Note that RDKit’s implementation differs slightly from the original Bemis–Murcko definition⁵⁹: it retains the first atom of exocyclic substituents attached via double bonds, distinguishing certain structural motifs (e.g., differentiating between C1CC1 = O and C1CC1). Further details about these differences are documented in the RDKit community discussion (https://github.com/rdkit/rdkit/discussions/6844).

Once scaffolds were generated within each scaffold-defined group, we performed hierarchical clustering to further subdivide compounds into distinct phenotypic subgroups. We retained groups where each cluster contained at least three compounds to ensure sufficient representation for meaningful analysis. To explore the robustness of our approach, we systematically tested scenarios with two, three, four, or five clusters per scaffold, resulting in corresponding minimal scaffold group sizes of 6, 9, 12, or 15 compounds.

Detailed statistical results from these different clustering configurations are provided in Supplementary Fig. 4.

Scaffolds were selected as exposing subclusters with substantially different phenotypes when intra-cosine phenotypic similarity was higher than inter-cosine phenotypic similarity by at least 0.2 (in Fig. 4). The value 0.2 corresponds to a cosine similarity greater than 3 standard deviations from the mean of the whole dataset (see Supplementary Fig. 20 for distribution of similarity across all JUMP-CP compounds).

Pathways and networks construction

The EGFR pathway used in Fig. 6 corresponds to the signaling network provided by The SIGnaling Network Open Resource (SIGNOR 3.0, https://signor.uniroma2.it) with a confidence superior to 0.2 for all interactions. Additionally, the Brd4 gene was manually included based on evidence from previous literature facts^63,64.

For Fig. 7 and Table 1, gene sets were obtained from The Molecular Signatures Database (MSigDB, https://www.gsea-msigdb.org/), a curated collection of annotated gene sets derived from diverse biological sources, including published experimental data, expert-curated pathways, and computational predictions. We used the Hallmarks gene sets that focus on cancer but it can be generalized. Compound-target relationships were retrieved from BindingDB (https://www.bindingdb.org/), a comprehensive database of experimentally validated compound–gene interactions. We retained only compounds present in both BindingDB and the JUMP-CP dataset, along with genes targeted by these remaining compounds. Detailed statistics regarding genes, filtered genes, and remaining compounds are presented in Table 1.

The gene interaction network shown in Fig. 7 was constructed using the NeKo tool⁶⁶, which aims at reconstructing connected networks from curated prior-knowledge databases (i.e., OmniPath (https://omnipathdb.org/), a comprehensive resource for intra- and intercellular signaling knowledge). We built the network specifically with the 21 genes from the G2M pathway targeted by JUMP-CP compounds and focused on the SIGNOR database only. This complete NeKo-generated network is available in Supplementary Fig. 21 The subnetwork in Fig. 7. highlights interactions relevant to the four selected compounds.

Pathways and phenotypes table and statistics

For each gene set corresponding to a given biological pathway, we retained only those pathways in which at least 10 distinct genes were targeted by at least 50 compounds from the JUMP-CP dataset, resulting in 30 pathways for further analysis. For each selected pathway, we computed pairwise cosine similarities between phenotypic profiles of all compounds known to target genes within that pathway. As a comparative control, we also computed cosine similarities among a random set of non-targeting compounds, sampled at a ratio of 10 non-targeting compounds per active compound. The resulting distributions of cosine similarities for each pathway, including both active and random compound pairs, are shown in Supplementary Fig. 10.

Statistics and reproducibility

All metrics for batch effect removal, signal conservation (mean Average Precision) and bioactive compounds retrieval (Enrichment Factors) are deterministic. Replicates are defined at the field of view, well level by field of view, or well that have been exposed to the same chemical compound perturbation (or no perturbation). FOV replicates are technical replicates. Well replicates are biological replicates. All metrics for batch effect removal and signal conservation are only evaluated with biological replicates.

To compare the distribution of Tanimoto similarities in Fig. 3, we first checked normality of the paired differences using the Shapiro–Wilk test at a significance level of α = 0.05. Because most differences did not follow a normal distribution (p< 0.05, see Supplementary Table 1), we employed the nonparametric Wilcoxon signed-rank test with a one-sided alternative hypothesis (e.g., “Top 5% Structures” > “All Tested Compounds”). This approach provides a robust evaluation of whether one distribution exhibits systematically higher (or lower) Tanimoto similarity compared to another.

The resulting p-values were then converted into significance thresholds (p< 0.05, <0.01, <0.001, etc.), which are visually depicted in box plots using the star notation (e.g. ∗, ∗∗, ∗∗∗, etc.). All test values and raw p-values are displayed in Supplementary Table 1, the number of samples in Supplementary Fig. 19

Activity cliffs detection is deterministic

To determine whether active compounds exhibited more extreme similarity or dissimilarity compared to random compound pairs, we first tested differences in variance between these two distributions using Levene’s test. Additionally, to assess whether active compound pairs were significantly enriched in highly similar (cosine similarity > mean + 3 standard deviations, see Supplementary Fig. 20 for distribution of similarity across all JUMP-CP compounds) or highly dissimilar pairs (cosine similarity < mean − 3 standard deviations, see Supplementary Fig. 20 for distribution of similarity across all JUMP-CP compounds), we calculated the percentage of pairs meeting these thresholds in both active and random distributions, and assessed statistical significance using Fisher’s exact test. Detailed statistical results, including test statistics and p-values, are provided in Table 1.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Supplementary information

Transparent Peer Review file^{(221.6KB, pdf)}

Supplementary Information^{(52.2MB, pdf)}

Reporting Summary^{(1.4MB, pdf)}

Acknowledgements

This work has received support under the program « Investissements d’Avenir » launched by the French Government and implemented by the ANR, with the references: ANR-10-LABX-54 MEMO LIFE ANR-11-IDEX-0001-02 PSL* Research University, MS was funded by Iktos and ANRT under the CIFRE program. This work was granted access to the HPC resources of IDRIS under the allocation 2020-AD011011495 made by GENCI. We thank Gabriel Watkinson for helping with the JUMP-CP dataset, the computing service of IBENS, and Mary-Ann Letellier for editing the manuscript.

Author contributions

M.S. designed the computational framework and performed all the analyses. N.B., I.B., E.C., and I.S. helped for the analyses. E.dN. provided screening data and expertise. H.T. and G.B. contributed to the discussion. A.G. and L.C. conceived the project and supervised the work. M.S. and A.G. wrote the manuscript. All authors revised the manuscript.

Peer review

Peer review information

Communications Biology thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editors: Laura Rodríguez Pérez. A peer review file is available.

Data availability

JUMP-CP dataset is publicly available (see: https://broadinstitute.github.io/jump_hub/). Screen data from ChEMBL⁶⁹ is publicly available; we used version 35 of this database (see: https://www.ebi.ac.uk/chembl/). Screens data from BioPhoenix are not publicly available. Compounds-targets interactions are publicly available (see: https://www.bindingdb.org/). The gene sets are publicly available (see: https://www.gsea-msigdb.org/).

Code availability

We provide scripts at https://github.com/mxfly14/2025_sanchez_phenoseeker⁷⁰ for the automated preprocessing of the data, test and evaluation of normalization methods and reproduction of the figures and plots. The code to download the compressed version of JUMP-CP is available at https://github.com/gwatkinson/jump_download.

Competing interests

The authors declare the following competing interests: M.S. and H.T. are employees at Iktos. All other authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Laurence Calzone, Email: laurence.calzone@curie.fr.

Auguste Genovesio, Email: auguste.genovesio@ens.psl.eu.

Supplementary information

The online version contains supplementary material available at 10.1038/s42003-025-09500-y.

References

1.Vincent, F. et al. Phenotypic drug discovery: recent successes, lessons learned and new directions. Nat. Rev. Drug Discov.21, 899–914 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Swinney, D. C. Phenotypic vs. target-based drug discovery for first-in-class medicines. Clin. Pharmacol. Ther.93, 299–301 (2013). [DOI] [PubMed] [Google Scholar]
3.Swinney, D. C. & Anthony, J. How were new medicines discovered?. Nat. Rev. Drug Discov.10, 507–519 (2011). [DOI] [PubMed] [Google Scholar]
4.Reisen, F. et al. Linking phenotypes and modes of action through high-content screen fingerprints. ASSAY Drug Dev. Technol.13, 415–427 (2015). [DOI] [PubMed] [Google Scholar]
5.Brodin, P., DelNery, E. & Soleilhac, E. Criblage phénotypique à haut contenu pour la chémobiologie et ses enjeux. m.édecine/Sci.31, 187–196 (2015). [DOI] [PubMed] [Google Scholar]
6.Bray, M.-A. et al. Cell Painting, a high-content image-based assay for morphological profiling using multiplexed fluorescent dyes. Nat. Protoc.11, 1757–1774 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Gustafsdottir, S. M. et al. Multiplex cytological profiling assay to measure diverse cellular states. PLoS ONE8, e80999 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Schorpp, K. et al. CellDeathPred: a deep learning framework for ferroptosis and apoptosis prediction based on cell painting. Cell Death Discov. 9, 277 (2023). [DOI] [PMC free article] [PubMed]
9.Tapaswi, A., Cemalovic, N., Polemi, K. M., Sexton, J. Z. & Colacino, J. A. Applying cell painting in non-tumorigenic breast cells to understand impacts of common chemical exposures. 2024.04.30.591893 Preprint at 10.1101/2024.04.30.591893 (2024). [DOI] [PubMed]
10.Greatbatch, C. J. et al. High-throughput functional profiling of genes at intraocular pressure loci reveals distinct networks for glaucoma. Hum. Mol. Genet. 33, 739–751 (2024). [DOI] [PMC free article] [PubMed]
11.Lambert, R. et al. Drug-induced cytotoxicity prediction in muscle cells, an application of the Cell Painting assay. 2024.02.08.579439 Preprint at 10.1101/2024.02.08.579439 (2024). [DOI] [PMC free article] [PubMed]
12.Lejal, V., Rouquié, D. & Taboureau, O. Cell morphology and gene expression: tracking changes and complementarity across time and cell lines. 2024.08.30.610494 Preprint at 10.1101/2024.08.30.610494 (2024). [DOI] [PubMed]
13.Garcia-Fossa, F. et al. Live Cell Painting: image-based profiling in live cells using Acridine Orange. 2024.08.28.610144 Preprint at 10.1101/2024.08.28.610144 (2024). [DOI] [PMC free article] [PubMed]
14.Platani, M. et al. Screening for variable drug responses using human iPSC cohorts. PLOS ONE20, e0323953 (2025). [DOI] [PMC free article] [PubMed]
15.Ringers, C. et al. High-content morphological profiling by Cell Painting in 3D spheroids. 2025.02.05.636642 Preprint at 10.1101/2025.02.05.636642 (2025).
16.Seal, S. et al. Cell Painting: a decade of discovery and innovation in cellular imaging. Nat. Methods22, 254–268 (2025). [DOI] [PMC free article] [PubMed]
17.LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature521, 436–444 (2015). [DOI] [PubMed] [Google Scholar]
18.Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature596, 583–589 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Kim, V. et al. Self-supervision advances morphological profiling by unlocking powerful image representations. Sci. Rep.15, 4876 (2025). [DOI] [PMC free article] [PubMed]
20.Zheng, S. et al. Cross-modal graph contrastive learning with cellular images. Adv. Sci.11, 2404845 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Lamiable, A. et al. Revealing invisible cell phenotypes with conditional generative modeling. Nat. Commun.14, 6386 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Liu, G. et al. Learning Molecular Representation in a Cell. Preprint at 10.48550/arXiv.2406.12056 (2024).
23.Moshkov, N. et al. Learning representations for image-based profiling of perturbations. Nat. Commun.15, 1594 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Sanchez-Fernandez, A., Rumetshofer, E., Hochreiter, S. & Klambauer, G. CLOOME: contrastive learning unlocks bioimaging databases for queries with chemical structures. Nat. Commun.14, 7339 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Lazar, N.H. et al. High-resolution genome-wide mapping of chromosome-arm-scale truncations induced by CRISPR–Cas9 editing. Nat. Genet. 56, 1482–1493 (2024). [DOI] [PMC free article] [PubMed]
26.Bray, M.-A. et al. A dataset of images and morphological profiles of 30000 small-molecule treatments using the Cell Painting assay. GigaScience6, 1–5 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Herman, D. et al. Leveraging cell painting images to expand the applicability domain and actively improve deep learning quantitative structure–activity relationship models. Chem. Res. Toxicol.36, 1028–1036 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Hofmarcher, M., Rumetshofer, E., Clevert, D.-A., Hochreiter, S. & Klambauer, G. Accurate prediction of biological assays with high-throughput microscopy images and convolutional networks. J. Chem. Inf. Model.59, 1163–1171 (2019). [DOI] [PubMed] [Google Scholar]
29.Celik, S. et al. Building, benchmarking, and exploring perturbative maps of transcriptional and morphological data. PLoS Comput. Biol. 20, e1012463 (2024). [DOI] [PMC free article] [PubMed]
30.Chandrasekaran, S. N. et al. Morphological map of under- and overexpression of genes in human cells. Nat. Methods22, 1742–1752 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Cohen, E. et al. Cell painting transfer increases screening hit rate. Biol. Imaging3, e4 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Wang, S. et al. PhenoScreen: a dual-space contrastive learning framework-based phenotypic screening method by linking chemical perturbations to cellular morphology. 2024.10.23.619752 Preprint at 10.1101/2024.10.23.619752 (2024).
33.Simm, J. et al. Repurposing high-throughput image assays enables biological activity prediction for drug discovery. Cell Chem. Biol.25, 611–618.e3 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Seal, S. et al. Merging bioactivity predictions from cell morphology and chemical fingerprint models using similarity to training data. J. Cheminformatics15, 56 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Trapotsi, M.-A. et al. Comparison of chemical structure and cell morphology information for multitask bioactivity predictions. J. Chem. Inf. Model.61, 1444–1456 (2021). [DOI] [PubMed] [Google Scholar]
36.Ha, S. V. et al. Low concentration cell painting images enable the identification of highly potent compounds. Sci. Rep.14, 24403 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Moshkov, N. et al. Predicting compound activity from phenotypic profiles and chemical structures. Nat. Commun.14, 1967 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Fredin Haslum, J. et al. Cell Painting-based bioactivity prediction boosts high-throughput screening hit-rates and compound diversity. Nat Commun15, 3470 (2024). [DOI] [PMC free article] [PubMed]
39.Chandrasekaran, S. N. et al. JUMP Cell Painting dataset: morphological impact of 136,000 chemical and genetic perturbations. 2023.03.23.534023 Preprint at 10.1101/2023.03.23.534023 (2023).
40.Arevalo, J. et al. Evaluating batch correction methods for image-based cell profiling. Nat. Commun. 15, 6516 (2024). [DOI] [PMC free article] [PubMed]
41.Polishchuk, P. G., Madzhidov, T. I. & Varnek, A. Estimation of the size of drug-like chemical space based on GDB-17 data. J. Comput. Aided Mol. Des.27, 675–679 (2013). [DOI] [PubMed] [Google Scholar]
42.van Tilborg, D., Alenicheva, A. & Grisoni, F. Exposing the limitations of molecular machine learning with activity cliffs. J. Chem. Inf. Model.62, 5938–5951 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Stirling, D. R. et al. CellProfiler 4: improvements in speed, utility and usability. BMC Bioinforma.22, 433 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
44.He, K., Zhang, X., Ren, S. & Sun, J. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 770–778. 10.1109/CVPR.2016.90. (2016)
45.Oquab, M. et al. DINOv2: learning robust visual features without supervision. Preprint at 10.48550/arXiv.2304.07193 (2023).
46.Bourriez, N. et al. ChAda-ViT: channel adaptive attention for joint representation learning of heterogeneous microscopy images. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR, 2024).
47.Kraus, O. et al. Masked autoencoders for microscopy are scalable learners of cellular biology. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR, 2024).
48.Ando, D. M., McLean, C. Y. & Berndl, M. Improving phenotypic measurements in high-content imaging screens. 161422 Preprint at 10.1101/161422 (2017).
49.Lopez, R., Regier, J., Cole, M. B., Jordan, M. I. & Yosef, N. Deep generative modeling for single-cell transcriptomics. Nat. Methods15, 1053–1058 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
50.Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods16, 1289–1296 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
51.Hie, B., Bryson, B. & Berger, B. Efficient integration of heterogeneous single-cell transcriptomes using Scanorama. Nat. Biotechnol.37, 685–691 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Yan, C. et al. Triple-effect correction for Cell Painting data with contrastive and domain-adversarial learning. Nat. Commun.16, 6886 (2025). [DOI] [PMC free article] [PubMed]
53.Watkinson, G. et al. Weakly supervised cross-modal learning in high-content screening. In Proc. IEEE International Symposium on Biomedical Imaging 1–5 (2024).
54.Kessy, A., Lewin, A. & Strimmer, K. Optimal whitening and decorrelation. Am. Stat.72, 309–314 (2018). [Google Scholar]
55.LIT-PCBA: An Unbiased Data Set for Machine Learning and Virtual Screening | Journal of Chemical Information and Modeling. https://pubs.acs.org/doi/10.1021/acs.jcim.0c00155. [DOI] [PubMed]
56.Morgan, H. L. The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service. J. Chem. Doc.5, 107–113 (1965). [Google Scholar]
57.Bajusz, D., Rácz, A. & Héberger, K. Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations?. J. Cheminformatics7, 20 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
58.Healy, J. & McInnes, L. Uniform manifold approximation and projection. Nat. Rev. Methods Prim.4, 1–15 (2024). [Google Scholar]
59.Bemis, G. W. & Murcko, M. A. The Properties of Known Drugs. 1. Molecular Frameworks. J. Med. Chem.39, 2887–2893 (1996). [DOI] [PubMed] [Google Scholar]
60.Guha, R. Exploring Uncharted Territories - Predicting Activity Cliffs in Structure-Activity Landscapes. J. Chem. Inf. Model.52, 2181–2191 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
61.Zhang, Z., Zhao, B., Xie, A., Bian, Y. & Zhou, S. Activity Cliff Prediction: Dataset and Benchmark. Preprint at 10.48550/arXiv.2302.07541 (2023).
62.Lo Surdo, P. et al. SIGNOR 3.0, the SIGnaling network open resource 3.0: 2022 update. Nucleic Acids Res51, D631–D637 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
63.Devaiah, B. N. et al. MYC protein stability is negatively regulated by BRD4. Proc. Natl. Acad. Sci. USA117, 13457–13467 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
64.Kotekar, A., Singh, A. K. & Devaiah, B. N. BRD4 and MYC: power couple in Transcription and Disease. FEBS J.290, 4820–4842 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
65.He, K. et al. Masked Autoencoders Are Scalable Vision Learners. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 15979–15988. 10.1109/CVPR52688.2022.01553. (2022)
66.Ruscone, M. et al. NeKo: a tool for automatic network construction from prior knowledg. Preprint at 10.1101/2024.10.14.618311 (2024). [DOI] [PMC free article] [PubMed]
67.Kalinin, A. A. et al. A versatile information retrieval framework for evaluating profile strength and similarity. Nat. Commun. 16, 5181 (2025). [DOI] [PMC free article] [PubMed]
68.Lopes, J. C. D., dos Santos, F. M., Martins-José, A., Augustyns, K. & De Winter, H. The power metric: a new statistically robust enrichment-type metric for virtual screening applications with early recovery capability. J. Cheminformatics9, 7 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
69.Gaulton, A. et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res40, D1100–D1107 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
70.mxfly14. mxfly14/2025_sanchez_phenoseeker: Publication DOI. Zenodo 10.5281/zenodo.17914525 (2025).

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Transparent Peer Review file^{(221.6KB, pdf)}

Supplementary Information^{(52.2MB, pdf)}

Reporting Summary^{(1.4MB, pdf)}

Data Availability Statement

[CR1] 1.Vincent, F. et al. Phenotypic drug discovery: recent successes, lessons learned and new directions. Nat. Rev. Drug Discov.21, 899–914 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR2] 2.Swinney, D. C. Phenotypic vs. target-based drug discovery for first-in-class medicines. Clin. Pharmacol. Ther.93, 299–301 (2013). [DOI] [PubMed] [Google Scholar]

[CR3] 3.Swinney, D. C. & Anthony, J. How were new medicines discovered?. Nat. Rev. Drug Discov.10, 507–519 (2011). [DOI] [PubMed] [Google Scholar]

[CR4] 4.Reisen, F. et al. Linking phenotypes and modes of action through high-content screen fingerprints. ASSAY Drug Dev. Technol.13, 415–427 (2015). [DOI] [PubMed] [Google Scholar]

[CR5] 5.Brodin, P., DelNery, E. & Soleilhac, E. Criblage phénotypique à haut contenu pour la chémobiologie et ses enjeux. m.édecine/Sci.31, 187–196 (2015). [DOI] [PubMed] [Google Scholar]

[CR6] 6.Bray, M.-A. et al. Cell Painting, a high-content image-based assay for morphological profiling using multiplexed fluorescent dyes. Nat. Protoc.11, 1757–1774 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR7] 7.Gustafsdottir, S. M. et al. Multiplex cytological profiling assay to measure diverse cellular states. PLoS ONE8, e80999 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR8] 8.Schorpp, K. et al. CellDeathPred: a deep learning framework for ferroptosis and apoptosis prediction based on cell painting. Cell Death Discov. 9, 277 (2023). [DOI] [PMC free article] [PubMed]

[CR9] 9.Tapaswi, A., Cemalovic, N., Polemi, K. M., Sexton, J. Z. & Colacino, J. A. Applying cell painting in non-tumorigenic breast cells to understand impacts of common chemical exposures. 2024.04.30.591893 Preprint at 10.1101/2024.04.30.591893 (2024). [DOI] [PubMed]

[CR10] 10.Greatbatch, C. J. et al. High-throughput functional profiling of genes at intraocular pressure loci reveals distinct networks for glaucoma. Hum. Mol. Genet. 33, 739–751 (2024). [DOI] [PMC free article] [PubMed]

[CR11] 11.Lambert, R. et al. Drug-induced cytotoxicity prediction in muscle cells, an application of the Cell Painting assay. 2024.02.08.579439 Preprint at 10.1101/2024.02.08.579439 (2024). [DOI] [PMC free article] [PubMed]

[CR12] 12.Lejal, V., Rouquié, D. & Taboureau, O. Cell morphology and gene expression: tracking changes and complementarity across time and cell lines. 2024.08.30.610494 Preprint at 10.1101/2024.08.30.610494 (2024). [DOI] [PubMed]

[CR13] 13.Garcia-Fossa, F. et al. Live Cell Painting: image-based profiling in live cells using Acridine Orange. 2024.08.28.610144 Preprint at 10.1101/2024.08.28.610144 (2024). [DOI] [PMC free article] [PubMed]

[CR14] 14.Platani, M. et al. Screening for variable drug responses using human iPSC cohorts. PLOS ONE20, e0323953 (2025). [DOI] [PMC free article] [PubMed]

[CR15] 15.Ringers, C. et al. High-content morphological profiling by Cell Painting in 3D spheroids. 2025.02.05.636642 Preprint at 10.1101/2025.02.05.636642 (2025).

[CR16] 16.Seal, S. et al. Cell Painting: a decade of discovery and innovation in cellular imaging. Nat. Methods22, 254–268 (2025). [DOI] [PMC free article] [PubMed]

[CR17] 17.LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature521, 436–444 (2015). [DOI] [PubMed] [Google Scholar]

[CR18] 18.Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature596, 583–589 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR19] 19.Kim, V. et al. Self-supervision advances morphological profiling by unlocking powerful image representations. Sci. Rep.15, 4876 (2025). [DOI] [PMC free article] [PubMed]

[CR20] 20.Zheng, S. et al. Cross-modal graph contrastive learning with cellular images. Adv. Sci.11, 2404845 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR21] 21.Lamiable, A. et al. Revealing invisible cell phenotypes with conditional generative modeling. Nat. Commun.14, 6386 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR22] 22.Liu, G. et al. Learning Molecular Representation in a Cell. Preprint at 10.48550/arXiv.2406.12056 (2024).

[CR23] 23.Moshkov, N. et al. Learning representations for image-based profiling of perturbations. Nat. Commun.15, 1594 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR24] 24.Sanchez-Fernandez, A., Rumetshofer, E., Hochreiter, S. & Klambauer, G. CLOOME: contrastive learning unlocks bioimaging databases for queries with chemical structures. Nat. Commun.14, 7339 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR25] 25.Lazar, N.H. et al. High-resolution genome-wide mapping of chromosome-arm-scale truncations induced by CRISPR–Cas9 editing. Nat. Genet. 56, 1482–1493 (2024). [DOI] [PMC free article] [PubMed]

[CR26] 26.Bray, M.-A. et al. A dataset of images and morphological profiles of 30000 small-molecule treatments using the Cell Painting assay. GigaScience6, 1–5 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR27] 27.Herman, D. et al. Leveraging cell painting images to expand the applicability domain and actively improve deep learning quantitative structure–activity relationship models. Chem. Res. Toxicol.36, 1028–1036 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR28] 28.Hofmarcher, M., Rumetshofer, E., Clevert, D.-A., Hochreiter, S. & Klambauer, G. Accurate prediction of biological assays with high-throughput microscopy images and convolutional networks. J. Chem. Inf. Model.59, 1163–1171 (2019). [DOI] [PubMed] [Google Scholar]

[CR29] 29.Celik, S. et al. Building, benchmarking, and exploring perturbative maps of transcriptional and morphological data. PLoS Comput. Biol. 20, e1012463 (2024). [DOI] [PMC free article] [PubMed]

[CR30] 30.Chandrasekaran, S. N. et al. Morphological map of under- and overexpression of genes in human cells. Nat. Methods22, 1742–1752 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR31] 31.Cohen, E. et al. Cell painting transfer increases screening hit rate. Biol. Imaging3, e4 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR32] 32.Wang, S. et al. PhenoScreen: a dual-space contrastive learning framework-based phenotypic screening method by linking chemical perturbations to cellular morphology. 2024.10.23.619752 Preprint at 10.1101/2024.10.23.619752 (2024).

[CR33] 33.Simm, J. et al. Repurposing high-throughput image assays enables biological activity prediction for drug discovery. Cell Chem. Biol.25, 611–618.e3 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR34] 34.Seal, S. et al. Merging bioactivity predictions from cell morphology and chemical fingerprint models using similarity to training data. J. Cheminformatics15, 56 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR35] 35.Trapotsi, M.-A. et al. Comparison of chemical structure and cell morphology information for multitask bioactivity predictions. J. Chem. Inf. Model.61, 1444–1456 (2021). [DOI] [PubMed] [Google Scholar]

[CR36] 36.Ha, S. V. et al. Low concentration cell painting images enable the identification of highly potent compounds. Sci. Rep.14, 24403 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR37] 37.Moshkov, N. et al. Predicting compound activity from phenotypic profiles and chemical structures. Nat. Commun.14, 1967 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR38] 38.Fredin Haslum, J. et al. Cell Painting-based bioactivity prediction boosts high-throughput screening hit-rates and compound diversity. Nat Commun15, 3470 (2024). [DOI] [PMC free article] [PubMed]

[CR39] 39.Chandrasekaran, S. N. et al. JUMP Cell Painting dataset: morphological impact of 136,000 chemical and genetic perturbations. 2023.03.23.534023 Preprint at 10.1101/2023.03.23.534023 (2023).

[CR40] 40.Arevalo, J. et al. Evaluating batch correction methods for image-based cell profiling. Nat. Commun. 15, 6516 (2024). [DOI] [PMC free article] [PubMed]

[CR41] 41.Polishchuk, P. G., Madzhidov, T. I. & Varnek, A. Estimation of the size of drug-like chemical space based on GDB-17 data. J. Comput. Aided Mol. Des.27, 675–679 (2013). [DOI] [PubMed] [Google Scholar]

[CR42] 42.van Tilborg, D., Alenicheva, A. & Grisoni, F. Exposing the limitations of molecular machine learning with activity cliffs. J. Chem. Inf. Model.62, 5938–5951 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR43] 43.Stirling, D. R. et al. CellProfiler 4: improvements in speed, utility and usability. BMC Bioinforma.22, 433 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR44] 44.He, K., Zhang, X., Ren, S. & Sun, J. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 770–778. 10.1109/CVPR.2016.90. (2016)

[CR45] 45.Oquab, M. et al. DINOv2: learning robust visual features without supervision. Preprint at 10.48550/arXiv.2304.07193 (2023).

[CR46] 46.Bourriez, N. et al. ChAda-ViT: channel adaptive attention for joint representation learning of heterogeneous microscopy images. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR, 2024).

[CR47] 47.Kraus, O. et al. Masked autoencoders for microscopy are scalable learners of cellular biology. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR, 2024).

[CR48] 48.Ando, D. M., McLean, C. Y. & Berndl, M. Improving phenotypic measurements in high-content imaging screens. 161422 Preprint at 10.1101/161422 (2017).

[CR49] 49.Lopez, R., Regier, J., Cole, M. B., Jordan, M. I. & Yosef, N. Deep generative modeling for single-cell transcriptomics. Nat. Methods15, 1053–1058 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR50] 50.Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods16, 1289–1296 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR51] 51.Hie, B., Bryson, B. & Berger, B. Efficient integration of heterogeneous single-cell transcriptomes using Scanorama. Nat. Biotechnol.37, 685–691 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR52] 52.Yan, C. et al. Triple-effect correction for Cell Painting data with contrastive and domain-adversarial learning. Nat. Commun.16, 6886 (2025). [DOI] [PMC free article] [PubMed]

[CR53] 53.Watkinson, G. et al. Weakly supervised cross-modal learning in high-content screening. In Proc. IEEE International Symposium on Biomedical Imaging 1–5 (2024).

[CR54] 54.Kessy, A., Lewin, A. & Strimmer, K. Optimal whitening and decorrelation. Am. Stat.72, 309–314 (2018). [Google Scholar]

[CR55] 55.LIT-PCBA: An Unbiased Data Set for Machine Learning and Virtual Screening | Journal of Chemical Information and Modeling. https://pubs.acs.org/doi/10.1021/acs.jcim.0c00155. [DOI] [PubMed]

[CR56] 56.Morgan, H. L. The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service. J. Chem. Doc.5, 107–113 (1965). [Google Scholar]

[CR57] 57.Bajusz, D., Rácz, A. & Héberger, K. Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations?. J. Cheminformatics7, 20 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR58] 58.Healy, J. & McInnes, L. Uniform manifold approximation and projection. Nat. Rev. Methods Prim.4, 1–15 (2024). [Google Scholar]

[CR59] 59.Bemis, G. W. & Murcko, M. A. The Properties of Known Drugs. 1. Molecular Frameworks. J. Med. Chem.39, 2887–2893 (1996). [DOI] [PubMed] [Google Scholar]

[CR60] 60.Guha, R. Exploring Uncharted Territories - Predicting Activity Cliffs in Structure-Activity Landscapes. J. Chem. Inf. Model.52, 2181–2191 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR61] 61.Zhang, Z., Zhao, B., Xie, A., Bian, Y. & Zhou, S. Activity Cliff Prediction: Dataset and Benchmark. Preprint at 10.48550/arXiv.2302.07541 (2023).

[CR62] 62.Lo Surdo, P. et al. SIGNOR 3.0, the SIGnaling network open resource 3.0: 2022 update. Nucleic Acids Res51, D631–D637 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR63] 63.Devaiah, B. N. et al. MYC protein stability is negatively regulated by BRD4. Proc. Natl. Acad. Sci. USA117, 13457–13467 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR64] 64.Kotekar, A., Singh, A. K. & Devaiah, B. N. BRD4 and MYC: power couple in Transcription and Disease. FEBS J.290, 4820–4842 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR65] 65.He, K. et al. Masked Autoencoders Are Scalable Vision Learners. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 15979–15988. 10.1109/CVPR52688.2022.01553. (2022)

[CR66] 66.Ruscone, M. et al. NeKo: a tool for automatic network construction from prior knowledg. Preprint at 10.1101/2024.10.14.618311 (2024). [DOI] [PMC free article] [PubMed]

[CR67] 67.Kalinin, A. A. et al. A versatile information retrieval framework for evaluating profile strength and similarity. Nat. Commun. 16, 5181 (2025). [DOI] [PMC free article] [PubMed]

[CR68] 68.Lopes, J. C. D., dos Santos, F. M., Martins-José, A., Augustyns, K. & De Winter, H. The power metric: a new statistically robust enrichment-type metric for virtual screening applications with early recovery capability. J. Cheminformatics9, 7 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR69] 69.Gaulton, A. et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res40, D1100–D1107 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR70] 70.mxfly14. mxfly14/2025_sanchez_phenoseeker: Publication DOI. Zenodo 10.5281/zenodo.17914525 (2025).

PERMALINK

Large scale compound selection guided by cell painting reveals activity cliffs and functional relationships

Maxime Sanchez

Nicolas Bourriez

Ihab Bendidi

Ethan Cohen

Ivan Svatko

Elaine Del Nery

Hamza Tajmouati

Guillaume Bollot

Laurence Calzone

Auguste Genovesio

Abstract

Introduction

Results

Building phenotypic profiles that enhance signal and reduce batch effect over the whole JUMP-CP

Fig. 1. Generating robust Phenotypic Profiles from JUMP-CP Cell Painting images.

Large-scale cell painting-based selection boosts active compound yield

Fig. 2. Library selection through robust JUMP-CP profiles boost screening hit rates.

Phenotypic-based compound selection increases the chemical diversity of biologically active compounds

Fig. 3. Selection through similar Cell Painting profiles yield a variety of compound structures.

Clustering analogs per phenotype reveals activity cliffs

Fig. 4. Systematic identification of activity cliffs.

Clustering analogs per phenotype reveals key chemical functions

Fig. 5. Phenotypic Clustering of Analog Compounds.

Compounds reproducing the positive control phenotype tend to hit targets in the same pathway

Fig. 6. Example of a screen where compounds reproducing the positive control phenotype hit different targets of the EGFR pathway.

Compounds targeting the same pathway tend to induce related phenotypes

Fig. 7. Phenotypic similarities and dissimilarities of compounds targeting G2M pathway genes.

Table 1.

Discussion

Methods

Cell painting dataset

Raw feature extraction from images

Features alignment on DMSO

Rescaling (Res)

Removal (Rem)

Z-score normalization (Z)

Robust Z-score normalization (rZ)

Inverse normal transform (Int)

Sphering (Sph)

Metric used to compare phenotypic profiles

Metrics and data used to compare profiling methods

Compound structural representation using Morgan fingerprints

Chemical diversity of JUMP-CP

Assays description

Screen

Assay

Normalized enrichment factor

Bemis-Murcko scaffolds and phenotypic subclusters

Pathways and networks construction

Pathways and phenotypes table and statistics

Statistics and reproducibility

Activity cliffs detection is deterministic

Reporting summary

Supplementary information

Acknowledgements

Author contributions

Peer review

Peer review information

Data availability

Code availability

Competing interests

Footnotes

Contributor Information

Supplementary information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases