SUMMARY
Genetic variants produce complex phenotypic effects that confound current assays and predictive models. We developed Variant in situ sequencing (VIS-seq), a pooled, image-based method measuring variant effects on molecular and cellular phenotypes in diverse cell types. Applying VIS-seq to ~3,000 LMNA and PTEN variants yielded high-dimensional morphological profiles capturing changes in protein abundance, localization, activity and cell architecture. VIS-seq identified a subset of linker subdomain LMNA variants that increase nuclear circularity, in contrast to aggregating or low abundance rod subdomain variants that decrease circularity. VIS-seq also identified autism-associated PTEN variants that mislocalize, and accurately distinguished autism-linked from tumor syndrome-linked and gnomAD control variants. Most variants impacted a multidimensional phenotypic continuum not recapitulated by any single functional readout. By linking variants to cell images at scale, VIS-seq illuminates how variant effects cascade from molecules to subcellular structures to cells, providing a framework for resolving the complexity of variant function.
Keywords: Optical pooled screening, Functional genomics, Genetic variation, Pleiotropy, Multiplexed assays of variant effect
In Brief:
Variant in situ sequencing (VIS-seq) links genetic variants to cell images, revealing how variants affect molecules, subcellular structures, and cells at scale. Applied to thousands of LMNA and PTEN variants, VIS-seq illuminated how variants impacted a multidimensional phenotypic continuum not recapitulated by any single functional readout.
Graphical Abstract

INTRODUCTION
Genetic variants change the sequence of transcripts and peptides deterministically, but their effect on molecular or cellular phenotypes is difficult to predict. For example, protein-coding variants can alter protein structure, function or localization; disrupt macromolecular complexes and subcellular processes; or alter cell internal structure, morphology and behavior. Deep mutational scanning enabled the systematic, pooled evaluation of thousands of protein-coding variants in a single experiment for some of these molecular or cellular phenotypes1. However, the vast majority of deep mutational scans and other multiplexed experiments have focused on simple, one-dimensional phenotypes2 like cell growth3,4 or protein abundance5,6 in cultured human cell lines or non-human models. These experiments generally provide limited insight into variant pathomechanism, largely fail to parse complex gene-disease relationships and cannot inform on cell-type specific effects. Consequently, genetic variants are often viewed as affecting one phenotype and classified as either benign or pathogenic. In reality, variants have complex, multidimensional phenotypic consequences. Thus, scalable methods are needed that can enable variant effect measurements on generalizable, interpretable, and information-dense molecular and cellular phenotypes in specialized cell types.
Building on in situ sequencing methods for image-based, pooled CRISPR screens7–13, we massively expanded the scope of deep mutational scanning by enabling pooled measurement of variant effects using cell imaging. Because imaging can spatially resolve biomolecules, subcellular structures and cell morphology using fluorescent proteins, in situ hybridization, antibodies, and cell-staining dyes, it yields insights into the effects of diverse perturbations. For example, imaging of cells with dyes that stain subcellular compartments14 revealed the effects of chemical15, CRISPR16, and ORF overexpression17 perturbations and illustrated that image-based profiling is at least as powerful as RNA-based profiling for clustering these perturbations18. Arrayed, image-based measurement of the localization of ~3,000 human pathogenic variant proteins expressed in a cell line also demonstrated that mislocalization is a common defect19. We overcame longstanding technical challenges to develop Variant in situ sequencing (VIS-seq), a method based on a circular RNA barcode that yields a thirteen-fold increase in reads per cell compared to a linear construct. We also created a transgene expression cassette that prevents silencing, enabling a ten-fold increase in scale compared to previous multiplexed experiments in human induced pluripotent stem (iPS) cells20–23 and their derivatives. Thus, VIS-seq is a scalable platform that uses imaging of cells to quantify the effect of variants on diverse phenotypes in disease-relevant cell types.
We used VIS-seq to derive morphological profiles for ~3,000 synonymous, missense, nonsense and frameshift variants of LMNA and PTEN from ~11.4 million cell images. We chose these genes because variants in each are associated with multiple diseases, and the underlying molecular and cellular mechanisms linking variants to each disease remain poorly understood. Morphological profiles of LMNA variants expressed in U2OS24 cells revealed extensive effects on the lamin A protein’s abundance, localization, aggregation, and ability to maintain nuclear shape. LMNA profiles enabled us to discover gain-of-function variants that increase nuclear circularity and revealed a direct connection between lamin A’s three-dimensional protein structure, localization and effect on cells. Morphological profiles of PTEN variants expressed in both iPS cells and iPS-derived, NGN2-induced neuron-like cells comprehensively mapped the interdependence of the PTEN protein’s abundance, activity, and localization, and revealed a nuclear localization defect enriched among autism spectrum disorder-associated variants. LMNA and PTEN morphological profiles identified known pathogenic variants with high accuracy and, in the case of PTEN, enabled us to discriminate between autism and tumor syndrome-associated variants. Thus, VIS-seq expands our ability to deeply phenotype genetic variants in different cell contexts, illuminating how variant effects propagate from molecules to subcellular structures to cells. More fundamentally, we show that variants perturb a phenotype space larger and more complex than previously explored, spanning a multidimensional continuum rather than a pathogenic-benign binary and highlighting the limitations of current one-dimensional experimental or predictive methods.
RESULTS
Barcoded circular RNAs and a universal expression cassette enable Variant in situ Sequencing
Variant in situ Sequencing (VIS-seq) is a platform for pooled, image-based profiling of thousands of transgenically expressed protein-coding variants (Figure 1A). VIS-seq comprises a cassette with a promoter expressing the protein variant and a second promoter expressing an abundant circular RNA25,26 containing one or more barcodes that are sequenced in situ to reveal the identity of the variant expressed in each cell (Figure 1B). The VIS-seq expression cassette includes flanking insulators27, a ubiquitous chromatin opening element28, and an intron29,30, along with a viral regulatory element31. Together, these elements prevent transgene silencing in iPS cells before and after differentiation, eliminating a longstanding barrier to transgene-based experiments in stem cells and their derivatives32. The cassette sustained expression of WT mEGFP33-tagged histone H1.4 over 14 days of passaging in human iPS cells and after 56 days in differentiated iPS-derived cardiomyocytes, whereas expression from a lentiviral construct was silenced (Figure 1C). A linear RNA barcode yielded zero or one in situ sequencing reads in 56% of all cells and 28% of non-silenced cells, preventing effective barcode calling (Figure 1D), so we developed a circularizing, in situ sequencing-compatible RNA barcode25,26. 93% of cells expressing this circularizing RNA barcode had two or more reads, and reads per cell increased ~13-fold in all cells and ~5-fold in non-silenced cells.
Figure 1. Transgenic variant expression and in situ sequencing with VIS-seq.

(A) VIS-seq uses fluorescent in situ sequencing of abundant circular RNA barcodes to genotype cells expressing protein variants. (1) A variant library in the VIS-seq expression cassette is integrated into cells via piggyBac-ase. (2) Cells are fixed; barcodes are reverse transcribed, captured with a padlock probe and amplified; (3) cells are stained and imaged; (4) barcodes are sequenced in situ; (5) single cell phenotype-genotype pairs are determined using STARCall; and (6) features for each cell are extracted using CellProfiler.
(B) VIS-seq expression cassette (PB = piggyBac34; PD = padlock probe130 binding site; UCOE = universal chromatin opening element28; HBB IVS2 = hemoglobin subunit beta intervening sequence 229,30; IRES = internal ribosome entry site131; WPRE = woodchuck hepatitis virus post-transcriptional regulatory element31; cHS4 = chicken beta globin locus control region hypersensitive site 427).
(C) Time-series plot of mEGFP positivity of human WTC11 iPS cells and derived cardiomyocytes. mEGFP-tagged WT histone H1.4 is expressed via the VIS-seq cassette (blue) or lentivirus (red). Two replicates are plotted.
(D) Violin plot of the number of rolling circle amplification colonies per cell after one cycle of sequencing for VIS-seq expression cassette (blue) and lentivirus (red) cells from (C) at day 14. Silenced cells were excluded. Two replicates are shown.
(E) Matched images of mEGFP-tagged lamin A variants in U2OS cells, stained with DAPI, phalloidin, WGA, and mitoprobe (left), with corresponding first base in situ sequencing; magenta=guanine, blue=thymidine, yellow=adenine, and red=cytosine (right). Silver dashed lines depict the borders of cells. Scale bar indicates 10 μm.
(F) Matched images of mEGFP-tagged PTEN library in human WTC11 PTEN-KO inducible-NGN2 iPS cells, stained with DAPI and phalloidin-CF568 (left, top row) and first base in situ sequencing (right, top row). Matched images of mEGFP-tagged PTEN library in neuron-like cells, seven days after induction of NGN2, stained with DAPI and anti-MAP2 AF568 (left, bottom row) and first base in situ sequencing (right, bottom row). Silver dashed lines depict the borders of iPS cells (top) or cell bodies of NGN2-induced neuron-like cells (bottom). Nucleobase coloring identical to (E). Scale bar indicates 10 μm.
VIS-seq starts with integration of a library-bearing expression cassette via PiggyBac transposition34 at low multiplicity of integration (MOI) (Figures 1A and S1A). Cells are fixed and barcodes are reverse transcribed, captured with a padlock probe and amplified7,35. Cells are stained with the desired combination of fluorescent antibodies, RNA FISH probes, and dyes14. Images are collected and stains and probes are removed or bleached10,36,37. A sequencing primer is hybridized to the amplified barcodes which are sequenced in situ7. Images are aligned and stitched in a single step using STitching, Alignment and Read Calling for in situ sequencing (STARCall), software we developed to reduce the need to maintain precise registration between rounds of phenotyping and in situ sequencing images38 (available at https://github.com/FowlerLab/starcall-workflow). Barcode sequences are extracted and mapped to protein variants, then cell and nuclear borders are segmented39,40. Lastly, morphological features are extracted using CellProfiler41, generating a single cell matrix of image features (Figure 1A). We demonstrated the VIS-seq workflow on four genes with variant mislocalization phenotypes,19,42 LMNA, PTEN, RPS19, and HIST1H1E, in U2OS or human iPS or iPS-derived cells (Figures 1E, 1F, S1B, and S1C).
VIS-seq reveals morphological effects of thousands of LMNA variants
We used VIS-seq to phenotype variants in the rod domain of the intermediate filament protein lamin A, encoded by LMNA. Lamin A assembles into a poorly understood meshwork lining the inner nuclear membrane, maintaining nuclear shape and mechanical stability43,44, connecting the nucleus to the cytoskeleton45,46, and regulating gene expression47,48. LMNA missense variants cause disparate diseases, collectively called laminopathies47–53. Immunostaining of lamin A in patient fibroblasts54 and transient transfection of lamin A variants into cell lines42,55 revealed altered localization and morphology, with some pathogenic rod domain variants leading to lamin A aggregates, changes in nuclear shape, and other nuclear defects. However, not all pathogenic LMNA variants form aggregates, and the relationship between these molecular and cellular phenotypes and clinical phenotypes remains poorly understood42.
We created a library of >1,700 mEGFP-tagged LMNA synonymous, missense and frameshift variants of the central rod domain between amino acid positions 178 and 273, a hotspot for pathogenic variants. We integrated this library into U2OS LMNA-KO cells and imaged ~10 million fixed, library-expressing cells after staining chromatin (DAPI), microfilaments (phalloidin), plasma membrane (wheat germ agglutinin, WGA)14, and mitochondria (anti-12S and 16S mitochondrial rRNA FISH probes10, Figures 2A and S1D). In situ sequencing of the 12-base variant barcode followed by removal of cells with multiple barcodes yielded thousands of cells for most variants as well as ~900,000 cells expressing wild type (WT) lamin A (Figures 2B and S2A–S2C). We used CellProfiler41 to extract 1,077 replicable morphological features for each cell that measured the intensity, distribution, texture, and granularity of the mEGFP-lamin A and stain channels, as well as correlations between channels (Figures 2C and 2D, Table S1). Features that reflected biologically irrelevant information (e.g., pixel coordinates, cell orientation)56 were removed. Thus, we obtained image and feature information for 1,767 variants expressed in ~6.6 million cells (Figures 2A, S2D).
Figure 2. Morphological profiles of 1,767 LMNA variants identify variants that alter function, including aggregating and pathogenic variants.

(A) Morphological profile analysis workflow. (1) A matrix of ~6.6 million single cells each with 1,077 CellProfiler-derived features, a sequenced barcode and associated lamin A variant was generated. (2) Feature medians and earth-mover distances were computed among all cells expressing each variant; feature selection removed features that were highly correlated, had low variance, or were biologically irrelevant. (3) Two metrics were computed to quantify the overall effect of each variant. Morphological impact score was defined as the cosine distance between a variant morphological profile and the median synonymous profile. Distinguishability score was defined as the area under a receiver operating characteristic curve (AUROC) of a classifier trained to distinguish a variant from WT using single cell feature profiles. (4) Variant embeddings were visualized using UMAP following dimensionality reduction with PCA. Heatmaps of position by substitution effects were visualized for each feature.
(B) Boxplots of the number of profiled single cells per variant of the indicated class across both replicates.
(C) Median over cells of variant mean nuclear intensity in the mEGFP-lamin A channel for variants in both replicates colored by variant type as in (B), with Pearson’s r shown.
(D) Stacked barplot showing proportion of all features, selected features which generate variant profiles, and hit features (defined in (E)) by imaging channel, cellular compartment, or method of computing the feature. The number of features in each set is indicated. Color legends are shown below.
(E) Volcano plot of the median of LMNA feature median z-scores across all 1,767 variants (x-axis) against geometric-mean p-values (y-axis, Komologorov-Smirnov test against the WT distribution of feature values). Both the median effect (x-axis) and p-values (y-axis) are computed over all profiled variants. Red dashed lines show the Bonferroni-corrected p<0.01 threshold and median effect thresholds used to define hit features. Point area is proportional to the number of variants that pass thresholds. Coloring is by imaging channel, see (D) for color legend.
(F) Boxplots of morphological impact scores for variants of the indicated class. Aggregating controls indicate independently-measured variants that aggregated in >15% of HEK 293T cells42 (ClinVar variants of uncertain significance=VUS and likely pathogenic/pathogenic variants=P/LP). *** indicates Mann-Whitney p<0.001.
(G) Boxplots of distinguishability scores for variant classes from (F). *** indicates Mann-Whitney p<0.001.
(H) UMAP visualization of PCA-transformed morphological profiles for each variant. Circles indicate synonymous (green), missense (grey) and frameshift (purple) variants, triangles indicate ClinVar likely pathogenic/pathogenic (P/LP, red).
(I) UMAP visualization from (H). Colored symbols indicate independently measured fraction of cells (“+” symbols, >15%; “x” symbols, <15%) with lamin A aggregates in HEK 293T cells.
To summarize feature heterogeneity across cells, we computed the median and earth-mover distance (EMD)57,58 from WT for each feature for every variant (Figure 2A). We removed 745 features that had low variance, high correlation to other features, or had EMDs with low reproducibility58, leaving 332 “selected features.” Selected features were more balanced than all features across channels, compartments, and types of measurement; they comprise each variant’s “morphological profile” used to score, visualize, and cluster variants (Figure 2D). Independently, we identified 407 “hit” features that were the most consistently perturbed across variants (Figures 2E and S2E, Table S1). Hit features were overwhelmingly derived from the mEGFP-lamin A channel and were used to interpret variant effects.
Morphological profiles detect variant aggregation and summarize impact and pathogenicity
We summarized the dissimilarity between each variant’s profile and WT via two metrics: first, a morphological impact score describing the magnitude of a variant’s effect on its morphological profile (Figure 2F, Table S2); and second, a distinguishability score describing the performance of a gradient-boosted decision tree model trained for every variant. Each model is trained to discriminate between single cells expressing the variant and cells expressing WT based on all features. The distinguishability score for a variant is defined as the area under the receiver operating characteristic curve (AUROC) computed on a test set of variant and WT cell features (Figures 2G and S2F, Table S2). Morphological impact (Pearson’s r=0.88) and distinguishability scores (r=0.92) were highly concordant between replicates (Figures S2G and S2H). Both scores were significantly lower for synonymous variants than for missense and frameshift variants, with distinguishability scores better discriminating different classes of variants (Figures 2F and 2G). LMNA variant features, morphological profiles, and scores can be explored at https://visseq.gs.washington.edu/.
Next, we validated the morphological profiles and summary scores using control aggregating and pathogenic variants. The 14 known aggregating variants42 segregated from synonymous variants on the UMAP visualization of variant profiles, and had significantly higher morphological impact and distinguishability scores, illustrating the power of unbiased morphological profiling to detect this known lamin A phenotype (Mann-Whitney U p<0.001, Figures 2F–2I). All pathogenic (P) or likely pathogenic (LP) ClinVar59 variants except p.Leu204Arg, a single-submitter variant in an individual with dilated cardiomyopathy, occupied regions of the UMAP distinct from nearly all synonymous variants and had markedly elevated morphological impact and distinguishability scores compared to synonymous variants and variants of uncertain significance (VUS, Mann-Whitney U p<0.001). Impact scores computed with both median and EMD features modestly outperformed EMDs alone but substantially outperformed medians alone at separating aggregating or clinical control variants (Figures S2I and S2J). Thus, unbiased summarization of morphological profiles at both variant- and single-cell resolution accurately differentiated aggregating and pathogenic control variants from synonymous variants (distinguishability score AUC=0.99 for aggregating controls and 1 for ClinVar controls, impact score AUC=0.94 and 0.88, Figure S2J).
LMNA variant morphological profiles form clusters with explainable effects
We used morphological profile similarity to group variants into ten clusters (Figures 3A and S3A). Within-cluster profiles were more similar than between-cluster profiles, and clusters were robust to resampling and reclustering (Figures S3B and S3C). To understand the effects of variants in each cluster, we focused on four hit features that are interpretable and highly perturbed. These “landmark features” are: lamin A intensity in the nucleus, lamin A intensity at the nuclear boundary, lamin A nuclear granularity, and nuclear circularity (Figures S2E, S3D, and S4A, Tables S1 and S2). 76 (90%) synonymous variants occupied clusters 5, 6, and 7, which, along with 519 (31%) missense variants, had little effect on landmark features and were similar to WT (Figures 3A, 3B, S3A, and S3B). The known aggregating variants were exclusively present in clusters 8, 9, and 10, along with 380 (24%) other missense variants42 (Figure 3A). Variants in these clusters had low nuclear and boundary lamin A abundance and high granularity, presumably because of relocalization from the nuclear boundary to aggregates (Figures 3B and S4B). These variants also exhibited the largest decrease in nuclear circularity, leading to both more elliptical and more concave nuclei and indicating that these variants abrogate lamin A’s ability to maintain nuclear shape43,46,60 (Figures 3B and S4A–S4C). The pathogenic variant p.Asn195Lys50, along with six other pathogenic variants (47% total), were also found in these clusters (Figures 2H and 3A). To better understand the effects of variants on nuclear boundary abundance, we interrogated lamin A intensity as a function of distance from the nuclear center (Figures 3C and S4D). In clusters 8 and 9, variants consistently reduced laminar abundance, but variants in cluster 10 had more heterogeneous effects.
Figure 3. Clustering LMNA variant morphological profiles yields rich, interpretable phenotypes.

(A) Center: UMAP of LMNA variant profiles (Figure 2H) colored by Louvain128-derived clusters. Periphery: images of variants and description of nuclear morphology, lamin A abundance and lamin A localization associated with each cluster. Silver dashed lines depict the borders of cells, blue=DAPI, green=lamin A-mEGFP. Scale bar indicates 5 μm. Graphical representations of each cluster indicate nuclear envelope morphology (black), lamin A localization and abundance (green) and nuclear circularization (red arrows). UMAP point and text color indicate cluster identity, depicted in (B).
(B) Heatmap of landmark feature medians and EMDs for each cluster, colored by z-score versus the synonymous variant distribution. The nuclear mEGFP-lamin A intensity feature measures the average lamin A pixel intensity in the nucleus, the boundary mEGFP-lamin A intensity feature measures the average lamin A pixel intensity within two pixels of the nuclear boundary. The nuclear mEGFP-lamin A granularity 1 feature is related to the presence of lamin A aggregates, with high values indicating aggregates. The nuclear circularity feature is the normalized ratio of nuclear perimeter squared to nuclear area, with high values indicating a circular nucleus.
(C) Plot of the mEGFP-lamin A total intensity in radial deciles of distance from the centroid of the nucleus, normalized to the DAPI total intensity in each decile. Bin 1 corresponds to the innermost radial decile and bin 10 to the outermost (see diagram below plot). mEGFP-laminA/DAPI median intensity for clusters 1, 3, 4, and 8 z-scored versus the synonymous variant distribution are shown. The remaining clusters are shown in Figure S4C.
The 454 (28%) missense variants in clusters 2, 3, and 4 had smaller overall decreases in nuclear and boundary abundance. Cluster 3 variants produced aggregates of varied sizes and were lower in abundance across the nucleus, whereas cluster 4 variants did not aggregate and were lower in abundance at the nuclear boundary (Figures 3B, 3C, S4B, and S4D). The pathogenic and known partially-aggregating42 variant p.Arg190Gln, along with five other pathogenic variants (40% total), were also found in these clusters (Figures 2H and 3A). Surprisingly, the 292 (18%) missense variants grouped in cluster 1 drove large increases in nuclear circularity and an increase in lamin abundance in the nuclear interior, suggesting that these variants, including p.Glu228Val, a pathogenic variant associated with multiple overlapping clinical phenotypes61, may exert their effects via a gain-of-function mechanism (Figures 3B and 3C).
In addition to defining clusters based on variant morphological profiles, we explored the relationship between landmark features (Figure S3D). 44% of profiled missense variants perturbed nuclear circularity whereas only 20% drove changes in nuclear lamin A granularity, despite aggregation being readily appreciable in images of patient-derived cells42,55,62. Thus, VIS-seq grouped variants by their effects on lamin A intensity, localization, aggregation, nuclear shape and other features, and revealed a class of variants that increase nuclear circularization.
VIS-seq reveals the structural basis of LMNA variant effects
Next, we compared lamin A morphological profiles to prior crystal structures63–65 and models66 to understand the structural basis for the segregation of variants into clusters. Lamin A forms a soluble dimer in the cytoplasm that assembles into a multimeric filament and creates a meshwork at the lamina. The region of lamin A we mutagenized spans the rod domain comprising two α-helical coiled coil subdomains, coils 1B and 2A, separated by the linker 12 subdomain (Figure 4A). In other intermediate filament protein structures, linker 12 is flexible and adopts non-helical conformations, but in crystal structures of partial lamin A filaments, linker 12 is α-helical63,64,67. An alternate model suggests that the functional state of linker 12 is flexible and non-α-helical66,68. To explore the relationship between these lamin A structures and function, we created variant effect maps for each landmark feature alongside UMAPs colored by feature scores (Figures 4A, 4B, and S4E–S4G). These heatmaps revealed differences between variant effects at proposed coil and linker positions and profound effects of proline substitutions, which disrupt α-helices. Moreover, coil and linker substitutions had similar morphological impact scores but different morphological effects (Figures S5A–S5C). At coil positions, missense substitutions were enriched in clusters 3 and 8, characterized by aggregation, and cluster 4, characterized by low laminar abundance, (Fisher exact test q<0.01–0.001, Figure S5B). However, at linker 12 positions, missense substitutions were strongly enriched in cluster 1, characterized by increased circularity and preserved lamin A abundance (Fisher exact test q<0.001, Figure S5B). Proline substitutions had the highest morphological impact scores of all, but their morphological effects depended on their location (Mann-Whitney U p<0.001, Figures 4A, 4C, S4E–S4G, and S5D). To further explore the effects of proline substitutions, we clustered them by their morphological profiles, yielding linker, linker-proximal, and α-helical position groups (Figure 5E, Table S2). Of linker 12 positions, 17 of 19 (89%) were in the linker group, and all positions in the α-helical group were in the coil subdomains (Figure 4C). By revealing that variants at flexible linker positions, linker-proximal positions, and α-helical coil positions can have distinct effects, our results support the alternate structural model66.
Figure 4. Disruption of lamin A multimerization dictates effects on aggregation and abundance.

(A) Map of lamin A missense variant effects on the nuclear circularity feature (grey boxes=missing variants, black dots=synonymous substitutions). Blue to red coloring indicates feature z-score versus the synonymous variant distribution. Above the map, positions are annotated by their participation in multimer contacts (first row, defined below), α-helical average substitution profile (second row, defined below and in Figure S5F), and proline-substitution profile (third row, defined below and in Figure S5E). Linker or coil subdomains are shown directly above the heatmap63. Position-averaged scores are below and amino acid substitution marginal feature z-score distributions are on the right (points depict variants; black lines depict medians).
(B) UMAP of variant profiles (Figure 2H) colored by nuclear circularity score as in (A).
(C) Proline substitutions are shown as triangles on the UMAP shown in (Figure 2H), colored by clustering of the effects of proline substitutions as defined in Figure S5E with legend above.
(D) The number of proline substitutions is plotted for each subdomain63 colored as in (C) with legend above. *** indicates χ2 p<0.001.
(E) The number of α-helical positions (defined in Figure S5E) plotted by subdomain63 and colored based on the effect of variants on aggregation or abundance at the position (defined in Figure S5F), with legend above. ** indicates χ2 p<0.01.
(F) The number of α-helical positions (defined in Figure S5E) plotted by their participation in structural contacts ((G), see Methods), colored as in (E). Color legend shown above. ** indicates χ2 p<0.01.
(G) A11 tetramer structure 6JLB63 (bottom) and the A22 tetramer structure64 (top) with positions colored as in (E) (magenta=aggregation-sensitive, light blue=abundance-sensitive, dark blue=low impact, grey=position not profiled, black=position not clustered).
(H) Graphical representation of lamin A processing, localization, aggregation, and degradation, annotated with their relationship to dimerization, multimerization and phenotypic effects as well as variant clusters.
Figure 5. Morphological profiles of >1,200 PTEN variants in iPS cells and NGN2-induced neuron-like cells.

(A) Image of PTEN library at positions 112–172 expressed in human PTEN-KO iPS cells carrying dox-inducible NGN2 (left) and neuron-like cells 7 days after induction of NGN2 (right). MAP2=anti-MAP2 antibody, ConA=concanavalin A. Scale bar indicates 10 μm.
(B) UMAP visualization of 1,228 PTEN variant morphological profiles collected in iPS cells (left) and NGN2-induced neurons (right). Point color indicates variant type: synonymous (green), missense (grey), 3-nt deletions (blue), nonsense (red), and frameshift (purple).
(C,E,G) Missense variant effect maps for iPS cell PTEN landmark features (grey boxes=missing variants, black dots=synonymous substitutions). Blue to red coloring indicates feature z-score versus the synonymous variant distribution. Linker or coil subdomains are shown directly above the heatmap. Position-averaged scores are below and amino acid substitution marginal feature z-score distributions are on the right (points depict variants; black lines depict medians).
(D,F,H) UMAPs of PTEN variant morphological profiles in iPS cells (left) and NGN2-induced neurons (right) colored by landmark features. Blue indicates low and red indicates high z-score versus the synonymous variant distribution, as in (C,E,G).
(I) Barplot of fraction of variants that perturb landmark features in iPS cells and NGN2-induced neurons. For each variant, a feature was considered perturbed if |z-score|>2.5 and KS-test Bonferroni-corrected p<0.01. *** indicates χ2 p<0.001.
(J) Graphic depicting the relationship between PTEN localization (cell membrane, cytoplasm, or nucleus), activity and protein abundance, annotated with changes to landmark features.
We further investigated coil positions. Helices in the coiled coils can make contacts with adjacent helices across a dimer-forming interface in coil 1B and multimer-forming interfaces in both coil 1B and 2A63,64. Prior work has linked aggregating or pathogenic variants to dimer-forming contacts42,69, or examined the effect of a few variants on multimer formation63. To systematically probe the nature of these contacts, we clustered α-helical positions based on the average morphological profile across all substitutions at each position (Figure S5F, Table S2). We identified three groups of positions where substitutions were either aggregation-sensitive, abundance-sensitive or low impact, with corresponding significant differences in morphological impact scores and variant cluster segregation (Mann-Whitney U p<0.001, Figures S5G–S5J). The disparate morphologies characterizing substitutions in each of the three groups of positions were dictated by whether variants occurred at dimer or multimer interfaces (Figures 4E–4G and S5H). For example, 67% of dimer–facing positions, which occur only in coil 1B, were in the abundance-sensitive group, whereas 100% of multimer-facing positions, which occur in both coil 1B and 2A, were in the aggregation-sensitive group. 86% of non-interacting residues, as well as the synonymous variant average profile, were in the low impact group (Figures 4E–4G). The proportion of amino acid substitutions also differed significantly across clusters. For example, substitutions to acidic residues, enriched in variant clusters 8 and 9, drove high impact and distinguishability scores and low abundance and circularity at α-helical positions. Substitutions to aromatic amino acids, enriched in variant clusters 1, 3, 9, and 10, drove aggregation especially at aggregation-sensitive α-helical positions (Fisher’s exact test, q<0.001–0.05, Figures S5K and S5L).
Thus, unsupervised clustering of LMNA variant morphological profiles yielded insight into lamin A structure. The effects of variants in linker 12, enriched in the high-circularity and preserved nuclear abundance variant cluster 1, suggest that linker 12 is not a static coiled coil but, as postulated by the alternate structural model, is flexible66 (Fisher’s exact test q<0.001, Figures 4H and S5J). The effects of variants at α-helical positions were largely dictated by their location in dimer or multimer interfaces. Here, disruption of dimer interfaces at abundance-sensitive positions, disproportionately by low abundance variants in cluster 4, apparently did not prevent formation of lamin A multimers needed to circularize the nucleus (Fisher’s exact test q<0.001). Disruption of multimer interfaces at aggregation-sensitive positions, disproportionately by variants in clusters 3, 8, and 9, was characterized by lamin A aggregation and profound loss of nuclear circularity (Fisher’s exact test q<0.01–0.001). Rather than a simple continuum of loss of function based on protein folding or stability, disruption of different aspects of lamin A structure and assembly led to diverse effects. VIS-seq enabled us to trace these effects as they cascade from molecular phenotypes like protein abundance and localization to cellular phenotypes like nuclear morphology.
Morphological profiling of PTEN variants in iPS cells and NGN2-induced neuron-like cells
We next focused on Phosphatase and TENsin homolog (PTEN), a lipid and protein phosphatase that negatively regulates the PI3K-AKT-mTOR pathway and plays roles in cell growth, cell migration, and genome stability70–75. Pathogenic germline PTEN variants give rise to a spectrum of autosomal dominant tumor predisposition phenotypes76 and are a cause of autism spectrum disorder, often co-occurring with macrocephaly and developmental delay77. Somatic PTEN mutations occur frequently in cancer70,78. Patients with PTEN variants can present with a complex mix of these phenotypes, and the relationship between changes in PTEN molecular function and disease remains unclear79,80. To explore this relationship, we collected morphological profiles for 1,228 PTEN catalytic domain synonymous, missense, nonsense, 3-nt deletion and frameshift variants in >2.8 million human iPS cells and >2 million NGN2-induced neuron-like cells (hereafter NGN2-induced neurons) in two replicate PTEN-knockout clonal cell lines81 (Figures 5A, S1E, and S6A–S6E). iPS cells were imaged with dyes staining chromatin (DAPI), microfilaments (phalloidin), and endoplasmic reticulum (concanavalin-A)14 as well as mEGFP-tagged PTEN and an antibody staining phospho-Thr308-AKT (pAKT) which reports PTEN lipid phosphatase activity82. NGN2-induced neurons were imaged with the same dyes and antibodies except that a marker of neuronal differentiation83 (anti-MAP2 antibody) replaced phalloidin.
Feature selection led to variant morphological profiles comprising 1,174 feature medians or EMDs for iPS cells and 1,036 feature medians or EMDs57,58 for NGN2-induced neurons. Due to the difficulty of robustly segmenting neurites, all NGN2-induced neuron features were derived from segmented nuclei and cell bodies. Hit features in both cell types were disproportionately from mEGFP-PTEN and pAKT channels (Figures S6F–S6I, Table S3). UMAP visualization of the iPS cell and NGN2-induced neuron variant profiles demonstrated clear separation of synonymous from 3-nt deletion and nonsense variants (Figure 5B). Likewise, PTEN variant morphological impact and distinguishability scores were higher for 3-nt deletion, frameshift, and nonsense variants than for synonymous and missense variants in both iPS cells and NGN2-induced neurons (Mann-Whitney U p<0.001, Figures S6J and S6K), and replicate impact scores were strongly correlated (Pearson’s r=0.88, 0.89, Figure S6L, Table S4). Impact (Pearson’s r=0.95) and distinguishability scores (r=0.90) were highly concordant between iPS cells and NGN2-induced neurons, suggesting that variants tended to be similarly disruptive in both cell types (Figure S6M). PTEN variant features and morphological profiles can be explored at https://visseq.gs.washington.edu/.
PTEN variants alter abundance, lipid phosphatase activity, and localization in a cell type-specific manner
To explore variant-specific differences between cell types, we focused on three interpretable, highly-perturbed landmark features: Steady-state PTEN abundance as reflected by mEGFP-PTEN intensity, lipid phosphatase activity as reflected by pAKT intensity, and nucleocytoplasmic localization as reflected by pixel-level DAPI-PTEN correlation (Figures 5C–5H, S6F–S6I, and S7A–S7D, Tables S3 and S4). Both PTEN abundance and lipid phosphatase activity exhibited a steep gradient across variants in the UMAP in both iPS cells and NGN2-induced neurons (Figures 5D and 5F). Our PTEN abundance and activity measurements were correlated with previous measurements of activity in a yeast fitness assay (Pearson’s r=−0.84 versus iPS cell pAKT intensity and −0.86 versus NGN2-induced neuron) and abundance in a human cell line (r=0.78 versus iPS cell mEGFP-PTEN intensity, 0.77 versus NGN2-induced neuron, Figure S7E)5,84. As expected, abundance-sensitive PTEN positions were concentrated in the core of the protein71 (e.g., β-sheet positions 118–121, α-helix positions 137–141, Figures 5C and S7F–S7H). Some variants had cell type-specific abundance effects; 43% of variants altered PTEN abundance in iPS cells versus 52% in NGN2-induced neurons (χ2 p<0.001, Figures 5I and S8A). Activity-sensitive positions were concentrated at the catalytic surface (e.g., P-loop71 positions 123–130 and 171) and in second/third-shell positions (e.g., 155 and 160, Figures 5E and S7G). Unlike for abundance, variants appeared to have similar effects on activity in each cell type (Figures 5I and S8A).
The third landmark feature, DAPI-PTEN correlation, is a sensitive readout of nucleocytoplasmic localization (Figures 5G and S8B). PTEN is found in the nucleus, where it dephosphorylates nuclear proteins and may have non-enzymatic functions72,74,75,80. The DAPI-PTEN correlation feature formed a gradient across variants in the UMAP, with less consistency than abundance or activity (Figure 5H). 67% of variants altered DAPI-PTEN correlation in iPS cells versus 48% in NGN2-induced neurons (χ2 p<0.001, Figures 5G, 5I, S7C, and S8C). However, PTEN nucleocytoplasmic localization was related to changes in both abundance and lipid phosphatase activity: Low abundance variants had low DAPI-PTEN correlation levels and thus cytoplasmic localization compared to WT, whereas low lipid phosphatase activity variants had high DAPI-PTEN correlation levels and thus nuclear localization (Figures 5C–5H and S8D). We used kernel ridge regression to model DAPI–PTEN correlation as a function of abundance and activity and computed the residual as a mislocalization score (see Methods). This score quantified each variant’s effect on localization relative to WT independent of abundance and activity, and was reproducible (Pearson’s r=0.68, Figures S8E and S8F). The mislocalization score revealed variants that shifted PTEN into the nucleus without altering pAKT levels. We hypothesized that such variants would occur at the membrane-binding positions 163 and 16485 and the substrate specificity-determining positions 167 and 16886. Indeed, at these positions, 43 (72%) profiled variants had elevated mislocalization scores yet WT-like lipid phosphatase activity. By contrast, only 18 (27%) variants at four adjacent activity-sensitive positions (165, 166, 170, and 171) had elevated mislocalization scores (Figures S8E, S8G, and S8H). Thus, analyzing the relationships between multiple interpretable VIS-seq features enabled us to isolate variant effects on specific phenotypes.
Overall, PTEN missense variants impacted diverse combinations of landmark features in iPS cells and NGN2-induced neurons, most frequently altering all three features in both cell types (20%, Figure S8I). Variants that reduced lipid phosphatase activity, concentrated at catalytically important positions, were more likely to be distinguishable in iPS cells whereas variants that reduced abundance, concentrated in buried positions, were more likely to be distinguishable in NGN2-induced neurons (Figures S8J and S8K). Thus, morphological profiles enabled us to dissect the relationship between PTEN abundance, lipid phosphatase activity, and nucleocytoplasmic localization in a cell-type-specific manner (Figure 5J).
PTEN variant morphological profiles are validated by clinical phenotypes and reveal disease-specific variant effects
We curated 5 likely benign (LB) and 62 likely pathogenic or pathogenic (LP/P) missense variants from ClinVar59, and these clinical control variants were separated on the UMAP in both iPS cells and NGN2-induced neurons (Figure S9A). LP/P variant morphological impact scores were significantly elevated as compared to both synonymous and LB/B variant impact scores (Mann-Whitney U p<0.001, Figure S9B). To explore the relationship between PTEN variant morphological profiles and autism spectrum disorder/developmental delay (ASD/DD), PTEN Hamartoma Tumor Syndrome (PHTS), or somatic cancer, we curated 66 germline variants in 541 probands for PHTS and ASD/DD phenotypes (Table S5) as well as 22 variants from the COSMIC database87 (Figure 6A, Table S4). 31 variants in this combined set were associated with only ASD/DD, PHTS or somatic cancer, and 41 were associated with multiple clinical phenotypes. Distinguishability scores separated P/LP variants or curated variants associated with all three clinical phenotypes from LB and gnomAD88 controls better than yeast fitness or human cell abundance scores (Figures 6B and S9C). Additionally, morphological impact and distinguishability scores were significantly increased for variants associated with each clinical phenotype when compared to synonymous or gnomAD88 variants (Mann-Whitney U p<0.001, Figures 6C, S9D, and S9E). However, variants associated with ASD/DD had significantly lower morphological impact scores compared to variants associated with PHTS or somatic cancer (Mann-Whitney U p<0.01–0.001). Thus, morphological profiles were superior to single phenotype measurements in capturing overall PTEN missense variant pathogenicity and in relating variants to clinical phenotypes.
Figure 6. PTEN variant morphological profiles predict clinical phenotypes and reveal disease-specific variant effects.

(A) Venn diagram of clinical phenotypes associated with curated variants (see Methods and Table S5). Each clinical phenotype was associated with a variant if it occurred in at least one proband with that variant.
(B) Receiver operating characteristic (ROC) curves for univariate zero-shot models predicting ClinVar pathogenicity or each clinical phenotype (ASD = autism spectrum disorder, DD = developmental delay, PHTS = PTEN hamartoma tumor syndrome; see Methods for curation criteria) from iPS cell and NGN2-induced neuron-like cell distinguishability scores (this publication, solid lines), yeast fitness scores84 (dashed line), VAMPseq scores5 (dashed line), or AlphaMissense91 or EVE92 scores(dot-dashed lines). Area under the curve (AUC) scores are shown for each model.
(C) Morphological impact scores for PTEN variants in iPS cells plotted by association with clinical phenotypes. gnomAD v4.1 (blue) and synonymous (green) variants are plotted for comparison. *** indicates Mann-Whitney U p<0.001 and ** indicates p<0.01.
(D) UMAPs of iPS cell (left) and NGN2-induced neurons (right) PTEN variant profiles highlighting curated variants from (B). Triangles indicate association with clinical phenotypes (colors match (D)). gnomAD v4.1 (blue) variants are also plotted. Circles represent other variants, colored green (synonymous) or grey (non-synonymous).
(E) Mislocalization scores (see Methods) for PTEN variants in iPS cells plotted by association with clinical phenotypes. gnomAD v4.1 (blue) and synonymous (green) variants are plotted for comparison. *** indicates Mann-Whitney U p<0.001 and ** indicates p<0.01.
(F) Volcano plots of median PTEN variant effects on iPS cell feature median z-scores (x-axis) against geometric-mean KS-test p-values (y-axis, tested against WT feature distributions) for variants associated with different clinical phenotypes (see (B)). Points are features with area proportional to the number of variants that pass thresholds and colored by imaging channel (see right for legend). iPS cell landmark features are highlighted. Red dashed lines show the Bonferroni-corrected p<0.01 threshold and median effect thresholds.
(G) Macro-averaged multiclass ROC curve for PHTS-associated, ASD/DD-associated, and gnomAD or ClinVar LB/B variants. Curves show one-vs-rest performance for univariant zero-shot models from (B) as well as multivariate support vector classifier models trained on landmark features measured in iPS cells or NGN2-induced neurons (this publication, dotted line). Corresponding macro-averaged AUC are shown.
We hypothesized that, rather than being points on a single dimension of pathogenicity, ASD/DD, PHTS, and cancer-associated variants might instead perturb distinct molecular and cellular phenotypes. Indeed, variants associated with each clinical phenotype occupied distinct regions of the UMAP, implying that they had disparate molecular and cellular effects (Figure 6D). ASD/DD-associated variants have been previously described as hypomorphic89,90, and they do have lower overall iPS cell morphological impact as well as milder abundance and lipid phosphatase activity phenotypes compared to PHTS and somatic cancer-associated variants (Figures S9F and S9G). However, ASD/DD-associated variants also shifted aberrantly into the nucleus in iPS cells unlike PHTS and somatic cancer-associated variants (Mann-Whitney U p<0.01, Figure 6E). DAPI-PTEN correlation was the hit feature with the largest effect across ASD/DD-associated variants, unlike for PHTS-associated variants (Figures 6F and S9H). ASD/DD-associated variants perturbed mEGFP-PTEN channel features most strongly, whereas PHTS-associated variants perturbed pAKT channel features most strongly (Figures 6F and S9H, Table S3). Based on these observations, we trained a linear support vector classifier that could accurately discriminate between gnomAD control, PHTS-associated, and ASD/DD-associated variants using the landmark features (iPS cell landmarks macro-averaged five-fold cross-validated AUC=0.92 versus 0.76 for yeast fitness assay, 0.62 human cell abundance assay, Figure 6G). Thus, we dissected the relationship between PTEN variant molecular, cellular, and clinical phenotypes, identifying molecular phenotypes that differentiate ASD/DD- from PHTS-associated variants and revealing that ASD/DD-associated variants cannot be understood solely as hypomorphic.
Variant effect predictors perform poorly on many molecular and cellular phenotypes
State-of-the-art variant effect predictors leverage conservation and protein structure to make accurate inferences of overall variant effects. We compared three leading predictors, AlphaMissense91, EVE92, and REVEL93, to our morphological profiles. Variant effect predictions correlated poorly with lamin A morphological impact scores as compared to impact score replicability (Spearman’s ρ=0.27 to 0.35, Figures S2G, S10A, and S10B) and more strongly with PTEN impact scores (Spearman’s ρ=0.69 to 0.84 in iPS cells and 0.64 to 0.81 in NGN2-induced neurons, Figures S6L and S10C–S10F). Similar correlations were observed between other variant effect data and predictions91–94.
The correlation between individual lamin A or PTEN landmark features and predictions was more variable. For example, predictions were moderately correlated with lamin A variant abundance (Spearman’s ρ=−0.35 to −0.46) but poorly correlated with lamin A variant aggregation (ρ=0.23 to 0.25) and nuclear circularity (ρ=−0.26 to −0.37, Figure S10A). Similarly, predictions were moderately correlated with PTEN lipid phosphatase activity (ρ=0.57 to 0.79) but poorly correlated with PTEN abundance (ρ=−0.19 to −0.52) or iPS cell PTEN localization (ρ=0.05 to 0.32, Figures S10C and S10E). Predictions performed similarly to VIS-seq distinguishability scores in separating PTEN pathogenic variants from gnomAD controls (Figures 6B and S9C). However, one-dimensional predictions or assays performed poorly compared to multidimensional VIS-seq landmark features at discriminating between gnomAD control, PHTS-associated, and ASD/DD-associated variants (VIS-seq macro-averaged five-fold cross-validated AUC=0.92 versus 0.76 for AlphaMissense and EVE, Figure 6G). These results highlight the strengths and limitations of current predictors and illustrate how the multimodal data provided by VIS-seq could be used to develop or constrain a next generation of phenotype-aware predictors.
DISCUSSION
We introduce Variant in situ Sequencing (VIS-seq), a scalable single-cell method that uses in situ sequencing of circular RNA barcodes to simultaneously measure the effects of thousands of protein variants on features derived from cell images. VIS-seq includes an expression cassette engineered to prevent silencing and enable transgene assays in specialized or hard-to-transduce cell types where silencing and low expression limit current approaches32. Thus, VIS-seq expands multiplexed assays of variant effect beyond cell growth, protein abundance or reporter readouts in cancer cell lines or non-human cells by enabling morphological profiling that captures many biologically relevant, spatially-resolved molecular and cellular phenotypes.
We used VIS-seq to create morphological profiles for ~3,000 variants in LMNA and PTEN, genes with complex genotype-disease relationships. Variant morphological profiles for these genes comprise a high-dimensional space which can be interpreted using multiple landmark features with biological meaning. Prior work linked individual LMNA variants to aggregation42 and nuclear shape defects54,55 but without a clear link to rod-domain structure and assembly. VIS-seq demonstrated that rather than generic loss-of-function, variants at linker 12, dimer-forming interfaces, and multimer-forming interfaces drove distinct cell morphologies defined by increased nuclear circularity, low abundance, or aggregation, respectively. Prior multiplexed datasets have measured PTEN variant effects on abundance and lipid phosphatase activity. VIS-seq in multiple cell types disentangled abundance, pAKT signaling, and localization, revealing a mislocalization phenotype enriched among autism-associated variants. Morphological profiles, which comprise a large set of simple measurements of the intensity, distribution and shape of different markers in cells, outperformed single phenotype measurements at discerning variant pathogenicity. They also illuminated how variant effects cascade from molecules to subcellular structures to cells in a fashion not captured by current experimental or predictive methods. Thus, we show that variant effects span a multidimensional continuum of function that transcends a pathogenic versus benign binary or a one-dimensional spectrum, highlighting the power of multimodal variant effect assays like VIS-seq.
Because VIS-seq can measure many imageable phenotypes in human cell lines and iPS-derived cell types, it will be broadly applicable, particularly to the ~5,400 disease-associated human genes95. While other multiplexed assays of variant effect can measure cell growth3,4,96, protein abundance5,6 or transcriptomic changes97,98, none offer the spatial information or multimodality of VIS-seq. For example, most genes are incompatible with cell growth assays99–101, which do not generally provide mechanistic insight. Loss of protein abundance is a primary mechanism of missense variant pathogenicity5,6,102,103, but fails to capture mislocalization or loss of catalytic activity104. Lastly, missense variants do not always produce large transcriptomic changes, particularly in utilitarian cancer cell lines97, instead exerting their primary effect on proteins or cells. VIS-seq leverages the extraordinary catalog of imaging-compatible stains and affinity reagents to overcome these limitations by enabling simultaneous measurement of a vast swath of molecular phenotypes like protein abundance, mislocalization, and protein function as well as organelle and cell phenotypes. Thus, because many aspects of variant dysfunction can be made visible in cells, VIS-seq can be applied broadly.
Limitations of the study
Despite its promise, VIS-seq has important limitations. Although our platform provides robust expression in many cell types, such exogenous expression systems create the possibility of expression-related artifacts. We used fluorescent protein fusions to visualize lamin A and PTEN, which changes lamin A function and can generally alter stability and other phenotypes105. Although VIS-seq reagent costs are low (~$0.001/cell), a barrier to scaling is manual in situ sequencing which requires intensive hands-on time (~1 hour for imaging and ~1 hour for reagent exchange per base). Thus, sequencing a 12 base pair barcode takes at least 3 days. Two replicates of a typical human protein with ~8,000 missense variants, performed in a tiled fashion, would require ~4 weeks. Automation of reagent exchange and heating would reduce labor and speed data acquisition considerably. Customization of markers and stains to the target protein’s function creates informative morphological profiles but requires optimization of each new marker. Lastly, although image-derived features are useful for clustering and variant scoring, mapping profiles to cellular phenotypes requires manual feature and cell image interpretation. And unlike genes, proteins, or genomic elements which are quantified in single-cell -omics, most image features do not have straightforward biological relevance.
These limitations suggest productive avenues for future work. VIS-seq could be combined with time-lapse imaging or 3D organoid models106 to capture dynamic or tissue-like phenotypes. Incorporating protein-tagging strategies for endogenous loci107,108 with barcodes in the 3’-UTR would mitigate expression-related artifacts. Untaggable proteins could be immunostained for visualization. And, with rapidly advancing deep-learning-based pipelines for embedding image phenotypes109,110, the bottleneck of measuring and interpreting morphology at scale will continue to diminish. In summary, VIS-seq represents a significant step forward in our capacity to assess variant effects on complex phenotypes. By uniting multiplexed assays with the power of imaging, our work serves as a blueprint for mapping how variant effects propagate from molecules to cells.
RESOURCE AVAILABILITY
Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Douglas Fowler (dfowler@uw.edu).
Materials availability
Plasmids generated in this study are publicly available through Addgene (252243, 252244, and 252245). Variant libraries and other unique reagents generated in this study are available from the lead contact upon reasonable request.
Data and code availability
Variant morphological impact scores, distinguishability scores and landmark feature values can be found on MaveDB111 (LMNA: https://mavedb.org/experiments/urn:mavedb:00001243-a, PTEN iPS cells: https://mavedb.org/experiments/urn:mavedb:00001244-b, and PTEN NGN2-induced neurons: https://mavedb.org/experiments/urn:mavedb:00001244-a).
Tabular data is provided as a supplement or, for larger files, is available via Zenodo at doi.org/10.5281/zenodo.15787684.
Image data, single-cell feature profiles, and CellProfiler pipelines are publicly available via the BioImage Archive at doi.org/10.6019/S-BIAD3095.
All code necessary to reproduce our analyses, starting at the cell by features matrix or from summary data available via Zenodo, can be found at https://github.com/FowlerLab/visseq. STARCall code used to generate the cells by features matrix from phenotyping and genotyping images can be found at https://github.com/FowlerLab/starcall-workflow. Code used to generate trained XGBoost models and variant-level distinguishability scores can be found at https://github.com/FowlerLab/fisseqtools.
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.
STAR★METHODS
EXPERIMENTAL MODEL AND STUDY PARTICIPANT DETAILS
General cell culture
U2OS cells, derived from an osteosarcoma in a female (ATCC, HTB-96), and HEK 293T cells, derived from the kidney of a female fetus (ATCC, CRL-3216), were cultured in media composed of DMEM (Gibco, 11965–092) with 10% fetal bovine serum (FBS; Hyclone, SH30071.03) and 1% Pen/Strep (Gibco, 15140–122). Cells were passaged in the following way: lifted by washing with DPBS (Gibco, 14190–144), incubating with Trypsin 0.05% (Gibco, 25300–054) for 5 minutes (HEK 293T) or 15 minutes (U2OS) then quenching with media, spun down at 300g for 5 minutes, and lastly resuspended in media and replated. Cells were tested routinely for Mycoplasma. For VIS-seq experiments, U2OS cells were plated at a density of 800,000 per well on all wells of a glass-bottom 6-well plate (Cellvis, P06–1.5H-N) and fixed ~12 hours after plating.
iPS cell culture
WTC11 iPS cells, derived from healthy adult male fibroblasts (Coriell, GM25256) were cultured on matrigel-coated six-well plates in complete mTeSR Plus (STEMCELL Technologies, 100–0276) media. The media was changed at least every other day. The plates were coated by diluting matrigel (Corning, 354234) to a final concentration of 80 μg/mL in DMEM-F12 (Gibco, 11320–033), adding 1 mL to each well (or 5mL to a 10-cm plate), and incubating the plates at room temperature for 1 hour. iPS cells were passaged as follows: washed with DPBS then incubated cells with Versene (Gibco, 15400–054) for 3–5 minutes until cells dissociated, then resuspended the cells in mTeSR Plus with 10 μM Y-27632 (SelleckChem, S1049) and replated. Media changes without re-plating used mTeSR Plus only. Cells were tested routinely for Mycoplasma. For VIS-seq experiments, iPS cells were plated on all wells of a matrigel-coated glass-bottom 6-well plate (Cellvis, P06–1.5H-N) at 600,000 per well and fixed using 4% paraformaldehyde, 6 hours after they were plated.
NGN2-induced neural differentiation of iPS cells
NWTC11.G3-WT112 iPS cells, derived from WTC11 by Dr. Li Gan (Gladstone Institute, obtained with permission), were differentiated into NGN2-induced neuron-like cells using doxycycline-inducible NGN2 expression following a protocol reported previously83. On day −3, iPS cells were dissociated using Versene and then resuspended in Pre-Differentiation medium with 10 μM Y-27632. Pre-Differentiation medium contains: KnockOut DMEM/F-12 (Gibco 12660–012), 1X MEM Non-Essential Amino Acids (Gibco 11140–050), 1X N-2 Supplement (Gibco 17502–048), 10ng/mL NT-3 (Peprotech, 450–03) and BDNF (Peprotech, 450–02), 1 μg/mL Laminin protein (Gibco 23017–015), and 2 μg/mL doxycycline (Sigma, D9891). Cells were plated at 800k/well using 1mL of media per well in 12-well plates coated in matrigel, and media was changed to Pre-Differentiation medium on days −2 and −1. On day −1, a glass-bottom 6-well plate was coated using 15 μg/mL poly-L-ornithine (Sigma, P3655) overnight and then washed with DPBS 3 times the following day followed by air drying at room temperature. On day 0, pre-differentiated cells were dissociated using Accutase (STEMCELL Technologies, 07922) and then pelleted at 200 g for 5 minutes, followed by resuspension in Maturation medium. Maturation medium contains: 50% Neurobasal-A medium base (Gibco 12349–015), 50% DMEM/F-12 base, 1X MEM Non-Essential Amino Acids, 0.5X GlutaMAX Supplement (Gibco 35050–061), 0.5X N-2 Supplement, 0.5X B-27 Supplement (Gibco 17504–044), 10 ng/mL NT-3 and BDNF, 1 μg/mL Laminin protein, and 2 μg/mL doxycycline. Cells were plated at 600k/well into poly-L-ornithine coated 6-well plates in 2 mL of Maturation medium per well. NGN2-induced neurons were fixed and processed for VIS-seq on day 4 (day 7 after doxycycline addition).
To confirm differentiation, NGN2-induced neurons were fixed (see below, VIS-seq NGN2-induced neuron fixation protocol) and stained using DAPI (Fisher Scientific, D1306) and rabbit anti-NCAM1 primary antibody conjugated to Alexa647 (CellSignal, 50831S) using the VIS-seq staining mix (see below, Cell Painting Buffer + 0.25% Triton-X 100) and imaged using the same settings as used in the VIS-seq phenotype imaging (Figure S1B).
Cardiomyocyte differentiation of iPS cells
WTC11 iPS cells were cultured on 10 μg/ml Vitronectin XF (STEMCELL Technologies, 100–0763) prior to cardiomyocyte differentiation. Small molecule directed differentiation was performed as previously described with modifications22. iPS cells were plated in 15 μg/ml Vitronectin XF-coated 24 well plates at 7.5×104 cells per well in mTeSR Plus supplemented with 10 μM Y-27632. Media was changed to fresh mTeSR Plus without 10 μM Y-27632 the next day. Directed differentiation was initiated (D0) when cells reached 60–70% confluency by aspirating mTeSR Plus, washing with DPBS, and replacing media with RBA media: RPMI (Invitrogen, 11875135), 0.5 mg/mL bovine serum albumin (Sigma, A9418), 0.213 mg/mL ascorbic acid (Sigma, A8960) supplemented with 4.5 μM CHIR-99021 (TOCRIS, 4423). After two days (D2), CHIR-99021 containing media was removed and replaced with RBA supplemented with 2 μM Wnt-C59 (Selleck, S7037). On D4, Wnt-C59-containing media was replaced with RBA. On D6 (and every other day afterwards), media was replaced with cardiomyocyte media: RPMI, B27 plus insulin (Invitrogen, 17504044). After D12, cardiomyocyte media was removed and cardiomyocytes were dissociated to single cells with TrypLE Select 10X (Gibco, A12177–01) and replated on Matrigel-coated 6 well plates at 3×106 cells per well in cardiomyocyte media supplemented with 10% FBS and 10 μM Y-27632. At D20, cardiomyocytes were purified by replacing cardiomyocyte media with lactate selection media for four days: DMEM, no glucose (Gibco, 11966–025) with 4mM sodium-L-lactate (Sigma, 71718). Lastly, cardiomyocytes were dissociated to single cells using TrypLE Select 10X on D56 and then mEGFP-positivity was measured by flow cytometry (see end of next section).
METHOD DETAILS
CRISPR/Cas9 knockout and cell-line preparation
For knockout of LMNA/C, U2OS C11 cells113 were electroporated on the Lonza 4D Nucleofector using protocol CM104 in a single well of a Lonza 4D-Nucleofector SE Kit S (Lonza, V4XC-1032). The electroporation mix contained 1 million cells resuspended in 30 μL of supplemented Lonza SE buffer with 1 μL of SpCas9 (Synthego) and 1 μL of LMNA-KO gRNA (Synthego). Cells were then recovered and seeded into 96-well plates at a dilution of 1 cell per 5 wells. Clones were grown over a month, passaged, and LMNA-knockout was confirmed by Western blot (see below). U2OS C3 LMNA-KO was the clone selected for both replicates of VIS-seq on LMNA variants.
For knockout of PTEN, NWTC11.G3-WT112 iPS cells were electroporated on the Lonza 4D Nucleofector using protocol CM137 in a single well of a Lonza 4D-Nucleofector P3 Kit S (Lonza, V4XP-3032). The electroporation mix contained 200,000 cells per well resuspended in 30 μL of supplemented Lonza P3 buffer with 1 μL of SpCas9 and 1 μL of PTEN-KO gRNA (Synthego). Cells were then recovered and seeded into 96-well plates at a dilution of 1 cell per 5 wells. Clones were grown over 10 days, passaged, and then PTEN-knockout was confirmed by Western blot. PTEN-KO C18 was used for the first replicate of iPS cell and neuron VIS-seq on PTEN variants, and PTEN-KO C2 was used for second replicates.
VIS-seq expression construct cloning
VIS-seq expression constructs for expressing N-terminal mEGFP-tagged prelamin A (PBv2b-laminA) and PTEN (PBv2c-NSapI-PTEN) libraries were prepared by Gibson assembly114 using NEBuilder HiFi DNA Assembly Master Mix (New England Biolabs, E2621L) followed by Golden Gate115 insertion of barcodes and then variant libraries.
First, an intermediate plasmid was cloned which contains Esp3I/BsmBI sites for installation of a padlock-BC sequence in a future step. We cloned this plasmid by digesting pJBL_05125 (gift from the Jay Shendure lab) with NotI-HF (New England Biolabs, R3189S) and SacII (New England Biolabs, R0157S) and performing Gibson assembly with the PD-BsmBIsites gene fragment (Twist, see Table S6 for DNA sequences). Then, the PBv2b expression plasmid was prepared by Gibson assembly of the following five fragments: intermediate backbone from the previous step digested with NheI-HF (New England Biolabs, R3131L) and MluI-HF (New England Biolabs, R3198S), UCOE and EF1a promoter/intron amplified from pHR-UCOE-EF1a-Zim3-dCas9-BFP116 (Addgene, 188775), and gene fragments containing mEGFP, MluI and EcoRI restriction sites, a truncated portion of HBB intron 2, an IRES, a puromycin resistance gene, and WPRE (mEGFP-HBBIVS2trunc-v1 and IRES-PuroR-WPRE-v1, Twist). All PCR amplifications were performed using Q5 Master Mix (New England Biolabs, M0492L), using 98C for the denaturation temperature, 72C for the extension temperature, and 60C for the annealing temperature and were stopped prior to saturation. The PBv2b expression plasmid was subsequently prepped and sequence-confirmed.
PBv2b-NSapI, PBv2b-CSapI, and PBv2c-NSapI expression plasmids were cloned using similar Gibson assemblies. The corresponding gene fragment was replaced with one containing SapI sites either upstream of mEGFP (mEGFP-HBBIVS2trunc-v2C, Twist) for PBv2b-CSapI or downstream of mEGFP (mEGFP-HBBIVS2trunc-v2N, Twist) for PBv2b-NSapI and PBv2c-NSapI. For PBv2c-NSapI, we also substituted a CAG promoter amplified from pCAG-NLS-HA-Bxb1117,118 for the EF1a promoter.
All VIS-seq backbone plasmids (PBv2b, PBv2b-NSapI, PBv2b-CSapI, and PBv2c-NSapI) were barcoded using Golden Gate assembly. Klenow fill-in (New England Biolabs, M0212L) of VISseq_GG_BColigo (IDT) using primer VS_01 (IDT) yielded a double-stranded oligo with a 12-bp degenerate barcode used to barcode PBv2b, PBv2b-NSapI, or PBv2b-CSapI, and low-cycle PCR of VISseq_GG_BColigoX2 (IDT) with primers VS_02 (IDT) and VS_03 (IDT) yielded a double-stranded oligo with two 8-bp barcodes used to barcode PBv2c-NSapI. The barcoding reaction was performed in eight simultaneous reactions each with 500ng of input plasmid, a 2:1 molar ratio of barcode, 1.5 μL of Esp3I (New England Biolabs, R0734L), 1 μL of T4 ligase (New England Biolabs, M0202M), and 2.5 μL of 10X T4 ligase buffer with water added to 25 μL. These reactions were cycled between 37 °C and 16 °C 45 times, post-digested with 1 μL of BsmBI-v2 (New England Biolabs, R0739S) and then pooled, concentrated, and electroporated into 25 μL of NEB 10-beta electrocompetent E. coli (New England Biolabs, C3020K). Bacteria were plated onto BioAssay dishes (Fisher Scientific, 07–200-600) containing LB + 100 μg/mL carbenicillin (GoldBio, C-103–250) and grown at 30 °C overnight. Colonies were scraped and plasmid DNA was prepped the following day. Three electroporations were used for each plate. Each barcoded VIS-seq expression vector had a minimum of 25 million colonies per plate for 2 plates, which were pooled 1:1.
VIS-seq library cloning
The LMNA (prelamin A sequence, derived from NM_170707.4) variant cDNA library was purchased from Twist as an SSVL library and delivered in an arrayed format; positions 178 to 273 were pooled and restriction cloned into the VIS-seq PBv2b expression plasmid. PTEN (positions 112–172, derived from NM_000314.8), RPS19 (positions 52–109, derived from NM_001022.4), and HIST1H1E (positions 54–113, derived from NM_005321.3) variant cDNA libraries were cloned in-house using BsaI Golden Gate assembly, and then inserted into the VIS-seq expression plasmids with SapI Golden Gate assembly, similar to a previously published method119.
First, the plasmid pHSG299 (Takara Bio, 3299) was modified to contain two BsaI sites. Oligo libraries containing tiles of gene variants were amplified from the purchased Twist oligo pool and then assembled with the flanking gene blocks coding for the remainder of the cDNA (PTEN_tile3_s1/2 or RPS19_s1/2 or H1.4_s1 from Twist) and the modified pHSG299-BsaI. This Golden Gate reaction was performed using 100 ng of pHSG299-BsaI, a 2:1 molar ratio of gene blocks to backbone, a 2:1 molar ratio of the amplified oligo library to backbone, 1 μL of BsaI-HF-v2 (New England Biolabs, R3733L), 0.5 μL of T4 ligase, and 2.5 μL of 10X T4 ligase buffer with water added to 25 μL total volume. For Figure 1C, WT H1.4 cDNA was cloned by substituting the amplified oligo tile with a wildtype gene block (Twist). These reactions were incubated at 37 °C for 18 hours, then concentrated and electroporated into NEB 10-beta electrocompetent E. coli, followed by plasmid prep and sequence-confirmation.
Full cDNA variant libraries were cloned into corresponding VIS-seq expression plasmids. The prelamin A variant library (positions 178 to 273; Twist) and barcoded PBv2b VIS-seq expression plasmid were digested with MluI-HF and EcoRI-HF (New England Biolabs, R3101S) and then ligated at 16 °C overnight using 0.5 μL T4 ligase and T4 ligase buffer at a final 1X concentration in a 20 μL reaction. Other genes were assembled into barcoded PBv2b-NSapI (HIST1H1E), PBv2b-CSapI (RPS19), or PBv2c-NSapI (PTEN) using Golden Gate assembly: 100 ng of barcoded plasmid and a 2:1 molar ratio of variant library plasmid, 1.5 μL of SapI (New England Biolabs, R0569L), 0.5 μL of T4 ligase, and 2.5 μL of 10X T4 ligase buffer in a 25 μL reaction cycled between 16 °C and 37 °C for 3 hours. After assembly, reactions were post-digested with an additional 1 μL of SapI. All reactions were concentrated and then electroporated into NEB 10-beta electrocompetent E. coli followed by growth of dilutions in LB overnight, prep, and sequence confirmation. For all libraries, the dilution that corresponded most closely to ~20-fold barcode coverage over variants was selected the following day for plasmid preparation.
PacBio sequencing
Barcoded PBv2b-Prelamin A and PBv2c-NSapI-PTEN libraries were digested with NotI-HF. Then, the SmrtBell Prep Kit (SmrtBell Prep Kit 3.0, Pacific Biosciences, 102–182-700) instructions were followed to generate PacBio libraries. Libraries were sequenced using the Pacific Biosciences Revio sequencer. Pacybara120 was used to process reads to generate variant-barcode association tables (LMNA) or variant-double barcode association tables (PTEN).
Low-MOI piggyBac integration of VIS-seq libraries
For low-MOI integration of the VIS-seq LMNA library, U2OS C3 LMNA-KO cells were electroporated using the Lonza 4D Nucleofector with protocol CM104 in 6 separate wells of a Lonza 4D-Nucleofector SE Kit S. Each cuvette contained 1 million cells per well resuspended in 30 μL of supplemented Lonza SE buffer with 200 ng of VIS-seq LMNA variant library and 50ng of hyPBase121 expression plasmid (gift from the Jay Shendure lab). Cells were then recovered in a 6-well plate in media (DMEM + 10% FBS + 1% Pen/Strep), using one well per electroporation well, for 4 days and then pooled and integrants were selected with 2ug/mL puromycin (Gibco, A11138–03) for 7 days. After selection, cells were recovered for at least 2 days prior to freezing in media with 5% v/v added DMSO (Sigma, D8418–50 mL). Library-expressing cells were thawed, washed and resuspended in media and recovered for 2 days prior to VIS-seq experiments.
For low-MOI integration of VIS-seq libraries into iPS cells, cells were electroporated in 2 cuvettes of Lonza 4D-Nucleofector P3 Kit X (Lonza, V4XP-3024) using the Lonza 4D Nucleofector with protocol CM104 (U2OS) or CM137 (iPS cells). The PTEN library was integrated into PTEN-null clones of NWTC11.G3-WT112 iPS cells and other libraries were integrated into WTC11 iPS cells. Each cuvette contained 3 million cells resuspended in 100 μL of supplemented Lonza P3 buffer with 400 ng of VIS-seq expression plasmid and 100 ng of hyPBase121 expression plasmid. Cells were then recovered for 4 days in a 10-cm plate per cuvette and then pooled and integrants were selected with 1ug/mL puromycin for 3 days or until cells were >90% mEGFP expressing. After selection, cells were recovered for at least 2 days in mTeSR Plus and then frozen in liquid nitrogen after resuspension in CryoStor CS10 Cell Freezing Medium (STEMCELL Technologies, 100–1061). Library-expressing cells were thawed, washed and resuspended in mTeSR Plus with 10 μM Y-27632 and recovered for 2 days prior to VIS-seq experiments or NGN2-induced neuronal differentiation.
VIS-seq expression cassette silencing and in situ sequencing compared to lentiviral construct
The lentiviral plasmid pLenti_EF1a_mEGFP-H1.4 was cloned via a Golden Gate assembly using Esp3I of four fragments: backbone derived from PCR of lentiGuide-Puro122, EF1a-mEGFP-H1.4 fragment from amplifying the assembled PBv2b-NSapI-H1.4(WT) plasmid, a double stranded 12-bp barcode flanked by padlock sequences generated by Klenow fill-in of Lenti_GG_BColigo (IDT) with primer VS_04 (IDT), and an IRES-Puro-WPRE-SV40 terminator fragment derived from amplifying the assembled PBv2b-NSapI-H1.4(WT) plasmid. The plasmid was subsequently prepped and sequence-confirmed.
To generate lentivirus, HEK293T cells were seeded into a 6 well plate at 900,000 cells per well in the evening. The next morning, TransIT-293 (Mirus, MIR 2700) suspensions were made by adding 0.625 μg pMD2.G (Addgene, 12259), 1.25 μg psPAX2 (Addgene, 12260), and 0.625 μg pLenti_EF1a_mEGFP-H1.4 to 250 μL OPTI-MEM (Gibco, 31985–070) for every well to be transfected. After vortexing, 7.5 μL Transit reagent was added per well and mixed by gently flicking the tube, followed by a 15 minute incubation at room temperature. After the incubation, the suspension was distributed dropwise onto the plated cells and allowed to sit for 24 hours. The following day the media was replaced with fresh DMEM and placed back in the incubator for 48 hours. Following the incubation, the media was removed from the cells and spun at 1,000g for 5 minutes. The supernatant was filtered through a 0.45 μm syringe filter (VWR, 76479–012) and 1 mL aliquots frozen at −80 °C until use.
To generate iPS cells with integrated lentivirus expressing mEGFP-H1.4 and linear RNA barcodes, 10 μL of filtered lentiviral supernatant was added to a 6-well plate of WTC11 iPS cells plated at a concentration of 100,000 per well for two replicates, followed by puromycin selection after 4 days until >90% of cells were GFP-positive. To generate iPS cells with integrated VIS-seq cassette expressing mEGFP-H1.4 and circular RNA barcodes, the protocol for low-MOI integration of the VIS-seq plasmid PBv2b-NSapI-H1.4(WT) was followed (see above) for two replicates.
After selection, both iPS cell populations were passaged and monitored for mEGFP-positivity using the BD FACSSymphony A3 Analyzer over 14 days. Cells were taken to flow cytometry on days 0 (immediately after selection), 4, 9 and 14. mEGFP expression was measured for at least 10,000 cells using the 488 nm laser and the BB515 filter. Nontransduced iPS cells were used as a control to set the mEGFP-positive gate. Day 14 transduced and VIS-seq cassette-expressing iPS cell populations were fixed, in situ libraries were generated, and the first base of in situ sequencing was performed (see below for protocol). Only mEGFP-expressing cells were analyzed and STARCall38 was used to count the number of rolling circle colonies (rolonies) for each condition.
Cardiomyocytes were differentiated from transduced and VIS-seq-expressing iPS cell populations after selection (see above for protocol). On day 56 after start of differentiation, mEGFP-positivity of cardiomyocytes was measured using the BD Symphony S6 Cell Sorter with 488 nm laser and BB515 filter.
Western blots
Approximately 1 million cells per sample were incubated with 50 μL 1x RIPA buffer (Abcam, ab156034) supplemented with Halt protease inhibitor cocktail (ThermoFisher, 1860932) for 30 minutes on ice. Samples were spun down at 10,000g and the supernatant moved to new chilled tubes. The supernatants were assayed by BCA assay (ThermoFisher, Pierce BCA protein assay kit 23225) to determine the total protein concentration. Lysates were normalized and mixed with 2x Laemmeli sample buffer (BioRad, 1610737) supplemented with fresh β-mercaptoethanol (BioRad, 1610710) and boiled for 5 minutes at 95 °C. Boiled samples are spun at 10,000g for 1 minute right before loading 30 μg total protein per well for each sample into 8–16% Criterion TGX Stain-Free Protein Gel (BioRad, 5678103). Gels were run with 10x Tris/Glycine/SDS Running Buffer (BioRad, 1610732) diluted to 1x with type I ultrapure water for 35 minutes at 200V. After running, the gels were imaged for total protein on a BioRad Chemidoc MP. Transfer was done using Trans-Blot Turbo Midi 0.2 μm PVDF Transfer Packs (BioRad, 1704157) on the Trans-Blot Turbo Transfer System using the TURBO settings. Membrane blocking was done in 5% BSA (Sigma-Aldrich, A9418) in Tris buffered saline with Tween 20 (TBS-T; 50mM Tris-HCL, 150mM NaCl, 0.05% Tween-20) for rocking for 1 hour at room temperature. Primary antibody staining was done in TBS-T with 1% BSA rocking overnight at 4°C. Primary antibodies used were anti-phospho-(Thr308)-AKT rabbit mAb (Cell Signaling, 13038), anti-pan-AKT mouse mAb (Cell Signaling,2920), anti-PTEN rabbit mAb (Cell Signaling, 9559), and anti-lamin A/C mouse mAb (BioLegend, 600001) all used at a 1:1000 dilution from stock. Following primary, blots were washed three times for 10 minutes with TBS-T rocking at room temperature. Secondary staining was done in TBS-T with 0.01% SDS and 1% BSA. Secondary antibodies used were StarBright blue 520 goat anti-mouse IgG (BioRad, 12005866) and StarBright blue 700 goat anti-rabbit IgG (BioRad, 12004162) both used at a 1:5000 dilution. hFAB rhodamine anti-actin primary antibody (BioRad, 12004164) was used to stain for β-actin as a loading control in the secondary solution at a dilution of 1:5000. Secondary staining was performed on a rocker for one hour at room temperature followed by three 10 minute washes in TBS-T and developed on the BioRad Chemidoc MP.
VIS-seq molecular biology
In situ sequencing protocols are similar to those published previously35. However, we detail them here, documenting the changes we made. All in situ sequencing experiments were performed using all wells of a glass-bottom #1.5H 6-well plate (Cellvis, P06–1.5H-N) with cells plated at 800,000 cells per well (U2OS) or 600,000 cells per well (iPS cell/NGN2-induced neuron), and unless indicated otherwise wash steps contain 2mL per well.
U2OS and iPS cells were first washed 3 times with DPBS prior to fixation. Then 1 mL of 4% PFA (Electron Microscopy Sciences, 15713-S) in DPBS was added to each well. NGN2-induced neurons were fixed in media, by removing 50% of media (leaving 1 mL remaining in well) and adding 250 μL of 20% PFA dropwise for a final concentration of 4% PFA, followed by swirling gently to mix PFA into media. All cells were incubated during fixation in the dark for 30 minutes at room temperature and then washed 3X with DPBS prior to enzyme steps.
Cells were permeabilized with 1 mL per well of 0.1% Triton-X 100 (Sigma, Y8787–50ML) in DPBS for 15 minutes at room temperature. This was followed by 2 washes of 0.1% Tween-20 (Sigma, P1279–500ML) in DPBS (henceforth called PBS-T). A 5’-amino-linked RT primer containing linked nucleic acid bases was then hybridized by incubation with 600 μL per well of 1 μM CROP_RT2_LNA_5Am7,9 (IDT) for 15 minutes at room temperature followed by 3 PBS-T washes. Then cells were fixed with 1 mL per well of 3% PFA and 0.1% glutaraldehyde (Electron Microscopy Sciences, 16210) in PBS-T for 30 minutes at room temperature, followed by 3 PBS-T washes. 750 μL per well of reverse transcription reaction mix containing final concentrations of 1x Revertaid RT buffer, 250 μM dNTPs (New England Biolabs, N0447L), 400 μg/mL Ultrapure BSA (Thermo Scientific, AM2616), 0.5 U/μL RiboLock (Thermo Scientific, EO0381), and 4.8 U/μL Revertaid RT enzyme (Thermo Scientific, EP0451) in Nuclease-free water (Thomas Scientific, C001X09) and then added to each well. Water was added to the spaces between wells and the plate was sealed with foil. The plate was incubated at 37 °C while shaking at 80 RPM overnight.
Following reverse transcription, cells were washed three times with PBS-T and then fixed with 1 mL per well of 3% PFA and 0.1% glutaraldehyde in PBS-T for 30 minutes at room temperature. Cells were then washed five times with PBS-T to reduce the dNTP concentration. A previously reported optimized padlock probe9 was annealed onto the sample, the gap in the padlock probe was filled in, and the circle was ligated shut in a single reaction. This reaction was performed using 600 μL per well of the gap-fill mix containing final concentrations of 1X Ampligase buffer (Biosearch Technologies, SS000015-D3), 200 μg/mL Ultrapure BSA, 100 nM padlock probe (BESTMIP1 for PTEN, BESTMIP2_3ntdel for LMNA), 0.02 U/μL Taq-IT polymerase (Qiagen, P7620L), 50 nM dNTP, 0.1 U/μL RNaseH (New England Biolabs, M0297L) and 0.25 U/μL Ampligase (Biosearch Technologies, E0001–100D2) in Nuclease-free water. The low dNTP concentration is crucial and dNTP dilutions were re-prepped each experiment. The plate was incubated at 37 °C for 5 minutes and then 45 °C for 90 minutes.
The cells were washed twice with PBS-T and then 600 μL of RCA mix containing final concentrations of 1X Phi29 buffer (Thermo Scientific, EP0094), 250 μM dNTPs, 200 μg/mL Ultrapure BSA, 5% glycerol (Fisher Scientific, G33–1), and 0.25 U/μL Phi29 enzyme (Thermo Scientific, EP0094) in Nuclease-free water was added to each well. The plate was sealed with foil. The plate was incubated at 30 °C while shaking at 80 RPM overnight.
The cells were washed twice with PBS-T for the LMNA experiments and three times for the PTEN experiments and then stained. Different staining mixes were used for U2OS cells, iPS cells, and NGN2-induced neurons. U2OS cells were first stained with 600 μL per well of Mitoprobe10 staining mix containing final concentrations of 10% formamide (BP227–500), 250 nM 12S and 16S Mitoprobe10 (IDT), and 2X SSC (Invitrogen, 15557036) in Nuclease-free water. They were incubated at 37 °C for 20 minutes.
Following Mitoprobe staining, Cell Painting14 was performed on U2OS cells in a manner similar to previously described. Cell Painting buffer was prepared as 1X HBSS (Gibco, 14065–056), 1% BSA and 0.01% sodium azide (Sigma, S-8032). U2OS cells were then washed twice with PBS-T and 600 μL of Cell Painting mix containing 0.2 ng/μL DAPI (Fisher Scientific, D1306), 0.9 μL of WGA-CF555 (Biotium 29076–1), 0.75 μL of Phalloidin-CF568 (Biotium 00044-T), and 0.3 U/μL RiboLock in Cell Painting buffer was added. Dye concentrations were individually titrated per lot. Cells were incubated at room temperature for 30 minutes. Then U2OS cells were washed five times with PBS-T.
iPS cells and NGN2-induced neurons were first stained with 750 μL of a modified Call Painting/immunostaining mix that contained 0.2 ng/μL DAPI, 0.75 μL of Concanavalin A-AF750 (Thermo Scientific, C56127), 0.94 μL of Phalloidin-CF568 (iPS cells only), 0.25% Triton-X 100, 1.5 μL of anti-(Thr308)-pAKT Antibody (CellSignal, 13038T), and 1.5 μL of anti-MAP-2 Alexa594-labeled primary antibody (NGN2-induced neurons only; BioLegend, 801802) in Cell Painting buffer. As with U2OS cells, dye and antibody concentrations were individually titrated per lot. Both primary and secondary staining were incubated at room temperature while shaking at 80 RPM for 1 hour.
After painting and primary staining, iPS cells or NGN2-induced neurons were washed three times with PBS-T. Then they were incubated with 750 μL per well of Goat anti-Rabbit Alexa-647-labeled secondary antibody (Thermo Scientific, A-21244) at a concentration of 0.5 U/μL in Cell Painting buffer with added 0.25% Triton-X 100 for 1 hour while shaking at 80 RPM. Then cells were washed four times with PBS-T. Cells were kept in the dark during and after staining and prior to imaging.
Once the U2OS cells, iPS cells, and NGN2-induced neurons had been stained, 2 mL of Imaging mix containing 2X SSC, 10 mM ascorbic acid (Sigma, A92902–100G) and 0.2 ng/μL DAPI in Nuclease-free water were added to each well prior to imaging.
After phenotype imaging, dyes were removed differently for U2OS cells (LMNA) or iPS cells and NGN2-induced neurons (PTEN). U2OS cells were permeabilized with 2 mL 70% ethanol per well for 15 minutes at room temperature, which was sufficient to remove phalloidin; mitoprobe was removed later during sequencing steps by heating10. 1 mL PBS-T was added and 2 mL was removed followed by adding and removing 1 mL PBS-T four times to titrate the ethanol concentration below 1%. Then U2OS cells were washed twice with 2 mL PBS-T per well.
iPS cells and NGN2-induced neurons were washed twice with 2 mL PBS-T per well prior to chemical bleaching with 1 mL of 1.74 mg/mL sodium borohydride (Sigma, 452882) in DPBS incubated for 15 minutes at room temperature. The borohydride solution was used immediately after mixing. Then, cells were washed three times with PBS-T. After phenotyping dyes were removed or chemically bleached, we proceeded with in situ sequencing.
We then added 600 μL of 0.5 μM sequencing primer (using previously reported ISS1 (IDT) for PTEN experiments and ISS2 (IDT)9 for LMNA experiments) diluted in hybridization buffer of 2X SSC + 10% formamide to each well and incubated at room temperature for 30 minutes. We washed cells twice with PBS-T and then once with MiSeq Incorporation Buffer (IB, Illumina, MS-103–1003). We then proceeded to the base cycles: 600 μL of Illumina MiSeq reagent #1 (Illumina, MS-103–1003) was added to each well and the 6-well plate was incubated at 60 °C for 3 minutes. We then added and removed 1 mL IB four times until the incorporation mix was diluted >50-fold. We washed each well with 5X SSC + 0.5% Tween-20 (a wash-step replacement for Incorporation Buffer) and incubated wash steps at 55C (reduced temperature to prevent amplicon unbinding) for 5 minutes while shaking at 300 RPM. We repeated this wash and incubation four more times for a total of 5 washes to remove background base incorporation. We then performed base (genotyping) imaging (see below) in Imaging mix (2X SSC, 10 mM ascorbic acid, 0.2 ng/μL DAPI in Nuclease-free water).
After imaging was completed, we added 700 μL Illumina MiSeq reagent #4 (Illumina, MS-103–1003) to each well and incubated at 60 °C for 6 minutes, washed once with 2 mL IB per well. We then added 2 mL 5X SSC with 0.5% Tween-20 to each well and incubated at 55C for 1 minute while shaking at 300 RPM. We repeated this wash and incubation two more times for a total of 3 washes. We then washed once with 2 mL IB before adding Illumina MiSeq reagent #1 to incorporate another base. We repeated this process until 8 (PTEN) or 12 (LMNA) bases were imaged.
VIS-seq imaging
All VIS-seq imaging was performed using a Nikon Ti2-E inverted epifluorescence microscope with a motorized stage and motorized objective and filter cube turrets, similar to previously reported configurations for in situ sequencing35. A Hamamatsu ORCA-Fusion scientific CMOS camera was used to collect images. A Lumencor Celesta light engine with 7 laser lines (408, 445, 473, 518, 545, 635, and 750 nm) was used for fluorescence illumination. We also used a Finger Lakes Instrumentation HS-625 high speed emission filter wheel with 5 bandpass filters (499–529, 553–577, 604–644, 662–691, and 698–766 nm), and enabled hardware-level triggering between these filters. The phenotype imaging was performed using a 0.75 NA Plan Apochromat Lambda 20X objective, whereas base cycles were performed using a 0.45 NA Plan Apochromat Lambda 10X objective. Laser power was set to 1% for 405nm laser (DAPI), 20% for other phenotyping lasers, and 100% for in situ sequencing lasers, and exposure time was maintained at 50ms for both phenotyping and in situ sequencing imaging.
Confocal imaging of protein variant libraries in iPS cells and NGN2-induced neurons
Imaging of PTEN, HIST1H1E, and RPS19 libraries in iPS cells or NGN2-induced neurons for Figure S1C (top level) was performed using a Nikon A1R HD25 laser scanning confocal microscope. 405, 488, and 561 nm Nikon LU-N4 lasers were used to image DAPI-stained, mEGFP-tagged protein libraries, Phalloidin-CF568-stained cells, respectively. Imaging was performed using an Nikon Apochromat Lambda S LWD 40x water-immersion objective (NA 1.15). A Nikon galvano scanner and Nikon A1-DUG-2 GaAsP photomultiplier tubes were used to acquire 1024×1024 12-bit color images of single z-planes.
Stitching and alignment of VIS-seq images
Stitching, alignment, and read calling were performed using STARCall38. Stitching and alignment is required for processing of VIS-seq data due to the small (<3 pixel) features being detected and the need for them to align to one another across all imaging cycles. STARCall stitches the raw microscope images across positions and cycles simultaneously using an extended MIST algorithm123 incorporating improvements from ASHLAR124.
Using stage positional information, a graph was constructed representing images as nodes with edges connecting every overlapping image pair. This graph contained all imaging cycles, with edges between the same tile in one cycle to the respective tile in all other cycles. For every edge on this graph, both inter-cycle and intra-cycle, we performed the phase cross-correlation algorithm125 to find the alignment that maximizes the correlation between the two images. The alignment for each edge was scored by calculating the zero-normalized cross-correlation (ZNCC) on the overlapping region of the two images. The ZNCC was computed by normalizing the overlapping pixel intensities to unit vectors with mean zero and computing their dot product. To remove erroneous offsets, we calculated a lower-bound for the ZNCC scores of offsets, using the 95th percentile of ZNCC scores calculated on a random selection of non-overlapping images. Any pair of images that had a ZNCC score below this threshold was removed from the graph. Removing these pairs caused some portions of the graph to become disconnected, especially if cells in some regions of the well were sparsely plated thus yielding few features for alignment. To reconnect these portions, we trained a linear model on the remaining offsets, predicting their offset given their stage positions. This model was then used to replace edges on disconnected images.
We then constructed a linear system of equations on the global (x,y) positions of the images, where each offset in the graph creates two equations, relating the difference in position of the two images to the offset. This system was then solved using least squares, finding the optimal global position for each image. In particular, the system of equations was solved minimizing the mean absolute error (MAE), the solution for which can be found quickly with linear programming, with an additional benefit being that the influence of outliers is limited when compared to minimizing the mean squared error. All images were then combined to form composite images for each cycle, with overlapping regions between images combined using a weighted average. The code for STARCall and instructions on how to use it can be found on GitHub (see Data and code availability section).
Read calling of genotyping images
We defined the base reads as bright, high frequency features present in only one of the channels and changing frequently across bases, best characterized by a small point spread function. In addition, we characterized common image features that should be filtered out: low-intensity cell background, diffuse features that are present in most channels and increase monotonically across cycles, and cell debris, which present as extremely bright, high frequency features present in all channels and not fluctuating between cycles. With the objects of interest characterized, we developed a procedure for selecting only the fluorescent reads similar to previous approaches used for in situ sequencing datasets7: First we subtracted a small Gaussian blur (σ=4), which left only small, high frequency features, filtering out general and cell background. We then z-score normalized the resulting image, which corrected for differences between channels in intensity. Next, we subtracted the second-maximal channel on a per-pixel basis, which did not affect the fluorescent dots present in any one channel but greatly reduced the intensity of cell background and cell debris present in all channels. After these filtering steps we applied a small (σ=1) Gaussian blur to smooth out noise we introduced, then clipped any negative values to zero. Finally, we calculated the standard deviation on a per-pixel level across imaging cycles, which resulted in higher values for features that change between bases, filtering out background and debris that doesn’t change between cycles. After summing across channels we obtained a grayscale image containing the features we wanted to extract and believed to only be fluorescent reads. The Laplacian of Gaussian blob detection algorithm was applied to find these features at the expected size (1≤σ≤3). The positions of the blobs were used across all cycles to extract the base reads, which are defined using the highest channel intensity in each cycle for that position. We extracted these values from the images after we subtracted the small Gaussian blur and z-score normalized them, additionally applying a dilation morphological filter with a footprint of 2 pixels to ensure the maximum value of each dot is sampled. We associated barcodes (LMNA) or pairs of barcodes (PTEN) to the nearest genotype, making no call if two genotypes are equally close by edit distance. Later, we filtered cells on their least edit distance barcode.
Segmentation of cells and nuclei
Cell segmentation was performed using Cellpose version 2.2.139 for cell border segmentation and StarDist version 0.8.540 for nuclear segmentation. CellPose requires the estimated diameter of cells to be specified to rescale images before inference. The method for cell segmentation may be adjusted depending on the cell type and imaging setup. We used diameter values of d=50 for U2OS cells and d=35 for iPS cells and NGN2-induced neurons. For U2OS cells we provided DAPI as the nuclear channel and wheat germ agglutinin (WGA) and phalloidin as the cytoplasmic channel to CellPose. For iPS cells we provided DAPI as the nuclear channel and WGA as the cytoplasmic channel. Due to difficulties segmenting NGN2-induced neurons, we used only nuclear segmentation, converting it into an estimated cell segmentation by expanding the nuclear labels out to neighboring cells, with a maximum expansion distance of 25 pixels.
Intensity normalization and feature extraction
We performed intensity normalization for each well’s phenotype imaging separately to adjust for uneven illumination of the field of view by subtracting the 5th percentile value of that pixel across all such fields. After stitching and registering phenotype images with genotypes, we then partitioned each well’s phenotype image into a 20-by-20 grid and passed images, cell segmentations from Cellpose39, and nuclear segmentations from StarDist40 into CellProfiler version 4.2.641 for all grid positions that have genotyped cells. We then merged all of these grid positions to generate a well-level cells-by-features matrix. In the case of the PTEN VIS-seq experiments, we adjusted the pAKT channel (stained with AF647) for bleed-through from the ConA channel (stained with AF750) by subtraction. The CellProfiler41 pipelines for feature extraction used in PTEN and LMNA VIS-seq experiments are available in the STARCall GitHub (see Data and code availability section).
Generation of variant-level morphological profiles
We filtered out cells whose consensus barcodes were >1 (LMNA) or >2 (PTEN) edit distance from a barcode in the barcode-variant lookup table, had no variant called, contained barcodes (LMNA) or combinations of 2 barcodes (PTEN) that were present in <10 cells, or contained variants with <5 barcodes. We then extracted variant-level feature median and earth-mover distance (EMD) values57,58 from WT for all variants and features. We filtered out feature EMDs which were non-reproducible in a manner similar to Pearson et al58. Briefly, we randomly partitioned WT cells into two batches and computed scores equal to EMD values between the batches for each feature normalized to median absolute deviation. After repeating this 25 times, we removed features above an average score threshold of 1.5 times the interquartile range plus the lower quartile average score.
We then used Pycytominer version 1.2.156 to perform variant-level median aggregation and variant-level median/EMD normalization and selection, for generating morphological profiles. This normalization was a z-scoring for each feature median using synonymous feature means and standard deviation in the LMNA VIS-seq experiment, and a z-scoring over all variants for the PTEN VIS-seq experiments. We first removed non-biologically-relevant features like orientational or positional features, yielding 1077 features for LMNA and 3222 for PTEN. We ran pycytominer56 feature_select() using both “correlation threshold” and “variance threshold” to generate the variant-level morphological profiles.
Clinical variant curation
All clinical variant analysis excluded variants with predicted splicing impact defined by SpliceAI126 scores>0.2, since these assays do not detect splicing effects. LMNA ClinVar59 variants were accessed on 6/18/2024 and LP/P variants were curated. Conflicting variants were labeled as variants of uncertain significance (VUS). PTEN ClinVar variants were accessed on 12/19/2024; conflicting variants were treated as their most recent classification to generate a putative list of profiled VUS, LB/B and LP/P variants.
PTEN missense variants on gnomAD v4.188 were accessed on 5/14/2025 and used as a control set. PTEN missense variants present in 10 or more patient cancers on the Catalogue Of Somatic Mutations In Cancer (COSMIC)87, accessed on 12/12/2024, were used to define somatic PTEN cancer-associated variants.
For curation of PTEN clinical variant-phenotype associations for PHTS and ASD/DD, 170 publications containing a total of 541 probands with PTEN missense variants were identified using HGMD and ClinVar (accessed 5/13/2025). The clinical features for each individual with a PTEN missense variant were curated manually by a molecular genetic pathologist blinded to the functional assay results (AEM) and included demographics, clinical diagnosis (Cowden Syndrome (including Cowden-like Syndrome), Bannayan-Riley-Ruvalcaba syndrome, or PTEN hamartoma tumor syndrome, and the presence of autism spectrum disorder (ASD) or developmental delay/intellectual disability (DD/ID)). For the analysis, probands reported to have ASD, DD, or ID were considered to be affected by the neurodevelopmental phenotype. These phenotypes were combined due to inconsistencies in reporting practices. A large language model (Claude Sonnet 4, Anthropic) was used to assemble the data table (Table S5), with the output verified manually. Any variant linked to a clinical phenotype in at least one affected proband was then associated with that clinical phenotype. See Figure 6A for the number of curated variants in each clinical category.
Variant effect predictor curation and comparison
We curated scores from three leading in silico variant effect predictors: AlphaMissense91, EVE92, and REVEL93. We compared these predictor scores to VIS-seq readouts by plotting VIS-seq-derived impact or distinguishability scores (or landmark feature z-scores; x-axis) against predictor scores (y-axis), and by visualizing VIS-seq-derived UMAP embeddings colored by predictor scores (Figure S10). Spearman correlation coefficients between predictor scores and VIS-seq readouts are shown. ClinVar variants are shown on the plot for comparison.
QUANTIFICATION AND STATISTICAL ANALYSIS
Impact scoring and clustering of variant-level morphological profiles
We performed principal component analysis (PCA) for dimensionality reduction and uniform manifold approximation and projection (UMAP)127 for visualization of morphological profiles. For PCA, we used the minimum number of PCA dimensions that could explain 60% (PTEN) or 70% (LMNA) of the dataset variance. For PTEN NGN2-induced neuron profiles, we noticed that three variants with poor inter-replicate correlation were outlying in PC1, and they were removed prior to rerunning PCA and further analysis. We then performed UMAP on PCA-reduced variant profiles using cosine similarity to visualize the local structure of the phenotype manifold.
To generate morphological impact scores, we computed the cosine similarity between a variant’s morphological profile and the median synonymous variant morphological profile, computed by taking the median over each selected feature. We then defined the morphological impact score as the cosine distance, defined as ½ × (1 minus the cosine similarity). Morphological impact scores were compared between groups of variants using boxplots with the median and inter-quartile range (IQR) shown, with n shown below the label indicating the number of variants in each group, and Mann-Whitney U p-value cutoffs reported (Figures 2F, 6C, S5C, S5D, S5I, S6J, S9B, and S9D).
To perform Louvain clustering128 on the LMNA profiles, we generated an unweighted k-nearest-neighbor graph using k=25 of the PCA-reduced variant profiles using cosine distance. Then, we used the Louvain community detection algorithm to partition variants into clusters. To demonstrate separation by profiles, we plotted variants organized by cluster against selected features used to generate profiles (Figure S3A). We plotted cosine similarities between pairs of variants within a cluster and between different clusters to demonstrate that clusters were distinct, with Mann-Whitney U p-value cutoffs reported (Figure S3B). To demonstrate cluster robustness we randomly removed 10% of variants, performed clustering 100 times, and plotted the fraction of cluster co-occurrence (Figure S3C). For each cluster, the z-scored ratio between lamin A and DAPI intensity across radial deciles is shown with Mann-Whitney U p-value cutoffs reported (Figures 3C, S3D). Lastly, the distribution of variants across clusters by subdomain, position group, and amino acid substitution are shown with Fisher’s exact test p-value cutoffs reported (Figures S5B, S5J, and S5I).
Training of variant-specific classifiers and distinguishability scoring
To characterize the morphological distinguishability of each variant from WT using single-cell image data, we trained a collection of binary classifiers, each tasked with distinguishing a single variant from the wild type (Figure S2F). The input to each variant-specific model was the set of CellProfiler-derived41 features (1,077 for LMNA, 3,222 for PTEN iPS cells, and 3,227 for PTEN neurons), extracted from wild type cells and the cells corresponding to that variant. To avoid any data leakage, no feature selection was performed. Features were not z-scored. The output was a binary label indicating whether each cell is wild type or variant. In order to prepare the training data, we first removed any variants that had fewer than 150 cells between both replicates, after which we divided the filtered training data into a 8:1:1 train/validation/test split, stratified by variant type. For each variant, we used the train and validation sets to train a single binary classifier using the Extreme Gradient Boosting (XGBoost) algorithm129, where the loss on the validation set was used to determine when to stop training. Finally, after training variant-specific classifiers, we defined each variant’s distinguishability score as the area under the receiver operating characteristic curve (AUROC) on the held-out test set (Tables S2 and S4). This test set was not used during training, providing an unbiased estimate of how well each model generalizes to unseen data. Distinguishability scores were compared between groups of variants with n reporting the number of variants in each group and Mann-Whitney U p-value cutoffs reported (Figures 2G, S5I, S6K, and S9E).
For PTEN variants, the distinguishability score difference was defined as the percentile difference in distinguishability scores between iPS cells and NGN2-induced neurons (Figure S8J). iPS cell or NGN2-induced neuron features were ranked by Spearman correlation with the distinguishability score difference. Spearman correlation values were converted to Student’s t-test statistics and then Bonferroni-corrected p-values were reported (Figure S8K).
Volcano analysis of features
To generate variant-level feature p-values for each replicate, we performed a Komolgorov-Smirnov (KS) test between each variant and WT for each feature to generate a matrix of p-values with dimensions of (number of variants: 1,767 for LMNA and 1,228 for PTEN) by (number of features: 1,077 for LMNA, 3,222 for PTEN iPS cells, and 3,227 for PTEN neurons). Distributions of feature values over single cells were tested. See Tables S2 and S4 for the number of cells in each replicate for each variant. P-values from each replicate were combined using Fisher’s method. Next, we generated a volcano plot in which each feature was summarized by the median z-score effect across all variants (x-axis) and the geometric mean, across variants, of the KS-test p-values for that feature (y-axis). This analysis was repeated for both feature medians and feature EMDs. See Tables S1 and S3 for the geometric mean p-values and median effects on each feature. We then called features as hits if they met both criteria: Bonferroni-corrected p-value < 0.01 and |median z-score| > 0.25 for LMNA (or > 1.0 for PTEN) (Figures 2D, S2E, S6G, and S6I). For both LMNA and PTEN, landmark features were interpretable hit features that were highly significant and strongly perturbed. For curated variant-specific volcano plots (Figure 6F), we summarized each feature by the median z-score effect across variants in the curated group (x-axis) and by the geometric mean of the corresponding p-values across variants in that same group (y-axis).
Structural analysis
We used the lamin A structures of A11 tetramer 6JLB63 and recently published A22 four-coil interaction64, and PTEN structure 1D5R71 to perform structural analysis of morphological profiles. Structures were visualized and residues colored with position-averaged feature z-scores using PyMOL v3.1. Lamin A rod subdomains were defined using annotations in Ahn et al.63 and PTEN P-loop and TI-loop annotations were defined using annotations in Lee et al.71
To call lamin A dimer-facing and multimer-facing positions, we plotted the inter-chain minimum β-carbon to β-carbon distance for each lamin A position (except glycines) on a given chain. We identified residues with inter-chain local minima distances, and any residues within 1.5 Angstroms of this inter-chain local minimum distance, as interacting with an opposing chain (Figure S6H). Interactions were labeled as dimer-facing or multimer-facing depending on the geometry of the chains: coil 1B forms the A11 lateral structure and therefore contains both types of interacting residues whereas coil 2A forms the A22 four-helix bundle and therefore only contains multimer-facing residues.
Groups of LMNA positions defined by clustering (Figures S5E and S5F) were compared by lamin A subdomain (Figures 4D and 4E) or by participation in dimer- or multimer- contacts (Figure 4F) with n equal to the number of such positions shown on the y-axis and χ2 goodness-of-fit p-value cutoffs reported.
Mislocalization scoring
We noticed that PTEN variant localization, as measured by DAPI-PTEN correlation, was related to variant abundance as measured by mEGFP-PTEN intensity and lipid phosphatase activity as measured by pAKT intensity (Figure S8D). To isolate the effect of variants on localization, we computed a mislocalization score for each variant. For each variant, we fit a kernel ridge regression model with a radial basis function kernel (α = 1.0, γ = 1.0) to predict DAPI–PTEN correlation from abundance and activity features. The residual from this model, defined as the observed DAPI–PTEN correlation minus the model-predicted value, was used as the mislocalization score. Positive residuals indicate greater nuclear localization than expected given abundance and activity, whereas negative residuals indicate less nuclear localization than expected. iPS cell mislocalization scores are reported in this publication (Figure S8E). Mislocalization scores are shown using boxplots illustrating median and IQR for curated clinical PTEN variant categories with n indicating the number of variants and Mann-Whitney U p-value cutoffs reported (Figure 6E).
PTEN VIS-seq score or feature effects for curated clinical variants
For comparisons of VIS-seq-derived scores or features between curated PTEN variant categories, gnomAD88 variants that were not ClinVar LP or P were used as a control set. Scores or features are shown using boxplots illustrating median and IQR with n indicating the number of variants in each clinical category and Mann-Whitney U p-value cutoffs reported (Figures 6C, 6E, and S9D–S9G).
Assay or predictor performance on curated clinical variants
For LMNA, we used VIS-seq-derived variant-level impact scores (computed from feature medians, feature EMDs, or both) and distinguishability scores for zero-shot binary classification of either ClinVar pathogenicity or aggregation42 versus synonymous controls (Figure S2J). High-frequency LMNA variants in gnomAD are often pathogenic and are not an appropriate control set. Since these models are zero-shot, no training was performed and the test set is the entire dataset of variants. Because protein-level variant effect predictors do not provide meaningful scores for synonymous variants, they were also excluded from this analysis.
For PTEN, we used VIS-seq-derived variant-level distinguishability scores (from iPS cells and NGN2-induced neurons), together with yeast fitness84 scores, VAMP-seq5 scores, and in silico predictors (AlphaMissense91 and EVE92), for zero-shot binary classification of ClinVar pathogenicity or curated disease-associated variant classes versus ClinVar likely benign or gnomAD variants. gnomAD variants that were not ClinVar LP or P or present in the clinical variant-phenotype associations we curated were used as a control set. REVEL93 was trained and validated on ClinVar and therefore was excluded from this analysis. Since these models are zero-shot, no training was performed and the test set is the entire dataset of variants. ROC curves and corresponding AUC values for these zero-shot models are shown in Figures 6B and S9C.
For three-class prediction among (i) gnomAD controls, (ii) PHTS variants not associated with ASD/DD, and (iii) ASD/DD variants not associated with PHTS, we first excluded variants not belonging to exactly one of these classes. Then, we trained linear support vector classifiers (SVC) using either all three VIS-seq landmark features from iPS cells or NGN2-induced neurons, or one of the one-dimensional variant-level scores used in the binary analyses mentioned above. Classes were reweighted to address class imbalance. We used stratified five-fold cross-validation for generating the out-of-fold probabilities used for ROC and AUC. For each model, multiclass ROC curves were computed by macro-averaging one-vs-rest sensitivity and specificity across the three classes, weighing each class equally (Figure 6G).
Supplementary Material
Figure S1. piggyBac MOI titration and knockout Western blots, related to Figure 1
(A) Fraction of cells with 0, 1, 2, or 3 integrations, determined by in situ sequencing, after co-transfection into U2OS cells of LMNA VIS-seq library DNA at different quantities (in nanograms) with plasmid encoding Piggybac-ase (at a 4-fold lower mass) and puromycin selection.
(B) Imaging of day-7 neuron-like cells derived from clonal NGN2-inducible PTEN knockout lines stained for DAPI (blue) and NCAM1 (red). Scale bar indicates 20 μm.
(C) Unmatched images of mEGFP-tagged H1.4, RPS19, and PTEN libraries in human WTC11 iPS cells or PTEN library in NGN2-induced neuron-like cells (right), stained with DAPI and phalloidin-CF568 (top row) and first base in situ sequencing (bottom row). Some cells express variants with localization defects (HIST1H1E=chromatin binding, RPS19=nucleolar-cytoplasmic, PTEN=nucleo-cytoplasmic). Nucleobase coloring identical to Figure 1E. Scale bar indicates 20 μm.
(D) Western blot of U2OS cells showing parental line (left) and LMNA knockout line clone 3 (right) stained for lamin A protein. Two replicate lanes are shown. Clone 3 was used in subsequent LMNA VIS-seq experiments.
(E) Western blot of NGN2-inducible iPS cells showing parental line (left) and PTEN knockout clonal lines (right) stained for PTEN and phospho-AKT protein. Clonal lines C2 and C18 were used for PTEN VIS-seq replicate 1 and 2, respectively.
Figure S2. LMNA VIS-seq replication, scoring, and feature analysis, related to Figure 2
(A) 12-basepair barcode sequences on circular RNAs were read by in situ sequencing-by-synthesis in each cell and then mapped to corresponding LMNA variants using a barcode-to-variant dictionary made by long-read sequencing. Example cells with reads in all 12 cycles are shown.
(B) Histogram of the total number of 12-base pair reads per sequenced cell in a single well of LMNA replicate 2 experiment.
(C) Histogram of the edit distance between consensus cell-level 12-base pair read and nearest library barcode in a single well of LMNA replicate 2 experiment. Red line indicates that cells with edit distance of 0 or 1 were used if they matched to a unique barcode.
(D) Number of cells genotyped for each LMNA variant in both replicates of VIS-seq screen colored by variant type, with Pearson’s r shown.
(E) Volcano plots of median LMNA variant effect on feature median z-scores (x-axis, left) or feature EMD z-scores (x-axis, right) against geometric-mean KS-test p-values (y-axis, tested against WT feature distributions). Both the feature effect size (x-axis) and p-values (y-axis) are computed over all profiled variants. Points are features with area proportional to the number of variants that pass thresholds and coloring is by imaging channel (top) or cellular compartment (bottom) (see right for legend). Red dashed lines show the Bonferroni-corrected p<0.01 threshold and effect thresholds used to define hit features.
(F) Flowchart describing training a binary classifier for each variant to distinguish images of cells expressing that variant from corresponding WT cell images. For each variant, a distinguishability score summarizing classifier performance was computed as the area under a ROC curve on a test set of single cells. 0.5 indicates random classifier performance and 1 indicates perfect discrimination between variant and WT single cells. For a full description, see Methods.
(G) Morphological impact score of variants in both replicates of the LMNA VIS-seq experiment colored by variant type as in (D), with Pearson’s r shown.
(H) Distinguishability score of variants in both replicates of the LMNA VIS-seq experiment colored by variant type as in (D), with Pearson’s r shown.
(I) Boxplots of morphological impact scores computed from only median, only EMD, or median and EMD-based variant profiles for the same groups of variants as Figure 2F. Points represent individual variant impact scores.
(J) Receiver operating characteristic (ROC) curves are plotted for univariate zero-shot models predicting LMNA aggregating control variants (left) or ClinVar P/LP variants (right) from synonymous variants, using impact scores shown in (I) or distinguishability scores. AUC is shown on the right for each model.
Figure S3. LMNA variant profile clustering is robust and separates variants by selected and landmark features, related to Figure 3
(A) Heatmap of LMNA variants organized their Louvain-derived cluster (y-axis, colored according to Figure 3A) versus selected feature medians (left portion of heatmap, x-axis) or selected feature EMDs (right portion of heatmap, x-axis) organized by hierarchical clustering. Each feature medians and EMDs is z-scored using the synonymous distribution of the feature. Imaging channel and compartment are annotated for each feature on the top, with clustering dendrogram shown (see legend on the right). Landmark feature medians, EMDs, and impact and distinguishability scores are shown separately on the right. Nuclear mEGFP-lamin A intensity and granularity features, nuclear shape features, and nuclear correlation features between DAPI and mEGFP-lamin A channels are annotated below.
(B) Boxplot depicting cosine similarity computed for pairs of variants using either selected feature medians or EMDs. Pairs of variants were chosen in different ways to assess the robustness of the Louvain-derived clusters (see Figure 3A). To assess the baseline cosine similarity between the morphological profiles of variants, pairs were chosen from all variants (“global”, left column, grey boxes). To assess the similarity between variants in each cluster, pairs were drawn from the cluster (“within”, right columns, blue boxes). To assess the similarity between variants within a cluster compared to variants outside the cluster, one variant was drawn from the cluster and one was drawn from outside the cluster (“between”, right columns, orange). Each boxplot represents the distribution of cosine similarities computed for all pairs of the indicated type. Louvain cluster colors are shown above. *** indicates Mann-Whitney U FDR-corrected q<0.001.
(C) To assess the stability of our Louvain clustering (see Figure 3A), the clustering was performed 100 times, each time omitting a randomly selected 10% of variants. The fraction of samples in which each pair of variants was located in their original Louvain cluster is shown. Variants are organized by their original Louvain cluster labels shown on the left.
(D) UpSet plot showing the number of missense LMNA variants that impact each combination of landmark features defined in Figure 3B. Features were considered impacted if EMD z-score>2.5 and KS-test Bonferroni-corrected p<0.01). Sets, represented by bars and dots in the plot, are colored by the most frequently impacted feature in the set (no impacted features=black, circularity=orange, granularity=purple, intensity/boundary intensity=teal).
Figure S4. LMNA VIS-seq landmark feature analysis, related to Figure 3
(A) Randomly-selected cells from the 10th and 90th percentile in the landmark feature distributions are shown. mEGFP-tagged lamin A channel is shown in green, and DAPI in blue. Silver dashed lines depict the borders of cells. Scale bar indicates 5 μm.
(B) Plot showing variant effects on nuclear mEGFP-lamin A granularity 1, 2 and 3 features by Louvain cluster of residence. Z-scores are versus the synonymous distribution. Synonymous variants are shown on the left. Granularity scale by pixel size is shown in the legend.
(C) LMNA feature median z-scores for mEGFP-lamin A boundary intensity, nuclear eccentricity, nuclear solidity, and nuclear eccentricity are plotted against each other for all profiled variants. Z-scores are versus the synonymous variant distribution. Variant type is colored according to: synonymous variants (green), missense (grey), and frameshift variants (purple).
(D) Plot over nuclear radial deciles of the mEGFP-lamin A total intensity in each decile normalized to the DAPI total intensity in that decile. Bin 1 corresponds to the innermost radial decile and bin 10 to the outermost. The data for the remainder of the clusters and a diagram of nuclear radial deciles are shown in Figure 3C.
(E) Map of lamin A missense variant effects on the lamin A nuclear intensity feature (left; grey boxes=missing variants, black dots=synonymous substitutions) and on the UMAP visualization of variant profiles (right, see Figure 2H). Blue to red coloring indicates feature z-score versus the synonymous variant distribution. Above the map, positions are annotated by their participation in multimer contacts (first row, defined below), α-helical average substitution profile (second row, defined below and in Figure S5F), and proline-substitution profile (third row, defined below and in Figure S5E). Linker or coil subdomains are shown in the thicker annotation bar directly above the heatmap63. Shown below the heatmap are position-averaged scores, and on the right are amino acid substitution marginal feature z-score distributions (points depict variants; black lines depict medians).
(F) Heatmap and UMAP for missense substitution effects on the lamin A nuclear boundary intensity feature, colored and annotated as in (E).
(G) Heatmap and UMAP for missense substitution effects on the lamin A nuclear granularity 1 feature, colored and annotated as in (E).
Figure S5. VIS-seq profiles separate lamin A residues by structural and functional properties, related to Figure 4
(A) UMAP visualization of LMNA synonymous (dark green) or missense variant profiles colored by lamin A subdomain.
(B) Heatmap of proportion of missense variants in coil or linker 12 subdomains present in each Louvain cluster (see Figure 3A). Proportion is indicated inside each box. Color scale for heatmap shown on the right. *** indicates Fisher’s exact test FDR-corrected q<0.001 and ** indicates q<0.01.
(C) Morphological impact score of missense substitutions plotted by domain, compared with synonymous variants (green). *** indicates Mann-Whitney U-test p<0.001.
(D) Morphological impact score of lamin A missense substitutions plotted by amino acid, compared with synonymous variants (green). *** indicates Mann-Whitney U-test p<0.001.
(E) Pearson’s correlation coefficients between LMNA proline substitution morphological profiles in PCA space (light color=positive correlation, dark=negative correlation). Bars on top indicate each position’s subdomain and participation in multimer contacts (see (G), Methods, color legend shown above and (E)). Bars on right indicate dendrogram-derived grouping into three clusters: linker, linker-proximal and α-helical.
(F) Pearson’s correlation coefficients between the α-helical (defined in (D)) PCA space, position-averaged variant morphological profiles (light color=positive correlation, dark=negative correlation). Synonymous variants were averaged into a single PCA vector and included (green annotation, top). Bars on top indicate each position’s subdomain63 (color legend above and (D)) and participation in multimer contacts63,64 (see (G), Methods). Bars on right indicate dendrogram-derived grouping into three clusters: strongly aggregating, low abundance, and low impact.
(G) Contour plots indicate location on the UMAP shown in (Figure 2H) for missense substitutions at each group of positions defined in (E), labeled according to their effects on lamin A.
(H) Intra-dimer (left) and inter-dimer coil 1B (middle) and coil 2A (right) minimum β-carbon distances derived from A1163 and A2264 multimer structures. Dimer-facing residues are indicated in blue (left) and multimer-facing residues are indicated in red for A1163 and A2264 structures (middle, right, respectively). See Methods for how these residues are defined.
(I) Morphological impact score (left) or distinguishability score (right) of missense substitutions at α-helical position groups as defined in (E) by clustering amino acid positions. *** indicates Mann-Whitney U-test p<0.001.
(J) Heatmap of proportion of missense variants in each position cluster from (E) and (F) present in each Louvain cluster (see Figure 3A). Proportion is indicated inside each box. Color scale for heatmap shown on the right. *** indicates Fisher’s exact test FDR-corrected q<0.001, ** indicates q<0.01, and * indicates q<0.05.
(K) Heatmap of average effects on scores and landmark features of substitutions to amino acids in proline substitution profile-defined position groups (see (E)) and α-helical position groups (see (F)). Lprox indicates linker-proximal, αH indicates α-helical, Agg indicates aggregation-sensitive, Abund indicates abundance-sensitive, and LI indicates low impact position groups. Color scales for each heatmap shown on the bottom.
(L) Heatmap of proportion of missense variants to groups of amino acids present in each Louvain cluster (see Figure 3A). Proportion is indicated inside each box. Color scale for heatmap shown on the right. *** indicates Fisher’s exact test FDR-corrected q<0.001, ** indicates q<0.01, and * indicates q<0.05.
Figure S6. PTEN VIS-seq replication, scoring, and feature analysis in iPS cells and NGN2-induced neurons, related to Figure 5
(A) Histogram of total read count per genotyped cell in single well of PTEN iPS cell (blue) or NGN2-induced neuron-like cell (green) replicate 2 experiments. Each genotyped cell has two 8-bp barcodes.
(B) Total (summed) edit distance between consensus cell-level 16-base pair double barcode reads and nearest library double barcode in a single well of PTEN iPS cell (blue) or NGN2-induced neuron (green) replicate 2 experiments. Red line indicates the edit distance ≤ 2 filtering threshold we used for matching of sequenced barcodes to our barcode library.
(C) Total edit distance between consensus cell-level 16-base pair double barcode reads and nearest library double barcode (x) or second nearest library double barcode (y) in a single well of PTEN iPS cell replicate 2 experiment. Red line indicates that cells with total edit distance < 3 were used if they matched to a unique library double barcode.
(D) Boxplots of the number of profiled single cells per variant of the indicated class for iPS cells (left) or NGN2-induced neurons (right) containing single PTEN variants over both replicates.
(E) Number of cells genotyped for each PTEN variant in iPS cell (x) and NGN2-induced neuron (y) of independently transfected and differentiated replicate 1 (left) and replicate 2 (right) of VIS-seq experiment, with Pearson’s r shown.
(F) Stacked barplot showing proportion of all features measured in iPS cells, selected features which generate variant profiles, and hit features (defined in (G)) by imaging channel, cellular compartment, or method of computation of feature. The number of features in each set is indicated. Color legends are shown below.
(G) Volcano plots of median variant effect on iPS cell feature median z-scores (x-axis, left) or feature EMD z-scores (x-axis, right) against geometric-mean KS-test p-values (y-axis). Both the feature effect size (x-axis) and p-values (y-axis) are computed over all profiled variants. Points are features with area proportional to the number of variants that pass thresholds and coloring is by imaging channel (top) or cellular compartment (bottom) (see (F) for legend). Red dashed lines show the Bonferroni-corrected p<0.01 threshold and effect thresholds. Hit features are defined by median effect thresholds (left).
(H) Stacked barplot showing proportion of all features measured in NGN2-induced neurons as in (F).
(I) Volcano plots of NGN2-induced neuron features as in (G). See (H) for legend.
(J) Morphological impact score for PTEN variants in iPS cells (left) and NGN2-induced neurons (right), plotted by variant type. *** indicates Mann-Whitney p<0.001.
(K) Distinguishability scores for PTEN variants in iPS cells (left) or NGN2-induced neurons (right), plotted by variant type. Shown comparisons (top) have Mann-Whitney p<0.001.
(L) Morphological impact scores for variants in each replicate of PTEN VIS-seq experiment in iPS cell (left) and NGN2-induced neurons (right), colored by variant type as in (E), with Pearson’s r shown.
(M) Morphological impact scores (top) and distinguishability scores (bottom) for PTEN variants in both iPS cells and NGN2-induced neurons are compared, colored by variant type as in (B). Pearson’s r shown.
Figure S7. Landmark features in VIS-seq PTEN experiments, related to Figure 5
(A,B,C) Missense variant effect maps for NGN2-induced neuron-like cell PTEN landmark features (grey boxes=missing variants, black dots=synonymous substitutions). Blue to red coloring indicates feature z-score versus the synonymous variant distribution. Linker or coil subdomains are shown in the thicker annotation bar directly above the heatmap. Shown below the heatmap are position-averaged scores, and on the right are amino acid substitution marginal feature z-score distributions (points depict variants; black lines depict medians).
(D) Two randomly-selected iPS cells (top) or NGN2-induced neurons (bottom) expressing PTEN WT or variants are shown. mEGFP-tagged PTEN channel is shown in green, DAPI in blue, phalloidin (top) or MAP2 (bottom) in yellow, and pAKT in red. Silver dashed lines depict the borders of cells. Scale bar indicates 5 μm.
(E) PTEN abundance score for PTEN variants in iPS cells (upper left) and NGN2-induced neurons (upper right) plotted against VAMP-seq5 score, with best-fit line and Pearson’s r shown. pAKT intensity score for PTEN variants in iPS cell (bottom left) and NGN2-induced neuron-like cells (bottom right) plotted against yeast fitness84 score, with Pearson’s r shown.
(F) iPS cell mEGFP-PTEN intensity score positional averages are used to color the PTEN crystal structure (1D5R)71. Solvent-exposed surface is colored by the exposed residue’s z-score. Blue indicates low-intensity positions. The bound tartrate molecule is shown in green. Residue color scale shown to the right of (H).
(G) iPS cell pAKT intensity score positional averages are used to color the PTEN crystal structure (1D5R)71.
(H) iPS cell DAPI-PTEN correlation score positional averages are used to color the PTEN crystal structure (1D5R)71.
Figure S8. Relationships between PTEN landmark features and scores in iPS cells and NGN2-induced neuron-like cells, related to Figure 5
(A) iPS cell landmark feature z-scores are plotted against NGN2-induced neuron-like cell landmark feature z-scores for all profiled variants, colored by variant type. Z-scores are versus the synonymous variant distribution.
(B) iPS cell nucleus to cytoplasm PTEN intensity ratio z-scores are plotted against iPS cell DAPI-PTEN correlation z-scores for all profiled variants, colored by variant type as in (A).
(C) Violin plots of iPS cell and NGN2-induced neuron PTEN abundance (left) and DAPI-PTEN correlation z-scores (right). iPS cell-specific nuclear-localized variants are indicated. *** indicates KS p<0.001.
(D) iPS cell PTEN landmark feature z-scores are plotted against each other for all profiled variants. Variant type is colored according as in (A). iPS cell-specific lipid phosphatase-active nuclear-localized variants are indicated.
(E) The mislocalization score is defined as the variant-level residuals when DAPI-PTEN correlation is regressed against PTEN intensity and pAKT intensity (see Methods). Positional heatmap of iPS cell mislocalization scores for missense, stop gain, or 3-nucleotide deletion variant profiles in iPS cell. Positional average of scores shown on the bottom and amino acid substitution marginal distribution shown on the right.
(F) Mislocalization score in iPS cells computed independently for replicates 1 and 2 are plotted. Synonymous (green) and missense (grey) variants are shown. Pearson’s r shown on the top left.
(G) PTEN structure (1D5R)71 with spheres centered at α-carbon atoms with radii indicating the number of lipid-phosphatase active (defined as pAKT z-score<2.5) and nuclear-localized (defined as DAPI-PTEN correlation z-score>2.5) variants at that position. The top four positions indicating nuclear localization are highlighted, with the number of variants at each of these positions indicated.
(H) iPS cell mislocalization score positional averages are used to color the PTEN crystal structure (1D5R)71. Blue indicates aberrantly cytoplasmic-localizing positions and red indicates aberrantly nuclear-localizing positions. Color scale shown on the right.
(I) UpSet plot showing the number of PTEN missense variants that perturb each combination of landmark features in iPS cells and NGN2-induced neurons. Landmark features for each variant were considered perturbed if feature |z-score|>2.5 and KS-test Bonferroni-corrected p<0.01. Sets, represented by bars and dots in the plot, are colored by combinations of impacted features (neither pAKT nor PTEN abundance impacted in both cell types = black, pAKT impacted in both cell types = orange, PTEN abundance impacted in both cell types = teal, both pAKT and PTEN abundance impacted in both cell types = purple).
(J) The distinguishability score difference is defined as the variant-level difference in distinguishability score percentile between iPS cells and neurons. Positional heatmap of distinguishability score difference for missense, stop gain, or 3-nucleotide deletion variant profiles in iPS cells. Positional average of scores shown on the bottom and amino acid substitution marginal distribution shown on the right.
(K) Waterfall plots of ranking all iPS cell features (top) or neuron features (bottom) ranked by Spearman’s correlation with distinguishability score difference. Positive values indicate larger values of feature are associated with distinguishability in iPS cells compared to neurons. Landmark features are indicated and tracks are shown below the plot indicating feature imaging channel and cellular compartment. Lines above and below the plot labeled by *** indicate which features have Student’s t-test Bonferroni-corrected p<0.001.
Figure S9. PTEN VIS-seq profiles predict pathogenicity and clinical phenotype, related to Figure 6
(A) UMAP visualization of iPS cell (left) and NGN2-induced, neuron-like cell (right) PTEN variant profiles with triangles indicating variants in ClinVar classified as likely benign (LB, blue), likely pathogenic (LP, red), pathogenic (P, dark red), or variant of uncertain significance (VUS, yellow). All profiled variants are plotted in the background colored green (synonymous) or grey (otherwise) for comparison.
(B) Morphological impact scores for PTEN iPS cell (left) or NGN2-induced neuron (right) profiles are plotted by ClinVar label. Synonymous variants (green) are included for comparison. *** indicates Mann-Whiney U p<0.001.
(C) Receiver operating characteristic (ROC) curves are plotted for univariate zero-shot models predicting ClinVar pathogenicity or each clinical phenotype (ASD = autism spectrum disorder, DD = developmental delay, PHTS = PTEN hamartoma tumor syndrome; see Methods for curation criteria) from iPS cell or NGN2-induced neuron distinguishability score (this publication, solid lines), yeast fitness scores84 (dashed line), VAMPseq scores5 (dashed line), AlphaMissense91 or EVE92 scores (dot-dashed lines). Area under the curve (AUC) scores are shown in the box for each model.
(D) Morphological impact scores for PTEN variants in NGN2-induced neurons are plotted by variant association with clinical phenotypes. gnomAD v4.1 (light blue) variants and synonymous variants (green) are plotted for comparison.
(E) Distinguishability scores for PTEN variants in iPS cells (top) or NGN2-induced neurons (bottom) are plotted by variant association with clinical phenotypes. gnomAD v4.1 (light blue) variants and synonymous variants (green) are plotted for comparison.
(F) Feature scores for PTEN variants in iPS cells are plotted by variant association with clinical phenotypes. gnomAD v4.1 (light blue) variants and synonymous variants (green) are plotted for comparison. Features plotted include PTEN intensity (left) or pAKT intensity (right). *** indicates Mann-Whitney U-test p<0.001.
(G) Feature scores for PTEN variants in NGN2-induced neurons are plotted by variant association with clinical phenotypes. gnomAD v4.1 (light blue) variants and synonymous variants (green) are plotted for comparison. Features plotted include PTEN intensity (left), pAKT intensity (middle), or mislocalization score (right). *** indicates Mann-Whitney U-test p<0.001.
(H) Plot of PTEN median effects on feature z-scores in the ASD/DD-only variants (x-axis) against PHTS-only variants (y-axis), in iPS cells (left) or NGN2-induced neurons (right). Points are features colored by imaging channel (see left for legend). Landmark features are highlighted. Red dashed lines show median effect thresholds at synonymous z-scores of 1 or −1.
Figure S10. VIS-seq measurements elaborate computational variant effect predictions, related to Figure 6
(A) Predictor scores for AlphaMissense91, EVE92, and REVEL93 are plotted against LMNA VIS-seq variant impact and distinguishability scores as well as landmark feature medians. ClinVar P/LP variants are shown with red triangles. Predictor-specified pathogenic and benign thresholds are shown by horizontal dotted lines. Pearson’s r between logit-transformed predictor scores and VIS-seq scores are shown on the bottom left.
(B) Predictor scores for AlphaMissense91, EVE92, and REVEL93 are colored on the UMAP visualization of LMNA VIS-seq profiles.
(C) Predictor scores for AlphaMissense91, EVE92, and REVEL93 are plotted against PTEN VIS-seq iPS cell variant impact and distinguishability scores as well as landmark feature medians. ClinVar P/LP variants are shown with red triangles and B/LB variants are shown with blue triangles. Predictor-specified pathogenic and benign thresholds are shown by horizontal dotted lines. Pearson’s r between logit-transformed predictor scores and VIS-seq scores are shown on the bottom left.
(D) Predictor scores for AlphaMissense91, EVE92, and REVEL93 are colored on the UMAP visualization of PTEN iPS cell VIS-seq profiles.
(E) Predictor scores for AlphaMissense91, EVE92, and REVEL93 are plotted against PTEN VIS-seq NGN2-induced neuron-like cell variant scores and landmark features as in (C).
(F) Predictor scores for AlphaMissense91, EVE92, and REVEL93 are colored on the UMAP visualization of PTEN NGN2-induced neuron VIS-seq profiles as in (D).
Table S1. LMNA feature-level volcano analysis, related to Figure 2
Table S4. PTEN variant-level landmark features, scores, and clinical annotations, related to Figures 5 and 6
Table S5. Curated PTEN missense variant–phenotype associations from the literature (PHTS and ASD/DD), related to Figure 6
Table S6. DNA sequences used in this publication, related to STAR Methods
(A) Primers and oligos used for cloning
(B) Libraries
(C) Gene fragments used for cloning
(D) Plasmids
(E) Guide RNAs
(F) Oligos for in situ sequencing
KEY RESOURCES TABLE.
| REAGENT OR RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Antibodies | ||
| Anti-NCAM1 Rabbit mAb, Alexa Fluor 647 conjugate | Cell Signaling Technology | Cat#50831S; RRID: AB_3717968 |
| Anti-MAP2 Mouse mAb, Alexa Fluor 594 conjugate | BioLegend | Cat#801802; RRID: AB_2721382 |
| Anti-phospho-AKT (Thr308) Rabbit mAb | Cell Signaling Technology | Cat#13038T; RRID: AB_2629447 |
| Goat Anti-Rabbit IgG H&L cross-adsorbed secondary antibody, Alexa Fluor 647 conjugate | Thermo Scientific | Cat#A-21244; RRID: AB_2535812 |
| Anti-pan-AKT Mouse mAb | Cell Signaling Technology | Cat#2920; RRID: AB_1147620 |
| Anti-PTEN Rabbit mAb | Cell Signaling Technology | Cat#9559; RRID: AB_390810 |
| Anti-lamin A/C Mouse mAb | BioLegend | Cat#600001; RRID: AB_2810655 |
| StarBright Blue 520 Goat Anti-Mouse IgG | Bio-Rad | Cat#12005866; RRID: AB_2934034 |
| StarBright Blue 700 Goat Anti-Rabbit IgG | Bio-Rad | Cat#12004162; RRID: AB_3712168 |
| hFAB Rhodamine Anti-Actin Primary Antibody | Bio-Rad | Cat#12004164; RRID: AB_2861334 |
| Bacterial and virus strains | ||
| NEB 10-beta electrocompetent cells | New England Biolabs | Cat#C3020K |
| Chemicals, peptides, and recombinant proteins | ||
| DMEM | Gibco | Cat#11965-092 |
| DMEM/F-12 | Gibco | Cat#11320-033 |
| Fetal Bovine Serum (FBS) | Hyclone | Cat#SH30071.03 |
| Penicillin/Streptomycin | Gibco | Cat#15140-122 |
| OPTI-MEM | Gibco | Cat#31985-070 |
| DPBS | Gibco | Cat#14190-144 |
| Versene | Gibco | Cat#15400-054 |
| Trypsin-EDTA 0.05%, Phenol Red | Gibco | Cat#25300-054 |
| TransIT-293 | Mirus | Cat#MIR 2700 |
| Dimethyl Sulfoxide (DMSO) | Sigma-Aldrich | Cat#D8418-50mL |
| CryoStor CS10 Cell Freezing Medium | STEMCELL Technologies | Cat#100-1061 |
| Matrigel Matrix | Corning | Cat#354234 |
| mTeSR Plus | STEMCELL Technologies | Cat#100-0276 |
| ROCK Inhibitor Y-27632 | SelleckChem | Cat#S1049 |
| Puromycin Dihydrochloride | Gibco | Cat#A11138-03 |
| 0.45 um Syringe Filter | VWR | Cat#76479-012 |
| SpCas9 2NLS Nuclease (300 pmol) | Synthego | N/A |
| Poly-L-ornithine | Sigma-Aldrich | Cat#P3655 |
| Accutase | STEMCELL Technologies | Cat#07922 |
| Doxycycline hyclate | Sigma-Aldrich | Cat#D9891 |
| Recombinant Human NT-3 | Peprotech | Cat#450-03 |
| Recombinant Human BDNF | Peprotech | Cat#450-02 |
| GlutaMAX Supplement 100X | Gibco | Cat#35050-061 |
| MEM Non-Essential Amino Acid Solution 100X | Gibco | Cat#11140-050 |
| B-27 Supplement 50X | Gibco | Cat#17504-044 |
| N-2 Supplement 100X | Gibco | Cat#17502-048 |
| KnockOut DMEM/F-12 | Gibco | Cat#12660-012 |
| Neurobasal-A Medium | Gibco | Cat#12349-015 |
| Laminin Mouse Protein | Gibco | Cat#23017-015 |
| RPMI 1640 Medium | Invitrogen | Cat#11875135 |
| DMEM, no glucose | Gibco | Cat#11966-025 |
| Sodium-L-Lactate | Sigma-Aldrich | Cat#71718 |
| Vitronectin XF | STEMCELL Technologies | Cat#100-0763 |
| CHIR-99021 | TOCRIS | Cat#4423 |
| Wnt-C59 | Selleck Chemicals | Cat#S7037 |
| Bovine Serum Albumin (BSA) | Sigma-Aldrich | Cat#A9418 |
| Ascorbic Acid | Sigma-Aldrich | Cat#A8960 |
| B27 + Insulin | Invitrogen | Cat#17504044 |
| TrypLE Select 10X | Gibco | Cat#A12177-01 |
| Paraformaldehyde 20% Solution | Electron Microscopy Sciences | Cat#15713-S |
| Triton X-100 | Sigma-Aldrich | Cat#Y8787-50ML |
| Tween-20 | Sigma-Aldrich | Cat#P1279-500ML |
| Glutaraldehyde 25% Solution | Electron Microscopy Sciences | Cat#16210 |
| Deoxynucleotide (dNTP) Solution Mix 10 mM | New England Biolabs | Cat#N0447L |
| Ultrapure BSA | Thermo Scientific | Cat#AM2616 |
| Ribolock RNase Inhibitor | Thermo Scientific | Cat#EO0381 |
| RevertAid H Minus Reverse Transcriptase | Thermo Scientific | Cat#EP0451 |
| Ampligase Buffer | Biosearch Technologies | Cat#SS000015-D3 |
| Taq-IT Polymerase | Qiagen | Cat#P7620L |
| Ampligase | Biosearch Technologies | Cat#E0001-100D2 |
| RNase H | New England Biolabs | Cat#M0297L |
| Phi29 DNA Polymerase | Thermo Scientific | Cat#EP0094 |
| Glycerol | Fisher Scientific | Cat#G33-1 |
| Formamide | Fisher Scientific | Cat#BP227-500 |
| Nuclease-Free Water | Thomas Scientific | Cat#C001X09 |
| 20X SSC | Invitrogen | Cat#15557036 |
| 10X HBSS | Gibco | Cat#14065-056 |
| Bovine Serum Albumin (fraction V) | Sigma-Aldrich | Cat#A2153-100G |
| Sodium azide | Sigma-Aldrich | Cat#S-8032 |
| Ascorbic Acid | Sigma-Aldrich | Cat#A92902-100G |
| 4’,6-Diamidino-2-phenylindole (DAPI) | Fisher Scientific | Cat#D1306 |
| Sodium borohydride | Sigma-Aldrich | Cat#452882 |
| Phalloidin-CF568 conjugate | Biotium | Cat#00044-T |
| Wheat Germ Agglutinin (WGA)-CF555 conjugate | Biotium | Cat#29076-1 |
| Concanavalin A-Alexa Fluor 750 conjugate | Thermo Scientific | Cat#C56127 |
| Carbenicillin Disodium, USP Grade | GoldBio | Cat#C-103-250 |
| Kanamycin | Sigma-Aldrich | Cat#K1537-25G |
| Q5 High-Fidelity 2X Master Mix | New England Biolabs | Cat#M0492L |
| Klenow Fragment, 3’-5’ exo-minus | New England Biolabs | Cat#M0212L |
| NEBuilder HiFi DNA Assembly Master Mix | New England Biolabs | Cat#E2621L |
| T4 DNA Ligase | New England Biolabs | Cat#M0202M |
| BsaI-HFv2 | New England Biolabs | Cat#R3733L |
| SapI | New England Biolabs | Cat#R0569L |
| Esp3I | New England Biolabs | Cat#R0734L |
| BsmBI-v2 | New England Biolabs | Cat#R0739S |
| NheI-HF | New England Biolabs | Cat#R3131L |
| MluI-HF | New England Biolabs | Cat#R3198S |
| EcoRI-HF | New England Biolabs | Cat#R3101S |
| NotI-HF | New England Biolabs | Cat#R3189S |
| SacII | New England Biolabs | Cat#R0157S |
| RIPA Buffer | Abcam | Cat#ab156034 |
| Halt Protease Inhibitor Cocktail | ThermoFisher Scientific | Cat#1860932 |
| Laemmli Sample Buffer | Bio-Rad | Cat#1610737 |
| β-Mercaptoethanol | Bio-Rad | Cat#1610710 |
| 8–16% Criterion TGX Stain-Free Protein Gel | Bio-Rad | Cat#5678103 |
| Tris/Glycine/SDS Running Buffer | Bio-Rad | Cat#1610732 |
| Trans-Blot Turbo Midi 0.2 um PVDF Transfer Packs | Bio-Rad | Cat#1704157 |
| Critical commercial assays | ||
| Glass 6-well Plate | Cellvis | Cat#P06-1.5H-N |
| Corning Untreated 245 mm Square BioAssay Dishes | Fisher Scientific | Cat#07-200-600 |
| Lonza 4D Nucleofector SE Kit S | Lonza | Cat#V4XC-1032 |
| Lonza 4D Nucleofector P3 Kit S | Lonza | Cat#V4XP-3032 |
| Lonza 4D Nucleofector P3 Kit X | Lonza | Cat#V4XP-3024 |
| Illumina MiSeq v2 Nano 500 Kit | Illumina | Cat#MS-103-1003 |
| Pierce BCA Protein Assay Kit | ThermoFisher Scientific | Cat#23225 |
| SMRTbell Prep Kit 3.0 | Pacific Biosciences | Cat#102-182-700 |
| Deposited Data | ||
| Impact scores, distinguishability scores, landmark features | MaveDB; Rubin et al. | URN#s: 00001243-a, 00001244-a, 00001244-b |
| Variant-level features, EMDs, p-values, scores | Zenodo | Cat#15787685; https://zenodo.org/records/15787685 |
| Phenotype image data, single-cell features and genotypes, CellProfiler pipelines | Biolmage Archive | Accession#S-BIAD3095; https://www.ebi.ac.uk/biostudies/bioimages/studies/S-BIAD3095 |
| Experimental models: Cell lines | ||
| Human embryonic kidney 293T (HEK 293T) | ATCC | Cat#CRL-3216; RRID: CVCL_0063 |
| U2OS cells | ATCC | Cat#HTB-96; RRID: CVCL_0042 |
| U2OS C11 cells | Hasle et al. | N/A |
| WTC11 iPS cells | Coriell | Cat#GM25256; RRID: CVCL_Y803 |
| NGN2-inducible WTC11 iPS cells (NWTC11.G3-WT) | Gladstone Institute; gift from Li Gan lab | N/A |
| Oligonucleotides | ||
| See Table S6 | ||
| Recombinant DNA | ||
| pHSG299 | Takara Bio | Cat#3299 |
| pHR-UCOE-EF1a-Zim3-dCas9-BFP | Addgene; Replogle et al. | Cat#188775; RRID: Addgene_188775 |
| pJBL_051 | Gift from Jay Shendure lab; Lalanne et al. | N/A |
| lentiGuide-Puro | Addgene; Sanjana et al. | Cat#52963; RRID: Addgene_52963 |
| pCAG-NLS-HA-Bxb1 | Addgene; Hermann et al. | Cat#51271; RRID: Addgene_51271 |
| Hyperactive piggyBac transposase expression vector (hyPBase) | Gift from Jay Shendure lab; Yusa et al. | N/A |
| pMD2.G | Addgene | Cat#12259; RRID: Addgene_12259 |
| psPAX2 | Addgene | Cat#12260; RRID: Addgene_12260 |
| PBv2b | This publication | VIS-seq expression vector. N/A |
| PBv2b-NSapI | This publication | VIS-seq expression vector. Cat#252243; RRID: Addgene_252243 |
| PBv2b-CSapI | This publication | VIS-seq expression vector. Cat#252244; RRID: Addgene_252244 |
| PBv2c-NSapI | This publication | VIS-seq expression vector. Cat#252245; RRID: Addgene_252245 |
| Software and algorithms | ||
| Cellpose v2.2.1 | Stringer et al. | https://www.cellpose.org/; RRID: SCR_021716 |
| StarDist v0.8.5 | Schmidt et al. | https://github.com/stardist/stardist |
| STARCall | Bradley et al. | https://github.com/FowlerLab/starcall-workflow |
| CellProfiler v4.2.6 | Carpenter et al. | https://cellprofiler.org/; RRID: SCR_007358 |
| Pycytominer v1.2.1 | Serrano et al. | https://github.com/cytomining/pycytominer |
| SciPy v1.14.1 | Virtanen et al. | https://scipy.org/; RRID: SCR_008058 |
| NumPy v1.24.4 | Harris et al. | https://numpy.org/; RRID: SCR_008633 |
| Scikit-learn v1.5.2 | Pedregosa et al. | https://scikit-learn.org/; RRID: SCR_002577 |
| Scikit-image v0.24.0 | van der Walt et al. | https://scikit-image.org/; RRID: SCR_021142 |
| Matplotlib v3.9.2 | Hunter et al. | https://matplotlib.org/; RRID: SCR_008624 |
| NetworkX v3.4.2 | Hagberg et al. | https://networkx.org/; RRID: SCR_016864 |
| Snakemake v7.32.4 | Molder et al. | https://snakemake.readthedocs.io/; RRID: SCR_003475 |
| Pandas v2.2.3 | McKinney et al. | https://pandas.pydata.org/; RRID: SCR_018214 |
| Python v3.11 | Python Software Foundation | https://www.python.org/; RRID: SCR_008394 |
| XGBoost v3.0.1 | Chen and Guestrin | https://xgboost.ai/; RRID: SCR_021361 |
| PyMOL v3.1 | Schrodinger LLC | https://www.pymol.org; RRID: SCR_000305 |
| Pacybara | Weile et al. | https://github.com/rothlab/pacybara |
Highlights.
VIS-seq enables optical measurement of variant effects in diverse cell types at scale
LMNA variants can drive aggregation or low abundance, modulating nuclear circularity
PTEN localization distinguishes autism from tumor syndrome variants
VIS-seq reveals how complex variant effects can cascade from molecules to cells
ACKNOWLEDGMENTS
We would like to acknowledge Brian Beliveau, Andrew Stergachis, Hao Yuan Kueh, and Sanjay Srivatsan for their scientific guidance. For assistance with microscopy and executing VIS-seq experiments, we acknowledge Wai Pang-Chan at the University of Washington Biology Imaging Facility as well as Tony Cooke. For assistance with flow cytometry, we acknowledge scientists and staff at the DLMP Flow Cytometry Core at the University of Washington. Funding was provided by the National Institutes of Health (RM1HG010461 to DMF, LMS, and FPR; R35GM152106 to DMF; R01HG013025 to LMS, DMF, AEM, and AFR; K99HL177347 to CEF; R01HL171174 to KCY; R01HL164675 to FPR), Chan Zuckerberg Initiative (CZIF2024-010284 to DMF and LMS; CP-2-1-Fowler to DMF) the Brotman Baty Institute for Precision Medicine (CC28 to AEM), the Department of Veterans Affairs Biomedical Laboratory Research and Development Service (I01BX006428 and IK2BX004642 to KCY), the Novo Nordisk Foundation (DMF). AEM was supported by an Early Career Award from the Alex’s Lemonade Stand for Childhood Cancer and RUNX1 foundation (21-25037). RKG was supported by the Washington Research Foundation Postdoctoral Fellowship.
Footnotes
DECLARATION OF INTERESTS
FPR is an advisor and shareholder in Constantiam Biosciences.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
REFERENCES
- 1.Fowler DM, and Fields S (2014). Deep mutational scanning: a new style of protein science. Nat. Methods 11, 801–807. 10.1038/nmeth.3027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Maes S, Deploey N, Peelman F, and Eyckerman S (2023). Deep mutational scanning of proteins in mammalian cells. Cell Rep. Methods 3, 100641. 10.1016/j.crmeth.2023.100641. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Findlay GM, Daza RM, Martin B, Zhang MD, Leith AP, Gasperini M, Janizek JD, Huang X, Starita LM, and Shendure J (2018). Accurate classification of BRCA1 variants with saturation genome editing. Nature 562, 217–222. 10.1038/s41586-018-0461-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Olvera-León R, Zhang F, Offord V, Zhao Y, Tan HK, Gupta P, Pal T, Robles-Espinoza CD, Arriaga-González FG, Matsuyama LSAS, et al. (2024). High-resolution functional mapping of RAD51C by saturation genome editing. Cell 187, 5719–5734.e19. 10.1016/j.cell.2024.08.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Matreyek KA, Starita LM, Stephany JJ, Martin B, Chiasson MA, Gray VE, Kircher M, Khechaduri A, Dines JN, Hause RJ, et al. (2018). Multiplex assessment of protein variant abundance by massively parallel sequencing. Nat. Genet. 50, 874–882. 10.1038/s41588-018-0122-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Beltran A, Jiang X. ‘er, Shen Y, and Lehner B (2024). Site saturation mutagenesis of 500 human protein domains reveals the contribution of protein destabilization to genetic disease. Preprint at bioRxiv. 10.1101/2024.04.26.591310. [DOI] [Google Scholar]
- 7.Feldman D, Singh A, Schmid-Burgk JL, Carlson RJ, Mezger A, Garrity AJ, Zhang F, and Blainey PC (2019). Optical Pooled Screens in Human Cells. Cell 179, 787–799.e17. 10.1016/j.cell.2019.09.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Funk L, Su K-C, Ly J, Feldman D, Singh A, Moodie B, Blainey PC, and Cheeseman IM (2022). The phenotypic landscape of essential human genes. Cell 185, 4634–4653.e22. 10.1016/j.cell.2022.10.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Labitigan RLD, Sanborn AL, Hao CV, Chan CK, Belliveau NM, Brown EM, Mehrotra M, and Theriot JA (2024). Mapping variation in the morphological landscape of human cells with optical pooled CRISPRi screening. 10.7554/elife.94964.1. [DOI] [Google Scholar]
- 10.Sivanandan S, Leitmann B, Lubeck E, Sultan MM, Stanitsas P, Ranu N, Ewer A, Mancuso JE, Phillips ZF, Kim A, et al. (2023). A Pooled Cell Painting CRISPR Screening Platform Enables de novo Inference of Gene Function by Self-supervised Deep Learning. Preprint at bioRxiv. 10.1101/2023.08.13.553051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Kudo T, Meireles AM, Moncada R, Chen Y, Wu P, Gould J, Hu X, Kornfeld O, Jesudason R, Foo C, et al. (2024). Multiplexed, image-based pooled screens in primary cells and tissues with PerturbView. Nat. Biotechnol. 10.1038/s41587-024-02391-0. [DOI] [PubMed] [Google Scholar]
- 12.Kanfer G, Sarraf SA, Maman Y, Baldwin H, Dominguez-Martin E, Johnson KR, Ward ME, Kampmann M, Lippincott-Schwartz J, and Youle RJ (2021). Image-based pooled whole-genome CRISPRi screening for subcellular phenotypes. J. Cell Biol. 220, e202006180. 10.1083/jcb.202006180. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Kremitzki C, Waligorski J, Bachman G, Ali LM, Bramley J, Vakaki M, Chandrasekaran V, Patel P, Mathur D, Hime P, et al. (2025). Pathogenic morphological signatures of perturbations in mitochondrial-related genes revealed by pooled imaging assay. Npj Imaging 3, 35. 10.1038/s44303-025-00097-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Bray M-A, Singh S, Han H, Davis CT, Borgeson B, Hartland C, Kost-Alimova M, Gustafsdottir SM, Gibson CC, and Carpenter AE (2016). Cell Painting, a high-content image-based assay for morphological profiling using multiplexed fluorescent dyes. Nat. Protoc. 11, 1757–1774. 10.1038/nprot.2016.105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Chandrasekaran SN, Cimini BA, Goodale A, Miller L, Kost-Alimova M, Jamali N, Doench JG, Fritchman B, Skepner A, Melanson M, et al. (2024). Three million images and morphological profiles of cells treated with matched chemical and genetic perturbations. Nat. Methods 21, 1114–1121. 10.1038/s41592-024-02241-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ramezani M, Weisbart E, Bauman J, Singh A, Yong J, Lozada M, Way GP, Kavari SL, Diaz C, Leardini E, et al. (2025). A genome-wide atlas of human cell morphology. Nat. Methods, 1–13. 10.1038/s41592-024-02537-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Rohban MH, Singh S, Wu X, Berthet JB, Bray M-A, Shrestha Y, Varelas X, Boehm JS, and Carpenter AE (2017). Systematic morphological profiling of human gene and allele function via Cell Painting. Elife 6, e24060. 10.7554/eLife.24060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Way GP, Natoli T, Adeboye A, Litichevskiy L, Yang A, Lu X, Caicedo JC, Cimini BA, Karhohs K, Logan DJ, et al. (2022). Morphology and gene expression profiling provide complementary information for mapping cell state. Cell Syst. 13, 911–923.e9. 10.1016/j.cels.2022.10.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Lacoste J, Haghighi M, Haider S, Reno C, Lin Z-Y, Segal D, Qian WW, Xiong X, Teelucksingh T, Miglietta E, et al. (2024). Pervasive mislocalization of pathogenic coding variants underlying human disorders. Cell 187, 6725–6741.e13. 10.1016/j.cell.2024.09.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Takahashi K, and Yamanaka S (2006). Induction of Pluripotent Stem Cells from Mouse Embryonic and Adult Fibroblast Cultures by Defined Factors. Cell 126, 663–676. 10.1016/j.cell.2006.07.024. [DOI] [PubMed] [Google Scholar]
- 21.Friedman CE, Fayer S, Pendyala S, Chien W-M, Loiben A, Tran L, Chao LS, McKinstry A, Ahmed D, Farris SD, et al. (2024). Multiplexed functional assessments of MYH7 variants in human cardiomyocytes. Circ. Genom. Precis. Med. 17, e004377. 10.1161/CIRCGEN.123.004377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Friedman CE, Fayer S, Pendyala S, Chien W-M, Loiben A, Tran L, Chao LS, Mckinstry A, Ahmed D, Karbassi E, et al. (2023). CRaTER enrichment for on-target gene editing enables generation of variant libraries in hiPSCs. J. Mol. Cell. Cardiol. 179, 60–71. 10.1016/j.yjmcc.2023.03.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Lv W, Qiao L, Petrenko N, Li W, Owens AT, McDermott-Roe C, and Musunuru K (2018). Functional annotation of TNNT2 variants of uncertain significance with genome-edited cardiomyocytes. Circulation 138, 2852–2854. 10.1161/CIRCULATIONAHA.118.035028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Ponten J, and Saksela E (1967). Two established in vitro cell lines from human mesenchymal tumours. Int. J. Cancer 2, 434–447. 10.1002/ijc.2910020505. [DOI] [PubMed] [Google Scholar]
- 25.Lalanne J-B, Regalado SG, Domcke S, Calderon D, Martin BK, Li X, Li T, Suiter CC, Lee C, Trapnell C, et al. (2024). Multiplex profiling of developmental cis-regulatory elements with quantitative single-cell expression reporters. Nat. Methods 21, 983–993. 10.1038/s41592-024-02260-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Litke JL, and Jaffrey SR (2019). Highly efficient expression of circular RNA aptamers in cells using autocatalytic transcripts. Nat. Biotechnol. 37, 667–675. 10.1038/s41587-019-0090-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Chung JH, Whiteley M, and Felsenfeld G (1993). A 5′ element of the chicken β-globin domain serves as an insulator in human erythroid cells and protects against position effect in Drosophila. Cell 74, 505–514. 10.1016/0092-8674(93)80052-G. [DOI] [PubMed] [Google Scholar]
- 28.Antoniou M, Harland L, Mustoe T, Williams S, Holdstock J, Yague E, Mulcahy T, Griffiths M, Edwards S, Ioannou PA, et al. (2003). Transgenes encompassing dual-promoter CpG islands from the human TBP and HNRPA2B1 loci are resistant to heterochromatin-mediated silencing. Genomics 82, 269–279. 10.1016/S0888-7543(03)00107-1. [DOI] [PubMed] [Google Scholar]
- 29.Seczynska M, Bloor S, Cuesta SM, and Lehner PJ (2022). Genome surveillance by HUSH-mediated silencing of intronless mobile elements. Nature 601, 440–445. 10.1038/s41586-021-04228-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Lawn RM, Fritsch EF, Parker RC, Blake G, and Maniatis T (1978). The isolation and characterization of linked δ- and β-globin genes from a cloned library of human DNA. Cell 15, 1157–1174. 10.1016/0092-8674(78)90043-0. [DOI] [PubMed] [Google Scholar]
- 31.Zufferey R, Donello JE, Trono D, and Hope TJ (1999). Woodchuck Hepatitis Virus Posttranscriptional Regulatory Element Enhances Expression of Transgenes Delivered by Retroviral Vectors. J. Virol. 73, 2886–2892. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Cabrera A, Edelstein HI, Glykofrydis F, Love KS, Palacios S, Tycko J, Zhang M, Lensch S, Shields CE, Livingston M, et al. (2022). The sound of silence: Transgene silencing in mammalian cell engineering. Cell Syst. 13, 950–973. 10.1016/j.cels.2022.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Zacharias DA, Violin JD, Newton AC, and Tsien RY (2002). Partitioning of lipid-modified monomeric GFPs into membrane microdomains of live cells. Science 296, 913–916. 10.1126/science.1068539. [DOI] [PubMed] [Google Scholar]
- 34.Ding S, Wu X, Li G, Han M, Zhuang Y, and Xu T (2005). Efficient Transposition of the piggyBac (PB) Transposon in Mammalian Cells and Mice. Cell 122, 473–483. 10.1016/j.cell.2005.07.013. [DOI] [PubMed] [Google Scholar]
- 35.Feldman D, Funk L, Le A, Carlson RJ, Leiken MD, Tsai F, Soong B, Singh A, and Blainey PC (2022). Pooled genetic perturbation screens with image-based phenotypes. Nat. Protoc. 17, 476–512. 10.1038/s41596-021-00653-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Vaughan JC, Jia S, and Zhuang X (2012). Ultrabright photoactivatable fluorophores created by reductive caging. Nat. Methods 9, 1181–1184. 10.1038/nmeth.2214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Radtke AJ, Chu CJ, Yaniv Z, Yao L, Marr J, Beuschel RT, Ichise H, Gola A, Kabat J, Lowekamp B, et al. (2022). IBEX: an iterative immunolabeling and chemical bleaching method for high-content imaging of diverse tissues. Nat. Protoc. 17, 378–401. 10.1038/s41596-021-00644-9. [DOI] [PubMed] [Google Scholar]
- 38.Bradley NJ, Pendyala S, Partington K, and Fowler DM (2025). STARCall integrates image stitching, alignment, and read calling to enable scalable analysis of in situ sequencing data. Preprint at bioRxiv, 2025.10.31.685785. 10.1101/2025.10.31.685785. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Pachitariu M, and Stringer C (2022). Cellpose 2.0: how to train your own model. Nat. Methods 19, 1634–1641. 10.1038/s41592-022-01663-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Weigert M, Schmidt U, Haase R, Sugawara K, and Myers G (2020). Star-convex Polyhedra for 3D Object Detection and Segmentation in Microscopy. In 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 3655–3662. 10.1109/WACV45572.2020.9093435. [DOI] [Google Scholar]
- 41.Stirling DR, Swain-Bowden MJ, Lucas AM, Carpenter AE, Cimini BA, and Goodman A (2021). CellProfiler 4: improvements in speed, utility and usability. BMC Bioinformatics 22, 433. 10.1186/s12859-021-04344-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Anderson CL, Langer ER, Routes TC, McWilliams SF, Bereslavskyy I, Kamp TJ, and Eckhardt LL (2021). Most myopathic lamin variants aggregate: a functional genomics approach for assessing variants of uncertain significance. NPJ Genom. Med. 6, 1–11. 10.1038/s41525-021-00265-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Lammerding J, Fong LG, Ji JY, Reue K, Stewart CL, Young SG, and Lee RT (2006). Lamins A and C but not lamin B1 regulate nuclear mechanics. J. Biol. Chem. 281, 25768–25780. 10.1074/jbc.M513511200. [DOI] [PubMed] [Google Scholar]
- 44.Lammerding J, Schulze PC, Takahashi T, Kozlov S, Sullivan T, Kamm RD, Stewart CL, and Lee RT (2004). Lamin A/C deficiency causes defective nuclear mechanics and mechanotransduction. J. Clin. Invest. 113, 370–378. 10.1172/jci200419670. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Lee JSH, Hale CM, Panorchan P, Khatau SB, George JP, Tseng Y, Stewart CL, Hodzic D, and Wirtz D (2007). Nuclear lamin A/C deficiency induces defects in cell mechanics, polarization, and migration. Biophys. J. 93, 2542–2552. 10.1529/biophysj.106.102426. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Broers JLV, Peeters EAG, Kuijpers HJH, Endert J, Bouten CVC, Oomens CWJ, Baaijens FPT, and Ramaekers FCS (2004). Decreased mechanical stiffness in LMNA−/− cells is caused by defective nucleo-cytoskeletal integrity: implications for the development of laminopathies. Hum. Mol. Genet. 13, 2567–2580. 10.1093/hmg/ddh295. [DOI] [PubMed] [Google Scholar]
- 47.Hutchison CJ (2002). Lamins: building blocks or regulators of gene expression? Nat. Rev. Mol. Cell Biol. 3, 848–858. 10.1038/nrm950. [DOI] [PubMed] [Google Scholar]
- 48.van Steensel B, and Belmont AS (2017). Lamina-associated domains: Links with chromosome architecture, heterochromatin, and gene repression. Cell 169, 780–791. 10.1016/j.cell.2017.04.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Barletta M.R. di, Ricci E, Galluzzi G, Tonali P, Mora M, Morandi L, Romorini A, Voit T, Orstavik KH, Merlini L, et al. (2000). Different Mutations in the LMNA Gene Cause Autosomal Dominant and Autosomal Recessive Emery-Dreifuss Muscular Dystrophy. Am. J. Hum. Genet. 66, 1407–1412. 10.1086/302869. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Fatkin D, MacRae C, Sasaki T, Wolff MR, Porcu M, Frenneaux M, Atherton J, Vidaillet HJ, Spudich S, Girolami UD, et al. (1999). Missense Mutations in the Rod Domain of the Lamin A/C Gene as Causes of Dilated Cardiomyopathy and Conduction-System Disease. N. Engl. J. Med. 341, 1715–1724. 10.1056/NEJM199912023412302. [DOI] [PubMed] [Google Scholar]
- 51.Eriksson M, Brown WT, Gordon LB, Glynn MW, Singer J, Scott L, Erdos MR, Robbins CM, Moses TY, Berglund P, et al. (2003). Recurrent de novo point mutations in lamin A cause Hutchinson–Gilford progeria syndrome. Nature 423, 293–298. 10.1038/nature01629. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Csoka AB, Cao H, Sammak PJ, Constantinescu D, Schatten GP, and Hegele RA (2004). Novel lamin A/C gene (LMNA) mutations in atypical progeroid syndromes. J. Med. Genet. 41, 304–308. 10.1136/jmg.2003.015651. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Sandre-Giovannoli AD, Chaouch M, Kozlov S, Vallat J-M, Tazir M, Kassouri N, Szepetowski P, Hammadouche T, Vandenberghe A, Stewart CL, et al. (2002). Homozygous Defects in LMNA, Encoding Lamin A/C Nuclear-Envelope Proteins, Cause Autosomal Recessive Axonal Neuropathy in Human (Charcot-Marie-Tooth Disorder Type 2) and Mouse. Am. J. Hum. Genet. 70, 726–736. 10.1086/339274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Muchir A, Medioni J, Laluc M, Massart C, Arimura T, Kooi AJVD, Desguerre I, Mayer M, Ferrer X, Briault S, et al. (2004). Nuclear envelope alterations in fibroblasts from patients with muscular dystrophy, cardiomyopathy, and partial lipodystrophy carrying lamin A/C gene mutations. Muscle Nerve 30, 444–450. 10.1002/mus.20122. [DOI] [PubMed] [Google Scholar]
- 55.Cowan J, Li D, Gonzalez-Quintana J, Morales A, and Hershberger RE (2010). Morphological Analysis of 13 LMNA Variants Identified in a Cohort of 324 Unrelated Patients With Idiopathic or Familial Dilated Cardiomyopathy. Circ. Cardiovasc. Genet. 3, 6–14. 10.1161/CIRCGENETICS.109.905422. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Serrano E, Chandrasekaran SN, Bunten D, Brewer KI, Tomkinson J, Kern R, Bornholdt M, Fleming SJ, Pei R, Arevalo J, et al. (2025). Reproducible image-based profiling with Pycytominer. Nat. Methods 22, 677–680. 10.1038/s41592-025-02611-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Stossi F, Singh PK, Marini M, Safari K, Szafran AT, Rivera Tostado A, Candler CD, Mancini MG, Mosa EA, Bolt MJ, et al. (2024). SPACe: an open-source, single-cell analysis of Cell Painting data. Nat. Commun. 15, 10170. 10.1038/s41467-024-54264-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Pearson YE, Kremb S, Butterfoss GL, Xie X, Fahs H, and Gunsalus KC (2022). A statistical framework for high-content phenotypic profiling using cellular feature distributions. Commun. Biol. 5, 1–15. 10.1038/s42003-022-04343-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Landrum MJ, Chitipiralla S, Brown GR, Chen C, Gu B, Hart J, Hoffman D, Jang W, Kaur K, Liu C, et al. (2020). ClinVar: improvements to accessing data. Nucleic Acids Res. 48, D835–D844. 10.1093/nar/gkz972. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Pho M, Berrada Y, Gunda A, and Stephens AD (2024). Nuclear shape is affected differentially by loss of lamin A, lamin C, or both lamin A and C. MicroPubl. Biol. 2024. 10.17912/micropub.biology.001103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Ambrosi P, Kreitmann B, Lepidi H, Habib G, Levy N, Philip N, and De Sandre-Giovannoli A (2016). A novel overlapping phenotype characterized by lipodystrophy, mandibular dysplasia, and dilated cardiomyopathy associated with a new mutation in the LMNA gene. Int. J. Cardiol. 209, 317–318. 10.1016/j.ijcard.2016.02.113. [DOI] [PubMed] [Google Scholar]
- 62.Ghosh DK, Pande S, Kumar J, Yesodharan D, Nampoothiri S, Radhakrishnan P, Reddy CG, Ranjan A, and Girisha KM (2022). The E262K mutation in Lamin A links nuclear proteostasis imbalance to laminopathy-associated premature aging. Aging Cell 21, e13688. 10.1111/acel.13688. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Ahn J, Jo I, Kang S-M, Hong S, Kim S, Jeong S, Kim Y-H, Park B-J, and Ha N-C (2019). Structural basis for lamin assembly at the molecular level. Nat. Commun. 10, 3757. 10.1038/s41467-019-11684-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Ahn J, Jo I, Jeong S, Lee J, and Ha N-C (2023). Lamin Filament Assembly Derived from the Atomic Structure of the Antiparallel Four-Helix Bundle. Mol. Cells 46, 309–318. 10.14348/molcells.2023.2144. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Strelkov SV, Schumacher J, Burkhard P, Aebi U, and Herrmann H (2004). Crystal structure of the human lamin A coil 2B dimer: implications for the head-to-tail association of nuclear lamins. J. Mol. Biol. 343, 1067–1080. 10.1016/j.jmb.2004.08.093. [DOI] [PubMed] [Google Scholar]
- 66.Makarov AA, Zou J, Houston DR, Spanos C, Solovyova AS, Cardenal-Peralta C, Rappsilber J, and Schirmer EC (2019). Lamin A molecular compression and sliding as mechanisms behind nucleoskeleton elasticity. Nat. Commun. 10, 3056. 10.1038/s41467-019-11063-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Eldirany SA, Lomakin IB, Ho M, and Bunick CG (2021). Recent insight into intermediate filament structure. Curr. Opin. Cell Biol. 68, 132–143. 10.1016/j.ceb.2020.10.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Herrmann H, and Aebi U (2016). Intermediate filaments: Structure and assembly. Cold Spring Harb. Perspect. Biol. 8, a018242. 10.1101/cshperspect.a018242. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Lin EW, Brady GF, Kwan R, Nesvizhskii AI, and Omary MB (2020). Genotype-phenotype analysis of LMNA-related diseases predicts phenotype-selective alterations in lamin phosphorylation. FASEB J. 34, 9051–9073. 10.1096/fj.202000500R. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Li J, Yen C, Liaw D, Podsypanina K, Bose S, Wang SI, Puc J, Miliaresis C, Rodgers L, McCombie R, et al. (1997). PTEN, a Putative Protein Tyrosine Phosphatase Gene Mutated in Human Brain, Breast, and Prostate Cancer. Science 275, 1943–1947. 10.1126/science.275.5308.1943. [DOI] [PubMed] [Google Scholar]
- 71.Lee J-O, Yang H, Georgescu M-M, Cristofano AD, Maehama T, Shi Y, Dixon JE, Pandolfi P, and Pavletich NP (1999). Crystal Structure of the PTEN Tumor Suppressor: Implications for Its Phosphoinositide Phosphatase Activity and Membrane Association. Cell 99, 323–334. 10.1016/S0092-8674(00)81663-3. [DOI] [PubMed] [Google Scholar]
- 72.Davidson L, Maccario H, Perera NM, Yang X, Spinelli L, Tibarewal P, Glancy B, Gray A, Weijer CJ, Downes CP, et al. (2010). Suppression of cellular proliferation and invasion by the concerted lipid and protein phosphatase activities of PTEN. Oncogene 29, 687–697. 10.1038/onc.2009.384. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Tamura M, Gu J, Danen EHJ, Takino T, Miyamoto S, and Yamada KM (1999). PTEN Interactions with Focal Adhesion Kinase and Suppression of the Extracellular Matrix-dependent Phosphatidylinositol 3-Kinase/Akt Cell Survival Pathway *. J. Biol. Chem. 274, 20693–20703. 10.1074/jbc.274.29.20693. [DOI] [PubMed] [Google Scholar]
- 74.Fan X, Kraynak J, Knisely JPS, Formenti SC, and Shen WH (2020). PTEN as a guardian of the genome: Pathways and targets. Cold Spring Harb. Perspect. Med. 10, a036194. 10.1101/cshperspect.a036194. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Sun H, Lesche R, Li DM, Liliental J, Zhang H, Gao J, Gavrilova N, Mueller B, Liu X, and Wu H (1999). PTEN modulates cell cycle progression and cell survival by regulating phosphatidylinositol 3,4,5,-trisphosphate and Akt/protein kinase B signaling pathway. Proc. Natl. Acad. Sci. U. S. A. 96, 6199–6204. 10.1073/pnas.96.11.6199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Pilarski R, Burt R, Kohlman W, Pho L, Shannon KM, and Swisher E (2013). Cowden syndrome and the PTEN hamartoma tumor syndrome: systematic review and revised diagnostic criteria. J. Natl. Cancer Inst. 105, 1607–1616. 10.1093/jnci/djt277. [DOI] [PubMed] [Google Scholar]
- 77.Butler MG, Dasouki MJ, Zhou X-P, Talebizadeh Z, Brown M, Takahashi TN, Miles JH, Wang CH, Stratton R, Pilarski R, et al. (2005). Subset of individuals with autism spectrum disorders and extreme macrocephaly associated with germline PTEN tumour suppressor gene mutations. J. Med. Genet. 42, 318–321. 10.1136/jmg.2004.024646. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Bailey MH, Tokheim C, Porta-Pardo E, Sengupta S, Bertrand D, Weerasinghe A, Colaprico A, Wendl MC, Kim J, Reardon B, et al. (2018). Comprehensive Characterization of Cancer Driver Genes and Mutations. Cell 173, 371–385.e18. 10.1016/j.cell.2018.02.060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Post KL, Belmadani M, Ganguly P, Meili F, Dingwall R, McDiarmid TA, Meyers WM, Herrington C, Young BP, Callaghan DB, et al. (2020). Multi-model functionalization of disease-associated PTEN missense mutations identifies multiple molecular mechanisms underlying protein dysfunction. Nat. Commun. 11, 2073. 10.1038/s41467-020-15943-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Hasle N, Matreyek KA, and Fowler DM (2019). The impact of genetic variants on PTEN molecular functions and cellular phenotypes. Cold Spring Harb. Perspect. Med. 9, a036228. 10.1101/cshperspect.a036228. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Ran FA, Hsu PD, Wright J, Agarwala V, Scott DA, and Zhang F (2013). Genome engineering using the CRISPR-Cas9 system. Nat. Protoc. 8, 2281–2308. 10.1038/nprot.2013.143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Papa A, Wan L, Bonora M, Salmena L, Song MS, Hobbs RM, Lunardi A, Webster K, Ng C, Newton RH, et al. (2014). Cancer-associated PTEN mutants act in a dominant-negative manner to suppress PTEN protein function. Cell 157, 595–610. 10.1016/j.cell.2014.03.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Zhang Y, Pak C, Han Y, Ahlenius H, Zhang Z, Chanda S, Marro S, Patzke C, Acuna C, Covy J, et al. (2013). Rapid Single-Step Induction of Functional Neurons from Human Pluripotent Stem Cells. Neuron 78, 785–798. 10.1016/j.neuron.2013.05.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Mighell TL, Evans-Dutson S, and O’Roak BJ (2018). A Saturation Mutagenesis Approach to Understanding PTEN Lipid Phosphatase Activity and Genotype-Phenotype Relationships. Am. J. Hum. Genet. 102, 943–955. 10.1016/j.ajhg.2018.03.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Das S, Dixon JE, and Cho W (2003). Membrane-binding and activation mechanism of PTEN. Proc. Natl. Acad. Sci. U. S. A. 100, 7491–7496. 10.1073/pnas.0932835100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Leitner MG, Hobiger K, Mavrantoni A, Feuer A, Oberwinkler J, Oliver D, and Halaszovich CR (2018). A126 in the active site and TI167/168 in the TI loop are essential determinants of the substrate specificity of PTEN. Cell. Mol. Life Sci. 75, 4235–4250. 10.1007/s00018-018-2867-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Sondka Z, Dhir NB, Carvalho-Silva D, Jupe S, Madhumita, McLaren, K., Starkey, M., Ward, S., Wilding, J., Ahmed, M., et al. (2024). COSMIC: a curated database of somatic variants and clinical data for cancer. Nucleic Acids Res. 52, D1210–D1217. 10.1093/nar/gkad986. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Chen S, Francioli LC, Goodrich JK, Collins RL, Kanai M, Wang Q, Alföldi J, Watts NA, Vittal C, Gauthier LD, et al. (2024). A genomic mutational constraint map using variation in 76,156 human genomes. Nature 625, 92–100. 10.1038/s41586-023-06045-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Leslie NR, and Longy M (2016). Inherited PTEN mutations and the prediction of phenotype. Semin. Cell Dev. Biol. 52, 30–38. 10.1016/j.semcdb.2016.01.030. [DOI] [PubMed] [Google Scholar]
- 90.Mighell TL, Thacker S, Fombonne E, Eng C, and O’Roak BJ (2020). An Integrated Deep-Mutational-Scanning Approach Provides Clinical Insights on PTEN Genotype-Phenotype Relationships. Am. J. Hum. Genet. 106, 818–829. 10.1016/j.ajhg.2020.04.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Cheng J, Novati G, Pan J, Bycroft C, Žemgulytė A, Applebaum T, Pritzel A, Wong LH, Zielinski M, Sargeant T, et al. (2023). Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science 381, eadg7492. 10.1126/science.adg7492. [DOI] [PubMed] [Google Scholar]
- 92.Frazer J, Notin P, Dias M, Gomez A, Min JK, Brock K, Gal Y, and Marks DS (2021). Disease variant prediction with deep generative models of evolutionary data. Nature 599, 91–95. 10.1038/s41586-021-04043-8. [DOI] [PubMed] [Google Scholar]
- 93.Ioannidis NM, Rothstein JH, Pejaver V, Middha S, McDonnell SK, Baheti S, Musolf A, Li Q, Holzinger E, Karyadi D, et al. (2016). REVEL: An ensemble method for predicting the pathogenicity of rare missense variants. Am. J. Hum. Genet. 99, 877–885. 10.1016/j.ajhg.2016.08.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Livesey BJ, and Marsh JA (2025). Variant effect predictor correlation with functional assays is reflective of clinical classification performance. Genome Biol. 26, 104. 10.1186/s13059-025-03575-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.DiStefano MT, Goehringer S, Babb L, Alkuraya FS, Amberger J, Amin M, Austin-Tse C, Balzotti M, Berg JS, Birney E, et al. (2022). The Gene Curation Coalition: A global effort to harmonize gene-disease evidence resources. Genet. Med. 24, 1732–1742. 10.1016/j.gim.2022.04.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Weile J, Sun S, Cote AG, Knapp J, Verby M, Mellor JC, Wu Y, Pons C, Wong C, van Lieshout N, et al. (2017). A framework for exhaustively mapping functional missense variants. Mol. Syst. Biol. 13, 957. 10.15252/msb.20177908. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Ursu O, Neal JT, Shea E, Thakore PI, Jerby-Arnon L, Nguyen L, Dionne D, Diaz C, Bauman J, Mosaad MM, et al. (2022). Massively parallel phenotyping of coding variants in cancer with Perturb-seq. Nat. Biotechnol. 40, 896–905. 10.1038/s41587-021-01160-7. [DOI] [PubMed] [Google Scholar]
- 98.Xu H, Chen L, Sun M, Jean-Baptiste K, Cao D, Zhou X, Wong S, Xiao C, Liu T, Quijano V, et al. (2023). Single cell sequencing as a general variant interpretation assay. Preprint at bioRxiv. 10.1101/2023.12.12.571130. [DOI] [Google Scholar]
- 99.DepMap B (2024). DepMap 24Q4 Public. (Figshare+). 10.25452/FIGSHARE.PLUS.27993248.V1. [DOI] [Google Scholar]
- 100.Tsherniak A, Vazquez F, Montgomery PG, Weir BA, Kryukov G, Cowley GS, Gill S, Harrington WF, Pantel S, Krill-Burger JM, et al. (2017). Defining a cancer dependency map. Cell 170, 564–576.e16. 10.1016/j.cell.2017.06.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Blomen VA, Májek P, Jae LT, Bigenzahn JW, Nieuwenhuis J, Staring J, Sacco R, van Diemen FR, Olk N, Stukalov A, et al. (2015). Gene essentiality and synthetic lethality in haploid human cells. Science 350, 1092–1096. 10.1126/science.aac7557. [DOI] [PubMed] [Google Scholar]
- 102.Cagiada M, Jonsson N, and Lindorff-Larsen K (2024). Decoding molecular mechanisms for loss-of-function variants in the human proteome. Preprint at bioRxiv. 10.1101/2024.05.21.595203. [DOI] [Google Scholar]
- 103.Redler RL, Das J, Diaz JR, and Dokholyan NV (2016). Protein destabilization as a common factor in diverse inherited disorders. J. Mol. Evol. 82, 11–16. 10.1007/s00239-015-9717-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Fayer S, Horton C, Dines JN, Rubin AF, Richardson ME, McGoldrick K, Hernandez F, Pesaran T, Karam R, Shirts BH, et al. (2021). Closing the gap: Systematic integration of multiplexed functional data resolves variants of uncertain significance in BRCA1, TP53, and PTEN. Am. J. Hum. Genet. 108, 2248–2258. 10.1016/j.ajhg.2021.11.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Odell J, and Lammerding J (2024). N-terminal tags impair the ability of lamin A to provide structural support to the nucleus. J. Cell Sci. 137, jcs262207. 10.1242/jcs.262207. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.de Medeiros G, Ortiz R, Strnad P, Boni A, Moos F, Repina N, Challet Meylan L, Maurer F, and Liberali P (2022). Multiscale light-sheet organoid imaging framework. Nat. Commun. 13, 4864. 10.1038/s41467-022-32465-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Roberts B, Haupt A, Tucker A, Grancharova T, Arakaki J, Fuqua MA, Nelson A, Hookway C, Ludmann SA, Mueller IA, et al. (2017). Systematic gene tagging using CRISPR/Cas9 in human stem cells to illuminate cell organization. Mol. Biol. Cell 28, 2854–2874. 10.1091/mbc.e17-03-0209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Sansbury SE, Serebrenik YV, Lapidot T, Smith DG, Burslem GM, and Shalem O (2025). Pooled tagging and hydrophobic targeting of endogenous proteins for unbiased mapping of unfolded protein responses. Mol. Cell 85, 1868–1886.e12. 10.1016/j.molcel.2025.04.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Kobayashi H, Cheveralls KC, Leonetti MD, and Royer LA (2022). Self-supervised deep learning encodes high-resolution features of protein subcellular localization. Nat. Methods 19, 995–1003. 10.1038/s41592-022-01541-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Gupta A, Wefers Z, Kahnert K, Hansen JN, Leineweber W, Cesnik A, Lu D, Axelsson U, Navarro FB, Karaletsos T, et al. (2024). SubCell: Vision foundation models for microscopy capture single-cell biology. Preprint at bioRxiv. 10.1101/2024.12.06.627299. [DOI] [Google Scholar]
- 111.Rubin AF, Stone J, Bianchi AH, Capodanno BJ, Da EY, Dias M, Esposito D, Frazer J, Fu Y, Grindstaff SB, et al. (2025). MaveDB 2024: a curated community database with over seven million variant effects from multiplexed functional assays. Genome Biol. 26, 13. 10.1186/s13059-025-03476-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Karch CM, Kao AW, Karydas A, Onanuga K, Martinez R, Argouarch A, Wang C, Huang C, Sohn PD, Bowles KR, et al. (2019). A comprehensive resource for induced pluripotent stem cells from patients with primary tauopathies. Stem Cell Reports 13, 939–955. 10.1016/j.stemcr.2019.09.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Hasle N, Cooke A, Srivatsan S, Huang H, Stephany JJ, Krieger Z, Jackson D, Tang W, Pendyala S, Monnat RJ, Jr, et al. (2020). High-throughput, microscope-based sorting to dissect cellular heterogeneity. Mol. Syst. Biol. 16, e9442. 10.15252/msb.20209442. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.Gibson DG, Young L, Chuang R-Y, Venter JC, Hutchison CA, and Smith HO (2009). Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat. Methods 6, 343–345. 10.1038/nmeth.1318. [DOI] [PubMed] [Google Scholar]
- 115.Engler C, Kandzia R, and Marillonnet S (2008). A One Pot, One Step, Precision Cloning Method with High Throughput Capability. PLoS One 3, e3647. 10.1371/journal.pone.0003647. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Replogle JM, Bonnar JL, Pogson AN, Liem CR, Maier NK, Ding Y, Russell BJ, Wang X, Leng K, Guna A, et al. (2022). Maximizing CRISPRi efficacy and accessibility with dual-sgRNA libraries and optimal effectors. Elife 11, e81856. 10.7554/eLife.81856. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Matreyek KA, Stephany JJ, and Fowler DM (2017). A platform for functional assessment of large variant libraries in mammalian cells. Nucleic Acids Res. 45, e102–e102. 10.1093/nar/gkx183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Hermann M, Stillhard P, Wildner H, Seruggia D, Kapp V, Sánchez-Iranzo H, Mercader N, Montoliu L, Zeilhofer HU, and Pelczar P (2014). Binary recombinase systems for high-resolution conditional mutagenesis. Nucleic Acids Res. 42, 3894–3907. 10.1093/nar/gkt1361. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.Macdonald CB, Nedrud D, Grimes PR, Trinidad D, Fraser JS, and Coyote-Maestas W (2023). DIMPLE: deep insertion, deletion, and missense mutation libraries for exploring protein variation in evolution, disease, and biology. Genome Biol. 24, 36. 10.1186/s13059-023-02880-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.Weile J, Ferra G, Boyle G, Pendyala S, Amorosi C, Yeh C-L, Cote AG, Kishore N, Tabet D, van Loggerenberg W, et al. (2024). Pacybara: accurate long-read sequencing for barcoded mutagenized allelic libraries. Bioinformatics 40, btae182. 10.1093/bioinformatics/btae182. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Yusa K, Zhou L, Li MA, Bradley A, and Craig NL (2011). A hyperactive piggyBac transposase for mammalian applications. Proc. Natl. Acad. Sci. U. S. A. 108, 1531–1536. 10.1073/pnas.1008322108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Sanjana NE, Shalem O, and Zhang F (2014). Improved vectors and genome-wide libraries for CRISPR screening. Nat. Methods 11, 783–784. 10.1038/nmeth.3047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Chalfoun J, Majurski M, Blattner T, Bhadriraju K, Keyrouz W, Bajcsy P, and Brady M (2017). MIST: Accurate and Scalable Microscopy Image Stitching Tool with Stage Modeling and Error Minimization. Sci. Rep. 7, 4988. 10.1038/s41598-017-04567-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 124.Muhlich JL, Chen Y-A, Yapp C, Russell D, Santagata S, and Sorger PK (2022). Stitching and registering highly multiplexed whole-slide images of tissues and tumors using ASHLAR. Bioinformatics 38, 4613–4621. 10.1093/bioinformatics/btac544. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 125.Kuglin C, and Hines D (1975). The phase correlation image alignment method. In Proceedings of the 1975 IEEE International Conference on Cybernetics and Society, pp. 163–165. [Google Scholar]
- 126.Jaganathan K, Panagiotopoulou SK, McRae JF, Darbandi SF, Knowles D, Li YI, Kosmicki JA, Arbelaez J, Cui W, Schwartz GB, et al. (2019). Predicting Splicing from Primary Sequence with Deep Learning. Cell 176, 535–548.e24. 10.1016/j.cell.2018.12.015. [DOI] [PubMed] [Google Scholar]
- 127.McInnes L, Healy J, and Melville J (2020). UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. Preprint at arXiv. 10.48550/arXiv.1802.03426. [DOI] [Google Scholar]
- 128.Blondel VD, Guillaume J-L, Lambiotte R, and Lefebvre E (2008). Fast unfolding of communities in large networks. J. Stat. Mech. 2008, P10008. 10.1088/1742-5468/2008/10/P10008. [DOI] [Google Scholar]
- 129.Chen T, and Guestrin C (2016). XGBoost: A Scalable Tree Boosting System. Preprint at arXiv [cs.LG]. [Google Scholar]
- 130.Larsson C, Koch J, Nygren A, Janssen G, Raap AK, Landegren U, and Nilsson M (2004). In situ genotyping individual DNA molecules by target-primed rolling-circle amplification of padlock probes. Nat. Methods 1, 227–232. 10.1038/nmeth723. [DOI] [PubMed] [Google Scholar]
- 131.Jang SK, Kräusslich HG, Nicklin MJ, Duke GM, Palmenberg AC, and Wimmer E (1988). A segment of the 5’ nontranslated region of encephalomyocarditis virus RNA directs internal entry of ribosomes during in vitro translation. J. Virol. 62, 2636–2643. 10.1128/JVI.62.8.2636-2643.1988. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Figure S1. piggyBac MOI titration and knockout Western blots, related to Figure 1
(A) Fraction of cells with 0, 1, 2, or 3 integrations, determined by in situ sequencing, after co-transfection into U2OS cells of LMNA VIS-seq library DNA at different quantities (in nanograms) with plasmid encoding Piggybac-ase (at a 4-fold lower mass) and puromycin selection.
(B) Imaging of day-7 neuron-like cells derived from clonal NGN2-inducible PTEN knockout lines stained for DAPI (blue) and NCAM1 (red). Scale bar indicates 20 μm.
(C) Unmatched images of mEGFP-tagged H1.4, RPS19, and PTEN libraries in human WTC11 iPS cells or PTEN library in NGN2-induced neuron-like cells (right), stained with DAPI and phalloidin-CF568 (top row) and first base in situ sequencing (bottom row). Some cells express variants with localization defects (HIST1H1E=chromatin binding, RPS19=nucleolar-cytoplasmic, PTEN=nucleo-cytoplasmic). Nucleobase coloring identical to Figure 1E. Scale bar indicates 20 μm.
(D) Western blot of U2OS cells showing parental line (left) and LMNA knockout line clone 3 (right) stained for lamin A protein. Two replicate lanes are shown. Clone 3 was used in subsequent LMNA VIS-seq experiments.
(E) Western blot of NGN2-inducible iPS cells showing parental line (left) and PTEN knockout clonal lines (right) stained for PTEN and phospho-AKT protein. Clonal lines C2 and C18 were used for PTEN VIS-seq replicate 1 and 2, respectively.
Figure S2. LMNA VIS-seq replication, scoring, and feature analysis, related to Figure 2
(A) 12-basepair barcode sequences on circular RNAs were read by in situ sequencing-by-synthesis in each cell and then mapped to corresponding LMNA variants using a barcode-to-variant dictionary made by long-read sequencing. Example cells with reads in all 12 cycles are shown.
(B) Histogram of the total number of 12-base pair reads per sequenced cell in a single well of LMNA replicate 2 experiment.
(C) Histogram of the edit distance between consensus cell-level 12-base pair read and nearest library barcode in a single well of LMNA replicate 2 experiment. Red line indicates that cells with edit distance of 0 or 1 were used if they matched to a unique barcode.
(D) Number of cells genotyped for each LMNA variant in both replicates of VIS-seq screen colored by variant type, with Pearson’s r shown.
(E) Volcano plots of median LMNA variant effect on feature median z-scores (x-axis, left) or feature EMD z-scores (x-axis, right) against geometric-mean KS-test p-values (y-axis, tested against WT feature distributions). Both the feature effect size (x-axis) and p-values (y-axis) are computed over all profiled variants. Points are features with area proportional to the number of variants that pass thresholds and coloring is by imaging channel (top) or cellular compartment (bottom) (see right for legend). Red dashed lines show the Bonferroni-corrected p<0.01 threshold and effect thresholds used to define hit features.
(F) Flowchart describing training a binary classifier for each variant to distinguish images of cells expressing that variant from corresponding WT cell images. For each variant, a distinguishability score summarizing classifier performance was computed as the area under a ROC curve on a test set of single cells. 0.5 indicates random classifier performance and 1 indicates perfect discrimination between variant and WT single cells. For a full description, see Methods.
(G) Morphological impact score of variants in both replicates of the LMNA VIS-seq experiment colored by variant type as in (D), with Pearson’s r shown.
(H) Distinguishability score of variants in both replicates of the LMNA VIS-seq experiment colored by variant type as in (D), with Pearson’s r shown.
(I) Boxplots of morphological impact scores computed from only median, only EMD, or median and EMD-based variant profiles for the same groups of variants as Figure 2F. Points represent individual variant impact scores.
(J) Receiver operating characteristic (ROC) curves are plotted for univariate zero-shot models predicting LMNA aggregating control variants (left) or ClinVar P/LP variants (right) from synonymous variants, using impact scores shown in (I) or distinguishability scores. AUC is shown on the right for each model.
Figure S3. LMNA variant profile clustering is robust and separates variants by selected and landmark features, related to Figure 3
(A) Heatmap of LMNA variants organized their Louvain-derived cluster (y-axis, colored according to Figure 3A) versus selected feature medians (left portion of heatmap, x-axis) or selected feature EMDs (right portion of heatmap, x-axis) organized by hierarchical clustering. Each feature medians and EMDs is z-scored using the synonymous distribution of the feature. Imaging channel and compartment are annotated for each feature on the top, with clustering dendrogram shown (see legend on the right). Landmark feature medians, EMDs, and impact and distinguishability scores are shown separately on the right. Nuclear mEGFP-lamin A intensity and granularity features, nuclear shape features, and nuclear correlation features between DAPI and mEGFP-lamin A channels are annotated below.
(B) Boxplot depicting cosine similarity computed for pairs of variants using either selected feature medians or EMDs. Pairs of variants were chosen in different ways to assess the robustness of the Louvain-derived clusters (see Figure 3A). To assess the baseline cosine similarity between the morphological profiles of variants, pairs were chosen from all variants (“global”, left column, grey boxes). To assess the similarity between variants in each cluster, pairs were drawn from the cluster (“within”, right columns, blue boxes). To assess the similarity between variants within a cluster compared to variants outside the cluster, one variant was drawn from the cluster and one was drawn from outside the cluster (“between”, right columns, orange). Each boxplot represents the distribution of cosine similarities computed for all pairs of the indicated type. Louvain cluster colors are shown above. *** indicates Mann-Whitney U FDR-corrected q<0.001.
(C) To assess the stability of our Louvain clustering (see Figure 3A), the clustering was performed 100 times, each time omitting a randomly selected 10% of variants. The fraction of samples in which each pair of variants was located in their original Louvain cluster is shown. Variants are organized by their original Louvain cluster labels shown on the left.
(D) UpSet plot showing the number of missense LMNA variants that impact each combination of landmark features defined in Figure 3B. Features were considered impacted if EMD z-score>2.5 and KS-test Bonferroni-corrected p<0.01). Sets, represented by bars and dots in the plot, are colored by the most frequently impacted feature in the set (no impacted features=black, circularity=orange, granularity=purple, intensity/boundary intensity=teal).
Figure S4. LMNA VIS-seq landmark feature analysis, related to Figure 3
(A) Randomly-selected cells from the 10th and 90th percentile in the landmark feature distributions are shown. mEGFP-tagged lamin A channel is shown in green, and DAPI in blue. Silver dashed lines depict the borders of cells. Scale bar indicates 5 μm.
(B) Plot showing variant effects on nuclear mEGFP-lamin A granularity 1, 2 and 3 features by Louvain cluster of residence. Z-scores are versus the synonymous distribution. Synonymous variants are shown on the left. Granularity scale by pixel size is shown in the legend.
(C) LMNA feature median z-scores for mEGFP-lamin A boundary intensity, nuclear eccentricity, nuclear solidity, and nuclear eccentricity are plotted against each other for all profiled variants. Z-scores are versus the synonymous variant distribution. Variant type is colored according to: synonymous variants (green), missense (grey), and frameshift variants (purple).
(D) Plot over nuclear radial deciles of the mEGFP-lamin A total intensity in each decile normalized to the DAPI total intensity in that decile. Bin 1 corresponds to the innermost radial decile and bin 10 to the outermost. The data for the remainder of the clusters and a diagram of nuclear radial deciles are shown in Figure 3C.
(E) Map of lamin A missense variant effects on the lamin A nuclear intensity feature (left; grey boxes=missing variants, black dots=synonymous substitutions) and on the UMAP visualization of variant profiles (right, see Figure 2H). Blue to red coloring indicates feature z-score versus the synonymous variant distribution. Above the map, positions are annotated by their participation in multimer contacts (first row, defined below), α-helical average substitution profile (second row, defined below and in Figure S5F), and proline-substitution profile (third row, defined below and in Figure S5E). Linker or coil subdomains are shown in the thicker annotation bar directly above the heatmap63. Shown below the heatmap are position-averaged scores, and on the right are amino acid substitution marginal feature z-score distributions (points depict variants; black lines depict medians).
(F) Heatmap and UMAP for missense substitution effects on the lamin A nuclear boundary intensity feature, colored and annotated as in (E).
(G) Heatmap and UMAP for missense substitution effects on the lamin A nuclear granularity 1 feature, colored and annotated as in (E).
Figure S5. VIS-seq profiles separate lamin A residues by structural and functional properties, related to Figure 4
(A) UMAP visualization of LMNA synonymous (dark green) or missense variant profiles colored by lamin A subdomain.
(B) Heatmap of proportion of missense variants in coil or linker 12 subdomains present in each Louvain cluster (see Figure 3A). Proportion is indicated inside each box. Color scale for heatmap shown on the right. *** indicates Fisher’s exact test FDR-corrected q<0.001 and ** indicates q<0.01.
(C) Morphological impact score of missense substitutions plotted by domain, compared with synonymous variants (green). *** indicates Mann-Whitney U-test p<0.001.
(D) Morphological impact score of lamin A missense substitutions plotted by amino acid, compared with synonymous variants (green). *** indicates Mann-Whitney U-test p<0.001.
(E) Pearson’s correlation coefficients between LMNA proline substitution morphological profiles in PCA space (light color=positive correlation, dark=negative correlation). Bars on top indicate each position’s subdomain and participation in multimer contacts (see (G), Methods, color legend shown above and (E)). Bars on right indicate dendrogram-derived grouping into three clusters: linker, linker-proximal and α-helical.
(F) Pearson’s correlation coefficients between the α-helical (defined in (D)) PCA space, position-averaged variant morphological profiles (light color=positive correlation, dark=negative correlation). Synonymous variants were averaged into a single PCA vector and included (green annotation, top). Bars on top indicate each position’s subdomain63 (color legend above and (D)) and participation in multimer contacts63,64 (see (G), Methods). Bars on right indicate dendrogram-derived grouping into three clusters: strongly aggregating, low abundance, and low impact.
(G) Contour plots indicate location on the UMAP shown in (Figure 2H) for missense substitutions at each group of positions defined in (E), labeled according to their effects on lamin A.
(H) Intra-dimer (left) and inter-dimer coil 1B (middle) and coil 2A (right) minimum β-carbon distances derived from A1163 and A2264 multimer structures. Dimer-facing residues are indicated in blue (left) and multimer-facing residues are indicated in red for A1163 and A2264 structures (middle, right, respectively). See Methods for how these residues are defined.
(I) Morphological impact score (left) or distinguishability score (right) of missense substitutions at α-helical position groups as defined in (E) by clustering amino acid positions. *** indicates Mann-Whitney U-test p<0.001.
(J) Heatmap of proportion of missense variants in each position cluster from (E) and (F) present in each Louvain cluster (see Figure 3A). Proportion is indicated inside each box. Color scale for heatmap shown on the right. *** indicates Fisher’s exact test FDR-corrected q<0.001, ** indicates q<0.01, and * indicates q<0.05.
(K) Heatmap of average effects on scores and landmark features of substitutions to amino acids in proline substitution profile-defined position groups (see (E)) and α-helical position groups (see (F)). Lprox indicates linker-proximal, αH indicates α-helical, Agg indicates aggregation-sensitive, Abund indicates abundance-sensitive, and LI indicates low impact position groups. Color scales for each heatmap shown on the bottom.
(L) Heatmap of proportion of missense variants to groups of amino acids present in each Louvain cluster (see Figure 3A). Proportion is indicated inside each box. Color scale for heatmap shown on the right. *** indicates Fisher’s exact test FDR-corrected q<0.001, ** indicates q<0.01, and * indicates q<0.05.
Figure S6. PTEN VIS-seq replication, scoring, and feature analysis in iPS cells and NGN2-induced neurons, related to Figure 5
(A) Histogram of total read count per genotyped cell in single well of PTEN iPS cell (blue) or NGN2-induced neuron-like cell (green) replicate 2 experiments. Each genotyped cell has two 8-bp barcodes.
(B) Total (summed) edit distance between consensus cell-level 16-base pair double barcode reads and nearest library double barcode in a single well of PTEN iPS cell (blue) or NGN2-induced neuron (green) replicate 2 experiments. Red line indicates the edit distance ≤ 2 filtering threshold we used for matching of sequenced barcodes to our barcode library.
(C) Total edit distance between consensus cell-level 16-base pair double barcode reads and nearest library double barcode (x) or second nearest library double barcode (y) in a single well of PTEN iPS cell replicate 2 experiment. Red line indicates that cells with total edit distance < 3 were used if they matched to a unique library double barcode.
(D) Boxplots of the number of profiled single cells per variant of the indicated class for iPS cells (left) or NGN2-induced neurons (right) containing single PTEN variants over both replicates.
(E) Number of cells genotyped for each PTEN variant in iPS cell (x) and NGN2-induced neuron (y) of independently transfected and differentiated replicate 1 (left) and replicate 2 (right) of VIS-seq experiment, with Pearson’s r shown.
(F) Stacked barplot showing proportion of all features measured in iPS cells, selected features which generate variant profiles, and hit features (defined in (G)) by imaging channel, cellular compartment, or method of computation of feature. The number of features in each set is indicated. Color legends are shown below.
(G) Volcano plots of median variant effect on iPS cell feature median z-scores (x-axis, left) or feature EMD z-scores (x-axis, right) against geometric-mean KS-test p-values (y-axis). Both the feature effect size (x-axis) and p-values (y-axis) are computed over all profiled variants. Points are features with area proportional to the number of variants that pass thresholds and coloring is by imaging channel (top) or cellular compartment (bottom) (see (F) for legend). Red dashed lines show the Bonferroni-corrected p<0.01 threshold and effect thresholds. Hit features are defined by median effect thresholds (left).
(H) Stacked barplot showing proportion of all features measured in NGN2-induced neurons as in (F).
(I) Volcano plots of NGN2-induced neuron features as in (G). See (H) for legend.
(J) Morphological impact score for PTEN variants in iPS cells (left) and NGN2-induced neurons (right), plotted by variant type. *** indicates Mann-Whitney p<0.001.
(K) Distinguishability scores for PTEN variants in iPS cells (left) or NGN2-induced neurons (right), plotted by variant type. Shown comparisons (top) have Mann-Whitney p<0.001.
(L) Morphological impact scores for variants in each replicate of PTEN VIS-seq experiment in iPS cell (left) and NGN2-induced neurons (right), colored by variant type as in (E), with Pearson’s r shown.
(M) Morphological impact scores (top) and distinguishability scores (bottom) for PTEN variants in both iPS cells and NGN2-induced neurons are compared, colored by variant type as in (B). Pearson’s r shown.
Figure S7. Landmark features in VIS-seq PTEN experiments, related to Figure 5
(A,B,C) Missense variant effect maps for NGN2-induced neuron-like cell PTEN landmark features (grey boxes=missing variants, black dots=synonymous substitutions). Blue to red coloring indicates feature z-score versus the synonymous variant distribution. Linker or coil subdomains are shown in the thicker annotation bar directly above the heatmap. Shown below the heatmap are position-averaged scores, and on the right are amino acid substitution marginal feature z-score distributions (points depict variants; black lines depict medians).
(D) Two randomly-selected iPS cells (top) or NGN2-induced neurons (bottom) expressing PTEN WT or variants are shown. mEGFP-tagged PTEN channel is shown in green, DAPI in blue, phalloidin (top) or MAP2 (bottom) in yellow, and pAKT in red. Silver dashed lines depict the borders of cells. Scale bar indicates 5 μm.
(E) PTEN abundance score for PTEN variants in iPS cells (upper left) and NGN2-induced neurons (upper right) plotted against VAMP-seq5 score, with best-fit line and Pearson’s r shown. pAKT intensity score for PTEN variants in iPS cell (bottom left) and NGN2-induced neuron-like cells (bottom right) plotted against yeast fitness84 score, with Pearson’s r shown.
(F) iPS cell mEGFP-PTEN intensity score positional averages are used to color the PTEN crystal structure (1D5R)71. Solvent-exposed surface is colored by the exposed residue’s z-score. Blue indicates low-intensity positions. The bound tartrate molecule is shown in green. Residue color scale shown to the right of (H).
(G) iPS cell pAKT intensity score positional averages are used to color the PTEN crystal structure (1D5R)71.
(H) iPS cell DAPI-PTEN correlation score positional averages are used to color the PTEN crystal structure (1D5R)71.
Figure S8. Relationships between PTEN landmark features and scores in iPS cells and NGN2-induced neuron-like cells, related to Figure 5
(A) iPS cell landmark feature z-scores are plotted against NGN2-induced neuron-like cell landmark feature z-scores for all profiled variants, colored by variant type. Z-scores are versus the synonymous variant distribution.
(B) iPS cell nucleus to cytoplasm PTEN intensity ratio z-scores are plotted against iPS cell DAPI-PTEN correlation z-scores for all profiled variants, colored by variant type as in (A).
(C) Violin plots of iPS cell and NGN2-induced neuron PTEN abundance (left) and DAPI-PTEN correlation z-scores (right). iPS cell-specific nuclear-localized variants are indicated. *** indicates KS p<0.001.
(D) iPS cell PTEN landmark feature z-scores are plotted against each other for all profiled variants. Variant type is colored according as in (A). iPS cell-specific lipid phosphatase-active nuclear-localized variants are indicated.
(E) The mislocalization score is defined as the variant-level residuals when DAPI-PTEN correlation is regressed against PTEN intensity and pAKT intensity (see Methods). Positional heatmap of iPS cell mislocalization scores for missense, stop gain, or 3-nucleotide deletion variant profiles in iPS cell. Positional average of scores shown on the bottom and amino acid substitution marginal distribution shown on the right.
(F) Mislocalization score in iPS cells computed independently for replicates 1 and 2 are plotted. Synonymous (green) and missense (grey) variants are shown. Pearson’s r shown on the top left.
(G) PTEN structure (1D5R)71 with spheres centered at α-carbon atoms with radii indicating the number of lipid-phosphatase active (defined as pAKT z-score<2.5) and nuclear-localized (defined as DAPI-PTEN correlation z-score>2.5) variants at that position. The top four positions indicating nuclear localization are highlighted, with the number of variants at each of these positions indicated.
(H) iPS cell mislocalization score positional averages are used to color the PTEN crystal structure (1D5R)71. Blue indicates aberrantly cytoplasmic-localizing positions and red indicates aberrantly nuclear-localizing positions. Color scale shown on the right.
(I) UpSet plot showing the number of PTEN missense variants that perturb each combination of landmark features in iPS cells and NGN2-induced neurons. Landmark features for each variant were considered perturbed if feature |z-score|>2.5 and KS-test Bonferroni-corrected p<0.01. Sets, represented by bars and dots in the plot, are colored by combinations of impacted features (neither pAKT nor PTEN abundance impacted in both cell types = black, pAKT impacted in both cell types = orange, PTEN abundance impacted in both cell types = teal, both pAKT and PTEN abundance impacted in both cell types = purple).
(J) The distinguishability score difference is defined as the variant-level difference in distinguishability score percentile between iPS cells and neurons. Positional heatmap of distinguishability score difference for missense, stop gain, or 3-nucleotide deletion variant profiles in iPS cells. Positional average of scores shown on the bottom and amino acid substitution marginal distribution shown on the right.
(K) Waterfall plots of ranking all iPS cell features (top) or neuron features (bottom) ranked by Spearman’s correlation with distinguishability score difference. Positive values indicate larger values of feature are associated with distinguishability in iPS cells compared to neurons. Landmark features are indicated and tracks are shown below the plot indicating feature imaging channel and cellular compartment. Lines above and below the plot labeled by *** indicate which features have Student’s t-test Bonferroni-corrected p<0.001.
Figure S9. PTEN VIS-seq profiles predict pathogenicity and clinical phenotype, related to Figure 6
(A) UMAP visualization of iPS cell (left) and NGN2-induced, neuron-like cell (right) PTEN variant profiles with triangles indicating variants in ClinVar classified as likely benign (LB, blue), likely pathogenic (LP, red), pathogenic (P, dark red), or variant of uncertain significance (VUS, yellow). All profiled variants are plotted in the background colored green (synonymous) or grey (otherwise) for comparison.
(B) Morphological impact scores for PTEN iPS cell (left) or NGN2-induced neuron (right) profiles are plotted by ClinVar label. Synonymous variants (green) are included for comparison. *** indicates Mann-Whiney U p<0.001.
(C) Receiver operating characteristic (ROC) curves are plotted for univariate zero-shot models predicting ClinVar pathogenicity or each clinical phenotype (ASD = autism spectrum disorder, DD = developmental delay, PHTS = PTEN hamartoma tumor syndrome; see Methods for curation criteria) from iPS cell or NGN2-induced neuron distinguishability score (this publication, solid lines), yeast fitness scores84 (dashed line), VAMPseq scores5 (dashed line), AlphaMissense91 or EVE92 scores (dot-dashed lines). Area under the curve (AUC) scores are shown in the box for each model.
(D) Morphological impact scores for PTEN variants in NGN2-induced neurons are plotted by variant association with clinical phenotypes. gnomAD v4.1 (light blue) variants and synonymous variants (green) are plotted for comparison.
(E) Distinguishability scores for PTEN variants in iPS cells (top) or NGN2-induced neurons (bottom) are plotted by variant association with clinical phenotypes. gnomAD v4.1 (light blue) variants and synonymous variants (green) are plotted for comparison.
(F) Feature scores for PTEN variants in iPS cells are plotted by variant association with clinical phenotypes. gnomAD v4.1 (light blue) variants and synonymous variants (green) are plotted for comparison. Features plotted include PTEN intensity (left) or pAKT intensity (right). *** indicates Mann-Whitney U-test p<0.001.
(G) Feature scores for PTEN variants in NGN2-induced neurons are plotted by variant association with clinical phenotypes. gnomAD v4.1 (light blue) variants and synonymous variants (green) are plotted for comparison. Features plotted include PTEN intensity (left), pAKT intensity (middle), or mislocalization score (right). *** indicates Mann-Whitney U-test p<0.001.
(H) Plot of PTEN median effects on feature z-scores in the ASD/DD-only variants (x-axis) against PHTS-only variants (y-axis), in iPS cells (left) or NGN2-induced neurons (right). Points are features colored by imaging channel (see left for legend). Landmark features are highlighted. Red dashed lines show median effect thresholds at synonymous z-scores of 1 or −1.
Figure S10. VIS-seq measurements elaborate computational variant effect predictions, related to Figure 6
(A) Predictor scores for AlphaMissense91, EVE92, and REVEL93 are plotted against LMNA VIS-seq variant impact and distinguishability scores as well as landmark feature medians. ClinVar P/LP variants are shown with red triangles. Predictor-specified pathogenic and benign thresholds are shown by horizontal dotted lines. Pearson’s r between logit-transformed predictor scores and VIS-seq scores are shown on the bottom left.
(B) Predictor scores for AlphaMissense91, EVE92, and REVEL93 are colored on the UMAP visualization of LMNA VIS-seq profiles.
(C) Predictor scores for AlphaMissense91, EVE92, and REVEL93 are plotted against PTEN VIS-seq iPS cell variant impact and distinguishability scores as well as landmark feature medians. ClinVar P/LP variants are shown with red triangles and B/LB variants are shown with blue triangles. Predictor-specified pathogenic and benign thresholds are shown by horizontal dotted lines. Pearson’s r between logit-transformed predictor scores and VIS-seq scores are shown on the bottom left.
(D) Predictor scores for AlphaMissense91, EVE92, and REVEL93 are colored on the UMAP visualization of PTEN iPS cell VIS-seq profiles.
(E) Predictor scores for AlphaMissense91, EVE92, and REVEL93 are plotted against PTEN VIS-seq NGN2-induced neuron-like cell variant scores and landmark features as in (C).
(F) Predictor scores for AlphaMissense91, EVE92, and REVEL93 are colored on the UMAP visualization of PTEN NGN2-induced neuron VIS-seq profiles as in (D).
Table S1. LMNA feature-level volcano analysis, related to Figure 2
Table S4. PTEN variant-level landmark features, scores, and clinical annotations, related to Figures 5 and 6
Table S5. Curated PTEN missense variant–phenotype associations from the literature (PHTS and ASD/DD), related to Figure 6
Table S6. DNA sequences used in this publication, related to STAR Methods
(A) Primers and oligos used for cloning
(B) Libraries
(C) Gene fragments used for cloning
(D) Plasmids
(E) Guide RNAs
(F) Oligos for in situ sequencing
Data Availability Statement
Variant morphological impact scores, distinguishability scores and landmark feature values can be found on MaveDB111 (LMNA: https://mavedb.org/experiments/urn:mavedb:00001243-a, PTEN iPS cells: https://mavedb.org/experiments/urn:mavedb:00001244-b, and PTEN NGN2-induced neurons: https://mavedb.org/experiments/urn:mavedb:00001244-a).
Tabular data is provided as a supplement or, for larger files, is available via Zenodo at doi.org/10.5281/zenodo.15787684.
Image data, single-cell feature profiles, and CellProfiler pipelines are publicly available via the BioImage Archive at doi.org/10.6019/S-BIAD3095.
All code necessary to reproduce our analyses, starting at the cell by features matrix or from summary data available via Zenodo, can be found at https://github.com/FowlerLab/visseq. STARCall code used to generate the cells by features matrix from phenotyping and genotyping images can be found at https://github.com/FowlerLab/starcall-workflow. Code used to generate trained XGBoost models and variant-level distinguishability scores can be found at https://github.com/FowlerLab/fisseqtools.
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.
