Donor-matched iPSC model reveals context-dependent T2D genetic signals in fibro-adipogenic progenitors

Christa Ventresca; Arushi Varshney; Peter Orchard; Ha TH Vu; Yao-chang Tsan; Andre Monteiro da Rocha; Michael R Erdos; Leena Kinnunen; Timo A Lakka; Jouko Saramies; Markku Laakso; Jaakko Tuomilehto; Karen L Mohlke; Michael Boehnke; Laura J Scott; Heikki A Koistinen; Francis S Collins; Todd Herron; Stephanie Bielas; Stephen C J Parker

doi:10.64898/2026.02.04.702388

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

[Preprint]. 2026 Feb 6:2026.02.04.702388. [Version 1] doi: 10.64898/2026.02.04.702388

Donor-matched iPSC model reveals context-dependent T2D genetic signals in fibro-adipogenic progenitors

Christa Ventresca ^1,², Arushi Varshney ², Peter Orchard ², Ha TH Vu ², Yao-chang Tsan ¹, Andre Monteiro da Rocha ³, Michael R Erdos ⁴, Leena Kinnunen ^5,^6,^7,⁸, Timo A Lakka ^9,^10,¹¹, Jouko Saramies ¹², Markku Laakso ^13,¹⁴, Jaakko Tuomilehto ¹⁵, Karen L Mohlke ¹⁶, Michael Boehnke ¹⁷, Laura J Scott ¹⁷, Heikki A Koistinen ^5,^6,^7,⁸, Francis S Collins ¹⁸, Todd Herron ¹⁹, Stephanie Bielas ^1,²⁰, Stephen C J Parker ^1,^2,^17,^21,^*

¹ Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA

² Gilbert S. Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA

³ Internal Medicine, University of Michigan Medical School, Ann Arbor, MI, USA

⁴ Molecular Genetics Section, NHGRI, NIH

⁵ work reported here performed during former affiliation at Finnish Institute for Health and Welfare, Helsinki, Finland

⁶ Faculty of Medicine, Clinical and Molecular Metabolism, University of Helsinki, Helsinki, Finland

⁷ Department of Medicine, University of Helsinki and Helsinki University Hospital, Helsinki Finland

⁸ Minerva Foundation Institute for Medical Research, Helsinki, Finland

⁹ Institute of Biomedicine, School of Medicine, University of Eastern Finland, Kuopio Campus, Finland

¹⁰ Department of Clinical Physiology and Nuclear Medicine, Kuopio University Hospital, Kuopio, Finland

¹¹ Foundation for Research in Health Exercise and Nutrition, Kuopio Research Institute of Exercise Medicine, Kuopio, Finland

¹² South Karelia Social and Health Care District, Wellbeing services county of South Karelia, Finland

¹³ Institute of Clinical Medicine, Internal Medicine, University of Eastern Finland, Kuopio, Finland

¹⁴ Department of Medicine, Kuopio University Hospital, Kuopio, Finland

¹⁵ Department of Public Health, University of Helsinki, Helsinki, Finland

¹⁶ Department of Genetics, University of North Carolina, Chapel Hill, NC, USA

¹⁷ Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA

¹⁸ Currently unaffiliated; NHGRI work reported here preceded retirement on March 1, 2025

¹⁹ Greenstone Biosciences, Palo Alto, CA, USA

²⁰ Pediatrics, University of Michigan, Ann Arbor, MI, USA

²¹ Lead contact

Author contributions

CV led the experiments, analysis, and manuscript preparation. AV performed the FUSION FAP subtype identification and led the FUSION analyses used here. PO assisted with the sn-multiomic analysis including developing pipelines used. HV assisted with the iPSC-FAP nuclei QC and subtype identification. YT, AMR, TH, and SB provided essential mentorship and guidance for cultivating iPSCs and differentiating to FAPs, with SB providing additional guidance for forming the cell village and performing the insulin stimulated glucose uptake assay. ML, JS, JT, TAL, LK, and HAK contributed biopsy samples. KLM, MB, JLS, and FSC provided supervision and mentoring. SCJP organized the project, provided mentorship, and supervised all aspects of data generation, analysis, and interpretation.

Correspondence: scjp@umich.edu

PMCID: PMC12889671 PMID: 41676558

Summary:

Fibro-adipogenic progenitors (FAPs) in skeletal muscle have been implicated in type 2 diabetes (T2D) risk, yet their heterogeneity and context-dependent regulation remain poorly understood. Here, we establish induced pluripotent stem cell (iPSC)-derived FAPs as a faithful model of primary FAPs by leveraging a unique resource: iPSC lines and skeletal muscle biopsies obtained from the same 30 individuals. Donor-matched comparisons reveal that iPSC-FAPs recapitulate the transcriptome, epigenome, and subtype composition of muscle tissue FAPs. Using single-nucleus multiomics, we show that high-insulin exposure drives iPSC-FAPs toward an adipogenic fate - and that this adipogenic subtype is enriched for T2D GWAS signals, an enrichment undetectable under baseline conditions. We map the T2D-associated rs3814707 non-coding signal to LTBP3, a gene that influences FAP adipogenic differentiation. These findings reveal how disease-relevant regulatory mechanisms can be masked in unstimulated cells and establish iPSC-FAPs as a powerful platform for dissecting the state-dependent biology of complex metabolic disease.

Introduction:

Type 2 diabetes (T2D) affects approximately 462 million individuals, representing 6% of the global population, and arises from a combination of genetic, environmental, and behavioral factors.^1–3 Genome-wide association studies (GWAS) have identified over 1,200 independent signals associated with T2D, more than 90% of which localize to non-coding regions of the genome and individually have small effect sizes.^4,5 Many of these variants map to regulatory elements that are active only in specific cell types,^4–12 indicating that identifying the relevant cellular contexts is critical for interpreting GWAS signals and understanding disease mechanisms.

Skeletal muscle is a tissue where T2D-relevant regulatory activity has been identified; notably muscle regulatory elements are enriched for GWAS signals associated with fasting insulin levels.^13,14 Within skeletal muscle, fibro-adipogenic progenitors (FAPs) have emerged as a cell population of particular interest. Genetic variants that regulate chromatin accessibility in FAPs colocalize with GWAS signals for T2D and related traits, suggesting that FAP-specific regulatory elements may harbor causal variants.^13,14 FAPs are a heterogeneous population comprising multiple subtypes that reflect distinct differentiation trajectories: multipotent progenitors capable of following any trajectory, adipogenic cells poised to differentiate into adipocytes, and fibrogenic cells that promote fibrosis.^15–19 The relative proportions of these subtypes shift in response to injury,^15,16 suggesting that FAP composition is dynamically regulated by the cellular environment. However, the contribution of specific FAP subtypes in T2D pathophysiology has not yet been explored.

Beyond cell type identity, the cellular environment and cell state may also shape the regulatory landscape and thereby influence disease development. A previous study of immune cells demonstrated that exposure to different stimuli alters chromatin architecture, transcription factor (TF) binding, and gene expression.²⁰ Notably, some regulatory elements exist in a “primed’ configuration that enables stimulus-specific transcriptional responses, indicating that chromatin structure can predetermine how cells respond to environmental cues. This principle suggests that studying FAPs under T2D-relevant conditions - rather than at baseline alone - may reveal disease-relevant regulatory mechanisms that would otherwise remain hidden.

To investigate how FAP subtypes and their regulatory pathways respond to a diabetogenic environment, we leveraged the Finland-United States Investigation of NIDDM Genetics (FUSION) Tissue Biopsy Study.¹³ This cohort includes skeletal muscle biopsies from 287 Finnish individuals with matched genotyping and single-nucleus multiomic profiling (snATAC-seq and snRNA-seq).^13,21 Because FAPs constitute only a small fraction of cells in muscle biopsies, we used induced pluripotent stem cells (iPSCs) as an alternative platform. We differentiated iPSCs from 30 FUSION participants into FAPs and exposed them to a high-insulin environment, a metabolic milieu characteristic of T2D. We then performed single-nucleus multiomic profiling to characterize gene regulatory mechanisms across basal and high-insulin conditions. We found that a high-insulin state promotes the formation of the adipogenic FAP subtype and that chromatin accessibility peaks across FAP subtypes are differentially enriched for GWAS signals, establishing the relevance of this model system to T2D and related traits.

Results:

FAP abundance in skeletal muscle biopsy and gene expression is associated with T2D status and FAP subtypes.

We used skeletal muscle single-nucleus gene expression and chromatin accessibility profiles from 287 individuals in the FUSION Tissue study¹³ (Figure 1A) and identified 13 distinct cell types using the gene expression modality, of which 4.6% were annotated as FAPs (Figure 1B). We investigated whether FAP abundance was associated with donor traits including age, BMI, fasting plasma glucose, fasting serum insulin, sex, and T2D status. We modeled FAP counts across individuals while accounting for batch effects and donor traits using a negative binomial generalized linear model (NB-GLM) as implemented in DESeq2 (Figure 1C). Our analysis showed that increased FAP abundance was associated with higher fasting insulin, lower fasting plasma glucose levels, male sex, and T2D status (FDR<0.05).

Figure 1: — A – Schematic of the FUSION Tissue Study, a study of 287 Finnish individuals’ skeletal muscle biopsies.

B – UMAP of cell types from the FUSION Tissue Study, fibro-adipogenic progenitors (FAPs) make up 4.6% of the nuclei based on gene expression.

C – Comparison of FAP abundance with donor metadata, red indicates association with a decrease in FAP abundance and blue indicates an association with an increase in FAP abundance. DESeq2 was used for analysis and significance. Data represented as mean ± SEM.

D – A diagram of the neighborhood workflow.

E – UMAP of FUSION data indicating neighborhoods and colored by log2 fold change of differential abundance of T2D status.

F – Beeswarm plots colored by log2 fold change of differential abundance for specifically the FAPs.

G – Heatmap of differentially expressed genes within FAP neighborhoods based on differential abundance of T2D status.

H – Clustering of FUSION FAPs revealed progenitors, fibrogenic, and adipogenic FAP subtypes.

I – Multiple genes identified in G are revealed to have differential expression amongst the FAP subtypes.

J – UCSC screenshot of FUSION skeletal muscle biopsy cell types, with the FAPs additionally separated by subtype. The promoter peak at *MME* has significantly different levels of chromatin accessibility amongst the FAP subtypes (DESeq2, normalized counts per 10 million reads).

We next asked whether analyzing all cell types within a skeletal muscle biopsy, agnostic of discrete cell-type cluster annotations, would recapitulate the observed FAP differential abundance. Using Milo²², we constructed a k-nearest neighborhood (knn) graph from the snRNA-seq data, assigned nuclei into 7,332 partially overlapping neighborhoods, and performed differential abundance testing across neighborhoods using a NB-GLM framework.²² We identified FAP-enriched neighborhoods with significantly (p<0.05) increased association with T2D status (80 neighborhoods), decreased fasting plasma glucose (49 neighborhoods), and association with sex (94 neighborhoods) (Figures 1E and F), consistent with our results shown in Figure 1C.

To further elucidate distinctions among FAP neighborhoods, we investigated which genes that were differentially expressed between neighborhoods with differential abundance and those without differential abundance. Specifically, we analyzed the top 2,000 most variable genes within the FAP neighborhoods, and identified the ten most significantly differentially expressed genes for each trait based on neighborhood-based differential abundance. In FAP neighborhoods with significantly higher T2D prevalence, we identified lower expression of HSPG2, COL6A3, LAMA2, MME, ARHGAP24, COL15A1, SCN7A, and LRRTM4 and higher expression of RSPO3 (FDR<0.05, Figure 1G, Figure S1). Among the ten most significantly differentially expressed genes, three - HSPG2, COL6A3, and RSPO3 have been previously implicated in the development of T2D and diabetic neuropathy.^23–27 Notably, MME is a regulator of cell growth and adipogenic differentiation.²⁸ These observations reinforce the association between FAP abundance and T2D status and fasting plasma glucose, and highlight the potential contribution of FAP functional states, such as adipogenic differentiation, to disease-relevant phenotypes.

We next investigated whether in vivo FAPs comprised distinct functional subtypes. By clustering the FUSION FAP snRNA data, we identified three subtypes: fibrogenic, adipogenic, and progenitors, consistent with findings in previous studies^15–17,19 (Figure 1H, Figure S1K). Notably, genes that were differentially expressed in FAP neighborhoods with high T2D-occurrence, including LAMA2, COL6A3, and MME (Figure 1G) also showed differential expression across FAP subtypes (Figure 1I). In addition to gene expression, we assessed the chromatin accessibility of the FAP subtypes and observed distinct FAP chromatin profiles, including at the promoters of key marker genes involved in FAP differentiation (Figure S1L–N). In particular, MME was not only differentially expressed among FAP neighborhoods and FAP subtypes, but also exhibited significant differences in chromatin accessibility profiles between the FAP subtypes (FDR<0.05; DESeq2). Specifically, chromatin accessibility differed between adipogenic and fibrogenic subtypes (nominal p-value = 3e-19), adipogenic and progenitor subtypes (nominal p-value = 2e-31), whereas no significant difference was observed between fibrogenic and progenitor subtypes (nominal p-value = 0.2) (Figure 1J). Collectively, these findings suggest that FAP-specific gene regulatory mechanisms contribute to T2D etiology. Further investigation will enable us to pinpoint the pathways and FAP subtypes involved, offering deeper insight into their roles in the development of T2D.

Induced pluripotent stem cell lines can be differentiated to fibro-adipogenic progenitors.

We had access to 30 iPSC lines from individuals in the FUSION Tissue study. We differentiated all 30 iPSC lines into FAPs (Figure 2A, see Methods). We monitored cellular morphology throughout the differentiation process by capturing images every hour. Initially, the iPSCs were small and formed compact colonies; as the differentiation progressed, the cells transitioned to the elongated morphology characteristic of FAPs (Figure 2B–D). We further assessed the differentiations using flow cytometry to measure expression of an iPSC (Tra-1–60) and a FAP marker (NT5E) over the course of the differentiation (Figure 2E, Figure S2C). We observed a progressive decrease in expression of the iPSC marker concomitant with an increase in expression of the FAP marker gene, as shown for four pilot differentiations using pools of ten lines. Representative examples from individual lines are shown in Figures S2A and B. Together, these results demonstrated a robust model system for studying FAP biology across many individuals.

We further explored these differentiations by examining the chromatin accessibility in iPSC-derived FAPs and assessing their similarity to in vivo FAPs. To this end, we generated single-nucleus chromatin accessibility (snATAC) data for a pooled sample of 10 lines at three timepoints during differentiation: early (day −2), intermediate (day 6), and late (day 21). To compare the chromatin accessibility profiles of the differentiated iPSC-FAPs with those of in vivo skeletal muscle cell types, we performed a logistic regression using ATAC peaks from the terminal time point and corresponding peaks from the cell types profiled in FUSION skeletal muscle biopsies (Figure 2F).¹⁴ We observed that the iPSC-FAPs were the most similar to in vivo FAPs compared with other skeletal muscle cell types (Figure 2G).

We next evaluated which differentiation stage of iPSC-FAPs most closely resembled biopsy-derived FAPs by computing Jaccard similarity scores using the top 200,000 most significant peaks for each iPSC-FAP timepoint, ranked by p-value. As expected, peak overlap increased progressively with differentiation stage, the early time point (day −2) showed the fewest overlapping peaks (41,230), increasing to 77,704 peaks at the terminal time point (day 21, Figure 2H). This corresponds to the emergence of at least 36,474 new, shared accessible regions over the course of the differentiation. For example, we observed a chromatin accessibility peak (blue box; chr4:12,884,865–12,885,328) present at early stages that gradually diminished as differentiation advanced (Figure 2I). Conversely, a nearby peak (red box; chr4:12,889,003–12,889,441) emerged during differentiation and ultimately recapitulated the accessibility profile seen in biopsy-derived FAPs (Figure 2I); examples of individual lines are shown in Figure S2D.

iPSC-FAPs respond to insulin stimulation.

To confirm the identities of the 30 iPSC lines used for FAP data generation, we performed low-pass whole-genome sequencing, which verified that all lines matched their respective donor reference data with high confidence (Figure 3A). We then differentiated all 30 lines into FAPs and evaluated the proportion of cells expressing FAP marker genes at the terminal time point (day 21). An average of 74% of cells expressed the FAP marker gene NT5E based on FACS analysis, confirming that the majority of lines successfully differentiated to FAPs (Figure 3B).

To assess insulin responsiveness, we subjected all 30 iPSC-derived lines to an insulin-stimulated glucose uptake assay (Figure 3C, Figure S3A). In response to the dose of insulin, we observed a significant increase in luminescence indicating enhanced glucose uptake (Figure 3D, Figure S3B). Next, we examined how donor-specific metadata traits such as age, BMI, FACS-based differentiation efficiency, sex, and T2D status correlated with cellular glucose uptake using a linear mixed model (Figure 3B). We found that BMI is inversely correlated with glucose uptake under basal conditions (p<0.05, Figure 3E and G) consistent with previous linking higher BMI to impaired glucose uptake.^29,30 Under high-insulin conditions, male sex emerged as a significant predictor of decreased glucose uptake, but only following insulin stimulation (p<0.05, Figure 3F and H). Previous studies have suggested that men are more likely to develop insulin resistance than women, and subsequent T2D.³¹ These results highlight how changes in stimulatory context between basal and high-insulin environments modulate cellular responses to the same stimulus. Overall, our present experiment provides line-specific physiological response data, confirms differential responses to environmental perturbations, and establishes a foundation for identifying associated gene regulatory mechanisms underlying insulin responsiveness.

The cell village of 30 donors demonstrates responsiveness to high-insulin state.

Maintaining all 30 iPSC-FAP lines in separate plates can result in plate-specific effects that may confound downstream analyses; therefore, we pooled cells from all 30 individuals into a single “cell village” to minimize this source of technical variation.³² The cell village included 12 males and 18 females, including 12 individuals with T2D and 18 non-T2D controls (Figure 4A). We then divided the cell village into two pools, one maintained under basal conditions and the other exposed to a high-insulin environment. To assess the representation of each donor within the cell village prior to omics profiling, we performed low-pass whole genome sequencing (WGS) and applied Census-Seq to quantify donor representation (Figure 4B). Donor representation across basal and high-insulin environments did not differ with at least 25 individuals each contributing ≥1% of nuclei per sample, enabling robust comparisons between conditions.

Figure 4: — A – Schematic of the 30 donors included in the cell village.

B – Bar plot of line representation in basal and high-insulin states as indicated by color and assessed by light whole genome sequencing of the cell village.

C – Bar plot of line representation in basal and high-insulin states as indicated by color and assessed by sn-multi-omics data.

D – Clustering of the snRNA data, labelled by environment.

E – UCSC screenshot of an FAP marker gene for basal ATAC, basal RNA, insulin ATAC, and insulin RNA modalities (normalized counts per 10 million reads).

F – Volcano plot of gene expression comparing insulin conditions to basal as assessed by MAST.

G – Gene set enrichment pathways using KEGG analysis within the differential gene expression data comparing the high-insulin state to the basal state.

H – UCSC browser screenshot of an extended view of a differential peak (left side). This peak is near the gene *DSP* (ENSG00000096696), which is known to be involved in the cytoskeleton in muscle cells and is included in Table S1 and in the pathways in Fig 4G (normalized counts per 10 million reads).

I – A volcano plot comparing differential peaks between high-insulin and basal states.

J – Plot of log2 SMAD3 motif enrichment (percent of reads with motif in target sequences divided by percent of reads with motif in background sequences) in differential peaks colored by subset of peaks (basal or insulin). Significance determined by both HOMER analysis of motif enrichment and Fisher’s exact test of counts.

Next, we performed single-nucleus gene expression and chromatin accessibility profiling using the 10x Genomics Epi Multiome ATAC + Gene Expression platform to investigate transcriptional and regulatory features of the iPSC-FAPs across different environmental conditions. We used demuxlet to assess donor representation in the single nucleus data and observed similar representation levels (Figure 4C), consistent with previous estimates from WGS-based donor quantification (Figure S4A). After quality control, we jointly clustered 1,990 high-quality basal nuclei and 3,038 high-insulin nuclei, which integrated homogeneously across conditions (Figure 4D). To assess similarity between iPSC-FAPs and in vivo FAPs at the transcriptional level, we computed Spearman correlations on normalized TPM values. We observed the highest correlation between the iPSC-FAPs and in vivo FAPs, indicating strong concordance in gene expression profiles (Figure S4B). We further evaluated whether canonical FAP marker genes were robustly expressed and exhibited accessible promoter chromatin in iPSC-FAPs. FAP marker genes showed high expression and open promoter chromatin (Figure 4E, Figure S4C and D), confirming that the cell village recapitulated expected FAP molecular features.

To explore differences in gene expression between the basal and high-insulin environments, we performed differential gene expression analysis, identifying 738 up-regulated genes (higher expression in high-insulin compared with basal conditions) and 899 down-regulated genes (Figure 4F). Up-regulated genes were enriched for pathways including “insulin resistance,” “PI3K-Akt signaling pathway,” and “ECM-receptor interaction” (FDR<0.05), consistent with previous findings characterizing cellular responses to insulin.^33–35 We also observed enrichments of the “cytoskeleton in muscle cells” pathway among up-regulated genes, suggesting continued differentiation under high-insulin conditions. Conversely, downregulated genes were enriched for “spliceosome” and “oxidative phosphorylation” pathways (Figure 4G, full results in Table S1). Reduced oxidative phosphorylation is a well-established feature of high-insulin states,³⁶ and prior work has shown that spliceosome components are misregulated - and specifically downregulated - in T2D, particularly in pancreatic beta cells.³⁷ Additional GO term enrichment analysis, presented in the supplemental data, were consistent with KEGG pathway results (Figure S4E).

Next, we used a Negative Binomial Generalized Linear Mixed Model (NBGLMM) to identify differentially accessible chromatin between basal and high-insulin environments (Figure 4H). We identified 1,958 (0.9%) peaks with greater accessibility in the basal state, and 2,385 (1%) peaks with greater accessibility under high-insulin conditions (FDR<0.05, Figure 4I). Notably, a regulatory element near DSP, a gene involved in the cytoskeleton in muscle cells KEGG pathway (Table S1), showed increased accessibility in the high-insulin condition (Figure 4H). To identify transcription factors that may mediate insulin-responsive chromatin remodeling, we tested whether differentially accessible chromatin peaks were enriched for known transcription factor binding motifs. We identified 139 motifs significantly enriched in peaks more accessible under high-insulin conditions and 151 peaks more accessible under basal conditions. (FDR<0.05, Table S2 and S3). Peaks with more accessible chromatin under high insulin were strongly enriched for the SMAD3 binding motif (Figure 4J, Table S2, HOMER p = 1e-12; Fisher’s exact test comparing basal vs. insulin enrichment, p = 9e-13). Consistent with this finding, SMAD3 expression was also upregulated under high-insulin conditions (Figure 4F). Given the established role in adipogenesis,³⁸ these results suggest that SMAD3 may link insulin exposure to the adipogenic fate in iPSC-FAPs. Taken together, these analyses established a high-quality multi-omics dataset for dissecting how iPSC-FAPs respond to high-insulin through coordinated gene expression and chromatin accessibility.

The high-insulin state induces the FAPs to cultivate the adipogenic subtype.

We also explored the FAP subtypes in our population of iPSC-FAPs. Consistent with the FUSION FAPs, we identified three distinct sub-populations characterized by the expression of E2F1, SEMA3C, and THY1 markers (Figure 5A, Figure S5A–C). Additionally, we performed pseudotime analysis to further refine subtype identification by leveraging comparisons between nuclei in the gene expression modality to assemble trajectories of development. This analysis revealed developmental trajectories corresponding to FAP progenitors, adipogenic cells, and a fibrogenic cluster (Figure 5B). Interestingly, when pseudotime analysis was conducted separately on basal and high-insulin samples, we observed that the basal condition comprised only progenitor and fibrogenic subtypes, while the high-insulin condition encompassed progenitor, fibrogenic, and adipogenic subtypes (Figure S5D and E).

Figure 5: — A – Clustering of the snRNA data, labelled by subtype.

B – Pseudotime trajectory within the clustering. The FAP subtypes of progenitors, adipogenic, and fibrogenic are labelled.

C – UMAP of iPSC-FAP data indicating milo neighborhoods and colored by log fold change of differential abundance of environment (basal or insulin).

D – Beeswarm plot from Milo colored by log2 fold change of differential abundance for environment.

E – UCSC screenshot of iPSC-FAP subtypes. The promoter peak at *MME* (same locus as Fig 1J) has different levels of chromatin accessibility amongst the FAP subtypes (normalized counts per 10 million reads).

F – Plot of enrichment by FAP subtype chromatin accessibility peaks, looking at Suzuki et al., Nature, 2024 loci. Color indicates sample (basal or high-insulin).

G – UCSC screenshot showing the *LTBP3* locus along with 50 kb to either side. The dark yellow line indicates the location of rs3814707.

H – UCSC screenshot showing the *LTBP3* locus and its location on chromatin accessibility peaks of the iPSC-FAP subtypes. Dark yellow line indicates location of rs3814707, dark yellow dashed line shows proposed chromatin looping with p-value from SCENT analysis.

I – UCSC screenshot showing the *LTBP3* locus and its location on chromatin accessibility peaks of the iPSC-FAP subtypes. The dark yellow line indicates the location of rs3814707.

J – Bar plot showing p-values of correlation between chromatin accessibility peak shown in 5H and expression of genes within 50 kb based on SCENT analysis (two genes were removed from analysis due to low expression in iPSC-FAPs) in the basal sample and high-insulin sample. The orange dashed line indicates p-value = 0.05.

Nuclei from the basal and high-insulin samples integrated well; however, the relative contributions of nuclei from each condition differed across subtypes. We therefore hypothesize that the high-insulin sample would contain more nuclei within the adipogenic FAP subtype. Indeed, the neighborhood analyses on the joint clustering of basal and insulin samples revealed that adipogenic neighborhoods were enriched for high-insulin nuclei (Figure 5C and D). Furthermore it showed that the adipogenic subtype contained significantly more high-insulin than basal nuclei (binomial test, p-value = 4e-10, Figure S5F). These findings are consistent with previous studies showing that insulin serves as a key trigger for the adipogenic pathway in FAPs.^41,42 Collectively these findings highlight how changes in the cellular environment can alter FAP differentiation trajectories, with the high-insulin conditions favoring adipogenic differentiation. We also identified chromatin accessibility peaks specific to each FAP subtype. Interestingly, the pronounced peak identified in the adipogenic subtype in Figure 1J was successfully recapitulated in our iPSC-FAPs (Figure 5E). Together, these results demonstrate that iPSC-FAPs not only closely resemble in vivo FAPs as a whole, but also accurately reproduce distinct FAP subtypes, underscoring the robustness of iPSC-FAPs as a model system for studying human FAP biology.

To assess the overlap of iPSC-FAP subtype-specific chromatin accessibility peaks with existing T2D GWAS data, we performed an enrichment analysis which revealed that only the adipogenic subtype is significantly enriched for T2D GWAS signals (Figure 5F).^4,43 These findings highlight the particular relevance of the adipogenic subtype to T2D, and underscore the importance of studying cells under varied environmental conditions. Notably, without the addition of insulin to the cellular environment, our samples would contain significantly fewer nuclei belonging to the adipogenic cluster (p-value = 4e-10, Figure S5F), limiting our ability to detect this enrichment. Consistent with this observation, when the basal and high-insulin FAP subtypes were tested separately for enrichment only high-insulin adipogenic FAPs showed significant enrichment for T2D GWAS signals (Figure 5F).

Within this subset of peaks overlapping a T2D GWAS loci, the rs3814707 signal emerged as a notable example. This signal is located within 50 kb of seven genes (Figure 5G). To determine possible gene interactions, we leveraged co-accessibility analyses from in vivo FAPs, which indicated that this peak is connected to the nearby gene LTBP3.¹³ Additionally, using SCENT we assessed correlations between chromatin accessibility at this peak and expression of genes within 50 kb. Among these genes, only LTPB3 showed a significant association with accessibility at this locus, with proposed chromatin looping shown (Figure 5H). Notably, this GWAS signal resides at the summit of an adipogenic subtype-specific accessibility peak (Figure 5I). Importantly this association was only significant when testing high-insulin nuclei (Figure 5J), under basal conditions, while no genes within 50 kb are significantly associated with this peak (Figure 5J). LTBP3 is known to regulate the extracellular matrix and cellular growth, and previously has been implicated in the regulation of adipogenic differentiation in FAPs.^44,45

Discussion:

This study establishes two main findings. First, we demonstrate that iPSC-derived FAPs faithfully recapitulate the molecular and cellular properties of in vivo FAPs isolated from skeletal muscle. This validation is uniquely rigorous, because our iPSC lines and muscle biopsies were derived from the same individuals, enabling direct donor-matched comparisons that have not been done in previous studies. Second, we show that the adipogenic subtype of FAPs is enriched for T2D GWAS signals, and that this enrichment emerges only when cells are exposed to a high-insulin environment. Together, these findings establish iPSC-FAPs as a robust model system for studying FAP biology and demonstrate how disease-relevant regulatory mechanisms can be uncovered through context-specific profiling.

The donor-matched design of our data allowed us to benchmark iPSC-FAPs against in vivo FAPs with unprecedented precision. iPSC-FAPs closely resemble their in vivo counterparts in morphology, marker gene expression, and chromatin accessibility profiles. Critically, the same major FAP subtypes - progenitors, adipogenic, and fibrogenic - are identifiable in both systems using the same marker genes. This molecular and cellular concordance demonstrates that iPSC differentiation captures the essential features of FAP biology, rather than merely producing a superficial resemblance. Our data resolve this uncertainty and provide a foundation for using iPSC-FAPs to investigate aspects of FAP biology that are impractical to study using primary tissue, including responses to controlled environmental perturbations.

Beyond establishing a robust model system, this study reveals that cellular context is critical for identifying disease-relevant regulatory mechanisms. The adipogenic FAP subtype showed significant enrichment for T2D GWAS signals only under high-insulin conditions, while this enrichment was not detectable at baseline. iPSC-FAPs are functionally responsive to insulin, as evidenced by increased glucose uptake and a shift in subtype composition toward the adipogenic fate. This state dependence is consistent with evidence from other systems that regulatory elements can exist in “primed” configurations, enabling context-specific transcriptional responses.²⁰ Our findings reinforce a broader principle: GWAS interpretation efforts focused exclusively on baseline cellular states may overlook regulatory mechanisms active only under disease-relevant conditions.

FAP abundance in skeletal muscle is associated with T2D status, fasting glucose, and fasting insulin, consistent with a role for this cell population in disease etiology. Our results support a model in which T2D risk variants act through FAPs to promote adipogenic differentiation in response to elevated circulating insulin levels, demonstrated by the specific enrichment of the adipogenic iPSC-FAP subtype under high-insulin conditions. One locus of particular interest is rs3814707, where a T2D GWAS signal lead SNP resides within a chromatin accessibility peak in the adipogenic subtype and chromatin looping analyses indicate this peak interacts with the gene LTBP3. Although LTBP3 has not been directly linked to T2D prior to this study, we propose that such variants may bias FAP differentiation toward adipogenesis under high-insulin conditions, thereby contributing to increased intramuscular adipose tissue in individuals with T2D. This model generates testable hypotheses regarding how non-coding variants influence T2D pathophysiology through effects on progenitor cell fate decisions.

Our study has some limitations. Although iPSC-derived FAPs closely resemble in vivo FAPs, they are not identical to primary cells. Our donor-matched design mitigates this limitation by enabling direct comparison to skeletal muscle biopsies from the same individuals; however, iPSC-FAP findings should ultimately be validated in primary tissue where feasible. Additionally, all participants in our study are of Finnish ancestry, and our findings may not generalize to other populations. Extending this work to cohorts of diverse ancestries will be important for external validity and assessing broader applicability.

Several questions remain for future investigation. Future studies should investigate the downstream transcriptional consequences of GWAS-implicated regulatory variants in FAPs to elucidate the pathways through which they contribute to the disease development. It will also be of interest to explore additional stimulatory conditions to determine whether other FAP subtypes exhibit GWAS enrichment in distinct environmental conditions. The iPSC-FAP platform we established here provides a tractable system for addressing these questions through systematic profiling across diverse environmental contexts.

Lead Contact

Requests for further information and resources should be directed to by Stephen C. J. Parker (scjp@umich.edu).

Materials Availability

This study did not generate new unique reagents.

Methods

EXPERIMENTAL MODEL AND STUDY PARTICIPANT DETAILS

FUSION Tissue Study Individuals

The FUSION Tissue Study consists of 287 individuals aged 40 to 80 years, including 123 female participants, all of whom are of Finnish ancestry. The FUSION study was approved by the coordinating ethics committee of the Hospital District of Helsinki and Uusimaa and written informed consent was obtained from all participants.

From this cohort, a subset of 30 was selected for the cell village and iPSC experiments. The age range of this subset was 51–69 years and included 12 male and 18 female participants. We included sex as a factor in all analyses ran.

iPSC Growth Conditions

We grew induced pluripotent stem cell (iPSC) lines on 6 well Matrigel plates with mTeSR Plus (Stem Cell Technologies) media replaced on a daily basis. Cell lines were passaged once a week by incubating the wells in 1 mL of Versene and either incubating at room temperature for ten minutes or 37 °C for 5 minutes. Cells were then scraped off of the plates and added at a 1:3 ratio to the new plates. Cells were incubated at 37 °C, 5% CO2 overnight.

QUANTIFICATION AND STATISTICAL ANALYSIS

P-values are reported within figures, with values <0.05 indicating statistical significance for correlations and differences. Details of statistical methods used and sample sizes are provided in the figure legends, and plots display mean ± SEM. Sex was included as a covariate in all analyses.

METHOD DETAILS

Percent FAP per skeletal muscle biopsy analysis

In order to compare changes in the number of nuclei per FAPs in relation to donor traits in the FUSION Tissue study skeletal muscle biopsies, DESeq2 (v. 1.42.1) was used. We used raw nucleus counts per cell type as input and the abundance of FAPs was then compared to donor metadata variables. The model included batch, sex, BMI, age, T2D status (all donors were encoded as a one if diagnosed with T2D and zero otherwise), fasting glucose, and fasting insulin.

Neighborhood analysis

To perform neighborhood analyses, MiloR (v. 2.1.3) was used.²² The model used included age, BMI, batch, fasting glucose at time of biopsy, fasting insulin at time of biopsy, T2D status (1 if an individual was diagnosed with T2D at time of biopsy, 0 otherwise), and sex. In order to identify differentially expressed genes between neighborhoods, we used Milo’s function testDiffExp and pinpointed genes whose expression levels were associated with the metadata of interest. The testDiffExp function tests this association by using a negative binomial GLM from edgeR. Milo’s function plotNhoodExpressionGroups was then used to plot the top 10 most significantly differently expressed genes (based on adjusted p-values) that differ between neighborhoods with differential abundance within the FAPs.

Differential Chromatin Accessibility Across in vivo FAP Subtypes

Differential peaks in Figure 1J were compared using DESeq2 (v. 1.42.1). Once the FAP subtypes were identified, we generated a consensus peak set by merging the narrowPeak files from all three subtypes. Specifically, we used bedtools⁴⁶ (v. 2.30.0) to consolidate overlapping features into a single region that encompasses the combined peaks. We then used featureCounts⁴⁷ (v. 2.0.3) to quantify the read counts in peaks per individual per subtype. This was used as input for DESeq2, where a paired analysis was performed, and the MME promoter peak was examined.

Differentiation of iPSC-FAPs

We differentiated iPSCs to FAPs by plating and growing as iPSCs for 2 days (day −2 to day 0). Mesenchymal Induction media (Stem Cell Technologies) was then added, and cells were grown in that for 4 days (day 0 to day 4). After that cells were then grown in FAP media (86% alpha-MEM, 13% FBS, and 1% L-glutamine). On day 6, cells are passaged to MatrixPlus plates (StemBioSys). Cells are grown in FAP media and on MatrixPlus plates until day 21 when they are then considered to be iPSC-FAPs. FAPs are passaged using the ACF Cell Dissociation Kit (Stem Cell Technologies).

FACS analysis

For flow cytometry analysis, we dissociated cells using the previously mentioned passaging methods. Cells are put through a 70 um filter and divided into a negative control and a stained sample. Cells are then centrifuged down, media is aspirated, and the positive sample is stained with antibodies for Tra-1–60 (iPSC marker) (Miltenyi Biotec Cat# 130–122-921, RRID:AB_2801969) and NT5E (FAP marker) (Miltenyi Biotec Cat# 130–111-913, RRID:AB_2784275) for 10 minutes. Cells are centrifuged down and then fixed in 3% PFA for 15 minutes before getting resuspended in PBS. All reagents are kept on ice throughout until FACS data collection.

Nuclei isolation and snATAC data generation

We pooled 10 iPSC lines together for a joint differentiation and split the cells into two replicates after day 6. Cells were frozen down into cryovials of 500,000 cells. For nuclear isolation, cells were thawed and pelleted at 500 RCF at 4°C for 5 min in a fixed-angle centrifuge. Cells were then resuspended in 50 ul of ATAC-Resuspension Buffer (RSB) (1M Tris-HCl pH 7.4, 5M NaCl, 1M MgCl2, and water to 50 mL) containing 0.1% NP40, 0.1% Tween-20, and 0.01% Digitonin and pipetted up and down 3 times before incubating on ice for 3 minutes. ATAC-RSB was then washed out with 1 mL of ATAC-RSB containing 0.1% Tween-20, and tube was inverted 3 times to mix. Nuclei are then pelleted at 500 RCF for 10 min at 4°C in a fixed-angle centrifuge, then resuspended in 1000ul PBS + 0.1% Tween-20 and 1% BSA with a 1:1000 dilution of 7-AAD and incubated for 30 minutes on ice before sorting with a Miltenyi Tyto machine. Five thousand nuclei were then loaded onto the 10x platform in two replicates to generate snATAC data using the 10x multi-omics kit.

Comparing chromatin accessibility profiles with a Jaccard statistic

To compare chromatin accessibility peaks between the iPSC-FAPs and in vivo FAPs, we used bedtools⁴⁶ (v. 2.30.0) function “bedtools jaccard” to calculate the Jaccard similarity statistic. For this analysis, we restricted the iPSC-FAPs to the top 200,000 strongest peaks based on p-value ranking within the broadPeak file and defined intersection as peaks overlapping at least 75% of the iPSC-FAP input peak (bedtools jaccard -f 0.75).

Comparing chromatin accessibility profiles with a logistic regression

We used a logistic regression approach to compare cell-type-specific chromatin accessibility profiles between iPSC-FAPs from the eight most abundant individuals at the end time point and corresponding peaks from the cell types from the FUSION skeletal muscle biopsies. For each of the 8 individuals, we took the day −2 peaks (called separately on each individual) and day 21 peaks (called separately for each individual) and generated a master peak list by merging them (taking the union, peaks called using MACS2 v. 2.2.4). We then filtered to TSS-distal master peaks (peaks not overlapping the 5kb window upstream of a gene) and restricted to master peaks present in only one of the two timepoints. Next, for each of the 13 FUSION cell types, we fit a logistic regression model:

l o g i t (p (x)) = β + β x

where $p (x)$ is the probability that the master peak is accessible at day 21, and $x$ is the probability that the master peak overlaps a peak from the fusion cell type. We then divided the 13 coefficients by the maximum coefficient to obtain get a peak “similarity score” capped at 1.

Insulin-stimulated glucose uptake assay

To investigate the response of the FAPs to insulin, we used an insulin-stimulated glucose uptake assay. To do this, we incubated cells in 100 mM of insulin for 2 hours or kept them at basal levels (constituting the basal and high-insulin samples) prior to using the Promega insulin-stimulated glucose uptake kit. Each line was measured in triplicate in both of the stimulatory states. Lines were plated in replicates on four 96-well plates (20 lines were assessed in triplicate, 8 in duplicate, and 2 as singlets based on cell availability at the time). Within each environment, one plate was given an additional dose of 100 nM of insulin for 30 minutes before all wells were given fluorescent glucose, and luminescence was measured to assess glucose uptake, allowing us to compare glucose uptake in response to insulin under basal and high-insulin conditions.

Statistical analysis

After measuring luminescence levels, we subtracted the background (averaged from empty wells) from each well. We normalized luminescence levels to the number of cells, measured at plating, and performed inverse rank normalization. We used a paired t-test to assess the significance of the difference in fold change between samples that received additional insulin from the Promega kit or not.

Response to stimulatory state modeling

To compare normalized luminescence values to donor metadata variables, we used a linear mixed model. The model included age, BMI, differentiation efficiency (from the FACS analysis of that specific line’s differentiation), sex, and T2D status.

DNA isolation

We isolated DNA for light whole genome sequencing from cell lines using the DNeasy Blood & Tissue Kit (Qiagen). In brief, cells were frozen down in aliquots of 500,000 cells that were then thawed and centrifuged for 5 minutes at 300g. Pellet was resuspended in 200 ul PBS and Proteinase K (from kit). 200 ul of Buffer AL (from kit) was added before vortexing and incubating at 56 C for 10 minutes. 200 ul of ethanol was added before vortexing and adding the mixture to a DNeasy spin column. The spin column was centrifuged at 6000g for 1 pm, flow-through was discarded, and the spin column was placed in a new collection tube. 500 ul of Buffer AW1 (from the kit) was added before centrifuging for 1 minute at 6000g, flow through was again discarded, and a new collection tube was added. 500 ul of Buffer AW2 (from kit) was then added, and centrifuged for 3 minutes at 20,000g, flow-through was again discarded. The spin column was then placed in a 1.5 mL tube, 200 ul of water was added before incubating at room temperature for 1 minute, and then centrifuging at 6000g for 1 minute to elute the DNA.

Census-seq

For census-seq (v. 2.5.1) analysis,⁴⁸ we trimmed reads using cta (v. 0.1.2, https://github.com/ParkerLab/cta), then aligned to the hg38 reference genome using BWA (v. 0.7.17). We then marked duplicates using Picard Tools MarkDuplicates (v. 3.4.0, https://broadinstitute.github.io/picard/) and removed them prior to running Census-Seq, Roll Call, and CSI Analysis from the Census-Seq tools.⁴⁸ Specifically, census-seq’s Roll Call analysis was used to determine the fraction match of each individual iPSC line’s low pass whole genome sequencing to reference SNP Chip and Imputation data for each donor. This tool compares individual reads from whole-genome sequencing to the reference data for each donor and determines a fraction match based on how many reads of the whole genome sequencing match the reference.

Nuclei isolation and multi-omics data generation

The cell village was frozen down into aliquots of 200,000 cells. The cells were thawed on ice for 30 minutes before being transferred to 1 mL of pre-warmed alpha-MEM and centrifuged at 300 rcf for 5 minutes. The cells were then resuspended in 1 mL DPBS with 0.04% BSA before proceeding. To isolate nuclei, cells were pelleted at 500 rcf at 4 C for 5 minutes in a fixed-angle centrifuge. 600 ul of lysis buffer (10 mM Tris-HCl pH 7.4, 10 mM NaCl, 3 mM MgCl2, 0.1% NP40, 1% of 10% BSA, 0.1% Tween-20, 1 mM DTT, 50 mg/mL 7-AAD, 1 U/ul RNase inhibitor, and water to volume) was added before transferring to a pre-chilled dounce homogenizer and homogenizing with 5 strokes. The mixture was then incubated on ice for 7 minutes, with pipetting once per minute. Lysis was then washed out with 1.3 mL of wash buffer (1% of 10% BSA, 0.1% Tween-20, 1 mM DTT, 0.5 mg/mL 7-AAD, and DPBS to volume) before pelleting at 500 rcf for 5 minutes at 4 C in a swinging bucket centrifuge. Nuclei were resuspended in 0.1% Tween-20, 1% BSA, 1 mM DTT, 1 U/ul RNase Inhibitor, 1 mg/mL 7-AAD, and DPBS to the desired volume then incubated for 15 minutes. Due to issues with the Miltenyi Tyto machine, the basal sample was sorted through the Tyto, and the insulin sample was pelleted at 500 rcf for 5 minutes at 4 C in a swinging bucket centrifuge, then the nuclei were resuspended in 450 ul of wash buffer before spinning down at 500 rcf for 5 minutes in a fixed-angle centrifuge. The resuspension in wash buffer and the spin-down was repeated once more. Finally, both samples were resuspended in 1 mL nuclear permeabilization buffer (10 mM Tris-HCl pH 7.5, 10 mM NaCl, 3 mM MgCl2, 0.1% Tween-20, 0.1% IGEPAL, 0.01% Digitonin, and water to volume) before pipetting 10 times. From there, 10,000 basal nuclei and 20,000 insulin nuclei were loaded onto the 10x Genomics multi-omics kit for data generation.

Multi-omics analysis

snATAC-seq processing

The chromatin accessibility (ATAC) component of the multiome library was processed using the pipeline at https://github.com/porchard/snATACseq-NextFlow (commit b81ce91) and the TOPMed version of the hg38 reference genome (described here: https://github.com/broadinstitute/gtex-pipeline/blob/master/TOPMed_RNAseq_pipeline.md). In brief, we trimmed sequencing adapters using cta (v. 0.1.2) and mapped reads to hg38 with bwa⁴⁹ (v.0.7.15; bwa mem -I 200,200,5000 -M). Cell barcodes were corrected using a custom implementation of the barcode correction algorithm described on the 10X Genomics website. Duplicates were marked using picardtools (http://broadinstitute.github.io/picard/; v 2.26.9; MarkDuplicates READ_ONE_BARCODE_TAG=CB READ_TWO_BARCODE_TAG=CB VALIDATION_STRINGENCY=LENIENT). BAM files were filtered to properly mapped, non-duplicate autosomal read pairs using samtools (samtools view -h -b -f 3 -F 4 -F 8 -F 256 -F 1024 -F 2048 -q 30 chr{1..22}). Peaks were called using MACS2⁵⁰ (macs2 callpeak -t $bed --outdir . --SPMR -f BED -g hs --nomodel --shift −100 --extsize 200 -B --broad --keep-dup all, with a bed file generated using the bedtools⁴⁶ bamtobed command) and filtered against the ENCODE blacklist (https://www.encodeproject.org/files/ENCFF356LFX/). QC metrics were generated using ataqv⁵¹ (v. 1.5.0; --ignore-read-groups --nucleus-barcode-tag CB, with the ENCODE blacklist.

snRNA-seq processing

The gene expression (RNA) component of the multiome library was processed using the pipeline at https://github.com/porchard/snRNAseq-NextFlow (commit baea274). In brief, reads were mapped and gene count matrices generated using STARsolo⁵² (v. 2.7.10a, with hg38 fasta file and GENCODE v30-derived genome annotations from the TOPMed consortium (https://github.com/broadinstitute/gtex-pipeline/blob/master/TOPMed_RNAseq_pipeline.md); default options except --soloBarcodeReadLength 0 --outSAMattributes NH HI nM AS CR CY CB UR UY UB sM GX GN --outSAMtype BAM SortedByCoordinate --outSAMunmapped Within KeepPairs --soloType CB_UMI_Simple --soloUMIlen 12 --soloFeatures GeneFull_ExonOverIntron --soloUMIfiltering MultiGeneUMI --soloCBmatchWLtype 1MM_multi_pseudocounts --soloCellFilter None --soloCBwhitelist 10X-whitelist.txt). We removed ambient RNA using cellbender ⁵³ (v. 0.3.0, options --fpr 0.05) and emptyDrops⁵⁴, and calculated per-nucleus QC metrics using a custom python script.

Multiome QC

After processing the ATAC and RNA components as described above, we selected pass-QC nuclei by applying the following thresholds for each nucleus. For the basal sample, we required a minimum of 499 UMIs for the RNA modality, maximum 15 percent mitochondrial reads from the RNA modality, minimum 3000 properly paired and mapped non-duplicate autosomal reads where mapping quality is at or above 30 in the ATAC modality, minimum TSS enrichment of 2 in the ATAC modality, and minimum CellBender cell probability of 0.99 for the RNA modality. For the high-insulin sample, we required a minimum of 1445 UMIs for the RNA modality, maximum 15 percent mitochondrial reads from the RNA modality, minimum 6672 properly paired and mapped non-duplicate autosomal reads where mapping quality is at or above 30 in the ATAC modality, minimum TSS enrichment of 2 in the ATAC modality, and minimum CellBender cell probability of 0.99 for the RNA modality.

Multiome doublet detection

We relied on both the RNA and the ATAC modalities to detect doublets. The multiome library’s doublets detection was processed using the pipeline at https://github.com/porchard/Multiome-Doublet-Detection-NextFlow (commit 72df9ea). For both modalities we ran demuxlet⁵⁵ for detecting doublets and assigning singlets to donors. If a barcode was found to be a doublet based on either method or modality (or if the inferred donor identity differed between the ATAC and RNA modalities), it was removed prior to clustering. Clustering was performed using Seurat v5.⁵⁶ After subsetting to pass-QC nuclei and adding metadata, the nuclei were split into different layers based on sample (basal or insulin). scTransform was then used to normalize counts (vars.to.regress = c(“donor”, “age”, “sex”, “bmi”, “ogtt”). The layers were then integrated using Harmony. 10 PCs were included in the final clustering (any more would lead to donor-specific clusters) with a resolution of 0.5 used in FindClusters. An additional separate cluster of less than 100 nuclei was additionally removed to limit analyses to strictly FAPs. A binomial test was used to assess the composition of different subtypes for significance.

Pseudotime analysis

To assess pseudotime trajectories in the gene expression data of the iPSC-FAPs, Monocle3 (v. 1.3.7) was used.⁵⁷

Spearman correlation

To assess the similarity of the iPSC-FAPs to FUSION skeletal muscle cell types, TPM values were calculated from average counts per gene for each cell type. The comparison was then restricted to genes shared between the cell types, and a Spearman correlation was used to quantify similarities.

Differential gene expression

Differential gene expression was performed using MAST (v. 1.28.0) using a model that took into account donor metadata traits of sex, age, BMI, and T2D status.⁵⁸ Genes expressed in less than 0.35 of nuclei were removed prior to analysis. Pathway analysis was then conducted by clusterProfiler (v. 4.16.0).⁵⁹

Differential peak analysis

To identify differential chromatin accessibility peaks between the basal and high-insulin samples, a Negative Binomial Generalized Linear Mixed Model (NBGLMM) was used. First, peak lists were compiled using snapATAC2 (v. 2.7.0).⁶⁰ Differential peaks were then identified by the NBGLMM controlling for donor traits and testing for basal vs insulin samples (called “sample” in the below models) at an FDR level of 5% (using Storey Lab’s qvalue package, v. 2.15.0). Two models were ran, both with and without the variable of interest:

model_with <- as.formula(“PEAK ~ sample + sex + age + ogtt + bmi + tss_enrichment + offset(log(total_counts)) + (1|donor)”)

model_without <- as.formula(“PEAK ~ sex + age + ogtt + bmi + tss_enrichment + offset(log(total_counts)) + (1|donor)”)

After running the models, likelihood ratio tests identified differentially accessible peaks between the basal and high-insulin samples.

Motif enrichment analysis

Transcription factor motif enrichment was performed using HOMER (v. 4.11.1).⁶¹ The command findMotifsGenome.pl was used with default settings separately on bed files of peaks more accessible under basal and high-insulin conditions.

GWAS Enrichment analysis

To perform enrichment analyses comparing chromatin accessibility peaks of the iPSC-FAP subtypes to GWAS data, fgwas (v. 0.3.6) was used.⁴³ In this case, summary statistics from the 2024 T2D GWAS (specifically the multi-ancestry analysis) were used for the enrichment.⁴ The summary statistics were formatted as required by the program, and default parameters were used.

Chromatin looping analysis

To identify genes our peak of interest was likely interacting with, we used SCENT (v. 1.0.1).⁶² Genes within 50 kb of the selected peak were tested with covariates log(UMI), fraction mitochondrial reads, sample (basal/insulin), donor, sex, BMI, T2D status, and age.

Supplementary Material

Supplement 1

Supplemental Table 1: All significant pathways based on KEGG enrichment analysis.

media-1.docx^{(29.2KB, docx)}

Supplement 2

Supplemental Table 2: Enriched transcription factors for peaks more accessible under high-insulin conditions.

media-2.docx^{(43.4KB, docx)}

Supplement 3

Supplemental Table 3: Enriched transcription factors for peaks more accessible under basal conditions.

media-3.docx^{(44.3KB, docx)}

Supplement 4

Supplemental Figure 1

A – UMAP of FUSION data indicating neighborhoods and colored by log fold change of differential abundance of age.

B – Heatmap of differentially expressed genes within FAP neighborhoods based on differential abundance of age.

C – UMAP of FUSION data indicating neighborhoods and colored by log fold change of differential abundance of BMI.

D – Heatmap of differentially expressed genes within FAP neighborhoods based on differential abundance of BMI.

E – UMAP of FUSION data indicating neighborhoods and colored by log fold change of differential abundance of fasting glucose.

F – Heatmap of differentially expressed genes within FAP neighborhoods based on differential abundance of fasting glucose.

G – UMAP of FUSION data indicating neighborhoods and colored by log fold change of differential abundance of fasting insulin.

H – Heatmap of differentially expressed genes within FAP neighborhoods based on differential abundance of fasting insulin.

I – UMAP of FUSION data indicating neighborhoods and colored by log fold change of differential abundance of sex.

J – Heatmap of differentially expressed genes within FAP neighborhoods based on differential abundance of sex.

K – UMAP of FUSION FAPs, colored by marker genes for the fibrogenic, adipogenic, and progenitor subclusters.

L – UCSC screenshot of FUSION FAP subtypes at COL6A3 (normalized counts per 10 million reads). Promoter peak is indicated by red box.

M – UCSC screenshot of FUSION FAP subtypes at ARHGAP24 (normalized counts per 10 million reads). Promoter peak is indicated by red box.

N – UCSC screenshot of FUSION FAP subtypes at COL15A1 (normalized counts per 10 million reads). Promoter peak is indicated by red box.

media-4.jpg^{(771.4KB, jpg)}

Supplement 5

Supplemental Figure 2:

A – Flow cytometry examining an iPSC marker gene and a FAP marker gene over the course of one differentiation for one individual.

B - Flow cytometry examining an iPSC marker gene and a FAP marker gene over the course of one differentiation for one individual.

C – Flow cytometry examining an iPSC marker gene and a FAP marker gene over the course of four different differentiations of a pooled sample containing ten different individuals.

D - UCSC browser screen shot of chromatin accessibility data from the beginning (day −2), middle (day 6), and end (day 21) of a pooled sample of ten individuals across the FAP differentiation as well as two individual chromatin accessibility profiles from the mix. At the bottom is the chromatin accessibility data from the FUSION FAPs (normalized counts per 10 million reads).

media-5.jpg^{(341.1KB, jpg)}

Supplement 6

Supplemental Figure 3:

A - Alternate representation of the insulin stimulated glucose uptake assay.

B - Normalized luminescence from insulin stimulated glucose uptake assay for single donor in basal, basal with a small dose of insulin, high-insulin, and high-insulin with a small additional dose of insulin environments.

media-6.jpg^{(313.8KB, jpg)}

Supplement 7

Supplemental Figure 4:

A – Scatterplot comparing line representation from light WGS (Figure 4B) and line representation from single-nucleus multi-omics (Figure 4C) of cell village.

B – Heatmap of Spearman correlation comparing gene expression TPM values between the iPSC-FAPs and the FUSION skeletal muscle cell types.

C – UCSC screenshot of the FAP marker gene FBN1 for basal ATAC, basal RNA, insulin ATAC, and insulin RNA modalities (normalized counts per 10 million reads).

D – UCSC screenshot of the FAP marker gene PDGFRA for basal ATAC, basal RNA, insulin ATAC, and insulin RNA modalities (normalized counts per 10 million reads).

E – Gene set enrichment pathways using GO term analysis within the differential gene expression data comparing the high-insulin state to the basal state.

media-7.jpg^{(402KB, jpg)}

Supplement 8

Supplemental Figure 5:

A – Violin plot of a marker gene of FAP progenitor subtype, p-value from MAST differential expression analysis.

B – Violin plot of a marker gene of FAP fibrogenic subtype, p-value from MAST differential expression analysis.

C – Violin plot of a marker gene of FAP adipogenic subtype, p-value from MAST differential expression analysis.

D – Basal sample pseudotime analysis with subtypes labelled.

E – Insulin sample pseudotime analysis with subtypes labelled.

F – Table of sample composition by subtype, with binomial test p-values comparing the make-up of the subtype to the whole sample.

media-8.jpg^{(341.9KB, jpg)}

Acknowledgements

The authors acknowledge several undergraduate researchers who helped cultivate and differentiate the iPSC-FAPs: Usman Siddiqui, Ashna Atukuri, Chelsea Kehoe, Benjamin Means, and Elena Wood.

We thank the Advanced Genomics Core at the University of Michigan for their assistance in generating the data used throughout this study.

CV was supported by T32GM149391 (Michigan Predoctoral Training in Genetics).

This research was supported by National Institutes of Health (NIH) grants F31DK135342, DK062370, and ZIAHG000024. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

Footnotes

Declaration of Interests

SCJP has a research grant from Pfizer.

References

1.Thévenod F. (2008). Pathophysiology of Diabetes Mellitus Type 2: Roles of Obesity, Insulin Resistance and Β-Cell Dysfunction. Diabetes Cancer 19, 1–18. 10.1159/000152019. [DOI] [Google Scholar]
2.Tremblay J., and Hamet P. (2019). Environmental and genetic contributions to diabetes. Metabolism 100, 153952. 10.1016/j.metabol.2019.153952. [DOI] [Google Scholar]
3.Khan M.A.B., Hashim M.J., King J.K., Govender R.D., Mustafa H., and Al Kaabi J. (2020). Epidemiology of Type 2 Diabetes – Global Burden of Disease and Forecasted Trends. J. Epidemiol. Glob. Health 10, 107–111. 10.2991/jegh.k.191028.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Suzuki K., Hatzikotoulas K., Southam L., Taylor H.J., Yin X., Lorenz K.M., Mandla R., Huerta-Chagoya A., Melloni G.E.M., Kanoni S., et al. (2024). Genetic drivers of heterogeneity in type 2 diabetes pathophysiology. Nature, 1–11. 10.1038/s41586-024-07019-6. [DOI] [Google Scholar]
5.Mahajan A. (2022). Multi-ancestry genetic study of type 2 diabetes highlights the power of diverse populations for discovery and translation. Nat Genet 54. 10.1038/s41588-022-01058-3. [DOI] [Google Scholar]
6.Rai V., Quang D.X., Erdos M.R., Cusanovich D.A., Daza R.M., Narisu N., Zou L.S., Didion J.P., Guan Y., Shendure J., et al. (2020). Single-cell ATAC-Seq in human pancreatic islets and deep learning upscaling of rare cells reveals cell-specific type 2 diabetes regulatory signatures. Mol. Metab. 32, 109–121. 10.1016/j.molmet.2019.12.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Udler M.S., Kim J., Grotthuss M. von, Bonàs-Guarch S., Cole J.B., Chiou J., Isgc, C.D.A. on behalf of M. and the, Boehnke M., Laakso M., Atzmon G., et al. (2018). Type 2 diabetes genetic loci informed by multi-trait associations point to disease mechanisms and subtypes: A soft clustering analysis. PLOS Med. 15, e1002654. 10.1371/journal.pmed.1002654. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Maynard A.G., Bhardwaj R., Jones T.R., and Claussnitzer M. (2025). Bridging the variant-to-function gap in type 2 diabetes: advances and challenges. Diabetologia. 10.1007/s00125-025-06600-6. [DOI] [Google Scholar]
9.Chen J. (2021). The trans-ancestral genomic architecture of glycemic traits. Nat Genet 53. 10.1038/s41588-021-00852-9. [DOI] [Google Scholar]
10.Parker S.C.J. (2013). Chromatin stretch enhancer states drive cell-specific gene regulation and harbor human disease risk variants. Proc Natl Acad Sci USA 110. 10.1073/pnas.1317023110. [DOI] [Google Scholar]
11.Quang D.X., Erdos M.R., Parker S.C.J., and Collins F.S. (2015). Motif signatures in stretch enhancers are enriched for disease-associated genetic variants. Epigenetics Chromatin 8, 23. 10.1186/s13072-015-0015-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Thurner M., van de Bunt M., Torres J.M., Mahajan A., Nylander V., Bennett A.J., Gaulton K.J., Barrett A., Burrows C., Bell C.G., et al. (2018). Integration of human pancreatic islet genomic data refines regulatory mechanisms at Type 2 Diabetes susceptibility loci. eLife 7, e31977. 10.7554/eLife.31977. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Varshney A., Manickam N., Orchard P., Tovar A., Ventresca C., Zhang Z., Feng F., Mears J., Erdos M.R., Narisu N., et al. (2024). Population-scale skeletal muscle single-nucleus multi-omic profiling reveals extensive context specific genetic regulation. Preprint at bioRxiv, https://doi.org/10.1101/2023.12.15.571696 10.1101/2023.12.15.571696. [DOI] [Google Scholar]
14.Orchard P., Manickam N., Ventresca C., Vadlamudi S., Varshney A., Rai V., Kaplan J., Lalancette C., Mohlke K.L., Gallagher K., et al. (2021). Human and rat skeletal muscle single-nuclei multi-omic integrative analyses nominate causal cell types, regulatory elements, and SNPs for complex traits. Genome Res. 31, 2258–2275. 10.1101/gr.268482.120. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Garcia S.M., Diaz A., Lau J., Chi H., Lizarraga M., Davies M.R., Liu X., and Feeley B.T. (2023). Distinct human stem cell subpopulations drive adipogenesis and fibrosis in musculoskeletal injury. Preprint at bioRxiv, https://doi.org/10.1101/2023.07.28.551038 10.1101/2023.07.28.551038. [DOI] [Google Scholar]
16.He Q., Lu J., Liang Q., Yao L., Sun T., Wang H., Duffy M., Jiang X., Lin Y., Lee J.-H., et al. (2025). Prg4+ fibro-adipogenic progenitors in muscle are crucial for bone fracture repair. Preprint at bioRxiv, https://doi.org/10.1101/2025.05.14.654160 10.1101/2025.05.14.654160. [DOI] [Google Scholar]
17.Dai Q., Wan C., Xu Y., Fei K., Olivere L.A., Garrett B., Akers L., Peters D., Otto J.C., Kontos C.D., et al. (2024). Vcam1+ Fibro-adipogenic Progenitors Mark Fatty Infiltration in Chronic Limb Threatening Ischemia. Preprint at bioRxiv, https://doi.org/10.1101/2024.07.08.602430 10.1101/2024.07.08.602430. [DOI] [Google Scholar]
18.Giuliani G., Rosina M., and Reggio A. (2022). Signaling pathways regulating the fate of fibro/adipogenic progenitors (FAPs) in skeletal muscle regeneration and disease. FEBS J. 289, 6484–6517. 10.1111/febs.16080. [DOI] [PubMed] [Google Scholar]
19.Anderson S.E., Hymel L.A., Zhang H., McKinney J.M., Turner T.C., Mohiuddin M., Han W.M., Lee N.H., Choi J.J., Jeong G., et al. (2025). Aberrant Fibro-Adipogenic Progenitor Subpopulations Drive Volumetric Muscle Loss-Induced Fibrosis. Preprint at bioRxiv, https://doi.org/10.1101/2025.05.11.653339 10.1101/2025.05.11.653339. [DOI] [Google Scholar]
20.Alasoo K., Rodrigues J., Mukhopadhyay S., Knights A.J., Mann A.L., Kundu K., Hale C., Dougan G., and Gaffney D.J. (2018). Shared genetic effects on chromatin and gene expression indicate a role for enhancer priming in immune response. Nat. Genet. 50, 424–431. 10.1038/s41588-018-0046-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Varshney A., Scott L.J., Welch R.P., Erdos M.R., Chines P.S., Narisu N., Albanus R.D., Orchard P., Wolford B.N., Kursawe R., et al. (2017). Genetic regulatory signatures underlying islet gene expression and type 2 diabetes. Proc. Natl. Acad. Sci. 114, 2301–2306. 10.1073/pnas.1621192114. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Missarova A., Dann E., Rosen L., Satija R., and Marioni J. (2024). Leveraging neighborhood representations of single-cell data to achieve sensitive DE testing with miloDE. Genome Biol. 25, 189. 10.1186/s13059-024-03334-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Johnson B.B., Cosson M.-V., Tsansizi L.I., Holmes T.L., Gilmore T., Hampton K., Song O.-R., Vo N.T.N., Nasir A., Chabronova A., et al. (2024). Perlecan (HSPG2) promotes structural, contractile, and metabolic development of human cardiomyocytes. Cell Rep. 43, 113668. 10.1016/j.celrep.2023.113668. [DOI] [PubMed] [Google Scholar]
24.Dankel S.N., Svärd J., Matthä S., Claussnitzer M., Klöting N., Glunk V., Fandalyuk Z., Grytten E., Solsvik M.H., Nielsen H.-J., et al. (2014). COL6A3 expression in adipocytes associates with insulin resistance and depends on PPARγ and adipocyte size. Obes. Silver Spring Md 22, 1807–1813. 10.1002/oby.20758. [DOI] [Google Scholar]
25.Dankel S.N., Grytten E., Bjune J.-I., Nielsen H.J., Dietrich A., Blüher M., Sagen J.V., and Mellgren G. (2020). COL6A3 expression in adipose tissue cells is associated with levels of the homeobox transcription factor PRRX1. Sci. Rep. 10, 20164. 10.1038/s41598-020-77406-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Sreekumar R., Halvatsiotis P., Schimke J.C., and Nair K.S. (2002). Gene Expression Profile in Skeletal Muscle of Type 2 Diabetes and the Effect of Insulin Treatment. Diabetes 51, 1913–1920. 10.2337/diabetes.51.6.1913. [DOI] [PubMed] [Google Scholar]
27.Loh N.Y., Minchin J.E.N., Pinnick K.E., Verma M., Todorčević M., Denton N., Moustafa J.E.-S., Kemp J.P., Gregson C.L., Evans D.M., et al. (2020). RSPO3 impacts body fat distribution and regulates adipose cell biology in vitro. Nat. Commun. 11, 2797. 10.1038/s41467-020-16592-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Fitzgerald G., Turiel G., Gorski T., Soro-Arnaiz I., Zhang J., Casartelli N.C., Masschelein E., Maffiuletti N.A., Sutter R., Leunig M., et al. (2023). MME+ fibro-adipogenic progenitors are the dominant adipogenic population during fatty infiltration in human skeletal muscle. Commun. Biol. 6, 111. 10.1038/s42003-023-04504-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Stolic M., Russell A., Hutley L., Fielding G., Hay J., MacDonald G., Whitehead J., and Prins J. (2002). Glucose uptake and insulin action in human adipose tissue—influence of BMI, anatomical depot and body fat distribution. Int. J. Obes. 26, 17–23. 10.1038/sj.ijo.0801850. [DOI] [Google Scholar]
30.Power C., and Thomas C. (2011). Changes in BMI, Duration of Overweight and Obesity, and Glucose Metabolism: 45 Years of Follow-up of a Birth Cohort. Diabetes Care 34, 1986–1991. 10.2337/dc10-1482. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Tramunt B., Smati S., Grandgeorge N., Lenfant F., Arnal J.-F., Montagner A., and Gourdy P. (2020). Sex differences in metabolic regulation and diabetes susceptibility. Diabetologia 63, 453–461. 10.1007/s00125-019-05040-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Mitchell J.M., Nemesh J., Ghosh S., Handsaker R.E., Mello C.J., Meyer D., Raghunathan K., Rivera H. de, Tegtmeyer M., Hawes D., et al. (2020). Mapping genetic effects on cellular phenotypes with “cell villages.” Preprint at bioRxiv, https://doi.org/10.1101/2020.06.29.174383 10.1101/2020.06.29.174383. [DOI] [Google Scholar]
33.Luo Y.E., Abe-Teh Z., Alsaghir T., Kuo L.-Y., Yu F., Stoker B.E., Appu A.B., Zhou Y., Yue F., Kopinke D., et al. (2025). Fibro-Adipogenic Progenitors require autocrine IGF-I in homeostatic and regenerating skeletal muscle. Preprint at bioRxiv, https://doi.org/10.1101/2025.04.11.648330 10.1101/2025.04.11.648330. [DOI] [Google Scholar]
34.Johansen V.B.I., Lund J., Romero-Lado M.J., Breum A.W., Svendsen C., Jørgensen K.S., Miller R.L., Fritzen A.M., Kilpeläinen T.O., Schjoldager K.T., et al. (2025). Genetic and preclinical evidence implicating chondroitin sulfate as a matritherapeutic target for the treatment of type 2 diabetes. Preprint at bioRxiv, https://doi.org/10.1101/2025.11.12.688143 10.1101/2025.11.12.688143. [DOI] [Google Scholar]
35.Huang T., Yang M., Grant C., Kelly K., O’Connor A.J., Kalionis B., and Heath D.E. (2025). Decellularized Extracellular Matrix Produced by iPSC-Derived MSCs Promotes iPSC-MSC Proliferation and Differentiation and Regulates Secreted Factors. J. Biomed. Mater. Res. A 113, e70010. 10.1002/jbma.70010. [DOI] [PubMed] [Google Scholar]
36.Stump C.S., Short K.R., Bigelow M.L., Schimke J.M., and Nair K.S. (2003). Effect of insulin on human skeletal muscle mitochondrial ATP production, protein synthesis, and mRNA transcripts. Proc. Natl. Acad. Sci. U. S. A. 100, 7996–8001. 10.1073/pnas.1332551100. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Bernardo E., Vas M.G.D., Balboa D., Cuenca-Ardura M., Bonàs-Guarch S., Planas-Fèlix M., Mollandin F., Torrens-Dinarès M., Maestro M.A., García-Hurtado J., et al. (2025). HNF1A and A1CF coordinate a beta cell transcription-splicing axis that is disrupted in type 2 diabetes. Cell Metab. 37, 1870–1889.e10. 10.1016/j.cmet.2025.07.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Das R., Giri J., K. Paul P., Froelich N., Chinnadurai R., McCoy S., Bushman W., and Galipeau J. (2022). A STAT5-Smad3 dyad regulates adipogenic plasticity of visceral adipose mesenchymal stromal cells during chronic inflammation. Npj Regen. Med. 7, 41. 10.1038/s41536-022-00244-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Lin H.-M., Lee J.-H., Yadav H., Kamaraju A.K., Liu E., Zhigang D., Vieira A., Kim S.-J., Collins H., Matschinsky F., et al. (2009). Transforming Growth Factor-β/Smad3 Signaling Regulates Insulin Gene Transcription and Pancreatic Islet β-Cell Function. J. Biol. Chem. 284, 12246–12257. 10.1074/jbc.M805379200. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Pujar M.K., Vastrad B., and Vastrad C. (2019). Integrative Analyses of Genes Associated with Subcutaneous Insulin Resistance. Biomolecules 9, 37. 10.3390/biom9020037. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Reggio A., Spada F., Rosina M., Massacci G., Zuccotti A., Fuoco C., Gargioli C., Castagnoli L., and Cesareni G. (2019). The immunosuppressant drug azathioprine restrains adipogenesis of muscle Fibro/Adipogenic Progenitors from dystrophic mice by affecting AKT signaling. Sci. Rep. 9, 4360. 10.1038/s41598-019-39538-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Reggio A., Rosina M., Palma A., Cerquone Perpetuini A., Petrilli L.L., Gargioli C., Fuoco C., Micarelli E., Giuliani G., Cerretani M., et al. (2020). Adipogenesis of skeletal muscle fibro/adipogenic progenitors is affected by the WNT5a/GSK3/β-catenin axis. Cell Death Differ. 27, 2921–2941. 10.1038/s41418-020-0551-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Pickrell J.K. (2014). Joint analysis of functional genomic data and genome-wide association studies of 18 human traits. Am. J. Hum. Genet. 94, 559–573. 10.1016/j.ajhg.2014.03.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Singh K., Sachan N., Ene T., Dabovic B., and Rifkin D. (2022). Latent transforming growth factor β binding protein 3 controls adipogenesis. Matrix Biol. J. Int. Soc. Matrix Biol. 112, 155–170. 10.1016/j.matbio.2022.08.001. [DOI] [Google Scholar]
45.Wu M., Wu S., Chen W., and Li Y.-P. (2024). The roles and regulatory mechanisms of TGF-β and BMP signaling in bone and cartilage development, homeostasis and disease. Cell Res. 34, 101–123. 10.1038/s41422-023-00918-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Quinlan A.R., and Hall I.M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842. 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Liao Y., Smyth G.K., and Shi W. (2019). The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads. Nucleic Acids Res. 47, e47. 10.1093/nar/gkz114. [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Wells M.F., Nemesh J., Ghosh S., Mitchell J.M., Salick M.R., Mello C.J., Meyer D., Pietilainen O., Piccioni F., Guss E.J., et al. (2023). Natural variation in gene expression and viral susceptibility revealed by neural progenitor cell villages. Cell Stem Cell 30, 312–332.e13. 10.1016/j.stem.2023.01.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Li H., and Durbin R. (2009). Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760. [DOI] [PMC free article] [PubMed] [Google Scholar]
50.Zhang Y., Liu T., Meyer C.A., Eeckhoute J., Johnson D.S., Bernstein B.E., Nussbaum C., Myers R.M., Brown M., Li W., et al. (2008). Model-based Analysis of ChIP-Seq (MACS). Genome Biol. 9, R137. 10.1186/gb-2008-9-9-r137. [DOI] [PMC free article] [PubMed] [Google Scholar]
51.Orchard P., Kyono Y., Hensley J., Kitzman J.O., and Parker S.C.J. (2020). Quantification, Dynamic Visualization, and Validation of Bias in ATAC-Seq Data with ataqv. Cell Syst. 10, 298–306.e4. 10.1016/j.cels.2020.02.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Dobin A., Davis C.A., Schlesinger F., Drenkow J., Zaleski C., Jha S., Batut P., Chaisson M., and Gingeras T.R. (2012). STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
53.Fleming S.J., Chaffin M.D., Arduini A., Akkad A.-D., Banks E., Marioni J.C., Philippakis A.A., Ellinor P.T., and Babadi M. (2023). Unsupervised removal of systematic background noise from droplet-based single-cell experiments using CellBender. Nat. Methods, 1–13. 10.1038/s41592-023-01943-7. [DOI] [PubMed] [Google Scholar]
54.Lun A.T.L., Riesenfeld S., Andrews T., Dao T.P., Gomes T., and Marioni J.C. (2019). EmptyDrops: distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data. Genome Biol. 20, 1–9. 10.1186/s13059-019-1662-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
55.Kang H.M., Subramaniam M., Targ S., Nguyen M., Maliskova L., McCarthy E., Wan E., Wong S., Byrnes L., Lanata C.M., et al. (2018). Multiplexed droplet single-cell RNA-sequencing using natural genetic variation. Nat. Biotechnol. 36, 89–94. 10.1038/nbt.4042. [DOI] [PMC free article] [PubMed] [Google Scholar]
56.Hao Y., Stuart T., Kowalski M.H., Choudhary S., Hoffman P., Hartman A., Srivastava A., Molla G., Madad S., Fernandez-Granda C., et al. (2024). Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat. Biotechnol. 42, 293–304. 10.1038/s41587-023-01767-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
57.Cao J., Spielmann M., Qiu X., Huang X., Ibrahim D.M., Hill A.J., Zhang F., Mundlos S., Christiansen L., Steemers F.J., et al. (2019). The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502. 10.1038/s41586-019-0969-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
58.Finak G., McDavid A., Yajima M., Deng J., Gersuk V., Shalek A.K., Slichter C.K., Miller H.W., McElrath M.J., Prlic M., et al. (2015). MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 16, 278. 10.1186/s13059-015-0844-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
59.Yu G., Wang L.-G., Han Y., and He Q.-Y. (2012). clusterProfiler: an R Package for Comparing Biological Themes Among Gene Clusters. OMICS J. Integr. Biol. 16, 284–287. 10.1089/omi.2011.0118. [DOI] [Google Scholar]
60.Zhang K., Zemke N.R., Armand E.J., and Ren B. (2024). A fast, scalable and versatile tool for analysis of single-cell omics data. Nat. Methods 21, 217–227. 10.1038/s41592-023-02139-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
61.S H., C B., N S., E B., Yc L., P L., Jx C., C M., H S., and Ck G. (2010). Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38. 10.1016/j.molcel.2010.05.004. [DOI] [Google Scholar]
62.Sakaue S., Weinand K., Isaac S., Dey K.K., Jagadeesh K., Kanai M., Watts G.F.M., Zhu Z., Brenner M.B., McDavid A., et al. (2024). Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles. Nat. Genet. 56, 615–626. 10.1038/s41588-024-01682-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials