Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 May 7.
Published in final edited form as: Cell Metab. 2024 Mar 20;36(5):1105–1125.e10. doi: 10.1016/j.cmet.2024.02.015

Transcriptomic, epigenomic and spatial metabolomic cell profiling redefines regional human kidney anatomy

Haikuo Li 1, Dian Li 1, Nicolas Ledru 1, Qiao Xuanyuan 1, Haojia Wu 1, Amish Asthana 2, Lori N Byers 2, Stefan G Tullius 3, Giuseppe Orlando 2, Sushrut S Waikar 4, Benjamin D Humphreys 1,5,6,*
PMCID: PMC11081846  NIHMSID: NIHMS1979282  PMID: 38513647

SUMMARY

A large-scale multimodal atlas that includes major kidney regions is lacking. Here, we employed simultaneous high-throughput single-cell ATAC/RNA-sequencing (SHARE-seq) and spatially resolved metabolomics to profile 54 human samples from distinct kidney anatomical regions. We generated transcriptomes of 446,267 cells and chromatin accessibility profiles of 401,875 cells and developed a package to analyze 408,218 spatially resolved metabolomes. We find that the same cell type, including thin limb, thick ascending limb loop of Henle and principal cells, display distinct transcriptomic, chromatin accessibility and metabolomic signatures, depending on anatomic location. Surveying metabolism-associated gene profiles revealed non-overlapping metabolic signatures between nephron segments and dysregulated lipid metabolism in diseased proximal tubule (PT) cells. Integrating multimodal omics with clinical data identified PLEKHA1 as a disease marker and its in-vitro knockdown increased gene expression in PT differentiation, suggesting possible pathogenic roles. This study highlights previously underrepresented cellular heterogeneity underlying the human kidney anatomy.

Keywords: Multiomics, single-cell combinatorial indexing, SHARE-seq, spatial metabolomics, MALDI-MS, chronic kidney disease, acute kidney injury, anatomy, metabolism, lipid metabolism

Graphical Abstract

graphic file with name nihms-1979282-f0001.jpg

In brief

Li et al. profile human kidney anatomical structures with split-pool barcoding simultaneous RNA/ATAC-sequencing and spatially resolved metabolomics. They report that the same kidney cell types have distinct transcriptomic, epigenomic and metabolomic profiles depending on anatomic location. This multimodal and regionally stratified human kidney atlas reveals anatomical heterogeneity and cell type-specific metabolism.

INTRODUCTION

The human kidney maintains fluid and electrolyte balance, removes waste and regulates blood pressure. In healthy adult kidney, 180 liters of plasma are filtered daily into nephrons whose tubules span kidney cortex, medulla and papilla - the major anatomical regions of the kidney.1 Kidney diseases affect over 10% of the population worldwide and there are relatively few treatment options to slow the disease progression.2,3 The transcriptional, epigenetic and metabolomic landscape of human kidney has not been studied across all anatomic regions, limiting our understanding of cell type-specific functionality and potentially identification of new therapeutic targets.

The human kidney has about 1 million nephrons, each composed of a renal corpuscle containing the renal glomerulus in the kidney cortex and the renal tubules that exist throughout all kidney regions. Three categories of nephrons, cortical, short-looped and long-looped nephrons, are classified based on the anatomical locations where a U-shaped loop (i.e., the loop of Henle) turns back, in the cortex, medulla and papilla, respectively (Figure 1A).4,5 This characteristic makes kidney a unique organ where the same tubular epithelial cell type, originating from same developmental structure (i.e., the ureteric bud for collecting ducts and the metanephric mesenchyme for the other tubular cells1), may be positioned within distinct anatomical regions. Whether these cells possess a differential molecular signature when located in different regions remains undefined.

Figure 1. A multimodal and anatomically stratified single-cell atlas of the human kidney.

Figure 1.

(A) The structure of human kidney anatomy. DCT, distal convoluted tubule. Renal collecting ducts are not shown for the convenience of visualization.

(B) Study overview. Figure created with BioRender.com.

(C) PAS and trichrome histology staining on human kidney regions.

(D) Pseudobulk analysis.

(E) Visualization of region-specific gene expression/activity in the pseudobulk analysis.

(F-G) UMAP presentations of 446,267 single-cell transcriptomes (F) and 401,875 chromatin accessibility profiles (G). The surrounding circular layouts indicate the cell number of each population (log10-transformed scale bar), major cell clusters (outer layout), and distributions of 5 anatomical regions in cell cluster (inner layout; color legend same as Figure 1D). PT_dediff, dedifferentiated PT; PT_VCAM1, VCAM1-expressing PT; TAL, thick ascending limb of loop of Henle; CNT, connecting tubule; PC, principal cell of collecting duct; ICA/ICB, type A/B intercalated cell of collecting duct; POD, podocyte; PEC, parietal epithelial cell; JGA, juxtaglomerular apparatus; ENDO, endothelial cell; Fib, fibroblast; Ma, macrophage; B/T, B and T cells; Uro, urothelium; SMC, smooth muscle cell.

(H) Spatially resolved metabolomics of human kidney regions. For each sample, PAS staining (left) and metabolomics Leiden clustering (right) are shown. Zoom-in regions are shown below.

Single-cell RNA sequencing (scRNA-seq) has been widely applied to study tissue heterogeneity and cell fate transitions in both healthy and diseased kidneys.611 More recently, joint scRNA-seq and assay for transposase-accessible chromatin with sequencing (snATAC-seq) has enabled multimodal characterization of both gene expression and epigenetic regulatory mechanisms in mouse and human kidneys or kidney organoids.1215 The most widely adopted single-cell platform is based upon droplet microfluidics. Limitations such as relatively low throughput and high reagent cost hinder the utilization of this technology for analyzing a large number of samples.16,17 Although a few large consortia have managed to analyze human kidney samples at scale,1820 comparative multimodal analysis of different kidney anatomical structures is lacking. In addition, a comprehensive human kidney atlas would require metabolomic profiling to elucidate the cell type-specific metabolic functions. Although imaging mass spectrometry (IMS)-based spatially resolved metabolomics has been increasingly used in multiple biomedical fields,2124 an automated pipeline for IMS data analysis and metabolomic clustering is still lacking, which limits its adoption in large-scale human kidney research.

Here, we employed split-pool barcoding-based simultaneous high-throughput ATAC and RNA expression with sequencing (SHARE-seq25) to profile 48 adult human kidney samples from cortex, medulla, papilla, renal artery and ureter. Our optimized SHARE-seq protocol generated transcriptomes of 446,267 cells and open chromatin accessibility profiles of 401,875 cells at low experimental cost. We developed MALDIpy, an open-source package for analyzing IMS data and used it to analyze 408,218 spatially resolved metabolomes generated from 6 adult human kidney cortex, medulla and papilla samples using the matrix-assisted laser desorption/ionization (MALDI) method.26 With this large-scale, multimodal and anatomically stratified human kidney atlas, we describe transcriptional, epigenomic and metabolomic heterogeneity across the anatomical regions, including identification of a cell type less characterized in previous studies - thin limb of loop of Henle (tL) cells enriched in kidney medulla and papilla. Cells of tL, thick ascending limb loop Henle (TAL) and principal cells (PC) have distinct signatures when located in different anatomical regions. We identified metabolism-associated gene profiles that are unique to each nephron segment and anatomical region, and dysregulated lipid metabolism including decreased fatty acid oxidation and increased lipid accumulation gene expression in a diseased state of proximal tubule (PT) cells. Finally, we integrated the clinical data from enrolled patients and identified Pleckstrin Homology Domain-Containing A1 (PLEKHA1) as a disease marker gene potentially involved in kidney disease development.

RESULTS

Human kidney sample collection and study overview

The kidney cortex comprises the outer region of the kidney lying under the kidney capsule. The renal medulla, surrounded by kidney cortex, consists of triangular-shaped renal pyramids. The cone-shaped papilla is located at the tip of a renal pyramid and projects into the ureter (Figure 1AB).27 Here, we obtained adult human kidney samples from multiple anatomical regions including cortex, medulla, papilla, renal artery and ureter (Figure 1B). To confirm the accuracy of sample dissection, sections of the same tissues from the five anatomical regions were processed for Periodic acid–Schiff (PAS) and trichrome histological staining (Figure 1C). As expected, glomeruli were only observed in the kidney cortex. The renal tubules span multiple anatomical regions including cortex, medulla and papilla, reflecting diverse subpopulations of tubular epithelial cells. By contrast, smooth muscle can be identified in both the renal artery and ureter, which suggests high tissue elasticity. To investigate the underlying difference of these anatomical regions at the multimodal and single-cell level, we employed SHARE-seq25 to profile transcriptomes and chromatin accessibility profiles of the same cells from 48 kidney samples (24 cortex, 10 medulla, 7 papilla, 3 renal artery and 4 ureter samples; Table S1) (Figure 1DG; detailed in Figure 2) and used IMS-based spatially resolved metabolomics to analyze 6 kidney samples (tissue sections of distinct anatomical regions from 2 donors; Table S2) (representative data in Figure 1H; detailed in Figure 3).

Figure 2. SHARE-seq multiomics analysis identifies anatomical heterogeneity of the human kidney.

Figure 2.

(A) Dot plot showing cluster-specific marker gene expression and bar plot showing the number of cells.

(B) Dot plot showing cluster-specific gene activities and bar plot showing the number of cells.

(C) Cell type composition of each kidney region.

(D) Regional distribution of each cell type.

(E) Heat map showing the proportion of snATAC-seq cells with identical cluster annotations as annotated by snRNA-seq analysis.

(F) Integrative analysis of both modalities with WNN analysis.

(G) Heat map showing cell type-specific motif activities.

(H) Coverage plots of three tL marker genes.

(I) Immunostaining of SLC44A5, UMOD and AQP2 on kidney medullary sections. Scale bars on the left panels: 100 μm.

Figure 3. Spatially resolved metabolomics highlights anatomical heterogeneity of the human kidney.

Figure 3.

(A) MALDIpy, a package for IMS data analysis.

(B) UMAP presentation of 408,218 spatially resolved metabolomes. Glom, glomerulus; CoD, collecting duct; LoH, loop of Henle; Matrix, MALDI technical background matrix.

(C) Dot plot showing cluster-specific metabolomics. Each feature is labeled by its common name and chemical formula.

(D) Spatially resolved metabolomics profiling of 6 human kidney samples, colored by metabolomics Leiden clustering. Scale bars: 300 μm. Donor #1 samples are highlighted in Figure 1H and zoom-in regions are shown with specialized kidney structures highlighted.

(E) Spatial feature plot of metabolite SM(d18:1/16:0), with the chemical structure shown. The value on the color bar indicates IMS intensity post total ion count (TIC) normalization (see Methods). See also Mendeley Data.

(F) Biological functions of enzymes of interest.

(G) Gene expression of CERS6 and CDS2 in the kidney cortex.

(H) Spatial feature plot of metabolite PA(18:1(9Z)/15:0), with the chemical structure shown.

(I) Spatial feature plot of metabolite LysoPC(22:4(7Z,10Z,13Z,16Z)), with the chemical structure shown.

(J) Immunostaining of CDS2 and LTL (Lotus tetragonolobus lectin) in the human kidney cortex, with specialized kidney structures highlighted. Scale bars: 25 μm.

(K) Gene expression of LYPLA1/2 in the snRNA-seq data. Uro1/2 cells are combined for the convenience of visualization.

Transcriptional and epigenomic profiling with SHARE-seq

SHARE-seq25 employs a split-pool barcoding strategy and enables co-measurement of snRNA-seq and snATAC-seq readouts in the same cells/nuclei. In SHARE-seq, the nuclei are extracted from frozen tissues, fixed, transposed with Tn5 transposase, reverse transcribed with biotin-tagged poly(T) primers followed by three rounds of ligation-based 96-well split-pool barcoding (Figure 1B). Preliminary results revealed several challenges encountered when applying the original SHARE-seq protocol on the human kidneys, including low yield in nuclei isolation from these primary samples, non-uniform Tn5 transposition efficiency and reduced sensitivity for gene detection. Therefore, we optimized each of these major procedures, with changes detailed in Table S3 and STAR Methods, including the use of a hyperactive Tn5 variant28 (Mendeley Data). With this optimized protocol, we profiled 3–4 samples per SHARE-seq batch leveraging its sample multiplexing capacity, completing the processing of 48 samples within 15 batches at high throughput (~50,000 raw cells per batch) and reagent costs ~20 times lower than the commercialized 10X Genomics multiome kit (~$0.01–0.02 per cell with both modalities) (Table S4).

Pseudobulk analysis was performed to check quality by combining cells of the same sample into a pseudo-cell for each modality (Mendeley Data). Gene activities were calculated for snATAC-seq pseudo-cells as a measurement of chromatin accessibility profiles, which were co-embedded with snRNA-seq pseudo-cells (Figure 1D and S1A; Mendeley Data). This analysis revealed low variation between samples of the same regions, compared to variation between samples of different anatomic regions (Figure 1D), which was confirmed by analyzing snRNA-seq or snATAC-seq pseudo-cells alone (Figure S1B). Correlation analysis suggested kidney medulla showed higher transcriptomic similarity with the papilla compared with the cortex, and high similarity between renal artery and ureter samples (Figure S1C). We identified upregulation of SLC22A8 (encoding a PT transporter) in kidney cortex, RNF24 (specific to the Loop of Henle and distal nephron29) in medulla and papilla and MYH11 [marker of smooth muscle cells (SMCs)] in the renal artery and ureter samples, confirming the sample quality (Figure 1E and S1D).

After sample demultiplexing, doublet removal and quality control (STAR Methods), we obtained a total of 446,267 single-cell transcriptomes measured by snRNA-seq, with an average of 3,084 unique molecular identifiers (UMIs) and 1,051 genes per cell (Figure 1F and S1E), which is comparable to current commercialized assays on primary kidney tissues.30 We also processed 401,875 single-cell open chromatin accessibility profiles measured by snATAC-seq, with an average of 4,729 fragments and 1,866 peaks per cell (Figure 1G and S1F) (Mendeley Data). Dimension reduction and unsupervised single-cell clustering analysis identified a total of 29 major cell clusters for the snRNA-seq module and 21 cell clusters for the snATAC-seq module (Figure 1FG) (data visualizer available at http://humphreyslab.com/SingleCell/).

Human kidney anatomical heterogeneity revealed by SHARE-seq multiomics analysis

The large size of this dataset and inclusion of samples from different kidney anatomical structures allowed us to not only benchmark previously described cell types/states of the human kidney, such as a PT cell state exhibiting a profibrotic and proinflammatory signature characterized by expression of VCAM1 (PT_VCAM1),6,31 but also to identify clusters not examined in depth by prior single-cell studies including tL cells, subclusters of TAL and PC cells (discussed later) and multiple populations of fibroblasts, SMCs and urothelial cells (Figure 1FG), which all exhibited distinct gene expression and gene activity signatures (Figure 2A and 2B). Gene module scoring analysis revealed upregulation of extracellular matrix (ECM)-encoding genes in fibroblasts and SMCs (Figure S1G) and pathway activity inference suggested high TGF-β pathway activity in fibroblasts and endothelial cells, confirming previous results (Figure S1H).32,33 Biological term enrichment analysis indicated myogenesis activity specific to SMCs (Figure S1I) and immune functions such as inflammatory responses in macrophages (Figure S1J).

We first analyzed the cell type composition of each anatomical region (Figure 2C, S2A and S2B). As expected, PT comprised the major population in the kidney cortex (Figure 2C) and cells of the glomerulus, distal convoluted tubule (DCT) and connecting tubule (CNT) are mainly present in the cortex (Figure 2D). Notably, type B intercalated cells (ICB) are primarily identified in the cortex (90.7% of ICB) while type A intercalated cells (ICA) can be found in all cortex, medulla and papilla (Figure 2C and 2D), consistent with existing knowledge.34 Cells of the loop of Henle are highly abundant in the medullary and papillary regions, SMCs are enriched in the renal artery and ureter, and urothelial cells are mostly identified in the ureter (Figure 2C and 2D). Our snRNA-seq clustering identified Uro1 and Uro2 as two separate clusters of urothelial cells, both expressing markers reported in a recent study35 including various uroplakin and keratin genes (Mendeley Data). We found Uro1 cells are highly enriched in the ureter samples (Figure 2C and 2D) and therefore represent the ureteric urothelium, but Uro2 cells are more abundant in the kidney medulla and papilla (Figure 2D), indicating a renal pelvis identity. We confirmed there are no sample-specific cell clusters (Figure S2C) and all analyses had consistent results between snRNA-seq and snATAC-seq clustering (Mendeley Data).

We next analyzed cells for which both snRNA-seq and snATAC-seq readouts met quality control metrics. Nearly 80.8% of the snATAC-seq cells (n = 324,701) were also identified in our snRNA-seq analysis after cell identity matching. The gene expression levels showed a strong correlation with corresponding gene activities in these cells (Figure S2D; Mendeley Data). Cell annotations in our snATAC-seq clustering analysis were highly consistent with annotations in the snRNA-seq data (Figure 2E and S2E). Integrative analysis of both modalities with the weighted-nearest neighbor (WNN) analysis36 on the 324,701 cells revealed the same pattern of cell clustering as identified in snRNA-seq and snATAC-seq analysis, further validating the data quality (Figure 2F) (Mendeley Data). The WNN analysis also enabled us to impute cell type-specific transcription factor binding motifs (Figure 2G).37 For example, HNF4A and HNF4G showed PT cell-specific gene expression and motif activity, and an upregulated expression and motif activity of FOXP1 were identified in ICA and ICB, confirming previous results on mice (Figure 2G and S2F).14,38 Overall, we present a human kidney cell atlas which enables identification of transcriptomic and epigenomic heterogeneity across kidney anatomical regions.

Multimodal characterization of thin limb loop of Henle (tL) cells

Notably, besides the aforementioned well-characterized cell types, our snRNA-seq and snATAC-seq clustering analysis both identified two new cell clusters (tL1 and tL2) (Figure 1FG) which could not be mapped back to previously published studies on human kidneys. Enrichment analysis on differentially expressed genes (DEGs) of these two clusters presented terms including nephric duct development, regulation of nephron tubule epithelial cell differentiation and epithelial cell proliferation, suggesting they are tubular epithelial cells. In addition, these cells do not express marker genes of TAL (e.g., UMOD) and PC (e.g., AQP2), and are highly abundant in kidney medulla and papilla compared to cortex (Figure 2C and 2D), suggesting they are likely cells of thin limb of loop of Henle which forms a U-shaped loop in the medulla and papilla. Previous single-cell sequencing studies on human kidneys typically profiled samples only from kidney cortex due to sample accessibility, largely explaining the underrepresentation of these cells in previous studies.

To further validate the identity of tL cells, we leveraged a published analysis of microdissected rat renal tubules (including the tL segment) by RNA-seq (GSE56743).39 We found genes that were upregulated in tL1/2 cells in our human snRNA-seq data also exhibited an expression pattern specific to tL cells, including both thin descending limb (DTL) and thin ascending limb (ATL), in this rat data (Figure S2G and S2H). Integration with a recently published human kidney snRNA-seq dataset20 also validated a co-clustering pattern between tL1/2 cells and DTL/ATL cells (Mendeley Data).

Our dataset provides a unique opportunity to explore the transcriptomic and epigenomic profiles of tL cells. We found aquaporin 1 (AQP1), which is expressed in PT and DTL but absent in ATL,40 showed higher gene expression in tL1 than tL2 cells (Figure S2I), suggesting that tL1 and tL2 cells have DTL and ATL identities, respectively. In addition, we identified SATB2 (Special AT-rich sequence-binding protein 2) as an upregulated DEG of tL1, SH3GL3 (Endophilin-A3) as an upregulated DEG of tL2, and SLC44A5 (Choline Transporter-Like Protein 5) is commonly expressed in both clusters (Figure 2A). Importantly, these genes also showed gene activities specific to the corresponding cell clusters (Figure 2B) and we observed a higher open chromatin accessibility at the promoter or intragenic region in tL cells compared with the other cell types (Figure 2H; Mendeley Data). Choline transporter-like proteins (SLC44 family) are important for phospholipid metabolism41 and we confirmed SLC44A5 protein expression by immunofluorescence (Figure 2I). Co-staining of SLC44A5 with UMOD and AQP2 on kidney medullary sections revealed that SLC44A5+ tubules do not colocalize with TAL and collecting duct cells in the kidney medulla, further confirming its tL identity. This analysis also indicated that the SLC44A5 transporter mainly localizes at the basolateral membrane of tL cells (Figure 2I).

Analyzing human kidney anatomy by imaging mass spectrometry (IMS)

Next, we aimed to study the unique metabolomic signature of distinct human kidney anatomical regions with IMS-based spatially resolved metabolomics. Recent technological advances have increased its resolution to 10-μm pixel size, allowing metabolomic analysis at near single-cell resolution (each “cell” refers to a 10-μm pixel metabolome). Here, we generated IMS data for 6 human kidney tissue sections with the MALDI method (10-μm pixel size; positive ion mode) (STAR Methods), obtaining a total of 408,218 spatially resolved metabolomes with 588 features detected which mainly include metabolites and small molecules (Figure S3AS3C; Mendeley Data). The 6 tissue samples are: one outer cortex, one medulla, one papilla from Donor #1 (Figure 1H), and one inner cortex, one medulla including the corticomedullary junction region, another medulla including the pelvis region from Donor #2 (Figure S3D) [Donor #1: healthy control; Donor #2: acute kidney injury (Table S2)], therefore covering all major anatomical structures along the kidney corticopapillary axis. Each post-IMS MALDI section was processed for PAS staining for tissue co-registration.

Although some current platforms (e.g., METASPACE42) enable metabolite annotation and basic data exploration, there exist no open-source packages for advanced feature visualization, metabolomic clustering and integrative multi-sample analysis. Therefore, in this work, we first developed a Python-based package that we call MALDIpy, which enables analysis of IMS data at the single-pixel resolution (Figure 3A). Compared with an existing spatial omics Python package, MALDIpy resulted in over 10-time faster processing speed while consuming less memory when analyzing data of this study (Figure S3E). MALDIpy can project clusters back onto the tissue section so that the spatially resolved metabolomic clusters can be co-registered with histology staining. The package incorporates a linear normalization on total ion counts and data integration when used to analyze multiple samples (STAR Methods). Moreover, we provide the protein associations for each metabolite based on the Human Metabolome Database,43 to search for links between metabolites and proteins that are potentially associated with the metabolites.

Using MALDIpy, we performed dimension reduction and clustering analysis on the 408,218 spatially resolved metabolomes and identified 12 clusters, each exhibiting unique metabolomic features (Figure 3B and 3C). For example, Cluster #1 showed high abundance of a species of sphingomyelin, SM(d18:1/16:0) (Figure 3C), which was previously reported as a glomerular marker.18 Cluster #10 showed specific accumulation of palmitoylcarnitine and oleoylcarnitine (Figure 3C), which are key intermediates of fatty acid β-oxidation, a hallmark of kidney PT cells.44,45 Visualizing the metabolomic clusters on the tissue sections revealed distinct metabolomic signatures in different kidney anatomical regions (Figure 3D and Figure S3FG) (Mendeley Data). We performed cluster annotation with a combination of two approaches, by (1) manually surveying the differential features of each cluster and checking cell identity through literature search or (2) co-registration of the spatially resolved metabolomic clusters with histology staining. For example, by comparing with PAS staining on the same section region, we found that Cluster #1 clearly indicates the kidney glomerulus located in the cortex (Figure 1H, left panel; Figure 3D, left panel) and Cluster #12 is only present at the renal pelvis region suggesting its urothelial identity (Figure 3D, right panel; Figure S3H) (Mendeley Data).

We hypothesized the cell type-specific metabolomic signature should also be reflected at the transcriptomic level, such as unique expression of genes encoding enzymes that are associated with metabolites of interest. As mentioned above, SM(d18:1/16:0) displayed specific accumulation at the glomerulus region in the kidney cortex (Figure 3E; Mendeley Data). CERS6 (Ceramide Synthase 6) can convert palmitoyl(C16:0)-coenzyme A (CoA) to ceramide that is a precursor of SM(d18:1/16:0) (Figure 3F).46,47 Consistently, CERS6 gene is specifically expressed in podocytes in the kidney cortex as indicated by our SHARE-seq data (Figure 3G). Besides SM(d18:1/16:0), we described a species of phosphatidic acid, PA(18:1(9Z)/15:0), as another Cluster #1 differential metabolite and it is only present in the glomerulus but poorly detected in the kidney medulla and papilla (Figure 3H; Mendeley Data). With MALDIpy, we identified CDS2 (CDP-Diacylglycerol Synthase 2) as an associated enzyme responsible for catalyzing phosphatidic acids to CDP-diacylglycerol (DAG) (Figure 3F). CDP-DAG is a key intermediate of phosphatidylinositol metabolism, which is essential for normal glomerular morphology and function.48,49 CDS2 is highly abundant in juxtaglomerular apparatus (JGA) and also expressed in podocytes (Figure 3G), and we validated its protein expression in kidney glomeruli, especially at JGA, with immunofluorescence on human cortex samples (Figure 3J). The expression pattern of CERS6 and CDS2 was also confirmed by the Human Protein Atlas (Mendeley Data).50 In addition, we identified lysophosphatidylcholine LysoPC(22:4(7Z,10Z,13Z,16Z)) as a Cluster #12 marker accumulated in the renal pelvis region (Figure 3I) and the associated enzymes, Lysophospholipase1/2 (LYPLA1/2), are both specifically expressed in urothelial cells (Figure 3F and 3K), further validating its cluster identity.

Thus, in addition to the transcriptomic and epigenomic heterogeneity revealed by our SHARE-seq analysis, we further highlight the metabolomic diversity of the human kidney anatomy and demonstrate that MALDIpy can be used for analysis of IMS data at the sub-structure resolution.

The same tubular epithelial cell (TEC) type can have distinct transcriptomic and epigenomic signatures depending on regional location

Next, we aimed to leverage the SHARE-seq multiomics data and the spatially-resolved metabolomics data to study whether different anatomical locations may lead to differential molecular signatures even for the same terminally differentiated TECs. In the aforementioned SHARE-seq analysis, we found that clusters of the same TEC (tL, PC and TAL) can be stratified by their regional locations (Figure 1FG). For example, TAL3 cells are mainly identified in the cortex, TAL2 cells are mostly derived from the medulla and TAL1 cells are most abundant in the papilla (Figure 2D and S2A). Very recent snRNA-seq studies also reported region-specific clusters of TAL20 and PC51 cells, but several questions remain, including whether differential transcriptomic signatures are accompanied with changes in chromatin accessibility or metabolomics and whether there are regulatory elements such as transcription factors that are commonly differentially expressed regardless TEC types.

First, we confirmed that the same TEC showed differential transcriptomic signatures depending on anatomical regions by subclustering analysis of tL (Figure 4A), distal nephron (Figure 4B) and TAL cells (Figure 4C). We identified two tL subclusters that are highly abundant in the medulla (tL-M1/2) and two subclusters more specific to the kidney papilla (tL-P1/2) (Figure 4A and S4A). Mapping these cells back to the whole SHARE-seq dataset indicated that tL-M1/2 and tL-P1 cells mainly represented tL1 cells and tL-P2 cells are mostly derived from the tL2 cluster. These subclusters showed the unique expression of genes such as higher CD200 expression in tL-M1/2 and tL-P1 than tL-P2 cells, as well as higher CA8 (carbonic anhydrase 8) expression in papillary tL than medullary tL cells (Figure 4D and S4B). We identified increased chromatin accessibility at the promoter or intragenic regions of these genes in tL cells (Figure 4E and S4C), suggesting a differential epigenomic signature.

Figure 4. tL, PC and TAL cells have distinct signatures depending on regional locations.

Figure 4.

(A-C) Subclustering analysis of tL (A), distal nephron (B) and TAL cells (C). Cells are colored by either cluster annotations (left) or sample origins (right).

(D) Dot plots showing cluster-specific gene expression.

(E) Coverage plots of marker genes of tL, PC and TAL cells. Differential accessible regions are highlighted.

(F) ANK2 expression in TAL cells.

(G) Immunostaining of ANK2, UMOD and LTL on kidney cortex and medulla samples. Scale bars: 25 μm.

(H) TFs sorted by regulatory score on TAL-M vs. TAL-C (left) and TAL-P vs. TAL-M (right).

(I) Immunostaining of AQP2 (green) and UMOD (red) on serial MALDI sections of cortex (top), medulla (middle) and papilla (bottom) tissues. See Mendeley Data for scale bars and whole-area scanning.

(J) Comparative analysis between immunostaining, metabolomics clustering and features. Regions of interest are highlighted in Figure 4I. 1st row: AQP2+ collecting ducts are indicated by arrows; UMOD+ TAL cells are indicated by triangles. 2nd row: Leiden clustering, with color scheme same as Figure 3B. 3rd/4th rows: Visualization of metabolites C45H78NO7PNa and C42H80NO8PNa. 5th row: Concurrent visualization indicates the two metabolites do not colocalize with each other.

(K) UMAP presentations of two features, C45H78NO7PNa and C42H80NO8PNa, with chemical structures shown.

Subclustering analysis of distal nephron cells presented one cluster of PC cells more abundant in the cortex and medulla (PC-M) and another enriched in papilla (PC-P) (Figure 4B and S4D), each showing unique DEGs and consistent chromatin accessibility variations (Figure 4D and S4EF). For example, two type IV collagen encoding genes COL4A3 and COL4A4 have a head-to-head conformation in the human genome,52 and we found both genes showed increased chromatin accessibility at the promoter region and higher gene expression in PC-P than PC-M (Figure 4D and S4F). A recent snRNA-seq study51 reported a lower expression of RALYL (RNA-Binding Raly-Like Protein) in papillary PC, and we further identified a decreased chromatin accessibility of this gene in PC-P, compared with PC-M (Figure 4E).

Using a similar approach, we defined TAL cells in the kidney cortex, medulla and papilla (TAL-C/M/P), respectively (Figure 4C and S4G). TAL-C cells exhibited a higher expression of PAPPA2 accompanied with an increased chromatin accessibility at its promoter region, compared with TAL-M and TAL-P cells (Figure 4DE and S4H), concordant with a previous study on rat kidneys.53 It is known that TAL cells located in different anatomical regions have distinct morphology and metabolic functions, such as different ion reabsorption preference.5 Therefore, we surveyed ion transporter-related genes among the DEGs identified between TAL subclusters. We found ANK2 (Ankyrin 2) was upregulated in TAL-M/P compared to TAL-C cells (q<0.0001) (Figure 4F and S4J) and also showed an increased intragenic chromatin accessibility in the medullary and papillary regions (Figure S4I). Ankyrin 2 is important for maintaining spectrin-actin cytoskeleton and membrane stabilization of many ion transporters in various cell types.54 We performed immunofluorescence by co-staining ANK2 with UMOD, a well-known TAL marker regardless regional locations (Mendeley Data), and validated ANK2 was lowly expressed in cortical UMOD+ cells, but most medullary UMOD+ cells co-expressed ANK2 (Figure 4G).

Next, we aimed to identify potential transcription factors (TFs) that are associated with kidney region-specific transcriptomic and epigenomic signatures. We surveyed the lists of DEGs and differential accessible regions across the aforementioned subclusters of tL, PC and TAL cells, and identified that RUNX1 (Runt Related Transcription Factor 1) showed an increased gene expression along the corticopapillary axis regardless TEC types (i.e., higher expression in tL-P vs. tL-M, PC-P vs. PC-M and TAL-P vs. TAL-M vs. TAL-C) (Figure 4D), as well as consistently varied chromatin accessibility at the intragenic region (Figure S4C, S4F, S4I). To further screen TF drivers that potentially shape the gene regulatory networks in different kidney regions, we leveraged a recently described computational method13 which simultaneously predicts associations of genes with both cis-regulatory elements and TFs. This analysis on TAL-P vs. TAL-M vs. TAL-C (Figure 4H), tL-P vs. tL-M (Figure S4K) and PC-P vs. PC-M (Figure S4L) all indicated RUNX1 as a regulatory TF in maintaining the identity of these TECs in different anatomical regions. Since TFs bind to specific genomic sequences (i.e., motifs) to regulate target gene expression, we performed snATAC-seq motif inference analysis37 and further validated an increased RUNX1 motif activity in the papilla, compared to the medulla or cortex, for cells of tL, TAL and PC (Figure S4M).

To summarize, we comprehensively analyzed the transcriptomic and epigenomic landscapes of three types of TECs, tL, TAL and PC, and identified that these cells showed different signatures depending on regional locations. Computational TF inference analysis prioritized RUNX1 as a potential regulator driver in maintaining the region-specific cell identity.

Distinct metabolomic profiles in cortical and medullary TAL cells

In addition to the distinct transcriptomic and epigenomic profiles in TECs at different kidney regions, we asked whether regional locations might also determine the metabolomic signature. Here, we co-stained UMOD and AQP2, two markers of TAL and PC cells, respectively, with immunofluorescence on serial tissue sections of samples used in our IMS experiment (Figure 4I and first row of Figure 4J), to study whether there exist region-dependent metabolomic changes in TAL and PC cells. We found that almost all AQP2+ cells are annotated as Cluster #5 (colored in brown) in our spatially resolved metabolomics analysis, regardless of regional locations (first two rows of Figure 4J; marked by arrows), suggesting PC cells in different anatomical regions share relatively similar metabolomic signatures. On the other hand, although most UMOD+ TAL cells colocalized with metabolomic Cluster #5 in the kidney cortex, in the medulla, however, they were mainly identified by Cluster #7 (colored by blue) (Figure 4J; marked by triangles). We confirmed this finding on all 6 samples analyzed in this study (Mendeley Data). For example, on the medullary tissue which covers a corticomedullary junction (CMJ) region, Cluster #7 metabolomes are more abundant in the medullary than the cortical area (Figure S3H).

We computed differential features of Clusters #5 and #7 and identified C45H78NO7PNa, a phosphatidylethanolamine PE(P-18:0/22:6), as a marker specific to Cluster #5, and C42H80NO8PNa, a phosphatidylcholine PC(20:2/14:0), as a Cluster #7 marker (Figure 4K). TAL cells in the cortex sample exhibited a notable abundance of C45H78NO7PNa, while medullary TAL cells shower significantly higher abundance of C42H80NO8PNa (last three rows of Figure 4J; Figure S4N). On the other hand, AQP2+ PC cells consistently expressed C45H78NO7PNa regardless of regional locations. Although these two metabolites have been rarely studied, we searched potential enzymes that are related with the two classes of molecules, which suggested an association between C45H78NO7PNa and phospholipase D2 (PLD2, which preferentially utilizes phosphatidylethanolamine), as well as an association between C42H80NO8PNa and sphingomyelin synthase 2 (SGMS2, which mediates reversible conversion from phosphatidylcholine to ceramide). Surveying the expression of PLD2 and SGMS2 in our SHARE-seq data indicated that both genes showed a higher expression in TAL-C than TAL-M cells, and higher in TAL-M than TAL-P cells, but there were no differences in expression levels between PC-M and PC-P cells (Figure S4O), which further reinforced the associations between the genes and the two metabolites.

In summary, we combined our spatially resolved metabolomics analysis with immunostaining of TAL cells to show that cortical and medullary TAL cells have distinct metabolomic signatures.

Unique metabolism-associated gene profiles between nephron segments

We hypothesize that the unique transcriptomic, epigenomic and metabolomic signatures of each kidney cell type, including TECs located in different anatomical regions mentioned above, should be associated with distinct metabolic events in the cells. To survey all potential metabolic events in an unbiased manner with our SHARE-seq data, we leveraged an approach described in a recent study55 in which only a subset of genes associated with metabolic processes were retained for single-cell analysis.

Based on the 446,267 single-cell transcriptomes, we processed a total of 2,111 genes (gene list adapted from the Reactome56 database; available in Mendeley Data) and performed a second-step quality control by retaining a total of 88,113 cells with over 150 genes detected. We projected the annotations obtained from analysis of the 446,267 cells onto these 88,113 cells (Figure 5A; Mendeley Data). Interestingly, we were able to stratify all major kidney cell types with this subset of 2,111 metabolic genes (Figure 5A), with very few exceptions including DCT cells showing a higher similarity with TAL cells (Figure S5A) compared to that at the whole transcriptome level (Mendeley Data). Each kidney cell type presented distinct metabolism-associated gene signatures (Figure S5B) such as the unique expression of lipid metabolism-related gene ACSM2A (an acyl-CoA synthetase) in PT cells and higher expression of CHST11 (a carbohydrate sulfotransferase) in immune cells than the other cell types.

Figure 5. Unique metabolism-associated profiles between nephron segments.

Figure 5.

(A) UMAP presentation of cells analyzed by a subset of metabolic genes.

(B-C) Heat maps showing metabolic gene expression (B) and activity profiles (C) of each TEC.

(D) Bar plots showing metabolism-associated gene module scores across kidney cortex (C), medulla (M) and papilla (P). Values are shown as mean with a 95% confidence interval error bar.

(E-H) Bar plots showing metabolism-associated gene module scores.

(I-J) Bar plots showing the osmotic stress score across TECs (I) and across kidney regions (J).

(K) Bar plots showing the osmotic stress score in the subclustering analysis.

(L) SCCPDH gene expression specific to tL and collecting duct cells.

(M) Immunostaining of SCCPDH, AQP2 and UMOD on a human kidney medullary section. Scale bars: 10 μm.

(N) Immunostaining of SCCPDH and SLC44A5 on a medullary section. Scale bars: 12.5 μm.

Enrichment analysis on metabolism-associated DEGs of each cell type identified unique and non-overlapping metabolic terms across these cells (Figure 5B). For example, besides high activity of fatty acid oxidation (FAO), PT cells also showed increased expression of genes involved in gluconeogenesis, such as aldolase-encoding ALDOB and PCK1/2 which encodes phosphoenolpyruvate carboxykinases.57,58 tL cells exhibited higher expression of genes related to glutamine metabolism and glycolysis, supporting previous results summarized in a recent review.59 Glycolysis was also an enriched term for PC cells, though with a distinct set of genes such as PGK1 (phosphoglycerate kinase 1).60 Importantly, all these metabolic genes also exhibited statistically significant cell type-specific gene activities as indicated by our snATAC-seq data (Figure 5C).

Surveying the metabolic scores across different anatomical regions indicated a higher activity of FAO and gluconeogenesis in the kidney cortex, but increased usage of glycolysis and glutamine metabolism in the kidney papilla (Figure 5D). At the cell-type level, we identified PT cells with highest scores of FAO and gluconeogenesis (Figure 5EF). On the other hand, tL and PC cells, which are highly abundant in the kidney medulla and papilla, both exhibited low gluconeogenesis scores but relatively high activity of glycolysis and glutamine metabolism (Figure 5GH), explaining the scoring differences across anatomical regions (Figure S5CE). Reanalyzing RNA-seq data from microdissected rat renal tubules39 with the same gene lists also validated PT-specific FAO and gluconeogenesis gene expression and indicated collecting duct as the segment with highest activity of glycolysis and glutamine metabolism (Mendeley Data).

Next, we asked what are the physiological conditions that shape this unique metabolic gene signature in medulla/papilla-enriched tL and PC cells. Over the corticopapillary axis, there is a significant gradient in oxygen pressure and osmolarity, leading to increased osmotic cellular stress.61 Therefore, we defined an osmotic stress score by averaging expression of genes related to osmotic cellular stress62 and validated that tL and PC cells are two TEC types with highest scores (Figure 5I). The osmotic stress score is higher in the medulla and papilla than in the cortex (Figure 5J). Surveying the score across all kidney cell types indicated that urothelial cells also showed high activity of osmotic stress (Figure S5F). Since we identified that tL and PC cells have distinct transcriptomic and epigenomic profiles when located in different anatomical regions (Figure 4), we wondered whether there exists a correlation between region-specific signatures and osmotic cellular stress. We computed the osmotic stress score based on the aforementioned subclustering analysis, and confirmed a higher score of osmotic stress in papillary PC (PC-P) than cortical and medullary PC (PC-M), and in papillary tL (tL-P1/2) than medullary tL (tL-M1/2) cells (Figure 5K).

Due to the high similarity of metabolic gene profiles between tL and PC cells, we next looked for metabolic genes commonly expressed in both cell types. We identified that SCCPDH (putative saccharopine dehydrogenase) was specifically expressed in tL and PC cells and showed lower expression in the other TECs (Figure 5L and S5G). We performed immunostaining and confirmed that in kidney medulla and papilla, SCCPDH+ intracellular particles were mainly located in AQP2+ PC cells (Figure 5M and S5H) and SLC44A5+ tL cells (Figure 5N and S5I), but were poorly detected in the other TECs such as UMOD+ TAL cells.

Overall, we identified unique metabolic gene profiles across nephron segments. Our data identified high expression of genes involved in osmotic cellular stress in tL and PC cells and suggested osmotic cellular stress is potentially associated with region-specific transcriptional programs in these cells.

Multimodal characterization of lipid metabolism in PT cells

Lipid metabolism is highly active in PT cells and can be dysregulated during kidney injury, which is associated with poor disease outcomes.44,6365 Although the importance of lipid metabolism in PT cells has been increasingly studied, it remains unclear which specific lipid species PT cells utilize and whether transcriptomic variations involved in lipid metabolism are associated with an epigenomic change.

In our spatially resolved metabolomics data, we identified a cluster of PT metabolomes (Cluster #10) with elevated abundance of long-chain acylcarnitines (LCACs) including palmitoylcarnitine and oleoylcarnitine (Figure 3BC, Figure 6AB). LCACs serve as carriers to transport corresponding long-chain acyl-CoAs from the cytoplasm to the outer mitochondrial membrane which is catalyzed by carnitine palmitoyltransferase I (CPT1), and therefore, are recognized as an indicator of FAO. We detected 23 species of LCACs in this dataset and we found most of them showed specific accumulation in the PT cluster (Figure 6C and S6A), such as linoelaidylcarnitine and tetradecanoylcarnitine (Figure S6BC). Computing the average abundance of all LCAC species (i.e., acylcarnitine score) confirmed that LCAC accumulation was a unique signature of the PT cluster (Figure 6D and S6D). Among the 23 LCAC species, there are 18 features showing significantly higher abundance in the PT cluster than the other clusters (Figure 6C). The 18 LCAC species have carbon lengths from 19 to 25 and can be classified into three categories, saturated, unsaturated and hydroxylated LCACs (Figure 6E). We identified palmitoylcarnitine (23-C) and oleoylcarnitine (25-C with one double bond) as the most abundant saturated and unsaturated LCAC species, respectively, and hydroxylated LCACs were relatively less enriched (Figure 6E).

Figure 6. Surveying proximal tubule lipid metabolism in health and disease.

Figure 6.

(A-B) Spatial feature plots of metabolites palmitoylcarnitine and oleoylcarnitine, with the chemical structures shown.

(C) Dot plot showing expression of acylcarnitine species across clusters.

(D) UMAP presentation showing the acylcarnitine score is specific to Cluster #10.

(E) Composition of PT-specific acylcarnitine species with carbon chain length and relative abundance presented.

(F) Pseudotemporal ordering of PT cells with a subset of metabolic genes.

(G) Relative gene expression (top) or activity (bottom) of genes involved in FAO (left) or lipid accumulation (right) across PT subclusters. Data are shown as mean ± SEM. All genes are dysregulated with statistical significance (p<0.0001).

(H) Bar plots showing FAO score across PT subclusters. Data are shown as mean with a 95% confidence interval error bar.

(I) Expression of CYP4A11 over pseudotime (left), chromatin accessibility (middle) and gene expression (right) in PT and PT_VCAM1 cells.

(J) qPCR analysis of RPTEC/TERT1 cells treated with TNF-α.

(K) Seahorse analysis of RPTEC/TERT1 cells treated with TNF-α.

(L) Bar plots showing lipid accumulation score across PT subclusters.

(M) Expression of FAAH2 over pseudotime (left), chromatin accessibility (middle) and gene expression (right) in PT and PT_VCAM1 cells.

(N) Quantification of ORO staining in primary RPTECs treated with oleate or palmitate fatty acids.

(O) qPCR analysis of lipid accumulation genes on primary RPTECs. *p < 0.05, **p < 0.01, ***p < 0.001 and ****p < 0.0001 by Student’s t test.

In our SHARE-seq dataset, we identified PT cells both in health and in the diseased states (PT_dediff and PT_VCAM1) (Figure 1F). These cell clusters have been characterized by prior studies6,13,14,31 and showed reduced expression of healthy PT marker genes such as LRP2 and increased injury marker expression such as VCAM1 in PT_VCAM1 cells (Figure 2A). Computational TF inference with joint multiomics analysis (STAR Methods) identified PPARA and HNF4A as healthy PT-promoting TFs, and TCF12 and NFAT5 as TFs driving the PT_VCAM1 cell state (Figure S6E), consistent with a recent study.13 Although the profibrotic and proinflammatory signature of PT_VCAM1 cells have been previously studied, how genes involved in lipid metabolism are dysregulated in these cells remains unexplored. In our single-cell analysis with a subset of 2,111 metabolic genes, PT, PT_dediff and PT_VCAM1 could still be clearly stratified on the reduced dimension space (Figure 5A), suggesting distinct metabolism-associated gene profiles.

We validated the unique metabolic gene profiles between healthy and diseased PT cells by trajectory inference analysis with the 2,111 metabolic genes, and identified a trajectory from PT to PT_VCAM1 cells, with PT_dediff cells located in between (Figure 6F and S6F). We found genes involved in FAO, including β-oxidation, ω-oxidation and regulation of FAO activity, were significantly downregulated in PT_dediff and PT_VCAM1, compared to PT cells (Figure 6G, left panel). Gene module scoring analysis also identified a lowest FAO activity in PT_VCAM1, and PT_dediff cells showed an intermediate FAO score (Figure 6H). In addition, these FAO-related genes all exhibited significantly reduced gene activities in PT_VCAM1 cells in our snATAC-seq analysis (Figure 6G, left panel), indicating chromatin accessibility variations (Figure S6G; Mendeley Data). For example, CYP4A11, which catalyzes fatty acid ω-oxidation, showed reduced gene expression over the PT→PT_VCAM1 pseudotime and decreased chromatin accessibility at the promoter region (Figure 6I).

TNF-α treatment can induce VCAM1 upregulation in PT cells through activation of NF-κB signaling.66,67 Consistent with this, in our SHARE-seq analysis, PT_VCAM1 cells showed increased expression of genes involved in the TNF-α/NF-κB signaling complex and increased NF-κB pathway activity, compared with PT cells (Mendeley Data). Therefore, to validate compromised FAO in PT cells with VCAM1 upregulation, we treated hTERT-renal proximal tubule epithelial cells (RPTECs) with TNF-α. This significantly increased expression of NF-κB subunit genes NFKB1 and RELB, as well as injury marker genes VCAM1 and HAVCR1 (Figure 6J and S6H).31 After TNF-α treatment, we identified significantly reduced expression of FAO genes such as CPT1A, ACOX1 and SLC27A2 (Figure 6J). We measured the real-time oxygen consumption rate (OCR) and extracellular acidification rate (ECAR) with the Seahorse metabolic assay, which revealed significantly decreased OCR, but not ECAR, at the baseline condition (Figure 6K and S6I), supporting reduced FAO activity.

Sustained inhibition of FAO leads to increased lipid accumulation in PT cells.7,63 In the above enrichment analysis, we observed that biological terms related to lipid accumulation, including lipid transport, lipid droplet localization and ether lipid biosynthesis, were activated in PT_dediff and PT_VCAM1 cells compared to PT, and the associated genes also presented significantly increased chromatin accessibility as indicated by gene activities (Figure 6G, right panel; Mendeley Data). In a similar fashion, we defined the lipid accumulation score, which remained low in healthy PT cells, but was elevated in the PT_dediff and PT_VCAM1 clusters (Figure 6L). We identified increased expression of FAAH2 (encoding a lipid hydrolysis enzyme) and ABCC1/3 (encoding ATP-binding cassette transporters for lipids) as a function of pseudotime, as well as an increased promoter chromatin accessibility (Figure 6M and S6J).

Next, we asked whether the increased gene expression was a consequence of increased lipid deposition. We recently described BSA-conjugated fatty acid exposure on primary human RPTECs as an in-vitro model of lipid accumulation.7 Therefore, we treated RPTECs with BSA-oleate/palmitate fatty acids, which led to significantly increased intracellular lipids marked by Oil Red O (ORO) staining (Figure 6N and S6K). We identified significantly increased expression of FAAH2 and ABCC3 upon treatment of either fatty acid species (Figure 6O) and increased ABCC1 expression after BSA-oleate exposure (Figure S6L). Therefore, intracellular lipid accumulation is sufficient to induce their upregulation.

To summarize, we identified enrichment of acylcarnitine species in PT cells with spatially resolved metabolomics. Transcriptomic and epigenomic analysis on metabolic gene profiles revealed dysregulated lipid metabolism, including reduced FAO and increased lipid accumulation, in the diseased PT cell state.

Clinical data integration identified new pathogenic candidate genes

Next, we aimed to demonstrate the use of our SHARE-seq data for translational applications by integrating clinical data of the enrolled donors, especially disease indicators such as serum creatinine and blood urea nitrogen (BUN) levels. Donors with clear classifications [i.e., either healthy control, acute kidney injury (AKI) or chronic kidney disease (CKD)] were processed for this analysis (Table S1).

We first confirmed the correlation between kidney disease and expression of well-known disease markers. By analyzing the PT cell subset for each sample, we observed significantly increased expression of PT injury marker genes (HAVCR1, VCAM1 and TNIK) as well as decreased expression of differentiation markers such as HNF4A, in AKI or CKD donors compared with controls (Figure 7AB and S7A). Genes associated with FAO exhibited reduced expression in CKD samples (Figure S7B), supporting our previous notion that lipid metabolism is dysregulated in disease. We found the expression levels of HAVCR1 and VCAM1 were positively correlated with patient creatinine concentrations (Figure 7CD), though there was no significant correlation with BUN levels (Figure S7C). The expression of collagen genes increases as a function of donor age (Figure 7E). In addition, AKI and CKD samples were associated with increased expression of ECM-encoding genes (Figure 7F and Figure S7DE) and we identified an increased proportion of fibroblasts in CKD samples (Figure S7F), concordant with existing knowledge.

Figure 7. Clinical data integration identified target genes in disease progression.

Figure 7.

(A-B) Violin plots showing expression of VCAM1 (A) and HAVCR1 (B) in all PT cells across donors of control, AKI and CKD. *p < 0.05, **p < 0.01 and ***p < 0.001 by Mann-Whitney U test.

(C-D) Correlation between patient creatinine levels and expression of VCAM1 (C) and HAVCR1 (D) in PT cells with a simple linear regression fit. Pearson correlation coefficient (r) and p-value (p) are shown. Color scheme presented in Figure 7E.

(E) Correlation between patient ages and expression of collagen genes in kidney cortex.

(F) Violin plot showing expression of COL1A1.

(G) 13 genes associated with diseased PT cell state with clinical significance were selected based on four criteria.

(H) Correlation between patient creatinine levels and expression of two candidate genes PPFIBP1 (left) and PLEKHA1 (right) in PT cells.

(I) PPFIBP1 or PLEKHA1 were knocked down in primary RPTECs.

(J-K) Heat maps showing dysregulated genes after siPPFIBP1 (J) or siPLEKHA1 (K) treatments with gene ontology associations presented. CPM, counts per million.

Next, we aimed to identify candidate genes that are potentially important for cell state transition of diseased PT cells. By intersecting genes that (1) are upregulated in AKI/CKD compared to control samples, (2) are upregulated in PT_VCAM1 compared to PT cells, (3) show significantly higher expression in patients with high creatinine levels (>= 1.3 mg/dL) than low-creatinine samples and (4) show significantly increased expression as a function of creatinine levels in the linear regression model (Figure 7G), we identified 13 candidate genes meeting all criteria (Figure S7GS7H). These included Liprin-β1 (PPFIBP1)68 and Pleckstrin Homology Domain-Containing A1 (PLEKHA1)69 as two genes with no prior connection to kidney in published work (Figure 7H). With immunofluorescence staining, we validated that PPFIBP1 expression was mainly identified in PT in the kidney cortex and its expression was increased in PT cells expressing VCAM1 (Mendeley Data). Reanalyzing existing microarray datasets70,71 identified upregulation of PPFIBP1 and PLEKHA1 in patients with CKD, including diabetic nephropathy, focal segmental glomerulosclerosis and lupus nephritis, compared with controls (Mendeley Data). Surveying recently published single-cell studies further validated upregulation of PPFIBP1 and PLEKHA1 in PT cells in mouse models of kidney injury,7 and in human autosomal dominant polycystic kidney disease72 and diabetic kidney disease (Mendeley Data).29 In a similar approach, we were able to identify disease-related candidate genes that are specifically expressed in other cell types such as TAL and DCT (Figure S7H; Mendeley Data), further demonstrating the utilization of this single-cell human kidney atlas for translational studies.

To further examine potential roles for PPFIBP1 and PLEKHA1 in kidney disease progression, we performed in vitro gene siRNA knockdown on primary human RPTECs (Figure 7I; Mendeley Data) and analyzed the cells by bulk RNA-seq. Compared with non-targeting control (siNT), PPFIBP1 and PLEKHA1 siRNA (siPPFIBP1/siPLEKHA1) treatments decreased 88.0% and 91.4% of PPFIBP1 and PLEKHA1 expression, respectively, confirming successful gene knockdown. siPPFIBP1 treatment led to significant upregulation of 448 genes and downregulation of 457 genes, and siPLEKHA1 treatment resulted in dysregulation of a broader repertoire of genes (2,062 upregulated genes and 2,022 downregulated genes). siPPFIBP1 had no significant impact on PLEKHA1 expression, but siPLEKHA1 could induce decreased PPFIBP1 expression by 26.0% (Figure 7JK). Enrichment analysis on the DEGs after siPPFIBP1 treatment identified increased gene expression in lipid metabolic processes [e.g., PPARA, ACAT2, SPTLC2 (serine palmitoyltransferase long chain base subunit 2)] and ECM organization, as well as reduced expression of genes related to apoptotic signaling and negative regulation of stress-activated signaling pathways (Figure 7J), suggesting the role of PPFIBP1 in lipid metabolism and stress responses. On the other hand, siPLEKHA1 treatment resulted in downregulation of genes related to cell proliferation (e.g., MKI67), cell cycle regulation and cell division, while upregulated differentiated PT markers such as HNF4A and SLC6A8 and genes involved in hypoxia response (Figure 7K), indicating a pro-differentiation effect after PLEKHA1 knockdown in PT cells.

Next, we studied the impact of PPFIBP1/PLEKHA1 knockdown in cells of injured states by TNF-α stimulation on siNT/siPPFIBP1/siPLEKHA1-treated cells. We identified significantly reduced HAVCR1 and VCAM1 expression after PLEKHA1 knockdown compared with siNT controls (Figure S7I; Mendeley Data), further supporting the pathogenic role of PLEKHA1, although expression of FAO genes such as CPT1A and ACOX1 were not changed significantly. PPFIBP1 knockdown upregulated VCAM1 and moderately increased SPTLC2 expression (Figure S7I), though not affecting HAVCR1, CPT1A and ACOX1 expression significantly. Therefore, clinical data integration enabled us to identify candidate genes associated with kidney disease and our RNA-seq and in-vitro analysis suggested PLEKHA1 could be a possible therapeutic target as its knockdown could increase gene expression in PT differentiation and reduce HAVCR1 and VCAM1 expression.

DISCUSSION

The human kidney medulla and papilla play important roles in maintaining fluid and electrolyte homeostasis. They are vulnerable to various kidney diseases such as ischemic injury, bacterial infections, medullary cystic kidney diseases, renal hypodysplasia and papillary necrosis.7375 However, due to sample availability, human kidney cortical samples are still the major sample source of translational studies. For example, 85 out of 89 kidney samples enrolled in the large Genotype-Tissue Expression (GTEx) project76 were derived from the cortex. Our work here provides a valuable resource of single-cell multimodal characterization of major kidney anatomical regions.

Nephron epithelia from the same segment differentiate from common progenitors during development,1 and therefore share a common gene expression signature. By contrast, growing evidence suggests that these TECs exhibit different physiological functions, biophysical properties and morphologies based on their regional locations,5 suggesting the presence of region-specific signatures within a single cell type. Here, we defined such distinct transcriptomic and epigenomic region-specific profiles for tL, TAL and PC cells by SHARE-seq analysis. Incorporation of spatially resolved metabolomics data further demonstrated a distinct metabolomic profile between cortical and medullary TAL cells. In addition, by analyzing metabolism-associated genes, our study identified unique metabolic gene profiles between these TECs and highlighted an association with region-specific environmental characteristics such as osmotic pressure gradient. Of note, the regional heterogeneity of endothelial cells and its association with osmotic stress has been described in mouse kidneys.77

PT cells have high baseline energy requirements and predominantly utilize lipids as the major ATP-generating fuel source.44,63,78 In this study, we not only defined the composition of diverse acylcarnitine species in PT cells using spatially resolved metabolomics, but also identified PT in the diseased cell state with SHARE-seq analysis, which exhibited reduced FAO and increased lipid accumulation gene profiles. Defective lipid metabolism involves reduced CPT1 activity, resulting in decreased acylcarnitine turnover rate. Interestingly, we observed a less robust detection of acylcarnitines in the cortex sample of Donor #2, a patient with AKI, compared to the healthy control Donor #1 who had relatively normal kidney function (creatinine levels described in Table S2). This observation is consistent with the known downregulation of FAO in chronic kidney disease.63

The presented SHARE-seq protocol25 is based upon split-pool barcoding.15,7982 SHARE-seq does not require advanced equipment and its protocol is highly modifiable. Here, we optimized the original protocol specifically for primary human kidney samples (Table S3). For example, with our optimized tissue homogenization method, nearly 500 thousand paraformaldehyde-fixed nuclei can be extracted from a kidney biopsy sample as small as 1-cm length×16-gauge thickness. Our libraries were sequenced at a depth relatively lower than the commercialized 10X Chromium assays (~18,000 and 20,000 raw reads per cell for snRNA-seq and snATAC-seq libraries, respectively), but the gene detection sensitivity was still sufficient to stratify major kidney cell types.

We developed MALDIpy as a Python-based package for efficient analysis of IMS data. Detailed tutorials of MALDIpy installation and usage are available on our GitHub page (Data and Code Availability). Here, with a combination of methods including co-registration of histology staining on the same MALDI sections, co-registration of immunostaining on serial sections and literature search, we successfully annotated multiple kidney cell types and structures such as cells of glomerulus, PT, TAL, PC and renal pelvis. Metabolomic identification of all kidney cell types would require a larger sample cohort, a higher imaging mass spectrometry profiling resolution and the use of highly multiplexed imaging methods on serial sections. For example, in this work, the metabolomic Cluster #1 cells are not only identified in kidney glomerulus, but also detected in the medullary ray regions in Donor #2, and many low-abundant interstitial cell types remain to be further stratified.

Recent human kidney cell atlases11,20 demonstrated the utility of integrating single-cell multiomics with spatially resolved transcriptomics to reveal kidney heterogeneity and disease phenotypes. For example, Abedini et al.11 included significant number of patients with diabetic kidney disease and hypertensive kidney disease, and identified the fibrotic microenvironment in the human kidney by integrating single-cell and spatial transcriptomics data. Compared with these studies, this work places a special emphasis on molecular characterization of different kidney anatomical structures and joint transcriptomic and epigenomic profiling in the same cells integrated with spatially resolved metabolomics analysis.

In summary, we leveraged high-throughput SHARE-seq and high-resolution spatially resolved metabolomics technologies to comprehensively study transcriptomic, epigenomic and metabolomic differences across human kidney anatomical regions. We reveal that tL, TAL and PC cells exhibit distinct signatures depending on their regional locations, highlighting kidney anatomical heterogeneity. Nephron segments showed unique and non-overlapping metabolic gene profiles and we presented dysregulated lipid metabolism in diseased PT cells. Integrating this multimodal dataset with clinical information identified candidate genes associated with kidney disease progression. This multiomic single cell atlas will serve as a resource to advance our understanding of human kidney physiology and disease.

Limitations of study

Though we have profiled all major kidney anatomical regions (cortex, medulla, papilla, renal artery, ureter) in this work, other kidney structures were not included due to sampling difficulty, such as the renal vein, major and minor calyx and capsule. Renal artery samples may contain contaminating cells from kidney cortex during dissection. The sequencing depth of SHARE-seq libraries, especially the snATAC-seq libraries, can be further increased to promote genome coverage and gene detection sensitivity. In addition, spatially resolved metabolomics is limited in the number of features detected per section. Future MALDI-MS studies should further increase the sample size and include highly multiplexed imaging technologies for co-registration analysis to annotate all kidney cell types.

STAR METHODS

RESOURCE AVAILABILITY

Lead Contact

Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Benjamin D. Humphreys (humphreysbd@wustl.edu).

Materials Availability

This study did not generate new unique reagents.

Data and Code Availability

Raw data (.fastq), processed data (count matrix .h5 files and fragment .bed files), sublibrary primer sequences and metadata of the SHARE-seq data have been deposited in NCBI’s Gene Expression Omnibus and are available through GEO Series accession number GSE234788. Raw and processed bulk RNA-seq data on RPTECs are available through GEO Series accession number GSE240639. Supporting information, including supporting figures and matrices of IMS data, are available in Mendeley Data. The imaging mass spectrometry data can be explored online at https://metaspace2020.eu/project/maldipy_kidney.

Scripts for pipelines of objection generation, pseudobulk analysis, single-cell analysis, subclustering analysis and generation of all major figures in this study were written mostly in Python and R with codes available at https://github.com/TheHumphreysLab/SHARE-seq-kidney. Our package for IMS data analysis, MALDIpy, was documented in https://github.com/TheHumphreysLab/MALDIpy (codes are also available at https://pypi.org/project/MALDIpy/ and https://doi.org/10.6084/m9.figshare.25254688). Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

Data S1. Unprocessed data underlying the display items in the manuscript, related to Figures 2, 6, S24, S6 and S7.

EXPERIMENTAL MODEL AND SUBJECT DETAILS

This research complies with all relevant ethical regulations and has been approved by the Washington University Institutional Review Board. For SHARE-seq library generation, discarded human kidney samples (n=36) were obtained from deceased organ donors under an established Institutional Review Board protocol approved by Washington University in St. Louis. Additional human living kidney donor biopsy samples were obtained from patients at Brigham and Women’s Hospital (Boston, MA) (n=4) and Wake Forest Baptist (Winston-Salem, NC) (n=8). Overall, samples of different kidney regions were derived from a total of 24 donors, including 13 males and 11 females with ages ranging from 19 to 71 (Table S1). For imaging mass spectrometry, 3 samples of different kidney regions were derived from one 28-year-old male and one 52-year-old male organ donor, respectively (n=6 in total; Table S2), under the protocol approved by Washington University in St. Louis. All living donor participants provided written informed consent in accordance with the Declaration of Helsinki, including publication of demographics. Histologic sections were reviewed by a renal pathologist and laboratory data was abstracted from the medical record.

Primary human renal proximal tubule epithelial cells (RPTECs) (CC-2553, Lonza) were cultured in Renal Epithelial Cell Growth Medium (CC-3190, Lonza) with supplements provided in the kit. Primary RPTECs were used in early passage. hTERT immortalized RPTEC cells (RPTEC/TERT1) (CRL-4031, ATCC) were cultured in DMEM:F12 medium supplemented with 0.2% FBS (F4135, Sigma), 0.1 mg/mL G418 (10131035, Gibco) and growth factors provided in the kit (ACS-4007, ATCC). All cells were maintained in a humidified 5% CO2 atmosphere at 37°C unless otherwise specified.

METHOD DETAILS

Sample processing for histology analysis

For nuclei preparation in SHARE-seq, human kidney samples were snap-frozen with liquid nitrogen. For immunofluorescence, kidneys were fixed with 4% paraformaldehyde (PFA) (15714, Electron Microscopy Sciences) in phosphate-buffered saline (PBS) at 4°C overnight, immersed in 30% sucrose at 4°C overnight and embedded in optimum cutting temperature compound (4583, Sakura) to cut sections. For PAS and trichrome staining, kidneys were fixed with 10% formalin at 4°C overnight, which was then switched to 70% ethanol for overnight incubation at 4°C. 5-μm tissue sections were cut from paraffin blocks and processed for histological analysis following the manufacturer’s instructions (Cat#9162A, Newcomer Supply, for PAS staining; Cat#87020, Epredia, for trichrome staining). PAS and trichrome staining images were captured with a Zeiss Axio Scan Z1 light microscopy and examined in a blinded fashion.

Immunofluorescence

6-μm tissue sections were fixed with 4% PFA for 10 minutes, rinsed in running water and washed with PBS. Permeabilization was performed with 0.3% Triton X-100 in PBS for 15 minutes at room temperature. Blocking was performed with 1% Bovine Serum Albumin (BSA) in PBS for 30 minutes at room temperature. For each target, sections were stained with the desired primary antibody for 1 hour at room temperature or overnight at 4°C. Sections were washed with PBS (three times; 5 minutes each) and stained with the desired secondary antibody for 1 hour at room temperature. After three washes with PBS for 5 minutes each, sections were counterstained with DAPI and mounted with Prolong Gold. Supplier information of primary antibodies used in this work can be found in Key Resources Table. All images were captured and processed with a confocal microscope (Eclipse Ti, Nikon) and examined in a blinded fashion.

Key Resources Table.
REAGENT or RESOURCE SOURCE IDENTIFIER
Antibodies
Sheep anti-UMOD BioRad Cat#8595-0054; RRID: AB_620425
Goat anti-AQP2 Santa Cruz sc-9882; RRID: AB_2289903
Mouse anti-ANK2 Santa Cruz sc-12718; RRID: AB_626673
Rabbit anti-CDS2 Novus NBP1-86435; RRID: AB_11019952
FITC-conjugated Lotus Tetragonolobus Lectin (LTL) Vector Laboratories FL-1321-2; RRID: AB_2336559
Biotinylated LTL Vector Laboratories B-1325-2; RRID: AB_2336558
Rabbit anti-SLC44A5 Novus NBP2-30814
Rabbit anti-SCCPDH Novus NBP2-97630
Mouse anti-PPFIBP1 Santa Cruz sc-514575
Chemicals, peptides, and recombinant proteins
NaCl 5M Invitrogen AM9759
MgCl2 1M Invitrogen AM9530G
Tris-HCl pH 7.5 1M Invitrogen Cat#15567-027
Tris-HCl pH 8.0 1M Invitrogen Cat#15568-025
Triton X-100 Sigma T8787
Tween 20 Sigma P9416
Paraformaldehyde Electron Microscopy Sciences Cat#15714
EDTA (0.5 M, pH 8.0) Boston BioProducts BM-150-DR
NP-40 Thermo Scientific Cat#28324
Pierce Dimethylformamide (DMF) Thermo Scientific Cat#20673
Protease Inhibitor Cocktail Sigma P8340
Tris Acetate Buffer pH 7.8 Bioworld Cat#40120265
Magnesium acetate Sigma Cat#63052
Potassium acetate Sigma Cat#95843
Polyethylene glycol 6000 (PEG6000) Sigma Cat#528877
10% Sodium dodecyl sulfate (SDS) Sigma Cat#71736
100 mM Phenylmethanesulfonyl fluoride (PMSF) Sigma Cat#93482
Renal Epithelial Cell Growth Medium Lonza CC-3190
Recombinant Human TNF-alpha Protein R&D Systems 210-TA/CF
Oleate fatty acid Cayman Chemical Cat#29557
Palmitate fatty acid Cayman Chemical Cat#29558
Lipofectamine Reagent Thermo Scientific Cat#13778150
Nuclei EZ Lysis Buffer Sigma NUC101
EDTA-free protease inhibitor tablets Roche Cat#5892791001
RNasin Ribonuclease Inhibitor Promega N2615
SUPERase•In RNase Inhibitor Thermo Scientific AM2696
Beverly RNase Inhibitor Qiagen Y9240L
NxGen RNAse Inhibitor LGC Biosearch Cat#30281-2
BSA NEB B9000S
7.5% BSA Sigma A8412
dNTP NEB N0447L
Maxima H Minus Reverse Transcriptase Thermo Scientific EP0753
T4 DNA Ligase Reaction Buffer NEB B0202S
T4 DNA Ligase NEB M0202L
Proteinase K Invitrogen Cat#25530049
Dynabeads MyOne Streptavidin C1 Invitrogen Cat#65002
DNA Clean & Concentrator-5 Zymo D4014
Select-a-Size DNA Clean & Concentrator MagBead Kit Zymo D4084
NEBnext 2× master mix NEB M0541L
KAPA HiFi HotStart ReadyMix Roche Cat#7958935001
iTaq Universal SYBR® Green Supermix Biorad Cat#1725121
AMPure XP Reagent Beckman Coulter A63881
20% Ficoll, Type 400 Sigma F5415
EvaGreen Dye, 20X Biotium Cat#31000-T
Elution buffer Qiagen Cat#19086
Critical commercial assays
ImmPRESS® HRP Horse AntiRabbit IgG Polymer Detection Kit, Peroxidase Vector Laboratories MP-7401
High-Capacity cDNA Reverse Transcription Kit Applied Biosystems Cat#4368813
Seahorse XFe96 FluxPaks Agilent Cat#102416-100
Deposited data
Mendeley Data This study https://doi.org/10.17632/ydskdv2vrz
Raw and processed SHARE-seq data This study GSE234788
Imaging mass spectrometry data matrix This study https://metaspace2020.eu/project/maldipy_kidney (exported matrices available in Mendeley Data)
Raw and processed RNA-seq data This study GSE240639
Kidney Interactive Transcriptomics The Humphreys Lab http://humphreyslab.com/SingleCell/
Experimental models: Cell lines
Primary human renal proximal tubule epithelial cells Lonza CC-2553
RPTEC/TERT1 ATCC CRL-4031
Software and algorithms
MALDIpy This study https://github.com/TheHumphreysLab/MALDIpy; https://doi.org/10.6084/m9.figshare.25254688
Scanpy Wolf et al.83 https://scanpy.readthedocs.io/en/stable/
Seurat Stuart et al.84 https://satijalab.org/seurat/
Signac Stuart et al.85 https://stuartlab.org/signac/
SHARE-seq preprocessing Ma et al.25 https://github.com/masai1116/SHARE-seq-alignment and https://github.com/masai1116/SHARE-seq-alignmentV2/
STAR Dobin et al.86 https://github.com/alexdobin/STAR
Bowtie2 Langmead et al.87 https://bowtie-bio.sourceforge.net/bowtie2/manual.shtml
featureCounts Liao et al.88 https://subread.sourceforge.net/
Picard Broad Institute https://broadinstitute.github.io/picard/
RSeQC Wang et al.89 https://rseqc.sourceforge.net/
Tabix Li et al.90 http://www.htslib.org/doc/tabix.html
MACS2 Zhang et al.91 https://github.com/macs3-project/MACS
Monocle3 Cao et al.80 https://cole-trapnell-lab.github.io/monocle3/
Scrublet Wolock et al.92 https://github.com/swolock/scrublet
plot1cell Wu et al.30 https://github.com/TheHumphreysLab/plot1cell
Harmony Korsunsky et al.93 https://portals.broadinstitute.org/harmony/
DecoupleR Badia-I-Mompel et al.94 https://decoupler-py.readthedocs.io/en/latest/index.html
Progeny Holland et al.95 https://saezlab.github.io/progeny/
DoRothEA Holland et al.95 https://saezlab.github.io/dorothea/
METASPACE Palmer et al.42 https://metaspace2020.eu/
chromVAR Schep et al.37 http://greenleaflab.github.io/chromVAR/index.html

To co-stain primary antibodies of the same host species (rabbit anti-SLC44A5 and rabbit anti-SCCPDH in this work), a Tyramide signal amplification staining method was used. Briefly, after permeabilization, blocking was performed with 2.5% horse serum (MP-7401, Vector Laboratories) for 1 hour at room temperature. The first primary antibody (1:2000 diluted in 0.1% BSA) was added and incubated overnight at 4°C. After PBS washes, ImmPRESS HRP Horse anti-rabbit IgG polymer reagent (MP-7401, Vector Laboratories) was added and incubated for 30 minutes at room temperature. Sections were washed with PBS (three times; 5 minutes each) and stained with Tyramide-AF488 (B40953, Invitrogen) (1:100 diluted in 0.0015% H2O2 in 10 mM Tris pH 7.5) for 10 minutes at room temperature. After PBS washes, a second blocking was performed with 1% BSA in PBS for 1 hour at room temperature. The second primary antibody (1:100 diluted in 0.1% BSA) was incubated overnight at 4°C and following procedures were performed as described above.

Nuclei isolation

Nuclei isolation from fresh-frozen human kidney samples was performed as previously described31,96 with minor modifications. Briefly, Nuclei EZ Lysis Buffer (NUC101, Sigma) was used and supplemented with EDTA-free protease inhibitor tablets (5892791001, Roche) and RNase inhibitors (N2615, Promega; AM2696, Thermo Scientific). Tissues were minced with a razor blade and homogenized with Dounce Tissue Grinders (885303-0002, Kimble) in ice-cold lysis buffer. The homogenate was filtered through a 200-μm strainer (43-50200-03, pluriSelect) and cell homogenization was performed with the Dounce Tissue Grinder again. The homogenate was incubated in the buffer for 3 minutes and then filtered again through a 40-μm strainer (43-50040-51, pluriSelect). The homogenate was centrifuged at 500×g for 4 minutes at 4°C and the pellet was resuspended in the lysis buffer with gentle pipetting. After 5-minute incubation, the suspension was centrifuged at 500×g for 4 minutes at 4°C and the nuclei pellet was resuspended with Nuclei Suspension Buffer (NSB), which was freshly prepared by supplementing nuclei buffer (10 mM Tris-HCl pH 7.5, 10 mM NaCl, 3mM MgCl2) with 1% RNase inhibitor (AM2696, Thermo Scientific) and 1% BSA (B9000S, NEB). For small kidney biopsy sample processing, strainers of use were replaced with Cat#43-10200-60 and Cat#43-10040-60 (pluriSelect).

Tn5 transposome preparation

The hyperactive Tn5 construct and the protocol for in-house protein preparation were described previously.28 Briefly, the construct contains two point mutations E54K/L372P and the final protein product was diluted to 1 mg/mL (around 18.7 μM) in 25 mM Tris pH 7.5, 800 mM NaCl, 0.1 mM EDTA, 1 mM DTT, and 50% glycerol.

To prepare Tn5 transposome used in tagmentation in RNA-seq library generation, 100 μM Read1 oligos (TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG) and 100 μM Blocked_ME_complement oligos (/5Phos/C*T*G*T*C*T*C*T*T*A*T*A*C*A*/3ddC/) were mixed at the same volume. The oligo mixture was first heated at 85°C for 2 minutes and slowly cooled down to 20°C at a rate of −1°C/minute. Next, 1 mg/mL naked Tn5 was mixed with an equal volume of dilution buffer (50 mM Tris pH 7.5, 100 mM NaCl, 0.1 mM EDTA, 1 mM DTT, 0.1% NP-40, and 50% glycerol). The diluted Tn5 (2X dilution) was mixed with an equal volume of Read1/Blocked_ME_complement annealed oligos (4X dilution). The mixture was incubated at 25°C for 45 minutes with gentle shaking. Dilution buffer was added to generate a 20X stock. To prepare transposome used in transposition in ATAC-seq library generation, Read1/Blocked_ME_complement annealed oligos and Read2/Blocked_ME_complement annealed oligos were prepared in a similar approach (Read2 primer: /5Phos/GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG), mixed and added to the same volume of 2X Tn5. After incubation at 25°C for 45 minutes with gentle shaking, dilution buffer was added to generate a 36X stock. All Tn5 transposomes were stored at −20°C before use.

SHARE-seq library generation

Generation of SHARE-seq libraries was performed following a previously described protocol25 with several modifications. Briefly, three types of plates for three rounds of ligation-based split-pool barcoding were pre-prepared by mixing link strand oligos and barcoding oligos (Table S5) into each well of LoBind PCR Plates (0030129512, Eppendorf), with final concentrations of oligos described previously.25 Oligo annealing was performed by heating the plates at 95°C for 2 minutes and then slowly cooling it down to 20°C at a rate of −1°C/minute. The annealed oligos were aliquoted at 10 μL per well to several plates and stored at −20°C for future use.

After nuclei isolation, the nuclei suspension was diluted to 1 million nuclei/mL. Light fixation was performed by adding PFA (15714, Electron Microscopy Sciences) at a final concentration of 0.2% and incubating at room temperature for 5 minutes. Fixation was stopped by adding 56.1x μL 2.5M glycine, 50x μL 1M Tris-HCl pH 8.0, and 13.3x μL 7.5% BSA (A8412, Sigma), where x refers to the volume of the nuclei suspension processed. The suspension was incubated at room temperature for 5 minutes and then centrifuged at 500×g for 4 minutes at 4°C. The cell pellet was washed twice with 500 μL NB1 buffer [pre-cold NB buffer (10 mM Tris pH 7.5, 10 mM NaCl, 3 mM MgCl2, 0.1% NP-40) supplemented with 1% BSA and 0.5% RNase inhibitor (Y9240L, Qiagen)] and finally resuspended in 100 μL NB1. The nuclei concentration was counted and the suspension was diluted to around 4,000 nuclei/μL. The nuclei suspension was aliquot into a total of 24 1.5-mL tubes (20,000 nuclei per tube) for subsequent transposition reactions.

A mixture of acetates (Ace-Mix) was pre-prepared by mixing 9.7 mL 200 mM Tris-acetate, 776 μL 5 M K-acetate, 590 μL 1 M Mg-acetate, 600 μL 10% NP-40 and 27.699 mL H2O. Transposition buffer was prepared by adding 282 μL DMF, 7.1 μL Protease inhibitor cocktail (P8340, Sigma) and 20 μL RNase inhibitors (Y9240L, Qiagen and AM2696, Thermo Scientific) to 1181 μL Ace-Mix. For each transposition reaction, 55 μL transposition buffer was added to 5 μL nuclei and incubated at room temperature for 10 minutes. Then, 5 μL 36X assembled Tn5 transposome was added to each reaction and transposition was performed by incubating at 37°C for 1 hour with shaking at 1000 rpm. Next, all reactions were combined and centrifuged at 1000×g for 4 minutes at 4°C. The nuclei pellet was washed with NB2 buffer [pre-cold NB buffer supplemented with 1% RNase inhibitor cocktail (N2615, Promega; AM2696, Thermo Scientific; Y9240L, Qiagen)] and finally resuspended by 180 μL NB2.

For reverse transcription, 720 μL RT mix was prepared by mixing 45 μL 10 mM dNTP (N0447L, NEB), 90 μL 100 μM poly-T primers (Table S5), 112.5 μL Maxima H Minus Reverse Transcriptase (EP0753, Thermo Scientific), 180 μL 5X RT buffer, 270 μL 50% PEG6000 and 22.5 μL RNase inhibitor cocktail (AM2696, Thermo Scientific; Y9240L, Qiagen). Then, 180 μL nuclei suspension was mixed with 720 μL RT mix. The reverse transcription was performed by heating at 50°C for 10 minutes, a total of 4 thermal cycles (8°C for 12 s, 15°C for 45 s, 20°C for 45 s, 30°C for 30 s, 42°C for 120 s and 50°C for 180 s) and final incubation at 50°C for 5 minutes. Next, 1.6 mL NB2 buffer was added to the reaction and the mixture was centrifuged at 1000×g for 4 minutes at 4°C. The pellet was washed with NB2 and finally resuspended by 4.3 mL hybridization mix [430 μL 10X T4 ligation buffer (B0202S, NEB) supplemented with 1075 μL NB2 buffer, 2682 μL H2O, 43 μL 10% Triton X-100 and 70 μL RNase inhibitor cocktail (AM2696, Thermo Scientific; Y9240L, Qiagen)].

Next, for ligation-based split-pool barcoding, 40 μL nuclei suspension was distributed to each well of the pre-prepared 96-well plate containing 10 μL annealed round 1 linker strand and barcoding oligos. The reaction was incubated for 30 minutes at room temperature at 250 rpm. Then, 10 μL 22 μM blocking oligos (CCCATGATCGTCCGAGTCTCGTGGGCTCGG) were added to each well and the reaction was incubated for 30 minutes at room temperature at 250 rpm. The nuclei were then pooled together and re-distributed to the second pre-prepared 96-well plate in which 10 μL 26.4 μM blocking oligos (ATCCACGTGCTTGAGCGCGCTGCATACTTG) were added after incubation. A split-pool barcoding procedure was performed again on the third 96-well plate in which 10 μL 23 μM blocking oligos (GTGGCCGATGTTTCGCATCGGCGTACGACT) were added after incubation. Then, the nuclei were pooled and centrifuged at 1000×g for 4 minutes at 4°C. The pellet was washed twice with NB2 buffer and resuspended in 500 μL ligation mix [1X T4 ligation buffer (B0202S, NEB), 5% T4 DNA ligase (M0202L, NEB), 0.1% Triton X-100, 20% NB2 buffer and 1.5% RNase inhibitor cocktail (N2615, Promega; Y9240L, Qiagen)]. The ligation was performed by incubating at room temperature for 30 minutes at 300 rpm.

After ligation, the nuclei were centrifuged at 1000×g for 4 minutes at 4°C, washed with NB2 buffer and thoroughly resuspended in 100 μL NB2. Cell number was counted and around 20,000 nuclei were distributed to each 1.5-mL tube, where each tube refers to one SHARE-seq sublibrary. NB2 buffer was added to bring the total volume to 47 μL in each tube. 50 μL 2X reverse crosslinking buffer (1mL buffer prepared by mixing 100 μL 1M Tris pH 8, 20 μL 5M NaCl, 40 μL 10% SDS and 840 μL H2O), 2 μL proteinase K (25530049, Invitrogen) and 1 μL RNase inhibitor (AM2696, Thermo Scientific) were added to each reaction. The mixture was incubated at 55°C for 1 hour. Next, 5 μL 100 mM PMSF (93482, Sigma) was added to each reaction to inactivate proteinase K and incubated at room temperature for 10 minutes.

To separate RNA-seq sublibraries from ATAC-seq sublibraries, MyOne C1 Dynabeads (65002, Invitrogen) were used. Briefly, the beads were washed twice with 1X B&W-T buffer (5 mM Tris pH 8.0, 1 M NaCl, 0.5 mM EDTA, and 0.05% Tween 20) and washed once with 1X B&W-T buffer supplemented with 5% RNase inhibitor (Y9240L, Qiagen). Then, the beads were resuspended with 2X B&W buffer (10 mM Tris pH 8.0, 2 M NaCl, 1 mM EDTA) supplemented with 10% RNase inhibitor cocktail (N2615, Promega; AM2696, Thermo Scientific; Y9240L, Qiagen). 100 μL beads were mixed with each sample and the mixture was rotated on a rotator at room temperature at 8 rpm for 1 hour. The samples were placed on a magnetic stand to separate the supernatants (snATAC-seq sublibraries) and beads (snRNA-seq sublibraries). The supernatant samples were purified with DNA Clean & Concentrator-5 (D4014, Zymo) and eluted in 20 μL elution buffer. The sublibraries can be stored at −80°C for subsequent library construction.

SHARE-seq snATAC-seq sublibrary construction

The 20 μL snATAC-seq fragment samples were mixed with 25 μL 2X PCR mix (M0541L, NEB), 2.5 μL 10 μM indexed Ad1 primers (Table S5) and 2.5 μL 10 μM P7 primers (CAAGCAGAAGACGGCATACGAGAT). Each sublibrary should be constructed with a unique Ad1 primer. The first PCR reaction was performed by heating samples at 72°C for 5 minutes, 98°C for 30 seconds, a total of 6 thermal cycles (98°C for 10 s, 65°C for 30 s, 72°C for 60 s) and final incubation at 72°C for 5 minutes. Next, right-sided size selection was performed with Select-a-Size DNA Clean & Concentrator MagBead Kit (D4084, Zymo) by mixing 26.4 μL beads with 50 μL PCR products. The supernatant was obtained, mixed with 63.6 μL beads and finally eluted in 21 μL elution buffer.

Then, qPCR was performed to determine the number of PCR cycles needed for library amplification. Briefly, 1 μL eluted samples were mixed with 5 μL iTaq SuperMix (1725121, BioRad), 0.5 μL 10 μM corresponding indexed Ad1 primers, 0.5 μL 10 μM P7 primers and 3 μL H2O. qPCR reactions were performed by heating samples at 98°C for 30 s and 20 thermal cycles (98°C for 10 s, 65°C for 30 s, 72°C for 60 s). The number of extra cycles was determined as the number of qPCR cycles to reach 1/3 of saturation.25 Next, the second PCR reaction was performed with the same cycler condition as described above. Finally, the PCR products were purified with 1.7X Ampure XP beads or the Zymo MagBeads and eluted with 20 μL elution buffer.

SHARE-seq snRNA-seq sublibrary construction

The beads were washed for three times with 1 X B&W-T buffer and once with STE buffer (10 mM Tris pH 8, 50 mM NaCl, and 1 mM EDTA), both supplemented with 5% RNase inhibitor (30281-2, LGC Biosearch). For each sample, the beads were resuspended with 100 μL template switch mix [20 μL 20% Ficoll solution, 30 μL 50% PEG6000, 1.8 μL H2O, 5 μL RT Maxima H Minus Reverse Transcriptase, 20 μL 5X RT buffer, 10 μL 10 mM dNTP, 3.2 μL 100 μM TSO oligo (Table S5) and 10 μL RNase inhibitor (30281-2, LGC Biosearch)]. The bead suspensions were rotated on a rotator at 8 rpm for 30 minutes at room temperature and then incubated at 300 rpm for 90 minutes at 42°C, during which the beads were resuspended every 30 minutes by pipetting. 200 μL STE buffer was added to each sample and the supernatant was removed after placing the sample on a magnetic stand. The beads were briefly washed with STE and resuspended in 55 μL PCR mix containing 27.5 μL 2X Kapa HiFi PCR mix (7958935001, Roche), 5.5 μL 4 μM P7 primers, 5.5 μL 4 μM RNA PCR primers (AAGCAGTGGTATCAACGCAGAGT) and 16.5 μL H2O. The first PCR reaction was performed by heating samples at 95°C for 3 minutes, a total of 6 thermal cycles (98°C for 30 s, 65°C for 45 s, 72°C for 3 min) and final incubation at 72°C for 5 minutes. Then, the supernatant was obtained on a magnetic stand.

qPCR was performed to determine the number of PCR cycles needed for library amplification. Briefly, 2.5 μL eluted samples were mixed with 3.75 μL Kapa HiFi PCR mix, 1 μL 4 μM P7 primers, 1 μL 4 μM RNA PCR primers, 0.5 μL 20X EvaGreen dye (31000-T, Biotium) and 1.25 μL H2O. qPCR reactions were performed by heating samples at 95°C for 3 minutes and 20 thermal cycles (98°C for 30 s, 65°C for 20 s, 72°C for 3 min). The number of extra cycles was determined as the number of qPCR cycles to reach 1/3 of saturation.25 Next, the second PCR reaction was performed with the same cycler condition as described above. Next, the cDNA products were purified with 31.5 μL Ampure XP beads and eluted with 40 μL elution buffer. The concentration of cDNA was quantified on the 4200 TapeStation System (Agilent).

For tagmentation, 50–120 ng cDNA was incubated in 100 μL tagmentation mix containing 50 μL 2X TD buffer (20 mM Tris HCl pH 7.5, 10 mM MgCl2, 20% DMF) and 10 μL pre-prepared 20X Tn5 transposome at 55°C at 250 rpm. For 50–80 ng cDNA, the sample was incubated for 10 minutes. For 80–120 ng cDNA, the sample was incubated for 15 minutes. Next, the sheared cDNA was purified with DNA Clean & Concentrator-5 (D4014, Zymo) and eluted in 20 μL elution buffer. The purified products were mixed with 25 μL 2X PCR mix (M0541L, NEB), 2.5 μL 10 μM P7 primers and 2.5 μL 10 μM indexed Ad1 primers (Table S5). Each sublibrary should be constructed with a unique Ad1 primer. The PCR reaction was performed by heating samples at 72°C for 5 minutes, 98°C for 30 seconds, a total of 7 thermal cycles (98°C for 10 s, 65°C for 30 s, 72°C for 1 min) and final incubation at 72°C for 5 minutes. The PCR products were purified with 0.7X Ampure XP beads and eluted with 20 μL elution buffer.

SHARE-seq library quantification and next-generation sequencing

All SHARE-seq snATAC-seq and snRNA-seq sublibraries were quantified and visualized on the 4200 TapeStation System (Agilent) and balanced to the same molarity for next-generation sequencing. In this study, a total of 15 SHARE-seq experiments (batches) were performed to profile 48 human kidney samples (Table S4). A total of 74 high-quality paired snRNA-seq and snATAC-seq libraries generated from the 15 SHARE-seq batches were selected and sequenced on one paired-end 200-cycle NovaSeq 6000 S2 flow cell and two paired-end 200-cycle NovaSeq 6000 S4 flow cells (Illumina). snRNA-seq and snATAC-seq sublibraries were pooled and sequenced on different lanes of the flow cells, with sequencing parameters listed as: Read1: 50 cycles; Index1: 99 cycles; Index2: 8 cycles; Read2: 50 cycles.

SHARE-seq data pre-processing

Pre-processing SHARE-seq sequencing data (.fastq) was performed with previously described scripts (available at https://github.com/masai1116/SHARE-seq-alignmentV2/).25 Briefly, the human hg19 reference genome was created from GRCh37 Release 39. For snRNA-seq data, reads were trimmed and aligned to the hg19 genome with STAR (v2.7.7a).86 Reads were demultiplexed based on barcodes introduced in split-pool barcoding where one mismatched base was allowed and annotated to both exonic and intronic regions of each gene with featureCounts (v2.0.1).88 Barcodes with fewer than 100 reads were ignored. For snATAC-seq data, genome alignment was performed with Bowtie 2 (v2.3.5.1)87 and reads that were unmapped, mapped to Y chromosome and mapped to mitochondria were discarded from analysis. Duplicates were identified and removed with Picard (v2.14.1) (http://broadinstitute.github.io/picard/). For quality check, read distribution was checked using RSeQC.89 This pipeline ultimately generated a count matrix (.h5 file) for snRNA-seq and a fragment profile (.bed file) for snATAC-seq.

To process the .h5 files for snRNA-seq analysis, Scanpy (v1.7.2)83 was used with the scanpy. read_10x_h5 function. Count matrices of data generated from different Novaseq flow cells were combined with the anndata.concat function. To process the .bed files for snATAC-seq analysis, fragment profiles of data generated from different Novaseq flow cells were combined, sorted and indexed with Tabix (v1.9).90 Then, a combined fragment object was created in Signac (v1.6.0)85 with the CreateFragmentObject function. Peak calling was performed with the CallPeaks function (extsize=150) which implemented MACS2 (v2.1.4)91 and a count matrix was created with the FeatureMatrix function. A total of 189,184 features were identified, with output files available in Mendeley Data. In all these analyses, meta information was retained for further batch effect correction.

snRNA-seq data analysis

First, cells with fewer than 200 genes or over 5000 genes detected, and with fewer than 300 reads or over 20,000 reads and genes present in fewer than 30 cells were removed from the gene count matrix. Cells with barcode sequences not introduced in the split-pool barcoding design (accounting for 0.4% of the total cells) were also removed from analysis. Percentage of mitochondrial reads was calculated for each cell and cells with over 4% mitochondrial reads were removed. Estimation of cell doublets was implemented on Scrublet (v0.2.3)92 using the scanpy.external.pp.scrublet function with the expected overall doublet rate set as 0.06 and the number of neighbors set as 30. Cells with the doublet score over 0.2 were annotated as expected doublets and discarded from analysis. A second quality control was performed by filtering out genes present in fewer than 50 cells. Then, data was normalized and log-transformed. Highly variable genes (n=5038) were identified with the scanpy.pp.highly_variable_genes function (min_mean = 0.0125, max_mean = 3, min_disp = 0.5) and the effects of total counts per cell and percentage of mitochondrial reads per cell were regressed out using the scanpy.pp.regress_out function. Next, the data was scaled and Principal Component Analysis (PCA) was performed for dimensionality reduction using the scanpy.tl.pca function (svd_solver = ‘arpack’).

Next, Harmony (Harmonypy v0.0.6)93 was used to eliminate potential batch effects across the 15 SHARE-seq batches with the scanpy.external.pp.harmony_integrate function. Then, a neighborhood graph of cells was computed using the scanpy.pp.neighbors function with the number of neighbors set as 30 (metric = ‘cosine’). The neighborhood graph was embedded in two dimensions using Uniform Manifold Approximation and Projection (UMAP) with the scanpy.tl.umap function, in which the effective minimum distance between embedded points set as 0.1. Leiden clustering was performed with the scanpy.tl.leiden function and marker genes of each Leiden cluster was identified using the scanpy.tl.rank_genes_groups function (method = ‘wilcoxon’). For customized UMAP visualization, the plot1cell package was used (https://github.com/TheHumphreysLab/plot1cell).30 For single-cell cluster annotation, a list of marker genes (summarized in Table S6) were curated from existing cell atlas datasets.6,11,18,20,29,31,35,72,97 Each marker gene was first qualitatively visualized on the UMAP space and then its differential expression pattern was quantitatively evaluated, in which an adjusted p value lower than 0.01 in the Mann-Whitney-U test with the Benjamini-Hochberg correction was considered as significant. For cell type subclustering, cells annotated as the subtype in the above analysis were extracted and re-analyzed following the similar procedure (see Code Availability). All analysis were executed on Scanpy (v1.7.2)83 unless otherwise specified.

snATAC-seq data analysis

A chromatin assay was created from the count matrix with the CreateChromatinAssay function on Signac (v1.6.0),85 which was then converted to a Seurat object with the CreateSeuratObject function on Seurat (v4.1.0).84 The strength of nucleosome signal, the transcription start site (TSS) enrichment score, the fraction of reads in peaks (FRiP) and the fraction of counts overlapping the hg19 genome blacklist were calculated for each cell with the NucleosomeSignal, TSSEnrichment, FRiP and FractionCountsInRegion functions. For quality control, cells with over 400 peaks and fewer than 50,000 peaks detected, nucleosome signal value less than 2.5, TSS enrichment score over 1, FRiP value over 0.1 and blacklist fraction less than 0.05 were used for analysis. Cells identified as doublets in snRNA-seq analysis were also removed from snATAC-seq analysis. Then, data was normalized with the frequency-inverse document frequency (TF-IDF) method with the RunTFIDF function. Linear dimension reduction was performed by calculating singular value decomposition (SVD) on the TD-IDF matrix with the RunSVD function.

Next, Harmony93 was used to eliminate potential batch effects across the 15 SHARE-seq batches with the RunHarmony function. Graph-based clustering, non-linear dimension reduction and UMAP visualization were performed with the RunUMAP (dims = 2:30, min.dist = 0.1, n.neighbors = 50), FindNeighbors and FindClusters functions. We emphasize that cells with relatively lower FRiP values should be carefully removed during the snATAC-seq analytic pipeline to avoid FRiP-low artefactual clusters. For customized UMAP visualization, the plot1cell package was used.30 Gene annotation was performed by calculating counts per cell in gene body and promoter regions with the GeneActivity function. Individual clusters were annotated by manually inspecting the activity of lineage-specific genes, integration with the snRNA-seq data of this work (see below) and comparison with existing cell atlas datasets. Peak calling was performed again on each individual cluster and a total of 397,601 features were identified, with output files available in Mendeley Data.

All aforementioned snATAC-seq analyses were performed based on the peak-by-cell matrix mentioned earlier. We also compared the results with analysis based on a genomic bin-by-cell matrix with ArchR98 or SnapATAC99 but did not see improved clustering results.

Cross-modality integration of snRNA-seq and snATAC-seq

After quality control, cells identified in both snRNA-seq and snATAC-seq (n= 324,701) were selected for integration of the two modalities with the Weighted Nearest Neighbor (WNN) method.36 Briefly, the subsets of snRNA-seq and snATAC-seq count matrices were combined into one Seurat object. The snRNA-seq data was normalized and scaled and the effects of total counts per cell, gene counts per cell and percentage of mitochondrial reads per cell were regressed out. PCA dimensionality reduction was performed and batch effects were corrected with Harmony (Harmony_RNA) as described above. The snATAC-seq data was normalized and linear dimension reduction was performed, with batch effect correction with Harmony (Harmony_peaks) as described above. Next, a WNN graph was calculated on Harmony_RNA and Harmony_peaks with the FindMultiModalNeighbors function [dims.list = list(1:50, 1:50), k.nn = 20]. The computed WNN graph was then processed for UMAP visualization and clustering (min.dist = 0.1, n.neighbors = 50).

Pseudobulk analysis

Pseudobulk clustering analysis was performed using decoupleR (v1.2.0).94 For either snRNA-seq (gene expression) or snATAC-seq (gene activity, computed as described above), each cell in the cell-by-feature count matrix was assigned to its original human kidney sample (n=48) based on the barcodes. Then, a sample-by-gene count matrix was generated from the cell-by-feature count matrix by aggregating cells that originate from the same sample with the decoupleR.get_pseudobulk function. A total of 8,836 genes (snRNA-seq) or 16,284 genes (snATAC-seq) were detected in all 48 samples and retained for analysis. The pseudobulk data was normalized, log-transformed and scaled. With Scanpy, PCA dimensionality reduction was performed and a neighborhood graph was computed and embedded in two dimensions using UMAP. To co-embed the two modalities on the same UMAP, the pseudobulk matrices of raw data were combined where the intersected genes were selected for analysis [anndata.concat(join=“inner”)]. A similar dimension reduction and clustering pipeline was performed.

For sample-based pseudobulk functional analysis, the decoupleR.get_contrast function was used to compute differentially expressed genes. Pathway activity and transcription factor activity inferences were performed with the decoupleR.run_consensus function based on the Progeny and DoRothEA databases.95 For cell type-based pseudobulk analysis, activity inferences were implemented with the Multivariate Linear Model (MLM) using the decoupleR.run_mlm function. Enrichment score was computed with the Over Representation Analysis (ORA) method using the decoupleR.get_ora_df function where the Molecular Signatures Database100 was retrieved.

Gene sets and module scoring

Unless otherwise specified, scoring analysis of all major gene modules was performed with the scanpy.tl.score_genes function. Briefly, for each cell, the average expression of a set of selected genes will be calculated and then subtracted with the average expression of another reference set of genes, in which the reference set is randomly sampled from all features, and the subtracted value is used as the “score”.101 The gene set used for ECM scoring was obtained from the Matrisome Project.102 The gene sets used for scoring of lipid accumulation, FAO, gluconeogenesis, OXPHOS, glycolysis, polyol metabolism and osmotic stress were obtained from the database of Harmonizome.62 The gene sets used for scoring of glutamine and platelet activating factor metabolic processes were obtained from Gene Ontology.103 A list of genes associated with metabolic processes was obtained from Reactome.56 All gene lists and their references can be found in Table S7.

Single-cell trajectory analysis

All PT cells and genes associated with metabolic processes were extracted from the snRNA-seq count matrix. Then, cells with fewer than 150 genes detected and cells with mitochondrial counts accounting for over 5% of the total read counts were removed from analysis, which ultimately generated a count matrix of 21,375 PT cells and >2000 metabolism-associated genes. Single-cell trajectory analysis was performed on Monocle3.80 Briefly, the matrix was normalized and pre-processed with the preprocess_cds function (method = “PCA”, num_dim = 20). Batch effects were subtracted with the align_cds function. Next, dimensionality reduction was performed with the reduce_dimension function (reduction_method=‘UMAP’, umap.n_neighbors = 30). Next, a principal graph from the reduced dimension space was generated using reversed graph embedding with the learn_graph function (close_loop=FALSE) and the cells were ordered in pseudotime with the order_cells function.

Cell type-specific regulator analysis

For the multiome object containing 324,701 cells with the WNN graph calculated as mentioned above, its ATAC assay was used to find enriched TF binding motifs’ activities in each cell type annotated by the RNA modality. A motif position weight matrix (PWM) was extracted from the JASPAR2020104 database’s CORE collection and was added to the assay using the Signac AddMotifs function. The Signac wrapper function RunChromVAR was then called to create a chromVAR37 assay containing motif activity per cell. Cell-type-specific motifs were identified by the Seurat function FindAllMarkers using a logistic regression (LR) test with the ‘logfc.threshold’ argument set to 0. Motif activities were extracted from the chromVAR assay using the Seurat function AverageExpression for heatmap visualization. TF expression levels corresponding to the target motifs were extracted from the RNA assay using AverageExpression.

Cis-co-accessibility networks (CCANs) analysis

Cicero105 algorithm was used to link regulatory DNA elements to their target genes for the ATAC assay of the multiome object containing 324,701 cells with the WNN graph calculated as mentioned above. The ‘counts’ slot of the ATAC assay was first converted to CellDataSet format using the Cicero function make_atac_cds. Then detect_genes, estimate_size_factors, preprocess_cds, Embeddings, and make_cicero_cds functions from Monocle380 were applied sequentially to create a Cicero CellDataSet object containing UMAP reduction. The run_cicero function was then called to find peak connections by running the primary functions of the Cicero pipeline with default parameters. Cis-co-accessibility networks (CCANs) were later extracted from the connections object using the generate_ccans function with coaccess_cutoff_override set to 0.2.

Imaging mass spectrometry data acquisition

Unbiased imaging mass spectrometry profiling was performed with a publicly established protocol at the Mass Spectrometry Technology Access Center at Washington University School of Medicine. Briefly, a total of 6 fresh-frozen human kidney cortex, medulla and papilla samples (Table S2) were sectioned at 10 μm thickness on MALDI IntelliSlides (Bruker). Samples of the same donor were sectioned onto the same slide to reduce potential batch effects. For each sample, one extra 10-μm serial tissue section was also obtained for immunofluorescence staining analysis. Data acquisition was performed with the positive ion mode at a pixel size of 10 × 10 μm2 over m/z range 100 – 1500. The matrix used was 2,5-dihydroxybenzoic acid (DHB; 15 mg/mL in 90% acetonitrile with 0.1% trifluoroacetic acid), with spraying parameters of 60 °C nozzle temperature, a flow rate of 0.1 mL/min, 1200 mm/min velocity, a track spacing of 3 mm, moving pattern CC, 14 passes and a N2 pressure of 10 psi. Matrix application was performed with the HTX M5 Sprayer (HTX Technologies) and data acquisition was performed on the timsTOF fleX MALDI-2 instrument (Bruker), where the laser was rastered over a tissue section and mass spectra were recorded in each spot. Our pilot efforts with the negative-mode MALDI method resulted in significant artifactual background at the tissue boundary and lower number of features detected, and therefore, were not presented in this study (data available at Mendeley Data).

Imaging mass spectrometry data analysis

Unless otherwise specified, all analysis of imaging mass spectrometry data were performed with the MALDIpy package described in this work, with details of methodology documented in https://github.com/TheHumphreysLab/MALDIpy (codes are also available at https://pypi.org/project/MALDIpy/). Briefly, metabolite annotation was performed with the METASPACE database of core mammalian metabolites and lipids (https://metaspace2020.eu/)42 and the raw data (FDR < 20%; with hot-spot removal) post total ion count (TIC) normalization were exported. A MALDIpy object was created for each tissue section with the msi_data function, which allows downstream feature visualization, data matrix conversion and single-cell analysis, where each pixel was recognized as an individual cell. The MALDIpy object was converted to an AnnData with the to_adata function which enables Scanpy-based single-cell data quality control, normalization, dimensional reduction and clustering. In quality control, cells with fewer than 40,000 counts and fewer than 30 features detected were removed from analysis. After normalization with the scanpy.pp.normalize_total function, the effect of total counts was regressed out and the data was scaled. Then, PCA was performed for dimensionality reduction (svd_solver=‘arpack’). For integration of the 6 tissue samples, Harmony (Harmonypy v0.0.6)93 was used to eliminate the batch effects. Then, a neighborhood graph of cells was computed using the scanpy.pp.neighbors function with the number of neighbors set as 30 (n_pcs = 20). The neighborhood graph was embedded onto a UMAP space with the scanpy.tl.umap function, in which the effective minimum distance between embedded points set as 0.2. Leiden clustering was performed and marker features of each Leiden cluster was identified using the scanpy.tl.rank_genes_groups function (method = ‘wilcoxon’). The Leiden clusters were visualized on the tissue sections with the MALDIpy.projection.umap_projection function.

To visualize a feature on multiple tissue sections at the same time, a linear normalization was performed:

Nij=Cij×n(ijCij)n×ijCij

in which Cij is the feature intensity of a pixel and i and j define the location of the pixel on the 2-dimension space; n is the number of tissue sections for co-visualization; Nij is the feature intensity after this normalization.

To co-project multiple features on the same tissue section with distinct color schemes, the intensity matrix of a feature was normalized and converted to an array of color codes. The matrices of different features were summed up and converted to an array of 8-bit RGB color codes and visualized with the PIL.Image module.

To obtain meta-information for each metabolite feature, the Human Metabolome Database (HMDB)43 was used to survey the identification and taxonomy of each feature. To connect the metabolomics with genomics, HMDB was surveyed to identify protein associations of each feature, which identified a total of 96,524 general protein-feature associations including 36,716 enzyme-feature associations based on the 588 features analyzed.

In vitro induction of lipid metabolism dysregulation

Primary RPTECs cultured on plastics resemble an injured cell state.13 Therefore, to induce VCAM1 upregulation and reduced FAO gene expression, RPTEC/TERT1 cells were treated with TNF-α (210-TA/CF, R&D Systems) at a final concentration of 20 ng/mL for 24 hours. TNF-α simulation following siRNA treatment was also performed on RPTEC/TERT1 cells. In vitro lipid accumulation induction was performed on human primary RPTECs as previously described.7 Briefly, when RPTECs reached 70%–80% confluence, cells were starved overnight by culturing in Renal Epithelial Cell Growth Medium without growth supplements. Then, cells were switched to complete growth medium and treated with 2% BSA-conjugated oleate fatty acid (29557, Cayman Chemical) or palmitate fatty acid (29558, Cayman Chemical) for 6 hours (final concentrations: 100 μM).

Metabolic measurement

Metabolic measurement of RPTEC/TERT1 cells was performed as previously described7,63 with a Seahorse XFe96 Analyzer (Agilent). Briefly, 20,000 cells were seeded into each well of a XF96 cell culture microplate (102416-100, Agilent). The culture medium was switched into Seahorse XF DMEM assay medium (103680-100, Agilent) supplemented with 2 mM glucose and 0.5 mM L-carnitine (C0283, Sigma) at one hour before the real-time measurement. Palmitate fatty acids, etomoxir (11969, Cayman Chemical) and oligomycin (O4876, Sigma) were injected during the Seahorse run at a final concentration of 100 μM, 40 μM and 1 μM, respectively.

Oil Red O staining

Cells were first fixed on a slide with 4% PFA and washed with PBS. Cells were incubated with Oil Red O Solution (ab150678, Abcam) overnight at room temperature followed by the other procedures with the manufacturer’s instructions. Fiji106 was used for quantification of ORO-positive area on each image of capture.

Quantitative polymerase chain reaction (qPCR) analysis

RNA was extracted with RNeasy Mini Kits (74104, Qiagen) following the manufacturer’s instruction. Complementary DNA (cDNA) was obtained by reverse transcribing the extracted RNA (~2 μg) with the High-Capacity cDNA Reverse Transcription Kit (4368813, Applied Biosystems), and then processed for qPCR using the iTaq Universal SYBR Green Supermix (1725121, BioRad). Gene expression was normalized to ACTB expression and quantified with the 2−ΔΔCt method. Primer sequences can be found in Table S5.

Integrative analysis with clinical data

The sample origin of each cell was identified according to the indexing barcodes introduced during the SHARE-seq library preparation. Next, the donor identity as well as disease categories, BUN and creatinine levels are added to the metadata for each cell in the AnnData (metadata available on GSE234788). Mann-Whitney U nonparametric test was used to make comparisons across control/AKI/CKD samples in clinical data analysis. Simple Linear Regression was used to calculate the Pearson correlation coefficients and statistics between expression of a gene and a clinical variable.

In vitro gene knockdown and RNA-seq

PPFIBP1 or PLEKHA1 expression in human primary RPTECs was knocked down with lipid-based siRNA transfection method. Briefly, Lipofectamine Reagent (13778150, Thermo Scientific) and PPFIBP1 Smartpool siRNA (L-011483-01-0005, Dharmacon) or PLEKHA1 Smartpool siRNA (L-018037-00-0005) were mixed for 5 minutes and added to cells following the manufacturer’s instruction. ON-TARGETplus Non-targeting Control Pool (D-001810-10-05, Dharmacon) was used as control. All siRNA molecules were used at a final concentration of 10 nM. Cells were processed for subsequent treatments at 2 days after siRNA transfection.

For bulk RNA-seq, RNA was extracted with RNeasy Kits (74104, Qiagen) following the manufacturer’s instruction. Libraries were generated with the RNase-H method using RiboErase kits (Kapa Biosystems) for ribosomal RNA removal and sequenced with the NovaSeq 6000 S4 platform (2×150 bp) at a target of 30 million reads per library. RNA-seq reads were aligned and quantitated to the human reference genome Ensembl GRCh38.101 with STAR (v2.7.9a1). Differential expression analysis was performed with the exactTest function of edgeR v3.34.1.107 Genes with FDR < 0.05 were considered as differentially expressed genes.

QUANTIFICATION AND STATISTICAL ANALYSIS

Unless otherwise specified, p values presented in biochemical assays were generated by unpair two-tailed Student’s t-tests with GraphPad Prism 8.0. FDR values presented in bulk RNA-seq differential expression analysis were generated by exact tests on two groups of negative binomial random variables with edgeR v3.34.1. Unless otherwise specified, Mann-Whitney-U test with the Benjamini-Hochberg correction was used for differential expression analysis of single-cell data with Scanpy v1.7.2. All statistical parameters can be found in repositories mentioned in Data and Code Availability.

Supplementary Material

1
2

Highlights.

  • SHARE-seq analyzes single-cell multiomics of distinct human kidney anatomical regions.

  • The same tubular cell types have distinct signatures depending on regional location.

  • Nephron segments possess distinct and non-overlapping metabolic signatures.

  • MALDIpy, a computational package for analyzing spatially resolved metabolomics data.

Acknowledgments

These experiments were funded by NIDDK R01DK103740 to BDH. The authors acknowledge the Washington University Genome Technology Access Center and Center for Genome Sciences & Systems Biology for next-generation sequencing support. The authors acknowledge the Washington University Mass Spectrometry Technology Access Center for imaging mass spectrometry technology support.

Inclusion and diversity

We support inclusive, diverse, and equitable conduct of research.

Footnotes

Declaration of interests

B.D.H. is a consultant for Janssen Research & Development, LLC, Pfizer and Chinook Therapeutics, held equity in Chinook Therapeutics and grant funding from Chinook Therapeutics, Pfizer and Janssen Research & Development, LLC; all interests are unrelated to the current work.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.McMahon AP (2016). Development of the Mammalian Kidney. Curr. Top. Dev. Biol. 117, 31. 10.1016/BS.CTDB.2015.10.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Kalantar-Zadeh K, Jafar TH, Nitsch D, Neuen BL, and Perkovic V (2021). Chronic kidney disease. Lancet 398, 786–802. 10.1016/S0140-6736(21)00519-5. [DOI] [PubMed] [Google Scholar]
  • 3.Hill NR, Fatoba ST, Oke JL, Hirst JA, O’Callaghan CA, Lasserson DS, and Hobbs FDR (2016). Global Prevalence of Chronic Kidney Disease – A Systematic Review and Meta-Analysis. PLoS One 11, e0158765. 10.1371/journal.pone.0158765. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Kriz W, Bankir L, Bulger RE, Burg MB, Goncharevskaya OA, Imai M, Kaissling B, Maunsbach AB, Moffat DB, Morel F, et al. (1988). A standard nomenclature for structures of the kidney. Kidney Int. 33, 1–7. 10.1038/ki.1988.1. [DOI] [PubMed] [Google Scholar]
  • 5.Bankir L, Figueres L, Prot-Bertoye C, Bouby N, Crambert G, Pratt JH, and Houillier P (2020). Medullary and cortical thick ascending limb: Similarities and differences. Am. J. Physiol. - Ren. Physiol. 312, F422–F442. 10.1152/AJPRENAL.00261.2019. [DOI] [PubMed] [Google Scholar]
  • 6.Kirita Y, Wu H, Uchimura K, Wilson PC, and Humphreys BD (2020). Cell profiling of mouse acute kidney injury reveals conserved cellular responses to injury. Proc. Natl. Acad. Sci. U. S. A. 117, 15874–15883. 10.1073/pnas.2005477117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Li H, Dixon EE, Wu H, and Humphreys BD (2022). Comprehensive single-cell transcriptional profiling defines shared and unique epithelial injury responses during kidney fibrosis. Cell Metab. 34, 1977–1998.e9. 10.1016/j.cmet.2022.09.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Park J, Shrestha R, Qiu C, Kondo A, Huang S, Werth M, Li M, Barasch J, and Suszták K (2018). Single-cell transcriptomics of the mouse kidney reveals potential cellular targets of kidney disease. Science (80-.). 360. 10.1126/science.aar2131. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Kuppe C, Ibrahim MM, Kranz J, Zhang X, Ziegler S, Perales-Patón J, Jansen J, Reimer KC, Smith JR, Dobie R, et al. (2021). Decoding myofibroblast origins in human kidney fibrosis. Nature 589. 10.1038/s41586-020-2941-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Gerhardt LMS, Liu J, Koppitch K, Cippà PE, and McMahon AP (2021). Single-nuclear transcriptomics reveals diversity of proximal tubule cell states in a dynamic response to acute kidney injury. Proc. Natl. Acad. Sci. U. S. A. 118. 10.1073/PNAS.2026684118/-/DCSUPPLEMENTAL. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Abedini A, Ma Z, Frederick J, Dhillon P, Balzer MS, Shrestha R, Liu H, Vitale S, Devalaraja-Narashimha K, Grandi P, et al. (2022). Spatially resolved human kidney multi-omics single cell atlas highlights the key role of the fibrotic microenvironment in kidney disease progression. bioRxiv, 2022.10.24.513598. 10.1101/2022.10.24.513598. [DOI] [Google Scholar]
  • 12.Yoshimura Y, Muto Y, Ledru N, Wu H, Omachi K, Miner JH, and Humphreys BD (2023). A single-cell multiomic analysis of kidney organoid differentiation. Proc. Natl. Acad. Sci. 120, e2219699120. 10.1073/PNAS.2219699120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Ledru N, Wilson PC, Muto Y, Yoshimura Y, Wu H, Asthana A, Tullius SG, Waikar SS, Orlando G, and Humphreys BD (2022). Predicting regulators of epithelial cell state through regularized regression analysis of single cell multiomic sequencing. bioRxiv, 2022.12.29.522232. 10.1101/2022.12.29.522232. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Gerhardt LMS, Koppitch K, van Gestel J, Guo J, Cho S, Wu H, Kirita Y, Humphreys BD, and McMahon AP (2023). Lineage Tracing and Single-Nucleus Multiomics Reveal Novel Features of Adaptive and Maladaptive Repair after Acute Kidney Injury. J. Am. Soc. Nephrol. 34. 10.1681/ASN.0000000000000057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Cao J, Cusanovich DA, Ramani V, Aghamirzaie D, Pliner HA, Hill AJ, Daza RM, McFaline-Figueroa JL, Packer JS, Christiansen L, et al. (2018). Joint profiling of chromatin accessibility and gene expression in thousands of single cells. Science (80-.). 361, 1380–1385. 10.1126/science.aau0730(2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Li H, and Humphreys BD (2021). Single Cell Technologies: Beyond Microfluidics. Kidney360 2, 1196–1204. 10.34067/kid.0001822021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Muto Y, Li H, and Humphreys BD (2022). Single Cell Transcriptomics. In Innovations in Nephrology: Breakthrough Technologies in Kidney Disease Care (Springer, Cham; ), pp. 87–102. 10.1007/978-3-031-11570-7_5. [DOI] [Google Scholar]
  • 18.Hansen J, Sealfon R, Menon R, Eadon MT, Lake BB, Steck B, Anjani K, Parikh S, Sigdel TK, Zhang G, et al. (2022). A reference tissue atlas for the human kidney. Sci. Adv. 8, 4965. 10.1126/sciadv.abn4965. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.The human body at cellular resolution: the NIH Human Biomolecular Atlas Program (2019). Nature 574, 187–192. 10.1038/s41586-019-1629-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Lake BB, Menon R, Winfree S, Hu Q, Ferreira RM, Kalhor K, Barwinska D, Otto EA, Ferkowicz M, Diep D, et al. (2023). An atlas of healthy and injured cell states and niches in the human kidney. Nature, 2021.07.28.454201. 10.1038/s41586-023-05769-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Wang G, Heijs B, Kostidis S, Mahfouz A, Rietjens RGJ, Bijkerk R, Koudijs A, van der Pluijm LAK, van den Berg CW, Dumas SJ, et al. (2022). Analyzing cell-type-specific dynamics of metabolism in kidney repair. Nat. Metab. 2022, 1–10. 10.1038/s42255-022-00615-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Rietjens RGJ, Wang G, van der Velden AIM, Koudijs A, Avramut MC, Kooijman S, Rensen PCN, van der Vlag J, Rabelink TJ, Heijs B, et al. (2023). Phosphatidylinositol metabolism of the renal proximal tubule S3 segment is disturbed in response to diabetes. Sci. Reports 2023 131 13, 1–12. 10.1038/s41598-023-33442-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Conroy LR, Clarke HA, Allison DB, Valenca SS, Sun Q, Hawkinson TR, Young LEA, Ferreira JE, Hammonds AV, Dunne JB, et al. (2023). Spatial metabolomics reveals glycogen as an actionable target for pulmonary fibrosis. Nat. Commun. 2023 141 14, 1–18. 10.1038/s41467-023-38437-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Zheng P, Zhang N, Ren D, Yu C, Zhao B, and Zhang Y (2023). Integrated spatial transcriptome and metabolism study reveals metabolic heterogeneity in human injured brain. Cell Reports Med. 4, 101057. 10.1016/J.XCRM.2023.101057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Ma S, Zhang B, LaFave LM, Earl AS, Chiang Z, Hu Y, Ding J, Brack A, Kartha VK, Tay T, et al. (2020). Chromatin Potential Identified by Shared Single-Cell Profiling of RNA and Chromatin. Cell 183, 1103–1116.e20. 10.1016/j.cell.2020.09.056. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Tuck M, Grélard F, Blanc L, and Desbenoit N (2022). MALDI-MSI Towards Multimodal Imaging: Challenges and Perspectives. Front. Chem. 10, 904688. 10.3389/FCHEM.2022.904688/BIBTEX. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Rayner HC, Thomas ME, and Milford DV (2020). Kidney Anatomy and Physiology: The Basis of Clinical Nephrology. In Understanding Kidney Diseases, pp. 1–9. 10.1007/978-3-030-43027-6_1. [DOI] [Google Scholar]
  • 28.Hennig BP, Velten L, Racke I, Tu CS, Thoms M, Rybin V, Besir H, Remans K, and Steinmetz LM (2018). Large-scale low-cost NGS library preparation using a robust Tn5 purification and tagmentation protocol. G3 Genes, Genomes, Genet. 8, 79–89. 10.1534/G3.117.300257/-/DC1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Wilson PC, Muto Y, Wu H, Karihaloo A, Waikar SS, and Humphreys BD (2022). Multimodal single cell sequencing implicates chromatin accessibility and genetic background in diabetic kidney disease progression. Nat. Commun. 13, 1–20. 10.1038/s41467-022-32972-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Wu H, Villalobos RG, Yao X, Reilly D, Chen T, Rankin M, Myshkin E, Breyer MD, and Humphreys BD (2022). Mapping the single-cell transcriptomic response of murine diabetic kidney disease to therapies. Cell Metab. 34, 1064–1078.e6. 10.1016/j.cmet.2022.05.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Muto Y, Wilson PC, Ledru N, Wu H, Dimke H, Waikar SS, and Humphreys BD (2021). Single cell transcriptional and chromatin accessibility profiling redefine cellular heterogeneity in the adult human kidney. Nat. Commun. 2021 121 12, 1–17. 10.1038/s41467-021-22368-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Strutz F, Zeisberg M, Renziehausen A, Raschke B, Becker V, Van Kooten C, and Müller GA (2001). TGF-beta 1 induces proliferation in human renal fibroblasts via induction of basic fibroblast growth factor (FGF-2). Kidney Int. 59, 579–592. 10.1046/J.1523-1755.2001.059002579.X. [DOI] [PubMed] [Google Scholar]
  • 33.Gewin L (2019). The Many Talents of TGF-β in the Kidney. Curr. Opin. Nephrol. Hypertens. 28, 203. 10.1097/MNH.0000000000000490. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Chen L, Clark JZ, Nelson JW, Kaissling B, Ellison DH, and Knepper MA (2019). Renal-Tubule Epithelial Cell Nomenclature for Single-Cell RNA-Sequencing Studies. J. Am. Soc. Nephrol. 30, 1358–1364. 10.1681/ASN.2019040415. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Fink EE, Sona S, Tran U, Desprez PE, Bradley M, Qiu H, Eltemamy M, Wee A, Wolkov M, Nicolas M, et al. (2022). Single-cell and spatial mapping Identify cell types and signaling Networks in the human ureter. Dev. Cell 57, 1899–1916.e6. 10.1016/J.DEVCEL.2022.07.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Hao Y, Hao S, Andersen-Nissen E, Mauck WM, Zheng S, Butler A, Lee MJ, Wilk AJ, Darby C, Zager M, et al. (2021). Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587.e29. 10.1016/J.CELL.2021.04.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Schep AN, Wu B, Buenrostro JD, and Greenleaf WJ (2017). chromVAR: inferring transcription-factor-associated accessibility from single-cell epigenomic data. Nat. Methods 2017 1410 14, 975–978. 10.1038/nmeth.4401. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Chen L, Lee JW, Chou CL, Nair AV, Battistone MA, Păunescu TG, Merkulova M, Breton S, Verlander JW, Wall SM, et al. (2017). Transcriptomes of major renal collecting duct cell types in mouse identified by single-cell RNA-seq. Proc. Natl. Acad. Sci. U. S. A. 114, E9989–E9998. 10.1073/PNAS.1710964114/SUPPL_FILE/PNAS.1710964114.SD04.XLSX. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Lee JW, Chou CL, and Knepper MA (2015). Deep sequencing in microdissected renal tubules identifies nephron segment-specific transcriptomes. J. Am. Soc. Nephrol. 26, 2669–2677. 10.1681/ASN.2014111067/-/DCSUPPLEMENTAL. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Su W, Cao R, Zhang XY, and Guan Y (2020). Aquaporins in the kidney: Physiology and pathophysiology. Am. J. Physiol. - Ren. Physiol. 318, F193–F203. 10.1152/AJPRENAL.00304.2019/ASSET/IMAGES/LARGE/ZH20121989720003.JPEG. [DOI] [PubMed] [Google Scholar]
  • 41.Taylor A, Grapentine S, Ichhpuniani J, and Bakovic M (2021). Choline transporter-like proteins 1 and 2 are newly identified plasma membrane and mitochondrial ethanolamine transporters. J. Biol. Chem. 296. 10.1016/J.JBC.2021.100604. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Palmer A, Phapale P, Chernyavsky I, Lavigne R, Fay D, Tarasov A, Kovalev V, Fuchser J, Nikolenko S, Pineau C, et al. (2016). FDR-controlled metabolite annotation for high-resolution imaging mass spectrometry. Nat. Methods 2016 141 14, 57–60. 10.1038/nmeth.4072. [DOI] [PubMed] [Google Scholar]
  • 43.Wishart DS, Feunang YD, Marcu A, Guo AC, Liang K, Vázquez-Fresno R, Sajed T, Johnson D, Li C, Karu N, et al. (2018). HMDB 4.0: the human metabolome database for 2018. Nucleic Acids Res. 46, D608. 10.1093/NAR/GKX1089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Gewin LS (2021). Sugar or fat? Renal tubular metabolism reviewed in health and disease. Nutrients 13, 1580. 10.3390/nu13051580. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Console L, Scalise M, Giangregorio N, Tonazzi A, Barile M, and Indiveri C (2020). The Link Between the Mitochondrial Fatty Acid Oxidation Derangement and Kidney Injury. Front. Physiol. 11. 10.3389/FPHYS.2020.00794. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Mizutani Y, Kihara A, and Igarashi Y (2005). Mammalian Lass6 and its related family members regulate synthesis of specific ceramides. Biochem. J. 390, 263–271. 10.1042/BJ20050291. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Lahiri S, Lee H, Mesicek J, Fuks Z, Haimovitz-Friedman A, Kolesnick RN, and Futerman AH (2007). Kinetic characterization of mammalian ceramide synthases: Determination of Km values towards sphinganine. FEBS Lett. 581, 5289–5294. 10.1016/J.FEBSLET.2007.10.018. [DOI] [PubMed] [Google Scholar]
  • 48.Staiano L, and De Matteis MA (2019). Phosphoinositides in the kidney. J. Lipid Res. 60, 287. 10.1194/JLR.R089946. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Harris DP, Vogel P, Wims M, Moberg K, Humphries J, Jhaver KG, DaCosta CM, Shadoan MK, Xu N, Hansen GM, et al. (2023). Requirement for Class II Phosphoinositide 3-Kinase C2α in Maintenance of Glomerular Structure and Function. 10.1128/MCB.00468-1031,63-80.10.1128/MCB.00468-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Uhlén M, Fagerberg L, Hallström BM, Lindskog C, Oksvold P, Mardinoglu A, Sivertsson Å, Kampf C, Sjöstedt E, Asplund A, et al. (2015). Tissue-based map of the human proteome. Science (80-.). 347. 10.1126/SCIENCE.1260419/SUPPL_FILE/1260419_UHLEN.SM.PDF. [DOI] [PubMed] [Google Scholar]
  • 51.Canela VH, Bowen WS, Ferreira RM, Syed F, Lingeman JE, Sabo AR, Barwinska D, Winfree S, Lake BB, Cheng Y-H, et al. (2023). A spatially anchored transcriptomic atlas of the human kidney papilla identifies significant immune injury in patients with stone disease. Nat. Commun. 2023 141 14, 1–17. 10.1038/s41467-023-38975-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Lu W, Phillips CL, Killen PD, Hlaing T, Harrison WR, Elder FFB, Miner JH, Overbeek PA, and Meisler MH (1999). Insertional mutation of the collagen genes Col4a3 and Col4a4 in a mouse model of Alport syndrome. Genomics 61, 113–124. 10.1006/GENO.1999.5943. [DOI] [PubMed] [Google Scholar]
  • 53.Cowley AW, Yang C, Kumar V, Lazar J, Jacob H, Geurts AM, Liu P, Dayton A, Kurth T, and Liang M (2016). Pappa2 is linked to salt-sensitive hypertension in Dahl S rats. Physiol. Genomics 48, 62–72. 10.1152/PHYSIOLGENOMICS.00097.2015/ASSET/IMAGES/LARGE/ZH70011640620007.JPEG. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.York NS, Sanchez-Arias JC, McAdam ACH, Rivera JE, Arbour LT, and Swayne LA (2022). Mechanisms underlying the role of ankyrin-B in cardiac and neurological health and disease. Front. Cardiovasc. Med. 9. 10.3389/FCVM.2022.964675. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Wang G, Heijs B, Kostidis S, Rietjens RGJ, Koning M, Yuan L, Tiemeier GL, Mahfouz A, Dumas SJ, Giera M, et al. (2022). Spatial dynamic metabolomics identifies metabolic cell fate trajectories in human kidney differentiation. Cell Stem Cell 29, 1580–1593.e7. 10.1016/j.stem.2022.10.008. [DOI] [PubMed] [Google Scholar]
  • 56.Jassal B, Matthews L, Viteri G, Gong C, Lorente P, Fabregat A, Sidiropoulos K, Cook J, Gillespie M, Haw R, et al. (2020). The reactome pathway knowledgebase. Nucleic Acids Res. 48, D498–D503. 10.1093/NAR/GKZ1031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Yañez AJ, Ludwig HC, Bertinat R, Spichiger C, Gatica R, Berlien G, Leon O, Brito M, Concha II, and Slebe JC (2005). Different involvement for aldolase isoenzymes in kidney glucose metabolism: aldolase B but not aldolase A colocalizes and forms a complex with FBPase. J. Cell. Physiol. 202, 743–753. 10.1002/JCP.20183. [DOI] [PubMed] [Google Scholar]
  • 58.Yu S, Meng S, Xiang M, and Ma H (2021). Phosphoenolpyruvate carboxykinase in cell metabolism: Roles and mechanisms beyond gluconeogenesis. Mol. Metab. 53, 101257. 10.1016/J.MOLMET.2021.101257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Miguel V, and Kramann R (2023). Metabolic reprogramming heterogeneity in chronic kidney disease. FEBS Open Bio. 10.1002/2211-5463.13568. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Jin C, Zhu X, Wu H, Wang Y, and Hu X (2020). Perturbation of phosphoglycerate kinase 1 (PGK1) only marginally affects glycolysis in cancer cells. J. Biol. Chem. 295, 6425. 10.1074/JBC.RA119.012312. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Scholz H, Boivin FJ, Schmidt-Ott KM, Bachmann S, Eckardt KU, Scholl UI, and Persson PB (2021). Kidney physiology and susceptibility to acute kidney injury: implications for renoprotection. Nat. Rev. Nephrol. 2021 175 17, 335–349. 10.1038/s41581-021-00394-7. [DOI] [PubMed] [Google Scholar]
  • 62.Rouillard AD, Gundersen GW, Fernandez NF, Wang Z, Monteiro CD, McDermott MG, and Ma’ayan A (2016). The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins. Database 2016. 10.1093/DATABASE/BAW100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Kang HM, Ahn SH, Choi P, Ko Y-A, Han SH, Chinga F, Park ASD, Tao J, Sharma K, Pullman J, et al. (2014). Defective fatty acid oxidation in renal tubular epithelial cells has a key role in kidney fibrosis development. Nat. Med. 2014 211 21, 37–46. 10.1038/nm.3762. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Mori Y, Ajay AK, Chang JH, Mou S, Zhao H, Kishi S, Li J, Brooks CR, Xiao S, Woo HM, et al. (2021). KIM-1 mediates fatty acid uptake by renal tubular cells to promote progressive diabetic kidney disease. Cell Metab. 33, 1042–1061.e7. 10.1016/J.CMET.2021.04.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Schaub JA, Venkatachalam MA, and Weinberg JM (2021). Proximal Tubular Oxidative Metabolism in Acute Kidney Injury and the Transition to CKD. Kidney360 2, 355. 10.34067/KID.0004772020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Tu Z, Kelley VR, Collins T, and Lee FS (2001). I kappa B kinase is critical for TNF-alpha-induced VCAM1 gene expression in renal tubular epithelial cells. J. Immunol. 166, 6839–6846. 10.4049/JIMMUNOL.166.11.6839. [DOI] [PubMed] [Google Scholar]
  • 67.Hayden MS, and Ghosh S (2014). Regulation of NF-κB by TNF family cytokines. Semin. Immunol. 26, 253–266. 10.1016/J.SMIM.2014.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Kriajevska M, Fischer-Larsen M, Moertz E, Vorm O, Tulchinsky E, Grigorian M, Ambartsumian N, and Lukanidin E (2002). Liprin β1, a member of the family of LAR transmembrane tyrosine phosphatase-interacting proteins, is a new target for the metastasis-associated protein S100A4 (Mts1). J. Biol. Chem. 277, 5229–5235. 10.1074/jbc.M110976200. [DOI] [PubMed] [Google Scholar]
  • 69.Marshall AJ, Krahn AK, Ma K, Duronio V, and Hou S (2002). TAPP1 and TAPP2 Are Targets of Phosphatidylinositol 3-Kinase Signaling in B Cells: Sustained Plasma Membrane Recruitment Triggered by the B-Cell Antigen Receptor. Mol. Cell. Biol. 22, 5479–5491. 10.1128/mcb.22.15.5479-5491.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Nakagawa S, Nishihara K, Miyata H, Shinke H, Tomita E, Kajiwara M, Matsubara T, Iehara N, Igarashi Y, Yamada H, et al. (2015). Molecular Markers of Tubulointerstitial Fibrosis and Tubular Cell Damage in Patients with Chronic Kidney Disease. PLoS One 10, e0136994. 10.1371/JOURNAL.PONE.0136994. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Ju W, Nair V, Smith S, Zhu L, Shedden K, Song PXK, Mariani LH, Eichinger FH, Berthier CC, Randolph A, et al. (2015). Tissue transcriptome-driven identification of epidermal growth factor as a chronic kidney disease biomarker. Sci. Transl. Med. 7. 10.1126/SCITRANSLMED.AAC7071. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Muto Y, Dixon EE, Yoshimura Y, Wu H, Omachi K, Ledru N, Wilson PC, King AJ, Eric Olson N, Gunawan MG, et al. (2022). Defining cellular complexity in human autosomal dominant polycystic kidney disease by multimodal single cell analysis. Nat. Commun. 13, 1–19. 10.1038/s41467-022-34255-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Berry MR, Mathews RJ, Ferdinand JR, Jing C, Loudon KW, Wlodek E, Dennison TW, Kuper C, Neuhofer W, and Clatworthy MR (2017). Renal Sodium Gradient Orchestrates a Dynamic Antibacterial Defense Zone. Cell 170, 860–874.e19. 10.1016/J.CELL.2017.07.022. [DOI] [PubMed] [Google Scholar]
  • 74.Song R, and Yosypiv IV (2012). Development of the kidney medulla. Organogenesis 8, 10–17. 10.4161/ORG.19308. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Haug S, Muthusamy S, Li Y, Stewart G, Li X, Treppner M, Köttgen A, and Akilesh S (2023). Multi-omic analysis of human kidney tissue identified medulla-specific gene expression patterns. Kidney Int. 0. 10.1016/J.KINT.2023.10.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Lonsdale J, Thomas J, Salvatore M, Phillips R, Lo E, Shad S, Hasz R, Walters G, Garcia F, Young N, et al. (2013). The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 2013 456 45, 580–585. 10.1038/ng.2653. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Dumas SJ, Meta E, Borri M, Goveia J, Rohlenova K, Conchinha NV, Falkenberg K, Teuwen LA, de Rooij L, Kalucka J, et al. (2020). Single-cell RNA sequencing reveals renal endothelium heterogeneity and metabolic adaptation to water deprivation. J. Am. Soc. Nephrol. 31, 118–138. 10.1681/ASN.2019080832. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Dhillon P, Park J, Hurtado del Pozo C, Li L, Doke T, Huang S, Zhao J, Kang HM, Shrestra R, Balzer MS, et al. (2021). The Nuclear Receptor ESRRA Protects from Kidney Disease by Coupling Metabolism and Differentiation. Cell Metab. 33, 379–394.e8. 10.1016/J.CMET.2020.11.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Martin BK, Qiu C, Nichols E, Phung M, Green-Gladden R, Srivatsan S, Blecher-Gonen R, Beliveau BJ, Trapnell C, Cao J, et al. (2022). Optimized single-nucleus transcriptional profiling by combinatorial indexing. Nat. Protoc. 2022, 1–20. 10.1038/s41596-022-00752-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Cao J, Spielmann M, Qiu X, Huang X, Ibrahim DM, Hill AJ, Zhang F, Mundlos S, Christiansen L, Steemers FJ, et al. (2019). The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502. 10.1038/s41586-019-0969-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Cao J, O’Day DR, Pliner HA, Kingsley PD, Deng M, Daza RM, Zager MA, Aldinger KA, Blecher-Gonen R, Zhang F, et al. (2020). A human cell atlas of fetal gene expression. Science (80-.). 370. 10.1126/SCIENCE.ABA7721. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Zhu C, Yu M, Huang H, Juric I, Abnousi A, Hu R, Lucero J, Behrens MM, Hu M, and Ren B (2019). An ultra high-throughput method for single-cell joint analysis of open chromatin and transcriptome. Nat. Struct. Mol. Biol. 26, 1063–1070. 10.1038/s41594-019-0323-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Wolf FA, Angerer P, and Theis FJ (2018). SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 2018 191 19, 1–5. 10.1186/S13059-017-1382-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Stuart T, Butler A, Hoffman P, Hafemeister C, Papalexi E, Mauck WM, Hao Y, Stoeckius M, Smibert P, and Satija R (2019). Comprehensive Integration of Single-Cell Data. Cell 177, 1888–1902.e21. 10.1016/J.CELL.2019.05.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Stuart T, Srivastava A, Madad S, Lareau CA, and Satija R (2021). Single-cell chromatin state analysis with Signac. Nat. Methods 18, 1333–1341. 10.1038/S41592-021-01282-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, and Gingeras TR (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21. 10.1093/BIOINFORMATICS/BTS635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Langmead B, and Salzberg SL (2012). Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359. 10.1038/NMETH.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Liao Y, Smyth GK, and Shi W (2014). featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930. 10.1093/BIOINFORMATICS/BTT656. [DOI] [PubMed] [Google Scholar]
  • 89.Wang L, Wang S, and Li W (2012). RSeQC: quality control of RNA-seq experiments. Bioinformatics 28, 2184–2185. 10.1093/BIOINFORMATICS/BTS356. [DOI] [PubMed] [Google Scholar]
  • 90.Li H (2011). Tabix: fast retrieval of sequence features from generic TAB-delimited files. Bioinformatics 27, 718–719. 10.1093/BIOINFORMATICS/BTQ671. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nussbaum C, Myers RM, Brown M, Li W, et al. (2008). Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, 1–9. 10.1186/GB-2008-9-9-R137/FIGURES/3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Wolock SL, Lopez R, and Klein AM (2019). Scrublet: Computational Identification of Cell Doublets in Single-Cell Transcriptomic Data. Cell Syst. 8, 281–291.e9. 10.1016/J.CELS.2018.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Korsunsky I, Millard N, Fan J, Slowikowski K, Zhang F, Wei K, Baglaenko Y, Brenner M, Loh P. ru, and Raychaudhuri S. (2019). Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 2019 1612 16, 1289–1296. 10.1038/s41592-019-0619-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Badia-I-Mompel P, Vélez Santiago J, Braunger J, Geiss C, Dimitrov D, Müller-Dott S, Taus P, Dugourd A, Holland CH, Ramirez Flores RO, et al. (2022). decoupleR: ensemble of computational methods to infer biological activities from omics data. Bioinforma. Adv. 2. 10.1093/BIOADV/VBAC016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Holland CH, Tanevski J, Perales-Patón J, Gleixner J, Kumar MP, Mereu E, Joughin BA, Stegle O, Lauffenburger DA, Heyn H, et al. (2020). Robustness and applicability of transcription factor and pathway analysis tools on single-cell RNA-seq data. Genome Biol. 2020 211 21, 1–19. 10.1186/S13059-020-1949-Z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.Li H, and Humphreys BD (2022). Mouse kidney nuclear isolation and library preparation for single-cell combinatorial indexing RNA sequencing. STAR Protoc. 3, 101904. 10.1016/j.xpro.2022.101904. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Hinze C, Kocks C, Leiz J, Karaiskos N, Boltengagen A, Cao S, Skopnik CM, Klocke J, Hardenberg JH, Stockmann H, et al. (2022). Single-cell transcriptomics reveals common epithelial response patterns in human acute kidney injury. Genome Med. 14, 1–18. 10.1186/s13073-022-01108-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Granja JM, Corces MR, Pierce SE, Bagdatli ST, Choudhry H, Chang HY, and Greenleaf WJ (2021). ArchR is a scalable software package for integrative single-cell chromatin accessibility analysis. Nat. Genet. 53, 403–411. 10.1038/s41588-021-00790-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Zhang K, Zemke NR, Armand EJ, and Ren B (2023). SnapATAC2: a fast, scalable and versatile tool for analysis of single-cell omics data. bioRxiv, 2023.09.11.557221. 10.1101/2023.09.11.557221. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Liberzon A, Birger C, Thorvaldsdóttir H, Ghandi M, Mesirov JP, and Tamayo P (2015). The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 1, 417–425. 10.1016/J.CELS.2015.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Satija R, Farrell JA, Gennert D, Schier AF, and Regev A (2015). Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 2015 335 33, 495–502. 10.1038/nbt.3192. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Shao X, Taha IN, Clauser KR, Gao Y (Tom), and Naba, A. (2020). MatrisomeDB: the ECM-protein knowledge database. Nucleic Acids Res. 48, D1136–D1144. 10.1093/NAR/GKZ849. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103.Carbon S, Ireland A, Mungall CJ, Shu S, Marshall B, Lewis S, Lomax J, Mungall C, Hitz B, Balakrishnan R, et al. (2009). AmiGO: online access to ontology and annotation data. Bioinformatics 25, 288–289. 10.1093/BIOINFORMATICS/BTN615. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104.Fornes O, Castro-Mondragon JA, Khan A, Van Der Lee R, Zhang X, Richmond PA, Modi BP, Correard S, Gheorghe M, Baranašić D, et al. (2020). JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 48, D87–D92. 10.1093/NAR/GKZ1001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105.Pliner HA, Packer JS, McFaline-Figueroa JL, Cusanovich DA, Daza RM, Aghamirzaie D, Srivatsan S, Qiu X, Jackson D, Minkina A, et al. (2018). Cicero Predicts cis-Regulatory DNA Interactions from Single-Cell Chromatin Accessibility Data. Mol. Cell 71, 858–871.e8. 10.1016/J.MOLCEL.2018.06.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106.Schindelin J, Arganda-Carreras I, Frise E, Kaynig V, Longair M, Pietzsch T, Preibisch S, Rueden C, Saalfeld S, Schmid B, et al. (2012). Fiji: an open-source platform for biological-image analysis. Nat. Methods 2012 97 9, 676–682. 10.1038/nmeth.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107.Robinson MD, McCarthy DJ, and Smyth GK (2010). edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140. 10.1093/BIOINFORMATICS/BTP616. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2

Data Availability Statement

Raw data (.fastq), processed data (count matrix .h5 files and fragment .bed files), sublibrary primer sequences and metadata of the SHARE-seq data have been deposited in NCBI’s Gene Expression Omnibus and are available through GEO Series accession number GSE234788. Raw and processed bulk RNA-seq data on RPTECs are available through GEO Series accession number GSE240639. Supporting information, including supporting figures and matrices of IMS data, are available in Mendeley Data. The imaging mass spectrometry data can be explored online at https://metaspace2020.eu/project/maldipy_kidney.

Scripts for pipelines of objection generation, pseudobulk analysis, single-cell analysis, subclustering analysis and generation of all major figures in this study were written mostly in Python and R with codes available at https://github.com/TheHumphreysLab/SHARE-seq-kidney. Our package for IMS data analysis, MALDIpy, was documented in https://github.com/TheHumphreysLab/MALDIpy (codes are also available at https://pypi.org/project/MALDIpy/ and https://doi.org/10.6084/m9.figshare.25254688). Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

Data S1. Unprocessed data underlying the display items in the manuscript, related to Figures 2, 6, S24, S6 and S7.

RESOURCES