ST-GEARS: Advancing 3D downstream research through accurate spatial information recovery

Tianyi Xia; Luni Hu; Lulu Zuo; Lei Cao; Yunjia Zhang; Mengyang Xu; Qin Lu; Lei Zhang; Taotao Pan; Bohan Zhang; Bowen Ma; Chuan Chen; Junfu Guo; Chang Shi; Mei Li; Chao Liu; Yuxiang Li; Yong Zhang; Shuangsang Fang

doi:10.1038/s41467-024-51935-0

. 2024 Sep 6;15:7806. doi: 10.1038/s41467-024-51935-0

ST-GEARS: Advancing 3D downstream research through accurate spatial information recovery

Tianyi Xia ^1,², Luni Hu ^1,², Lulu Zuo ³, Lei Cao ^1,², Yunjia Zhang ^1,², Mengyang Xu ^2,⁴, Qin Lu ², Lei Zhang ^1,², Taotao Pan ^1,², Bohan Zhang ^1,², Bowen Ma ^1,², Chuan Chen ^1,², Junfu Guo ³, Chang Shi ³, Mei Li ², Chao Liu ^1,^2,^✉, Yuxiang Li ^2,^5,^6,^✉, Yong Zhang ^2,^5,^6,^✉, Shuangsang Fang ^1,^2,^✉

PMCID: PMC11379900 PMID: 39242563

Abstract

Three-dimensional Spatial Transcriptomics has revolutionized our understanding of tissue regionalization, organogenesis, and development. However, existing approaches overlook either spatial information or experiment-induced distortions, leading to significant discrepancies between reconstruction results and in vivo cell locations, causing unreliable downstream analysis. To address these challenges, we propose ST-GEARS (Spatial Transcriptomics GEospatial profile recovery system through AnchoRS). By employing innovative Distributive Constraints into the Optimization scheme, ST-GEARS retrieves anchors with exceeding precision that connect closest spots across sections in vivo. Guided by the anchors, it first rigidly aligns sections, next solves and denoises Elastic Fields to counteract distortions. Through mathematically proved Bi-sectional Fields Application, it eventually recovers the original spatial profile. Studying ST-GEARS across number of sections, sectional distances and sequencing platforms, we observed its outstanding performance on tissue, cell, and gene levels. ST-GEARS provides precise and well-explainable ‘gears’ between in vivo situations and in vitro analysis, powerfully fueling potential of biological discoveries.

Subject terms: Computational models, Software, Transcriptomics, Bioinformatics, Data processing

Existing 3D Spatial Transcriptomics reconstruction approaches often overlook spatial information or experiment-induced distortions. Here, authors propose ST-GEARS to bridge the gap between in vivo cell locations and in vitro analysis, accurately recovering spatial profiles.

Introduction

Spatial transcriptomics (ST) is an omics technology that fuels biological research based on measuring gene expression on each position-recorded spot across sliced tissues^1–3. Notably, a range of methods has been developed. In vivo sequencing (ISS)⁴ platforms such as Barcoded Anatomy Resolved by Sequencing (BARseq)⁵ and Spatially-resolved Transcript Amplicon Readout Mapping (STARmap)⁶ rely on amplification, hybridization and imaging process to capture gene expression information. Next Generation Sequencing (NGS)⁷ platform such as Visium¹, Stereo-seq⁸ and Slide-Seq2⁹ uses spatial barcoding and capturing in their implementations. These methods offer various sequencing resolutions ranging from 100 µm^10,11 to 500 nm⁸, and can measure thousands⁵ to tens of thousands⁸ of genes simultaneously.

Single-slice ST studies have unleashed discoveries, and facilitated our understanding in diverse biological and medical fields^9,12–15. Consequently, numerous processing pipelines and analysis models have been developed for ST data on a single section^16–21. However, to truly capture transcriptomics in the real-world context, three-dimensional (3D) ST was designed to recover biological states and processes in real-world dimensions, without restriction of the isolated planes in single sectional ST studies. Various research has utilized the power of 3D ST to uncover insights in homeostasis, development, and diseases. Among them, Wang et al. ²² uncovered spatial cell state dynamics of Drosophila larval testis and revealed potential regulons of transcription factors. Mohenska et al. ²³ revealed complex spatial patterns in Murine heart and identified novel markers for cardiac subsections. And Vickovic et al. ²⁴ explored cell type localizations in Human rheumatoid arthritis synovium. The vast and large variety of downstream 3D research has posted the need for a reliable and automatic recovery method of in vivo spatial profile.

However, the collection process of ST data casts significant challenges onto the accurate reconstruction of 3D ST and the situation has not been overcome by current explorations. Specifically, in 3D ST experiments, individual slices are cross sectioned in a consistent direction, then manually placed on different chips or slides^14,25. This operation introduces varying geospatial reference systems of distinct sections, and coordinates are distorted compared to their in vivo states. The distortions occur due to squeezing and stretching effects during the picking, holding, and relocation of the sections. Such different geospatial systems and distortions complicates the recovery of in vivo 3D profile. Among current recovery approaches, STUtility²⁶ realizes multi-section alignment through the registration of histology images, without considering either geospatial or molecular profile of mRNA, which leads to compromised accuracies. Recently published method PASTE²⁷, and its second version PASTE2²⁸ achieve alignment using both gene expression and coordinate information, through optimization of mapping between individual spots across sections. These methods cause inaccurate mappings and produces rotational misalignments due to the nonadaptive regularization factors, and their uniform sum of probability assigned to all spots upon presence of spots without actual anchors. All above approaches only consider rigid alignment, yet neglect the correction of shape distortions, resulting in shape inconsistency across registered sections. Published method Gaussian Process Spatial Alignment (GPSA)²⁹ considers shape distortions in its alignment, yet it doesn’t involve structural consistency in its loss function, which can cause the model to overfit to local gene expression similarities, leading to mistaken distortions of spatial information. Moreover, its hypothesis space involves readout prediction in addition to coordinates alignment, causing uncertainty in direction of gradient descent, and vulnerabilities to input noises. Another alignment approach, Spatial-linked alignment tool (SLAT)³⁰ also focuses on anchors construction between sections, yet it doesn’t provide a methodology to construct 3D transcriptomics profile. Other tools focus on analysis and visualization of 3d data, such as Spateo³¹, VT3D³² and StereoPy³³.

To address these limitations, we introduce ST-GEARS, a 3D geospatial profile recovery approach designed for ST experiments. By formulating the problem using the framework of Fused Gromov-Wasserstein (FGW) Optimal Transport (OT)³⁴, ST-GEARS incorporates both gene expression and structural similarity into the Optimization process to retrieve cross-sectional mappings of spots with the same in vivo planar positions, also referred to as ‘anchors’. During this process, we introduce innovative Distributive Constraints that allow for different emphasis on distinct spot groups. The strategy addresses importance of expression consistent groups and suppresses inconsistent groups from imposing disturbances to optimization. Hence it increases anchor accuracy compared to current approaches. ST-GEARS utilizes the retrieved anchors to initially perform rigid alignment of sections. Subsequently, it introduces Elastic Field guided by the anchors to represent the deformation and knowledge to correct it according to each spot’s location. To enhance the quality of the field, Gaussian Smoothing is applied for denoising purposes. ST-GEARS then applies Bi-sectional Application to correction of each section’s spatial profile based on its denoised fields calculated with its neighboring sections. With validity proved mathematically, Bi-sectional Application eliminates distortions of sections, resulting in the successful recovery of a 3D in vivo spatial profile.

To understand effects of ST-GEARS, we first studied its counterparts with innovations including anchors retrieval and elastic registration, respectively on Human dorsolateral prefrontal cortex (DLPFC)³⁵, and Drosophila larva²². We found an advanced anchors accuracy of ST-GEARS compared to other available methods involving anchor’s concept and unveiled Distributive Constraints as reason behind the advancement. We validated the effectiveness of elastic registration process of ST-GEARS on both tissue shape smoothness and cross-sectional consistency. Then, we studied output of ST-GEARS and other methods on their reconstruction of Mouse hippocampus tissues³⁶, Drosophila embryo individual²² and a complete Mouse brain³⁷. The result was studied on morphological, cell and gene levels. ST-GEARS was found to be the only method that correctly reconstruct on all cases despite of cross-sectioning distance, number of sections, and sequencing platforms, and it was found to output the most accurate spatial information under both annotation type or clustering information, and hybridization evidence.

Results

ST-GEARS algorithm

ST-GEARS uses ST data as its inputs, including mRNA expression, spatial coordinates as well as approximate grouping information such as clustering or annotation of each observation. Then it recovers 3D geospatial profile in following steps (Fig. 1).

Fig. 1 — a The automatic pipeline of ST-GEARS which recovers ST-GEARS 3D in vivo spatial information by ordered steps including Fused Gromov Wasserstein (FGW) Optimal Transport (OT) problem parameter computing, problem formulating and solving which outputs probabilistic anchors across sections, rigid registration through Procrustes Analysis which solves optimal positional alignment using the anchors, and finally elastic registration. The input of the method is Unique molecular identifier (UMI) counts and location of each spot measured by ST technology, along with their annotations or cross-section clustering result. And the output of the method is recovered 3D in vivo spatial information of the experimented tissue, or sample. b FGW OT problem parameter computing, which assigns nonuniform weights to spots in preparation for future problem formulating, based on cross-sectional similarity of annotation types or clusters. c FGW OT problem formulating, whose setting aims to solve probabilistic anchors joining spots with highest in vivo proximity, through optimizing the combination of gene expression and structural similarity³⁴. FGW OT problem solving, which is implemented based on Conditional Gradient (CG) method, leading to retrieved probabilistic anchors. d Elastic registration, which utilizes the anchors again to compute and denoise distortion fields which guides the elimination of distortions, then applies the fields bi-sectionally to positionally aligned sections, leading to the recovered 3D in vivo spatial information.

(1) Optimization problem formulation under scheme of FGW OT with enhancement of Distributive Constraints. FGW OT formulation is established to enable solving of ‘anchors’, which are the joining of pair of spots with same in vivo planar positions. Noticeably, each solved anchor is equipped with a probability that describes its strength of connection, and each spot is solved to have zero to multiple anchors. Among each two sections, section-specific groups of spots, and genes are initially excluded from the formulation to avoid causing disturbances to anchors computing. Considering that connected spots are more spatially approximate, and more similar in gene expression because of shared cell identity^38,39, FGW was adopted to combine the gene expression and structural terms in optimization, enabling highest gene expression similarity between mapped spots, at the same time keeping similar spot positions relative to their sections. Moreover, an innovative Distributive Constraints setting is designed and integrated into FGW OT’s formulation, to assign higher emphasis on spots or cells whose annotation or cluster express high similarity across section, and vice versa. Distributive Constraints leads registration to rely more on expression-consistent regions of sections, hence largely enhancing both accuracy of anchors and precision of following rigid and elastic registration.

(2) Optimization problem solving utilizing self-adaptive regularization and conditional gradient descent. Our designed Self-adaptive Regularization strategy automatically determines the relative importance between gene expression and structural terms in the optimization problem. This strategy leads to an optimal regularization factor across different section distances, spot sizes, extent of distortions, and data quality such as level of diffusion. Conditional Gradient³⁴ is adopted as optimizer, which updates anchors iteratively towards higher expression and structural similarity with each iteration. The efficacy of Conditional Gradient has been demonstrated through its convergence to a local optimal point⁴⁰, thereby ensuring the robustness and effectiveness of our approach.

(3) Rigid registration by Procrustes Analysis⁴¹. After filtering out anchors with relatively low probabilities, the optimal transformation and rotation of each section are analytically solved through Procrustes Analysis, which minimizes summed spatial distances of spots anchored to each other. With the transformation and rotation applied, sections are positionally aligned.

(4) Elastic registration guided by anchors. Based on rigid registration result and anchors solved by FGW OT, elastic registration is implemented through the process including elastic field inference, 2D Gaussian denoising, and bi-sectional fields application. Based on each rigidly registered section, elastic fields is inferred leveraging the location difference between its own spots and its anchored spots on anterior and posterior neighbor sections. An elastic field is a 2D displacement distribution, describing how displacement values are distributed across different locations. Making use of continuity of deformation at local scales, 2D Gaussian Denoising convolutes all over the fields to reduce noises. With denoised fields, our designed Bi-sectional Fields Application corrects each section’s deformation according to its fields calculated with anterior and posterior neighbor sections. The bi-sectional correction method is mathematically proved to approximately recover each section’s spatial profile to its original state.

Enhancement of anchor retrieval accuracy through distributive constraints

As was unfolded, ST-GEARS is an algorithm flow jointly constituted of probabilistic anchor computation and spatial information recovery. Hence, to validate the effectiveness of our method and demonstrate its underlying design philosophy, we conducted comprehensive studies on the two counterparts using real-world data. To begin, we utilized the DLPFC dataset³⁵ to study our anchors retrieving accuracy with emphasis on the effect of Distributive Constraints design.

To assess the effects of Distributive Constraints on anchor accuracy, we compared ST-GEARS with and without this setting, and with other constraints involving methods including PASTE, PASTE2 and SLAT. We investigated constraint values assigned by these methods, as well as their solved number of anchors and maximum anchor probability of each spot. Furthermore, we examined the annotation types that were considered connected based on the computed anchors to assess accuracy of anchors. Among the methods we compared, ST-GEARS with Distributive Constraints was found to assign different constraint values to spots within different neuron layers, while the others assigned uniform constraints to all layers (Fig. 2a, Supplementary Fig. 1). The results of ST-GEARS showed that both number of anchors and the anchors’ maximum probabilities for each spot were lower in Layer 2 and Layer 4 compared to the thicker layers. However, this pattern was not observed in methods without Distributive Constraints setting (Fig. 2a, Supplementary Fig. 1). To illustrate the impact of this strategy on anchor accuracy, we tagged each spot with annotation of its connected spot by anchor with highest probability. We then compared this result to the tagged spot’s original annotation (Fig. 2a, Supplementary Fig. 1). Under Distributive Constraints, ST-GEARS achieved a significantly higher proximity between annotations compared to PASTE and our method without Distributive Constraints. PASTE2 also led to approximate annotations, but it anchored multiple spots to spots from neighboring layers, particularly those near layer boundaries. SLAT also mapped multiple spots to spots from different tissue layers, particularly of spots located on layer 2, 4 and 6.

Fig. 2 — a (from left to right) 1st and 2nd human dorsolateral prefrontal cortex (DLPFC) section of patient #3 by Maynard et al.³⁵ with their provided annotations and our anchors showcase, (of the same section pair) probabilistic constraints settings in Optimal Transport (OT) problem formulating, no. of anchors computed on each spot, max. anchor probability value computed of each spot, and annotation type mapped back to spots through computed anchors; (from top to bottom) respectively by PASTE, PASTE2, SLAT, ours without distributive constraints setting, and ours. The distinction of different annotation types on the 1st section is marked by dotted lines. Mapping accuracy is used to measure accuracy of anchors and is marked alongside respective annotation type mapping visualizations. b Mapping accuracy measured on anchors of sections pairs used in (b) by PASTE, PASTE2, SLAT, and ST-GEARS. c Comparison of no. of anchors histograms between ST-GEARS and ST-GEARS without distributive constraints, of sections pairs of 1st and 2nd, 2nd and 3rd, and 3rd and 4th sections. The Probability Density Function (PDF) estimated by Gaussian kernel was plotted in dotted lines with the same color of histograms, to highlight the distribution differences. Source data are provided as a Source Data file.

To evaluate the precision of anchors, we conducted a comparison with the Mapping accuracy index introduced by PASTE²⁷. This index measures the weighted percentage $\sum_{i, j, l (i) = l (j)} π_{i j}$ of anchors that connect spots with same annotation. As a result, ST-GEARS outperformed PASTE2 and SLAT, and reached a score that was over 0.5 (out of 1) higher than both PASTE and our method without Distributive Constraints (Fig. 2a, b, Supplementary Fig. 1).

To uncover the reasons behind the aforementioned phenomena, as the functional area in between thicker neocortical layers, thinner neocortical layers have comparable transcriptomic similarity with their adjacent layers in gene expression, than with its own annotation type^1,35. This implies that, in contrast to thicker layers, thinner layers tend to introduce more disturbances during anchor computation. However, the Distributive Constraints imposed suppression on these annotation types by assigning a smaller sum of probability to each of their spots. The suppression was reflected in above results where each spot in Layer 2 and Layer 4 has fewer assigned anchors and a lower maximum probability (Fig. 2a, Supplementary Fig. 1). Further analysis on all spots in the DLPFC reveals that a certain percentage of spots were suppressed in anchor generation due to the Distributive Constraints (Fig. 2c, Supplementary Fig. 2).

Recovery of in vivo shape profile through elastic registration

We then utilized Drosophila larva data to investigate the spatial profile recovery effect of ST-GEARS, with an emphasis on our innovated elastic registration. We first applied rigid registration to Drosophila larva sections and observed a visually aligned configuration of individual sections (Supplementary Fig. 3). By further mapping cell annotations back to their previous sections, according to the strongest anchors of each spot, the projected annotations are visually in match with original ones (Supplementary Fig. 4). The accuracy of the mapping matching between annotations was quantified by Mapping accuracy (Supplementary Fig. 5). The above findings validated that ST-GEARS produced reliable anchors and accurately aligned sections through rigid registration. However, when stacking the sections together, we observed an inconsistency on the edge of lateral cross-section of the rigid result (Supplementary Fig. 6). This inconsistency doesn’t conform to the knowledge of intra-tissue and overall structural continuity of Drosophila larvae.

After applying elastic registration to the rigidly-aligned larva, we observed a notable improvement in the continuity of the cross section above, indicating a closer-to-real spatial information being retrieved. To further understand the effect of elastic operation on the dataset, we compared the changes in area of the complete body and three individual tissues (trachea, central nervous system (CNS), and fat body) on all sections. We observed an enhanced smoothness in the curves of elastically registered sections, which aligns with the continuous morphology of the larva as expected by theoretical knowledge. To quantify the smoothing effect, we calculated Scale-independent Standard Deviation of Differences ( $S I - S T D - D I = S T D ({s_{i} - s_{i - 1} : i \in [1, 2, . . ., I - 1]}) / ∣ m e a n ({s_{i} - s_{i - 1} : i \in [1, 2, . . ., I - 1]}) ∣$ ) onto the curves, which measures the smoothness of area changes along the sectioning direction (Fig. 3a and Methods). A decrease of SI-STD-DI on all tissues and the body provided empirical evidence for the improved smoothness. To further investigate the recovery of internal structures, we introduced Mean Structural Similarity (MSSIM). MSSIM takes structurally consistent sections as input, and measures pairwise internal similarity of reconstructed result using annotations or clustering information (Supplementary Fig. 7). (See Methods for details). An improved MSSIM was noticed on all 4 sections, indicating that elastic registration further recovers internal geospatial continuity on basis of rigid operation(Fig. 3b). By comparing registration effect of individual sections, we also observed that the elastic process successfully rectified a bending flaw along the edge of the third section, (Fig. 3c). The shape fixing highlighted that ST-GEARS not only yielded a more structurally consistent 3D volume, but also provided a more accurate morphology for single sections. The improved smoothness, the recovered structural continuity, and the shape fixing collectively demonstrate that elastic registration effectively recovers geospatial profile.

Fig. 3 — a A comparison of area changes of 3 tissues and complete body of *Drosophila* Larva, between result of rigid registration and result of elastic registration appended to rigid registration. The areas are calculated based on recovered spot position of different tissues along cross-sectioning direction. Standard Deviation of Differences (SI-STD-DI) quantifying the smoothness is marked alongside each curve. b A comparison of structural accuracy, measured by Mean Structural Similarity (MSSIM), of selected section pairs from *Drosophila* Larva (L3), between result of rigid registration only and result of elastic registration appended to rigid registration. The chosen section pairs are the structurally consistent ones. c Comparison of individual sections recovered by rigid registration only and by elastic registration appended to rigid registration, of 1st to 5th section of *Drosophila* Larva (L3). Shape correction of bended area in the 3^rd section, and increased cross-sectional consistency on the 4th and 5th section were highlighted by blue arrows. Source data are provided as a Source Data file.

With elastic process validated and applied onto rigid registration result, the recovery of spatial information was completed. Stacking individual sections of the elastic result, a complete geospatial profile of the larva was generated (Supplementary Fig. 8), visualizing the ST-GEARS’ ability of in vivo spatial information recovery.

Application to sagittal sections of Mouse hippocampus

After validating the component phases of ST-GEARS, we proceeded to apply the method to multiple real-world problems to recover geospatial profiles. We first focused on two sagittal sections of Mouse hippocampus³⁶ (Supplementary Fig. 9) that were 10 μm apart, accounting for 1–2 layers of Cornu Ammonis (CA) 1 neurons⁴². Considering the proximity of these sections, we assumed no structural differences between them.

To compare the differences of registration effect among methods, we extracted CA fields and dentate gyrus (DG) beads (Supplementary Fig. 10), then stacked the two sections for a more obvious contrast (Fig. 4a). PASTE2 failed in performing the registration, leaving the sections unaligned. By GPSA, the sections’ positions were aligned, yet the 2nd section were squeezed into a narrower region than first one, leading to a contradiction of region’s location. The ‘narrowing’ phenomena may be caused by the overfitting of GPSA model on gene expression similarity, since it doesn’t involve structural similarity between registered sections in loss function. The scale on horizontal and vertical axis was distorted due to the equal scale range strategy adopted in GPSA’s preprocessing. STalign also misaligned the sections, leaving an obvious angle between two slices in registration result. This may be due to the method’s processing of ST data into images which completely relies on gene expression abundance to decide pixel intensities. On the sagittal section of Mouse hippocampus, the abundance difference between regions may not provide sufficient structural information required by registration. In the comparison between PASTE and ST-GEARS, our method demonstrates a more accurate centerline overlapping of CA fields and DG compared to PASTE. This indicated an enhanced recovery of spatial structure consistency and an improved registration effect. To quantitatively evaluate these findings, we utilized the MSSIM index as a measure of structural consistency and compared it among PASTE, PASTE2, GPSA, STalign and ST-GEARS (Fig. 4b). Consistent with the results of centerline, ST-GEARS achieved a higher MSSIM score than GPSA and PASTE, surpassing PASTE2 and STalign by >0.2 out of 1. By comparing memory efficiency across all methods, ST-GEARS and PASTE used ~1 GB less memory than PASTE2, GPSA and STalign, and the peak memory across ST-GEARS and PASTE was almost the same (Supplementary Fig. 11). In perspective of time efficiency, registration utilizing ST-GEARS, STalign, GPSA and PASTE was much faster than PASTE2.

Fig. 4 — a Stacked projections of Cornu Ammonis (CA) fields and dentate gyrus (DG), of pre-registered and registered result of *Mouse* hippocampus sagittal sections with 10 µm distance, respectively by PASTE, PASTE2, GPSA, STalign and ST-GEARS. b A comparison of both MSSIM measuring structural accuracy and Mapping accuracy measuring anchor accuracy of the 2 registered sections, across PASTE, PASTE2, GPSA, STalign and ST-GEARS. c Stacked projections of region-specific annotation types including DG, Neurogenesis, subiculum, CA1, CA2 and CA3, registered by ST-GEARS. Each column highlights the stacked projection of a single annotation type. Source data are provided as a Source Data file.

To understand reasons behind our enhancement, we thoroughly examined the anchors generated by PASTE, PASTE2 and ST-GEARS, as well as the effects of our elastic registration. By mapping cluster information of the 2nd section to the 1st, and the 1st to the 2nd through anchors, we found correspondences between the projected and original annotations (Supplementary Fig. 12). Accordingly, our Mapping accuracy was over 0.25 higher than PASTE and over 0.45 than PASTE2 (Fig. 4a), indicating our exceptional anchor accuracy. To understand and further substantiate this advantage, we visualized the probabilistic constraints and its resulted anchors probabilities (Supplementary Fig. 13a). It is worth noting that ST-GEARS implemented Distributive Constraints, in contrast to the uniform distributions used by PASTE. As a result, a certain percentage of spots were found to be suppressed in anchors connection by ST-GEARS (Supplementary Fig. 13b) compared to PASTE, leaving the registration to rely more on spots with higher cross-sectional similarity and less computational disturbances, and hence lead to a higher anchor accuracy. We excluded Distributive Constraints from ST-GEARS, and noticed an obvious decrease of mapping accuracy on the hippocampus dataset (Supplementary Fig. 14), indicating the contribution of Distributive Constraints on anchors accuracy. In the study of elastic effect, we found an increased overlapping of centerlines by elastic registration than by rigid operation only when overlapping CA fields and DG (Fig. 4b). Quantitively by MSSIM, the cross-sectional similarity was found to be increased by elastic registration (Supplementary Fig. 15). These findings suggest that the combination of Distributive Constraints and elastic process contributed to the enhanced registration of the Mouse hippocampus.

To explore the potential effect of impact of our registration on downstream analysis, we extracted region-specific annotation types from the sections, and analyzed their overlapping through stacking registered sections together (Fig. 4c). In all annotation types including DG, Neurogenesis, subiculum, CA1, CA2 and CA3, the distribution regions from both sections were nearly identical. The overlapping result unveils that ST-GEARS integrated the spatial profile of same cell subpopulations, enabling a convenient and accurate downstream analysis of multiple sections.

Application to 3D reconstruction of Drosophila embryo

Besides tissue level registration of Mouse hippocampus, to evaluate the performance of ST-GEARS in reconstructing individual with multiple sections, we further tested it on a Drosophila embryo. The transcriptomics of embryo was measured by Stereo-seq, with 7 μm cross-sectioning distance²². By quantifying the registration effect of spatial information recovery and comparing it to PASTE, PASTE2, GPSA and STalign, we found that ST-GEARS achieved the highest MSSIM in five out of the six structurally consistent pairs (Fig. 5a). On the pair where ST-GEARS did not result in highest MSSIM, it surpassed PASTE, and achieved a similar score to PASTE2. By comparing area changes with SI-STD-DI quantification of the complete section, and three individual tissues including epidermis, midgut and foregut, ST-GEARS yielded higher smoothness on all regions than all other approaches, both visually and quantitatively (Fig. 5b).

Fig. 5 — a A comparison of Mean Structural Similarity (MSSIM) measuring structural similarity, of section pairs that are structurally consistent from *Drosophila* Embryo (E14-16h), between reconstruction results of PASTE, PASTE2, GPSA, STalign and ST-GEARS. b A comparison of area changes of 3 tissues and complete body of *Drosophila* Embryo, along cross-sectioning direction, between reconstruction result of PASTE, PASTE2, GPSA, STalign and ST-GEARS. Standard Deviation of Differences (SI-STD-DI) which measures structural consistency is marked alongside each curve to quantify the smoothness. The smoothness difference of ST-GEARS compared to PASTE, PASTE2 and STalign are highlighted by orange rectangles. c Reconstructed individual sections with recovered spatial location of each spot. In result of PASTE, the incorrect flipping on the 15th section was highlighted in orange. In result of PASTE2, gradual rotations were marked by the 1st, 5th, 9th, 13th and 16th sections’ approximate symmetry axis whereas symmetry axis of the 1st section was replicated onto the 16th for angle comparison. In result of GPSA, mistakenly distorted sections were marked by purple arrows. In result of STalign, the incorrect flipping on the 13th section was highlighted in orange. In result of ST-GEARS, the fix of dissecting area on the 15th section was marked by a blue arrow. d Dorsal view of 3D reconstructed *Drosophila* embryo by PASTE, PASTE2, GPSA, STalign and ST-GEARS. The inaccurate regionalization of midgut was circled and pointed with arrow in orange. The resulted extruding part of single section by PASTE2 was circled and pointed in blue. e Mapping accuracy of all section pairs by PASTE, PASTE2 and ST-GEARS. f By dorsal view, regionalization of marker gene *Cpr56F* and *Osi7* by PASTE, PASTE2, GPSA, STalign and ST-GEARS, and their comparison with hybridization result from Berkeley *Drosophila* Genome Project (BDGP) database. The gathering expression regions were highlighted by dotted lines. Source data are provided as a Source Data file.

To compare the reconstruction effect, we studied both registered individual section, and reconstructed 3D volume. Among the methods compared, PASTE produced a wrong flipping on the 15^th section along A-P axis (Fig. 5c). Stacking sections back to 3D and investigating on dorsal view, the wrong flipping caused a false regionalization of foregut circled in orange (Fig. 5d). Along the first to last section registered by PASTE2, a gradual rotation was witnessed (Fig. 5c), leading to over 20 degrees of angular misalignment between the first and the last section. Similar to PASTE, this misalignment also caused the wrong regionalization of foregut in 3D map (Fig. 4d). Equally induced by the rotation, sections were found to extrude in the 3D result circled in blue, breaking the round overall morphology of the embryo. GPSA caused false distortion of 8 out of 16 sections as pointed by purple arrows (Fig. 5c) and the stacked sections formed a dorsal view of an isolated circle and an inner region (Fig. 5d). The phenomena may be due to its overfitting onto expressions, which is caused by the contradiction between its hypothesis of consistent readout across sections, and the large readout variation across 16 sections in this application. Similar to PASTE, STalign also produced a wrong flipping, on the 13^th section along A-P axis (Fig. 5c). Stacking the projections back to 3D, a mistaken regionalization of foregut, caused by the wrong flipping, was circled in orange (Fig. 5d). In contrast, ST-GEARS avoided all of these mistakes in its results (Fig. 5c). From the perspective of individual section profiles, noticeably in the 15^th section, we observed a significant reduction in the dissecting region between two parallel lines, indicating the successful fixation of flaws in the session. By comparing time usage across all methods, ST-GEARS achieved the 2nd lowest time consumption in registration (Supplementary Fig. 11). In terms of memory consumption, ST-GEARS, PASTE and STalign used much less memory than PASTE2 and GPSA. The three most memory efficient methods used almost identity peak memory, with the value fluctuation of <7%.

To comprehend the rationale behind our improvement, we analyzed the anchors generated by the three methods and the impact of our elastic registration. In the investigation of anchor accuracy, we discovered that ST-GEARS achieves the highest mapping accuracy among all section pairs (Fig. 5e), suggesting its advanced ability to generate precise anchors, which forms the basis for precise spatial profile recovery. To understand this advancement, probabilistic constraints and its resulted anchors distributions (Supplementary Fig. 16, Supplementary Fig. 17) were studied. With Distributive Constraints (Supplementary Fig. 16a), ST-GEARS generated different maximum probabilities on different annotation types (Supplementary Fig. 16b), which indicates that annotation types with higher cross-sectional consistency were prioritized in anchor generation. This selection led to reduced computational disturbances, and hence higher accuracy of anchors. We also compared anchor accuracy with and without Distributive Constraints adopted, and noticed an increase of mapping accuracy on each pair of sections (Supplementary Fig. 18). In final registration result, ST-GEARS without Distributive Constraints failed to fix the experimental flaw on the 15^th section (Supplementary Fig. 19), in contrast to effect upon the setting adopted (Fig. 5c). Above findings validate the contributive effect of Distributive Constraints in our method. In study of elastic registration in shape smoothness, we witnessed an increased level of smoothness of tissue epidermis, foregut, and midgut, as well as the complete section, through area changes quantified by SI-STD-DI index (Supplementary Fig. 20). In internal structure aspect, an increased MSSIM of structural consistent pairs were noticed (Supplementary Fig. 21). An experimental flaw on the 15^th section was also fixed by elastic registration (Supplementary Fig. 22). Above findings point that the enhancement of registration accuracy on Drosophila embryo was induced by Distributive Constraints and elastic process.

By mapping spots back to 3D space, we further investigated the effect of different method on downstream analysis, in the perspective of genes expression (Fig. 5f). Cpr56F and Osi7 were selected as marker genes, which were found to respectively highly express in foregut, and foregut plus epidermis region²². Investigating Cpr56F expression by ST-GEARS from dorsal view, we noticed three highly expressing regions, at anterior end, front region, and posterior end of the embryo. The finding matches the hybridization result of stage 13-16 Drosophila embryo extracted from Berkeley Drosophila Genome Project (BDGP) database. In contrast, none of PASTE, PASTE2, GPSA and STalign presented high expression at all three locations. When analyzing the distribution of Osi7 by PASTE, PASTE2 and STalign, we noticed a sharp decrease in expression from inner region to the outer layer marked by purple arrows, contradicting the prior knowledge of high expression in the epidermis. This is probably because PASTE and PASTE2 do not consider distortion correction as part of their methods, leaving section edges un-coincided and marker genes not obviously highly expressed on the outermost region. Though involving distortion correction, STalign lost certain amount of structural information by transforming ST data to image utilizing only information of regional gene expression abundance. The registration did not adequately correct distortion without support of enough structural messages. Similarly, PASTE2 failed to capture expression in outer layers and instead revealed a high expression in one inter-connected area, which did not correspond to the separate expression regions observed in hybridization result. No spatial pattern was witnessed when analyzing distribution of Osi7 by GPSA, which forms an obvious contrast to its hybridization evidence. Comparably, none of the violations was shown in the result of ST-GEARS. The comparison of spatial distribution indicated our potential capability to better enhance the process of downstream gene-related analysis.

Application to Mouse brain reconstruction

The design of 3D experiments involves various levels of sectioning distances^22,36,37. To further investigate the applicability of ST-GEARS on ST data with larger slice intervals, we applied the method to a complete Mouse brain hemisphere dataset, which consists of 40 coronal sections (Supplementary Fig. 23a), with a sectioning distance of 200 μm³⁷. The transcriptomics data was measured by BARseq, which includes sequencing data and its cross-modal histology images. Each observation represents captured transcriptomics surrounded by the boundary of a cell.

Through respectively applying PASTE, PASTE2, GPSA, STalign and ST-GEARS onto the dataset, we observed multiple misaligned sections produced by approaches including PASTE, PASTE2, GPSA and STalign (Supplementary Fig. 23b, Supplementary Fig. 23c, Supplementary Fig. 23d, Fig. 6a). In PASTE, these misalignments include 2 sections with ~ 180° angular misalignment (Supplementary Fig. 23b). By PASTE2, 4 rotational misalignments and 8 positional misalignments were noticed (Supplementary Fig. 23d). By GPSA, 12 sections were observed to be rotationally misaligned, and 3 sections were mistakenly distorted (Supplementary Fig. 23b), probably due to its overfitting onto expressions discussed in analysis of Drosophila embryo. The scale on horizontal and vertical axis was distorted maybe due to the similar reason analyzed in Mouse hippocampus. And by STalign, 7 rotational misalignments were generated (Supplementary Fig. 23e). As a clear contrast, our algorithm correctly aligned all 40 sections with 200 μm intervals (Supplementary Fig. 23f). To more accurately assess the result of our registration, we employed the direction of the cutting lines induced during tissue processing³⁷, and compared the consistency of tilt angles of these lines in the 20th, 25th, 26th, 27th, 33rd, 34th and 37th slices where these lines are visible. Notably, neither visual angle differences nor cutting line curving were observed, indicating that the sections were properly aligned by ST-GEARS (Fig. 6a, Supplementary Fig. 23f). To quantify the registration accuracy in aspect of structural continuity, we calculated MSSIM scores of 11 section pairs that are structural consistent (Fig. 6b). Consistent with the visual observations, PASTE2 presented a much larger score range than other methods, which reflects its instability across sections in this dataset, and GPSA exhibited the lowest median MSSIM score indicating its suboptimal average performance. By comparison, PASTE yielded a higher median score and a smaller variation, while ST-GEARS resulted in the highest median score and the smallest variation among all methods. In terms of computational efficiency, ST-GEARS achieved the 2nd lowest time consumption and lowest peak memory consumption across all methods (Supplementary Fig. 11).

Fig. 6 — a Reconstructed individual sections with recovered spatial location of each spot from the 25th to 36th section. Positional misalignments are marked by arrows of green, and angular misalignments are marked by arrows of orange. Visible cutting lines by ST-GEARS are marked by dotted lines. b A comparison of Mean Structural Similarity (MSSIM) score of 11 section pairs that are structurally consistent, between result of PASTE, PASTE2, GPSA, STalign and our method. The 11 biological replicates were studied, which were derived from different closest section pairs with each section pair representing smallest unit of study. Non control group was used as a MSSIM close to 1 is assumed to the idealized similarity value of the structurally similar pairs, hence a higher MSSIM value indicates higher reconstruction accuracy. The red lines positions show median score; the box extends from the first quartile (Q1) to the third quartile (Q3) of scores; the lower whisker is at the lowest datum above Q1 − 0.5 * (Q3-Q1), and the upper whisker is at the highest datum below Q3 + 0.5*(Q3-Q1); scores out of whiskers range are marked by circles. c Perspective, Lateral and Anterior view of reconstructed Mouse brain hemisphere. d Anterior view of layer annotation types distribution of reconstructed *Mouse* brain hemisphere. Source data are provided as a Source Data file.

To understand the reasons behind our progress, we examined anchor accuracy changes with regularization factors during ST-GEARS computation (Supplementary Fig. 24). Out of 39 section pairs, we observed a change in mapping accuracy >0.1 (out of 1) in 12 pairs. By Self-adaptive Regularization which was designed to face varying data characteristics which also includes varying section distances, regularization factor that leads to optimal mapping accuracy was selected, leading to an increased anchors accuracy in the 12 section pairs. Notably, among these 12 pairs, pairs 29th & 30th, 31st & 32nd and 32nd & 33rd were correctly aligned by ST-GEARS but misaligned by PASTE, which doesn’t adopt any self-adaptive regularization strategy.

After validating the registration result, we investigated the recovered cell-types’ distribution in the 3D space to assess the effectiveness of the reconstruction and its impact on further analysis. We observed that the complete morphology of hemisphere was recovered by ST-GEARS, with clear distinction of different tissues on perspective, lateral and anterior views (Fig. 6c). We further studied the distribution of separate annotation types within cortex layers and found that 3D regionalization of each annotation type was recovered by ST-GEARS (Fig. 6d). The reconstructed result indicated the adaptability of ST-GEARS across various scales of sectioning intervals, and its applicability on both bin-level, and cell-level datasets on which histology information is incorporated.

Discussion

We introduce ST-GEARS, a 3D geospatial profile recovery approach for ST experiments. Leveraging the formulation of FGW OT, ST-GEARS utilizes both gene expression and structural similarities to retrieve cross-sectional mappings of spots with same in vivo planar coordinates, referred to as ‘anchors’. To further enhance accuracy, it uses our innovated Distributive Constraints to enhance the accuracy. Then it rigidly aligns sections utilizing the anchors, before finally eliminating section distortions using Gaussian-denoised Elastic Fields and its Bi-sectional Application.

We validate counterpart of ST-GEARS including anchors retrieval and elastic registration, respectively on DLPFC and Drosophila larva dataset. In the validation of anchors retrieval, through Mapping accuracy evaluation of retrieved anchors, ST-GEARS consistently outperformed PASTE and PASTE2 across all section pairs. We show Distributive Constraints as reasons behind its distinguished performance, which effectively suppressed the generation of anchors between spot groups with low cross-sectional similarity while enhances their generation among groups with higher similarity. To investigate the effectiveness of the elastic registration process, we evaluate the effects of tissue area changes and cross-sectional similarity using the Drosophila larvae dataset. Both smoother tissue area curves and higher similarity observed between structurally consistent sections confirm the efficacy of the elastic process of ST-GEARS.

We demonstrate ST-GEARS’s advanced accuracy of reconstruction compared to current approaches including PASTE, PASTE2 and GPSA, and its positive impact on downstream analysis compared to existing approaches. Our evaluation encompasses diverse application cases, including registration of two adjacent sections of Mouse hippocampus tissue measured by Slide-seq, reconstruction of 16 sections of Drosophila embryo individual measured by Stereo-seq, and reconstruction of a complete Mouse brain measured by BARseq, including 40 sections with sectioning interval as far as 200 μm. Among the methods, registered result by ST-GEARS exhibited the highest intra-structural consistency measured by MSSIM for two hippocampus sections separated by a single layer of neurons. On 16 sections of a Drosophila embryo individual, our method’s outstanding accuracy is indicated by both MSSIM and smoothness of tissue area changes. Importantly, ST-GEARS provides more reliable embryo morphology, precise tissue regionalization, and accurate marker gene distribution under hybridization evidence compared to existing approaches. This suggests that ST-GEARS provides higher quality tissues, cells, and genes information. On Mouse brain sections with large intervals of 200 μm, ST-GEARS avoided positional and angular misalignments that occur in result of PASTE and PASTE2. The improvement was quantified by a higher MSSIM. Both hemisphere morphology and cortex layer regionalization were reflected in the result of 3D reconstruction by ST-GEARS. The successful representation of important structural and functional features in the aforementioned studies collectively underscores ST-GEARS’ reliability and capability for advancing 3D downstream research, enabling more comprehensive and insightful analysis of complex biological systems.

To further enhance and extend our method, opportunities in various aspects are anticipated to be explored. Firstly, algorithm aspects including hyperparameter sensitivity and scalability can be further explored for a more enhanced method performance. Though recommended values are provided for two of its hyperparameters, method performance is still affected by parameter values, raising the potential issue of overfitting and sensitivity which can be further studied. In scalability aspect, ST-GEARS introduces obvious computational cost increasement when dealing with large-scale datasets. Though strategy of Granularity adjusting is innovated to down-grade complexity, opportunity of improving robustness on increasing scale of data is expected to be further explored. Secondly, tasks aimed at improving data preprocessing, including but not limited to batch effect removal and diffusion correction, are expected to be integrated into our method, considering their coupling property with registration task itself: inaccuracies in input data introduce perturbations to anchors optimization, while recovered spatial information of our method may assist data quality enhancement by providing registered sections. Thirdly, the ST-GEARS’ Distributive Constraint takes rough grouping information as its input, which may potentially introduce computational burden during the reconstruction process. To address this, an automatic step is expected to be developed to reliably cluster spots while maintaining computational efficiency of the overall process. This step can be integrated into our method either as preprocessing, or as a coupling task, similarly to our expectation of data quality enhancement. Finally, we envision incorporating a wider scope of anchors applications into our existing framework. such as information integration of sections across time, across modalities and even across species. With interpretability, robustness and accuracy provided by ST-GEARS, we anticipate its applications and extension in various areas of biological and medical research. We believe that our method can help address a multitude of questions regarding growth and development, disease mechanisms, and evolutionary processes.

Methods

FGW OT description

Fused Gromov Wasserstein (FGW) Optimal Transport (OT) is the modeling of spot-wise or cell-wise similarity between two sections, with the purpose of solving optimal mappings between the spots or cells, with mappings also called ‘anchors’. By FGW OT, the optimal group of mappings enables highest gene expression similarity between mapped spots, at the same time keeping similar positions relative to their located sections.

The required input of FGW OT includes genes expression, spot or cell locations before registration, and constraint values which assigns different weight to the optimization on different spots or cells. For gene expression, we introduce $A \in R^{n_{A}, m}$ for section A, to describe normalized count of unique molecular identifiers (UMIs) of different genes of each cell or spot, thereinto n_A denotes number of spots in slice A, and m denotes number of genes that are captured in both sections. Similarly, we describe gene expression on section B as $B \in R^{n_{B}, m}$ , with genes arranged in the same order as in A. For spot or cell locations, we introduce $X_{A} \in R^{n_{A}, 2}$ to describe spots locations of section A, with the 1st column storing horizontal coordinates and the 2nd storing vertical coordinates. Similarly, we have $X_{B} \in R^{n_{B}, 2}$ to describe spots locations in section B. Spots are arranged in the same order in gene expression and location matrices. Constraint values are discussed in section of Distributive Constraints.

FGW OT solves:

π = a r g m i n_{π \in Π (a, b)} ⟨ (1 - α) M_{AB}^{2} + α L^{2} (C_{A}, C_{B}) \otimes π, π ⟩ = a r g m i n_{π \in Π (a, b)} ((1 - α) ⟨ M_{AB}^{2}, π ⟩ + α ⟨ L^{2} (C_{A}, C_{B}) \otimes π, π ⟩)

s . t . \sum_{j} π_{i, j} = W_{i}^{(A)}, \sum_{i} π_{i, j} = W_{j}^{(B)}

Thereinto, $M_{AB} \in R^{n_{A}, n_{B}}$ describes the similarity of each pair of spots respectively on section A and B, formulated as $M_{i, j}^{(AB)} = K L (A_{i, :}, B_{j, :})$ . Be noted that $M_{i, j}^{(AB)}$ still indicates spot-wise similarity M_AB, with section code AB being moved to superscript and added parenthesis for clarity, since subscript location are taken by spot index i, j. KL denotes Kullback-Leibler (KL) divergence⁴³. $C_{A} \in R^{n_{A}, n_{A}}$ describes spot-wise distance within section A, with $C_{i, j}^{(A)} = d i s (X_{i, :}^{(A)}, X_{j, :}^{(A)})$ , and dis denoting Euclidean distance measure. Be noted that $X_{i, :}^{(A)}$ and $X_{j, :}^{(A)}$ still indicate spot locations X_A, with section code A being moved to superscript and added parenthesis for clarity, since subscript location are taken by spot index i and j. $C_{i, j}^{(A)}$ refers to spot-wise distance C_A for the same reason. Similarly, $C_{B} \in R^{n_{B}, n_{B}}$ describes spot-wise distance of section B. $L \in R^{n_{A}, n_{B}, n_{A}, n_{B}}$ defines the difference between all spot pair distance respectively on section A and B, with $L_{i, j, k, l} = ∣ C_{i, k}^{(A)} - C_{j, l}^{(B)} ∣$ . ⊗ denotes Kronecker product of two matrices; 〈,〉 denotes matrix multiplication.

Adjacency matrix $π \in R^{(n_{A}, n_{B})}$ to be optimized stores strength of anchors between spots from the two sections, with row index representing spots on section A, and column index representing spots on section B. Sum of elements of π is 1. With $⟨ M_{AB}^{2}, π ⟩$ , the similarity of mapped spots are measured. With $⟨ L^{2} (C_{A}, C_{B}) \times π, π ⟩$ , similarity between distance of spot pairs on section A, with its anchored spot pairs on section B, is measured. $⟨ L^{2} (C_{A} l, C_{B}) \otimes π, π ⟩$ describes similarity between spatial structures under the anchors’ connection. α ∈ [0,1] denotes regularization factor, which specifies the relative importance of structure similarity compared to expression similarity. W_A and W_B are constraint values that are introduced in section of Distributive Constraints.

With the formulation above, FGW OT solves optimal anchors between the spots, or cells, which enables maximum weighted combination of gene expression similarity and position similarity of mapped spots or cells.

Distributive constraints

As adopted by constraint values in FGW OT, we introduce Distributive Constraints, to assign different emphasis to spots or cells in the optimization. Distributive Constraints utilizes cell type component information to differentiate the emphasis: if an annotation or cluster express high similarity across sections, its corresponding spots or cells will be placed relatively high sum of probability, and vice versa. With higher sum of probability, more anchors and anchors with higher strength are generated, while less anchors are produced on spots with lower sum of probability. This operation leads registration to rely more on expression-consistent regions of sections, hence largely enhancing both accuracy of anchors and precision of following rigid and elastic registration.

The required inputs of Distributive Constraints include $G_{A} \in R^{n_{A}}$ and $G_{B} \in R^{n_{B}}$ , which store the grouping information such as annotation type or cluster of each spot in section A and B. We then summarize the repeated annotations or clusters from G_A and G_B, and put the unique values in $g \in R^{n_{g r o u p}}$ . n_group is the number of unique annotation type or clusters. Then implemented in ST-GEARS, for each annotation type or cluster g_i, we calculate the average gene expression across spots:

\begin{matrix} av g_{A} = \frac{1}{∣ I_{A} ∣} 1_{n_{A}} A_{i \in I_{A}, :} \\ av g_{B} = \frac{1}{∣ I_{B} ∣} 1_{n_{B}} B_{i \in I_{B}, :} \end{matrix}

where

\begin{matrix} I_{A} = \{i^{'} \in {1, 2, . . ., n_{A}} ∣ G_{i^{'}}^{(A)} = g_{i}\} \\ I_{B} = \{i^{'} \in {1, 2, . . ., n_{B}} ∣ G_{i^{'}}^{(B)} = g_{i}\} \end{matrix}

Be noted that $G_{i^{'}}^{(A)}$ and $G_{i^{'}}^{(B)}$ still indicate grouping information G_A and G_B, with section code A and B being moved to superscript and added parenthesis for clarity, since subscript location are taken by spot index i′ and j′. And $1_{n_{A}}$ and $1_{n_{B}}$ are both row vectors of ones.

With average gene expression of each annotation type or cluster, with the form of distribution, we measure its difference across sections by KL divergence. Then the calculated distance is mapped by logistic kernel, to further emphasize differences between relatively consistent annotations or clusters.

d i s = K L (a v g_{A}, a v g_{B})

$d i s_{m a p} = f_{l o g i s t i c} (d i s)$ , where $f_{l o g i s t i c} (x) = \frac{1}{1 + e^{- x}} - 0.5$ . Putting scaler value dis of each annotation or cluster together, we have a vector $DI S_{map} \in R^{n_{c e l l t y p e}}$ . Finally, we transform the distance to similarity, map the similarity result back to each spot:

\begin{matrix} sim = - 1 \times DI S_{map} \\ {W_raw}_{{i ∣ C_{i}^{(A)} = c_{i}}}^{(A)} = si m_{i} \\ {W_raw}_{{i ∣ C_{i}^{(B)} = c_{i}}}^{(B)} = si m_{i} \end{matrix}

We further apply normalization on the result:

\begin{matrix} W_{A} = \frac{1}{Σ W_ra w^{(A)}} (W_ra w^{(A)} - \min (W_ra w^{(A)}) \times 1_{n_{A}}) \\ W_{B} = \frac{1}{Σ W_ra w^{(B)}} (W_ra w^{(B)} - \min (W_ra w^{(B)}) \times 1_{n_{B}}) \end{matrix}

W_A and W_B are constraints values applied in (1). Since the values are computed based on similarity measure using cell composition information, weight of FGW OT is automatically redistributed, with higher emphasis on more consistent regions across sections, and less emphasis on less consistent area. Enhanced anchor accuracy hence registration accuracy is then achieved.

Self-adaptive regularization

In FGW OT formulation, a regularization factor is included to specify the relative importance of structural similarity compared to expression similarity during optimization. ST-GEARS includes a self-adaptive regularization method that determines the factor value, that induces highest overall accuracy of anchors despite of varying situations. Situations include but are not limited to section distances, spot sizes, extent of distortions, and data quality such as level of diffusion.

By practice, our method respectively adopts factors on multiple scales including 0.8, 0.4, 0.2, 0.1, 0.05, 0.025, 0.013, and 0.006. The candidate values vary exponentially, for ST-GEARS to find the optimal term regardless of scale differences between expression and structural term in (1). The accuracy of each set of optimized anchors by every regularization factor was evaluated, by measuring weighted percentage $\sum_{G_{i}^{(A)} = G_{j}^{(B)}} π_{i, j}$ of anchors that join spots with same annotation types or clusters. Be noted that $G_{i}^{(A)}$ and $G_{j}^{(B)}$ still indicate grouping information G_A and G_B, respectively, with section code A and B being moved to superscript and added parenthesis for clarity, since subscript location are taken by spot index i and j. The regularization factor value that achieves highest accuracy is then adopted by our method.

Elastic field inference

Finding spots with highest probability

After rigid registration, elastic fields are inferred based on the anchors with the highest probability for each spot or cell. For elastic field to be applied on each section, it is calculated using its anchors with closest sections, as well as spatial coordinates of sections after rigid registration. Along cross-sectioning order, each section in the middle has two closest sections, respectively on its anterior and posterior sides. Exceptionally, if a section is on anterior or posterior end, it has only one closest section.

Specifically for a section in the middle with N spots, we calculate $I_{pre} ϵ Z^{N}$ and $I_{next} ϵ Z^{N}$ which stores the mapped spots on anterior and posterior neighbor section for each of its spots. The calculation takes as input adjacency matrix π_pre, which stores anchors with the anterior neighbor section output by FGW OT, and π_next storing anchors with posterior section.

\begin{matrix} I_{pre}^{n} = a r g m a x_{i ϵ \{0, \dots, N_{p r e} - 1\}} π_{:, n}^{(pre)} \\ I_{next}^{n} = a r g m a x_{j ϵ {0, . . ., N_{n e x t} - 1}} π_{n, :}^{(next)} \end{matrix}

Be noted that $π_{:, n}^{(pre)}$ and $π_{n, :}^{(next)}$ still indicate adjacency matrix π_pre and π_next, with direction code pre and next being moved to superscript and added parenthesis for clarity, since subscript location are taken by spot index n.

Notably, not every spot in a selected section has its own anchored spot, due to multiple strategies including distributive constraint and anchors filtration, hence their corresponding element in I_pre and I_next are null. For section located on posterior end, only I_next is applicable; and for section located on anterior end, only $I_{pre}^{n}$ is applicable.

Elastic field establishment

After specifying spots with highest probability, ST-GEARS calculates location displacements between the spots, then establishes elastic fields for each section. An elastic field is a 2D displacement distribution, describing how displacement values are distributed across different locations. And it is established to enable ST-GEARS to benefit from further denoising functions to reduce elastic operation outliers and improve elastic effect consistency across regions.

For each section located in the middle, 4 elastic fields are generated. Two of those represent the section’s horizontal and vertical displacement distribution compared to anterior neighbor section, denoted as 2D matrix F^(x_pre) and F^(y_pre), while the other two represent its horizontal and vertical displacement distribution compared to posterior neighbor, denoted as F^(x_next) and F^(y_next). To initialize F^(x_pre), F^(y_pre), F^(x_next) and F^(y_next) for the section, the shape of the matrix is first decided. Its height denoted by Height and width denoted by Width are calculated by gridding the spot locations using a fixed step. Height and Width are shared across the 4 matrices:

\begin{matrix} H e i g h t = ⌈ (m a x_{i ϵ {0, . . ., N}} X_{i, 0} - m i n_{i ϵ {0, . . ., N}} X_{i, 0}) / p s i z e ⌉ \\ W i d t h = ⌈ (m a x_{i ϵ {0, . . ., N}} X_{i, 1} - m i n_{i ϵ {0, . . ., N}} X_{i, 1}) / p s i z e ⌉ \end{matrix}

For its input, $X \in R^{N, 2}$ denotes spots location of current section after rigid registration. For a single section, we prepare $X^{(pre)} ϵ R^{N_p r e, 2}$ and $X^{(next)} ϵ R^{N_n e x t, 2}$ as spots location of its anterior and posterior section after rigid alignment, respectively. psize represents average distance between closest spot or cell centers, and it is to be input by users. The matrix has no filled values to this step.

To fill in the fields, we first transform spot locations into the coordinate system of field. With $X_shifted ϵ R^{N, 2}$ and $X_pixel ϵ R^{N, 2}$ :

\begin{matrix} {X_shifted}_{i, :} = X_{i, :} - {[m i n_{i ϵ {0, . . ., N}} X_{i, 0}, m i n_{i ϵ {0, . . ., N}} X_{i, 1}]}^{T} \\ {X_pixel}_{i, j} = ⌈ {X_shifted}_{i, j} / p s i z e ⌉ \end{matrix}

We then calculate location displacements between each of its spots and their anchored spots with highest probability, on both anterior and posterior neighbors. With $X_corres ϵ R^{N, 2}$ and $X_delta ϵ R^{N, 2}$ :

\begin{matrix} {X_corres}_{n, :}^{(pre)} = X_{I_{pre}^{n}, :}^{(pre)} \\ {X_corres}_{n, :}^{(next)} = X_{I_{next}^{n}, :}^{(next)} \\ \begin{matrix} {X_delta}_{n, :}^{(pre)} = {X_corres}_{n, :}^{(pre)} {- X}_{n, :} \\ {X_delta}_{n, :}^{(next)} = {X_corres}_{n, :}^{(next)} {- X}_{n, :} \end{matrix} \end{matrix}

With the spot locations in field coordinates and the displacement values above, we fill in corresponding elements of the elastic field:

\begin{matrix} F^{(x_pre)} [X_pixel] = {X_delta}_{:, 0}^{(pre)} \\ F^{(y_pre)} [X_pixel] = {X_delta}_{:, 1}^{(pre)} \\ F^{(x_next)} [X_pixel] = {X_delta}_{:, 0}^{(next)} \end{matrix}

F^{(y_next)} [X_pixel] = {X_delta}_{:, 1}^{(next)}

By the end of Eqs. (2), 4 elastic fields for each section in the middle is established. However, some elements in the matrix are still empty, because of absence of spots or cells located in the grid of location. To address this problem, 2d nearest interpolation method⁴⁴ was adopted, which fills in every empty element, with the displacement value of its neighboring elements:

\begin{matrix} \begin{matrix} F^{(x_pre)} = f_{i n t e r p_g r i d} (X_pixel, {X_delta}_{:, 0}^{(pre)}, mes h_{trans}) \\ F^{(y_pre)} = f_{i n t e r p_g r i d} (X_pixel, {X_delta}_{:, 1}^{(pre)}, mes h_{trans}) \end{matrix} \\ F^{(x_next)} = f_{i n t e r p_g r i d} (X_pixel, {X_delta}_{:, 0}^{(next)}, mes h_{trans}) \\ F^{(y_next)} = f_{i n t e r p_g r i d} (X_pixel, {X_delta}_{:, 1}^{(next)}, mes h_{trans}) \end{matrix}

thereinto $mes h_{trans} ϵ N^{n_{g r i d s} \times 2}$ denotes grid coordinates of the designed field, with $n_{g r i d s} = H e i g h t \times W i d t h$ . And f_{interp_grid} denotes the nearest interpolation method.

For section located on posterior end, only F^(x_next) and F^(y_next) are applicable; and for section located on anterior end, only F^(x_pre) and F^(y_pre) are applicable.

2D Gaussian denoising

As caused by exerted force, the displacement or elastic field is expected to have static or smoothly changing values across different locations^45–47. ST-GEARS makes use of this property, to smoothen the field and to reduce errors in the field caused by any upper stream process, such as raw data noises and inaccuracy in anchor computation. Gaussian filtering^48,49 is adopted to implement the denoising, similarly to image denoising processes^50,51. Denoised elastic fields are then generated.

It calculates weighted average across the neighboring region of each element to replace its value:

\begin{matrix} \begin{matrix} F^{(x_pre)} = f_{g a u s s i a n_f i l t e r} (F^{(x_pre)}) \\ F^{(y_pre)} = f_{g a u s s i a n_f i l t e r} (F^{(y_pre)}) \end{matrix} \\ F^{(x_next)} = f_{g a u s s i a n_f i l t e r} (F^{(x_next)}) \\ F^{(y_next)} = f_{g a u s s i a n_f i l t e r} (F^{(y_next)}) \end{matrix}

where f_{gaussian_filter} denotes the method of Gaussian filtering.

Bi-sectional fields application

Bi-sectional fields application plan

With elastic fields generated and denoised, ST-GEARS uses the fields as a guidance to correct distortion for each section. Through querying the elastic fields with spatial location of each spot, the displacement to be implemented is returned. For a section in the middle, its elastic fields calculated with both anterior and posterior neighbor sections are queried, and guidance provided by both anterior and posterior sections are applied on the rigid aligned result, called ‘Bi-sectional Fields Application’. After the application, the distortion of the section is corrected, and the elastic registration result is generated.

Specifically, the denoised elastic fields are first queried, returning the displacement to be implemented:

\begin{matrix} \begin{matrix} X {_d e l t a}_{i, 0}^{(p r e_f i n a l)} = F_{X_{i, 0}^{(p i x e l)}}^{(x_p r e)} \\ {X_d e l t a}_{i, 1}^{(p r e_f i n a l)} = F_{X_{i, 1}^{(p i x e l)}}^{(y_p r e)} \end{matrix} \\ {X_d e l t a}_{i, 0}^{(n e x t_f i n a l)} = F_{X_{i, 0}^{(p i x e l)}}^{(x_n e x t)} \\ {X_d e l t a}_{i, 1}^{(n e x t_f i n a l)} = F_{X_{i, 1}^{(p i x e l)}}^{(y_n e x t)} \end{matrix}

Next, average displacement returned by both anterior and posterior sections are applied on the rigid registration result, leading to final elastic registration result $X_final \in R^{N, 2} :$

X_final = X + \frac{1}{2} {X_delta}^{(pre_final)} + \frac{1}{2} {X_delta}^{(next_final)}

For section located on posterior end,

X_final = X + {X_delta}^{(pre_final)}

For section located on anterior end,

X_final = X + {X_delta}^{(next_final)}

The validity of this plan is proved in the section: Proof of validity of Bi-sectional Fields Application.

Proof of validity of Bi-sectional fields application

Bi-sectional Fields Application accurately recovers the spatial profile before distortion, by averaging and applying displacement value guided by both anterior and posterior neighbor section. The effect is approved mathematically as following:

Take section A, B, and C as an example of a sequence of sections, with X_A, X_B and X_C denoting their spots’ spatial information after rigid alignment, and X_{A_insitu}, X_{B_insitu} and X_{C_insitu} denoting their in vivo spatial information. The distortion occurred to the slices during experiments are denoted as X_{A_dis}, X_{B_dis} and X_{C_dis}.

According to Bi-sectional Fields Application, the corrected spatial information is:

X_{B_cor} = X_{B} + \frac{1}{2} (X_{A} - X_{B}) + \frac{1}{2} (X_{c} - X_{B}) = \frac{1}{2} (X_{A} + X_{c})

Thereinto,

X_{A} = X_{A_insitu} + X_{A_dis}

X_{C} = X_{C_insitu} + X_{C_dis}

Hence,

X_{B_cor} = \frac{1}{2} X_{A_insitu} + \frac{1}{2} X_{C_insitu} + \frac{1}{2} (X_{A_dis} + X_{C_dis})

Based on the in vivo morphological consistency across sections, spatial information of section B can be approximated by an average of information of A and C, written as

X_{B_insitu} = \frac{1}{2} (X_{A_insitu} + X_{C_insitu})

Given that X_{A_dis} and X_{C_dis} can be seen as independent and identically distributed sets of variables,

X_{A_dis} + X_{C_dis} = N (μ_{ABC}, Σ_{ABC})

where μ_ABC is the universal mean, and Σ_ABC is the variance of the 2d displacement information.

Inserting the terms (4) and (5) back to Eq. (3) gives

X_{B_cor} = X_{B_insitu} + \frac{1}{2} N (μ_{ABC}, Σ_{ABC}) = X_{B_{insitu}} + o (X_{B_{insitu}}) \to X_{B_insitu}

indicating the proximity of corrected spatial information to in vivo spatial information.

Evaluation metrix

We evaluated the accuracy of anchors by index of Mapping Accuracy, and measured the reconstruction effect by MSSIM and SI-STD-DI, in both elastic effect study and overall methodology comparison.

Mapping accuracy

Designed and adopted by PASTE²⁷, Mapping Accuracy calculates the weighted percentage of anchors joining spots with same annotation.

Mapping Accuracy = \sum_{i, j, l (i) = l (j)} π_{ij}

MSSIM index

MSSIM measures the accuracy of registration, based on the assumption that in some sectioning positions, tissue morphology remains almost consistent across slices. The method quantifies the accuracy, by measuring the similarity of annotation type distribution of such section pairs.

To implement the quantification, first, structurally consistent section pairs are selected among all sections arranged in sequence.

Next, on each section from the pair, transformation from individual spots to a complete image is implemented, by gridding the rectangular area that surrounds the tissue, and assigning each grid of a value that represents the annotation type which occurs most frequently in the grid. The resulted image describes the annotation type distribution of the section.

Finally, similarity between each pair of images is measured, by index of MSSIM⁵². The method generates a window with fixed size, slides the window simultaneously on both images, and compares the two framed parts by windows on their intensity, contrast, and structures. Among those, the intensity difference is measured by difference of average pixel values, the contrast difference is measured by comparing variance of the two sets of framed pixel values, and the structure difference is measured by comparing their covariances. A Structural Similarity of Images (SSIM) index is calculated for each position of the window using $S S I M (X, Y) = \frac{(2 μ_{x} μ_{y}) (2 σ_{x y} + c_{2})}{(μ_{x}^{2} + μ_{y}^{2} + c_{1}) (σ_{x}^{2} + σ_{y}^{2} + c_{2})}$ , where μ_x and μ_y denote average pixel values of the frames, σ_x and σ_y denote variances of the frames, and σ_xy denotes covariances of the two frames. c₁ and c₂ are constants to avoid 0 value of the divisor. Averaging the SSIM value across all windows gives the final MSSIM result of the two sections.

SI-STD-DI

SI-STD-DI measures smoothness of area changing across sections along a fixed axis, by calculating the standard deviation of area changes on each pair of adjacent sections and scale the result by dividing it by average area.

SI - STD - DI = S T D ({s_{i} - s_{i - 1} : i \in [1, 2, . . ., I - 1]}) / (∣ m e a n ({s_{i} - s_{i - 1} : i \in [1, 2, . . ., I - 1]}) ∣)

Software and code

Data analysis

All software used to analyze data in this study are open-sourced Python packages, including anndata = 0.9.2, numpy = 1.22.4, pandas = 1.4.3, scipy = 1.10.1, matplotlib = 3.5.2, k3d = 2.15.3.

Statistics and reproducibility

No statistical method was used to predetermine sample size. No data were excluded from the analyses. The experiments were not randomized. The Investigators were not blinded to allocation during experiments and outcome assessment.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Supplementary information

Supplementary Information^{(3.4MB, pdf)}

Peer Review File^{(28.6MB, pdf)}

Reporting Summary^{(1.5MB, pdf)}

Source data

Source Data^{(320.5MB, zip)}

Acknowledgements

This work is part of the “SpatioTemporal Omics Consortium” (STOC) paper package. A list of STOC members is available at: http://sto-consortium.org. We acknowledge the Stomics Cloud platform (https://cloud.stomics.tech/) for providing convenient ways for analyzing spatial omics datasets. We acknowledge the CNGB Nucleotide Sequence Archive (CNSA) of China National GeneBank DataBase (CNGBdb) for maintaining the Drosophila database. This work is supported by National Natural Science Foundation of China (32300526 to S. F., 32100514 to M. X.). We thank Weizhen Xue for the inspirational discussion towards design of Distributive Constraints. We thank Yating Ren for her advice towards a more efficient code implementation. We thank Dr. Xiaojie Qiu and Dr. Yinqi Bai for the discussion on the registration topic and their advice on our work.

Author contributions

Tianyi Xia was responsible of method design, analysis design and implementation, as well as drafting of this manuscript. Dr. Luni Hu participated in structure design of the applications. Lulu Zuo was in part of 3D visualizations design, and she helps maintain our online repository. Tianyi Xia, Lei Cao, Lulu Zuo and Dr. Luni Hu conducted experiments and analysis for reply to peer review. Dr. Yunjia Zhang provided insights in anchors results interpretation of DLPFC dataset, and in accuracy analysis of mouse brain dataset. Dr. Mengyang Xu revised this article. Lei Zhang and Bowen Ma offered numerous suggestions to enhance computational efficiency, in both memory and time. Taotao Pan and Chuan Chen provided suggestions in data preprocessing. Qin Lu, Bohan Zhang, Junfu Guo, Chang Shi and Mei Li provided suggestions for this study. Dr. Shuangsang Fang supervised this study in structure and analysis design, and she revised this article. Chao Liu, Yuxiang Li and Yong Zhang supervised this study.

Peer review

Peer review information

Nature Communications thanks Jun Ding, Xiangyu Luo and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Data availability

All data used in this research were collected from published sources. DLPFC data was obtained from the research: Transcriptome-scale Spatial Gene Expression in the Human Dorsolateral Prefrontal Cortex, with data downloading link of http://research.libd.org/spatialLIBD/index.html; Drosophila embryo and Drosophila larva data were collected from High-resolution 3d Spatiotemporal Transcriptomic Maps of Developing Drosophila Embryos and Larvae, with the dataset link of https://db.cngb.org/stomics/datasets/STDS0000060. Mouse brain data was collected from research: Modular cell type organization of cortical areas revealed by in vivo sequencing. The download link is: https://data.mendeley.com/datasets/8bhhk7c5n9/1. All datasets were generated on Spatial Transcriptomics platform, with DLPFC data generated by Visium technology of 10x Genomics, Mouse brain data generated by BARseq of Cold Spring Harbor Laboratory, while Drosophila embryo and larva generated by Stereo-seq technology of BGI. Source data are provided with this paper.

Code availability

The methods of ST-GEARS is packaged, and distributed as an open-source, publicly available repository at https://github.com/STOmics/ST-GEARS⁵³.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Chao Liu, Email: liuchao3@genomics.cn.

Yuxiang Li, Email: liyuxiang@genomics.cn.

Yong Zhang, Email: zhangyong2@genomics.cn.

Shuangsang Fang, Email: fangshuangsang@genomics.cn.

Supplementary information

The online version contains supplementary material available at 10.1038/s41467-024-51935-0.

References

1.Marx, V. Method of the year: spatially resolved transcriptomics. Nat. Methods18, 9–14 (2021). 10.1038/s41592-020-01033-y [DOI] [PubMed] [Google Scholar]
2.Yue, L. et al. A guidebook of spatial transcriptomic technologies, data resources and analysis approaches. Comput. Struct. Biotechnol. J. 21, 940–955 (2023) [DOI] [PMC free article] [PubMed]
3.Park, H.-E. et al. Spatial transcriptomics: technical aspects of recent developments and their applications in neuroscience and cancer research. Adv. Sci.10, 2206939 (2023). 10.1002/advs.202206939 [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Gyllborg, D. et al. Hybridization-based in vivo sequencing (hybiss) for spatially resolved transcriptomics in human and mouse brain tissue. Nucleic acids Res.48, 112–112 (2020). 10.1093/nar/gkaa792 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Chen, X. et al. High-throughput mapping of long-range neuronal projection using in vivo sequencing. Cell179, 772–786 (2019). 10.1016/j.cell.2019.09.023 [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Wang, X. et al. Three-dimensional intact-tissue sequencing of single-cell transcriptional states. Science361, 5691 (2018). 10.1126/science.aat5691 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Qin, D. Next-generation sequencing and its clinical application. Cancer Biol. Med.16, 4 (2019). 10.20892/j.issn.2095-3941.2018.0055 [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Chen, A. et al. Large field of view-spatially resolved transcriptomics at nanoscale resolution. BioRxiv10.1101/2021.01.17.427004 (2021).
9.Stickels, R. R. et al. Highly sensitive spatial transcriptomics at near-cellular resolution with slide-seqv2. Nat. Biotechnol.39, 313–319 (2021). 10.1038/s41587-020-0739-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Moses, L. & Pachter, L. Museum of spatial transcriptomics. Nat. Methods19, 534–546 (2022). 10.1038/s41592-022-01409-2 [DOI] [PubMed] [Google Scholar]
11.Moor, A. E. & Itzkovitz, S. Spatial transcriptomics: paving the way for tissue-level systems biology. Curr. Opin. Biotechnol.46, 126–133 (2017). 10.1016/j.copbio.2017.02.004 [DOI] [PubMed] [Google Scholar]
12.Zhou, R., Yang, G., Zhang, Y. & Wang, Y. Spatial transcriptomics in development and disease. Mol. Biomed.4, 32 (2023). 10.1186/s43556-023-00144-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Li, Z. & Peng, G. Spatial transcriptomics: New dimension of understanding biological complexity. Biophys. Rep.8, 119 (2022). 10.52601/bpr.2021.210037 [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Williams, C. G., Lee, H. J., Asatsuma, T., Vento-Tormo, R. & Haque, A. An introduction to spatial transcriptomics for biomedical research. Genome Med.14, 1–18 (2022). 10.1186/s13073-022-01075-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Walker, B. L., Cang, Z., Ren, H., Bourgain-Chang, E. & Nie, Q. Deciphering tissue structure and function using spatial transcriptomics. Commun. Biol.5, 220 (2022). 10.1038/s42003-022-03175-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Atta, L. & Fan, J. Computational challenges and opportunities in spatially resolved transcriptomic data analysis. Nat. Commun.12, 5283 (2021). 10.1038/s41467-021-25557-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Velten, B. et al. Identifying temporal and spatial patterns of variation from multimodal data using Mefisto. Nat. Methods19, 179–186 (2022). 10.1038/s41592-021-01343-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Townes, F. W. & Engelhardt, B. E. Nonnegative spatial factorization applied to spatial genomics. Nat. Methods20, 229–238 (2023). 10.1038/s41592-022-01687-w [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Verma, A. & Engelhardt, B. A Bayesian nonparametric semi-supervised model for integration of multiple single-cell experiments. bioRxiv10.1101/2020.01.14.906313 (2020).
20.Svensson, V., Teichmann, S. A. & Stegle, O. Spatialde: identification of spatially variable genes. Nat. Methods15, 343–346 (2018). 10.1038/nmeth.4636 [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Dries, R. et al. Giotto: a toolbox for integrative analysis and visualization of spatial expression data. Genome Biol.22, 1–31 (2021). 10.1186/s13059-021-02286-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Wang, M. et al. High-resolution 3d spatiotemporal transcriptomic maps of developing drosophila embryos and larvae. Dev. Cell57, 1271–1283 (2022). 10.1016/j.devcel.2022.04.006 [DOI] [PubMed] [Google Scholar]
23.Mohenska, M. et al. 3d-cardiomics: a spatial transcriptional atlas of the mammalian heart. J. Mol. Cell. Cardiol.163, 20–32 (2022). 10.1016/j.yjmcc.2021.09.011 [DOI] [PubMed] [Google Scholar]
24.Vickovic, S. et al. Three-dimensional spatial transcriptomics uncovers cell type localizations in the human rheumatoid arthritis synovium. Commun. Biol.5, 129 (2022). 10.1038/s42003-022-03050-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Rao, A., Barkley, D., França, G. S. & Yanai, I. Exploring tissue architecture using spatial transcriptomics. Nature596, 211–220 (2021). 10.1038/s41586-021-03634-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Bergenstråhle, J., Larsson, L. & Lundeberg, J. Seamless integration of image and molecular analysis for spatial transcriptomics workflows. BMC Genom.21, 1–7 (2020). 10.1186/s12864-020-06832-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Zeira, R., Land, M., Strzalkowski, A. & Raphael, B. J. Alignment and integration of spatial transcriptomics data. Nat. Methods19, 567–575 (2022). 10.1038/s41592-022-01459-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Liu, X., Zeira, R. & Raphael, B. Paste2: Partial alignment of multi-slice spatially resolved transcriptomics data. In Research in Computational Molecular Biology: 27th Annual International Conference, 210 (Springer Nature, 2023)
29.Jones, A., Townes, F. W., Li, D. & Engelhardt, B. E. Alignment of spatial genomics data using deep gaussian processes. Nat. Methods20, 1379–1387 (2023). 10.1038/s41592-023-01972-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Xia, C.-R., Cao, Z.-J., Tu, X.-M. & Gao, G. Spatial-linked alignment tool (slat) for aligning heterogenous slices properly. bioRxiv10.1101/2023.04.07.535976 (2023). [DOI] [PMC free article] [PubMed]
31.Qiu, X., et al. Spateo: multidimensional spatiotemporal modeling of single-cell spatial transcriptomics. BioRxiv10.1101/2022.12.07.519417 (2022).
32.Guo, L. et al. Vt3d: a visualization toolbox for 3d transcriptomic data. J. Genetics Genom. 50, 713–719 (2023). [DOI] [PubMed]
33.Fang, S. et al. Stereopy: modeling comparative and spatiotemporal cellular heterogeneity via multi-sample spatial transcriptomics. bioRxiv10.1101/2023.12.04.569485 (2023).
34.Titouan, V., Courty, N., Tavenard, R. & Flamary, R. Optimal transport for structured data with application on graphs. Int. Conf. Mach. Learn.91, 6275–6284 (2019).
35.Maynard, K. R. et al. Transcriptome-scale spatial gene expression in the human dorsolateral prefrontal cortex. Nat. Neurosci.24, 425–436 (2021). 10.1038/s41593-020-00787-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Rodriques, S. G. et al. Slide-seq: A scalable technology for measuring genome-wide expression at high spatial resolution. Science363, 1463–1467 (2019). 10.1126/science.aaw1219 [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Chen, X., Fischer, S., Zhang, A., Gillis, J. & Zador, A. Modular cell type organization of cortical areas revealed by in vivo sequencing. BioRxiv10.1101/2022.11.06.515380 (2022).
38.Abdolhosseini, F. et al. Cell identity codes: understanding cell identity from gene expression profiles using deep neural networks. Sci. Rep.9, 2342 (2019). 10.1038/s41598-019-38798-y [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Efroni, I., Ip, P.-L., Nawy, T., Mello, A. & Birnbaum, K. D. Quantification of cell identity from single-cell gene expression profiles. Genome Biol.16, 1–12 (2015). 10.1186/s13059-015-0580-x [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Lacoste-Julien, S. Convergence rate of frank-wolfe for non-convex objectives. arXiv10.48550/arXiv.1607.00345 (2016).
41.Wahba, G. A least squares estimate of satellite attitude. SIAM Rev.7, 409–409 (1965). 10.1137/1007077 [DOI] [Google Scholar]
42.Lanjakornsiripan, D. et al. Layer-specific morphological and molecular differences in neocortical astrocytes and their dependence on neuronal layers. Nat. Commun.9, 1623 (2018). 10.1038/s41467-018-03940-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Csisz ´ar, I: I-divergence geometry of probability distributions and minimization problems. Ann. Probab. 3, 146–158 (1975).
44.Schoenberg, I. J. Contributions to the problem of approximation of equidistant data by analytic functions. In I. J. Schoenberg Selected Papers.Contemporary Mathematicians. (ed. de Boor, C.) 3–57 (Birkhäuser, Boston, 1988).
45.Zhou, H. & Jayender, J. Smooth deformation field-based mismatch removal in real-time. arXiv10.1101/7.08553 (2020).
46.Li, X. & Hu, Z. Rejecting mismatches by correspondence function. Int. J. Comput. Vis.89, 1–17 (2010). 10.1007/s11263-010-0318-x [DOI] [Google Scholar]
47.Li, X., Larson, M. & Hanjalic, A. Pairwise geometric matching for large-scale object retrieval. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, 5153–5161 (IEEE, 2015)
48.Bergholm, F. Edge focusing. In IEEE Transactions on Pattern Analysis and Machine Intelligence, 726–741 (IEEE, 1987). [DOI] [PubMed]
49.Marr, D. & Hildreth, E. Theory of edge detection. Proc. R. Soc. Lond. Ser. B. Biol. Sci.207, 187–217 (1980). [DOI] [PubMed] [Google Scholar]
50.Mafi, M. et al. A comprehensive survey on impulse and gaussian denoising filters for digital images. Signal Process.157, 236–260 (2019). 10.1016/j.sigpro.2018.12.006 [DOI] [Google Scholar]
51.Saxena, C. & Kourav, D. Noises and image denoising techniques: a brief survey. Int. J. Emerg. Technol. Adv. Eng.4, 14878–14885 (2014). [Google Scholar]
52.Wang, Z., Bovik, A. C., Sheikh, H. R. & Simoncelli, E. P. Image quality assessment: from error visibility to structural similarity. IEEE Trans. image Process.13, 600–612 (2004). 10.1109/TIP.2003.819861 [DOI] [PubMed] [Google Scholar]
53.Xia, T. et al. ST-GEARS: Advancing 3d downstream research through accurate spatial information recovery. GitHub. 10.5281/zenodo.13131713 (2024).

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information^{(3.4MB, pdf)}

Peer Review File^{(28.6MB, pdf)}

Reporting Summary^{(1.5MB, pdf)}

Source Data^{(320.5MB, zip)}

Data Availability Statement

The methods of ST-GEARS is packaged, and distributed as an open-source, publicly available repository at https://github.com/STOmics/ST-GEARS⁵³.

[CR1] 1.Marx, V. Method of the year: spatially resolved transcriptomics. Nat. Methods18, 9–14 (2021). 10.1038/s41592-020-01033-y [DOI] [PubMed] [Google Scholar]

[CR2] 2.Yue, L. et al. A guidebook of spatial transcriptomic technologies, data resources and analysis approaches. Comput. Struct. Biotechnol. J. 21, 940–955 (2023) [DOI] [PMC free article] [PubMed]

[CR3] 3.Park, H.-E. et al. Spatial transcriptomics: technical aspects of recent developments and their applications in neuroscience and cancer research. Adv. Sci.10, 2206939 (2023). 10.1002/advs.202206939 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR4] 4.Gyllborg, D. et al. Hybridization-based in vivo sequencing (hybiss) for spatially resolved transcriptomics in human and mouse brain tissue. Nucleic acids Res.48, 112–112 (2020). 10.1093/nar/gkaa792 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR5] 5.Chen, X. et al. High-throughput mapping of long-range neuronal projection using in vivo sequencing. Cell179, 772–786 (2019). 10.1016/j.cell.2019.09.023 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] 6.Wang, X. et al. Three-dimensional intact-tissue sequencing of single-cell transcriptional states. Science361, 5691 (2018). 10.1126/science.aat5691 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR7] 7.Qin, D. Next-generation sequencing and its clinical application. Cancer Biol. Med.16, 4 (2019). 10.20892/j.issn.2095-3941.2018.0055 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR8] 8.Chen, A. et al. Large field of view-spatially resolved transcriptomics at nanoscale resolution. BioRxiv10.1101/2021.01.17.427004 (2021).

[CR9] 9.Stickels, R. R. et al. Highly sensitive spatial transcriptomics at near-cellular resolution with slide-seqv2. Nat. Biotechnol.39, 313–319 (2021). 10.1038/s41587-020-0739-1 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR10] 10.Moses, L. & Pachter, L. Museum of spatial transcriptomics. Nat. Methods19, 534–546 (2022). 10.1038/s41592-022-01409-2 [DOI] [PubMed] [Google Scholar]

[CR11] 11.Moor, A. E. & Itzkovitz, S. Spatial transcriptomics: paving the way for tissue-level systems biology. Curr. Opin. Biotechnol.46, 126–133 (2017). 10.1016/j.copbio.2017.02.004 [DOI] [PubMed] [Google Scholar]

[CR12] 12.Zhou, R., Yang, G., Zhang, Y. & Wang, Y. Spatial transcriptomics in development and disease. Mol. Biomed.4, 32 (2023). 10.1186/s43556-023-00144-0 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR13] 13.Li, Z. & Peng, G. Spatial transcriptomics: New dimension of understanding biological complexity. Biophys. Rep.8, 119 (2022). 10.52601/bpr.2021.210037 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR14] 14.Williams, C. G., Lee, H. J., Asatsuma, T., Vento-Tormo, R. & Haque, A. An introduction to spatial transcriptomics for biomedical research. Genome Med.14, 1–18 (2022). 10.1186/s13073-022-01075-1 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR15] 15.Walker, B. L., Cang, Z., Ren, H., Bourgain-Chang, E. & Nie, Q. Deciphering tissue structure and function using spatial transcriptomics. Commun. Biol.5, 220 (2022). 10.1038/s42003-022-03175-5 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Atta, L. & Fan, J. Computational challenges and opportunities in spatially resolved transcriptomic data analysis. Nat. Commun.12, 5283 (2021). 10.1038/s41467-021-25557-9 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR17] 17.Velten, B. et al. Identifying temporal and spatial patterns of variation from multimodal data using Mefisto. Nat. Methods19, 179–186 (2022). 10.1038/s41592-021-01343-9 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR18] 18.Townes, F. W. & Engelhardt, B. E. Nonnegative spatial factorization applied to spatial genomics. Nat. Methods20, 229–238 (2023). 10.1038/s41592-022-01687-w [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR19] 19.Verma, A. & Engelhardt, B. A Bayesian nonparametric semi-supervised model for integration of multiple single-cell experiments. bioRxiv10.1101/2020.01.14.906313 (2020).

[CR20] 20.Svensson, V., Teichmann, S. A. & Stegle, O. Spatialde: identification of spatially variable genes. Nat. Methods15, 343–346 (2018). 10.1038/nmeth.4636 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR21] 21.Dries, R. et al. Giotto: a toolbox for integrative analysis and visualization of spatial expression data. Genome Biol.22, 1–31 (2021). 10.1186/s13059-021-02286-2 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR22] 22.Wang, M. et al. High-resolution 3d spatiotemporal transcriptomic maps of developing drosophila embryos and larvae. Dev. Cell57, 1271–1283 (2022). 10.1016/j.devcel.2022.04.006 [DOI] [PubMed] [Google Scholar]

[CR23] 23.Mohenska, M. et al. 3d-cardiomics: a spatial transcriptional atlas of the mammalian heart. J. Mol. Cell. Cardiol.163, 20–32 (2022). 10.1016/j.yjmcc.2021.09.011 [DOI] [PubMed] [Google Scholar]

[CR24] 24.Vickovic, S. et al. Three-dimensional spatial transcriptomics uncovers cell type localizations in the human rheumatoid arthritis synovium. Commun. Biol.5, 129 (2022). 10.1038/s42003-022-03050-3 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR25] 25.Rao, A., Barkley, D., França, G. S. & Yanai, I. Exploring tissue architecture using spatial transcriptomics. Nature596, 211–220 (2021). 10.1038/s41586-021-03634-9 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR26] 26.Bergenstråhle, J., Larsson, L. & Lundeberg, J. Seamless integration of image and molecular analysis for spatial transcriptomics workflows. BMC Genom.21, 1–7 (2020). 10.1186/s12864-020-06832-3 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR27] 27.Zeira, R., Land, M., Strzalkowski, A. & Raphael, B. J. Alignment and integration of spatial transcriptomics data. Nat. Methods19, 567–575 (2022). 10.1038/s41592-022-01459-6 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR28] 28.Liu, X., Zeira, R. & Raphael, B. Paste2: Partial alignment of multi-slice spatially resolved transcriptomics data. In Research in Computational Molecular Biology: 27th Annual International Conference, 210 (Springer Nature, 2023)

[CR29] 29.Jones, A., Townes, F. W., Li, D. & Engelhardt, B. E. Alignment of spatial genomics data using deep gaussian processes. Nat. Methods20, 1379–1387 (2023). 10.1038/s41592-023-01972-2 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR30] 30.Xia, C.-R., Cao, Z.-J., Tu, X.-M. & Gao, G. Spatial-linked alignment tool (slat) for aligning heterogenous slices properly. bioRxiv10.1101/2023.04.07.535976 (2023). [DOI] [PMC free article] [PubMed]

[CR31] 31.Qiu, X., et al. Spateo: multidimensional spatiotemporal modeling of single-cell spatial transcriptomics. BioRxiv10.1101/2022.12.07.519417 (2022).

[CR32] 32.Guo, L. et al. Vt3d: a visualization toolbox for 3d transcriptomic data. J. Genetics Genom. 50, 713–719 (2023). [DOI] [PubMed]

[CR33] 33.Fang, S. et al. Stereopy: modeling comparative and spatiotemporal cellular heterogeneity via multi-sample spatial transcriptomics. bioRxiv10.1101/2023.12.04.569485 (2023).

[CR34] 34.Titouan, V., Courty, N., Tavenard, R. & Flamary, R. Optimal transport for structured data with application on graphs. Int. Conf. Mach. Learn.91, 6275–6284 (2019).

[CR35] 35.Maynard, K. R. et al. Transcriptome-scale spatial gene expression in the human dorsolateral prefrontal cortex. Nat. Neurosci.24, 425–436 (2021). 10.1038/s41593-020-00787-0 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR36] 36.Rodriques, S. G. et al. Slide-seq: A scalable technology for measuring genome-wide expression at high spatial resolution. Science363, 1463–1467 (2019). 10.1126/science.aaw1219 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR37] 37.Chen, X., Fischer, S., Zhang, A., Gillis, J. & Zador, A. Modular cell type organization of cortical areas revealed by in vivo sequencing. BioRxiv10.1101/2022.11.06.515380 (2022).

[CR38] 38.Abdolhosseini, F. et al. Cell identity codes: understanding cell identity from gene expression profiles using deep neural networks. Sci. Rep.9, 2342 (2019). 10.1038/s41598-019-38798-y [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR39] 39.Efroni, I., Ip, P.-L., Nawy, T., Mello, A. & Birnbaum, K. D. Quantification of cell identity from single-cell gene expression profiles. Genome Biol.16, 1–12 (2015). 10.1186/s13059-015-0580-x [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR40] 40.Lacoste-Julien, S. Convergence rate of frank-wolfe for non-convex objectives. arXiv10.48550/arXiv.1607.00345 (2016).

[CR41] 41.Wahba, G. A least squares estimate of satellite attitude. SIAM Rev.7, 409–409 (1965). 10.1137/1007077 [DOI] [Google Scholar]

[CR42] 42.Lanjakornsiripan, D. et al. Layer-specific morphological and molecular differences in neocortical astrocytes and their dependence on neuronal layers. Nat. Commun.9, 1623 (2018). 10.1038/s41467-018-03940-3 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR43] 43.Csisz ´ar, I: I-divergence geometry of probability distributions and minimization problems. Ann. Probab. 3, 146–158 (1975).

[CR44] 44.Schoenberg, I. J. Contributions to the problem of approximation of equidistant data by analytic functions. In I. J. Schoenberg Selected Papers.Contemporary Mathematicians. (ed. de Boor, C.) 3–57 (Birkhäuser, Boston, 1988).

[CR45] 45.Zhou, H. & Jayender, J. Smooth deformation field-based mismatch removal in real-time. arXiv10.1101/7.08553 (2020).

[CR46] 46.Li, X. & Hu, Z. Rejecting mismatches by correspondence function. Int. J. Comput. Vis.89, 1–17 (2010). 10.1007/s11263-010-0318-x [DOI] [Google Scholar]

[CR47] 47.Li, X., Larson, M. & Hanjalic, A. Pairwise geometric matching for large-scale object retrieval. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, 5153–5161 (IEEE, 2015)

[CR48] 48.Bergholm, F. Edge focusing. In IEEE Transactions on Pattern Analysis and Machine Intelligence, 726–741 (IEEE, 1987). [DOI] [PubMed]

[CR49] 49.Marr, D. & Hildreth, E. Theory of edge detection. Proc. R. Soc. Lond. Ser. B. Biol. Sci.207, 187–217 (1980). [DOI] [PubMed] [Google Scholar]

[CR50] 50.Mafi, M. et al. A comprehensive survey on impulse and gaussian denoising filters for digital images. Signal Process.157, 236–260 (2019). 10.1016/j.sigpro.2018.12.006 [DOI] [Google Scholar]

[CR51] 51.Saxena, C. & Kourav, D. Noises and image denoising techniques: a brief survey. Int. J. Emerg. Technol. Adv. Eng.4, 14878–14885 (2014). [Google Scholar]

[CR52] 52.Wang, Z., Bovik, A. C., Sheikh, H. R. & Simoncelli, E. P. Image quality assessment: from error visibility to structural similarity. IEEE Trans. image Process.13, 600–612 (2004). 10.1109/TIP.2003.819861 [DOI] [PubMed] [Google Scholar]

[CR53] 53.Xia, T. et al. ST-GEARS: Advancing 3d downstream research through accurate spatial information recovery. GitHub. 10.5281/zenodo.13131713 (2024).

PERMALINK

ST-GEARS: Advancing 3D downstream research through accurate spatial information recovery

Tianyi Xia

Luni Hu

Lulu Zuo

Lei Cao

Yunjia Zhang

Mengyang Xu

Qin Lu

Lei Zhang

Taotao Pan

Bohan Zhang

Bowen Ma

Chuan Chen

Junfu Guo

Chang Shi

Mei Li

Chao Liu

Yuxiang Li

Yong Zhang

Shuangsang Fang

Abstract

Introduction

Results

ST-GEARS algorithm

Fig. 1. Three-Dimensional (3D) Spatial Transcriptomics (ST) Geospatial profile recovery with ST-GEARS.

Enhancement of anchor retrieval accuracy through distributive constraints

Fig. 2. Anchors generated by ST-GEARS and their accuracy study.

Recovery of in vivo shape profile through elastic registration

Fig. 3. Distortion correction effect by elastic registration of ST-GEARS.

Application to sagittal sections of Mouse hippocampus

Fig. 4. Registration of Mouse hippocampus, respectively by PASTE, PASTE2, GPSA, STalign and ST-GEARS.

Application to 3D reconstruction of Drosophila embryo

Fig. 5. Three-Dimensional (3D) reconstruction of Drosophila Embryo, respectively by PASTE, PASTE2, GPSA, STalign and ST-GEARS.

Application to Mouse brain reconstruction

Fig. 6. Three-Dimensional (3D) reconstruction of Mouse Brain, respectively by PASTE, PASTE2, GPSA, STalign and ST-GEARS.

Discussion

Methods

FGW OT description

Distributive constraints

Self-adaptive regularization

Elastic field inference

Finding spots with highest probability

Elastic field establishment

2D Gaussian denoising

Bi-sectional fields application

Bi-sectional fields application plan

Proof of validity of Bi-sectional fields application

Evaluation metrix

Mapping accuracy

MSSIM index

SI-STD-DI

Software and code

Data analysis

Statistics and reproducibility

Reporting summary

Supplementary information

Source data

Acknowledgements

Author contributions

Peer review

Peer review information

Data availability

Code availability

Competing interests

Footnotes

Contributor Information

Supplementary information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases