Extract latent features of single-particle trajectories with historical experience learning

Yongyu Zhang; Feng Ge; Xijian Lin; Jianfeng Xue; Yuxin Song; Hao Xie; Yan He

doi:10.1016/j.bpj.2023.10.023

. 2023 Oct 27;122(22):4451–4466. doi: 10.1016/j.bpj.2023.10.023

Extract latent features of single-particle trajectories with historical experience learning

Yongyu Zhang ¹, Feng Ge ¹, Xijian Lin ¹, Jianfeng Xue ¹, Yuxin Song ¹, Hao Xie ^2,^∗, Yan He ^1,^∗∗

PMCID: PMC10698327 PMID: 37885178

Abstract

Single-particle tracking has enabled real-time, in situ quantitative studies of complex systems. However, inferring dynamic state changes from noisy and undersampling trajectories encounters challenges. Here, we introduce a data-driven method for extracting features of subtrajectories with historical experience learning (Deep-SEES), where a single-particle tracking analysis pipeline based on a self-supervised architecture automatically searches for the latent space, allowing effective segmentation of the underlying states from noisy trajectories without prior knowledge on the particle dynamics. We validated our method on a variety of noisy simulated and experimental data. Our results showed that the method can faithfully capture both stable states and their dynamic switch. In highly random systems, our method outperformed commonly used unsupervised methods in inferring motion states, which is important for understanding nanoparticles interacting with living cell membranes, active enzymes, and liquid-liquid phase separation. Self-generating latent features of trajectories could potentially improve the understanding, estimation, and prediction of many complex systems.

Significance

Classifying local dynamics from single-particle tracking (SPT) offers insights into subtle features of probe motion over time, such as the heterogeneity of the interplaying environment (e.g., cell membrane regions with different local interactions or apparent viscosities) and different transport states of the probe (e.g., molecular motors with different activation). However, inferring the dynamic changes and rapid state transitions from noisy and undersampling subtrajectories encounter challenges. Here, rather than presume certain feature space to estimate particle motion patterns, we provide a data-driven approach to search for the diffusion properties of subtrajectories automatically and autonomously in the latent space for feature clustering, enabling segmenting underlying states under a noise-reduced manner to better estimate and predict the behavior of many different complex systems.

Introduction

Single-particle tracking (SPT) has enabled the quantitative analyses of complex systems using in situ probes with nanoscale spatial localization precision and millisecond temporal resolution, capable of uncovering dynamics previously masked in the ensemble average (1,2). By decoding the diffusion patterns of individual particle trajectories in a medium, researchers have obtained key insight into membrane transport (3), liquid-liquid phase separation (4,5), chaotic movement (6), and active matter (7,8), and gained more knowledge on dynamic interactions in different processes. However, resolving the multiple dynamic states and rapid transitions exhibited by probe particles in nonequilibrium and nonstationary processes is challenging, where the probe particles exploring complicated systems are inevitably degraded by unavoidable noise from the local environments, the particle probes, the signal measurements, and the signal transduction (9).

Different categories of algorithms have been developed to segment the SPT data. The most widely used methods focus on the changes in diffusion properties between subtrajectories with sufficient length. In particular, mean-square displacement (MSD over time, $⟨ x^{2} ⟩$ $\propto$ t^α) and the scaling coefficient $α$ obtained from the fitting is used to characterize the diffusion pattern: 0 < α < 1 for subdiffusion, α = 1 for Brownian motion, and α > 1 for superdiffusion (especially, α = 0 corresponds to immobile particles and α = 2 corresponds to ballistic motion) (10). In addition to α, other features such as the diffusion constant (11,12), velocity, and slope of the moment scaling spectrum (MSS) (13) have been used to differentiate fragments based on some invariant parameters. However, for highly complicated spatiotemporally heterogeneous systems, the reliable application of stationarity statistical analysis requires sampling with an ultrahigh spatial and temporal resolution (14), which is not achievable in most imaging systems.

The second category involves inferring dynamic changes in SPT data through statistical modeling of complex time series, driven by the latest fast advancements in machine learning. Typically, the spatiotemporal correlations of different diffusion modes are captured in a feature space through various techniques such as hidden Markov models (HMMs) (15,16,17), Bayesian inference (18,19,20,21), and likelihood ratio test (22,23,24), which often involve optimization or estimation procedures to find the best model parameters or make statistical decisions. However, these methodologies rely on analytically identifying or comparing against specific types of diffusion models or the characteristics of interest, which may not always be appropriate. Detecting motion types that exhibit patterns over long timescales can be challenging when relying on single time step HMMs (15,16) and probability density function (25). The Bayesian framework allows for nonparametric identification of the most significant long-term patterns across various stochastic processes (18,21), but its primary dependence on properties relevant to Brownian motion may constrain its capability to capture the heterogeneity in anomalous diffusions (26).

To enhance the detection of subtrajectory differences, it is natural to consider alternative data-driven approaches that involve mapping raw features into a latent space rather than relying on fixed rules to process the original data stream. Consequently, all subtrajectories, not just those that are chronologically contiguous, are evaluated together. In some cases, the machine learning tools are applied with prior knowledge of SPT. Predefined motion types (e.g., subdiffusion, normal diffusion, and superdiffusion) or established models (e.g., CTRW, fractional Brownian motion [fBm], and ATTM) have been used to decipher the diffusion behavior in a certain type of particle-environment interaction (27,28,29,30). Employing rolling windows across multiple time points or detecting abrupt shifts in diffusion type can be utilized for the segmentation of time traces (31,32). By integrating additional handcrafted features, a more comprehensive classification of trajectory characteristics can be achieved. For example, a fingerprinting approach utilizes a predefined set of 17 isolated features to generate a dictionary of diffusional traits across multiple systems. These traits are then classified using linear discriminant analysis (33). The mobility of particles, interpreted through the MSS, can be accurately evaluated by employing a recurrent neural network (34). These approaches achieve good performance in selected data sets. However, they require prior knowledge and supervised learning, such as the applied model, the selected features, and the designated target, which are dependent on the particular complex phenomenon under investigation.

An ideal unsupervised SPT analysis algorithm aimed at classifying and segmenting dynamic states should be able to learn a mapping function that automatically represents global features as probability densities or as a mixture of feature preferences. Previously, we have proposed an unsupervised learning approach named SEES (from "succession of experience" to the "experience of succession"), which uses nonprior historical information as the feature vectors, and global k-means to extract the feature differences from the selected representation subspace (35,36). When using cluster centroids as different histories to label the original trajectories, SEES reveals the underlying distinctions of particle dynamics as intuitive colored patterns. However, the unsupervised learning method is sensitive to the trajectory noise, and the patterns extracted may not necessarily be relevant to informative traits of interest.

Inspired by deep generative architectures that reconstruct raw data and learn the true data distribution along with crucial latent features (37,38), we introduce a self-supervised method called Deep-SEES. This approach utilizes variational deep learning of subtrajectories to extract latent features for the purpose of classifying and segmenting the inherently complex diffusion of probes without the need for prior ground truth or high-quality data sets. It employs an SPT analysis pipeline including a variational autoencoder (VAE) architecture with two long short-term memory (LSTM) networks, known as LSTM-VAE (39,40,41). This architecture automatically searches for the latent space, unifying the denoising and segmentation steps. The LSTM network captures sequential patterns over the longer term of time series, while the VAE framework extracts main features from the latent space. Since the slight compression in the latent space leads to local smoothing while minimizing the effects of time-independent factors such as random noise or Brownian fluctuations, the separability of the embedded dynamics is enhanced, resulting in greatly improved accuracy and robustness in identifying and segmenting the underlying states.

To validate this approach, we first assessed the effectiveness of Deep-SEES on highly random SPT trajectories, and our results showed the z-latent features preserve not only the statistical properties of diffusive motions but also their temporal correlation. These z-latent features were shown to be very effective in segmentation and classification tasks for inferring motion states, especially in the presence of high noise. Compared with the commonly used unsupervised methods, our method yields significant enhancements, with an increase of 19% in the F1 score for classification and an increase of 26% in root mean-square error (RMSE) for segmentation. Moreover, we applied Deep-SEES to capture the dynamics of nanoparticles interacting with living cell membranes, including translational and rotational trajectories. The detailed motion state transition dynamics, as well as the location of the cell-entry rare events, were successfully recovered. Finally, through high-speed imaging, the dynamics of nanoparticles in an active enzyme system and a liquid-liquid phase separation system were well described, respectively, despite the highly noisy trajectories. The capability of our self-supervising approach in self-generating z-latent features of trajectories would expand our ability to understand, estimate, and predict the behavior of many different complex systems.

Materials and methods

This section details the methods, including the architectural design of Deep-SEES, modeling parameters for various systems, and underlying state evaluation criteria. The numerical and experimental setups are concisely defined.

Materials availability

Previously published data were used for this work (Fernex et al. (42), Vega et al. (13), Lin et al. (43), and Xue et al. (44)). The details are present in the paper and/or the supporting material.

Deep-SEES frameworks

Deep-SEES is an SPT analysis pipeline designed to automatically extract nonlinear features of subtrajectories. Given a multivariate time series of trajectories $x (t) = s (t) + w (t) \in R^{m}$ where $x (t)$ contains time-dependent system dynamics $s (t)$ and time-independent white noise $w (t)$ from $m$ -dimensional channels at time $t$ , the Deep-SEES learning results in the optimal reconstruction value $\hat{s} (t)$ and the corresponding labels of underlying states $c (t)$ .

In the first step, trajectories are decomposed into a successive combination of overlaid subsequences, and the subtrajectories are represented in the latent space by a VAE framework, with two LSTM networks acting as the encoder and the decoder, respectively (39). The data set of subtrajectories or historical vectors is generated by applying a rolling window over the time series of the trajectories. For example, $X (t) : = [x (t - L + 1), x (t - L + 2), . . ., x (t)]$ denotes a historical vector of length L having the leading point at time t. In systems primarily characterized by diffusion, especially those using SPT, the analysis focuses on the local displacement of particles rather than their coordinates. A processing strategy involving rotation and mirroring is applied to the input sequence: all trajectories are rotated to ensure that the initial position is at the origin and the final position is on the x axis. If the maximum absolute y coordinate or z coordinate value of the trajectory is negative, the trajectory is mirrored along the x axis.

An autoencoder (AE) learns to reconstruct input data using three layers: an input layer (encoder), a hidden layer (latent space), and an output layer (decoder). The AE is forced to learn just the most important latent features, while the reconstruction avoids nonessential sources of variation like random noise according to typically the mean-square error (MSE) loss function. The VAE framework generates an additional probabilistic distribution of the hidden layers (Gaussian latent) to maximize the likelihood of data set $X$ of subtrajectories in Equation 1.

\begin{array}{c} p (X, θ) = \int_{Z} p (X | Z, θ) p (Z, θ) d Z \end{array}

(1)

Here, to be specific, two LSTMs as VAE encoders and decoders are used to capture long- and short-term sequential patterns in time series. The LSTM cell consists of forget gate $f_{t}$ , input gate $i_{t}$ , output gate $o_{t}$ , hidden state $h_{t}$ and cell state $c_{t}$ in Equation 2.

\begin{array}{l} f_{t} & = σ (W_{f} x_{t} + U_{f} h_{t - 1} + b_{f}) \\ i_{t} & = σ (W_{i} x_{t} + U_{i} h_{t - 1} + b_{i}) \\ o_{t} & = σ (W_{o} x_{t} + U_{o} h_{t - 1} + b_{o}) \\ c_{t} & = f_{t} \circ c_{t - 1} + i_{t} \circ σ (W_{c} x_{t} + U_{c} h_{t - 1} + b_{c}) \\ h_{t} & = o_{t} \circ σ (c_{t}) \end{array}

(2)

The LSTM encoder embeds the segments of time series $X$ in latent space $Z$ , which inputs the coordinates $x$ , $y$ frame by frame. At each time step, the output of the hidden state from the previous LSTM cell and current $x$ , $y$ coordinates are fed into the next LSTM cell. By compressing the last hidden state into the two latent vectors $μ$ and $σ$ with a fully connected layer, the Gaussian distribution in the latent space is sampled to yield the latent variable $Z$ in Equation 3. Note that the initial hidden state is set randomly.

\begin{array}{c} Z \sim q (Z | X, θ) = N (μ, σ^{2}) \end{array}

(3)

With the same hyperparameters as the LSTM encoder, the LSTM decoder decompresses the latent variables $Z$ to reconstruct the original subtrajectories, where $Z$ is the initial hidden state and the sequence of zeros are inputs.

To enhance the inherent suitability of the learned representations for feature clustering, the total loss is formulated into two main components: the loss function derived from standard VAE implementations to maximize the evidence lower bound (ELBO), and the β-weighted k-means loss (45) in Equation 4.

\begin{array}{c} \begin{array}{r} L = L_{E L B O} + \end{array} {β L}_{K - m e a n s} \end{array}

(4)

The loss function ELBO is presented as follows: the important dynamics can be interpreted as a measure of the expectation with MSE, and the KL divergence can be interpreted as a measure of the additional information required to express the posterior relative to the prior with a closed-form solution of Gaussian distribution in Equation 5.

\begin{array}{r} L_{E L B O} = E_{q (Z | X, φ)} [\log p (X | Z, θ)] - K L (q (Z | X, φ) | | p (Z, θ)) \end{array}

(5)

To promote the development of cluster structures and to obtain cluster-specific representations, we further consider a k-means loss function (45) in Equation 6.

\begin{array}{c} L_{K - m e a n s} = Tr (Z^{T} Z) - Tr (F^{T} Z^{T} Z F) \end{array}

(6)

where Tr represents the matrix trace, and the latent variable data matrix $Z \in R^{m \times N} .$ The k-means minimization can be treated as a trace maximization problem of the Gram matrix Z^TZ. Here, $F \in R^{N \times k}$ denotes the cluster indicator matrix, where the closed-form solution for F is derived by composing the first k singular vectors of Z, in accordance with the Ky Fan theorem. Moreover, the term β functions as a regularization component within the model, promoting model simplicity and generalizability.

In the second step, using the latent variable $Z$ (z-latent), we can extract parameters of subtrajectories that represent the regions explored locally by the particles on both temporal and spatial scales. Differences in z-latent vectors show underlying states and their transitions along the full trajectory range, and these z-latent vectors are clustered by k-means, in which the similarity between z-latent vectors is evaluated using cosine distances. The minimum number of centroids K is determined by the Elbow method or silhouette coefficients (46), while the optimal number of centroids K is determined empirically by the fact that the cluster centers (recovered through the decoder as trajectory average in the original data space) do not overlap and the average duration of each state (defined by the length of the same color fragment after trajectory reconstruction) is close to the turning point (reduction of the duration from fast to slow as the K number increases, Notes S4 and S5). The original data space is where we observe the trajectories of particles over time, while other variables (such as polar angle) could be included as additional dimensions. We can also choose a certain cluster to drop or recluster to achieve hierarchical clustering. Moreover, to evaluate the k-means separation directly, the t-distributed stochastic neighbor embedding (t-SNE) function in the scikit-learn python package is used to visualize z-latent separation in 2D embedding space. The PCA is used as an initialization step in t-SNE to achieve a more stable and reliable embedding compared with those initialized randomly.

In the last step, visualization of the reconstructed dynamics and underlying states can be further exploited to extract more specific information according to existing models on particle motion. We attribute different colors to different z-latent clusters, which are mapped back to the subtrajectory space. An inverse rolling-window operation is applied to create a complete reconstructed trajectory, with each point being colored accordingly. Since each data point appears simultaneously on multiple overlapping subtrajectories attributed to different classifications, we choose the color label that appears most often (Note S6).

Implementation

The LSTM-VAE of Deep-SEES was implemented in the python open source framework PyTorch. We generated a subtrajectory data set with L = 30, which was then normalized by the mean absolute value of inputs. The encoder and decoder had a two-layer LSTM cell with 128 hidden neurons, while latent layers consisted of 20 neurons, which was about a third of the inputs’ size. We applied an Adam optimizer with a learning rate of 0.0005. For all data sets, gradient values were clipped to 5, the batch size was set to 50, and the dropout rate was set to 0.2. The network was trained for 200 epochs, and the computations were run on a GeForce RTX 2080 Ti GPU with a total training time of about 0.5 h. After training, all subtrajectories in the data set were fed to LSTM-VAE without shuffling. Using scikit-learn’s k-means algorithm with Euclidean distance, the normalized z-latent vector represents the cosine similarity while the choice of $K$ experimented within the range of 2–20. The cluster centers were multiplied by the average norm of their corresponding z-latent vectors and fed to the LSTM decoder to display the average trajectories. MATLAB software (The MathWorks, Natick, MA) was used to preprocess and visualize the data. Note that, during the hyperparameter selection process, we split the data into training sets and evaluation sets, with the latter used for hyperparameter tuning and with the loss function containing ELBO and k-means loss. We also evaluated the F1 scores when the ground truth of the simulated data was available. Once an optimal set of hyperparameters was determined, we trained 20 Deep-SEES models and selected the 5 best-performed networks based on the average results of their outputs. We chose a setup where the top k of eigenvalues in latent space generally exceeded the actual number of clusters, aiming to strengthen the model’s ability to avoid overfitting. The weight of the k-means loss β was set to 0.05.

Simulation and data set

Slow-fast variants

The slow-fast speed-switching trajectories are constructed using a HMM with four normal diffusion states with two different sets of occupancy probabilities of diffusion constant (33), where state i transitioned to state j with a probability p_ij is denoted by $s_{j} = s_{i} p_{i j}$ . The diffusion constant was chosen from a set of four values (0.006, 0.025, 0.1, 0.4 μm²/s), with associated probabilities for slow diffusion (0.5, 0.45, 0.05, 0.0) and fast diffusion (0, 0, 0.5, 0.5). At each frame update step, the diffusion constant was either recalculated with a probability of 0.1 or left unchanged with a probability of 0.9. The transition probability matrix, P, was defined as:

P = (\begin{array}{c} {0.9 + 0.1 p}_{1} & 0.1 p_{2} & 0.1 p_{3} & 0.1 p_{4} \\ 0.1 p_{1} & 0.9 + 0.1 p_{2} & 0.1 p_{3} & 0.1 p_{4} \\ 0.1 p_{1} & 0.1 p_{2} & 0.9 + 0.1 p_{3} & 0.1 p_{4} \\ 0.1 p_{1} & 0.1 p_{2} & 0.1 p_{3} & 0.9 + 0.1 p_{4} \end{array})

To update the x-y displacement, we sampled from a normal distribution, $N (0, 2 D_{i} Δ t)$ , where D_i was the diffusion constant of the current state i and t = 1/30 s was the frame length. The observational noise was introduced, and the displacement was added to the current coordinates according to the following relationship: $r = σ_{p} / \sqrt{2 \bar{D} Δ t}$ , where the noise was sampled from $N (0, σ_{p}^{2})$ , and the mean diffusion constant, denoted by $\bar{D}$ , was calculated as the average of a set of 4 distinct diffusion constant values. This procedure generated a fast and a slow variant of 20,000 frames each, for a total of 40,000 frames. The localization error noise level r ranged from 0 to 2. Furthermore, for segmentation, the location of change points along the trajectory was introduced at random intervals, covering a range from 20 to 180. Each set of parameters corresponded to 200 trajectories with 200 frames.

fBm

The self-similar fBm model was used to simulate different confinement modes, which was a stochastic process characterized by long-range correlations among its increments (28). In fBm, the position x(t) corresponded to a Gaussian process with stationary increments, which was characterized by a symmetric distribution with a mean of $⟨ x ⟩ = 0$ . The ensemble-averaged MSD (EA-MSD) for this process demonstrated scaling behavior $⟨ x^{2} ⟩$ = $2 K_{H} t^{H / 2}$ . Here, K_H represented the diffusion coefficient, and H denoted the Hurst exponent, which was associated with the anomalous diffusion exponent via H = α/2. The fBm featured two distinct regimes: a superdiffusive regime with positive noise correlation (0.5 < H < 1, 1 < α < 2) and a subdiffusive regime with negative noise correlation (0 < H < 0.5, 0 < α < 1). For H = 0.5 (α = 1), the noise was uncorrelated, and the fBm converged to the standard Brownian motion. The fBm samples were generated by employing the fgn method from the fBm Python package (https://pypi.org/project/fbm/). Furthermore, the time correlation for fBm was extended to higher 2D and 3D dimensions by considering the simple composition of independent motions along orthogonal axes.

Considering highly random SPT systems with time-varying statistical properties, we generated a joint trajectory of anomalous diffusion containing three regions with different confinement: a subdiffusion ( $α = 0.5$ ), a normal diffusion ( $α = 1.0$ ), and a superdiffusion ( $α = 1.5$ ). The parameters of fBm were the Hurst index $H = α / 2$ , a sampling rate of 60 Hz, and a time span of approximately 333.33 s in each region. The step length of the three trajectories was normalized by its corresponding variance. The reflected fBm (47) was used to describe different confinement of rotational motion and we generated polar angles corresponding to the three regions. The parameters of reflected fBm were $α = 0.5$ , $α = 1.0$ , and $α = 0.05$ , all with 90° acting as bounds, respectively. The step length of polar angles was normalized by its corresponding variance and finally.

Varied motion types and transitions

We used the data sets from Vega et al., which contained trajectories of switching between different diffusion modes: immobile, free diffusion, confined diffusion, and directed diffusion (13). For free diffusion tracks, x- and y-displacements were sampled from a normal distribution, N(0, 2DΔt), where D represented the diffusion constant and Δt denoted the time step. Confined diffusion tracks involved free diffusion tracks reflected off the boundary of a circle with radius R. Directed diffusion tracks were determined by vcos(θ)Δt and vsin(θ)Δt, where v was the drift velocity, and θ was the drift angle sampled from a uniform distribution. Immobile tracks were simulated using a normal distribution, N(0, σ_p²), with mean 0 and standard deviation σ_p, taking into account the position deviation (localization error). In the case of confinement or drifting velocities, it was a two-parameter diffusion model, the diffusion coefficient D, and the confinement radius R as well as the diffusion coefficient D and the velocity v. The switching between free diffusion and drift diffusion $(D E = v Δ t / \sqrt{4 D Δ t}$ ), and switching between free diffusion and confined diffusion ( $C E = \sqrt{4 D Δ t} / R$ ), respectively.

To compare the segmentation and motion classification performance for transient mobility analysis over multiple short trajectories, we performed simulations of particle trajectories for 200 frames, which involved a single switch between three distinct diffusion modes: free diffusion, confined diffusion, and directed diffusion. The diffusion coefficient used for all modes was set to D = 2 pixels²/frame. The CE ranged from 0 to 1, the DE ranged from 0 to 2, and the localization error noise level $r = σ_{p} / \sqrt{2 D Δ t}$ ranged from 0 to 2. The location of change points along the trajectory were introduced at random intervals, covering a range from 20 to 180. Each set of parameters corresponded to 200 trajectories.

High-speed imaging of nanoparticles

We used data sets from our previous publications by Lin et al. (43) and Xue et al. (44), which consisted of trajectories obtained from an active enzyme system and a liquid-liquid phase separation (LLPS) process, respectively. We obtained approximately 100 trajectories, with sampling rates between 1000 and 2000 Hz and an acquisition time of 0.5 s. High-speed imaging was employed using a PCO Dimax HS4 CMOS camera. The high-speed dark-field imaging system was used to simultaneously monitor both the translational and rotational motion of PEGylated Au nanorods (AuNRs) (85 × 40 nm in size). A birefringent prism separated the scattered light from a single AuNR into two orthogonal polarization components, enabling the detection of translational and rotational movements of AuNRs through the displacement of two-spot centers and changes in two-spot intensities, respectively. The time series of intensity channels Ix, Iy served as the 2D trajectories in the analysis of rotational motion.

Comparison baselines

We evaluated the performance of the Deep-SEES method by comparing it with four established features for different motion types and transitions, including slow-fast variants, fBm, and dynamic switch systems. The features included raw x-y trajectories, filtered x-y trajectories, velocity, and MSDs. For segmentation and classification, we implemented k-means clustering using the Euclidean distance metric while maintaining the same subtrajectory length and noise level as the Deep-SEES process. To eliminate local spatial information, the raw x-y subtrajectories were preprocessed with techniques such as rotation and mirroring operations. The filtered subtrajectories reconstructed by Deep-SEES were subjected to the same preprocessing steps. We computed the velocity for each frame within the subtrajectories and extracted the MSDs feature with duration lags equal to half the subtrajectory length.

We benchmarked the Deep-SEES method with two commonly used methods for inferring transient states: divide-and-conquer MSS (DC-MSS) (13) and HMMs (15). DC-MSS characterizes molecular motion by calculating multiple moments of the displacement distribution, while HMM represents a particular state of motion with a specific diffusion coefficient and velocity from single-step displacements. To compare the segmentation and motion classification performance for transient mobility analysis of identifying switches between free diffusion, confined diffusion, and directed diffusion in multiple short trajectories. Note that the immobility in DC-MSS was classified as a form of confined diffusion.

We used the macro average F1 score to evaluate assigning trajectories (or segments) to one of three diffusion motions. The F1 score was calculated by 2TP/(2TP + FP + FN), where TP, FP, and FN represent true positives, false positives, and false negatives calculated over the whole data set, respectively. Specifically, for each motion state class, we computed F1 scores based on each point of the output label and its corresponding ground truth. Then the macro average F1 score was calculated by taking the arithmetic mean of all the F1 scores for each class, which assigned equal importance to each class.

The identification of transition point precision was evaluated by RMSE. When there is only one transition point, we define an optimal segmentation point to maximize the class difference on both sides. Suppose the left side of the segment point represents class 1, and the right side represents class 2. We first computed the duration time (frequency) of output labels for each class. Specifically, we denoted the duration time for label 1 on the left as $τ_{11}$ , label 2 on the left as $τ_{12}$ , label 1 on the right as $τ_{21}$ , and label 2 on the right as $τ_{22}$ , respectively. Likewise, we can use the macro-averaged F1 score to evaluate the assignment of duration to one of the two segments. For the first segment (class 1), we denote the true positive TP as $τ_{11}$ , the false negative FN as $τ_{12}$ , and the false positive FP as $τ_{21}$ . For the second segment (class 2), we denote the true positive TP as $τ_{22}$ , the false negative FN as $τ_{21}$ , and the false positive FP as $τ_{12}$ . The F1 scores were calculated by 2TP/(2TP + FP + FN), and then the macro average F1 score was calculated by taking the arithmetic mean of all the F1 scores for two segments (Fig. S25). The optimal segmentation point can meet the requirements of identifying different diffusion behaviors on either side of the segmentation point and lasts for the longest continuous period.

We used the autocorrelation function $⟨ X (t) X (t + τ) ⟩$ to evaluate the reconstruction performance where the root mean-square error (RMSE) of zero time lag R(0) was used to evaluate the denoise performance and the RMSE of the autocorrelation curve (AC) was used to evaluate the long-range behaviors. The AC took the part greater than $2 τ$ , where $τ$ satisfied $R (τ) = R (0) / 2$ , and then subtracted the corresponding average.

Ablation experiments

To gain further insight into the features extracted through the self-supervised strategy, we simulated 2D trajectories consisting of three distinct motion patterns: Brownian, circular, and jump, each with an overlapped velocity distribution and lasting for a short segment. The short segment was based on the step displacement: for Brownian motion, x- and y-displacements were sampled from a normal distribution, N(0, 2DΔt), where D represented the diffusion constant, and Δt denoted the time step (1 frame). Circular motion tracks were determined by Rcos(ωΔt + θ) and Rsin(ωΔt + θ), where R represents the radius, ω corresponds to angular velocity with half a circular motion completed in 30 frames, and θ denotes the drift angle of the last frame velocity. Jump motion tracks were defined by vcos(θ)Δt and vsin(θ)Δt, where v represents the drift velocity, and θ refers to the drift angle of the last frame velocity. The duration and type of each segment were randomly selected from a range of 30–90 frames, with an equal probability of 1/3 assigned to each of the three motion patterns. During each state update step, the patterns were either recalculated with a probability of 2/3 or left unchanged with a probability of 1/3. The total time range was set to 10,000 for each trajectory, and we generated 20 trajectories, and the subtrajectories were set to a historical length L = 30. It is important to note that 4DΔt = R²ω²Δt² = v²Δt² was kept throughout to maintain the same velocity distribution across different patterns. The detailed algorithm can be found in the supporting material (Note S1).

The SEES operator (35) distinguished various patterns by combining different experience vector construction methods and differentiation metrics with different aspects of the time series. The SEES/U-ED operators used the velocity feature with Euclidean distance. To avoid identity transformation, we fed the LSTM with original trajectory coordinates except for the endpoint as input and the endpoint as the output, with the last hidden state functioning as the latent feature. The LSTM-AE employed two LSTMs as encoder and decoder, respectively, with the first LSTM’s last hidden layer acting as compressed z-latent vectors. The two-layer LSTM cell with 128 hidden neurons was implemented in the LSTM and LSTM-AE while z-latent vectors were clustering by k-means (K = 5) with cosine distance. The results exhibited the reconstructed input segments, cluster center, and separation with t-SNE. The performance of the classifiers was evaluated with an emphasis on the identification of changes between motion modes. The number of cluster centroids was set to K = 3, and a confusion matrix was generated to evaluate classification accuracy.

Nanoparticle transmembrane process

To assess the effectiveness of Deep-SEES when used with experimental data, we applied it to the analysis of the dynamic behavior of AuNRs moving through cellular membranes. The setup was similar to what we have described before (48). The cetyltrimethylammonium-bromide-stabilized AuNRs (NR-40–650) with a dimension of 85 $\times$ 40 nm were purchased from NanoSeedz (Hong Kong, China). U87 MG cells were obtained from American Type Culture Collection (Manassas, Virginia). Dulbecco’s phosphate-buffered solution and trypsin (0.25%, with EDTA) were purchased from Corning (New York, New York). High minimum essential medium with and without phenol red, fetal bovine serum, and penicillin-streptomycin solution were purchased from Gibco (Waltham, Massachusetts).

AuNRs and cells were imaged using a dark-field imaging system which was performed on Nikon 80i with a 100 W tungsten halogen lamp, an oil immersion dark-field condenser (NA 1.43–1.20), a 60 $\times$ oil immersion objective (NA 0.7–1.25), and a color CMOS camera (DP74, Olympus, Tokyo, Japan). The system provided a total magnification of 60×, yielding an effective pixel size of approximately 0.098 μm/pixel. Polar angles of AuNRs were extracted from the dual-channel difference between the red and green.

In plastic cell culture dishes, we cultured highly viable third- to tenth-passage U87 MG cells on clean, sterile coverslips and maintained them in the high minimum essential medium without phenol red, 10% fetal bovine serum, and 1% penicillin-streptomycin solution at 37°C, 5% CO₂, and a humidified atmosphere. When the number of U87 MG cells on the coverslip reached 40–60% confluence, we added 40 μL cetyltrimethylammonium-bromide-AuNRs and coincubated with the cells for 10 min. The coverslip was then removed and placed upside down on 100 μL of AuNR medium on top of the microscope grooved glass slide. This was done before viewing with the dark-field microscope.

TrackMate (49,50), a plugin of ImageJ commonly used in SPT research, was used to extract the trajectories of AuNRs. Generally, the estimated blob diameters of the AuNRs in the difference of the Gaussian detector were set to 0.8 μm with a threshold of 1. We used a simple LAP linker with a maximum link distance of 1–2 μm, and a gap tolerance usually set to 2 frames. To ensure the quality of the extracted trajectories, we typically applied a filter based on the number of spots in the track (>200).

Results

Deep-SEES

SPT analysis infers heterogeneous states of dynamical processes by analyzing the local diffusion properties of particle probes (Fig. 1 a) (26). While conventional techniques preset some feature space to estimate particle motion patterns, we extended this idea by learning diffusion features of subtrajectories in latent spaces, where the features embedded in the trajectories are approximated by input data by minimizing a cost function.

The general workflow of the self-supervised SPT trajectory analysis. (a) A typical data set consists of SPT data obtained by tracking particles' localization at each frame. (b) The detailed pipeline characterizing a trajectory: (i) a trajectory’s reconstructed position time series (*blue*) is decomposed into subtrajectories by a concatenation of moving windows (historical vector). (ii) The subtrajectories are fed to LSTM-VAE, and the extraction parameters contain the reconstructed trajectory (*red*) and the underlying states in the latent space (*colors*). (iii) Visualization of the reconstructed trajectories and different underlying states together provides insights into the corresponding heterogeneous properties. (c) The LSTM-VAE extracts SPT dynamics. The LSTM-Encoder compresses the noisy subtrajectories into the z-latent vectors and the LSTM-Decoder decompresses the z-latent vectors to recover the subtrajectories. These z-latent vectors contain information about each subtrajectory and are clustered by k-means. Each LSTM cell consists of a forget gate, input gate, output gate, hidden state, and cell state, as shown on the right. To see this figure in color, go online.

Generally, the SPT analysis is summarized in three steps: first, the reconstructed trajectories are decomposed into the concatenation of overlaid subsequences. Then, the parameters of each subsequence are extracted, such as velocity, diffusion coefficient D_t, scaling coefficient α, and autocorrelation. Finally, the spatiotemporal variations are visualized, with each parameter mapped to its corresponding states (Fig. 1 b). A key aspect of SPT analysis is finding appropriate functions that embed the original data manifold into the latent space, which typically requires physical models (28). However, we want to automatically and autonomously learn a mapping function that is independent of the presumed diffusion characteristics of the subtrajectories of the complex phenomenon.

We provide a data-driven approach called Deep-SEES, which employs an SPT analysis pipeline based on an LSTM-VAE (39,40) architecture to automatically search for the latent space (Fig. 1 c). The LSTM network captures sequential patterns over the longer term of time series, while the VAE framework extracts main features from the latent space. LSTM-VAE is a standard VAE that reconstructs the input itself by compressing the trajectory into a low-dimensional representation (Encoder) and reconstructing the trajectory from the latent representation (z-latent) while minimizing the reconstruction error (Decoder). The loss function ELBO is associated with capturing dynamics from observations while keeping the agreement between the latent and original data spaces.

The LSTM-VAE directly analyzes the original multivariate time series to bypass the requirement of prior feature selection and weight setting within and among the series that could introduce subjective bias. When the noisy time series fragments as input pass the LSTM-VAE block of Deep-SEES, the compressed z-latent feature is generated after learning. These z-latent features primarily play two roles: the first is reconstructing time series of nonlinear complicated dynamics, and the second is capturing underlying states (Fig. 1 b (ii), (iii)). Compared with the uncompressed vector spaces in the previous SEES method (35), the z-latent formed here, with only moderate compression but adequate for noise reduction, encodes a highly informative representation of the input subtrajectories and is therefore advantageous for state classification and interpretation. After training, the z-latent is partitioned into clusters using the k-means algorithm. The center of each cluster captures the dynamics of its respective state and can be subsequently mapped to the original trajectory space. Each trajectory point is assigned the cluster label corresponding to a unique color, which facilitates a more understandable representation of changes between dynamic states. Furthermore, the inclusion of a k-means loss function, in conjunction with the standard VAE loss function, enhances the separability of latent features throughout the training process (45).

To gain further insight into the features extracted through the self-supervised strategy, we illustrated the effect of each component in LSTM-VAE through an ablation study (Fig. S1; Note S1). For example, we used our previously developed SEES/U-ED (velocity with Euclidean distance, length L = 30, cluster centroids K = 5) algorithm to assess the simulated trajectories with three patterns (Brownian motion, Jump, and Circle), all with the same average velocity. The cluster centers resulting from SEES showed little difference in amplitude, and the dispersion performance of t-SNE (51) was poor due to inappropriate feature selection and the equal weight within historical fragments. In our pipeline, first we implemented the LSTM function, which applies the forget mechanism on the subtrajectories to model temporal correlations and multilevel lags between notable events (52). The LSTM was fed with original trajectory fragments except for the endpoint as input, and the output was the endpoint. The last hidden state retained long-term memory from the previous cell state with appropriate weighting, which contained the trend and shape of the time series. The clustering of these hidden states achieved separation in the low-dimensional manifold with satisfactory t-SNE performance. However, only one LSTM network alone after the input failed in subtrajectory reconstruction since it tended to overweight the information from the latest point. Therefore, we incorporated the VAE architecture for self-supervised learning. The AE is forced to learn the most important latent characteristics, while the reconstruction avoids nonessential sources of variation such as random noise (53). By generating a prior probabilistic distribution over the z-latent, the VAE framework ensured continuous mapping from the data manifold to the latent space. The cluster center with denoise behaviors exhibited significant separation of three patterns (Brownian motion, Jump, and Circle). The AE architecture alone cannot be used to assign probabilities or sample existing subtrajectories since the latent representations do not transition smoothly from one to the other (39). Subsequently, decoding is required to serve as an essential element to evaluate the quality and effectiveness of learned representations in preserving relevant features while minimizing information loss. For proof-of-concept, we evaluated the classification of different motion patterns using the confusion matrix of z-latent features derived from LSTM, LSTM-AE, LSTM-VAE, and LSTM-VAE without k-means loss, where the number of cluster centroids K was set to 3. When considering the number of cluster centroids, K, the separability of the subtrajectory average in the original space recovered from these centroids obtained in the latent space provides a criterion for choosing an appropriate K. As shown in the results, only the features derived from the LSTM-VAE framework demonstrate a combination of both interpretability and accuracy (Figs. S2 and S3).

Extraction dynamic states of noisy SPT system

Initially, we assessed the effectiveness and accuracy of Deep-SEES by using simulated trajectories with ground-truth states. We proved that the method could robustly preserve the statistical properties of these trajectories during the compression and decompression process through a latent space. First, we generated a slow-fast speed-switching trajectory concatenated by two portions with different diffusion coefficients. The data points were generated using a HMM consisting of four normal diffusion states, a 10% transition probability, and two sets of different occupancy probabilities. This process produced a fast and a slow variant, each with 20,000 frames with a localization error noise level r = 1 (Fig. 2 a). The LSTM-VAE neural network showed robustness with respect to hyperparameter selection (Fig. S4). We determined the optimal L by balancing reconstruction noise and dynamic state differentiation. Short subtrajectories (approximately ∼30 frames) were used to capture the temporal variation of the states under conditions of undersampling. This closely mimicked experimental systems used in single-particle studies, where the durations of the particle state were relatively short and different states may overlap.

Deep-SEES applied to highly random single-particle trajectories. (a–d) A concatenated trajectory of slow-fast variation. (a) The reconstructed (*red*) and raw (*blue*) trajectory. The raw trajectory concatenated by a slow region (I) and a fast region (II) is generated by a four-state diffusion model with two different HMM occupancy probabilities, each containing 20,000 frames, with the localization error noise level r = 1. (b) The autocorrelation functions of the raw data, the reconstructed data, and the ground truth. The preprocessing strategy excludes local spatial information by rotating and mirroring operations T, as illustrated in the inset. (c) Differentiation of the two underlying states. Left: portions (∼1000 frames each) of the reconstructed trajectory. Top-right: average trajectories (L = 30) of the cluster centers of the slow (*red*) and the fast (*purple*). Bottom-right: histogram of the slow/fast state assignment in the two portions of the reconstructed trajectory. (d) Confusion matrix and F1 score for predicting the slow and fast motions. (e–h) A concatenated trajectory of three different confinement regimes. (e) The reconstructed (*red*) and raw (*blue*) trajectory. The raw trajectory, with the localization error noise level r = 1, is generated by the fBm model and contains three regions of different confinement: α = 0.5 (I), α = 1.0 (II), and α = 1.5 (III), all with a similar step size distribution (20,000 frames for each region, while only 2000 frames are displayed for III). (f) The autocorrelation functions of the raw data (*blue*), the reconstructed data (*red dot*), and the ground truth (*black*). (g) Differentiation of the three underlying states. Left: portions (∼1000 frames each) of the reconstructed trajectory. Top-right: average trajectories (L = 40) of the cluster centers of the confined (*red*), the Brownian-like (*green*), and the directed (*purple*). Bottom-right: histogram of the state assignment in the three portions of the reconstructed trajectory. (h) Confusion matrix and F1 score for predicting the confined, Brownian-like, and directed behaviors. Note that the noise level, denoted by r = $σ / \sqrt{2 D Δ t}$ , is defined as the ratio of the standard deviation of additional Gaussian white noise N(0, σ) to the square root of twice the product of the average diffusion coefficient, D, and the time interval, Δt. To see this figure in color, go online.

Upon completion of the learning process, we wanted to capture and evaluate the spatiotemporal shape of the filtered trajectory rather than the particle moving direction in the x-y plane. To focus on the shape variation of the subtrajectories while eliminating the directional stochasticity of local diffusion, we first implemented a preprocessing strategy for the input time sequence. In brief, we rotated all subtrajectories to ensure that the final position was situated on the x axis, and mirrored them along the x axis if the largest y coordinate value of the trajectory was negative (Fig. 2 b, insert). In this way, trajectories were decomposed into two components: a continuously varying term and a noise term. After decoding, the first continuously varying term was reconstructed, preserving the time-dependent statistical properties. The autocorrelation of the trajectory reconstructed by Deep-SEES almost exactly overlapped with that of the original trajectory, indicating that the main long-term-varying dynamics were properly captured (Fig. 2 b). Indeed, the MSD curves of the original and reconstructed data are similar, indicating that the reconstructed trajectory preserved the statistical characteristics of nanoparticle motions within the region explored by the particle. The reconstructed MSD displayed a slight bias compared with the original MSD, as some time-independent white noise was removed in the highly random processes (Fig. S5, a and b). In contrast, when applying Deep-SEES directly without additional preprocessing, the features extracted from the z-latent layer primarily represented x-y coordinate variations of the single particle, rather than the desired subtrajectory shape differences or local dynamic interactions (Fig. S5, c and d).

The optimal choice of the number of centroids K depended on the separability of the latent features (see materials and methods for details). In the case of the system with fast and slow variants, for example, the z-latent vectors were clustered with K = 2 centroids. When a higher number of centroids was used, such as K = 4, the overlap of cluster centers reduced the discrimination between different states (Figs. S4 i and S6). The trajectories of the cluster centers (trajectory averages) of the slow motion and the fast motion were displayed in Fig. 2 c (top right). Each point on the recovered trajectory was colored according to the state assignment of their subtrajectories, and the representative portions were shown (Fig. 2 c, left). Despite the short-range stochasticity of the particle motion, the majority of data points in the slow and fast variants were accurately assigned to their corresponding states (Fig. 2 c, bottom-right). The classification accuracy could also be evaluated using a confusion matrix (28), where the rows represent the ground truth and the columns correspond to the predicted labels. Our results showed an F1 score of 94.03, with classification accuracy of 96.22 for the fast variant and 91.83 for the slow variant (Fig. 2 d). Furthermore, compared with traditional features, such as the velocity and MSD, the approach demonstrated improved accuracy in pattern recognition (Fig. S7) and maintained robust performance in high-noise environments (Fig. S4 f).

Next, we demonstrated the Deep-SEES method in highly random particle confinement trajectories. We generated a joint trajectory of anomalous diffusion containing three regions with different types of confinement: a subdiffusion (α = 0.5, region I), a normal diffusion (α = 1.0, region II), and a superdiffusion (α = 1.5, region III). Using the three given values of α, the trajectory was simulated by fBm (28) with similar step size and velocity distributions (Fig. S8). For each region, the trajectories were sampled at 60 Hz with a total length of 20,000 points, and the historical vector length L = 40 was set to obtain subtrajectory fragments.

After learning, the global behavior or the spatiotemporal shape of the trajectory was captured (Fig. 2, e and f). We selected K = 3 to illustrate the underlying states of these subtrajectories (Figs. 2 g, S9, and S10). The cluster centers displaying confined, Brownian-like, and directed behavior were denoted in red, green, and purple, respectively (Fig. 2 g, top right). All the data points on the recovered trajectory were colored according to the state assignment of their subtrajectory, and the representative parts were displayed (Fig. 2 g, left). In spite of the stochasticity of the particle motion within a short range, the majority in the subdiffusion regions (I) and superdiffusion regions (III) were correctly assigned to the corresponding states of directed and constrained motion, respectively (Fig. 2 h, bottom right). In contrast, the subtrajectories in the normal diffusion region (II) were assigned to three mixed states, consistent with frequent transitions between the three confinement states of a Brownian particle (Fig. 2 h).

The accuracy of classification can be improved if an additional dimension was introduced. Here, we fed into Deep-SEES a joint time-series of the trajectory of the particle translational motion mentioned above plus the trajectory of the particle rotational angles generated by reflected fBm (47): I (α = 0.5), II (α = 1), and III (α = 0.05), with similar angular velocity distributions (Fig. S11 a). The averaged trajectory showed notable separation in x-y polar coordinates while the cluster probability distribution showed a mapping of the three clusters to the regions (Fig. S11 b). For comparison, we utilized k-means with the same K = 3 centroids and L = 40 historical length parameters to cluster the MSD and the polar angle features separately. Although region III contained many restricted components, MSD showed little difference between region I and region II. The polar angles also showed similar patterns, with little difference between region I and region III, while region II contained many low angle values (Fig. S12).

Finally, we showed that the Deep-SEES method outperformed commonly used methods in inferring transient states in SPT trajectories: DC-MSS (13) and HMMs (15). DC-MSS characterizes molecular motion by calculating multiple moments of the displacement distribution, while HMM represents specific states of motion with a particular diffusion coefficient and velocity based on single-step displacements. Segmentation and classification performance of transient motion analysis in multiple short trajectories (200 frames) was simulated with accuracy to identify transitions between free, confined, and directed diffusion (Fig. 3, a–c), similar to the setup described before (13). First, the F1 score (28), which provided the accuracy for assigning segments to the actual motion types, was calculated to identify motion switches between free diffusion and drift diffusion ( $D E = v Δ t / \sqrt{4 D Δ t}$ , Fig. 3 a), as well as motion switches between free diffusion and confined diffusion ( $C E = \sqrt{4 D Δ t} / R$ , Fig. 3 c). The results showed that Deep-SEES can distinguish other motion types from free diffusion over a wider range of parameters, which suggested its ability to identify the inherent long-range correlation between trajectories. We then evaluated the sensitivity of our method with noise level by r = 0.75, using multiple trajectories simulated at CE = 0.7 and DE = 0.8, with random intervals for transition points (Fig. 3 b). Compared with DC-MSS, Deep-SEES was more accurate in identifying motion types in the presence of noise and was more accurate in identifying confined and directed motions, as shown by the confusion matrix predicting the three motion types (Fig. 3, d and e). F1 score for classification of diffusion patterns in the first and second segments (Fig. 3 f) and identification of transition point precision by RMSE (Fig. 3 g) were performed. The results showed that Deep-SEES had higher accuracy and was more sensitive to turning points than DC-MSS. In contrast, HMM was highly sensitive to noise and can only recognize one class. Finally, the F1 scores and RMSE validated the ability to segment and classify motion types under different noise levels, and the results showed that Deep-SEES could accurately identify transitions between different types of diffusion patterns (Figs. S13–S21).

Evaluation of segmentation and motion classification performance for transient mobility analysis. (a–c) F1 scores for Deep-SEES (*red*), DC-MSS (*green*), and HMM (*purple*) for identifying switches between free diffusion, confined diffusion, and directed diffusion in multiple short trajectories (200 frames). (a and c) The results of switching between free diffusion and drift diffusion $(D E = v Δ t / \sqrt{4 D Δ t}$ ), and switching between free diffusion and confined diffusion ( $C E = \sqrt{4 D Δ t} / R$ ), respectively. Note that the switching point is at the midpoint of each trajectory (100/100). (b) The data set consisting of multiple trajectories data simulated at CE = 0.7 and DE = 0.8, with a diffusion coefficient of D = 2 pixels²/frame, and with transition points occurring at random intervals. (d–g) Evaluation of the segmentation and classification of the results on the localization error N(0, $σ_{p}$ ) with noise level $r = σ_{p} / \sqrt{2 D Δ t} = 0.75$ . (d and e) The confusion matrix for predicting confined diffusion, free diffusion, and directed diffusion with Deep-SEES and DC-MSS, respectively. HMM only recognizes one class so its results are not listed here. (f) F1 score of the diffusion pattern classification for the first and second segments, which is a function of the location of the change point along the trajectory. (g) The root mean-square error (RMSE) for optimal change point along the trajectory. (h–i) The F1 score and RMSE are used to validate the segmentation and classification ability under varying noise levels (change point ∼60–140). These results demonstrate that Deep-SEES can accurately identify transitions between different types of diffusion modes with a wider range of parameters and noise than DC-MSS and HMM. To see this figure in color, go online.

Nanoparticle transmembrane process

To evaluate the utility of Deep-SEES on experimental data, we applied it to analyze the dynamics of AuNRs moving on cell membranes. As typical endocytosis, the nanoparticle transmembrane process is strongly associated with the targeted transport of drugs, such as immunotherapy for cancer (54). The experimental setup of SPT is shown in Fig. 4 a, as we have described before (48), where a dark-field imaging system (55) was used to record sequences of color images of AuNRs (85 $\times$ 40 nm). We monitored both the translational and rotational motion of the nanoprobes interacting with U87 MG cells (adherent astrocytoma cells in the human brain) simultaneously, where the polar angle of rotation was calculated by the dual-channel difference between the red and green (48). The sampling rate was 30 Hz, the total time range was approximately 500 s, and the history length L was 30.

Previously, in the SEES method (35), we examined the temporal variations of velocity and polar angle independently. Here, we performed a direct analysis on the raw multivariate time series, which allowed simultaneous evaluation of different parameters from the trajectories, such as x-y displacements, velocity, rotation angle, and more, minimizing subjective bias. The effectiveness of the clustering can be visualized using a t-SNE-like dimensionality reduction approach, providing a comprehensive understanding of the analysis. This is achievable due to the unsupervised clustering performed in the latent space.

However, unbalanced data can still pose challenges, such as a particular diffusion mode acting as a background, which reduces the separability of latent features and reduces the resolution of hierarchical structure information. In particular, when a single particle was used to probe a heterogeneous interface, such as the live cell membrane, the obtained time series contained several fragments of white-noise-like Brownian motion of the probe due to the discontinuity of particle-interface interaction (e.g., interactions between nanoparticles and the live cell membrane). As many trajectories were dominated by Brownian motion, direct clustering in the latent space suffered from reduced segmentation accuracy. Here, we proposed a Brownian mask to focus on the non-Brownian process. In brief, we estimated the scaling coefficient α and the power-law-fitting errors of the subtrajectories, both obtained from their MSD curves. Before clustering the z-latent vector, we applied manually selected thresholds on the two parameters and excluded the out-of-bounds subtrajectories (Fig. S22; Note S2).

In practical applications, the transmembrane process represented a significant dynamic change during the particle-cell interaction. However, the exact transmembrane site on a long trajectory of the entire process of a nanoparticle entering the live cell is a rare event and is difficult to identify because it is hidden in the length on-membrane diffusion stage. To pinpoint the entering site, a cascading segmentation strategy is used to reveal features of the nanoparticle transmembrane process sequentially. This approach combines an initial coarse segmentation with a subsequent fine reclustering of selected regions to capture more intricate details of the nanoparticle motions on the cell membrane. First, we analyzed the entire 15,000-frame trajectory with an optimal chosen K = 4 centroids; the average trajectories demonstrated 4 modes: superdiffusion, directed motion, confined motion with a large polar angle, and confined motion with a small polar angle (Fig. 4 b). They were compared by coloring the labels back to their original trajectories in MSD feature space. The α values are in agreement with the diffusion pattern of the average trajectories in the translation domain (Fig. 4 c). Here, the directed motion and superdiffusion shared a common feature of α = 1.26 at shorter time lags. However, with longer time lags, the slope of the superdiffusive behavior of the latter decreases. In both cases where confined motion exhibited α = 0.74, the dynamics manifested in two distinct patterns based on the associated polar angles. One pattern exhibited relatively large polar angles, all greater than 50°, while the other exhibited relatively small polar angles, all less than 40°. The color sequence mainly shows the diffusion across the membrane after the initial adsorption and intracellular long-distance transport (Fig. 4 d). The long-distance transport stages exhibit directional motion, while diffusion on the membrane exhibits alternations between superdiffusion, confined motion, and directional motion before the membrane-crossing event.

Subsequently, in situations where the signal/noise ratio is high enough, the transmembrane site can be located visually by searching for a narrow region where the nanorod is doing confined motion with minimal polar angles or assuming an orientation perpendicular to the local membrane surface. Now, for the noisy data recorded, after applying Deep-SEES with K = 4 to the entire long trajectory and obtaining the denoised reconstructed time series, we can ensure that the transmembrane site is within the confined region with small polar angles. This region is composed of 15 short segments discretely distributed throughout the original long trajectory, whose duration varies from about 1 s to up to 50 s and where the total displacements of the nanorod are less than 1 μm. However, it is still difficult to locate the transmembrane site from the 15 segments because the differentiation power of the algorithm was insufficient when the K value was small. Therefore, we re-evaluated these subtrajectories doing confined motion with a small polar angle using a larger value of K = 10∼15, such as K = 13 (Figs. 4 e and S23), and obtained a small area clearly distinguished from all others. Projection of the distinct line-like patterns back onto the reconstructed time series matched the transmembrane site with high precision (Figs. 4 f and S24). The transmembrane displacement of the particle is extremely small, and the polarization angle is close to 0°, consistent with the previous report (56).

High-speed imaging of nanoparticles

To further evaluate the utility of Deep-SEES on the noisy data, we applied it to analyze the translation and rotation of AuNRs in high-speed imaging situations. High-speed imaging allows us to capture the transient diffusion of nanoparticles in free solution, but the low photon flux also leads to a high noise level. Here, we tested the performance of our Deep-SEES method on noisy trajectories obtained in two different systems: an active enzyme system and a protein LLPS system. For signal measurement, a birefringent prism separated the single AuNR scattering into two orthogonal polarization components, so that the translation and rotation of AuNRs can be retrieved from the movement of the two-spot centers and the change in the two-spot intensities, respectively. In each case, ∼100 trajectories were recorded at the 1000–2000 Hz sampling rate for 0.5 s (Fig. 5 a).

Classification of nanoparticle rotation and translation in high-speed imaging. (a) Optical setup of the high-speed laser dark-field microscopy (HSLDFM). The scattered light from a single AuNR is separated into two orthogonal polarization components by a birefringent prism. The movement of the two-spot centers and the change of the two-spot intensities represent the translation and rotation, respectively. (b) The cluster centers of translational trajectories show the main patterns in catalytic enzyme systems. (c) More ballistic motion (*purple*) is observed in the reconstructed trajectories than in the original trajectories (*inset*). Scale bar, 5 μm. (d) The intensities of the two channels are used as trajectories, showing the rotation in solution and liquid-liquid phase separation (LLPS) droplets. The inferred dynamics from I_x and I_y are shown in time series and subtrajectories (L = 30), where high-frequency components are represented by their standard deviations $σ$ (e). The scale bar of normalized intensity is 0.1. The cluster centers of the I_x, I_y, and $σ$ time series show three distinct patterns: fast rotation (*purple*), medium rotation (*green*), and slow rotation (*red*). (f) The slow rotating fraction increases over time in the histogram, indicating that the LLPS condensates become more viscous. Error bars represent the standard deviation of classification percentages for different samples. To see this figure in color, go online.

In the study of the active enzyme system (43), enhanced diffusion and even intermittent ballistic motion of single AuNRs were successfully revealed in the reconstructed trajectories. We selected K = 3 to classify the underlying states of the subtrajectories into confined (red), Brownian-like (green), and directed (purple) (Fig. 5 b), respectively. The results clearly showed more details of the enhanced translation than the original one (Fig. 5 c).

In the LLPS system, protein aggregation in the solution leads to the formation of membrane-free granular structures, and the rotation of the AuNRs encapsulated inside the droplets provide rich information on the internal heterogeneous dynamics and structure-function relationships (44). To extract the rotational states, we first fed the original data into our network to obtain smoothed two-channel time series. We also calculated the standard deviation $σ$ of the difference between the reconstructed and original time series to capture rapid fluctuations in the solution (Fig. 5 d). Then we used the three time series (Ix, Iy, and $σ$ ) to retrieve the dynamics and underlying states during the LLPS process with no prescribed models. Here, we studied the LLPS process of PGL protein droplets after 10, 60, and 240 min of external thermal stress. In Fig. 5 e, we showed the recovery of three patterns: rapid rotation, normal rotation, and slow rotation. Compared with the most rapid rotation in the control solution, the fraction of slow rotation increases with time, indicating that the membrane-free condensates become more viscous during LLPS (Fig. 5 f).

Discussion

Classifying local dynamics of particle trajectories offers insights into subtle features of probe motion over time, such as the heterogeneity of the interplaying environment (e.g., cell membrane regions with different viscosities) and different transport states of the probe (e.g., molecular motors with different activation). Here, we propose a data-driven methodology, called Deep-SEES, which is a generic framework based on LSTM-VAE that extracted latent features from single-particle trajectories without prior knowledge on the ground truth or the availability of high-quality data sets. Our results suggest that the latent space may serve as an important bridge to model or distinguish different types of diffusion motion, and enhance the capability of detecting changes in probe dynamics in noisy environments.

Compared with unmediated trajectory filtering and subtrajectory classification, the latent space is a highly informative representation of the compressed data in which high-dimensional subtrajectories of similar shape are closer together in the vector space, whose centroid is related to the statistical properties of motion features such as the scaling coefficient α and the diffusion coefficient D_t. During the process of moderate compression and the subsequent decompression, Deep-SEES removes the time-independent components as noise through optimal reconstruction, while preserving the time-dependent components in long-term time series with different fine structures, even in highly stochastic trajectories. Many types of noise inherent in experimental systems (e.g., camera noise, localization errors, and tracking errors) reduce the resolution of particle motion and make it difficult to infer states accurately in a short duration. Noise has an obvious impact on the analysis of statistical features such as velocity, diffusion coefficient, and α, leading to significant degradation in the performance of DC-MSS (13) and HMMs (15). In addition, unsupervised techniques with unlabeled data, especially k-means (46), are sensitive to noise. Herein, we show that, by using Deep-SEES, the separability of the latent features can be improved by removing time-independent components, which were viewed as noise. We performed a comparison and ablation study on slow and fast variants as well as on the Vega data set (13), focusing on segmentation tasks. Only the features derived from the LSTM-VAE showed a balance between interpretability and accuracy (Figs. S17–S21; Tables S1 and S2).

While the performance of classification and segmentation using direct k-means clustering does not show substantial improvements before and after direct filtering of the original trajectories (Fig. S7, a and b), the latent space representations provide notable advantages. Characterized by reduced dimensionality and decreased noise levels compared with the original position or traditional features (Figs. S7 c, d, S12, and S16), the latent representations make it possible to extract nonlinear essential components associated with the entanglement of multiple variables. Regarding the latent dimension (Figs. S4, S9, and S14), we observed that accurate classification and segmentation can be achieved if the dimensionality is not reduced excessively. In our case, we found that a compression factor ranging from approximately 2 to 8 was all applicable. We have performed a comparison between our approach and DC-MSS (13); however, a direct comparison with the deep learning version, DL-MSS (34), is not possible because DL-MSS uses pretraining from simulations to define immobile, slow, and fast states, which gives it an advantage as supervised learning. Our modeling approach is purely data driven and relies on the raw trajectory data for information. The minimum data requirement is not particularly demanding, requiring a total of only at least 20,000 frames, which corresponds to about 100 trajectories, each consisting of about 200 frames. For example, the fast-slow variants case used about 40,000 frames, while the different confinement case used about 60,000 frames. In the transmembrane experiment, about 30,000 frames are used. Moreover, high-fidelity reconstruction was accomplished for complex systems exhibiting significant periodicity and continuous dynamics with noise levels as high as r = 0.6, such as Lorenz, Roessler, ECG measurements, and Kolmogorov flow dissipation data with added noise (Figs. S26–S32; Note S3). The reconstructed time series successfully captures both steady and transient features, suggesting the potential of this method to manage a wide range of complicated dynamics without oversimplifying or overfitting.

A guiding factor of history length L is important in Deep-SEES as it affects the accuracy of trajectory reconstruction and the preservation of stochasticity in the system. Although the history length has less influence on the overall shape of the trajectory reconstruction, choosing too short a history length leads to noisy results, while choosing too long a history length oversimplifies the dynamics and leads to a loss of stochastic properties (Figs. S4, S9, and S14). Furthermore, the choice of history length L is based on local length scales of the trajectory that effectively separates their states. The parameters CE and DE represent the switching between confined diffusion and free diffusion, drift diffusion, and free diffusion in Fig. 3, respectively. In our case, for systems exhibiting significant stochasticity, the relationship was satisfied at $\sqrt{M S D (t_{L}) / 2 D t_{L}}$ > 2 to achieve robust dynamic change detection. However, if L was longer than half of the duration of the state, a high degree of overlapping parts would obscure the difference among underlying states, decreasing the accuracy of classification and segments. The optimal selection of the number of centroids K in the latent space is influenced by the trade-off between the separability of the latent features and the average duration of each state. This selection of K ensures that the cluster centroids, appearing as the average subtrajectories in the original space, do not overlap, and the reduction of average duration of each state with the K number increase, is close to the fast-to-flat turning point (Figs. S3, S6, S10, and S13; Notes S4 and S5). Such a choice helps to improve the performance and interpretability of the resulting cluster centroids. In fact, when evaluating the segmentation performance of the false alarm rate, which represents the detection of change points if no actual change point is presented, the natural randomness of the system resulted in a few distinct states. To improve the effectiveness of the segmentation process, the best transition point can be selected by finding the point with the largest difference between the classes on either side of the change point (Figs. S14 and S15). Moreover, in multiparameter classification, classification and segmentation suffer from the challenge of encountering unbalanced data (e.g., large numbers of trajectories dominated by Brownian motion): the distance within one class is greater than that between other classes. We show the flexibility of our method by implementing some constraints to reveal features of interest, such as a cascading segmentation strategy that initially employs coarse segmentation and subsequently focuses on fine examination or reclustering of selected regions (Figs. S16 and S22).

A limitation of Deep-SEES stems from its rolling window-based methodology, whose length selection requires a balance between the sensitivity of the motion state switch detection and the classification accuracy. Thus, the direct identification of distinct, potentially short-lived states can be challenging in stochastic systems with history L requirements. Furthermore, while Deep-SEES enables robust dynamic change detection based on latent space clustering in high-noise environments, bypassing the need for ground truths or model categories, this segmentation process is accompanied by an inherent limitation of the k-means clustering technique. Manual intervention, such as parameter selection and thresholding, remains essential to obtain meaningful insights and to emphasize features of interest. Overcoming this shortcoming may be possible with future more advanced clustering algorithms.

Conclusions

Here, we propose a data-driven methodology, called Deep-SEES, which enables the segmentation of underlying states from single-particle trajectories without prior ground truth or the availability of high-quality data sets. Given the intrinsic noise and undersampling in single-particle tracking, inferring underlying states from short subtrajectories is challenging. Conventional methods need to first filter the noise from the real signal, and then classify the signal into some state accordingly. Generally, these methods encounter two main challenges. First, unsupervised classification algorithms are sensitive to noise, but the stochastic nature of single-particle tracking makes it infeasible to separate the noise from the real signal. Second, supervised algorithms need data sets that include different models, but it is impossible to enumerate all the models in the infinite model space.

Our pipeline is a generic framework that reconstructs the trajectory from the raw data while learning the true data distribution to extract crucial latent features. The VAE architecture guarantees the smooth mapping between the trajectory space and the latent representations, and the z-latent space encodes the difference between trajectories as well as the correlation of multiple variables. As a result, the reduced dimensional z-latent space retains the statistical properties of the original trajectory and can be applied for accurate and robust trajectory classification and segmentation.

We have focused on the inherent correlation with multivariate time series of an individual trajectory, but a more sophisticated analysis is possible by analyzing the correlation between multiple trajectories obtained simultaneously. The self-generating z-latent feature of trajectories expands our ability to better understand, estimate, and predict the behavior of many different complex systems.

Data and code availability

These data sets and the implementation for Deep-SEES are also available in the GitHub repository at https://github.com/EdwardZX/Deep-SEES.

Author contributions

Y.Z. and Y.H. conceived this project. Y.H. and H.X. supervised this research. Y.Z. designed detailed implementations and processed the data. X.L., J.X., and Y.S. set up the imaging system and performed experiments. All authors participated in critical discussions about the results and the writing of the paper.

Acknowledgments

We thank Hansen Zhao and Qi Pan for helpful discussions and suggestions on the paper. This work was supported by the National Natural Science Foundation of China grant nos. 22127807, 21425519, 21621003, and 91853105.

Declaration of interests

The authors declare that they have no competing interests.

Editor: Gerhard Schutz.

Footnotes

Supporting material can be found online at https://doi.org/10.1016/j.bpj.2023.10.023.

Contributor Information

Hao Xie, Email: xiehao@tsinghua.edu.cn.

Yan He, Email: yanhe2021@mail.tsinghua.edu.cn.

Supporting material

Document S1. Figures S1–S32, Tables S1 and S2, and Notes S1–S6

mmc1.pdf^{(9.3MB, pdf)}

Document S2. Article plus supporting material

mmc2.pdf^{(12.7MB, pdf)}

References

1.Montiel D., Yang H. Real-time three-dimensional single-particle tracking spectroscopy for complex systems. Laser Photon. Rev. 2010;4:374–385. [Google Scholar]
2.Chenouard N., Smal I., et al. Meijering E. Objective comparison of particle tracking methods. Nat. Methods. 2014;11:281–289. doi: 10.1038/nmeth.2808. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Shen H., Tauzin L.J., et al. Landes C.F. Single Particle Tracking: From Theory to Biophysical Applications. Chem. Rev. 2017;117:7331–7376. doi: 10.1021/acs.chemrev.6b00815. [DOI] [PubMed] [Google Scholar]
4.Pan Q., Sun D., et al. He Y. Real-Time Study of Protein Phase Separation with Spatiotemporal Analysis of Single-Nanoparticle Trajectories. ACS Nano. 2021;15:539–549. doi: 10.1021/acsnano.0c05486. [DOI] [PubMed] [Google Scholar]
5.Ray S., Singh N., et al. Maji S.K. α-Synuclein aggregation nucleates through liquid–liquid phase separation. Nat. Chem. 2020;12:705–716. doi: 10.1038/s41557-020-0465-9. [DOI] [PubMed] [Google Scholar]
6.Ouellette N.T., O’Malley P.J.J., Gollub J.P. Transport of Finite-Sized Particles in Chaotic Flow. Phys. Rev. Lett. 2008;101 doi: 10.1103/PhysRevLett.101.174504. [DOI] [PubMed] [Google Scholar]
7.Bechinger C., Di Leonardo R., et al. Volpe G. Active particles in complex and crowded environments. Rev. Mod. Phys. 2016;88 [Google Scholar]
8.Khadka U., Holubec V., et al. Cichos F. Active particles bound by information flows. Nat. Commun. 2018;9:3864. doi: 10.1038/s41467-018-06445-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Godoy B.I., Lin Y., Andersson S.B. held in Denver, CO; 2020. A Time-Varying Approach to Single Particle Tracking with a Nonlinear Observation Model; pp. 5151–5156. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Ernst D., Köhler J. Measuring a diffusion coefficient by single-particle tracking: statistical analysis of experimental mean squared displacement curves. Phys. Chem. Chem. Phys. 2013;15:845–849. doi: 10.1039/c2cp43433d. [DOI] [PubMed] [Google Scholar]
11.Calderon C.P. Motion blur filtering: A statistical approach for extracting confinement forces and diffusivity from a single blurred trajectory. Phys. Rev. E. 2016;93 doi: 10.1103/PhysRevE.93.053303. [DOI] [PubMed] [Google Scholar]
12.Ashley T.T., Andersson S.B. Method for simultaneous localization and parameter estimation in particle tracking experiments. Phys. Rev. E. 2015;92 doi: 10.1103/PhysRevE.92.052707. [DOI] [PubMed] [Google Scholar]
13.Vega A.R., Freeman S.A., et al. Jaqaman K. Multistep Track Segmentation and Motion Classification for Transient Mobility Analysis. Biophys. J. 2018;114:1018–1025. doi: 10.1016/j.bpj.2018.01.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Riahi M.K., Qattan I.A., et al. Homouz D. Identifying short- and long-time modes of the mean-square displacement: An improved nonlinear fitting approach. AIP Adv. 2019;9 [Google Scholar]
15.Monnier N., Barry Z., et al. Bathe M. Inferring transient particle transport dynamics in live cells. Nat. Methods. 2015;12:838–840. doi: 10.1038/nmeth.3483. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Persson F., Lindén M., et al. Elf J. Extracting intracellular diffusive states and transition rates from single-molecule tracking data. Nat. Methods. 2013;10:265–269. doi: 10.1038/nmeth.2367. [DOI] [PubMed] [Google Scholar]
17.Das R., Cairo C.W., Coombs D. A Hidden Markov Model for Single Particle Tracks Quantifies Dynamic Interactions between LFA-1 and the Actin Cytoskeleton. PLoS Comput. Biol. 2009;5 doi: 10.1371/journal.pcbi.1000556. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Heckert A., Dahal L., et al. Darzacq X. Recovering mixtures of fast-diffusing states from short single-particle trajectories. Elife. 2022;11 doi: 10.7554/eLife.70169. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Hines K.E., Bankston J.R., Aldrich R.W. Analyzing Single-Molecule Time Series via Nonparametric Bayesian Inference. Biophys. J. 2015;108:540–556. doi: 10.1016/j.bpj.2014.12.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.van de Meent J.-W., Bronson J.E., et al. Gonzalez R.L. Empirical Bayes Methods Enable Advanced Population-Level Analyses of Single-Molecule FRET Experiments. Biophys. J. 2014;106:1327–1337. doi: 10.1016/j.bpj.2013.12.055. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Bosch P.J., Kanger J.S., Subramaniam V. Classification of Dynamical Diffusion States in Single Molecule Tracking Microscopy. Biophys. J. 2014;107:588–598. doi: 10.1016/j.bpj.2014.05.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Wilson H., Wang Q. Joint Detection of Change Points in Multichannel Single-Molecule Measurements. J. Phys. Chem. B. 2021;125:13425–13435. doi: 10.1021/acs.jpcb.1c08869. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Li H., Yang H. Statistical Learning of Discrete States in Time Series. J. Phys. Chem. B. 2019;123:689–701. doi: 10.1021/acs.jpcb.8b10561. [DOI] [PubMed] [Google Scholar]
24.Song N., Yang H. Parallelization of Change Point Detection. J. Phys. Chem. A. 2017;121:5100–5109. doi: 10.1021/acs.jpca.7b04378. [DOI] [PubMed] [Google Scholar]
25.Saxton M.J. Single-particle tracking: the distribution of diffusion coefficients. Biophys. J. 1997;72:1744–1753. doi: 10.1016/S0006-3495(97)78820-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Metzner C., Mark C., et al. Fabry B. Superstatistical analysis and modelling of heterogeneous random walks. Nat. Commun. 2015;6:7516. doi: 10.1038/ncomms8516. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Granik N., Weiss L.E., et al. Shechtman Y. Single-Particle Diffusion Characterization by Deep Learning. Biophys. J. 2019;117:185–192. doi: 10.1016/j.bpj.2019.06.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Muñoz-Gil G., Volpe G., et al. Manzo C. Objective comparison of methods to decode anomalous diffusion. Nat. Commun. 2021;12:6253. doi: 10.1038/s41467-021-26320-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Kowalek P., Loch-Olszewska H., Szwabiński J. Classification of diffusion modes in single-particle tracking data: Feature-based versus deep-learning approach. Phys. Rev. E. 2019;100 doi: 10.1103/PhysRevE.100.032410. [DOI] [PubMed] [Google Scholar]
30.Janczura J., Kowalek P., et al. Weron A. Classification of particle trajectories in living cells: Machine learning versus statistical testing hypothesis for fractional anomalous diffusion. Phys. Rev. E. 2020;102 doi: 10.1103/PhysRevE.102.032402. [DOI] [PubMed] [Google Scholar]
31.Bo S., Schmidt F., et al. Volpe G. Measurement of anomalous diffusion using recurrent neural networks. Phys. Rev. E. 2019;100 doi: 10.1103/PhysRevE.100.010102. [DOI] [PubMed] [Google Scholar]
32.Matsuda Y., Hanasaki I., et al. Niimi T. Estimation of diffusive states from single-particle trajectory in heterogeneous medium using machine-learning methods. Phys. Chem. Chem. Phys. 2018;20:24099–24108. doi: 10.1039/c8cp02566e. [DOI] [PubMed] [Google Scholar]
33.Pinholt H.D., Bohr S.S.R., et al. Hatzakis N.S. Single-particle diffusional fingerprinting: A machine-learning framework for quantitative analysis of heterogeneous diffusion. Proc. Natl. Acad. Sci. USA. 2021;118 doi: 10.1073/pnas.2104624118. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Arts M., Smal I., et al. Meijering E. Particle Mobility Analysis Using Deep Learning and the Moment Scaling Spectrum. Sci. Rep. 2019;9 doi: 10.1038/s41598-019-53663-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Zhao H., Ge F., et al. He Y. Reveal heterogeneous motion states in single nanoparticle trajectory using its own history. Sci. China Chem. 2021;64:302–312. [Google Scholar]
36.Zhao H., Ge F., et al. He Y. Uncover Single Nanoparticle Dynamics on Live Cell Membrane with Data-Driven Historical Experience Analysis. Anal. Chem. 2021;93:9559–9567. doi: 10.1021/acs.analchem.1c01666. [DOI] [PubMed] [Google Scholar]
37.Muñoz-Gil G., Guigo i Corominas G., Lewenstein M. Unsupervised learning of anomalous diffusion data: an anomaly detection approach. J. Phys. Math. Theor. 2021;54 [Google Scholar]
38.Verdier H., Laurent F., et al. Masson J.-B. Simulation-based inference for non-parametric statistical comparison of biomolecule dynamics. PLoS Comput. Biol. 2023;19 doi: 10.1371/journal.pcbi.1010088. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Bowman S.R., Vilnis L., et al. Bengio S. Association for Computational Linguistics; 2016. Generating Sentences from a Continuous Space; pp. 10–21. [Google Scholar]
40.Rákos O., Aradi S., Szalay Z., et al. Compression of vehicle trajectories with a variational autoencoder. Appl. Sci. 2020;10 [Google Scholar]
41.Pandarinath C., O’Shea D.J., et al. Sussillo D. Inferring single-trial neural population dynamics using sequential auto-encoders. Nat. Methods. 2018;15:805–815. doi: 10.1038/s41592-018-0109-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Fernex D., Noack B.R., Semaan R. Cluster-based network modeling—From snapshots to complex dynamical systems. Sci. Adv. 2021;7 doi: 10.1126/sciadv.abf5006. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Lin X., He Y. Study Enhanced Enzyme Diffusion with High-Speed Single Nanoparticle Rotational and Translational Tracking. Anal. Chem. 2022;94:7158–7163. doi: 10.1021/acs.analchem.2c00363. [DOI] [PubMed] [Google Scholar]
44.Xue J., Wang Z., et al. He Y. Viscosity Measurement in Biocondensates Using Deep-Learning-Assisted Single-Particle Rotational Analysis. J. Phys. Chem. B. 2022;126:7541–7551. doi: 10.1021/acs.jpcb.2c03243. [DOI] [PubMed] [Google Scholar]
45.Ma Q., Zheng J., et al. Cottrell G.W. Learning representations for time series clustering. Adv. Neural Inf. Process. Syst. 2019;32 [Google Scholar]
46.Danny Matthew S., Daniel S., Liniyanti D.O. Atlantis Press; 2020. Effect of Distance Metrics in Determining K-Value in K-Means Clustering Using Elbow and Silhouette Method. Held in Palembang, Indonesia; pp. 341–346. [Google Scholar]
47.Vojta T., Halladay S., et al. Metzler R. Reflected fractional Brownian motion in one and higher dimensions. Phys. Rev. E. 2020;102 doi: 10.1103/PhysRevE.102.032108. [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Ge F., Xue J., et al. He Y. Real-time observation of dynamic heterogeneity of gold nanorods on plasma membrane with darkfield microscopy. Sci. China Chem. 2019;62:1072–1081. [Google Scholar]
49.Schindelin J., Arganda-Carreras I., et al. Cardona A. Fiji: an open-source platform for biological-image analysis. Nat. Methods. 2012;9:676–682. doi: 10.1038/nmeth.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
50.Tinevez J.-Y., Perry N., et al. Eliceiri K.W. TrackMate: An open and extensible platform for single-particle tracking. Methods. 2017;115:80–90. doi: 10.1016/j.ymeth.2016.09.016. [DOI] [PubMed] [Google Scholar]
51.Van der Maaten L., Hinton G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008;9 [Google Scholar]
52.Gers F.A., Schmidhuber J., Cummins F. Learning to Forget: Continual Prediction with LSTM. Neural Comput. 2000;12:2451–2471. doi: 10.1162/089976600300015015. [DOI] [PubMed] [Google Scholar]
53.Eraslan G., Simon L.M., et al. Theis F.J. Single-cell RNA-seq denoising using a deep count autoencoder. Nat. Commun. 2019;10:390. doi: 10.1038/s41467-018-07931-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Behzadi S., Serpooshan V., et al. Mahmoudi M. Cellular uptake of nanoparticles: journey inside the cell. Chem. Soc. Rev. 2017;46:4218–4244. doi: 10.1039/c6cs00636a. [DOI] [PMC free article] [PubMed] [Google Scholar]
55.Zhou R., Zhou H., et al. Yeung E.S. Pericellular Matrix Enhances Retention and Cellular Uptake of Nanoparticles. J. Am. Chem. Soc. 2012;134:13404–13409. doi: 10.1021/ja304119w. [DOI] [PubMed] [Google Scholar]
56.Zeng Z.-p., Xie H., et al. Xi P. Computational methods in super-resolution microscopy. Frontiers Inf. Technol. Electronic Eng. 2017;18:1222–1235. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S32, Tables S1 and S2, and Notes S1–S6

mmc1.pdf^{(9.3MB, pdf)}

Document S2. Article plus supporting material

mmc2.pdf^{(12.7MB, pdf)}

Data Availability Statement

These data sets and the implementation for Deep-SEES are also available in the GitHub repository at https://github.com/EdwardZX/Deep-SEES.

[bib1] 1.Montiel D., Yang H. Real-time three-dimensional single-particle tracking spectroscopy for complex systems. Laser Photon. Rev. 2010;4:374–385. [Google Scholar]

[bib2] 2.Chenouard N., Smal I., et al. Meijering E. Objective comparison of particle tracking methods. Nat. Methods. 2014;11:281–289. doi: 10.1038/nmeth.2808. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib3] 3.Shen H., Tauzin L.J., et al. Landes C.F. Single Particle Tracking: From Theory to Biophysical Applications. Chem. Rev. 2017;117:7331–7376. doi: 10.1021/acs.chemrev.6b00815. [DOI] [PubMed] [Google Scholar]

[bib4] 4.Pan Q., Sun D., et al. He Y. Real-Time Study of Protein Phase Separation with Spatiotemporal Analysis of Single-Nanoparticle Trajectories. ACS Nano. 2021;15:539–549. doi: 10.1021/acsnano.0c05486. [DOI] [PubMed] [Google Scholar]

[bib5] 5.Ray S., Singh N., et al. Maji S.K. α-Synuclein aggregation nucleates through liquid–liquid phase separation. Nat. Chem. 2020;12:705–716. doi: 10.1038/s41557-020-0465-9. [DOI] [PubMed] [Google Scholar]

[bib6] 6.Ouellette N.T., O’Malley P.J.J., Gollub J.P. Transport of Finite-Sized Particles in Chaotic Flow. Phys. Rev. Lett. 2008;101 doi: 10.1103/PhysRevLett.101.174504. [DOI] [PubMed] [Google Scholar]

[bib7] 7.Bechinger C., Di Leonardo R., et al. Volpe G. Active particles in complex and crowded environments. Rev. Mod. Phys. 2016;88 [Google Scholar]

[bib8] 8.Khadka U., Holubec V., et al. Cichos F. Active particles bound by information flows. Nat. Commun. 2018;9:3864. doi: 10.1038/s41467-018-06445-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib9] 9.Godoy B.I., Lin Y., Andersson S.B. held in Denver, CO; 2020. A Time-Varying Approach to Single Particle Tracking with a Nonlinear Observation Model; pp. 5151–5156. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib10] 10.Ernst D., Köhler J. Measuring a diffusion coefficient by single-particle tracking: statistical analysis of experimental mean squared displacement curves. Phys. Chem. Chem. Phys. 2013;15:845–849. doi: 10.1039/c2cp43433d. [DOI] [PubMed] [Google Scholar]

[bib11] 11.Calderon C.P. Motion blur filtering: A statistical approach for extracting confinement forces and diffusivity from a single blurred trajectory. Phys. Rev. E. 2016;93 doi: 10.1103/PhysRevE.93.053303. [DOI] [PubMed] [Google Scholar]

[bib12] 12.Ashley T.T., Andersson S.B. Method for simultaneous localization and parameter estimation in particle tracking experiments. Phys. Rev. E. 2015;92 doi: 10.1103/PhysRevE.92.052707. [DOI] [PubMed] [Google Scholar]

[bib13] 13.Vega A.R., Freeman S.A., et al. Jaqaman K. Multistep Track Segmentation and Motion Classification for Transient Mobility Analysis. Biophys. J. 2018;114:1018–1025. doi: 10.1016/j.bpj.2018.01.012. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib14] 14.Riahi M.K., Qattan I.A., et al. Homouz D. Identifying short- and long-time modes of the mean-square displacement: An improved nonlinear fitting approach. AIP Adv. 2019;9 [Google Scholar]

[bib15] 15.Monnier N., Barry Z., et al. Bathe M. Inferring transient particle transport dynamics in live cells. Nat. Methods. 2015;12:838–840. doi: 10.1038/nmeth.3483. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib16] 16.Persson F., Lindén M., et al. Elf J. Extracting intracellular diffusive states and transition rates from single-molecule tracking data. Nat. Methods. 2013;10:265–269. doi: 10.1038/nmeth.2367. [DOI] [PubMed] [Google Scholar]

[bib17] 17.Das R., Cairo C.W., Coombs D. A Hidden Markov Model for Single Particle Tracks Quantifies Dynamic Interactions between LFA-1 and the Actin Cytoskeleton. PLoS Comput. Biol. 2009;5 doi: 10.1371/journal.pcbi.1000556. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib18] 18.Heckert A., Dahal L., et al. Darzacq X. Recovering mixtures of fast-diffusing states from short single-particle trajectories. Elife. 2022;11 doi: 10.7554/eLife.70169. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib19] 19.Hines K.E., Bankston J.R., Aldrich R.W. Analyzing Single-Molecule Time Series via Nonparametric Bayesian Inference. Biophys. J. 2015;108:540–556. doi: 10.1016/j.bpj.2014.12.016. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib20] 20.van de Meent J.-W., Bronson J.E., et al. Gonzalez R.L. Empirical Bayes Methods Enable Advanced Population-Level Analyses of Single-Molecule FRET Experiments. Biophys. J. 2014;106:1327–1337. doi: 10.1016/j.bpj.2013.12.055. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib21] 21.Bosch P.J., Kanger J.S., Subramaniam V. Classification of Dynamical Diffusion States in Single Molecule Tracking Microscopy. Biophys. J. 2014;107:588–598. doi: 10.1016/j.bpj.2014.05.049. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib22] 22.Wilson H., Wang Q. Joint Detection of Change Points in Multichannel Single-Molecule Measurements. J. Phys. Chem. B. 2021;125:13425–13435. doi: 10.1021/acs.jpcb.1c08869. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib23] 23.Li H., Yang H. Statistical Learning of Discrete States in Time Series. J. Phys. Chem. B. 2019;123:689–701. doi: 10.1021/acs.jpcb.8b10561. [DOI] [PubMed] [Google Scholar]

[bib24] 24.Song N., Yang H. Parallelization of Change Point Detection. J. Phys. Chem. A. 2017;121:5100–5109. doi: 10.1021/acs.jpca.7b04378. [DOI] [PubMed] [Google Scholar]

[bib25] 25.Saxton M.J. Single-particle tracking: the distribution of diffusion coefficients. Biophys. J. 1997;72:1744–1753. doi: 10.1016/S0006-3495(97)78820-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib26] 26.Metzner C., Mark C., et al. Fabry B. Superstatistical analysis and modelling of heterogeneous random walks. Nat. Commun. 2015;6:7516. doi: 10.1038/ncomms8516. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib27] 27.Granik N., Weiss L.E., et al. Shechtman Y. Single-Particle Diffusion Characterization by Deep Learning. Biophys. J. 2019;117:185–192. doi: 10.1016/j.bpj.2019.06.015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib28] 28.Muñoz-Gil G., Volpe G., et al. Manzo C. Objective comparison of methods to decode anomalous diffusion. Nat. Commun. 2021;12:6253. doi: 10.1038/s41467-021-26320-w. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib29] 29.Kowalek P., Loch-Olszewska H., Szwabiński J. Classification of diffusion modes in single-particle tracking data: Feature-based versus deep-learning approach. Phys. Rev. E. 2019;100 doi: 10.1103/PhysRevE.100.032410. [DOI] [PubMed] [Google Scholar]

[bib30] 30.Janczura J., Kowalek P., et al. Weron A. Classification of particle trajectories in living cells: Machine learning versus statistical testing hypothesis for fractional anomalous diffusion. Phys. Rev. E. 2020;102 doi: 10.1103/PhysRevE.102.032402. [DOI] [PubMed] [Google Scholar]

[bib31] 31.Bo S., Schmidt F., et al. Volpe G. Measurement of anomalous diffusion using recurrent neural networks. Phys. Rev. E. 2019;100 doi: 10.1103/PhysRevE.100.010102. [DOI] [PubMed] [Google Scholar]

[bib32] 32.Matsuda Y., Hanasaki I., et al. Niimi T. Estimation of diffusive states from single-particle trajectory in heterogeneous medium using machine-learning methods. Phys. Chem. Chem. Phys. 2018;20:24099–24108. doi: 10.1039/c8cp02566e. [DOI] [PubMed] [Google Scholar]

[bib33] 33.Pinholt H.D., Bohr S.S.R., et al. Hatzakis N.S. Single-particle diffusional fingerprinting: A machine-learning framework for quantitative analysis of heterogeneous diffusion. Proc. Natl. Acad. Sci. USA. 2021;118 doi: 10.1073/pnas.2104624118. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib34] 34.Arts M., Smal I., et al. Meijering E. Particle Mobility Analysis Using Deep Learning and the Moment Scaling Spectrum. Sci. Rep. 2019;9 doi: 10.1038/s41598-019-53663-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib35] 35.Zhao H., Ge F., et al. He Y. Reveal heterogeneous motion states in single nanoparticle trajectory using its own history. Sci. China Chem. 2021;64:302–312. [Google Scholar]

[bib36] 36.Zhao H., Ge F., et al. He Y. Uncover Single Nanoparticle Dynamics on Live Cell Membrane with Data-Driven Historical Experience Analysis. Anal. Chem. 2021;93:9559–9567. doi: 10.1021/acs.analchem.1c01666. [DOI] [PubMed] [Google Scholar]

[bib37] 37.Muñoz-Gil G., Guigo i Corominas G., Lewenstein M. Unsupervised learning of anomalous diffusion data: an anomaly detection approach. J. Phys. Math. Theor. 2021;54 [Google Scholar]

[bib38] 38.Verdier H., Laurent F., et al. Masson J.-B. Simulation-based inference for non-parametric statistical comparison of biomolecule dynamics. PLoS Comput. Biol. 2023;19 doi: 10.1371/journal.pcbi.1010088. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib39] 39.Bowman S.R., Vilnis L., et al. Bengio S. Association for Computational Linguistics; 2016. Generating Sentences from a Continuous Space; pp. 10–21. [Google Scholar]

[bib40] 40.Rákos O., Aradi S., Szalay Z., et al. Compression of vehicle trajectories with a variational autoencoder. Appl. Sci. 2020;10 [Google Scholar]

[bib41] 41.Pandarinath C., O’Shea D.J., et al. Sussillo D. Inferring single-trial neural population dynamics using sequential auto-encoders. Nat. Methods. 2018;15:805–815. doi: 10.1038/s41592-018-0109-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib42] 42.Fernex D., Noack B.R., Semaan R. Cluster-based network modeling—From snapshots to complex dynamical systems. Sci. Adv. 2021;7 doi: 10.1126/sciadv.abf5006. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib43] 43.Lin X., He Y. Study Enhanced Enzyme Diffusion with High-Speed Single Nanoparticle Rotational and Translational Tracking. Anal. Chem. 2022;94:7158–7163. doi: 10.1021/acs.analchem.2c00363. [DOI] [PubMed] [Google Scholar]

[bib44] 44.Xue J., Wang Z., et al. He Y. Viscosity Measurement in Biocondensates Using Deep-Learning-Assisted Single-Particle Rotational Analysis. J. Phys. Chem. B. 2022;126:7541–7551. doi: 10.1021/acs.jpcb.2c03243. [DOI] [PubMed] [Google Scholar]

[bib45] 45.Ma Q., Zheng J., et al. Cottrell G.W. Learning representations for time series clustering. Adv. Neural Inf. Process. Syst. 2019;32 [Google Scholar]

[bib46] 46.Danny Matthew S., Daniel S., Liniyanti D.O. Atlantis Press; 2020. Effect of Distance Metrics in Determining K-Value in K-Means Clustering Using Elbow and Silhouette Method. Held in Palembang, Indonesia; pp. 341–346. [Google Scholar]

[bib47] 47.Vojta T., Halladay S., et al. Metzler R. Reflected fractional Brownian motion in one and higher dimensions. Phys. Rev. E. 2020;102 doi: 10.1103/PhysRevE.102.032108. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib48] 48.Ge F., Xue J., et al. He Y. Real-time observation of dynamic heterogeneity of gold nanorods on plasma membrane with darkfield microscopy. Sci. China Chem. 2019;62:1072–1081. [Google Scholar]

[bib49] 49.Schindelin J., Arganda-Carreras I., et al. Cardona A. Fiji: an open-source platform for biological-image analysis. Nat. Methods. 2012;9:676–682. doi: 10.1038/nmeth.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib50] 50.Tinevez J.-Y., Perry N., et al. Eliceiri K.W. TrackMate: An open and extensible platform for single-particle tracking. Methods. 2017;115:80–90. doi: 10.1016/j.ymeth.2016.09.016. [DOI] [PubMed] [Google Scholar]

[bib51] 51.Van der Maaten L., Hinton G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008;9 [Google Scholar]

[bib52] 52.Gers F.A., Schmidhuber J., Cummins F. Learning to Forget: Continual Prediction with LSTM. Neural Comput. 2000;12:2451–2471. doi: 10.1162/089976600300015015. [DOI] [PubMed] [Google Scholar]

[bib53] 53.Eraslan G., Simon L.M., et al. Theis F.J. Single-cell RNA-seq denoising using a deep count autoencoder. Nat. Commun. 2019;10:390. doi: 10.1038/s41467-018-07931-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib54] 54.Behzadi S., Serpooshan V., et al. Mahmoudi M. Cellular uptake of nanoparticles: journey inside the cell. Chem. Soc. Rev. 2017;46:4218–4244. doi: 10.1039/c6cs00636a. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib55] 55.Zhou R., Zhou H., et al. Yeung E.S. Pericellular Matrix Enhances Retention and Cellular Uptake of Nanoparticles. J. Am. Chem. Soc. 2012;134:13404–13409. doi: 10.1021/ja304119w. [DOI] [PubMed] [Google Scholar]

[bib56] 56.Zeng Z.-p., Xie H., et al. Xi P. Computational methods in super-resolution microscopy. Frontiers Inf. Technol. Electronic Eng. 2017;18:1222–1235. [Google Scholar]

PERMALINK

Extract latent features of single-particle trajectories with historical experience learning

Yongyu Zhang

Feng Ge

Xijian Lin

Jianfeng Xue

Yuxin Song

Hao Xie

Yan He

Abstract

Significance

Introduction

Materials and methods

Materials availability

Deep-SEES frameworks

Implementation

Simulation and data set

Slow-fast variants

fBm

Varied motion types and transitions

High-speed imaging of nanoparticles

Comparison baselines

Ablation experiments

Nanoparticle transmembrane process

Results

Deep-SEES

Figure 1.

Extraction dynamic states of noisy SPT system

Figure 2.

Figure 3.

Nanoparticle transmembrane process

Figure 4.

High-speed imaging of nanoparticles

Figure 5.

Discussion

Conclusions

Data and code availability

Author contributions

Acknowledgments

Declaration of interests

Footnotes

Contributor Information

Supporting material

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases