Abstract
The motion of nanoparticles in complex environments can provide us with a detailed understanding of interactions occurring at the molecular level. Liquid phase transmission electron microscopy (LPTEM) enables us to probe and capture the dynamic motion of nanoparticles directly in their native liquid environment, offering real time insights into nanoscale motion and interaction. However, linking motion to interactions to decode the underlying mechanisms of motion and interpret interactive forces at play is challenging, particularly when closed-form Langevin-based equations are not available to model the motion. Herein, we present LEONARDO, a deep generative model that leverages a physics-informed loss function and an attention-based transformer architecture to learn the stochastic motion of nanoparticles in LPTEM. We demonstrate that LEONARDO successfully captures statistical properties suggestive of the heterogeneity and viscoelasticity of the liquid cell environment surrounding the nanoparticles.
Subject terms: Transmission electron microscopy, Imaging techniques, Computational science
Understanding nanoparticle motion in complex liquids via Liquid phase transmission electron microscopy is limited by the difficulty of linking observed dynamics to underlying interaction forces without tractable physical models. The authors introduce a physics-informed transformer-based generative model that learns and reveals the stochastic, heterogeneous motion of nanoparticles, capturing key viscoelastic features of their environment.
Introduction
Nature operates under the influence of stochasticity. This stochasticity, which shows up in the motion of nanoparticles, e.g., quantum dots in polymer matrices1, proteins on lipid membranes2–4, and vesicles in cells5, is closely related to the nanoparticles’ interactions with the complex environments surrounding them. Probing such nanoscale motion and interaction is challenging using conventional microscopy techniques due to the nanoscale, spatially heterogeneous, and dynamic nature of interactive forces, which require spatiotemporal resolution beyond what these methods can offer. In situ liquid phase transmission electron microscopy (LPTEM) is a new microscopy method that enables direct imaging of the motion of nanoparticles in their native liquid environment inside a microfluidic liquid cell chamber and using a transmission electron microscope. The motion of nanoparticles in a liquid and in interaction with the window membrane of the LPTEM microfluidic chamber is stochastic and complex (Fig. 1a)6–14. The lack of a computational model to capture the complex motion and interaction of nanoparticles in LPTEM has limited the application of LPTEM for single-particle imaging. This is despite the unprecedented spatial and temporal resolution (nanometer and millisecond) that LPTEM offers for resolving nanoscale motion and interaction15–17.
Fig. 1. Overview of LEONARDO workflow.
a Schematic overview of our workflow for extracting single particle trajectories from LPTEM movies. By imaging the stochastic motion of single gold nanorods in water as they move and interact with the window of the liquid cell microfluidic chamber of LPTEM, we collect a large dataset of single-particle trajectories from LPTEM experiments. b Schematic of the LEONARDO model, a transformer-VAE with self-attention mechanisms in the encoder and decoder, mapping input trajectories to a low-dimensional latent space and reconstructing them with the aid of a physics-informed loss function. c Latent space representation of unseen trajectories encoded by the trained model, showing distinct clustering and overlaps indicative of different diffusion behaviors. d Demonstration of the generative power of LEONARDO in simulating new synthetic LPTEM trajectories by sampling from different regions of the latent space.
Several ideal stochastic processes with closed-form equations exist that can describe particles’ motion in certain types of environments. For example, Brownian motion18 captures the non-correlated random displacements of a particle’s trajectory due to the presence of thermal noise, fractional Brownian motion (FBM)19 describes the short- and long-range correlations of the displacements of a trajectory moving in a highly viscoelastic environment, and continuous time random walk (CTRW)20 models the trapping and escaping events of a particle as it moves over a random energy landscape with potential wells of various depths. In many experimental systems, including LPTEM, motion is a complex and hybrid mixture of various stochastic processes in space and time6,21–26. This complex motion is due to the nature of interactions that encompass different types of forces, which are nanoscopic and heterogeneous, that is, spatially varying and often hierarchical. The mathematical framework of the generalized Langevin equation that relates the position of particles to the stochastic noise present in the system through the fluctuation-dissipation theorem can potentially model more complex types of stochastic motion27,28. However, implementing this framework requires specifying the type of noise and the memory kernel term describing the past history of the system29. Computing the memory kernel in complex environments, including those with heterogeneous energy landscapes, using approaches such as molecular dynamics simulations is extremely difficult and often requires assumptions about the functional form of the memory kernel30–32. Therefore, obtaining a mechanistic understanding and modeling of the motion of particles in environments with heterogeneous interaction energy landscapes have been elusive at the nanoscale.
Recent advancements in artificial intelligence (AI) have led to the development of supervised machine learning methods that correlate experimental observations with different classes of ideal stochastic processes33,34. For example, supervised deep convolutional neural networks trained on simulated ideal stochastic processes have been used to classify the underlying mechanism of motion associated with single nanoparticle trajectories from LPTEM experiments6. Other studies have also used similar supervised learning methods with different neural network architectures to classify experimental single-particle trajectories into categories of known stochastic processes35–39. Unsupervised learning methods have also been employed to characterize the underlying mechanism of motion by training deep learning models on simulated data. For example, it was shown that standard autoencoders can be used to reconstruct ideal stochastic diffusion processes and identify their physical parameters40. Additionally, variational autoencoders (VAEs) with a probabilistic decoder were used to model ideal stochastic diffusion for Gaussian processes41. However, capturing the wide array of Gaussian and non-Gaussian diffusion characteristics exhibited by particles in experimental data is not achievable through models trained solely on ideal stochastic processes.
The rise of generative AI has provided new approaches to developing data-driven models that learn hidden features of training data and creating black-box simulators with the ability to generate entirely new synthetic data. For example, deep generative models such as VAEs have proven to be effective in learning low-dimensional latent space representations of complex data in different domains and generating synthetic data42. VAEs achieve this task by mapping high-dimensional data to a lower-dimensional probabilistic latent space through a process of encoding, sampling, and decoding, thereby capturing the essential characteristics of the input data42. Through this process, the model learns latent variables that each follow a prior distribution (e.g., Gaussian) that serve as the parameters of the black-box simulation model.
Herein, we pose the question: can generative AI learn and model the complex surface diffusion of nanoparticles from LPTEM experiments? If so, how can that be achieved? To answer these questions, we introduce a VAE deep learning model, named LEONARDO, with an attention-based transformer architecture43, and a customized physics-informed loss function to learn the complex diffusion of nanoparticles in LPTEM from tens of thousands of short single-particle trajectories (Fig. 1b). The self-attention mechanism captures complex temporal dependencies in time series data and has proven to be effective in a wide range of applications, such as the grammar of natural languages44,45, the intricate molecular structures of chemical species46, and the complex patterns of musical sequences47. The physics-informed loss function is designed to quantify the deviations between key statistical features of the input and generated trajectories. These features, such as the moments of the distribution, are canonically used in defining and characterizing stochastic diffusion processes, and therefore, aid the model in learning important attributes of trajectories. Leveraging the power of the self-attention mechanism, LEONARDO learns the temporal dependencies within single nanoparticle trajectories from LPTEM. We demonstrate that LEONARDO captures key physical properties related to the interaction energy landscape, including the non-Gaussianity of the distribution of displacements of a trajectory and their temporal correlations, which relate to the heterogeneity of potential wells in the environment surrounding the particle and their caging effect due to viscoelasticity. The generative power of LEONARDO enables us to use it as a black-box simulator by generating synthetic trajectories that are similar to real experimental data and cover the entire observed regime of behavior at different electron beam dose rates of the microscope and particle size (Fig. 1c, d). This feature provides an opportunity to generate ample particle trajectory data for downstream tasks, particularly in automating electron microscopes, where large and diverse datasets are essential for training algorithms that control the microscope48.
Results and discussion
Imaging and learning single nanoparticle trajectories from LPTEM experiments using the LEONARDO framework
To learn the stochastic motion of the nanoparticles in LPTEM, we used a model system of gold nanorods diffusing in water. We first curated a large dataset of single-particle trajectories of gold nanorods dispersed in water (see Methods for sample preparation details) and moving near the silicon nitride (SiNx) membrane window of the liquid cell chamber of LPTEM. A single nanoparticle trajectory from LPTEM, although highly stochastic in nature, is essentially a sequence of steps that the nanoparticle takes in time, containing all embedded time dependencies and, therefore, the information on its interactions with the surrounding environment over time. Figure 1a illustrates the workflow used here to collect single nanoparticle trajectories from LPTEM experiments. In situ movies of the stochastic motion of gold nanorods in LPTEM were recorded in real time. The collected movies were processed and analyzed as described in the Methods section to arrive at a large dataset of single nanoparticle trajectories (Fig. 1a). The normalized x and y coordinates of the two-dimensional (2-D) trajectories (see Supplementary Information for the normalization procedure) were used to formulate the training data for the neural network model with the architecture illustrated in Fig. 1b and Supplementary Fig. S3. To generalize the model to all experimental conditions applicable to LPTEM experiments, a diverse training dataset was prepared by collecting 38,279 short (200-frame) experimental trajectories at different camera frame rates and a broad range of electron beam dose rates of the microscope (2 to 60 e−/Å2 ⋅ s) for different sizes of gold nanorods (20 to 60 nanometers (nm) long). The trajectory length of 200 frames is a balanced choice to capture sufficient particle dynamics and accommodate datasets of varying sizes collected across different video frame rates. Longer experimental trajectories can be segmented into shorter fixed-length pieces to capture local dynamics, while subsampling of long trajectories can be used to study longer-range statistics with LEONARDO.
The input trajectories, after passing through a convolutional layer to increase the embedding dimension from 1 to 128, were transformed in the encoder network that features two sequential multiheaded self-attention blocks inspired by the transformer architecture43 to capture the temporal dependencies inherent in the trajectories (see the Supplementary Information and Methods section for details). These blocks feed into a convolutional encoder that compresses the output into a 12-dimensional latent vector, z, where each dimension follows a prior standard Gaussian distribution, with the help of a relative-entropy loss function term42. The latent vector is subsequently expanded back to the original trajectory length after passing through the decoder and the final convolutional layer (Fig. 1b). Through this transforming, compressing, and expanding process, the model learns to encode the input data into 12 independent Gaussian distributions (with mean μ and standard deviation σ, Fig. 1b) and subsequently samples from those distributions are used to reconstruct the input data in the output.
Physics-informed loss function
Given the stochastic nature of the trajectories, it is impractical to aim for an exact reconstruction after passing through a low-dimensional latent space using standard loss functions such as mean-squared error (MSE) and relative-entropy loss, as previously noted in the literature41. Therefore, the aim of this work is not to reconstruct the trajectories exactly; rather, it is to reconstruct their statistical properties and to learn the underlying physics related to those statistical properties. Thus, we designed a physics-informed loss function that includes terms customized to learn stochastic trajectories in addition to the standard loss terms in a VAE and we assigned a lower weight to the standard MSE-based reconstruction term to minimize its contribution to the total loss (see Supplementary Information for details of each loss term). Physics-informed machine learning has recently gained traction in applications where partial knowledge of the underlying physics of the input data is incorporated into the model’s loss function49. This approach constrains the learning process with physical laws, ensuring that the model captures the underlying physics of the system.
The new loss function proposed here includes additional terms comprised of errors between statistical moments of the distribution of displacements, Ψ(δr(t)), where δr(t) = r(t) − r(t − 1) is the nanoparticles’ displacement between two consecutive frames, and r(t) = (x(t), y(t)) is the position vector of the nanoparticle at time t, with x(t) and y(t) representing the x and y coordinates of the particle’s position, respectively. The first two terms are mean, , and variance, , (first two moments) of the displacement distribution, Ψ(δr(t)), respectively, where 〈 ⋅ 〉 denotes the mean value over frames of a single trajectory. Trajectories of nanoparticles moving in LPTEM are often characterized by non-Gaussian displacement distributions6,8. This is notable during events when particles escape potential wells after being locally trapped with minimal displacements, resulting in large displacements in either negative or positive directions, which makes the distribution heavy-tailed and skewed. To accommodate these events, we incorporated skewness, , and kurtosis, ,-related to the third and fourth moments of the distribution, respectively-into the loss function, where σδr is the standard deviation of Ψ(δr(t)). Skewness addresses the asymmetry of the distribution, while kurtosis provides insight into the heaviness of the tail of the distribution. The four moments together can describe displacement distributions of trajectories with Gaussian or non-Gaussian statistics.
Two other key statistical measures that we incorporated into the loss function are the velocity autocorrelation function and positional autocorrelation function of the trajectories, Cv(τ), and Cr(τ), respectively. Here, , where v(t) and v(t + τ) are the velocities of the particle at time t and t + τ, respectively, and 〈 ⋅ 〉 denotes the mean value calculated over time delays, τ, for a single trajectory. Cr(τ) is defined as , where r(t) and r(t + τ) represent the position vectors at time t and t + τ, respectively. The Cv(τ) captures dynamics not evident from the displacement distribution alone. The presence of the velocity autocorrelation function term in the loss function ensures that the model captures the anticorrelated motion of nanoparticles interacting with viscoelastic environments, such as those in crowded media, a well-known mechanism leading to anomalous motion50. Anticorrelation in particle trajectories is characterized by a negative value of the autocorrelation function at short time delays, τ. We also calculate the ensemble velocity autocorrelation for a batch of trajectories to quantify the degree to which particle displacements are temporally correlated. The positional autocorrelation function term, Cr(τ), quantifies the spatial correlation between particle positions at different time delays. The inclusion of this term was motivated by the distinct behaviors observed in our experimental trajectories, where particles often transition abruptly between positions and remain localized in the new positions for extended periods. These dynamics introduce long-term spatial correlations that are not fully captured by displacement-based metrics. In addition to the autocorrelations, it is necessary to incorporate a term that accounts for the correlations between the x and y components of the 2-D trajectories. This term measures the correlation coefficient between displacements in the x and y directions, ensuring that the model captures any anisotropy or coupling between orthogonal motion components (see Supplementary Information for more details).
To further enhance LEONARDO’s ability to capture the dynamics of experimental trajectories from LPTEM, we investigated the addition of other statistical terms inspired by the extensive set of statistical features discussed in the literature previously51, which characterizes anomalous diffusion processes. Among these, which include the four moments of the displacements distribution, median of displacements, moments of the discrete Fourier transform, power spectral densities, and wavelet coefficients, we selected the median of displacements as an additional term in the loss function due to its positive effect on model performance based on the metrics discussed in Section 2.2. Detailed derivations of each term in the loss function are provided in the Supplementary Information, along with the training and validation losses of all terms at each epoch (See Fig. S1).
LEONARDO model performance
A key goal of this study is to evaluate how well LEONARDO generates trajectories that resemble the experimental trajectories from LPTEM. For this purpose, we designed two performance metrics to quantitatively assess the similarity between LEONARDO-generated trajectories and experimental trajectories. These metrics not only provide a way to assess LEONARDO’s performance but also assign a quantitative value to the similarity between other stochastic processes.
Fréchet Distance between learned feature vectors
The first metric we designed is based on the Fréchet Inception Distance (FID), a measure widely recognized for its ability to compare distributions of multiple data points in a high-dimensional feature space originally introduced by Heusel et al.52 to evaluate the quality of a given generative model in generating new image data. FID calculates the Wasserstein-2 distance between two multivariate Gaussian distributions, one representing real data and the other representing model-generated data. By comparing distributions of multiple samples, FID provides a global perspective on the statistical similarities between datasets. The Fréchet distance measures the mean and covariance differences between two distributions (see the Methods section for details), with lower scores indicating greater similarity between the two distributions. In the context of images, the distributions used in the FID method are the features extracted from the second-last layer of the Inception-v3 classifier53, which is a 2048-dimensional vector representing high-level features of images, such as shapes, textures, and relationships of objects in images, rather than low-level pixel-level details. For our case of spatiotemporal trajectories, we utilized MoNet, a deep learning-based classifier developed specifically to analyze and characterize anomalous diffusion processes6. MoNet uses a dilated convolutional neural network to extract high-level features that differentiate between diffusion classes, making it a suitable substitute for Inception-v3 in this domain. The original MoNet was designed to classify three diffusion classes6. Herein, we adapted it into MoNet2.0 (Fig. 2c), which now classifies seven different diffusion classes and ensures capturing a broader range of stochastic processes, analogous to Inception-v3.
Fig. 2. Evaluation of the model performance of LEONARDO.
a Representative trajectories of different diffusion classes, including Brownian Motion (BM), five anomalous diffusion classes (FBM, CTRW, LW, SBM, and ATTM), and experimental trajectories from LPTEM. b Schematic illustrating the generation of trajectories by LEONARDO. Trajectories are sampled from a Gaussian distribution in LEONARDO's latent space and decoded to produce synthetic trajectories. c Schematic of the MoNet2.0 classifier architecture, trained using the trajectories from (a). To evaluate the performance of LEONARDO, trajectories from both (a) and (b) are input into the trained MoNet2.0 classifier. d Confusion matrix summarizing the classification performance of the MoNet2.0 model on benchmark diffusion classes, demonstrating high accuracy across most classes. e Schematic and results of the Fréchet Distance (FD) calculation. The second-last layer output of MoNet2.0 is used to compute the FD scores between pairs of diffusion class distributions. The lower triangular matrix shows that the FD score between LPTEM and LEONARDO-generated trajectories is significantly lower than the scores between other diffusion classes. f UMAP of the second-last layer of MoNet2.0 showing significant overlap between LPTEM and LEONARDO-generated trajectories. g Classification results of LEONARDO-generated trajectories by MoNet2.0 show that over 95% of LEONARDO-generated trajectories are classified as LPTEM, with smaller fractions classified as ATTM (3.53%) and FBM (0.34%).
To train MoNet2.0, we used a diverse dataset comprising LPTEM trajectories, Brownian Motion (BM), and five classes of anomalous diffusion processes simulated using the models from the Anomalous Diffusion (AnDi) challenge33 without any additional noise (see Methods Section for more details): FBM, CTRW, Annealed Transient Time Motion (ATTM), Scaled Brownian Motion (SBM), and Lévy Walk (LW) (Fig. 2a). We trained MoNet2.0 on 8000 trajectories per class, achieving a high accuracy across most classes as shown in Fig. 2d, with a high F1 score of 0.88 (a metric that balances precision and recall-see Methods section for the complete definition).
Using the trained MoNet2.0 model, we input 3202 LPTEM and 3202 LEONARDO-generated trajectories and extracted the second-last layer output of MoNet2.0 for each case-a 128-dimensional feature vector, analogous to the 2048-dimensional feature vector of Inception-v3 (Fig. 2b, c, e). We then calculated the Fréchet distance (FD) between these vectors. The resulting score of 7.88 demonstrates a high degree of fidelity between the LEONARDO-generated and experimental LPTEM trajectories. For context, the scores between different diffusion classes ranged from ~14.48 to ~57.58 (Fig. 2e). The scores computed between two independent batches of the same diffusion classes (e.g., FBM vs. FBM, CTRW vs. CTRW, etc.) range from 0.19 to 1.74 (see Fig. S4), providing a baseline for interpreting the FD values. These values highlight that a distribution of LEONARDO-generated trajectories is statistically closer to a distribution of LPTEM trajectories than to other classes of diffusion, and that the similarity between LPTEM and LEONARDO-generated trajectories is greater than the similarity between any other anomalous diffusion classes, further validating LEONARDO’s generative performance.
Although we developed this FD metric specifically for evaluating the performance of LEONARDO, this metric provides a robust and versatile framework for quantifying the similarity between time-series processes, making it broadly applicable for the community to evaluate the performance of generative AI models for time-series data.
We also visualized the 128-dimensional feature vectors using the Universal Manifold Approximation and Projection (UMAP) in Fig. 2f. The UMAP embedding shows that the trajectories generated by LEONARDO and the real LPTEM experimental data occupy overlapping regions in the feature space, reinforcing the statistical similarity between the real experimental and the LEONARDO-generated data.
Classification of generated trajectories
While the FD metric evaluates the similarity between the distributions of multiple trajectories, we also used the predicted probabilities of different classes of diffusion as outputted by the MoNet2.0 classifier as a second performance metric. By examining how each trajectory is classified by MoNet2.0, we gain insight into whether the generated trajectories capture the distinguishing features of the real experimental LPTEM trajectories. To evaluate this, we passed LEONARDO-generated trajectories through the trained MoNet2.0 classifier and recorded their predicted diffusion classes. More than 96% of the LEONARDO-generated trajectories were classified as LPTEM (Fig. 2g). This result highlights the strong resemblance between the generated and experimental trajectories, indicating that the LEONARDO-generated trajectories are more similar to LPTEM than any other class of diffusion, while a small percentage (less than 4%) of trajectories exhibit more resemblance to ATTM and FBM than LPTEM.
We note that while both the classifier and the FD are derived from the same MoNet2.0 model, they serve different purposes and are not expected to align exactly. The classifier assigns labels based on the most discriminative features in the learned feature space, and therefore highlights whether two trajectories can be separated by a sharp decision boundary. In contrast, FD measures the global similarity between two distributions across the entire 128-dimensional feature space. As a result, two classes may be close in terms of overall feature distributions (i.e., low FD), yet still be confidently separable by the classifier. This is seen, for example, in the case of Lévy Walk and Brownian Motion, which have a relatively low FD but are never confused in the classification task (Fig. 2d). Taken together, the FD and classification metrics offer complementary perspectives, and both indicate that LEONARDO-generated trajectories closely resemble the experimental LPTEM trajectories.
Quantitative comparison of statistical properties
In addition to FD and classification-based evaluation metrics, we further evaluated the statistical fidelity of LEONARDO’s outputs by comparing both reconstructed and generated trajectories to experimental LPTEM trajectories. Specifically, we computed the ensemble-averaged values of each statistical metric used in the physics-informed loss function over 3202 trajectories (validation dataset size) and calculated the squared differences between these averages. For LEONARDO-generated trajectories sampled randomly from the latent space, we measured how their average statistical properties differ from those of experimental LPTEM trajectories. As a reference point, we reported the same difference between the LEONARDO-generated trajectories and six different diffusion classes: Brownian motion, FBM, CTRW, Lévy Walk, ATTM, and SBM. We then evaluated LEONARDO-reconstructed trajectories, obtained by inputting experimental LPTEM trajectories into the trained model, and compared them statistically to the original LPTEM inputs. As shown in Table S1, LEONARDO-generated trajectories exhibit lower total weighted and unweighted squared errors against LPTEM compared to any reference diffusion class. The comparison between LEONARDO-reconstructed and original LPTEM input data demonstrates that LEONARDO can accurately recover the statistical properties of experimental trajectories, yielding significantly lower weighted and unweighted squared differences compared to reference comparisons between LEONARDO-generated trajectories and experimental LPTEM data, or simulated diffusion classes.
Learning the underlying physics of trajectories
To investigate how LEONARDO’s latent space encodes key statistical properties of the experimental data related to the interaction energy landscape of nanoparticles with the LPTEM environment, we analyzed two specific metrics: the non-Gaussianity of the displacement distribution at τ = 1 and the velocity autocorrelation at τ = 1 (Fig. 3). The non-Gaussianity parameter, ξ(τ), of a given displacement distribution54 is defined in terms of the fourth and second time-averaged moments for an ensemble of trajectories over a time delay window of size τ as 55, where 〈 ⋅ 〉 denotes the average over an ensemble of trajectories. Non-Gaussianity in the displacement distribution can arise in systems where particles interact with heterogeneous potential wells, leading to localized trapping and intermittent escape events. A negative velocity autocorrelation, Cv(τ), at short time delays is often indicative of the caging effect, where viscoelastic forces in the medium impose directional constraints on particle motion over short timescales. These statistical properties, although not direct evidence, serve as potential indicators of the complexity of the environment, offering a window into the heterogeneous and viscoelastic nature of the surroundings. Figure 3a shows the coefficient of determination (R2) for the relationship between each latent variable and the non-Gaussianity and velocity autocorrelation of the x and y components of the trajectories at τ = 1 (see Methods section for details about the calculation of R2). We generated trajectories by modulating each latent variable between − 3 and + 3 (corresponding to ± 3 standard deviations from the mean), while all other latent variables were kept constant at their initial values sampled from a Gaussian distribution with a mean of zero and standard deviation of one. The statistical properties were then calculated for all trajectories. To capture potential non-linear relationships between latent variables and statistical properties, a second-degree polynomial was then fit to the data to calculate the R2 values that quantify these relationships. The results in Fig. 3a suggest that latent variables z1, z4, and z5 have strong relationships with the desired statistical properties, as quantified by the R2 values, which are further explored in Fig. 3c, d, f, g, i, and j.
Fig. 3. Analysis of the physics learned by LEONARDO’s latent variables.
a Matrix showing the coefficient of determination (R2) for a second-degree polynomial fit between each latent variable zi (i = 1, . . . , 12) and key statistical features of the trajectories: non-Gaussianity (ξ(τ = 1)) of the x and y components (rows 1 and 2, respectively) and velocity autocorrelation (Cv(τ = 1)) of the x and y components (rows 3 and 4, respectively). b Probability density functions (PDF) of displacement distributions of trajectories generated by LEONARDO by modulating the sampled value of latent variable z1 from − 3 to + 3, corresponding to ± 3 standard deviations away from the mean of the latent variable distribution. The PDFs of the x and y components are overlaid with Gaussian fits (dashed gray line for x and solid gray line for y). c Scatter plot of non-Gaussianity (ξ(τ = 1)) versus latent variable z1. The black dashed and solid lines represent second-degree polynomial fits to the x and y components, respectively. d Scatter plot of velocity autocorrelation (Cv(τ = 1)) versus latent variable z1, with second-degree polynomial fits as in (c). e–g Same analysis as (b–d), but for latent variable z4. h–j Same analysis as (b–d), but for latent variable z5.
For latent variable z1, as its sampled value is modulated from −3 to + 3, the probability density function (PDF) of displacement distributions for the y-components of the generated trajectories (1000 trajectories per case) transitions from negatively skewed to positively skewed (Fig. 3b). The presence of heavy tails on the skewed side of the distributions contributes to an increase in non-Gaussianity. This change is reflected in ξ(τ) going from 6, indicating a non-Gaussian distribution, to near 0, representing a more Gaussian distribution, and back to 6 by increasing the value of z1(Fig. 3c). A similar trend is observed for the latent variable z5 in the x-component, as shown in Fig. 3h, i. The high non-Gaussianity of displacement distributions is a common characteristic of trajectories where particles traverse a heterogeneous interaction energy landscape, experiencing prolonged entrapment within potential wells and infrequent escape events50.
In Fig. 3d, j, corresponding to z1 and z5, there is little to no apparent relationship between the values of the latent variables and the velocity autocorrelation at τ = 1. This indicates that z1 and z5 do not capture temporal correlations. In contrast, Fig. 3g, corresponding to z4, exhibits a clear linear trend in both x and y components, highlighting a strong correlation between z4 and the velocity autocorrelation at τ = 1. However, Fig. 3e, f show that z4 has a weak relationship with non-Gaussianity, suggesting that z4 predominantly captures temporal correlations of trajectories.
Thus, z1 and z5 primarily learn the non-Gaussianity at τ = 1, which is suggestive of the heterogeneity of the energy landscape that the particles traverse. Similarly, z4 captures the velocity autocorrelation at τ = 1, which is indicative of the viscoelasticity of the environment surrounding the particles. These results demonstrate how LEONARDO’s latent variables capture statistical properties that relate to the physical processes governing particle motion in LPTEM. This capability allows LEONARDO to probe how experimental factors, such as particle size and electron beam dose rate, influence the interaction energy landscape in the liquid cell, as discussed in the next section. While we focused on three latent variables due to their relevance to the heterogeneity of the energy landscape and the viscoelasticity of the LPTEM environment, other latent variables also encode statistical properties important for capturing the full diversity of the physical processes in LPTEM. The complete table of the variances for all latent variables, computed from the encoder outputs on the test set, is provided in Table S2 of the Supplementary Information.
The effect of particle size and electron beam dose rate of the microscope on the interaction energy landscape in LPTEM
Two key physical features that LEONARDO learned about the LPTEM trajectories through the latent space encoding of the trajectories are closely related to the interaction energy landscape that the nanoparticles traverse as they move near the surface of the SiNx membrane window of the liquid cell. As shown in Fig. 3, latent variables z1, z4, and z5 are correlated with non-Gaussianity, related to the heterogeneity of this landscape, and anticorrelation, related to the viscoelasticity of the environment. Here, we leveraged LEONARDO’s capabilities to explore the nature of the interaction energy landscape within the liquid cell environment. Specifically, we investigated how this landscape evolves with the electron beam dose rate of the microscope and the nanoparticle size on new, unseen experimental data. To achieve this, we encoded trajectories from different experimental conditions and analyzed the distributions of the means (μ) of the latent variables z1, z4, and z5, as μ1, μ4, and μ5, respectively. Figure 4 shows the analysis for the trajectories collected at low and high dose rates (20 and 35 e−/Å2 ⋅ s) and for two different nanoparticle sizes (40 nm and 60 nm in length), where the resulting distributions of latent μ were plotted to examine how experimental factors influence the learned representations. The experimental data consisted of eight 4000-frame trajectories for three separate experimental conditions: (1) Dose 20, Size 60 (Fig. 4a–b, e–f, and i–k), (2) Dose 35, Size 60 (Fig. 4a–n), and (3) Dose 35, Size 40 (Fig. 4c–d, g–h, l–n). Each experimental trajectory was segmented into short (200-frame) pieces and encoded into LEONARDO to obtain the mean, μ, of the latent variables for each piece. The distributions of these μ values were plotted for each experimental condition (Fig. 4a–d). Furthermore, we plotted the corresponding distributions of non-Gaussianity and velocity autocorrelation at τ = 1 for the same datasets, which shows the relationship between the latent variables and the statistical properties for these experimental data (Fig. 4e–h). To provide a global view of how the different experimental conditions are encoded in the latent space, we also show the UMAP embeddings of the mean values of all twelve latent variables in Fig. 4i–n.
Fig. 4. Characterization of the interaction energy landscape in LPTEM using LEONARDO.
a Probability densities of the values of 〈μ1, μ5〉 for experimental trajectories collected at electron beam dose rates of 20 and 35 e−/Å2 ⋅ s for particle size of 60 nm. b Probability densities of the values of μ4 for the same experimental trajectories as in (a). c Probability densities of the values of 〈μ1, μ5〉 for experimental trajectories collected at particle sizes of 40 nm and 60 nm at an electron beam dose rate of 35 e−/Å2 ⋅ s. d Probability densities of the values of μ4 for the same experimental trajectories as in (c). e Probability densities of the values of non-Gaussianity (averaged over the x and y components) at τ = 1 for the same experimental trajectories as in (a). f Probability densities of the values of velocity autocorrelation (averaged over the x and y components) at τ = 1 for the same experimental trajectories as in (b). g Probability densities of the values of non-Gaussianity at τ = 1 for the same experimental trajectories as in (c). h Probability densities of the values of velocity autocorrelation at τ = 1 for the same experimental trajectories as in (d). i UMAP embeddings of all twelve latent variables for the same experimental trajectories as in (a) and (e), color-coded by dose rate. j UMAP embeddings of all twelve latent variables for the same experimental trajectories as in (a), (e), and (i), color-coded by non-Gaussianity (τ = 1) on a symmetric logarithmic scale (SymLogNorm) k UMAP embeddings of all twelve latent variables for the same experimental trajectories as in (a), (e), and (i), color-coded by velocity autocorrelation (τ = 1). l UMAP embeddings of all twelve latent variables for the same experimental trajectories as in (c) and (g), color-coded by particle size. m UMAP embeddings of all twelve latent variables for the same experimental trajectories as in (c), (g), and (l), color-coded by non-Gaussianity (τ = 1) on a symmetric logarithmic scale (SymLogNorm) n UMAP embeddings of all twelve latent variables for the same experimental trajectories as in (c), (g), and (l), color-coded by velocity autocorrelation (τ = 1).
Figure 4a shows the distribution of the average of μ1 and μ5, 〈μ1, μ5〉, for trajectories collected at dose rates of 20 and 35 e−/Å2 ⋅ s for a fixed particle size of 60 nm. Since both μ1 and μ5 encode non-Gaussianity in the y and x components, respectively (Fig. 3c, i shows this connection), we take their average, 〈μ1, μ5〉, to capture the overall non-Gaussianity of the trajectories. At the lower dose rate of 20 e−/Å2 ⋅ s, the distribution is narrower with mainly lower values of 〈μ1, μ5〉. In contrast, at a higher dose rate of 35 e−/Å2 ⋅ s, the distribution broadens, encompassing both higher and lower values of 〈μ1, μ5〉. This observation is supported by the corresponding non-Gaussianity, ξ(τ = 1), distributions in Fig. 4e, where we show the computed non-Gaussianity averaged over the x and y components of trajectories. At the lower dose rate of 20 e−/Å2 ⋅ s, this distribution is narrower with lower values, while at the higher dose rate of 35 e−/Å2 ⋅ s, it becomes broader with both low and high values. This alignment reinforces the connection between 〈μ1, μ5〉 and non-Gaussianity. Figure 4c compares the distribution of 〈μ1, μ5〉 for particle sizes of 40 nm and 60 nm, both collected at a dose rate of 35 e−/Å2 ⋅ s. For the smaller particle size, the distribution is significantly narrower compared to the larger particle size, with most values concentrated around smaller magnitudes. However, interestingly, there are outliers reaching values less than −2 for the 40 nm-long particles. This finding is further corroborated by the corresponding non-Gaussianity distributions in Fig. 4g, where most values remain low for the smaller particle size, but rare outliers extend to higher non-Gaussianity. This consistency further reinforces the relationship between 〈μ1, μ5〉 and non-Gaussianity. The observation of high values of non-Gaussianity at the higher dose rate of 35 e−/Å2 ⋅ s for both particle sizes suggests that dose rate plays a critical role in driving these extreme values, potentially contributing more to a heterogeneous energy landscape than the particle size.
We next investigated the latent variable z4, which is strongly correlated with anticorrelation of both x and y components of trajectories, as shown in Fig. 3g. Figure 4b compares the distributions of μ4 for the dose rates of 20 and 35 e−/Å2 ⋅ s and a fixed particle size of 60 nm. At the lower dose rate, the distribution is narrower and shifted toward higher values, indicating stronger anticorrelation at lower dose rates. In contrast, at the higher dose rate of 35, the distribution broadens, spanning a wider range of values. This shift in the peak position suggests the presence of anticorrelation indicative of higher viscoelasticity at lower electron beam dose rates, consistent with previous observations reported in the literature6,9. The trend of anticorrelation with the electron beam dose rate is further corroborated by the distribution of the velocity autocorrelation (averaged over x and y components of trajectories) in Fig. 4f, which exhibits a similar shift. Figure 4d examines the effect of particle size on z4 at a fixed dose rate of 35 e−/Å2 ⋅ s. For the smaller particles, the distribution is much narrower and centered around a z4 value of 2, suggesting a much higher anticorrelation. This finding is supported by the distribution of velocity autocorrelation in Fig. 4h.
To further visualize how these dose-dependent and size-dependent effects manifest in the latent space, we examined the UMAP embeddings of all twelve latent variables. Figure 4i shows the UMAP representation color-coded by the electron beam dose rate, demonstrating a clear separation between the two experimental conditions. Figure 4j, k shows that this separation aligns with trends in non-Gaussianity and velocity autocorrelation, reinforcing the connection between these statistical properties and the learned latent variables. Similarly, Fig. 4l–n present the UMAP embeddings color-coded by particle size, non-Gaussianity, and velocity autocorrelation, respectively. The UMAP embeddings reveal a clear distinction between the trajectories of 40 nm and 60 nm particles and their statistical properties, demonstrating that LEONARDO encodes differences in particle size through its learned statistical features. To further illustrate the relationships between the latent variables, pairwise scatter plots of μ1, μ4, and μ5 under the two dose rates and particle sizes are provided in Supplementary Fig. S6.
The possible viscoelasticity effect observed at lower electron beam dose rates may originate from the higher extent of elastic interactions with the SiNx membrane of the liquid cell chamber. These interactions arise due to multiple forces on nanometer length scales, including van der Waals forces, electrostatic interactions, and adhesion to the membrane, which create interaction potential wells with binding stiffness k that locally trap or cage nanoparticles, opposing their motion and restoring them to their current positions56. A similar but stronger trend is observed for particle size, where smaller particles exhibit narrower distributions of μ4 and higher anticorrelation, suggesting a more localized trapping effect. In contrast, larger particles display a broader range of μ4 values, indicating weaker caging effects and a wider distribution of elastic interactions. These observations point to a plausible relationship between particle size, dose rate, and viscoelasticity: the extent of caging and trapping is modulated both by the size of the particle and by the dose rate of the electron beam, with smaller particles and lower dose rates experiencing more pronounced viscoelastic effects.
Overall, these results demonstrate how LEONARDO’s latent variables z1, z4, and z5 capture the effects of experimental parameters, such as electron beam dose rate and particle size, on the underlying statistical properties of trajectories, which are potentially related to the interaction energy landscape of the liquid cell environment. High electron beam dose rates are associated with greater non-Gaussianity and, therefore, increased heterogeneity, regardless of particle size. In contrast, smaller particles and lower electron beam dose rates exhibit stronger anticorrelation, indicative of enhanced viscoelastic effects within the liquid cell environment.
In summary, we reported developing a new data-driven model for learning the stochastic motion of nanoparticles in LPTEM experiments using a transformer-based VAE model named LEONARDO. The novelty of LEONARDO is a customized loss function informed by the physics of nanoparticle trajectories and an attention-based transformer that enables learning time dependencies of particle trajectories. LEONARDO is trained on a large dataset of trajectories from LPTEM experiments collected in a range of electron beam dose rates and for short and long nanorods. The model captures the physics related to the interaction energy landscape surrounding the particle by probabilistically mapping the data into a low-dimensional space. We demonstrated that LEONARDO identifies characteristics of the interaction energy landscape in LPTEM by learning statistical properties associated with the energy landscape via three of the latent variables.
By applying LEONARDO to an unseen set of experimental LPTEM trajectories collected at varying electron beam dose rates and for nanoparticles of different sizes, we report that both dose rate and particle size jointly modulate key statistical features, such as non-Gaussianity and anticorrelation, which are indicative of heterogeneity and viscoelasticity in the liquid cell environment. These results highlight LEONARDO’s ability to uncover statistical and physical insights into the stochastic motion of nanoparticles and its potential as a tool for analyzing complex particle-environment interactions. LEONARDO is capable of producing single particle trajectories, which can serve as a simulator model for LPTEM experiments, offering a cost-effective alternative to acquiring experimental data. We envision that such synthetic data can be used to train models for automating electron microscope-based workflows in the future.
Methods
Experimental method
Liquid phase TEM silicon nitride chip preparation
Microfabricated silicon nitride chips from Protochips Inc. with 550 μm × 50 μm and 550 μm × 20 μm window and with 50 nm and 150 nm spacers were used for encapsulating 0.5 microliters of the gold nanorod solutions. The chips were first cleaned by immersing them in an acetone bath, followed by dipping them into an ethanol bath, and subsequently dried by blowing nitrogen gas parallel to the surface. The surface of the chips was then made hydrophilic using easiGlow, the glow discharge cleaning system by PELCO, to ensure that the liquid solution spreads evenly across the surface of the chip. The settings used for easiGlow were 0.39 mBar pressure and a glow discharge time of 45 seconds. The chips were then assembled inside the Poseidon Select holder from Protochips before imaging on the TEM.
TEM imaging
The LPTEM experiments were performed on the FEI Tecnai F30 TEM of the Materials Characterization Facility of the Institute for Matter and Systems of the Georgia Institute of Technology operating at 200 kV. The gold nanorods were visualized using the Gatan OneView in situ camera, and videos of nanorod motion were recorded using the Digital Micrograph software. The electron beam dose rate used for each experiment was calibrated for camera magnifications of 19.5 kx, 25 kx, and 29.5 kx using a custom-built Digital Micrograph script. Various dose rates of 2 to 60 e−/Å2 ⋅ s were used to capture a range of different diffusion regimes of the gold nanorods. Videos were recorded at various camera exposures from 0.005 s to 0.1 s, corresponding to frame rates of 200 to 10 frames per second. Camera resolutions of 512 × 512, 1024 × 1024, and 2048 × 2048 pixels by pixels were used.
Processing of in situ videos
The in situ videos (time series of frames) collected from the LPTEM experiments were processed to extract the positions of single particles. First, a custom MATLAB script was employed to extract a region of interest (ROI) from each video frame, which contained the complete length of a single particle trajectory. Subsequently, the series of ROIs from each video were processed using a thresholding-based algorithm to obtain the x and y coordinates of the centroid of each nanorod in each frame. Details on further processing of trajectories to generate the training dataset can be found in the Supplementary Information.
LEONARDO model
LEONARDO (Learning Electron microscopy Of NAnopaRticle Diffusion via an attention netwOrk) is a transformer-based variational autoencoder with encoder and decoder blocks that use multi-headed attention mechanisms in the encoder and decoder to learn the time-dependent dynamics of the trajectories. In doing so, LEONARDO maps an input trajectory of 200-frames, r = (r1, r2, ⋯ , rT=200), where rt = (xt, yt) represents the position vector of the nanoparticle at time t, with xt and yt denoting the x and y coordinates of the particle’s position, respectively, to a sequence of continuous representation z = (z1, ⋯ , z12), which are the latent variables of the model. Given z, the decoder generates an output trajectory . The attention block in the encoder uses 8 heads and 2 layers with an embedding size of 128. The output of the attention block goes into a convolutional encoder. The convolutional encoder consists of two convolutional layers with kernel sizes of 7 and 2, respectively. Following the convolutional layers, a dense layer feeds the output into the latent space with a dimension of 12. The latent space is then upsampled using a dense layer and a transpose convolutional layer. The output from the convolutional layer is input into a multi-headed attention block in the decoder before being reshaped into the final output trajectory using a convolutional layer, which serves to preserve temporal correlations in the trajectories.
The training dataset consisted of 38,279 LPTEM trajectories, the validation dataset consisted of 3202 LPTEM trajectories, while the test dataset included 5934 LPTEM trajectories. The model was trained for 200 epochs, during which the validation set was used to tune hyperparameters, such as determining the optimal combination of loss function components based on model performance metrics. The final model performance metrics were reported using the test dataset.
Figure S1 of the Supplementary Information shows the plots of training and validation losses, and Figure S3 presents a detailed architecture of the LEONARDO model.
Fréchet Distance
The Fréchet Distance, d, measures the mean and covariance differences between two distributions using the formula:
1 |
where μr and Σr are the mean and covariance of the real data features, and μg and Σg are the mean and covariance of the generated data features, with lower scores indicating greater similarity between the two distributions.
Theoretical stochastic diffusion processes and their simulation
To train MoNet2.0, we generated a diverse dataset comprising six stochastic diffusion processes: Brownian Motion (BM), Fractional Brownian Motion (FBM), Continuous Time Random Walk (CTRW), Annealed Transient Time Motion (ATTM), Scaled Brownian Motion (SBM), and Lévy Walk (LW), which are subsequently normalized according to equation S1 in the Supplementary Information. Below, we provide a brief theoretical description for each stochastic process and describe the methods used to simulate each process. For full theoretical details of simulating the anomalous diffusion processes (FBM, CTRW, ATTM, SBM, and LW), readers are referred to the Anomalous Diffusion (AnDi) Challenge and the corresponding article33.
Brownian motion
Brownian motion describes a purely random process, where the particles undergo thermal motion. The fundamental equation for Brownian Motion is:
2 |
where P(x, t) is the probability density function for the particle that describes the position of a particle in position x at time t and D is the diffusion coefficient that is characteristic of the particle and its surrounding environment (particle geometry and surrounding temperature). Solving this equation with an initial condition of x = 0 at t = 0 (i.e., Δxi = xi) with unbounded x and t results in:
3 |
The first moment (mean) of this distribution is zero, 〈x(t)〉 = 0, and the second moment of this distribution, i.e., the variance 〈x2(t)〉, that is also the ensemble-averaged mean squared displacement (e-MSD) of the trajectories coming from this distribution has the form:
4 |
Trajectories were generated as discrete realizations of Brownian motion by summing Gaussian-distributed random displacements at each time step. For each trajectory, the displacements were sampled from a normal distribution with a zero mean and a variance proportional to the time step. The simulation iteratively updates the position as:
5 |
where xi and yi are the current positions, and the displacements are sampled independently for x and y directions.
Fractional Brownian motion
FBM generalizes Brownian motion by introducing memory effects in the step increments, characterized by the Hurst exponent . The mean squared displacement (MSD) for particle position x scales as:
6 |
where α < 1 corresponds to subdiffusion and α > 1 to superdiffusion. FBM trajectories are generated using fractional Gaussian noise for both x and y directions. The process ensures that the increments Δx and Δy are correlated in time according to H:
7 |
where ηx,i(H) and ηy,i(H) are fractional Gaussian noise processes with memory effects determined by H. To train MoNet2.0, we sampled values of α (or equivalently 2H) uniformly in the subdiffusive range of [0.1, 1].
Continuous time random walk
CTRW introduces waiting times between particle steps, where the waiting times ψ(τ) follow a power-law distribution:
8 |
This results in subdiffusive behavior, as particles remain trapped in localized regions for long periods before stepping. To simulate CTRW trajectories, waiting times are drawn from the power-law distribution. After each waiting time, the particle position is updated with independent Gaussian random steps in both the x- and y-directions. The cumulative positions are then regularized to equally spaced time intervals. Specifically, the particle positions at time frame i + 1 are iteratively updated as:
9 |
where ηx,i and ηy,i are independent Gaussian-distributed random steps applied after each waiting time. The resulting trajectories exhibit subdiffusion due to the broad distribution of waiting times. To train MoNet2.0, we sampled the values of α uniformly in the range of [0.1, 1].
Lévy walk
Lévy walks are stochastic processes that introduce long flights of constant velocity. The MSD scales as:
10 |
Here, α controls the distribution of step durations, leading to superdiffusive behavior when 1 < α < 2. To simulate Lévy walk trajectories, the step durations τi are drawn from a power-law distribution:
11 |
with the exponent defined by σ = 3 − α (using a random σ when α = 2). For each flight, a constant velocity, v, is sampled and a single flight direction, θ, is drawn uniformly from θ = [0, 2π); the flight is discretized into an integer number of time steps, during which the particle postions are updated as
12 |
To train MoNet2.0, values of α were sampled uniformly from [1, 2].
Annealed transient time model
ATTM describes a process in which the diffusion coefficient D fluctuates over discrete time intervals, leading to subdiffusive dynamics. For a given Di each time step is defined as
13 |
and the particle position is updated as
14 |
where ηx,i and ηy,i are independent standard Gaussian random variables. To train MoNet2.0, values of α were sampled uniformly from [0.1, 1].
Scaled Brownian motion
SBM describes nonstationary diffusion in which the diffusion coefficient varies with time as
15 |
so that the mean squared displacement scales as
16 |
To simulate SBM, the variance is given by σ2tα; thus, if the variance at time step i is σ2iα, the position is updated as
17 |
where ηx,i and ηy,i are independent standard Gaussian random variables. To train MoNet2.0, values of α were sampled uniformly from [0.1, 1].
Supplementary Fig. S5 shows the UMAP plot of the latent space of LEONARDO with AnDi simulated trajectories and experimental LPTEM trajectories encoded.
Calculation of coefficient of determination (R2)
To quantify the relationship between latent variables and statistical features, we used the coefficient of determination (R2). This metric evaluates how well a model fits the data by comparing the variance explained by the model to the total variance of the data. Specifically, R2 is calculated as:
18 |
where SSres is the sum of squared residuals:
19 |
and SStot is the total sum of squares:
20 |
Here, yi represents the observed data, yfit,i is the predicted value from the second-degree polynomial fit, and 〈y〉 is the mean of the observed data. R2 values range from 0 to 1, with higher values indicating a stronger relationship between the variable of interest and the fit. This approach allows us to quantitatively assess the extent to which each latent variable captures the statistical properties of the trajectories in Section 2.3.
Calculation of F1 score
The F1 score is a metric that combines precision and recall as follows:
21 |
where precision is given by:
22 |
and recall is given by:
23 |
Here, TP refers to the number of true positives, FP to false positives, and FN to false negatives. The final F1 score for MoNet2.0 was calculated by averaging the F1 scores across all diffusion classes as follows:
24 |
where N is the total number of classes, F1i is the F1 score for class i, and wi is the weight for class i, given by:
25 |
where ni is the number of trajectories in class i.
Supplementary information
Source data
Acknowledgements
This research was supported by the NSF, Division of Chemical, Bioengineering, Environmental, and Transport Systems under award 2338466, the American Chemical Society Petroleum Research Fund under award 67239-DNI5, the Exponential Electronics seed grant of the Institute for Matter and Systems at Georgia Tech, and Georgia Institute of Technology start-up funds. N.G. acknowledges the support from the President’s Undergraduate Research Award (PURA) at Georgia Tech. The authors thank Dr. Amirali Aghazadeh, Dr. Cory Hargus, and Daniel Saeedi for insightful discussions. The authors thank Dr. Nasrin Houshmand and the Laser Dynamics Laboratory and Dr. Brettmann and her group at Georgia Tech for allowing us to use their wet lab space for part of our nanoparticle synthesis. The authors also thank Dr. Yong Ding, Dr. Ben Miller, and Mr. Thomas Zhang for their help in implementing our dose rate control code with the Oneview camera at Georgia Tech. The authors acknowledge the support of the Material Characterization Facility and the Electron Microscopy Facility of the Institute for Matter and Systems at Georgia Tech, a member of the National Nanotechnology Coordinated Infrastructure (NNCI), which is supported by the National Science Foundation (ECCS-2025462).
Author contributions
Z.S. and V.J. conceived the project and designed the experiments and the neural network architecture. Z.S. collected the data, developed the model, and wrote and implemented the code. N.G. implemented the image analysis code. P.A.N. synthesized and characterized nanoparticles. Z.S., N.G. and V.J. analyzed the data and generated the figures. Z.S., N.G., P.A.N. and V.J. wrote the manuscript. V.J. supervised the project.
Peer review
Peer review information
Nature Communications thanks Gorka Munoz-Gil, and the other, anonymous, reviewer for their contribution to the peer review of this work. A peer review file is available.
Data availability
The gold nanoparticle trajectory data used in this study for training, validating, and testing the LEONARDO model, as well as the experimental datasets used in the electron beam dose and size study (Section 2.4), are deposited in the HuggingFace repository under accession code 10.57967/hf/5786. Source data are provided with this paper.
Code availability
The LEONARDO and MoNet2.0 source code is available on GitHub at https://github.com/JamaliLab/LEONARDOand can be accessed and referenced via Zenodo at 10.5281/zenodo.1570821857. The trained LEONARDO model is available on HuggingFace at 10.57967/hf/5787.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-025-61632-1.
References
- 1.Rose, K. A. et al. Shape Anisotropy Enhances Nanoparticle Dynamics in Nearly Homogeneous Hydrogels. Macromolecules55, 8514–8523 (2022). [Google Scholar]
- 2.Barkai, E., Garini, Y. & Metzler, R. Strange kinetics of single molecules in living cells. Physics Today65, 29–35 (2012). [Google Scholar]
- 3.Mazaheri, M., Ehrig, J., Shkarin, A., Zaburdaev, V. & Sandoghdar, V. Ultrahigh-speed imaging of rotational diffusion on a lipid bilayer. Nano letters20, 7213–7219 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Moringo, N. A. et al. A mechanistic examination of salting out in protein-polymer membrane interactions. Proceedings of the National Academy of Sciences116, 22938–22945 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Tabaei, S. R., Gillissen, J. J. J., Vafaei, S., Groves, J. T. & Cho, N.-J. Size-dependent, stochastic nature of lipid exchange between nano-vesicles and model membranes. Nanoscale8, 13513–13520 (2016). [DOI] [PubMed] [Google Scholar]
- 6.Jamali, V. et al. Anomalous nanoparticle surface diffusion in LCTEM is revealed by deep learning-assisted analysis. Proc. Natl Acad. Sci.118, e2017616118 (2021). [DOI] [PMC free article] [PubMed]
- 7.Jamali, V. & Alivisatos, A. P. Studying diffusion of colloidal nanoparticles in solution using liquid phase tem and machine learning. Microscopy and Microanalysis28, 142–143 (2022). [Google Scholar]
- 8.Chee, S. W., Anand, U., Bisht, G., Tan, S. F. & Mirsaidov, U. Direct Observations of the Rotation and Translation of Anisotropic Nanoparticles Adsorbed at a Liquid-Solid Interface. Nano Letters19, 2871–2878 (2019). [DOI] [PubMed] [Google Scholar]
- 9.Woehl, T. J. & Prozorov, T. The Mechanisms for Nanoparticle Surface Diffusion and Chain Self-Assembly Determined from Real-Time Nanoscale Kinetics in Liquid. The Journal of Physical Chemistry C119, 21261–21269 (2015). [Google Scholar]
- 10.Bakalis, E. et al. Complex Nanoparticle Diffusional Motion in Liquid-Cell Transmission Electron Microscopy. The Journal of Physical Chemistry C124, 14881–14890 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Zheng, H., Claridge, S. A., Minor, A. M., Alivisatos, A. P. & Dahmen, U. Nanocrystal diffusion in a liquid thin film observed by in situ transmission electron microscopy. Nano Letters9, 2460–2465 (2009). [DOI] [PubMed] [Google Scholar]
- 12.Yesibolati, M. N. et al. Unhindered Brownian Motion of Individual Nanoparticles in Liquid-Phase Scanning Transmission Electron Microscopy. Nano Letters20, 7108–7115 (2020). [DOI] [PubMed] [Google Scholar]
- 13.Verch, A., Pfaff, M. & de Jonge, N. Exceptionally slow movement of gold nanoparticles at a solid/liquid interface investigated by scanning transmission electron microscopy. Langmuir31, 6956–6964 (2015). [DOI] [PubMed] [Google Scholar]
- 14.Chen, Q. et al. Interaction potentials of anisotropic nanocrystals from the trajectory sampling of particle motion using in situ liquid phase transmission electron microscopy. ACS central science1, 33–39 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Ross, F. M. Opportunities and challenges in liquid cell electron microscopy. Science350, aaa9886 (2015). [DOI] [PubMed]
- 16.Alcorn, F. M., Jain, P. K. & van der Veen, R. M. Time-resolved transmission electron microscopy for nanoscale chemical dynamics. Nature Reviews Chemistry7, 256–272 (2023). [DOI] [PubMed] [Google Scholar]
- 17.Cho, H., Moreno-Hernandez, I. A., Jamali, V., Oh, M. H. & Alivisatos, A. P. In situ quantification of interactions between charged nanorods in a predefined potential energy landscape. Nano Letters21, 628–633 (2020). [DOI] [PubMed] [Google Scholar]
- 18.Einstein, A. On the movement of small particles suspended in a stationary liquids required by the molecular-kinetic theory of heat. Annalen der physik17, 549–560 (1905). [Google Scholar]
- 19.Mandelbrot, B. B. & Van Ness, J. W. Fractional Brownian Motions, Fractional Noises and Applications. SIAM Review10, 422–437 (1968). [Google Scholar]
- 20.Scher, H. & Montroll, E. W. Anomalous transit-time dispersion in amorphous solids. Physical Review B12, 2455–2477 (1975). [Google Scholar]
- 21.Weigel, A. V., Simon, B., Tamkun, M. M. & Krapf, D. Ergodic and nonergodic processes coexist in the plasma membrane as observed by single-molecule tracking. Proceedings of the National Academy of Sciences108, 6438–6443 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Sarfati, R. & Schwartz, D. K. Temporally Anticorrelated Subdiffusion in Water Nanofilms on Silica Suggests Near-Surface Viscoelasticity. ACS Nano14, 3041–3047 (2020). [DOI] [PubMed] [Google Scholar]
- 23.Wang, B., Anthony, S. M., Bae, S. C. & Granick, S. Anomalous yet Brownian. Proceedings of the National Academy of Sciences106, 15160–15164 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Wang, B., Kuo, J., Bae, S. C. & Granick, S. When Brownian diffusion is not Gaussian. Nature Materials11, 481–485 (2012). [DOI] [PubMed] [Google Scholar]
- 25.Kang, S. et al. Real-space imaging of nanoparticle transport and interaction dynamics by graphene liquid cell TEM. Sci. Adv.7, eabi5419 (2021). [DOI] [PMC free article] [PubMed]
- 26.Vitali, S. et al. Langevin equation in complex media and anomalous diffusion. Journal of The Royal Society Interface15, 20180282 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Zwanzig, R. Nonequilibrium Statistical Mechanics (Oxford University Press, 2001).
- 28.Kubo, R. The fluctuation-dissipation theorem. Reports on Progress in Physics29, 306 (1966). [Google Scholar]
- 29.Lei, H., Baker, N. A. & Li, X. Data-driven parameterization of the generalized Langevin equation. Proceedings of the National Academy of Sciences113, 14183–14188 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Berkowitz, M., Morgan, J. D., Kouri, D. J. & McCammon, J. A. Memory kernels from molecular dynamics. The Journal of Chemical Physics75, 2462–2463 (1981). [Google Scholar]
- 31.Fricks, J., Yao, L., Elston, T. C. & Forest, M. G. Time-Domain Methods for Diffusive Transport in Soft Matter. SIAM Journal on Applied Mathematics69, 1277–1308 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Chen, M., Li, X. & Liu, C. Computation of the memory functions in the generalized Langevin models for collective dynamics of macromolecules. J. Chem. Phys.141, 064112 (2014). [DOI] [PubMed]
- 33.Muñoz-Gil, G. et al. Objective comparison of methods to decode anomalous diffusion. Nature Communications12, 6253 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Cichos, F., Gustavsson, K., Mehlig, B. & Volpe, G. Machine learning for active matter. Nature Machine Intelligence2, 94–103 (2020). [Google Scholar]
- 35.Granik, N. et al. Single-Particle Diffusion Characterization by Deep Learning. Biophysical Journal117, 185–192 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Muñoz-Gil, G., Garcia-March, M. A., Manzo, C., Martín-Guerrero, J. D. & Lewenstein, M. Single trajectory characterization via machine learning. New Journal of Physics22, 013010 (2020). [Google Scholar]
- 37.Kowalek, P., Loch-Olszewska, H. & Szwabiński, J. Classification of diffusion modes in single-particle tracking data: Feature-based versus deep-learning approach. Physical Review E100, 032410 (2019). [DOI] [PubMed] [Google Scholar]
- 38.Bo, S., Schmidt, F., Eichhorn, R. & Volpe, G. Measurement of anomalous diffusion using recurrent neural networks. Physical Review E100, 010102 (2019). [DOI] [PubMed] [Google Scholar]
- 39.Requena, B. et al. Inferring pointwise diffusion properties of single trajectories with deep learning. Biophysical Journal122, 4360–4369 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Muñoz-Gil, G., Guigo i Corominas, G. & Lewenstein, M. Unsupervised learning of anomalous diffusion data: an anomaly detection approach. Journal of Physics A: Mathematical and Theoretical54, 504001 (2021). [Google Scholar]
- 41.Fernández-Fernández, G., Manzo, C., Lewenstein, M., Dauphin, A. & Dauphin, A. Learning minimal representations of stochastic processes with variational autoencoders. Phys. Rev. E110, L012102 (2024). [DOI] [PubMed]
- 42.Kingma, D. P. & Welling, M. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).
- 43.Vaswani, A. et al. Attention is all you need. Adv. Neural Inform. Process. Syst. 30 (2017).
- 44.Zhao, K., Ding, H., Ye, K. & Cui, X. A Transformer-Based Hierarchical Variational AutoEncoder Combined Hidden Markov Model for Long Text Generation. Entropy23, 1277 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Wang, T. & Wan, X. T-CVAE: Transformer-Based Conditioned Variational Autoencoder for Story Completion. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence 5233–5239 (2019).
- 46.Dollar, O., Joshi, N., Beck, D. A. C. & Pfaendtner, J. Attention-based generative models for de novo molecular design. Chemical Science12, 8362–8372 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Jiang, J., Xia, G. G., Carlton, D. B., Anderson, C. N. & Miyakawa, R. H. Transformer VAE: A Hierarchical Model for Structure-Aware and Interpretable Music Representation Learning. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 516–520 (2020).
- 48.Spurgeon, S. R. et al. Towards data-driven next-generation transmission electron microscopy. Nature Materials20, 274–279 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Karniadakis, G. E. et al. Physics-informed machine learning. Nature Reviews Physics3, 422–440 (2021). [Google Scholar]
- 50.Metzler, R., Jeon, J.-H., Cherstvy, A. G. & Barkai, E. Anomalous diffusion models and their properties: non-stationarity, non-ergodicity, and ageing at the centenary of single particle tracking. Phys. Chem. Chem. Phys.16, 24128–24164 (2014). [DOI] [PubMed] [Google Scholar]
- 51.Gentili, A. & Volpe, G. Characterization of anomalous diffusion classical statistics powered by deep learning (CONDOR). J. Phys. A.54, 314003 (2021).
- 52.Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B. & Hochreiter, S. GANs trained by a two time-scale update rule converge to a local nash equilibrium. Adv. Neural Inform. Process. Syst.30 (2017).
- 53.Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. & Wojna, Z. Rethinking the inception architecture for computer vision. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2818–2826 (2016).
- 54.Lushnikov, P. M., Šulc, P. & Turitsyn, K. S. Non-Gaussianity in single-particle tracking: Use of kurtosis to learn the characteristics of a cage-type potential. Physical Review E85, 051905 (2012). [DOI] [PubMed] [Google Scholar]
- 55.Höfling, F. & Franosch, T. Anomalous transport in the crowded world of biological cells. Reports on Progress in Physics76, 046602 (2013). [DOI] [PubMed] [Google Scholar]
- 56.Goychuk, I. Viscoelastic subdiffusion: From anomalous to normal. Physical Review E80, 046125 (2009). [DOI] [PubMed] [Google Scholar]
- 57.Shabeeb, Z. & Jamali, V. Learning the diffusion of nanoparticles in liquid phase TEM via physics-informed generative AI. https://zenodo.org/records/15708218 (2025). [DOI] [PMC free article] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The gold nanoparticle trajectory data used in this study for training, validating, and testing the LEONARDO model, as well as the experimental datasets used in the electron beam dose and size study (Section 2.4), are deposited in the HuggingFace repository under accession code 10.57967/hf/5786. Source data are provided with this paper.
The LEONARDO and MoNet2.0 source code is available on GitHub at https://github.com/JamaliLab/LEONARDOand can be accessed and referenced via Zenodo at 10.5281/zenodo.1570821857. The trained LEONARDO model is available on HuggingFace at 10.57967/hf/5787.