Abstract
Fluorescence time traces are used to report on dynamical properties of molecules. The basic unit of information in these traces is the arrival time of individual photons, which carry instantaneous information from the molecule, from which they are emitted, to the detector on timescales as fast as microseconds. Thus, it is theoretically possible to monitor molecular dynamics at such timescales from traces containing only a sufficient number of photon arrivals. In practice, however, traces are stochastic and in order to deduce dynamical information through traditional means–such as fluorescence correlation spectroscopy (FCS) and related techniques–they are collected and temporally autocorrelated over several minutes. So far, it has been impossible to analyze dynamical properties of molecules on timescales approaching data acquisition without collecting long traces under the strong assumption of stationarity of the process under observation or assumptions required for the analytic derivation of a correlation function. To avoid these assumptions, we would otherwise need to estimate the instantaneous number of molecules emitting photons and their positions within the confocal volume. As the number of molecules in a typical experiment is unknown, this problem demands that we abandon the conventional analysis paradigm. Here, we exploit Bayesian nonparametrics that allow us to obtain, in a principled fashion, estimates of the same quantities as FCS but from the direct analysis of traces of photon arrivals that are significantly smaller in size, or total duration, than those required by FCS.
Keywords: Single photon detection, Spectroscopy, Confocal microscopy, FCS, Bayesian nonparametrics
I. INTRODUCTION
Methods to capture static molecular structures, such as super-resolution microscopy [36, 51, 67], provide only snapshots of life in time. Yet life is dynamical and obtaining a picture of life in action–one that captures diffraction-limited biomolecules as they move, assemble into and disassemble from larger bimolecular complexes–remains an important challenge [60]. In fact, the creative insights directly leading to fluorescence correlation spectroscopy (FCS) [30, 69]–and related methods such as FCS-FRET [84, 104] and FCCS [92]–have shown that deciphering dynamical information from molecules, often biomolecules, does not demand spatial resolution or spatial localization. Rather, the key is to inhomogeneously illuminate a sample over a small volume.
As fluorescently-labeled molecules diffuse across this inhomogeneously illuminated volume, they emit photons (i.e., they fluoresce) in a way that is proportional to the illumination at their respective locations [57]. Single photon detectors, often photo-multiplier tubes or avalanche photodiodes, are then used to record these photons. In principle, with the appropriate electronics, photons can be recorded within μs-ms. This suggests that information on the molecules’ motion could be drawn from the data on fast timescales that approach data acquisition, i.e. no more than a few μs-ms.
The fundamental quantities measured in a confocal optical setup are individual photon arrival times, from which photon inter-arrival times, i.e., the intervals between adjacent photon arrivals, can be readily obtained [59]. When imaging molecules fixed in space and under homogeneous (uniform) illumination, these inter-arrival times–excluding other experimental and label artifacts such as detector noise, background photons, and label photo-physical kinetics–are independent and identically distributed and so uncorrelated with each other. However, inter-arrival times measured in conventional confocal experiments encode the number of molecules in the vicinity of the confocal volume, their diffusion dynamics, their position with respect to the confocal center in addition to an array of experiment specific artifacts such as detector characteristics and label photokinetics. Consequently, inter-arrival times are correlated with each other and, in principle, these correlations can be exploited to characterize the dynamics of the underlying molecular system.
Thus far, correlations in the inter-arrival times are exploited by collecting photons over long periods [85] and temporally autocorrelating the resulting fluorescence intensity measurements [15, 30, 35, 69]. For sufficiently long intensity traces, the stochasticity in the number of labeled molecules contributing photons, as well as their positions in the illuminated volume and their instantaneous photon emission rates, are averaged out. As such, the mathematical expression for the fluorescence intensity time-autocorrelation function takes a simple form that–under strong assumptions on the illuminated volume’s geometry and the molecules’ photon emission rate–can be summarized in analytic formulas that are fitted on the acquired measurements.
However, despite the elegance and simplicity of the mathematics involved in the derivation of the time-autocorrelation function [15, 30, 35, 69], a critical limitation of autocorrelative methods, including all those within the FCS framework, remains the stark timescale separation between data collection (e.g., typical time between successive photon arrivals) and the timescale required to deduce a meaningful dynamical interpretation (e.g., typical duration between first and last photon arrivals used); see Fig. (1). A method that takes direct advantage of single photon arrivals, without using intensity traces (i.e., downsampled photon arrivals), has the potential to reveal dynamical information on timescales several orders of magnitude faster than traditional FCS analysis. As a result, rapid or non-equilibrium processes and, as such, abrupt changes in molecular chemistry, could be studied. Furthermore, provided such a method can utilize substantially shorter traces, the total duration of experiments can be shrunk and the phototoxic damage induced on biological samples can be reduced substancially [67, 70, 83, 103]. This is especially relevant for in vivo FCS applications [28, 80, 93, 105].
FIG. 1. Photon arrival times can characterize dynamical properties of molecules on fast, photon-detection, timescales.
(A) Schematic of an illuminated confocal volume (blue) with fluorescent molecules emitting photons based on their location within that volume. (B) Synthetic trace containing ≈ 1500 photon arrivals produced by 4 molecules diffusing at 1 μm2/s for a total time of 30 ms under background and molecule photon emission rates of 103 photons/s and 4 × 104 photons/s, respectively. (C) Autocorrelation curve, G(τ), of the trace in (B), binned at 100 μs. On account of the limited data available in the trace, any reasonable fit is impossible. Normally, in FCS analysis, much longer traces are used to generate smoother G(τ) that are fitted to determine a diffusion coefficient. In Fig. A1 of the Appendix, we show that the quality of the fit does not improve considerably by fitting to a semi-logarithmic curve. (D) Comparison between diffusion coefficient estimates using our proposed method (detailed later) and FCS as a function of the number of photon arrivals in the analyzed trace. Since by 1.5 × 104 photon arrivals our method has converged, we avoid analyzing larger traces.
Previously proposed methods to analyze single photon measurements [2, 4, 40, 41, 45, 75, 76, 82, 108] make assumptions that render them inappropriate for imaging molecules moving through inhomogeneously illuminated volumes [40]. For example, for the analysis of single molecule fluorescence resonance energy transfer (FRET), existing methods assume that the photon inter-arrival times reflect only biomolecular conformational transitions [40–42] but not diffusive motion of the entire biomolecule [26, 27, 39, 41], and so are appropriate only for experiments on immobilized molecules. Along the same lines, existing methods combine FRET with FCS [90] to quantify ns dynamics; however, they do not directly exploit single photon measurements. Rather, they operate on downsampled measurements, achieved through binning, similar to traditional FCS, and therefore inherit the same limitations and drawbacks.
To be able to use single photon arrival times to estimate the diffusion coefficient of labeled molecules in a confocal experiment, as in most biological applications, we must be able to determine the particular number of molecules responsible for the observed photon arrival time trace. Otherwise, naively, many molecules with low diffusion coefficients emitting photons at the periphery of the illuminated confocal volume could be mistaken for fewer molecules with higher diffusion coefficients in the center region which is most illuminated. As we illustrate in Fig. (2), misidentifying the number of molecules, or incorrectly assessing their positions, may give rise to incorrect diffusion coefficient estimates.
FIG. 2. Estimates of diffusion coefficients from photon arrival traces strongly depend on the number of molecules assumed to be contributing to the trace.
The trace analyzed contained ≈ 1800 photon arrivals produced by 4 molecules diffusing at 1 μm2/s for a total time of 30 ms under background and molecule photon emission rates of 103 photons/s and 4 × 104 photons/s, respectively. To estimate D parametrically, we assumed a fixed number of molecules, N = 1 (A); N = 2 (B); N = 3 (C); N = 4 (D); and N = 5 (E). The correct estimate in (D)–and the mismatch in all others–underscores why it is critical to estimate the number of molecules contributing to the trace to deduce quantities such as diffusion coefficients from single photon arrivals.
More concretely, to obtain quantitative estimates of the diffusion coefficient, we need to formulate a likelihood [9, 37, 107]. In turn, to formulate a likelihood for photon arrival data demands that we know the number of molecules contributing photons as well as their locations across time. As the number of molecules instantaneously located within the confocal volume is unknown, all reasonable possibilities need to be considered and rank-ordered using expensive pre- or post-processing model selection heuristics [60, 100]. This has not been achieved yet, in part, because of the prohibitive computational cost it entails. Analyzing single photon arrivals from a confocal setup to derive dynamical information therefore demands fundamentally new tools.
The conceptually novel framework that we propose in this study can winnow down infinite possibilities (i.e., infinite populations of molecules potentially contributing photons) to a finite, computationally manageable, number in a mathematically exact manner. Such a framework avoids compromising temporal resolution, as it requires no intensity trace to be formed (i.e., no downsampling), and allows us to directly deduce dynamical quantities, such as diffusion coefficients, efficiently from raw single photon arrivals. The underlying theory, Bayesian nonparametrics (BNPs) [34], is a powerful set of tools still under active development and largely unknown to the Physical Sciences [19, 49, 53, 60, 79, 95–98, 100].
Mathematical devices within BNPs, such as the beta-Bernoulli process [1, 16, 77], allow us to place priors not only on parameters themselves, as traditional parametric Bayesian methods, but also on distributions over an infinite number of candidate models to which parameters are associated [50]. Concretely, for the case of our single photon time traces, BNPs and in particular beta-Bernoulli processes can be used to assign posterior probabilities over an array of quantities including all possible number of molecules responsible for producing the data and their associated locations at each photon arrival time. With these devices, as we describe herewith, we turn the otherwise difficult problem of model-selection–that is, determining how many molecules contribute photons–into a parameter estimation problem that remains computationally tractable [1, 16, 77].
II. MATERIALS AND METHODS
Here, we describe the mathematical formulation of our BNPs method for the analysis of confocal single photon data. We begin with the overall input which consists of photon inter-arrival times, Δt = (Δt1,Δt2,...,ΔtK−1) where Δtk represents the time interval between adjacent observations of photons, which occur at times tk with k = 1,...,K. We also use as input the illuminated confocal volume’s shape and background photon emission rate which we can determine separately through calibration [13].
To derive estimates for the diffusion coefficient from Δt, we need to determine intermediate quantities which include: i) photon emission rates of molecular labels; and, most importantly, ii) the unknown number of molecules contributing photons to the trace Δt, as well as their location with respect to the center of the confocal volume.
A graphical summary of our formulation is shown in Fig. (3). Below, we explain briefly each step involved. More details, and an implementation of the whole method, are available in the Appendix. In addition, source code and a GUI version of our implementation are provided through the Supplementary Materials.
FIG. 3. BNP formulation used for the analysis of photon arrival traces.
Molecules, indexed n = 1,2,..., evolve over the experimental time course which is indexed by k = 1,2,...,K. Here, indicates the location of molecule n at time tk. During the experiment, only a single observation (inter-arrival time) Δtk is recorded, thereby combining photon emissions from every molecule and the background. The diffusion coefficient D determines the evolution of the molecular positions which influence the photon emission rates and eventually the recorded Δtk. The indicator variables bn are introduced to infer the unknown molecule population size. In the graphical model, the measured data are highlighted by grey shaded circles and the model variables, which require priors, are designated by blue circles.
A. Model Formulation
We begin with the distribution according to which the kth observation, Δtk, is derived
| (1) |
Accordingly, Δtk follows an exponential probability distribution [42, 81] with rate μk. In fact, the rate μk gathers the photon emission rates of all molecules which depend on their respective locations relative to the confocal center (see below) [13]. In addition to the molecule photon emissions rates, μk also includes background photons
| (2) |
where is the sum over photon emission rates gathered from the individual molecules, that we index with n = 1,2,..., and μback is the background photon emission rate. In our formulation, and μback are the emission rates of photons that reach our detectors which, due to optical and detector limitations, are typically lower than the rates of actual photon emissions [71, 72].
Next, we incorporate the dependency of the emission rate on location [5, 22, 111] with other effects such as camera pinhole shape and size, the laser intensity, laser wavelength, and quantum yield [13] into a characteristic point spread function (PSF). To be more precise, a PSF characterizes the optical response of an imaging system [12, 59, 114]. Although this term is mostly used for wide-field microscopes to describe the emission PSF, here, we follow the FCS literature, and use it to describe the confocal microscope, i.e., both emission and detection PSFs. Consistent with FCS [15, 30, 35, 69], we assume a 3D Gaussian geometry [57]
| (3) |
where is the position of the nth molecule at time tk and the parameter μmol indicates the brightness of a single molecule. This is the rate of detected photon emissions achieved when the molecule is at the center of the confocal volume where illumination is highest.
Finally, for a molecule diffusing along one direction, the probability distribution p(x,t) of its position x at time t satisfies the diffusion equation [6, 21, 55]
| (4) |
To solve this equation, we assume that the molecule is located at xk−1 at time tk−1 and we obtain
| (5) |
which is the probability density of a normal random variable with mean xk−1 and variance 2(t − tk−1)D. Therefore, at time t = tk, we write
| (6) |
where Δtk−1 = tk − tk−1 and D is the molecule’s diffusion coefficient. Similarly, solving the diffusion equation for molecules following isotropic diffusion in free space along all three Cartesian directions, we obtain
| (7) |
| (8) |
| (9) |
B. Model Inference
All quantities which we need to infer–such as the diffusion coefficient, D, locations of molecules through time, and the molecule photon emission rate μmol– are formulated as model variables. We estimate these variables within the Bayesian paradigm [37, 60, 100]. The model parameters such as D and μmol require priors. Additionally, we have to consider priors on the initial molecule locations, i.e., at the time of the very first photon arrival, . Options for these priors are straightforward and, for computational convenience, we adopt the distributions described in the Appendix.
Meanwhile, before we proceed any further with our BNPs formulation, we need to revise eq. (3) as follows
| (10) |
The variables bn, defined for each model molecule, take only values 1 or 0. Specifically, we have bn = 0 when the nth model molecule does not contribute photons to the measurements as in this case the molecule is decoupled from the overall photon emission rate μk. This indicator variable allows us to operate on an arbitrarily large population of model molecules; technically, an infinite population. The ability to recruit, from a potentially infinite pool of model molecules, the precise number that contributes to the measured trace Δt is the chief reason we abandon the parametric Bayesian paradigm and adopt BNPs. After introducing the indicators bn, we can estimate the number of molecules that contribute photons, i.e., those molecules where bn = 1, simultaneously with the remaining of the parameters simply by having each bn as a separate parameter and estimating its value.
To estimate bn, we consider a Bernoulli prior with a beta hyper-prior
| (11) |
| (12) |
where Aq and Bq are (hyper-hyper-)parameters specifically chosen to allow for n → ∞. In this limit, eqs. (11) and (12) can be combined resulting in a beta-Bernoulli process [1, 16, 77]; see Appendix for more details.
With the specified priors, we can now form a joint posterior probability including all unknown variables which we seek to determine. Nevertheless, the nonlinear dependence of the PSF on the molecules’ positions and the nonparametric prior on the indicators (bn)n exclude a closed form for our posterior. For this reason, we develop a Markov Chain Monte Carlo scheme [37, 58, 86] that exploits results from the theory of Computational Statistics and Non-linear filtering to generate pseudo-random samples from this posterior that we use in obtaining our estimates [37, 86]. A technical description of this scheme can be found in the Appendix and a ready-to-use implementation is available through the Supplementary Materials.
C. Data Acquisition
1. Acquisition of Synthetic Data for Figs. (4)–(7)
FIG. 4. A higher number of total photon arrivals provide more photons per unit time and sharper diffusion coefficient estimates.
(A1) Instantaneous molecule photon emission rates , normalized by μmol. (A2) Photon arrival trace resulting from combining photon emissions from every molecule and the background. This synthetic trace contains ≈ 2000 photon arrivals produced by 4 molecules diffusing at 1 μm2/s for a total time of 30 ms under background and molecule photon emission rates of 103 photons/s and 4×104 photons/s, respectively. The dashed lines show the initial 30%, 50%, 80%, and 100% portions of the original trace containing ≈ 600, ≈ 1000, ≈ 1600, ≈ 2000 photon arrivals, respectively. (B1-B4) Posterior probability distributions drawn from traces with differing length (shown in (A2)). As expected, for the longer traces, the peak of the posterior matches with the exact value of D (dashed line). Gradually, as we decrease the total number of photon arrivals analyzed, the estimation becomes less reliable.
FIG. 7. A higher molecule photon emission rate provides more photons per unit time and sharper diffusion coefficient estimates.
(A, B, C, D) Posterior probability distributions drawn from traces produced by 4 molecules diffusing at 1 μm2/s for a total time of 30 ms under background photon emission rate of 103 photons/s and molecule photon emission rates 4×105, 4×104, 4×103, 103 photons/s, respectively. As expected, under higher molecule photon emission rates, the peak of the posterior matches sharply with the exact value of D (dashed line). Gradually, as we decrease the molecule photon emission rate, the estimation becomes less reliable.
We acquire the synthetic data shown in the Results section by computer simulations [8, 33, 44, 47, 52] that represent Brownian motion of point molecules moving through a typical illuminated confocal volume. We provide finer details and complete parameter choices in the Appendix.
2. Acquisition of Experiment data for Figs. (8)–(12)
FIG. 8. Higher molecular concentrations in experimental traces provide more photons per unit time resulting in sharper diffusion coefficient estimates.
Estimates shown are drawn from experimental traces with a low (100 pM) (A) and high (1 nM) (B) concentration of Cy3 dye molecules and 75% glycerol at a fixed laser power of 100 μW. Similarly to Fig. (1), we compare our method’s diffusion coefficient estimate (circle green dots) to FCS (blue asterisk) as a function of the number of photons used in the analysis. Since by 1.5 × 104 photon arrivals our method has converged, we avoid analyzing larger traces. The red dash line is the FCS estimate produced from the entire 5 min trace containing ≈ 3 × 106 photon arrivals.
FIG. 12. Diffusion coeffcient estimates of labeled protein.
Estimates shown are drawn from experimental traces with fixed concentration 1 nM of Cy3-labeled streptavidin molecules and laser power 100 μW. Similarly to Fig. (1), we compare our method’s diffusion coefficient estimate (green dots) to FCS (blue asterisk) as a function of the number of photons used in the analysis. Since by 1.5 × 104 photon arrivals our method has converged, we avoid analyzing larger traces. The red dash line is the FCS estimate obtained from the entire, 5 min, trace containing ≈ 3 × 106 photon arrivals.
For these experiments we used Cy3 fluoresent dyes. Solutions were made by suspending Cy3 dye in glycerol/buffer (pH 7.5, 10 mM Tris-HCl, 100 mM NaCl and 10 mM KCl, 2.5 mM CaCl2) at various v/v, to a final concentration of either 100 pM or 1 nM. The solution was placed in a glass- bottomed fluid-cell, assembled on a custom designed confocal microscope [63] and a 532 nm laser beam was focused to a diffraction-limited spot on the glass coverslip of the fluid-cell using a 60x, 1.42 N.A., oil-immersion objective (Olympus). In our setup, the laser beam is focused at the glass-water/glycerol interface and the beam is refocused by visual inspection at the beginning of every measurement. Emitted fluorescence was collected from the same objective and focused onto a Single Photon Avalanche Diode (SPAD, Micro Photon Devices) with a maximum count rate of 11.8 Mc/s. A bandpass filter placed in front of the detector blocked all back-scattered excitation light and relayed only fluorescence from Cy3. Individual photon arrivals on the detector triggered TTL pulses and were both timestamped and registered at 80 MHz. This was achieved using a field programmable gate array (FPGA, NI Instruments) and custom LabVIEW software [89].
3. Acquisition of Experimental Data for Fig. (13)
FIG. 13. Diffusion coefficient estimates of 5-TAMRA dye.
Estimates shown are drawn from experimental traces with fixed concentration 20 nM of 5-TAMRA dye molecules. Similarly to Fig. (1), we compare our method’s diffusion coefficient estimate (green dots) to FCS (blue asterisk) as a function of the number of photons used in the analysis. Since by 1.5 × 104 photon arrivals our method has converged, we avoid analyzing larger traces. The red dash line is the FCS estimate obtained from the entire, 10 min, trace containing ≈ 6 × 106 photon arrivals.
For these experiments we used 5-TAMRA fluorescent dyes. The excitation source was a supercontinuum fiber laser Fianium WhiteLase SC480 (NKT Photonics, Birkerod, Denmark) operating at a repetition rate of 40 MHz. The excitation wavelength (550 nm) was selected by an acousto-optic tunable filter (AOTF), and the exiting beam was collimated and expanded by approximately a factor of three to slightly overfill the back aperture of the objective lens. The light was reflected into the objective lens (Zeiss EC Plan-Neofluar 100x oil, 1.3 NA pol M27, Thornwood, NY, USA) by a dichroic mirror (Chroma 89016bs). The same objective was used to collect the fluorescence from the sample, and passed through a band pass filter (Chroma ET575/50m) before being focused into a position motorized pinhole wheel set at 25 μm. The output of the pinhole was focused on a multimode hybrid fiber optic patch cable (M18L01, Thorlabs, NJ, USA) which was coupled to a single-photon avalanche diode (SPCM AQRH-14, Excelitas Technologies, Quebec, Canada). The detected photons were recorded by a TimeHarp 200 time-correlated single photon counting board (PicoQuant, Berlin, Germany) operating in T3 mode. The sample (≈50 μL) was contained in a perfusion chamber gasket (CoverWell) adhered on a glass coverslip. The sample was 20 nM 5-Carboxytetramethylrhodamine (5-TAMRA, purchased from Sigma-Aldrich, USA) dissolved in doubly distilled water at room temperature.
III. RESULTS
Our goal is to characterize quantities that describe molecular dynamics, especially dynamics encountered in biological samples, such as diffusion coefficients, at the data-acquisition timescales of conventional single-focus confocal setups. Our input consists of: i) the measured photon inter-arrival times Δt = (Δt1,Δt2,...,ΔtK−1); ii) the background photon emission rate; and iii) the geometry of the illuminated volume specified through a characteristic PSF.
As we explain in the Methods section, in order to estimate the molecules’ diffusion coefficient, D, we also estimate intermediate quantities (namely, molecule photon emission rates, molecule positions over time and the molecule numbers in the first place). These intermediate quantities demand that we use BNPs to determine quantities that a priori may be arbitrarily large such as the number of molecules contributing photons to our datasets Δt.
Within the Bayesian paradigm [60, 107], our estimates take the form of posterior probability distributions over the unknown quantities. These distributions combine parameter values, probabilistic relations among different parameters, as well as the associated uncertainties. According to the common statistical interpretation [37, 107], the sharper the posterior, the more conclusive (and certain) the estimate. To quantify the uncertainty, we compute a posterior variance and use the square root of this variance to construct error-bars (i.e., credible intervals) [37, 107]. In Table I in the Appendix, we summarize the mean values and error bars of our analyses.
TABLE I.
Here, we list point estimates of our analyses, which we obtain from the marginal posterior probability distributions p(D|Δt) and p(μmol|Δt). Estimates are listed according to figure.
| D | μ mol | |||
|---|---|---|---|---|
|
| ||||
| mean | std | mean | std | |
|
| ||||
| μm2/s | μm2/s | photons/s | photons/s | |
| Fig. (2A) | 4.54 | 4.49 | - | - |
| Fig. (2B) | 4.17 | 4.11 | - | - |
| Fig. (2C) | 1.14 | 1.12 | - | - |
| Fig. (2D) | 1.02 | 1.01 | - | - |
| Fig. (2E) | 4.75 | 4.64 | - | - |
|
| ||||
| Fig. (4B1) | 1.03 | 0.25 | - | - |
| Fig. (4B2) | 0.95 | 0.63 | - | - |
| Fig. (4B3) | 0.75 | 0.68 | - | - |
| Fig. (4B4) | 0.45 | 0.77 | - | - |
|
| ||||
| Fig. (5A3) | 1.01 | 0.27 | - | - |
| Fig. (5B3) | 1.09 | 0.51 | - | - |
| Fig. (5C3) | 1.65 | 1.59 | - | - |
|
| ||||
| Fig. (6) | 1.05 × 10−2 | 0.22 × 10−2 | - | - |
| 1.21 × 10−1 | 0.34 × 10−1 | - | - | |
| 1.06 | 0.19 | - | - | |
| 9.87 | 2.33 | - | - | |
| 117.62 | 35.13 | - | - | |
|
| ||||
| Fig. (7A) | 0.99 | 0.34 | - | - |
| Fig. (7B) | 0.96 | 0.51 | - | - |
| Fig. (7C) | 2.92 | 2.68 | - | - |
| Fig. (7D) | 3.26 | 2.95 | - | - |
|
| ||||
| Fig. (A3A3) | 10.02 | 1.17 | - | - |
| Fig. (A3B3) | 9.96 | 2.19 | - | - |
|
| ||||
| Fig. (A4A3) | - | - | 4.11 × 104 | 1.61 × 103 |
| Fig. (A4B3) | - | - | 4.37 × 105 | 2.84 × 103 |
| Fig. (A4C3) | - | - | 1.28 × 105 | 1.25 × 104 |
Below, we validate first our method on synthetic data where the ground truth is available. For these, we use a confocal volume of typical size ωxy = 0.3 μm and ωz = 1.5 μm [59]. We then test our method on experimental data collected in two labs utilizing different FCS setups. For the latter cases, we demonstrate the advantages of our method by comparing our results to the results obtained from autocorrelative methods used in FCS analysis.
A. Method Validation using Simulated Data
To demonstrate the robustness of our approach, we simulate raw single photon arrival traces under a broad range of: i) total photon arrivals, Fig. (4); ii) concentrations of labeled molecules, Fig. (5); iii) diffusion coefficients, Fig. (6); and iv) molecule photon emission rates, Fig. (7). The parameters not varied are held fixed at the following baseline values: diffusion coefficient of 1 μm2/s which is typical of slower in vivo conditions [7, 68, 88, 105], molecule photon emission rates of 4 × 104 photons/s [62, 82], and 4 as the number of labeled molecules contributing photons. We chose 4, a small number of molecules (as opposed to a larger number of molecules), because this scenario presents the greatest analysis challenge as very few photons, and thus little data, are gathered to aid the analysis.
FIG. 5. A higher molecular concentration provides more photons per unit time and sharper diffusion coefficient estimates.
(A1, B1, C1) Instantaneous molecule photon emission rates , normalized by μmol. (A2, B2, C2) Photon arrival traces resulting from combining photon emissions from every molecule and the background. These are produced by 10 molecules containing ≈ 3000 photon arrivals (A2), 4 molecules containing ≈ 2000 photon arrivals (B2), and 1 molecules containing ≈ 1000 photon arrivals (C2), diffusing at 1 μm2/s for a total time of 30 ms under background and molecule photon emission rates of 103 photons/s and 4×104 photons/s, respectively. (A3, B3, C3) Posterior probability distributions drawn from traces with differing number of molecules (shown in (A2, B2, C2)). As expected, for the traces with higher number of molecules, the peak of the posterior matches with the exact value of D (dashed line). Gradually, as we decrease the total number of molecules the estimation becomes less reliable.
FIG. 6. A lower diffusion coefficient provides more photons per unit time and sharper diffusion coefficient estimates.
Posterior probability distributions drawn from traces containing ≈ 2000 photon arrivals produced by 4 molecules diffusing at D = 0.01,0.1,1,10 μm2/s for a total time of 30 ms under background and molecule photon emission rates of 103 photons/s and 4 × 104 photons/s, respectively. For molecules diffusing at D = 100 μm2/s, under similar conditions, we used a trace containing ≈ 3000 photons for a total time of 50 ms, since we needed a longer trace to gather sufficient information for drawing a posterior.
As illustrated in Fig. (1), a critical and recurring point throughout this section is that the traces we analyze are shorter than those that could be meaningfully analyzed using FCS. While we focus on the diffusion coefficient estimation here, we note that our framework supports more detailed parameter estimation which we provide in the Appendix.
1. Total Photon Arrivals
We evaluate the robustness of our method with respect to the length of the trace (i.e., the total number of photon arrivals recorded) at a fixed number of molecules, diffusion coefficient, and molecule photon emission rates. The first important finding is that, for the values of parameters selected, we need 2 orders of magnitude less data than FCS; see Fig. (1D). For instance, to obtain an estimate of the diffusion coefficient within 10% of the ground truth value, we require ≈ 103 photons (directly emitted from the labeled molecule), while FCS requires ≈ 105 photons. Under our simulated scenario, these correspond to traces of total duration 30 − 50 ms and 50 s, respectively. To determine our error, we chose the mean value of the diffusion coefficient’s marginal posterior, p(D|Δt), and measure the percentage difference of this mean value to the ground truth.
In general, the precise photon numbers demanded by our method and traditional FCS depend on a broad range of experimental parameter settings. This is the reason, we explore different settings in subsequent subsections as well as the Appendix.
An important overarching concept is the concept of a photon arrival as a unit of information. The more photon arrivals we have in the analyzed trace, the sharper our diffusion coefficient estimates become. This is valid, as we see in Fig. (1D) and Fig. (4), for increasing total photon arrivals. Similarly, as we see in subsequent subsections, we also collect more photons as we increase the concentration of labeled molecules (and thus the number of molecules contributing photons to the trace), increase the molecule photon emission rates of molecular labels, or decrease diffusion coefficients of molecules. In the latter case, a slower diffusion coefficients provides more time for each molecule to traverse the illuminated region, in turn, resulting in more photon arrivals.
2. Molecule Concentration
To test the robustness of our method under different concentrations of labeled molecules at fixed diffusion coefficient, and molecule photon emission rates, we simulate molecules diffusing at 1 μm2/s for a total time 30 ms with: i) average concentrations of 10 molecules/μm3, Fig. (5A1, A2); ii) 4 molecules/μm3, Fig. (5B1, B2); and iii) 1 molecule/μm3, Fig. (5C1, C2). The molecule and background photon emission rates are taken to be 4×104 photons/s and 103 photons/s respectively, which are typical of confocal imaging [82].
Figure (5) summarizes our results and suggests that posteriors over diffusion coefficients are broader–and thus the accuracy with which we can pinpoint the diffusion coefficient drops–when the concentration of labeled molecules is lower. Intuitively, we expect this result as fewer molecules within the confocal volume provide fewer photons arrivals.
3. Diffusion Coefficients
We repeat the simulations of the previous subsection to demonstrate, using synthetic data, the robustness of our method with respect to the diffusion coefficient magnitude at fixed number of molecules, and molecule photon emission rates; see Fig. (6). Intuitively, and again on the basis of the fact that photon arrivals carry information, we expect that faster moving molecules give rise to broader posterior distributions as these emit fewer photons, and thus provide less information, while they traverse the confocal volume.
4. Molecule Photon Emission Rates
Figure (7) illustrates the robustness of our method with respect to the molecule photon emission rates (i.e., set by the laser power used in the experimental setting and the choice of fluorescent label) by fixing the number of molecules, diffusion coefficient (1 μm2/s), and background emission (103 photons/s). To accomplish this, we simulate increasingly dimmer molecules until the molecule signature is effectively lost in the background. As expected, dimmer molecules lead to broader posterior estimates over diffusion coefficients as these traces are associated with higher uncertainty.
B. Estimation of Physical Parameters from Experimental Data
To evaluate our BNPs method on real data, we used experimental single photon traces collected under a broad range of conditions. That is, we used measurements from two different experimental setups and different fluorescent dyes, that are commonly used in labeling biological samples, as well as diffusing labeled proteins. Additional differences between the setups include different numerical apertures (NA), laser powers, and overall detection instrumentation as detailed in the Methods section.
Figures (8)-(11) were collected using the Cy3 dye and these results were used to benchmark the robustness of our method on dye concentration, diffusion coefficients, and laser power. Moreover, to evaluate the proposed approach beyond free dyes, in Fig. (12), we used labeled proteins, namely freely diffusing streptavidin labeled with Cy3. For Fig. (13), photon arrivals were collected using 5-TAMRA dye in order to test the robustness of our method on a different fluorophore.
FIG. 11. Background photon emission rates are artificially added to experimental traces yielding challenging imaging conditions and broader diffusion coefficient estimates.
Experimental traces with fixed concentration 1 nM of Cy3 dye molecules and 67% glycerol and fixed laser power 100 μW. The same total number of photons analyzed under differing (artificially increased) background photon emission rates (0 (A1), 500 (B1), 1000 (C1) photons/s). (A2, B2, C2) Similarly to Fig. (1), we compare our method’s diffusion coefficient estimate (green dots) to FCS (blue asterisk) as a function of the number of photons used in the analysis. Since by 1.5×104 photon arrivals our method has converged, we avoid analyzing larger traces. The red dash line is the FCS estimate obtained from the entire, 5 min, trace containing ≈ 3 × 106 photon arrivals.
1. Benchmarking on Experimental Data using Cy3
We begin by verifying our method on mixtures of water and glycerol. While we only use short segments in our analysis, the collected traces are long enough (≈5 min each) to be meaningfully analyzed by traditional autocorrelative analysis used in FCS for sake of comparison. The result of the analysis of the full trace by FCS yields a diffusion coefficient that we treat as an effective ground truth. We then ask how long of a trace our method requires, as compared to FCS, in order for our diffusion coefficient estimate to converge to this ground truth.
Our strategy addresses the following complication: we anticipate that the PSF may be distorted from the idealized shape assumed especially with increasing amounts of glycerol [38]. However, the same (possibly incorrect) PSF is used in both FCS and our method in order to compare both methods head-to-head. Thus, concretely, we are asking: how many photon arrivals do we need to converge to the same result as FCS (irrespective of whether the FCS result is affected by PSF distortion artifacts)?
Our single photon traces are obtained under a range of conditions, namely different: i) dye concentrations, Fig. (8); ii) diffusion coefficients, Fig. (9); and iii) laser powers, Fig. (10). As before, longer traces, higher concentrations, lower diffusion coefficients, and higher laser powers result, on average, in sharper estimates with the results still converging with at least 2 orders of magnitude fewer photon arrivals than FCS for equal accuracy in Figs. (8), (9), and Fig. (10), respectively. We mention “on average” as individual traces are stochastic. Thus, some traces under higher concentrations of fluorescent molecules may happen to have fewer molecules contribute photons to the traces than experiments with lower concentrations.
FIG. 9. Lower diffusion coefficients in experimental traces provide more photons per unit time and sharper diffusion coefficient estimates.
Estimates shown are drawn from experimental traces with 99% glycerol (A), 94% glycerol (B), 75% glycerol (C), 67% glycerol (D), 50% glycerol (E), and 0% glycerol (F) with fixed concentration 1 nM of Cy3 dye molecules and laser power of 100 μW. Similarly to Fig. (1), we compare our method’s diffusion coefficient estimate (circle green dots) to FCS (blue asterisk) as a function of the number of photons used in the analysis. Since by 1.5 × 104 photon arrivals our method has converged, we avoid analyzing larger traces. The red dash line is the FCS estimate produced from the entire 5 min trace containing ≈ 3 × 106 photon arrivals.
FIG. 10. Higher laser powers in experimental traces provide more photons per unit time and sharper diffusion coefficient estimates.
Estimates shown are drawn from experimental traces with high (100 μW) (A) and low (25 μW) (B) laser power with fixed concentration 1 nM of Cy3 dye molecules and 75% glycerol. Similarly to Fig. (1), we compare our method’s diffusion coefficient estimate (circle green dots) to FCS (blue asterisk) as a function of the number of photons used in the analysis. Since by 1.5 × 104 photon arrivals our method has converged, we avoid analyzing larger traces. The red dash line is the FCS estimate produced from the entire 5 min trace containing ≈ 3 × 106 photon arrivals.
Figures (8) recapitulates our expectations derived from the synthetic data shown earlier (Fig. (5)), where dye concentrations are low yielding a wider posterior for our diffusion coefficient and correspondingly sharper posteriors for the higher concentration. Here, similarly to Fig. (1), we compare our method’s diffusion coefficient estimate to FCS as a function of the number of photon arrivals used in the analysis, Fig. (8A) and Fig. (8B), both in good agreement with FCS estimates, produced by the entire traces which is ≈ 103 times longer.
Similar to the analysis of synthetic data, by comparing different diffusion coefficients, the slower a diffusing molecule is, the more time it spends within the confocal volume, the more photons are collected providing us with a sharper posterior estimate of its diffusion coefficient (see Fig. (9)).
Similarly to the synthetic data shown earlier (Fig. (7)), Fig. (10) illustrates the robustness of our method to lower laser power which, as expected, yields a wider posterior for our diffusion coefficient and correspondingly sharper posteriors for higher laser power. Here, we compare our method’s diffusion coefficient estimate to FCS as a function of the number of photon arrivals used in the analysis, Fig. (10A) and Fig. (10B), both in good agreement with FCS estimates, produced from the entire trace.
As further controls, Fig. (11) demonstrates a set of analysis where the background photon emission rate is artificially added to real data. In these cases, we test the limits of our method on more challenging imaging conditons. Furthermore, we repeat our analysis on single photon traces produced by a labeled biomolecule. Specifically, in Fig. (12), we use streptavidin proteins labeled with Cy3.
2. Benchmarking on Experimental Data using 5-TAMRA
Finally, we switch to a different dye, different setup and acquisition electronics as detailed in the Methods section. Our sample contained 20 nM of 5-TAMRA dissolved in water. As previously, we successfully benchmark our estimates of the diffusion coefficient versus the value obtained from FCS on much longer (≈10 min) traces, see Fig. (13).
IV. DISCUSSION
A single photon arriving at a detector mounted to a confocal microsocope encodes information that reports on the fastest timescale achievable for spectroscopic and imaging applications [59, 73]. Directly exploiting this information can help uncover the dynamics of physical or biological systems at fast timescales with accuracy superior to that obtained from derived quantities such as down-sampled intensity traces.
Our method takes a Bayesian nonparametrics (BNPs) approach to tackling single photon arrival data to characterize dynamical quantities from as few as hundreds to thousands of datapoints from confocal imaging. This is by contrast to conventional autocorrelative methods used in FCS [30, 69, 84, 104] that require dramatically more data, i.e., datasets several orders of magnitude larger in either total duration or total number of photon arrivals, to characterize dynamical quantities with similar accuracy.
There have been partial solutions to the challenge of interpreting single molecule data at the single photon level often outside FCS applications. Indeed, existing methods make assumptions that render them inapplicable to diffusion through inhomogenesouly illuminated volumes. For example, they assume uniform illumination [40, 82], apply downsampling or binning and thereby reduce temporal resolution to exploit existing mathematical frameworks such as the hidden Markov model [2, 17, 41, 76, 106], or focus on immobile molecules [26, 27, 39, 41]. More recently, fluorescence-based nanosecond FCS approaches, in which the data are still correlated under the assumption that the time trace reports on processes at equilibrium, have been used to obtain information on rapid fluctuations in proteins [90]. As such, correlative methods largely continue to dominate confocal data analysis almost half a century beyond their inception [30, 57, 69].
To take full advantage of single photon data, new Mathematics are required. These must treat the inherent non-stationarity between photon arrivals arising due to molecular diffusion in an inhomogeneously illuminated volume and the stochastic number of molecules contributing photons. In particular, analyzing data derived from mobile molecules within an illuminated confocal region breaks down the perennial parametric Bayesian paradigm that has been the workhorse of data analysis [17, 46, 60, 65, 76, 100, 105]. We argue here that BNPs–which provide principled extensions of the Bayesian methodology [34, 102]–show promise in Physics [48, 54, 60, 95, 96, 98, 100] and give us a working solution to fundamental parametric challenges.
Our new tools open up the possibility to explore at the single photon level non-equilibrium processes resolved on fast timescales [3, 74], reaching ms or even below, that have been the focus of recent attention [25]. Moreover, and of immediate relevance for biophysical applications, if a single molecule photobleaches after emitting just a few hundred photons, then our novel method can still provide a diffusion coefficient estimate. Additionally, by analyzing single photon data pointwise, as we do in this study, we obtain a better handle on error bars than analyzing post-processed, such as correlated, data where the error bars can become difficult to compute or interpret [56, 87]. As such, a sharp diffusion coefficient posterior may not only suggest a good estimate of the diffusion coefficient but also suggest that the underlying model, such as normal diffusion, is appropriate and vice versa a broad posterior may suggest a poor estimate or an inappropriate motion model.
Furthermore, armed with a transformative framework, founded upon rigorous Statistics, it is now possible to extend the proof-of-principle study to treat effects that lie beyond the current scope of this work. In particular, we can extend our framework to treat multiple color imaging [24], triplet effect and complex molecule photophysics [43] (such as molecular blinking [97, 113] and photobleaching [61, 99]), more complex molecule motion models [101, 112] other than free diffusion [53], distorted or abberated PSF models [32], or even incorporate chemical reactions among the molecules [10, 109]. As our BNP framework explicitly represents the instantaneous position of each involved molecule throughout the experiment’s time course, these are extensions that require modest modifications.
Supplementary Material
TABLE II.
Summary of notation.
| Description | Variable | Units |
|---|---|---|
| Diffusion coefficient | D | μm2/s |
| α parameter of the diffusion coefficient prior | αD | - |
| β parameter of the diffusion coefficient prior | βD | μm2/s |
| Photon inter-arrival time | Δt | s |
| Total trace duration | Ttotal | s |
| molecule photon emission rate (maximum) | μmol | photons/s |
| α parameter of the molecule photon emission rate’s prior | αmol | - |
| β parameter of the molecule photon emission rate’s prior | βmol | photons/s |
| Emission rate of molecule n at time tk | photons/s | |
| Combined photon emission rate at time tk | μ k | photons/s |
| Background photon emission rate | μback | photons/s |
| Minor semi-axis of confocal PSF (focal plane) | ωxy | μm |
| Major semi-axis of confocal PSF (optical axis) | ω z | μm |
| Location of molecule n at time tk in x-coordinate | μm | |
| Location of molecule n at time tk in y-coordinate | μm | |
| Location of molecule n at time tk in z-coordinate | μm | |
| Recorded photon inter-arrival time between tk and tk−1 | Δtk | s |
| Indicator variable for molecule n | bn | - |
| Prior weight for bn | qn | - |
| α parameter of prior weight qn | α q | - |
| β parameter of prior weight qn | β q | - |
| Upper bound for the number of model molecules | N | - |
| Mean value of initial molecule position’s prior in the xy-plane | μxy | μm |
| Mean value of initial molecule position’s prior on the z-axis | μ z | μm |
| Variance of the initial molecule position’s prior in the xy-plane | μm | |
| Variance of the initial molecule position’s prior on the z-axis | μm | |
| Periodic boundary in the xy-plane | Lxy | μm |
| Periodic boundary on the z-axis | L z | μm |
TABLE III.
Probability distributions used and their densities. Here, the corresponding random variables are denoted by x. We use ”;” to separate random variables from parameters. For example, Normal(x;μ,σ2) means that x is the random variable (e.g. Normal(x;μ,σ2) = 1), and μ and σ2 are parameters characterizing this density.
| Distribution | Notation | Probability density function | Mean | Variance |
|---|---|---|---|---|
| Normal | Normal(μ,σ2) | μ | σ 2 | |
| Symmetric Normal | SymNormal(μ,σ2) | 0 | μ2 + σ2 | |
| Exponential | Exponential(μ) | |||
| Chi-square | χ2(α,2) | α | 2α | |
| Gamma | Gamma(α,β) | |||
| Inverse-Gamma | InvGamma(α,β) | |||
| Beta | Beta(α,β) | |||
| Bernoulli | Bernoulli(q) |
TABLE IV.
Parameter values used in the generation of the synthetic traces. Choices are listed according to figures.
| Lxy | Lz | ωxy | ωz | N | D | μmol | μback | Ttotal | |
|---|---|---|---|---|---|---|---|---|---|
| Units | μm | μm | μm | μm | - | μm2/s | photons/s | photons/s | s |
| Fig. (2A) | 1 | 2 | 0.3 | 1.5 | 4 | 1 | 4 × 104 | 103 | 0.03 |
| Fig. (2B) | 1 | 2 | 0.3 | 1.5 | 4 | 1 | 4 × 104 | 103 | 0.03 |
| Fig. (2C) | 1 | 2 | 0.3 | 1.5 | 4 | 1 | 4 × 104 | 103 | 0.03 |
| Fig. (2D) | 1 | 2 | 0.3 | 1.5 | 4 | 1 | 4 × 104 | 103 | 0.03 |
| Fig. (2E) | 1 | 2 | 0.3 | 1.5 | 4 | 1 | 4 × 104 | 103 | 0.03 |
| Fig. (4A) | 1 | 2 | 0.3 | 1.5 | 4 | 1 | 4 × 104 | 103 | 0.03 |
| Fig. (5A) | 1 | 2 | 0.3 | 1.5 | 10 | 1 | 4 × 104 | 103 | 0.03 |
| Fig. (5C) | 1 | 2 | 0.3 | 1.5 | 4 | 1 | 4 × 104 | 103 | 0.03 |
| Fig. (5E) | 1 | 2 | 0.3 | 1.5 | 1 | 1 | 4 × 104 | 103 | 0.03 |
| Fig. (6) | 1 | 2 | 0.3 | 1.5 | 4 | 10−2 | 4 × 104 | 103 | 0.03 |
| Fig. (6) | 1 | 2 | 0.3 | 1.5 | 4 | 10−1 | 4 × 104 | 103 | 0.03 |
| Fig. (6) | 1 | 2 | 0.3 | 1.5 | 4 | 1 | 4 × 104 | 103 | 0.03 |
| Fig. (6) | 1 | 2 | 0.3 | 1.5 | 4 | 10 | 4 × 104 | 103 | 0.03 |
| Fig. (6) | 1 | 2 | 0.3 | 1.5 | 4 | 100 | 4 × 104 | 103 | 0.03 |
| Fig. (7A) | 1 | 2 | 0.3 | 1.5 | 4 | 1 | 4 × 105 | 103 | 0.03 |
| Fig. (7B) | 1 | 2 | 0.3 | 1.5 | 4 | 1 | 4 × 104 | 103 | 0.03 |
| Fig. (7C) | 1 | 2 | 0.3 | 1.5 | 4 | 1 | 4 × 103 | 103 | 0.03 |
| Fig. (7D) | 1 | 2 | 0.3 | 1.5 | 4 | 1 | 103 | 103 | 0.03 |
| Fig. (A3A) | 1 | 2 | 0.3 | 1.5 | 10 | 10 | 4 × 105 | 103 | 0.05 |
| Fig. (A3C) | 1 | 2 | 0.3 | 1.5 | 10 | 10 | 4 × 104 | 103 | 0.05 |
| Fig. (A4A) | 1 | 2 | 0.3 | 1.5 | 10 | 10 | 4 × 105 | 103 | 0.05 |
| Fig. (A4C) | 1 | 2 | 0.3 | 1.5 | 10 | 10 | 4 × 104 | 103 | 0.05 |
| Fig. (A4E) | 1 | 2 | 0.3 | 1.5 | 10 | 10 | 4 × 103 | 103 | 0.05 |
TABLE V.
Parameter values used in the analyses of the traces. Choices are listed according to figures.
| ωxy | ωz | N | αD | βD | αmol | βmol | αq | βq | μxy | μz | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Units | μm | μm | - | - | μm2/s | - | phts/s | - | - | - | μm | μm | μm 2 |
| Fig. (2A) | 0.3 | 1.5 | - | 1 | 1 | 1 | 105 | - | - | 0.1 | 0.1 | 1 | 1 |
| Fig. (2B) | 0.3 | 1.5 | - | 1 | 1 | 1 | 105 | - | - | 0.1 | 0.1 | 1 | 1 |
| Fig. (2C) | 0.3 | 1.5 | - | 1 | 1 | 1 | 105 | - | - | 0.1 | 0.1 | 1 | 1 |
| Fig. (2D) | 0.3 | 1.5 | - | 1 | 1 | 1 | 105 | - | - | 0.1 | 0.1 | 1 | 1 |
| Fig. (2E) | 0.3 | 1.5 | - | 1 | 1 | 1 | 105 | - | - | 0.1 | 0.1 | 1 | 1 |
| Fig. (4) | 0.3 | 1.5 | 20 | 1 | 1 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (5A3) | 0.3 | 1.5 | 20 | 1 | 1 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (5B3) | 0.3 | 1.5 | 20 | 1 | 1 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (5C3) | 0.3 | 1.5 | 20 | 1 | 1 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (6) | 0.3 | 1.5 | 20 | 1 | 1 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (6) | 0.3 | 1.5 | 20 | 1 | 1 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (6) | 0.3 | 1.5 | 20 | 1 | 1 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (6) | 0.3 | 1.5 | 20 | 1 | 100 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (7A) | 0.3 | 1.5 | 20 | 1 | 1 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (7B) | 0.3 | 1.5 | 20 | 1 | 1 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (7C) | 0.3 | 1.5 | 20 | 1 | 1 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (7D) | 0.3 | 1.5 | 20 | 1 | 1 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (8) | 0.23 | 0.55 | 20 | 1 | 1 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (9) | 0.23 | 0.55 | 20 | 1 | 1 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (10) | 0.23 | 0.55 | 20 | 1 | 1 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (11) | 0.23 | 0.55 | 20 | 1 | 1 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (12) | 0.27 | 4.51 | 20 | 1 | 1 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (13) | 0.22 | 3.90 | 20 | 1 | 100 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (A3) | 0.3 | 1.5 | 20 | 1 | 1 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (A4) | 0.3 | 1.5 | 20 | 1 | 1 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (A5) | 0.23 | 0.55 | 20 | 1 | 1 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
| Fig. (A6) | 0.22 | 3.90 | 20 | 1 | 100 | 1 | 105 | 1 | 1 | 0.1 | 0.1 | 1 | 1 |
ACKNOWLEDGEMENTS
S.P. acknowledges support from the NIH NIGMS (R01GM130745) for supporting early efforts in nonparametrics and NIH NIGMS (R01GM134426) for supporting single photon efforts. S.S. acknowledges support from the NIH NIGMS (R01GM121885).
APPENDIX
FIG. A1. FCS curves resulting from exceedingly short traces (same synthetic data as Fig. 1) with linear (A) and semi-logarithmic (B) binning.
Due to the limited data, the quality of the fitted autocorrelation curve, G(τ), does not improve considerably for (B) as compared to (A).
Additional Analysis Results
In Fig. (A2) we illustrate the weakness of FCS analysis when applied on limited datasets such as those in the scope of our method. Additionally, using synthetic data, in Fig. (A3) we estimate diffusion coefficients faster than those in the Results section and in Fig. (A4) we estimate photon emission rates. Finally, using experimental data of Cy3 and 5-TAMRA dyes obtained as described in the Methods section, in Figs. (A5) and (A6), respectively, we benchmark the same estimates on real data.
FIG. A2. FCS curves resulting from exceedingly short traces.
Shown are autocorrelation curves, G(τ), of 5-TAMRA experimental traces, binned at 10 μs, for 100 ms and ≈ 500 photon arrivals (A); 200 ms and ≈ 1000 photon arrivals (B); 300 ms and ≈ 3000 photon arrivals (C); 2 s and ≈ 15000 photon arrivals (D); 30 s and ≈ 15 × 105 photon arrivals (E); 100 s and ≈ 15×106 photon arrivals (F). Even a visual inspection illustrates how poorly FCS applies on traces as sort as those analyzed by our BNP method.
FIG. A3. A larger molecule photon emission rate provides more photons per unit time and sharper diffusion coefficient estimates.
(A1, B1) Instantaneous molecule photon emission rates normalized by μmol. (A2, B2) Photon arrival trace resulting from combining photon emissions from every molecule and the background. These traces are produced by 10 molecules diffusing at 10 μm2/s for a total time of 50 ms under background photon emission rate of 103 photons/s and molecule photon emission rate 4×105 photons/s containing ≈ 3000 photon arrivals (A2), and molecule photon emission rate 4 × 104 photons/s containing ≈ 2000 photon arrivals (B2). (A3, B3) Posterior probability distributions drawn from traces with differing molecule photon emission rates (shown in (A2, B2)). As expected, for the traces with higher molecule photon emission rate, the peak of the posterior sharply matches with the exact value of D (dashed line). Gradually, as we decrease the molecule photon emission rate, the estimation becomes less reliable.
FIG. A4. A higher molecule photon emission rate provides more photons per unit time and sharper emission rate estimates.
(A1, B1, C1) Instantaneous molecule photon emission rates , normalized by μmol. (A2, B2, C2) Photon arrival traces resulting from combining photon emissions from every molecule and the background. These traces produced by 10 molecules diffusing at 10 μm2/s for a total time of 50 ms under background photon emission rate of 103 photons/s and molecule photon emission rate 4 × 105 photons/s containing ≈ 3000 photon arrivals (A2), molecule photon emission rate 4 × 104 photons/s containing ≈ 2000 photon arrivals (B2), and molecule photon emission rate 4 × 103 photons/s containing ≈ 1000 photon arrivals (C2). (A3, B3, C3) Posterior probability distributions drawn from traces with differing molecule photon emission rates (shown in (A2, B2, C2)). As expected, for the traces with higher molecule photon emission rate, the peak of the posterior sharply matches with the exact value of μmol (dashed line). Gradually, as we decrease the molecule photon emission rate, the estimation becomes less reliable.
FIG. A5. Estimation of the diffusion coefficient and molecule photon emission rate for Cy3 dyes.
(A) Experimental intensity trace (binned at 100 μs) with concentration 1 nM of Cy3 dye molecules and 61% glycerol. A background photon emission rate of 600 photons/s is known from calibration. (B) Analyzed portion of the trace containing ≈ 3000 photon arrivals. (C) Posterior probability distributions and the value (red dash line) of molecule photon emission rate determined by the photon counting histogram (PCH) method on the entire trace [22]. (D) Similarly to Fig. (1), we compare our method’s diffusion coefficient estimate (green dots) to FCS (blue asterisk) as a function of the number of photons used in the analysis. Since by 1.5×104 photon arrivals our method has converged, we avoid analyzing larger traces. The red dash line is the FCS estimate obtained from the entire, 5 min, trace containing ≈ 3 × 106 photon arrivals.
FIG. A6. Estimation of the diffusion coefficient and molecule photon emission rate for 5-TAMRA dyes.
(A) Experimental intensity trace (binned at 10 μs) with concentration 20 nM of 5-TAMRA dye molecules. A background photon emission rate of 300 photons/s is known from calibration. (B) Analyzed portion of the trace containing ≈ 8000 photon arrivals. (C) Posterior probability distributions and the value (red dash line) of molecule photon emission rate determined by the photon counting histogram (PCH) method on the entire trace [22]. (D) Similarly to Fig. (1), we compare our method’s diffusion coefficient estimate (green dots) to FCS (blue asterisk) as a function of the number of photons used in the analysis. Since by 1.5 × 104 photon arrivals our method has converged, we avoid analyzing larger traces. The red dash line is the FCS estimate obtained from the entire, 10 min, trace containing ≈ 6 × 106 photon arrivals.
Detailed Methods Description
Description of Fluorescence Correlation Spectroscopy (FCS)
In FCS the primary quantity of interest is the spontaneously fluctuating fluorescence intensity [29, 91]. Correlations in fluorescence intensities are used to determine physical parameters such as diffusion coefficients. The normalized time autocorrelation function of the fluorescence intensity is defined as
where I(t) is the fluorescence intensity, δI(t) is intensity fluctuations at time t, and τ is the lag time. The intensity fluctuations of the fluorescence intensity are defined as the deviations from the average of the intensity, . For freely diffusing molecules in a 3D Gaussian confocal volume, the autocorrelation, which we use in this study, is
where is the average number of molecules in the confocal volume, D is the diffusion coefficient, ωxy and ωz are the confocal volume axes along the xy and z directions. Further details on correlative analysis are contained in the cited literature [15, 29, 30, 35, 69, 91].
Explanation of Data Simulation
To generate synthetic traces we simulate molecules moving through a three dimensional illuminated volume. The number of moving molecules, N, is predefined in each simulation. We apply periodic boundaries to our volume of Lxy and Lz parallel to the focal plane and optical axis, respectively, to keep a relatively stable concentration of molecules near the confocal volume.
We denote the locations of the molecules as , and , where k labels time levels and n = 1,2,...,N labels molecules. The total trace duration Ttotal = tK − t0, is predefined. The time intervals between successive recorded photons , are generated through pseudo-random computer simulations and recorded for subsequent analysis.
The locations of the molecules , , at the first evaluation time t0 are randomly sampled from the uniform distribution with borders identical to the boundaries ±Lxy and ±Lz of the prescribed simulation region. Locations , , , for k = 1,...,K, at times tk are generated according to the diffusion model explained above under a predefined diffusion coefficient D.
We obtain photon inter-arrival times, , by simulating exponential random variables of rate μk. For independent background and molecule photon emission rates, the corresponding exponential emission mean rates μk depend on a Gaussian PSF as eqs. (2)–(3). Both background, μback, and the molecule photon emission rate, μmol, are predefined.
Definition of Molecule Photon Emission Rate
In this study the emission rate of detected photons for a single fluorophore at position x,y,z is used. This is formulated as the product . Here, μ0 and are the maximum excitation intensity and the efficiency of the photon collection at the center of the confocal volume, respectively, is the efficiency of the detector, is the quantum efficiency of the fluorophore, σ is the fluorophore absorption cross-section, EXC(x,y,z) is the excitation profile and CEF(x,y,z) is the detection profile [31]. By revising the definition of μ(x,y,z), we obtain where and .
To relate our single molecule photon emission rate μmol to the average photon count rate typically determined in bulk experiments, we compute a spatial average
where V denotes our PSF’s effective volume [59, 85] which is equal to . As a result, our molecule photon emission rate μmol is related to according to .
Description of Wilson-Hiferty Approximation
To perform the necessary computations of the next section, we use a Wilson-Hiferty transform [110] to approximate the probability density of exponential random variables. We use this approximation to sample the locations of the molecules within our overall Gibbs sampler (see next).
To apply the Wilson-Hiferty approximation, first we transform our observation random variable to a new random variable ρk, where . A change of variables, indicates that ∼ Exponential(μk) implies , where χ2(2) denotes the chi-square probability distribution with 2 degrees of freedom. By applying another transformation, where , according to [110], follows an approximately normal probability distribution ∼ Normal . So, by and , we conclude . Therefore, since , we establish the approximation
Detailed Description of the Inference Framework
Prior Probability Distributions
Within the Bayesian paradigm, all unknown model parameters need priors. These parameters are: the diffusion coefficient D; the molecule photon emission rate μmol; the initial molecule locations , , ; as well as the indicator prior weights qn.
Prior on the Diffusion Coefficient
To make sure that D sampled in our formulation attains only positive values, we choose an Inverse-Gamma prior
| (A1) |
This prior is conjugate to the motion model which simplifies the computations shown below.
Priors on Molecule Photon Emission Rate
To guarantee that μmol sampled in our formulation also attains only positive values, we choose a Gamma prior
| (A2) |
Priors on Initial Molecule Locations
Because of the symmetries inherent to the confocal volume, e.g., a molecule at a location (x,y,z) gives rise to the same photon emission rate as a molecule at location, (−x, −y, −z), we use priors on the initial locations that respect these symmetries. To simplify the computations, we use independent symmetric normal distributions
| (A3) |
| (A4) |
| (A5) |
Priors and Hyperpriors for the Indicators
To simplify the computations described in the next section, we use a finite, but large, model population of N molecules that contain contributing and noncontributing ones. These molecules are collectively indexed by n = 1,2,...,N. As described in the Methods section, inferring how many molecules are actually warranted by the data analyzed is the same as estimating how many of those N molecules are active, i.e., bn = 1, while the rest are inactive, i.e., bn = 0, and so have no influence and are applied just for computational reasons.
We use a Bernoulli prior of weight qn to make sure that each indicator bn takes only values 0 or 1. Moreover, on each weight qn, we assign a beta hyperprior
| (A6) |
| (A7) |
To make sure that the resulting formulation avoids overfitting, we make the specific selections and . For these choices [1, 16, 77, 78], and in the limit that (that is, when the assumed molecule population is large), this prior/hyperprior choice converges to a non-parametric beta-Bernoulli process. Therefore, for , the posterior is well defined and becomes independent of the selected value of N. In other words, provided N is large enough, its effect on the results is negligible; while its precise value has only computational implications.
Description of the Computational Implementation
Here, is the joint probability distribution of our framework where molecular trajectories and measurements are gathered in
Posterior samples are generated according to Gibbs sampling [37, 66, 86, 100, 107]. We achieve this by sampling a variable conditioned on all other variables and the given photon inter-arrival times Δt. Conceptually, the steps in the generation of each posterior sample are:
- For each n of the active molecules
- Update trajectory of active molecule n
- Update trajectory of active molecule n
- Update trajectory of active molecule n
Update jointly the trajectories ,, for all n of the inactive molecules
Update the diffusion coefficient D
Update jointly the prior weights qn for all model molecules and simultaneously update jointly the indicators bn for all model molecules
Update the molecule photon emission rate μmol
Sampling Active Molecules Locations
To sample the location of an active molecule (, , ), we use forward filtering and backward sampling [11, 20, 53, 94]. In particular, we update each dimension sequentially from the following full conditional probability distribution , , and Below, we show in detail the calculation only for sampling , since for sampling and they are similar.
To sample the trajectory , we rely on the factorization
and, according to this factorization, we sample individual locations sequentially
where, k = 1,...,K − 1. However, to be able to perform these steps, we first need to compute the involved probability distributions. We describe below a computationally efficient way to do so that proceeds in a forward filtering and a backward sampling step.
Before we start the sampling of the locations, we determine each one of the individual probability distributions that are needed. To do this in a computationally tractable manner [14, 20], we compute filter distributions . In our case, both dynamic (eqs. (7)–(9))) and observation (eq. (1)) probability distributions provide equal probabilities for and . Therefore, the filter distribution consists of two modes symmetrically placed across the origin [53]. Accordingly, we compute an approximate bimodal symmetric filter of the form
where SymNormal describes the symmetric normal distribution. The filter, that is the values of and , is updated iteratively according to
| (A8) |
To be able to carry out these computations efficiently, similar to [53], we work on an approximate model where the exponential emission equation, eq. (1), is replaced by a normal one using the Wilson-Hiferty approximation as we discussed earlier. Our approximate emission equation is
where is given in eq. (2); while , and are given by the Wilson-Hiferty approximation [110]. As explained earlier, the approximation is given by , , and . Because of the specific choices of our problem (i.e., diffusive molecules, symmetric normal filter at the proceeding time, and normal likelihood), eq. (A8) reduces to
| (A9) |
Finally, to obtain the values of and , we linearize the product in eq. (A9) as described next. From eq. (A9), we have
| (A10) |
The density in eq. (A10) consists of two modes, one on the positive semi-axis of x and one on the negative semi-axis of x. Considering and and linearizing them around the previous filter’s mode, or , the modes of eq. (A10) are approximated by
Combining both approximations, the density of eq. (A10), is approximated by
| (A11) |
where , and .
Equation (A11) describes a symmetric normal distribution. Equating this distribution with our filter, i.e., SymNormal , we obtain , and . These apply for k = 2,...,K and are used to update the filter. To begin, we use eq. (A3)–(A5) and set and .
Having computed the filter distributions above, we are able to sample the individual locations by starting from and moving backward towards . In particular, by applying the Bayes’ rule, each one of the individual distributions factorize as
| (A12) |
The first term is given by the filter distribution which is replaced by our approximate SymNormal , and the second term is our motion model Normal , all of which are known at this stage. Therefore, backward sampling starts at and continues for with
Sampling Inactive Molecule Trajectories
To update the trajectories of the inactive molecules, we sample from the corresponding conditionals . Since the inactive molecules are not associated with the observations in , these reduce to which we simulate as standard 3D Brownian motion [64].
Sampling the Diffusion Coeffcient
We sample the diffusion coefficient from the conditional probability distribution . Because of the specific dependencies of the variables in this formulation, e.g., eq. (A1) and eqs. (7)–(9), the conditional distribution simplifies to . Using Bayes’ rule, this distribution becomes where and .
Sampling Molecule Indicators
For each molecule n we sample its indicator prior weight, qn, from the corresponding conditional distribution , which simplifies to . For this we use eq. (A7) and eq. (A6). According to Bayes’ rule, the latter distribution becomes where and . Subsequently, we update the indicators bn by sampling from the corresponding conditional distribution using a Methropolis-Hasting algorithm [18, 23]. For this, we use a proposal . With this proposal, the acceptance ratio becomes
Sampling the Molecule Photon Emission Rate
In the last step, after updating the locations and indicators of the molecules, we sample the molecule photon emission rate from the corresponding conditional distribution . To sample this distribution, we also use a Metropolis-Hastings step. For this, we use proposals of the form where denotes the current sampled value. Using both eqs. (1) and (2), the acceptance ratio becomes
Contributor Information
Meysam Tavakoli, Department of Physics, Indiana University-Purdue University Indianapolis, IN 46202.
Sina Jazani, Center for Biological Physics, Department of Physics, Arizona State University, Tempe, AZ 85287.
Ioannis Sgouralis, Center for Biological Physics, Department of Physics, Arizona State University, Tempe, AZ 85287.
Omer M. Shafraz, Department of Biomedical Engineering, University of California, Davis, CA 95616
Sanjeevi Sivasankar, Department of Biomedical Engineering, University of California, Davis, CA 95616.
Bryan Donaphon, Biodesign Institute, Arizona State University, Tempe, AZ 85287.
Marcia Levitus, Center for Biological Physics, Department of Physics, Arizona State University, Tempe, AZ 85287; Biodesign Institute, Arizona State University, Tempe, AZ 85287 and School of Molecular Sciences, Arizona State University, Tempe, AZ 85287.
Steve Pressé, Center for Biological Physics, Department of Physics, Arizona State University, Tempe, AZ 85287 and School of Molecular Sciences, Arizona State University, Tempe, AZ 85287.
References
- [1].Labadi Luai Aland Zarepour Mahmoud. On approximations of the beta process in latent feature models: Point processes approach. Sankhya A, 80(1):59–79, 2018. [Google Scholar]
- [2].Andrec Michael, Levy Ronald M, and Talaga David S. Direct determination of kinetic rates from single-molecule photon arrival trajectories using hidden markov models. The Journal of Physical Chemistry A, 107(38):7454–7464, 2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [3].Andrews JO, Conway W, Cho W-K, Narayanan A, Spille J-H, Jayanth N, Inoue T, Mullen S, Thaler J, and Cissé II. qSR: A quantitative super-resolution analysis tool reveals the cell-cycle dependent organization of RNA polymerase i in live human cells. Scientific Reports, 8(1):7424, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [4].Antonik Matthew, Felekyan Suren, Gaiduk Alexander, and Seidel Claus AM. Separating structural heterogeneities from stochastic variations in fluorescence resonance energy transfer distributions via photon distribution analysis. The Journal of Physical Chemistry B, 110(13):6970–6978, 2006. [DOI] [PubMed] [Google Scholar]
- [5].Axelrod Daniel, Koppel DE, Schlessinger J, Elson Ei, and Webb Watt W. Mobility measurement by analysis of fluorescence photobleaching recovery kinetics. Biophysical Journal, 16(9):1055–1069, 1976. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Barato Andre C and Seifert Udo. Cost and precision of brownian clocks. Physical Review X, 6(4):041053, 2016. [Google Scholar]
- [7].Benda A, Beneš M, Marecek V, Lhotskỳ A, Hermens W Th, and Hof M. How to determine diffusion coefficients in planar phospholipid systems by confocal fluorescence correlation spectroscopy. Langmuir, 19(10): 4120–4126, 2003. [Google Scholar]
- [8].Berg Howard C. Random walks in biology. Princeton University Press, 1993. [Google Scholar]
- [9].Berglund Andrew J. Statistics of camera-based single-particle tracking. Physical Review E, 82(1):011917, 2010. [DOI] [PubMed] [Google Scholar]
- [10].Best Robert Band Hummer Gerhard. Reaction coordinates and rates from transition paths. Proceedings of the National Academy of Sciences, 102(19):6732–6737, 2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Bishop Christopher M. Pattern recognition and machine learning. springer, 2006. [Google Scholar]
- [12].Born Max and Wolf Emil. Principles of optics: electromagnetic theory of propagation, interference and diffraction of light. Elsevier, 2013. [Google Scholar]
- [13].Brakenhoff GJ, Visscher K, and Van der Voort HTM. Size and shape of the confocal spot: control and relation to 3D imaging and image processing. In Handbook of biological confocal microscopy, pages 87–91. Springer, 1990. [Google Scholar]
- [14].Briers Mark, Doucet Arnaud, and Maskell Simon. Smoothing algorithms for state–space models. Annals of the Institute of Statistical Mathematics, 62(1):61, 2010. [Google Scholar]
- [15].Bright Gary R, Fisher Gregory W, Rogowska Jadwiga, and Taylor D Lansing. Fluorescence ratio imaging microscopy. Methods in Cell Biology, 30:157–192, 1989. [DOI] [PubMed] [Google Scholar]
- [16].Broderick Tamara, Jordan Michael I, Pitman Jim, et al. Beta processes, stick-breaking and power laws. Bayesian Analysis, 7(2):439–476, 2012. [Google Scholar]
- [17].Bronson Jonathan E, Fei Jingyi, Hofman Jake M, Gonzalez Ruben L Jr, and Wiggins Chris H. Learning rates and states from biophysical time series: a Bayesian approach to model selection and single-molecule FRET data. Biophysical Journal, 97(12):3196–3205, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [18].Calderhead Ben. A general construction for parallelizing metropolis-hastings algorithms. Proceedings of the National Academy of Sciences, 111(49):17408–17413, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [19].Calderon Christopher Pand Bloom Kerry. Inferring latent states and refining force estimates via hierarchical dirichlet process modeling in single particle tracking experiments. PloS one, 10(9):e0137633, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [20].Cappé Olivier, Moulines Eric, and Rydén Tobias. Inference in hidden markov models. In Proceedings of EUSFLAT Conference, pages 14–16, 2009. [Google Scholar]
- [21].Chechkin Aleksei V, Seno Flavio, Metzler Ralf, and Sokolov Igor M. Brownian yet non-gaussian diffusion: from superstatistics to subordination of diffusing diffusivities. Physical Review X, 7(2):021002, 2017. [Google Scholar]
- [22].Chen Yan, Müller Joachim D, So Peter TC, and Gratton Enrico. The photon counting histogram in fluorescence fluctuation spectroscopy. Biophysical Journal, 77(1):553–567, 1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [23].Chib Siddhartha and Greenberg Edward. Understanding the metropolis-hastings algorithm. The American Statistician, 49(4):327–335, 1995. [Google Scholar]
- [24].Chiu Daniel T, Jeon Noo Li, Huang Sui, Kane Ravi S, Wargo Christopher J, Choi Insung S, Ingber Donald E, and Whitesides George M. Patterned deposition of cells and proteins onto surfaces by using three-dimensional microfluidic systems. Proceedings of the National Academy of Sciences, 97(6):2408–2413, 2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25].Cho Won-Ki, Spille Jan-Hendrik, Hecht Micca, Lee Choongman, Li Charles, Grube Valentin, and Cisse Ibrahim I. Mediator and rna polymerase II clusters associate in transcription-dependent condensates. Science, 361(6400):412–415, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [26].Chung Hoi Sung and Gopich Irina V. Fast single-molecule FRET spectroscopy: theory and experiment. Physical Chemistry Chemical Physics, 16(35):18644–18657, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [27].Chung Hoi Sung, Meng Fanjie, Kim Jae-Yeol, McHale Kevin, Gopich Irina V, and Louis John M. Oligomerization of the tetramerization domain of p53 probed by two-and three-color single-molecule FRET. Proceedings of the National Academy of Sciences, page 201700357, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [28].Dittrich Petra, Malvezzi-Campeggi Flaminia, Jahnz Michael, and Schwille Petra. Accessing molecular dynamics in cells by fluorescence correlation spectroscopy. Biological Chemistry, 382(3):491–494, 2001. [DOI] [PubMed] [Google Scholar]
- [29].Elson Elliot L. Fluorescence correlation spectroscopy: past, present, future. Biophysical Journal, 101(12): 2855–2870, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [30].Elson Elliot Land Magde Douglas. Fluorescence correlation spectroscopy. I. conceptual basis and theory. Biopolymers, 13(1):1–27, 1974. [DOI] [PubMed] [Google Scholar]
- [31].Enderlein Jörg and Ambrose W Patrick. Optical collection efficiency function in single-molecule detection experiments. Applied Optics, 36(22):5298–5302, 1997. [DOI] [PubMed] [Google Scholar]
- [32].Enderlein Jorg, Gregor Ingo, Patra Digambara, and Fitter Jorg. Art and artefacts of fluorescence correlation spectroscopy. Current Pharmaceutical Biotechnology, 5 (2):155–161, 2004. [DOI] [PubMed] [Google Scholar]
- [33].Erban Radek and Chapman S Jonathan. Stochastic modelling of reaction–diffusion processes: algorithms for bimolecular reactions. Physical Biology, 6(4):046001, 2009. [DOI] [PubMed] [Google Scholar]
- [34].Ferguson Thomas S. A Bayesian analysis of some nonparametric problems. The Annals of Statistics, pages 209–230, 1973. [Google Scholar]
- [35].Fitzpatrick James AJ and Lillemeier Björn F. Fluorescence correlation spectroscopy: linking molecular dynamics to biological function in vitro and in situ. Current Opinion in Structural Biology, 21(5):650–660, 2011. [DOI] [PubMed] [Google Scholar]
- [36].Gahlmann Andreas and Moerner WE. Exploring bacterial cell biology with single-molecule tracking and super-resolution imaging. Nature Reviews Microbiology, 12(1): 9, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [37].Gelman Andrew, Carlin John B, Stern Hal S, Dunson David B, Vehtari Aki, and Rubin Donald B. Bayesian data analysis, volume 2. CRC press Boca Raton, FL, 2014. [Google Scholar]
- [38].Gibson Sarah Friskenand Lanni Frederick. Experimental test of an analytical model of aberration in an oil-immersion objective lens used in three-dimensional light microscopy. JOSA A, 9(1):154–166, 1992. [DOI] [PubMed] [Google Scholar]
- [39].Gopich Irina V. Accuracy of maximum likelihood estimates of a two-state model in single-molecule FRET. The Journal of Chemical Physics, 142(3):034110, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [40].Gopich Irina Vand Szabo Attila. Single-molecule FRET with diffusion and conformational dynamics. The Journal of Physical Chemistry B, 111(44):12925–12932, 2007. [DOI] [PubMed] [Google Scholar]
- [41].Gopich Irina Vand Szabo Attila. Decoding the pattern of photon colors in single-molecule FRET. The Journal of Physical Chemistry B, 113(31):10965–10973, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [42].Gopich Irina Vand Szabo Attila. Theory of the energy transfer efficiency and fluorescence lifetime distribution in single-molecule FRET. Proceedings of the National Academy of Sciences, 109(20):7747–7752, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [43].Ha Taekjip, Ting Alice Y, Liang Joy, Caldwell W Brett, Deniz Ashok A, Chemla Daniel S, Schultz Peter G, and Weiss Shimon. Single-molecule fluorescence spectroscopy of enzyme conformational dynamics and cleavage mechanism. Proceedings of the National Academy of Sciences, 96(3):893–898, 1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [44].Haile JM, Johnston Ian, Mallinckrodt A John, McKay Susan, et al. Molecular dynamics simulation: elementary methods. Computers in Physics, 7(6):625–625, 1993. [Google Scholar]
- [45].Hajdziona Marta and Molski Andrzej. Maximum likelihood-based analysis of single-molecule photon arrival trajectories. The Journal of Chemical Physics, 134 (5):054112, 2011. [DOI] [PubMed] [Google Scholar]
- [46].He Jun, Guo Syuan-Ming, and Bathe Mark. Bayesian approach to the analysis of fluorescence correlation spectroscopy data i: theory. Analytical Chemistry, 84(9): 3871–3879, 2012. [DOI] [PubMed] [Google Scholar]
- [47].Higham Desmond J. An algorithmic introduction to numerical simulation of stochastic differential equations. SIAM Review, 43(3):525–546, 2001. [Google Scholar]
- [48].Hines Keegan E. A primer on Bayesian inference for biophysical systems. Biophysical Journal, 108(9):2103–2113, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [49].Hines Keegan E, Bankston John R, and Aldrich Richard W. Analyzing single-molecule time series via nonparametric Bayesian inference. Biophysical Journal, 108 (3):540–556, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [50].Hjort Nils Lid, Holmes Chris, Müller Peter, and Walker Stephen G. Bayesian nonparametrics, volume 28. Cambridge University Press, 2010. [Google Scholar]
- [51].Huang Bo, Bates Mark, and Zhuang Xiaowei. Superresolution fluorescence microscopy. Annual Review of Biochemistry, 78:993–1016, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [52].Ibe Oliver C. Elements of Random Walk and Diffusion Processes. John Wiley & Sons, 2013. [Google Scholar]
- [53].Jazani Sina, Sgouralis Ioannis, and Pressé Steve. A method for single molecule tracking using a conventional single-focus confocal setup. The Journal of Chemical Physics, 150(11):114108, 2019. [DOI] [PubMed] [Google Scholar]
- [54].Jazani Sina, Sgouralis Ioannis, Shafraz Omer M, Levitus Marcia, Sivasankar Sanjeevi, and Pressé Steve. An alternative framework for fluorescence correlation spectroscopy. Nature Communications, 10, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [55].Johnson Margaret Eand Hummer Gerhard. Freepropagator reweighting integrator for single-particle dynamics in reaction-diffusion models of heterogeneous protein-protein interaction systems. Physical Review X, 4(3):031037, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [56].Kask Peet, Günther Rolf, and Axhausen Peter. Statistical accuracy in fluorescence fluctuation experiments. European Biophysics Journal, 25(3):163–169, 1997. [Google Scholar]
- [57].Krichevsky Oleg and Bonnet Grégoire. Fluorescence correlation spectroscopy: the technique and its applications. Reports on Progress in Physics, 65(2):251, 2002. [Google Scholar]
- [58].Lacasa Lucas, Mariño Inés P, Miguez Joaquin, Nicosia Vincenzo, Roldán Édgar, Lisica Ana, Grill Stephan W, and Gómez-Gardeñes Jesús. Multiplex decomposition of non-markovian dynamics and the hidden layer reconstruction problem. Physical Review X, 8(3):031038, 2018. [Google Scholar]
- [59].Lakowicz Joseph R. Principles of fluorescence spectroscopy. Springer, 2006. [Google Scholar]
- [60].Lee Antony, Tsekouras Konstantinos, Calderon Christopher, Bustamante Carlos, and Pressé Steve. Unraveling the thousand word picture: An introduction to super-resolution data analysis. Chemical Reviews, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [61].Lee Sang-Hyuk, Shin Jae Yen, Lee Antony, and Bustamante Carlos. Counting single photoactivatable fluorescent molecules by photoactivated localization microscopy (PALM). Proceedings of the National Academy of Sciences, 109(43):17436–17441, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [62].Lerner Eitan, Ingargiola Antonino, and Weiss Shimon. Characterizing highly dynamic conformational states: the transcription bubble in RNAP-promoter open complex as an example. The Journal of Chemical Physics, 148(12):123315, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [63].Li Hui, Yen Chi-Fu, and Sivasankar Sanjeevi. Fluorescence axial localization with nanometer accuracy and precision. Nano Letters, 12(7):3731–3735, 2012. [DOI] [PubMed] [Google Scholar]
- [64].C-L Lin Lawrenceand Brown Frank LH. Brownian dynamics in fourier space: membrane simulations over long length and time scales. Physical Review Letters, 93 (25):256001, 2004. [DOI] [PubMed] [Google Scholar]
- [65].Liu Hongchao, Jiu Bo, Liu Hongwei, and Bao Zheng. Superresolution isar imaging based on sparse bayesian learning. IEEE Transactions on Geoscience and Remote Sensing, 52(8):5005–5013, 2014. [Google Scholar]
- [66].Liu Huan and Motoda Hiroshi. Computational methods of feature selection. CRC Press, 2007. [Google Scholar]
- [67].Liu Zhe, Lavis Luke D, and Betzig Eric. Imaging live-cell dynamics and structure at the single-molecule level. Molecular cell, 58(4):644–659, 2015. [DOI] [PubMed] [Google Scholar]
- [68].Macháň Radek and Hof Martin. Lipid diffusion in planar membranes investigated by fluorescence correlation spectroscopy. Biochimica et Biophysica Acta (BBA)-Biomembranes, 1798(7):1377–1391, 2010. [DOI] [PubMed] [Google Scholar]
- [69].Magde Douglas, Elson Elliot L, and Webb Watt W. Fluorescence correlation spectroscopy. II. an experimental realization. Biopolymers, 13(1):29–61, 1974. [DOI] [PubMed] [Google Scholar]
- [70].Magidson Valentin and Khodjakov Alexey. Circumventing photodamage in live-cell microscopy. In Methods in cell biology, volume 114, pages 545–560. Elsevier, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [71].Meyn Sean Pand Tweedie Richard L. Stability of markovian processes II: Continuous-time processes and sampled chains. Advances in Applied Probability, 25(3): 487–517, 1993. [Google Scholar]
- [72].Meyn Sean Pand Tweedie Richard L. Stability of markovian processes III: Foster–lyapunov criteria for continuous-time processes. Advances in Applied Probability, 25(3):518–548, 1993. [Google Scholar]
- [73].Michalet X, Siegmund OHW, Vallerga JV, Jelinsky P, Millaud JE, and Weiss S. Detectors for single-molecule fluorescence imaging and spectroscopy. Journal of Modern Optics, 54(2–3):239–281, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [74].Narayanan Arjun, Meriin Anatoli B, Sherman Michael Y, and Cisse Ibrahim I. A first order phase transition underlies the formation of sub-diffractive protein aggregates in mammalian cells. bioRxiv, page 148395, 2017. [Google Scholar]
- [75].Nir Eyal, Michalet Xavier, Hamadani Kambiz M, Laurence Ted A, Neuhauser Daniel, Kovchegov Yevgeniy, and Weiss Shimon. Shot-noise limited single-molecule fret histograms: comparison between theory and experiments. The Journal of Physical Chemistry B, 110(44): 22103–22124, 2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [76].Okamoto Kenji and Sako Yasushi. Variational bayes analysis of a photon-based hidden markov model for single-molecule FRET trajectories. Biophysical Journal, 103(6):1315–1324, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [77].Paisley John and Carin Lawrence. Nonparametric factor analysis with beta process priors. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 777–784. ACM, 2009. [Google Scholar]
- [78].Paisley John and Jordan Michael I. A constructive definition of the beta process. arXiv preprint arXiv:1604.00685, 2016. [Google Scholar]
- [79].Palla Konstantina, Knowles David A, and Ghahramani Zoubin. A reversible infinite HMM using normalised random measures. In International Conference on Machine Learning, 2014. [Google Scholar]
- [80].Phair Robert Dand Misteli Tom. Kinetic modelling approaches to in vivo imaging. Nature Reviews Molecular Cell Biology, 2(12):898, 2001. [DOI] [PubMed] [Google Scholar]
- [81].Phillips Rob, Theriot Julie, Kondev Jane, and Garcia Hernan. Physical biology of the cell. Garland Science, 2012. [Google Scholar]
- [82].Pirchi Menahem, Tsukanov Roman, Khamis Rashid, Tomov Toma E, Berger Yaron, Khara Dinesh C, Volkov Hadas, Haran Gilad, and Nir Eyal. Photon-by-photon hidden markov model analysis for microsecond single-molecule FRET kinetics. The Journal of Physical Chemistry B, 120(51):13065–13075, 2016. [DOI] [PubMed] [Google Scholar]
- [83].Purschke Martin, Rubio Noemi, Held Kathryn D, and Redmond Robert W. Phototoxicity of hoechst 33342 in time-lapse fluorescence microscopy. Photochemical & Photobiological Sciences, 9(12):1634–1639, 2010. [DOI] [PubMed] [Google Scholar]
- [84].Remaut Katrien, Lucas Bart, Braeckmans Kevin, Sanders NN, Smedt SC De, and Demeester Jo. FRET-FCS as a tool to evaluate the stability of oligonucleotide drugs after intracellular delivery. Journal of Controlled Release, 103(1):259–271, 2005. [DOI] [PubMed] [Google Scholar]
- [85].Rigler Rudolf and Elson Elliot S. Fluorescence correlation spectroscopy: theory and applications, volume 65. Springer Science & Business Media, 2012. [Google Scholar]
- [86].Robert Christian and Casella George. Introducing Monte Carlo Methods with R. Springer Science & Business Media, 2009. [Google Scholar]
- [87].Saffarian Saveez and Elson Elliot L. Statistical analysis of fluorescence correlation spectroscopy: the standard deviation and bias. Biophysical Journal, 84(3):2030–2042, 2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [88].Scherfeld Dag, Kahya Nicoletta, and Schwille Petra. Lipid dynamics and domain formation in model membranes composed of ternary mixtures of unsaturated and saturated phosphatidylcholines and cholesterol. Biophysical Journal, 85(6):3758–3768, 2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [89].Schmidt Patrick D, Reichert Benjamin H, Lajoie John G, and Sivasankar Sanjeevi. Method for high frequency tracking and sub-nm sample stabilization in single molecule fluorescence microscopy. Scientific Reports, 8(1):13912, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [90].Schuler Benjamin. Perspective: Chain dynamics of unfolded and intrinsically disordered proteins from nanosecond fluorescence correlation spectroscopy combined with single-molecule FRET. The Journal of Chemical Physics, 149(1):010901, 2018. [DOI] [PubMed] [Google Scholar]
- [91].Schwille Petra and Haustein Elke. Fluorescence correlation spectroscopy: an introduction to its concepts and applications. Biophysics textbook online, 1(3), 2001. [Google Scholar]
- [92].Schwille Petra, Meyer-Almes Franz-Josef, and Rigler Rudolf. Dual-color fluorescence cross-correlation spectroscopy for multicomponent diffusional analysis in solution. Biophysical Journal, 72(4):1878–1886, 1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [93].Schwille Petra, Haupts Ulrich, Maiti Sudipta, and Webb Watt W. Molecular dynamics in living cells observed by fluorescence correlation spectroscopy with one-and two-photon excitation. Biophysical Journal, 77 (4):2251–2265, 1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [94].Scott Steven L. Bayesian methods for hidden markov models: Recursive computing in the 21st century. Journal of the American Statistical Association, 97(457): 337–351, 2002. [Google Scholar]
- [95].Sgouralis Ioannis and Pressé Steve. ICON: an adaptation of infinite hmms for time traces with drift. Biophysical Journal, 112(10):2117–2126, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [96].Sgouralis Ioannis and Pressé Steve. An introduction to infinite HMMs for single-molecule data analysis. Biophysical Journal, 112(10):2021–2029, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [97].Sgouralis Ioannis, Madaan Shreya, Djutanta Franky, Kha Rachael, Hariadi Rizal F, and Pressé Steve. A bayesian nonparametric approach to single molecule forster resonance energy transfer. The Journal of Physical Chemistry B, 123(3):675–688, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [98].Sgouralis Ioannis, Whitmore Miles, Lapidus Lisa, Comstock Matthew J, and Pressé Steve. Single molecule force spectroscopy at high data acquisition: A bayesian nonparametric analysis. The Journal of Chemical Physics, 148(12):123320, 2018. [DOI] [PubMed] [Google Scholar]
- [99].Stricker Jesse, Maddox Paul, Salmon ED, and Erickson Harold P. Rapid assembly dynamics of the escherichia coli FtsZ-ring demonstrated by fluorescence recovery after photobleaching. Proceedings of the National Academy of Sciences, 99(5):3171–3175, 2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [100].Meysam Tavakoli J Taylor Nicholas, Li Chun-Biu, Komatsuzaki Tamiki, and Pressé Steve. Single molecule data analysis: an introduction. arXiv preprint arXiv:1606.00403, 2016. [Google Scholar]
- [101].Tavakoli Meysam, Tsekouras Konstantinos, Day Richard, Kenneth W Dunn, and Steve Presse. Quantitative kinetic models from intravital microcopy: A case study using hepatic transport. The Journal of Physical Chemistry B, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [102].Teh Yee W, Jordan Michael I, Beal Matthew J, and Blei David M. Sharing clusters among related groups: Hierarchical dirichlet processes. In Advances in neural information processing systems, pages 1385–1392, 2005. [Google Scholar]
- [103].Tinevez Jean-Yves, Dragavon Joe, Baba-Aissa Lamya, Roux Pascal, Perret Emmanuelle, Canivet Astrid, Galy Vincent, and Shorte Spencer. A quantitative method for measuring phototoxicity of a live cell imaging microscope. In Methods in enzymology, volume 506, pages 291–309. Elsevier, 2012. [DOI] [PubMed] [Google Scholar]
- [104].Torres Tedman and Levitus Marcia. Measuring conformational dynamics: a new FCS-FRET approach. The Journal of Physical Chemistry B, 111(25):7392–7400, 2007. [DOI] [PubMed] [Google Scholar]
- [105].Tsekouras Konstantinos, Siegel Amanda P, Day Richard N, and Pressé Steve. Inferring diffusion dynamics from FCS in heterogeneous nuclear environments. Biophysical Journal, 109(1):7–17, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [106].Uphoff Stephan, Gryte Kristofer, Evans Geraint, and Kapanidis Achillefs N. Improved temporal resolution and linked hidden markov modeling for switchable single-molecule FRET. ChemPhysChem, 12(3):571–579, 2011. [DOI] [PubMed] [Google Scholar]
- [107].Toussaint Udo Von. Bayesian inference in physics. Reviews of Modern Physics, 83(3):943, 2011. [Google Scholar]
- [108].Waligórska Martaand Molski Andrzej. Maximum likelihood-based analysis of photon arrival trajectories in single-molecule FRET. Chemical Physics, 403:52–58, 2012. [DOI] [PubMed] [Google Scholar]
- [109].Weiss Shimon. Fluorescence spectroscopy of single biomolecules. Science, 283(5408):1676–1683, 1999. [DOI] [PubMed] [Google Scholar]
- [110].Wilson Edwin Band Hilferty Margaret M. The distribution of chi-square. Proceedings of the National Academy of Sciences, 17(12):684–688, 1931. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [111].Wohland Thorsten, Rigler Rudolf, and Vogel Horst. The standard deviation in fluorescence correlation spectroscopy. Biophysical Journal, 80(6):2987–2999, 2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [112].Wu Guanghua, Ji Haifeng, Hansen Karolyn, Thundat Thomas, Datar Ram, Cote Richard, Hagan Michael F, Chakraborty Arup K, and Majumdar Arunava. Origin of nanomechanical cantilever motion generated from biomolecular interactions. Proceedings of the National Academy of Sciences, 98(4):1560–1564, 2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [113].Wu Shiwei, Han Gang, Milliron Delia J, Aloni Shaul, Altoe Virginia, Talapin Dmitri V, Cohen Bruce E, and Schuck P James. Non-blinking and photostable upconverted luminescence from single lanthanide-doped nanocrystals. Proceedings of the National Academy of Sciences, 106(27):10917–10921, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [114].Zhang Bo, Zerubia Josiane, and OlivoMarin Jean-Christophe. Gaussian approximations of fluorescence microscope point-spread function models. Applied Optics, 46 (10):1819–1829, 2007. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.



















