Abstract
Single molecule localization microscopy (SMLM) has enormous potential for resolving subcellular structures below the diffraction limit of light microscopy: Localization precision in the low digit nanometer regime has been shown to be achievable. In order to record localization microscopy data, however, sample fixation is inevitable to prevent molecular motion during the rather long recording times of minutes up to hours. Eventually, it turns out that preservation of the sample’s ultrastructure during fixation becomes the limiting factor. We propose here a workflow for data analysis, which is based on SMLM performed at cryogenic temperatures. Since molecular dipoles of the fluorophores are fixed at low temperatures, such an approach offers the possibility to use the orientation of the dipole as an additional information for image analysis. In particular, assignment of localizations to individual dye molecules becomes possible with high reliability. We quantitatively characterized the new approach based on the analysis of simulated oligomeric structures. Side lengths can be determined with a relative error of less than 1% for tetramers with a nominal side length of 5 nm, even if the assumed localization precision for single molecules is more than 2 nm.
Introduction
In the last decade, super-resolution microscopy techniques have paved the way for resolving cellular structures in unprecedented detail [1]. The assembly of biomolecules at the nanoscale plays a crucial role in their functionality and hence, is key to our understanding of cellular processes. In particular, the technique of single molecule localization microscopy (SMLM) appears well suited for structural biology, as it is based on localization coordinates of individual molecules rather than on pixelated images of recorded fluorescence intensities. In SMLM, dye molecules are linked to the biomolecule of interest and imaged under conditions, where only a small subset of dye molecules is visible at any time-point. From the movies containing thousands of images of the very same region, one can determine the positions of these dye molecules very accurately down to a precision of a few nanometers [2], which allows for establishing localization maps.
The increased spatial resolution of SMLM, however, comes at the cost of temporal resolution, as image acquisition takes several minutes up to hours. Thorough sample fixation is thus a crucial prerequisite for high resolution SMLM recordings, as any residual diffusion of molecules [3] will lead to distortions of the obtained localization maps. Since such residual motion is likely uncorrelated within the sample, it cannot be corrected by standard drift correction methods. Importantly, the chosen fixation method further needs to conserve the ultrastructure of the sample under investigation, which is typically not the case using chemical fixatives [4]. Novel cryo-fixation approaches [5] combined with SMLM at cryogenic temperatures (cryo-SMLM) [6–8] promise to resolve both points, thereby opening up SMLM to questions from structural biology.
Two aspects of SMLM, however, hamper the direct ultrastructural interpretation of localization maps: On the one hand, insufficient labeling and/or detection efficiency leads to undercounting; on the other hand, multiple detections of individual molecules result in overcounting [9]. Therefore, some parts of a particular biomolecular structure may not be visible at all, while other parts may be heavily overrepresented.
In principle, particle averaging approaches allow for circumventing the issue of statistical distortions in SMLM. Similarly to single particle reconstruction methods used in cryo-electron microscopy (cryo-EM), hundreds to thousands of identical copies of the same particle are imaged, and subsequently combined to yield an averaged super-particle [10, 11]. In case of unknown structures template-free registration methods have to be employed. Two possible approaches are pyramid registration, where particles are registered pairwise in consecutive steps [12], or all-to-all registration, where all particles are registered to all others simultaneously [13]. Any knowledge of particle symmetry may be included in the registration process in order to increase the quality of the reconstruction [13]. To improve the registration process under realistic imaging conditions, the Bachttacharyya distance allows to account for missing labels, different number of localizations of individual molecules and anisotropic localization uncertainty [12, 13]. In addition, for accurate reconstruction of semi-flexible structures, Shi et al. recently suggested an approach for deformed alignment [14]. Note that up to now these approaches were successfully applied only in case of rather large structures with sizes of tens of nanometers [14], or imaging conditions yielding tens to hundreds of localizations per label site [13]. Quite often, however, the cell-biological context of an experiment is in conflict with these requirements, in many cases impeding particle reconstruction. In such cases, template-based registration methods may recover superresolution analysis, or provide superior results. In principle, template-based registration allows to register the point sets acquired from each particle onto the template. In a pioneering study, Szymborska et al. [15] used a circular template to study the arrangement of molecules in the nuclear pore complex (NPC), allowing the determination of its radius with a precision of 0.1 nm. More elaborate analysis employing the eight-fold symmetry of the NPC allowed to analyze the single-molecule labeling efficiency [16] or reconstruct a more detailed view of the NPC structure [12].
As an alternative to coordinate-based registration, reconstruction can be performed based on algorithms developed for cryo-EM data. In this case, the obtained localization maps first need to be converted to localization images (e.g. based on localization densities or localization uncertainties), since EM-algorithms expect continuous intensity distributions instead of a list of coordinates. Using this approach combined with imaging at cryogenic temperatures, Weisenburger et al. reported a resolution on the Ångström scale for imaging of the GtCitA Pasc domain dimer and the streptavidin homotetramer [8]. Of note, imaging modalities of EM and SMLM differ quite substantially and EM-algorithms might not fully account for SMLM specifics.
Currently, however, it is difficult to assign localizations to specific dye molecules. We propose here a new approach for the analysis of oligomeric protein complexes, which is tailored to the conditions of cryo-SMLM. Measuring at cryogenic temperatures has two key advantages, which shall be exploited here: first, it ensures supreme fixation and conservation of the sample’s ultrastructure [5]; second, also rotational diffusion of the fluorophores’ excitation and emission dipoles during illumination is prevented at least over time-scales of hours [8]. The second aspect allows for establishing a unique characteristic for each dye molecule, based on the orientation of its dipole moment at the time point of freezing. In this paper, we propose to infer this characteristic from imaging sequences, in which samples are alternately excited with linearly polarized light with polarization vectors rotated by 90° (Fig 1A). Thereby, assignment of localizations to individual molecules becomes possible, which substantially enhances fitting results. We showcase the performance of the approach by determining the size of regular oligomeric structures, based on the analysis of thousands of simulated oligomers.
Results
In this manuscript, we consider the analysis of oligomeric protein structures consisting of n protomers, which can be represented by regular polygons consisting of n corners. If not mentioned otherwise, we consider tetramers with a side length of 5 nm. Each protomer shall be labeled by exactly one dye molecule. This can be achieved experimentally, e.g. by using tags or unnatural amino acids as labels [17, 18]. The aim of our study is to develop a template-based analysis approach to determine the distances between individual protomers, by making use of the correct assignment of localizations to individual protomers. The latter shall be enabled by exploiting the linear dichroism observable in the signal molecule brightness, when fixed dipoles are recorded with linearly polarized light.
At cryogenic temperatures, the dipole orientation of a fluorophore is fixed. When exciting such a fluorophore with linearly polarized light, the absorption probability depends on the scalar product between the fluorophore’s dipole orientation and the polarization vector of the excitation light (see Eqs (2) and (3)). Exciting the fluorophore consecutively with light of orthogonal polarization directions parallel to the x- and y-axis, respectively (Fig 1A), yields characteristic brightness changes depending on the fluorophore’s dipole orientation. Note that the orientation of the x, y-coordinate system in the image plane can be arbitrary. Here and in the following, we used an analytical representation of the number of localizations per molecule m, which closely reflects experimental data (see Methods/Simulations); for convenience, we used here data recorded at room temperature. For the single molecule brightness we considered a maximum number of photons per single molecule signal, Nmax, as it would be recorded if the dye’s dipole moment was aligned with the polarization of the excitation light. The actual signals, as they would be recorded for arbitrary dipole angles, were calculated according to Eqs (2) and (3), and were subjected to photon shot noise.
Heydarian and colleagues published a template-free approach to analyze the underlying structure of an unknown oligomer based on the obtained localization maps [13]. For high single molecule localization precision characterized by Nmax = 105 photons, the method indeed yields satisfactory results and clearly reveals the tetrameric arrangement of the individual dye molecules (see S1C Fig in S1 File). With decreasing photon numbers and increasing localization error, however, localization maps become more difficult to analyze; eventually at Nmax = 104 photons per dye molecule, no substructures can be identified. In our manuscript we propose to additionally include information about the assignment of localizations to the individual dye molecules, which becomes available when performing the experiment at cryogenic temperature. As we will show in the following, this assignment not only allows to tackle challenging imaging conditions at low photon numbers, it also yields highly precise estimates of the oligomer size.
In Fig 1B we plot the signal intensities Nx, Ny for the two polarization directions for four exemplary fluorophore dipole orientations, as they could occur for a fully labelled tetramer. In this case, discrimination of the four dye molecules is straightforward, and we can group all localizations that belong to each single molecule (indicated by color in Fig 1B and 1C) (see Methods section Assignment of Blinks to Specific Molecules). In principle, brightness values can cover the whole region (S2 Fig in S1 File) confined by Nx > 0, Ny > 0 and Nx + Ny < Nmax, with a slight dip in the center of the region. We only accepted sufficiently bright signals with Nx + Ny ≥ Nmin, which would yield a localization error below a user-defined threshold Δx (see Methods section Simulations for the relation between Δx and Nmin, and S2B Fig in S1 File). Note that the point clouds corresponding to each dye molecule can be elliptically distorted due to differences in the Poisson noise along the x- and y-axis (see the red point cloud in Fig 1B).
For convenience, we assumed throughout our manuscript the following procedure for determining the localization of single molecule signals: The two recordings corresponding to the two polarization channels are added up, irrespective of the signal intensities in the two channels, and the localizations are determined on the sum image. Considering the situation of fixed dipole moments, a substantial fraction of molecules will show dipoles characterized by an elevation angle close to the optical axis. Such molecules will produce rather faint signals, which yield large localization errors. In consequence, a rather broad distribution of localization errors can be expected. Of note, we assumed here subsequent illumination of the sample with different polarizations but detection of the two corresponding images on the very same region of the camera chip; hence, no image registration problems occur.
In Fig 1C and 1D we show the obtained localization map of the exemplary tetramer, both with (C) and without (D) localization assignment. Apparently, without localization assignment there is no realistic chance to identify any structural organization of the oligomer. To facilitate the analysis, one may include prior knowledge e.g. by assuming the oligomer to be represented by a regular polygonic structure. In this case, all corners of the simulated tetramer would lie on the perimeter of the circumscribed circle. However, even under this assumption, the circular fit does not yield satisfactory results (dashed line in Fig 1D); in this particular case, the size of the tetramer is substantially overestimated. Localization assignment substantially improves the situation (C). In this case, all localizations assigned to single dye molecules can be averaged, indicated by colored circles in Fig 1C). Taking these averaged positions as input for the fit yields the circle indicated by the dashed line, which is fairly close to the ground truth (dotted line). Importantly, a circle fit shows an inherent bias towards larger sizes [19] (see Methods Eqs (12) & (34)). This is intuitively plausible, as on average more data points lie outside the circle and hence contribute with a higher statistical weight. Correcting for the bias with Eq (39) yields an improved fit result that is shown by the solid line in Fig 1C.
In the following, we provide a quantitative evaluation of the proposed method; specifically, we assess the estimation of oligomer side length from a large number of recorded identical oligomers. We assume here that the oligomers shall be sufficiently separated from each other, so that a standard 2D clustering algorithm can be applied in order to group localizations belonging to individual oligomers. Such clustering algorithms can be found e.g. in refs. [20, 21]. As first step, we group the localizations of each oligomer based on the obtained intensities Nx and Ny. As an eligibility criterion, all oligomers which yield n distinct groups of localizations are taken for further analysis, all others are neglected. This criterion particularly rejects scenarios, where two or more groups of localizations overlap and hence would be interpreted as one spurious position at the weighted average of the detected localizations. If not mentioned otherwise, we assumed full labelling of all protomers. In Fig 2A we analyzed the assignment process (gray) and the eligibility criterion (black) for different single molecule brightness levels Nmax. With increasing brightness, we observed an increasing percentage of oligomers for which all localizations were assigned correctly to the individual protomers. The reason for this is the reduced spread of the brightness clusters in the Nx—Ny representation, which improves the performance of the applied clustering algorithm. Along a similar line, also the fraction of eligible oligomers increases with Nmax, partly due to improved assignment, partly due to the reduced influence of the detection threshold (S2 Fig in S1 File). We also analyzed the fraction of eligible oligomers which contained incorrectly assigned localizations, yielding negligible contributions (dashed line in Fig 2A).
Secondly, for each group of localizations we calculate their mean position, which are used to fit the circle that minimizes Eq (10); the fits are performed for all oligomers separately. From the fit results we determine the corrected radii using Eq (39) in order to calculate the n-mer side lengths via . Exemplary fitting results for a simulated data set of 500000 tetramers with a side length of 5 nm are shown in Fig 2B. As this distribution is slightly positively skewed, it seems reasonable to consider the median of the obtained histogram as an estimator of the underlying tetramer side length. Indeed, for this particular case, the median (blue line) outperforms the mean (red line).
We next estimated the influence of the number of simulated oligomers available for the analysis. The results for both a maximum photon number of Nmax = 104 and Nmax = 105 photons are shown in S3A Fig in S1 File. Here and in subsequent plots, we quantified errors by calculating
(1) |
where and l denote the determined and nominal side length, respectively. The side length estimation gives rather robust values, which are independent of the number of analyzed oligomers (S3A Fig in S1 File). A marginal bias towards too large or too small values was observed for Nmax = 104 and Nmax = 105 photons, respectively. As expected, the standard error of the median decreases with increasing number of oligomers (S3B Fig in S1 File). For all subsequent plots we used a total number of 500000 simulated oligomers for the analysis. In this plot, we further compared results obtained from taking the median or the mean values of the individual data sets; comparison shows a much better performance of the median, which was hence taken in all subsequent analyses.
We further analyzed the dependence of εl on the obtained photon numbers by varying the maximal number of photons Nmax from 104 to 105 (Fig 3A). Note that in this figure, we show a symmetric logarithmic plot, which shows positive and negative relative errors on the positive and negative y-axis, respectively. Relative errors |εl| < 10−3 are shown on a linear scale. The median generally gives very precise results with relative errors below 5 ‰, corresponding to 0.025 nm. For large photon numbers, Nmax ≥ 2 ⋅ 104, the side length is slightly underestimated (indicated by red color). Again, the mean estimator performs less well (full symbols). Of note, the average localization error for single molecules Δx would yield 2.30 and 0.78 nm for Nmax = 104 and Nmax = 105, respectively.
Up to now, we did not consider background noise for the analysis. A real life experiment, however, inevitably contains contributions from camera noise and sample background noise. The main consequences of including noise in the analysis are increased localization errors. We investigated the influence of background noise on the side length estimation by increasing its standard deviation up to b = 300 photon counts, which would be an exceptionally high value for cellular background (Fig 3B and S4 Fig in S1 File). Background noise mainly impacted the results for low photon numbers, where its relative contribution is higher. For high photon numbers, background only had a slight effect on the results.
An important issue with any fluorescence labeling technique is labeling efficiency, leaving some of the protomers within an oligomeric structure undetectable. Experimentally, this may be due to incomplete maturation of fluorescent proteins, prebleaching of dye molecules, or incomplete conjugation of the dye to the protomer. Generally, incomplete labeling compromises registration methods. However, in cases where the template is known and assignment of localizations to individual protomers is possible, one may filter the data and use only oligomers with correct number of dyes n for analysis. To asses the effects of incomplete labeling on our method, we varied the effective labeling efficiency and quantified the eligibility of oligomers. As expected, reduced labeling efficiency massively reduces the number of eligible oligomers (Fig 4A). Importantly, however, the labeling efficiency does not have a large influence on the side length estimation (Fig 4B), only the standard error of the median increases with decreasing labeling efficiency due to the reduced number of eligible oligomers (S5 Fig in S1 File).
We next were interested in the performance of our method for extremely small oligomers. When varying the side length between 10 and 1 nm, we made an interesting observation: While relative errors εl were negligible for side length l ≥ 5 nm, errors increased strongly at short side lengths, yielding an overestimation of the oligomer size up to a factor of 2 (Fig 5). Relative errors εl were negligible for side length l ≥ 2 nm and l ≥ 5 nm for Nmax = 105 and 104 photons, respectively (Fig 5). Errors increased strongly, however, at shorter side lengths, yielding an overestimation of the oligomer size up to a factor of 2, likely reflecting increasingly unstable fit results in case of high single molecule localization errors.
Further, we investigated the performance of our method for tri-, tetra-, penta- and hexamers, i.e. oligomers consisting of n = 3, 4, 5, 6 protomers (Fig 6). All oligomers were simulated as regular polygons. For this, we set the radius of each oligomer type to the fixed value of 4 nm. This leads to different side lengths for each oligomer type. The resulting relative error εl of the fitting procedure is shown in (Fig 6B). For both Nmax = 104 photons and Nmax = 105 photons, we observed improved performance with increasing degree of oligomerization. The main reason for this is an increased number of localizations for higher n. The number of eligible oligomers is somewhat reduced for increasing n due to increased ambiguities in the localization assignments, and a higher likelihood for missing one of the corners (S6 Fig in S1 File). Importantly, for virtually all simulations we observed very small errors εl ≪ 10−2.
In a realistic scenario, it may be difficult to ensure coplanarity between the plane of focus and the plane of orientation of the oligomeric structure. We were hence interested to what extent a tilt of tetramers out of the focal plane influences the results. Fig 7 shows that up to 10 degrees tilt the relative errors stay below 1%. Surprisingly, even massive tilts of 40 degrees only lead to a 10% underestimation of the obtained tetramer size.
Finally, we were interested in the performance of our method with respect to runtime (S7 Fig in S1 File). For this, we compared the analysis of different numbers of tetramers on a standard personal computer (see Methods). As input we used localization maps, which were already assigned to individual oligomers. Analysis of 500000 tetramers, as used throughout this manuscript, takes approximately three minutes. As expected, the runtime scales linearly with the number of tetramers, which can become a massive advantage for the analysis of large data-sets compared to template-free methods.
Discussion
In this manuscript, we describe a workflow for the quantitative analysis of regular oligomeric structures based on single molecule localization microscopy data, that were obtained with polarization-sensitive cryo-fluorescence microscopy. Performing experiments at cryogenic temperatures has a strong advantage over room-temperature measurements, as it solves the fixation problem. Standard fixation methods using chemical fixatives often do not preserve the ultrastructure of the sample [4], and even show residual mobility of biomolecules [3]. The problem becomes massive when SMLM shall be applied to questions from structural biology, where structure sizes down to a few nanometers shall be resolved. In contrast, cryo-fixation is considered as the gold standard and hardly affects the ultrastructure of the sample even below nanometer length scales [5].
Measuring at cryogenic temperatures further offers the possibility to exploit polarization effects due to the fixation of the fluorophore’s transition dipole. This allows to assign localizations to the individual dye molecules via their characteristic brightness upon excitation with differently polarized light. On top of that, also differences in the local environment of each fluorophore may additionally accentuate the recorded brightness values, thereby further improving discrimination. In principle, this enables the identification of partially labeled oligomers, which hence can be rejected from the analysis. A further advantage of cryogenic measurements is reduced photobleaching kinetics. In practice, one may hence expect even more precise estimates of oligomer sizes due to a higher number of localizations recorded per molecule.
A few requirements need to be fulfilled in order to fully capitalize on the strength of the method:
The fluorophores should be located in the focal plane. Due to the fixed orientation of the dipoles, the corresponding PSF will generally be tilted against the optical axis. Even slight defocusing may hence substantially displace the obtained localizations from the true fluorophore position [22]. Azimuthal filtering [23] or polarization-resolved imaging [24] have been described as solutions to obviate this effect.
The number of protomers per oligomer should be known. This requirement ensures that only correct, i.e. fully labeled, oligomers are taken for the analysis. While for a substantial number of proteins the degree of oligomerization is known, many interesting cases lack information on the oligomerization. In principle, this information can be extracted from the SMLM data by taking the maximum of the number of localization clusters per oligomer.
The labeling efficiency should be close to one. The higher the labeling efficiency, the larger the fraction of eligible oligomers (see Fig 4A). In consequence, less experiments would be required to achieve the same quality of the results. Importantly, lower labeling efficiencies do not introduce a bias in the obtained oligomer size, as contributions from incompletely labeled molecules would be rejected before the analysis (see Fig 4B).
The individual chromophores on each oligomer need to be mutually independent. Dyes in close proximity of a few nanometers may well exhibit coupling between their singlet and triplet levels [25], thereby affecting each other’s blinking rates. In the worst case, point-spread functions between different dyes would overlap. As long as reverse intersystem crossing processes are sufficiently slow, however, they can be filtered out in the respective single molecule trajectories based on abrupt jumps in the single molecule orientation. Of note, such effects were not observed in the pioneering study by Weisenburger et al. [8].
The oligomeric structure should be a regular polygon. While the analysis of irregular polygonal structures is in principle feasible, such a treatment goes beyond the scope of this manuscript.
The population should be homogeneous. Heterogeneous sample compositions are a challenging scenario for any particle averaging approach. In case of heterogeneities in the degree of oligomerization, our approach would yield the size of the oligomers of highest degree present in the sample.
The mutual distance between different oligomers should be large. To avoid localizations overlapping between different oligomers, it is critical to ensure that the mutual distance d of two neighboring oligomers is much larger than the spread of localizations belonging to one oligomer, i.e. d ≫ R + Δx. This can be achieved by reducing expression levels and/or by increasing the localization precision.
Oligomerization should occur only in a plane perpendicular to the optical axis. Two scenarios may be discriminated: First, the oligomerization plane may be tilted against the optical axis. An example would be oligomerization of proteins within the plasma membrane, which is not perfectly flat. In consequence, the obtained structures are distorted (Fig 7). Up to 40 degrees tilt angles, our approach yields oligomer sizes with surprisingly high precision. In more extreme cases, however, one may revert to alternative strategies. For example, a straightforward solution would be the rejection of oligomers, which show localization maps deviating from a regular polygon. Slightly distorted structures could still be accounted for by including a deformation matrix in the model [14]. Secondly, biomolecules may also oligomerize in three dimensions. In this case, tomographic approaches [8] may be preferential.
A straightforward application of our method would be the study of the protomer arrangement within oligomeric structures. Quite often it is not clear in which orientation protomers are assembled or how particular domains of the protein are arranged. If site-specifically labeled protomers are available, the resulting side length would depend on the position of the label: Labels facing towards the inside of the oligomeric structure would yield smaller side lengths than labels facing the outside of the oligomer. Positioning labels on specific sites of the protein hence allows for unravelling the protomer orientation. Similar approaches proved to be successful for the analysis of larger structures such as nuclear pore complexes [15] or endocytic sites [26].
In order to fully exploit the potential of our method it is critical to choose the labeling strategy wisely: Labels should be sufficiently small to report on the actual position of the target site on the protein, and exactly one dye molecule should be linked to the target site. These constraints disqualify fluorescently labeled antibodies. Appropriate possibilities include small tags [17] and unnatural amino acids [18]. In principle, also switchable fluorescent proteins can be used for the analysis of oligomeric structures which are large compared to the size of the fluorescent protein.
Conclusion
Taken together, we have presented and quantitatively characterized a method for polarization-sensitive cryo-SMLM. We found remarkable precision for the determination of the side length of regular oligomeric structures with relative errors of less than 1%, which would be of sufficient quality to ascribe subunit positions in multi-protein complexes. We believe that our method provides a good basis for opening up structural biology applications to cryo-SMLM approaches.
Methods
Simulations
First, we simulated the positions of the protomers. For this, n protomers were assigned to each n-mer (n = 3, 4, 5, 6). Individual protomers belonging to one oligomer were arranged around the oligomer’s center position in the shape of a regular polygon with fixed side length, but random in-plane orientation. If not specified otherwise, we simulated oligomers for each analyzed data set.
Second, each protomer was assumed to be labeled with exactly one dye molecule. In order to account for recordings at cryogenic conditions, a random but fixed dipole orientation was assigned to each dye molecule. The inherent brightness Nmax was considered to be the same for all dye molecules.
To simulate blinking, we assigned a random number of detections to each dye molecule, which was drawn from an artificial blinking statistics following a log-normal distribution (as in [27]). The mean of the log-normal distribution was set to 6.4 localizations and the standard deviation to 5 localizations. These values correspond to previously reported blinking characteristics of fluorescent probes under realistic experimental conditions (compare [28]).
Fluorophores were simulated to be excited alternatingly with differently polarized excitation light. The coordinate system was aligned with the orthogonal polarization directions x and y, which are orthogonal to the optical axis z. The absorption probability of a fluorophore depends on the angle between its dipole orientation and the polarization of the excitation light. Hence, w.l.o.g. the effective number of photons Nx, Ny for the two polarizations of excitation light can be calculated as
(2) |
(3) |
where θ and ϕ are the elevation and azimuth angle of the fluorophore’s dipole relative to the x–axis, respectively (see Fig 1), and Nmax the number of photons emitted if dipole orientation coincides with the polarization vector of the excitation light. For all simulations, we assumed random distributions of θ and ϕ on a sphere. The resulting probability density for detecting (Nx, Ny) photons is given by (see Note 1 in S1 File)
(4) |
Photon shot noise was included by drawing the observed number of photons from Poisson distributions with mean Nx and Ny, respectively.
The error in intensity estimation was distributed according to a normal distribution with mean 0 and variance (ΔN)2. The variance (ΔN)2 was set to the best possible variance of an unbiased estimator, which corresponds to the Cramér-Rao lower bound (CRLB) and is given as follows [29]:
(5) |
where a is the pixel size, b the background noise, N the signal photon count (i.e. Nx, Ny) and σPSF the standard deviation of the point-spread function (PSF). If not mentioned otherwise, background noise was set to b = 0. We assumed a pixel size of 100 nm and a standard deviation of the PSF of 160 nm.
Determination of the single molecule positions was assumed to be performed based on the combined images acquired by excitation with differently polarized light. The total intensity was calculated as Ntotal = Nx + Ny. The uncertainty of the localization procedure is hence given as [29]:
(6) |
As the background noise of the two individual frames combines, b was replaced by in the calculation of τ. Localization coordinates were displaced from the true protomer position by adding a random localization error according to the localization precision Δx. Any detections with a localization precision below 10 nm were discarded. Together with given values of background noise b, pixel size a and the standard deviation of the PSF σPSF this defines a minimum number of required photons to detect a single molecule signal Nmin.
In order to simulate tilted tetramers, without loss of generality we assumed a tilt around the x-axis. To this end, we transformed the y-coordinates of the single molecule positions according to y′ = y ⋅ cos(α), where α denotes the tilt angle of the oligomerization plane with respect to the focal plane.
Mathematical analysis
In this mathematical part, we will use the following notation. We assume that all oligomers are equilateral polygons and have the same number of corners n. We will need to distinguish between the different dye molecules constituting an oligomer, which we will index by i ∈ {1, ⋯, n}, and different localizations corresponding to dye molecule i, which we will index by j ∈ {1, ⋯, mi}, where mi specifies the total number of localizations of dye molecule i. The position of the individual dye molecule will be denoted by
(7) |
whereas we will write for the positions of the blinks . A superscript T as in (x, y)T denotes the transpose of a vector or matrix and (⋅)x and (⋅)y yields the x and y-component of a vector, respectively. The value denotes the expectation value of a random variable ρ, and its variance is defined by . Moreover, the addition of a hat, as in , is used for the estimator of a certain random variable. Note that an estimator is called unbiased if . Notably, there might be the situation where the estimator is biased, in which case where is called the bias.
Throughout the manuscript we make use of the following nomenclature: R, L denote the ground truth radius and side length, respectively, of a regular polygon of n corners, which are related via L = 2R sin(π/n). For each oligomer i with given dipole orientations of the dye molecules, the variables and denote the estimators for radius and side length, as they are obtained from the circle fit (described in section Method for Minimization). In particular, they are not corrected for the fitting bias (described in section Identification of the Bias). Note that and are randomly distributed due to the presence of localization errors. The variables and denote the bias-corrected estimators for radius and side length of all oligomers , which happen to be eligible for analysis (see section on Assignment of Blinks to Specific Molecules). As discussed in the main text of this paper, we calculated the estimator via the mean or median value of all .
Identification of individual oligomeric structures
First of all, we assume that the individual oligomers as well as the corresponding measured localizations are well separated from the ones of each other oligomer. That is, measurements of different oligomers do not overlap. If that is the case, we are able to cluster the given data spatially in order to identify the localizations belonging to individual oligomers. This can be done effectively with standard two dimensional clustering techniques. We use a straightforward approach. We sort the data and take the differences in coordinates in order to identify the adjacent blinks which are closer than a certain prescribed distance from one another. Every such localizations are then grouped together to one cluster. In the simulations, however, we know which protomer belongs to which oligomer such that we omit this step and use the given information in order to avoid unlikely errors in this regard.
Assignment of blinks to specific molecules
This task is performed by taking advantage of the measured polarization of the dipole. In general, the spatial variance of the distribution of blinks makes a reliable clustering (using only the spatial data) impossible. However, the polarization property discriminates effectively between all molecules in one oligomer, provided the polarization of each protomer is sufficiently far from the one of each other. It has to be noticed that we do not have access to the whole polarization of these molecules, but only to their projection on the illumination plane. Similarly, a sign change in the polarization cannot be detected.
Concretely, we cluster the estimated intensity of the two polarization directions, which is again a clustering in 2D as done above in the spatial domain (if performed). We consider the oligomer to be well resolved if the collection of blinks corresponding to that oligomer can be clustered in n groups, where the distance between two groups has to be larger than a given parameter δ. Empirical tests suggest δp = 300 + Nmax/100 to be a feasible choice for difference in the number of photons in order to assert the localizations properly. In the case the cluster is not well resolved for the polarization, we simply discard it for our further computations. Otherwise, we call the oligomer eligible and proceed to estimate the distance between its individual protomers.
See also Fig 1 for visualization, where one specific example (tetramer) is shown and the corresponding blinks are assigned to their respective protomer. As we can see, a simple spatial clustering of the blinks is not applicable.
Estimation of distance between single protomers
Given the data of one individual oligomer, i.e., the localizations (blinks) contained in one spatial cluster, we are now interested in the distance l between the individual protomers. Since we assume the oligomeric structure to be a regular polygon, the distance between two adjacent protomers is supposed to be constant. That is, assuming that the corners are ordered, for all i, we have
(8) |
For regular polygons, the radius R and the distance l between adjacent protomers are directly related via l = 2R sin(π/n). In order to find an estimation for this distance, we use a geometric circle fit. Intuitively, we could minimize the mean square distance between the data points and the fitted circle, which more precisely means solving
(9) |
However, this turns out to yield unsatisfactory results, as we can see in Fig 1A. Due to our ability to figure out which blinks belongs to which protomer, we fit the circle to the centers of mass of these individual clusters of blinks (see Fig 1B). That means, instead of the above we actually solve
(10) |
where the means are given by
(11) |
Note that the minimization takes place over the center of the circle as well as its radius . The example the different fits in Fig 1 show that the assignment of the blinks to their respective protomers improves the center as well as the estimation of the radius.
Although the standard (affine fit) least square problem is strictly convex and has a unique solution, this one is not convex and can have several minima which might correspond to unrealistic solutions [30]. Moreover, this procedure will always yield an estimation for the radius of the circle which possesses a certain bias and the radius R of the circle (see [19, 31, 32]). For example, in Fig 2 one finds the histogram of the distribution of the estimated side length for individual tetramers of 5 nm side length. As we can see, the mean of the distribution is higher than the ground truth and the median, although also overestimated, is closer to the real value.
Identification of the bias
Suppose the localizations of the blinks are identically and independently distributed (iid) normal random variables with zero expectation (centered) and constant variance σ2. Depending on these parameters, we define the random variable r which describes the radius of the circle fitting those blinks (by minimizing (10)) and we denote by the estimator of R. As described in [19], the bias of this estimator is in this case essentially given by
(12) |
In our setup, however, the variance of the random variables is not constant since there are different deviations for the localizations belonging to each individual protomer due to the polarization of the light. For the moment let us consider one fixed spatial cluster of blinks. That is, the data corresponding to one individual oligomer. For each protomer , i = 1, …, n, contained in the oligomeric structure, the coordinates of the blinks , j = 1, …, mi, are mathematically actually realizations of
(13) |
with centered independent identically distributed (iid) normal random variables with variance , i.e., . Let the mean values of these variables be denoted by
(14) |
Obviously, we obtain expectations and, consequently, the corresponding variances are given by
(15) |
since the and are iid. In order to have an estimation of the position of the protomer pi we use the center of mass of the random blinks which is simply given by the mean
(16) |
For each measured blink we are given an estimation for its standard deviation, denoted as , in x as well as in y-direction. That is, the variance in each random coordinate of , which means the variance of and , can be estimated by the (sample) variance
(17) |
Consequently, the estimator possesses a variance which is simply the variance of the mean coordinate deviations and as given in Eq (15). Eventually, this can be estimated by
(18) |
Due to the fact that the variance is different for every , we cannot simply use the bias given in (12). Consulting the considerations in [19], we obtain the following. In our case of a geometric circle fit, we have the parameter vector Θ = (a, b, R) and denote its estimator by . Moreover, we can write
(19) |
with given angles φi ∈ [0, 2π[for i = 1, …, n. Then, we denote
(20) |
such that for all i = 1, …, n and we define
(21) |
Hence, with , the first order expansion (in σ) of the minimization problem for the geometric circle fit can be written as
(22) |
with and which has the solution
(23) |
when the terms are neglected. Then the covariance of the statistical error is (to the leading order) given by
(24) |
where
(25) |
Note that in the case of a constant variance σ2 and single measurements per point (i.e., without averaging as in (14)), the latter simplifies to σ2 I such that which coincides with the result obtained in [19].
To improve the estimation of the bias, we use the second order Taylor expansion of (10) for our parameters Θ = (a, b, R), which we write for the sake of brevity
(26) |
(27) |
(28) |
where the first order terms are linear combinations of the variables and already known as they are contained in the Matrix given in (24). That means, we need to know the second order terms which are quadratic forms of the latter random variables. After expansion and simplification in (10), this yields another minimization problem which reads
(29) |
where the values hi for i = 1, …, n are given by
(30) |
Setting h = (h1, …, hn)T and , we find the solution
(31) |
where terms are neglected. Hence, we obtain
(32) |
In contrast to the case of a constant variance the latter is now a three-component vector which in general does not contain any zeros. That means, even the estimation of the center of the circle is going to be biased. On the other hand, this perfectly makes sense since some of the points to be fitted are simply measured more accurately than others, which indeed has an impact on the choice of the center. However, since the center is not of particular importance for our computations, we ignore the fact that the center is biased and only correct the estimation of the radius. Without loss of generality, we set for i = 1, …, n since we know that the underlying oligomeric structure is an equilateral polygon. We can always rotate our coordinate system or reindex such that this is fulfilled. Eventually, this gives us the matrices U, V as well as W. Moreover, we replace Z in (24) by its estimator
(33) |
according to Eqs (17) and (18) in order to approximate the matrix given by (24). With these values we are able to compute the vector h as well as which gives us the desired bias for the particular oligomeric setup (tetramer, pentamer, hexamer etc). The bias relevant for our computations writes
(34) |
Comparing again with the situation in [19] of M single iid measurements with variance σ2, we obtain in analogy to Eq (12) the bias
(35) |
where the second summand is neglected in [19] since it vanishes asymptotically for M tending to infinity (the paper [19] expands asymptotically with a number of measurments growing to infinity). Hence, the first summand, which does not depend on the number of measurements taken, is the so called essential bias. In our case, however, the second summand in (34) must not be neglected since our n is small (usually n = 4, 5, 6). On the other hand, it is easy to see that the become smaller, the more measurements mi we have for a single protomer such that the entire bias tends to zero for mi → ∞, i = 1, …, n. Therefore, more measurements lead to more accurate estimations of the radius and, in turn, the length of the edges of the oligomers. This is not the case in the setup described in [19]. The effect of the bias is shown in Figs 1 and 2.
An exact solution of Eq (34), considering , is given by
(36) |
such that we eventually find
(37) |
In case of , the correct solution is given by
(38) |
Since is experimentally not accessible, we can replace it by the random variable . Hence, a bias-corrected estimator for the oligomer radius is given by
(39) |
An estimator of the radius based on the analysis of oligomers is obtain via
(40) |
If not indicated otherwise, the median was taken for analysis.
Method for minimization
In order to solve (10), we have to provide an initial guess (a0, b0) for the center as well as r0 for the radius. While the minimization is not too sensitive to the guessed radius, the initial coordinates for the center ought to be not too far away from the ground truth. For that purpose, we use the center of mass as an initial guess for the center point. If we would not assign the different blinks to their individual protomers, that is the situation of M iid measurements, we simply compute the overall center of mass of all blinks in that particular spatial cluster. In our notation, that means with we would have the center
as our initial guess for the minimization. However, we are able to assign blinks to their respective protomer such that we actually obtain
(41) |
as the center of mass of our oligomer, which will in general differ from the former. For the radius we simply choose a value R0 which is sufficiently small such that R0 < Rtruth. Hence, with the centers of mass of the individual clusters of blinks (clustered in intensity), which are the estimations for the positions of the single protomers, given by the means and (see Eq 11), we eventually obtain the initial guess
(42) |
The method we use in order to perform the minimization in (10) is based on the Levenberg-Marquardt algorithm (LM, see [33, 34]). The LM method is an iterative optimization algorithm to solve non-linear least square problems (like the one above). In most applications it tends to be slower than a classical Gauß-Newton approach but, on the other hand, it is more robust. Essentially, LM is a Gauß-Newton ansatz which incorporates a regularization term which forces a decay of function values in the process. To be more specific, let us introduce the following notions. Let the function be defined by
such that the sum in (10) is given by
(43) |
In each iteration of the LM method the initial guess (a0, b0, R0) is now replaced by values (ak, bk, Rk)T = (ak−1, bk−1, Rk−1)T + d with d = (dx, dy, dR)T and . Let us further define the matrix D whose rows consist of the derivatives
(44) |
Hence, for sufficiently small d, we can use the linearized approximation
which means the sum in Eq (43) can be approximately written as
(45) |
(46) |
Taking the derivative with respect to dx, dy, and dr, setting the gradient equal to zero, and using matrix/vector notation, we now obtain with the vectors that
(47) |
has to hold for the minimum of S. In the case of the LM algorithm this system of equations is numerically stabilized by the addition of the identity matrix I with factor λ > 0, which means (47) is replaced by
(48) |
The regularization parameter λ can be changed in every iteration in order to speed up the convergence of the algorithm.
Since we can estimate the localization precision (standard deviation in nm) from the measured intensity for each individual blink of one protomer, we are able to discard low quality data. Hence, one could ignore measurements whose localization precision is lower than certain threshold in order to improve results.
Calculation of error bars
All error bars were calculated based on 1000 bootstrap samples, which were drawn from the individual data sets, and represent the 95% confidence intervals of the mean (or median).
Runtime analysis
For analysis of the runtime shown in S7 Fig in S1 File we used a standard personal computer model XPS 15 9570 with an Intel Core i7-8750H processor.
Supporting information
Acknowledgments
We thank Hamidreza Heydarian and Bernd Rieger for assistance with the code to generate S1 Fig in S1 File.
Data Availability
All relevant data are within the manuscript and its Supporting information files.
Funding Statement
OS: F6807-N36, FWF, https://www.fwf.ac.at/ OS: I3661-N27, FWF, https://www.fwf.ac.at/ GJS: F6809-N36, FWF, https://www.fwf.ac.at/ The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1. Sigal YM, Zhou R, Zhuang X. Visualizing and discovering cellular structures with super-resolution microscopy. Science. 2018;361:880–887. 10.1126/science.aau1044 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Smith CS, Joseph N, Rieger B, Lidke KA. Fast, single-molecule localization that achieves theoretically minimum uncertainty. Nat Methods. 2010;7(5):373–5. 10.1038/nmeth.1449 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Tanaka KA, Suzuki KG, Shirai YM, Shibutani ST, Miyahara MS, Tsuboi H, et al. Membrane molecules mobile even after chemical fixation. Nat Methods. 2010;7(11):865–6. 10.1038/nmeth.f.314 [DOI] [PubMed] [Google Scholar]
- 4. Li Y, Almassalha LM, Chandler JE, Zhou X, Stypula-Cyrus YE, Hujsak KA, et al. The effects of chemical fixation on the cellular nanostructure. Exp Cell Res. 2017;358(2):253–259. 10.1016/j.yexcr.2017.06.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Tsang TK, Bushong EA, Boassa D, Hu J, Romoli B, Phan S, et al. High-quality ultrastructural preservation using cryofixation for 3D electron microscopy of genetically labeled tissues. Elife. 2018;7 10.7554/eLife.35524 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Li W, Stein SC, Gregor I, Enderlein J. Ultra-stable and versatile widefield cryo-fluorescence microscope for single-molecule localization with sub-nanometer accuracy. Opt Express. 2015;23(3):3770–83. 10.1364/OE.23.003770 [DOI] [PubMed] [Google Scholar]
- 7. Tuijtel MW, Koster AJ, Jakobs S, Faas FGA, Sharp TH. Correlative cryo super-resolution light and electron microscopy on mammalian cells using fluorescent proteins. Sci Rep. 2019;9(1):1369 10.1038/s41598-018-37728-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Weisenburger S, Boening D, Schomburg B, Giller K, Becker S, Griesinger C, et al. Cryogenic optical localization provides 3D protein structure data with Angstrom resolution. Nat Methods. 2017;14(2):141–144. 10.1038/nmeth.4141 [DOI] [PubMed] [Google Scholar]
- 9. Deschout H, Shivanandan A, Annibale P, Scarselli M, Radenovic A. Progress in quantitative single-molecule localization microscopy. Histochem Cell Biol. 2014;142(1):5–17. 10.1007/s00418-014-1217-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Cheng Y, Grigorieff N, Penczek PA, Walz T. A primer to single-particle cryo-electron microscopy. Cell. 2015;161(3):438–449. 10.1016/j.cell.2015.03.050 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Fortun D, Guichard P, Hamel V, Sorzano COS, Banterle N, Gonczy P, et al. Reconstruction From Multiple Particles for 3D Isotropic Resolution in Fluorescence Microscopy. IEEE Trans Med Imaging. 2018;37(5):1235–1246. 10.1109/TMI.2018.2795464 [DOI] [PubMed] [Google Scholar]
- 12. Broeken J, Johnson H, Lidke DS, Liu S, Nieuwenhuizen RP, Stallinga S, et al. Resolution improvement by 3D particle averaging in localization microscopy. Methods Appl Fluoresc. 2015;3(1):014003 10.1088/2050-6120/3/1/014003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Heydarian H, Schueder F, Strauss MT, van Werkhoven B, Fazel M, Lidke KA, et al. Template-free 2D particle fusion in localization microscopy. Nat Meth. 2018;15(10):781–784. 10.1038/s41592-018-0136-6 [DOI] [PubMed] [Google Scholar]
- 14. Shi X, Garcia rG, Wang Y, Reiter JF, Huang B. Deformed alignment of super-resolution images for semi-flexible structures. PLoS One. 2019;14(3):e0212735 10.1371/journal.pone.0212735 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Szymborska A, de Marco A, Daigle N, Cordes VC, Briggs JAG, Ellenberg J. Nuclear Pore Scaffold Structure Analyzed by Super-Resolution Microscopy and Particle Averaging. Science. 2013;341 10.1126/science.1240672 [DOI] [PubMed] [Google Scholar]
- 16. Thevathasan JV, Kahnwald M, Cieslinski K, Hoess P, Peneti SK, Reitberger M, et al. Nuclear pores as versatile reference standards for quantitative superresolution microscopy. Nat Methods. 2019;16(10):1045–1053. 10.1038/s41592-019-0574-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Erdmann RS, Baguley SW, Richens JH, Wissner RF, Xi Z, Allgeyer ES, et al. Labeling Strategies Matter for Super-Resolution Microscopy: A Comparison between HaloTags and SNAP-tags. Cell Chem Biol. 2019;26(4):584–592.e6. 10.1016/j.chembiol.2019.01.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Jakob L, Gust A, Grohmann D. Evaluation and optimisation of unnatural amino acid incorporation and bioorthogonal bioconjugation for site-specific fluorescent labelling of proteins expressed in mammalian cells. Biochem Biophys Rep. 2019;17:1–9. 10.1016/j.bbrep.2018.10.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Al-Sharadqah A, Chernov N. Error Analysis for Circle Fitting Algorithms. Electronic Journal of Statistics. 2009;3:886–911. 10.1214/09-EJS419 [DOI] [Google Scholar]
- 20. Baumgart F, Arnold AM, Rossboth BK, Brameshuber M, Schutz GJ. What we talk about when we talk about nanoclusters. Methods Appl Fluoresc. 2018;7(1):013001 10.1088/2050-6120/aaed0f [DOI] [PubMed] [Google Scholar]
- 21. Khater IM, Nabi IR, Hamarneh G. A Review of Super-Resolution Single-Molecule Localization Microscopy Cluster Analysis and Quantification Methods. Patterns. 2020;1(3). 10.1016/j.patter.2020.100038 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Backlund MP, Lew MD, Backer AS, Sahl SJ, Moerner WE. The role of molecular dipole orientation in single-molecule fluorescence microscopy and implications for super-resolution imaging. Chemphyschem. 2014;15(4):587–99. 10.1002/cphc.201300880 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Backlund MP, Arbabi A, Petrov PN, Arbabi E, Saurabh S, Faraon A, et al. Removing Orientation-Induced Localization Biases in Single-Molecule Microscopy Using a Broadband Metasurface Mask. Nat Photonics. 2016;10:459–462. 10.1038/nphoton.2016.93 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Nevskyi O, Tsukanov R, Gregor I, Karedla N, Enderlein J. Fluorescence polarization filtering for accurate single molecule localization. APL Photonics. 2020;5(6). 10.1063/5.0009904 [DOI] [Google Scholar]
- 25. Tinnefeld P, Buschmann V, Weston KD, Sauer M. Direct Observation of Collective Blinking and Energy Transfer in a Bichromophoric System. Journal of Physical Chemistry A. 2003;107(3). 10.1021/jp026565u [DOI] [Google Scholar]
- 26. Mund M, van der Beek JA, Deschamps J, Dmitrieff S, Hoess P, Monster JL, et al. Systematic Nanoscale Analysis of Endocytosis Links Efficient Vesicle Formation to Patterned Actin Nucleation. Cell. 2018. 10.1016/j.cell.2018.06.032 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Arnold AM, Schneider MC, Husson C, Sablatnig R, Brameshuber M, Baumgart F, et al. Verifying molecular clusters by 2-color localization microscopy and significance testing. Sci Rep. 2020;10(1):4230 10.1038/s41598-020-60976-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Rossboth B, Arnold AM, Ta H, Platzer R, Kellner F, Huppa JB, et al. TCRs are randomly distributed on the plasma membrane of resting antigen-experienced T cells. Nat Immunol. 2018;19(8):821–827. 10.1038/s41590-018-0162-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Rieger B, Stallinga S. The lateral and axial localization uncertainty in super-resolution light microscopy. Chemphyschem. 2014;15(4):664–70. 10.1002/cphc.201300711 [DOI] [PubMed] [Google Scholar]
- 30. Chernov N, Lesort C. Least Squares Fitting of Circles. Journal of Mathematical Imaging and Vision. 2005;23:239–251. 10.1007/s10851-005-0482-8 [DOI] [Google Scholar]
- 31. Chernov N, Lesort C. Statistical Efficiency of Curve Fitting Algorithms. Computational Statistics and Data Analysis. 2004;47:713–728. 10.1016/j.csda.2003.11.008 [DOI] [Google Scholar]
- 32. Kanatani K. Statistical Optimization of Geometric Computation: Theory and Practice. Elsevier Science; 2004. [Google Scholar]
- 33. Levenberg K. A Method for the Solution of Certain Non-Linear Problems in Least Squares. Quarterly of Applied Mathematics. 1944;2:164–168. 10.1090/qam/10666 [DOI] [Google Scholar]
- 34. Marquardt D. An Algorithm for Least-Squares Estimation of Nonlinear Parameters. SIAM Journal on Applied Mathematics. 1963;11:431–441. 10.1137/0111030 [DOI] [Google Scholar]