Decoupling global biases and local interactions between cell biological variables

Assaf Zaritsky; Uri Obolski; Zhuo Gan; Carlos R Reis; Zuzana Kadlecova; Yi Du; Sandra L Schmid; Gaudenz Danuser

doi:10.7554/eLife.22323

. 2017 Mar 13;6:e22323. doi: 10.7554/eLife.22323

Decoupling global biases and local interactions between cell biological variables

Assaf Zaritsky ^1,², Uri Obolski ³, Zhuo Gan ^1,², Carlos R Reis ¹, Zuzana Kadlecova ¹, Yi Du ², Sandra L Schmid ¹, Gaudenz Danuser ^1,^2,^*

Editor: Fabrice Cordelières⁴

PMCID: PMC5413353 PMID: 28287393

Abstract

Analysis of coupled variables is a core concept of cell biological inference, with co-localization of two molecules as a proxy for protein interaction being a ubiquitous example. However, external effectors may influence the observed co-localization independently from the local interaction of two proteins. Such global bias, although biologically meaningful, is often neglected when interpreting co-localization. Here, we describe DeBias, a computational method to quantify and decouple global bias from local interactions between variables by modeling the observed co-localization as the cumulative contribution of a global and a local component. We showcase four applications of DeBias in different areas of cell biology, and demonstrate that the global bias encapsulates fundamental mechanistic insight into cellular behavior. The DeBias software package is freely accessible online via a web-server at https://debias.biohpc.swmed.edu.

DOI: http://dx.doi.org/10.7554/eLife.22323.001

Research Organism: Other

eLife digest

Cell biologists often use microscopes to look closely at cells and see what is happening during an experiment. Cell biology experiments typically involve measuring more than one aspect of the cells, for example, the forces a cell is experiencing and the direction it is moving, or the locations of two different components in the cell. The task is then to decipher the interactions between these independent variables to better understand the inner workings of a living cell.

This task, however, can be challenging because other variables can mask the interactions between the pairs of variables being studied. For example, it is difficult to know if two components of cells overlap in microscopy images because they directly interact or simply because the overall organization of the cell makes it more likely they will end up in the same place. Several statistical methods have been developed to estimate and eliminate such confounding effects to reveal specific interactions. However, these confounding effects – termed “global biases” – may themselves contain valuable information about how cells work. Ignoring the possible roles of global biases may limit our understanding of biological processes.

Zaritsky et al. have now addressed this issue by developing an algorithm called DeBias that takes global bias into account by decoupling it from true interactions between two variables. DeBias uses global bias as a second measurement to understand how two biological variables interact via confounding variables. Zaritsky et al. then used DeBias with data from four different cell biology experiments to show that the algorithm works. For example, animal cells contain several proteins that form filaments, which give them their shape and help them move. So-called vimentin filaments and microtubules had previously been seen to occur in the same place within cells but it was not clear whether their alignment was due to local interactions or determined by the overall shape of the cell. DeBias was used to analyze cells that were either still or moving in a particular direction. The algorithm could tease apart the effect of this movement and showed that co-alignment of vimentin filaments and microtubules was caused more by the movement and shape of the cell and less by specific interactions.

Overall, DeBias is a new mathematical tool for cell biologists that is freely accessible online. Other researchers can now use this tool in future studies to identify local interactions and global biases in a wide range of cell biology experiments and interpret the data in a meaningful way.

DOI: http://dx.doi.org/10.7554/eLife.22323.002

Introduction

Interpretation of the relations among coupled variables is a classic problem in many areas of cell biology. One example is the spatiotemporal co-localization of molecules – a critical clue to interactions between molecular components; another example is alignment of molecular structures, such as filamentous networks. However, co-localization or alignment may also occur because the observed components are associated with external effectors. For example, the internal components of a polarized cell are organized along the polarization axis, making it difficult to quantify how much of the observed alignment between two filamentous networks is related to common organizational constraints, and how much of it is indeed caused by direct interaction between filaments. Another example is introduced with protein co-localization, where their intensity distributions may be heavily biased to specific levels regulated by the cell state. The combined effects of global bias with local interactions are manifest in the joint distribution of the spatially coupled variables. The contribution of global bias to this joint distribution can be recognized from the deviation of the marginal distributions of each of the two variables from an (un-biased) uniform distribution.

Although global bias can significantly mislead the interpretation of co-localization and co-orientation measurements, most studies do not account for this effect (Adler and Parmryd, 2010; Bolte and Cordelières, 2006; Costes et al., 2004; Das et al., 2015; Dunn et al., 2011; Kalaidzidis et al., 2015; Rizk et al., 2014; Serra-Picamal et al., 2012; Tambe et al., 2011). Previous approaches indirectly assessed spatial correlations (e.g., [Drew et al., 2015; Karlon et al., 1999]), variants of mutual information (e.g., [Krishnaswamy et al., 2014; Reshef et al., 2011]) or spatial biases (Helmuth et al., 2010) but did not explicitly quantify the contribution of the global bias to the observed joint distribution. These methods approach the global bias as a confounding factor (VanderWeele and Shpitser, 2013) that must be eliminated for more accurate assessment of the true local interaction, but ignore the possibility that the global bias contains by-itself valuable mechanistic information to cell behavior.

Here, we present DeBias as an algorithm to decouple the global bias (represented by a global index) from the bona fide local interaction (represented by a local index) in co-localization and co-orientation of two independently-measured spatial variables. The decoupling enables simultaneous investigation of processes that drive global bias and local interactions between spatially-matched variables. Our method is dubbed DeBias because it Decouples the global Bias from local interactions between two variables.

To highlight its capabilities, DeBias was applied to data from four different areas in cell biology, ranging in scale from macromolecular to multicellular: (1) alignment of vimentin fibers and microtubules in the context of polarized cells; (2) alignment of cell velocity and traction stress during collective migration; (3) ﬂuorescence resonance energy transfer of Protein Kinase C; and (4) recruitment of transmembrane receptors to clathrin-coated pits during endocytosis. These examples demonstrate the generalization of the method and underline the potential of extracting global bias as an independent functional measurement in the analysis of multiplex biological variables.

Results

Similarity of observed co-orientation originating from different mechanisms

The issue of separating contributions from global bias and local interactions is best illustrated with the alignment of two sets of variables that carry orientational information. Examples of co-orientation include the alignment of two filament networks (Drew et al., 2015; Gan et al., 2016; Nieuwenhuizen et al., 2015), or the alignment of cell velocity and traction stress, a phenomenon referred to as plithotaxis (Das et al., 2015; Tambe et al., 2011; Trepat and Fredberg, 2011). In these systems, global bias imposes a preferred axis of orientation on the two variables, which is independent of the local interactions between the two variables (Figure 1A).

Similar observed alignments may arise from different levels of global bias and local interactions. This is demonstrated by simulation of two independent random variables X and Y, representing orientations (Figure 1B, left), from which pairs of samples x_i and y_i are drawn to form an alignment angle θ_i (Figure 1B, middle). Then, a local interaction between the two variables is modeled by co-aligning θ_i by ζ_i degrees, resulting in two variables x_i’ and y_i’ with an observed alignment θ_i - ζ_i (Figure 1B, right).

We show the joint distribution of X, Y for four simulations (Figure 1C) where X and Y are normally distributed with identical means but different standard deviations (σ), truncated to $[- 90^{\circ}, 90^{\circ}]$ , and different magnitudes of local interactions (ζ). The latter is defined as ζ = αθ (Figure 1B, α = 1 for perfect alignment). Throughout the simulations both σ and α are gradually increased (Figure 1C, left-to-right), implying that the global bias in the orientational variables is reduced while their local interactions increase. As a result, all simulations display similar observed alignments (mean values, 18.9°−19.5°). Figure 1D visualizes 100 samples from each of the two most distinct scenarios: low σ and no local interaction (σ = 17°, α = 0) leads to tendency of X and Y to align independently to one direction (left); higher variance together with increased interaction (σ = 40°, α = 0.5) leads to more diverse orientations of X and Y (right), while maintaining similar mean alignment. This simple example highlights the possibility of observing similar alignments arising from different mechanisms. While the described properties are well known and many others have used statistical post-processing to eliminate confounding factors for accurate assessment of local interactions (Drew et al., 2015; Helmuth et al., 2010; Karlon et al., 1999; Krishnaswamy et al., 2014; Reshef et al., 2011), we aim at directly quantifying the global bias, with the goal of extracting encapsulated information that is fundamental to the biological question.

DeBias: a method to assess the global and local contribution to observed co-alignment

DeBias models the observed marginal distributions X’ and Y’ as the sum of contributions by a common effector, i.e., the global bias, and by local interactions that effect the co-alignment of the two variables in every data point (Figure 2A).

Figure 2. — (A) Underlying assumption: the observed relation between two variables is a cumulative process of a global bias and a local interaction component. (B) Quantifying local and global indices: sample from the marginal distributions X, Y to construct the resampled distribution. The global index (GI) is calculated as the Earth Movers Distance (EMD) between the uniform and the resampled distributions. The local index (LI) is calculated as the subtraction of the GI from the EMD between the uniform and the observed distribution. (C) Local and global indices calculated for the examples from Figure 1C. Black circles represent the (GI,LI) value for the corresponding example in Figure 1C, bars represent the relative contribution of the local (green) and global (red) index to the observed alignment. (**D-E**) Simulation using a constant interaction parameter α = 0.2 and varying standard deviations of X, Y, σ = 50° to 5°. (D) Joint distributions. Correlation between X and Y is (subjectively) becoming less obvious for increasing global bias (decreasing σ). (E) GI and LI are negatively correlated: decreased σ enhances GI and reduces LI. The change in GI is ~4 fold larger compared to the change in LI indicating that the GI has a limited effect on LI values. Inset: stretched LI emphasizes the negative correlation. (F) Both LI and GI are needed to discriminate between simulations with different interaction parameters. α = 0.2 (red) or 0.25 (cyan), σ is drawn from a normal distribution (mean = 25°, standard deviation = 4°). Number of simulations = 40, for each parameter setting. Inset: stretched LI emphasizes the discrimination. Number of histogram bins, K = 15, for all simulations.

**DOI:** http://dx.doi.org/10.7554/eLife.22323.004

Figure 2—figure supplement 1. — (A) Underlying assumption: the observed relation between two variables is a cumulative process of a global bias and a local interaction component. (B) Quantifying local and global indices: sample from the marginal distributions X, Y to construct the resampled distribution. The global index (GI) is calculated as the Earth Movers Distance (EMD) between the uniform and the resampled distributions. The local index (LI) is calculated as the subtraction of the GI from the EMD between the uniform and the observed distribution. (C) Local and global indices calculated for the examples from Figure 1C. Black circles represent the (GI,LI) value for the corresponding example in Figure 1C, bars represent the relative contribution of the local (green) and global (red) index to the observed alignment. (**D-E**) Simulation using a constant interaction parameter α = 0.2 and varying standard deviations of X, Y, σ = 50° to 5°. (D) Joint distributions. Correlation between X and Y is (subjectively) becoming less obvious for increasing global bias (decreasing σ). (E) GI and LI are negatively correlated: decreased σ enhances GI and reduces LI. The change in GI is ~4 fold larger compared to the change in LI indicating that the GI has a limited effect on LI values. Inset: stretched LI emphasizes the negative correlation. (F) Both LI and GI are needed to discriminate between simulations with different interaction parameters. α = 0.2 (red) or 0.25 (cyan), σ is drawn from a normal distribution (mean = 25°, standard deviation = 4°). Number of simulations = 40, for each parameter setting. Inset: stretched LI emphasizes the discrimination. Number of histogram bins, K = 15, for all simulations.

**DOI:** http://dx.doi.org/10.7554/eLife.22323.004

In a scenario without any global bias or local interaction between X’ and Y’, the observed alignment would be uniformly distributed (denoted uniform). Hence any deviation from the uniform distribution would reflect contributions from both the global bias and the local interactions. To extract the contribution of the global bias we constructed a resampled alignment distribution (denoted resampled) from independent samples of the marginal distributions X’ and Y’, which decouples matched pairs (x_i’, y_i’), and thus excludes their local interactions. The global bias is defined as the dissimilarity between the uniform and resampled distributions and accordingly describes to what extent elements of X’ and Y’ are aligned without local interaction (Figure 2B). If a local interaction exists then the distribution of the observed alignment angles will differ from the independently resampled alignment distribution. Hence, the uniform distribution will be less similar to the experimentally observed alignment distribution (denoted observed) than to the resampled distribution. Accordingly, the local interaction is defined by the difference of dissimilarity between the observed and uniform distributions and dissimilarity between the resampled and uniform distributions (Figure 2B).

The Earth Mover's Distance (EMD) (Peleg et al., 1989; Rubner et al., 2000) was used to calculate dissimilarities between distributions. The EMD of 1-dimensional distributions is defined as the minimal 'cost' to transform one distribution into the other (Kantorovich and Rubinstein, 1958). This cost is proportional to the minimal accumulated number of moving observations to adjacent histogram bins needed to complete the transformation. Formally, we calculate $E M D (A, B) = \sum_{i = 1, \dots, K} | \sum_{j = 1, \dots, i} a_{j} - \sum_{j = 1, \dots, i} b_{j} |$ , with K - number of histogram bins, a_j and b_j - fraction of observations in bin j for distributions A and B correspondingly. Introducing the EMD defines scalar values for the dissimilarities and allows us to define the EMD between resampled and uniform alignment distributions as the global index (GI) and the local index (LI) as the difference of EMD between observed and uniform and the GI (Figure 2B).Figure 2C, demonstrates how the GI and LI recognize the global bias and local interactions between the matched variable pairs (x_i’, y_i’) established in Figure 1C. For a scenario with no local interaction (α = 0) DeBias correctly reports LI~0 and GI~3. For a scenario with gradually wider distributions X,Y, i.e., less global bias, and gradually stronger local interactions (α > 0), the LI increases while the GI decreases. Additional simulations showed that similar properties apply for negative local interactions ζ = αθ (Figure 1B) were α < 0 (Figure 2—figure supplement 1).

In the previous illustrations, changes in spread of the distributions X and Y were compensated by changes in the local interactions. When leaving the interaction parameter α constant while changing the spread of X and Y, a weak, but intrinsically negative correlation between LI and GI becomes apparent (Figure 2D–E). Thus, while DeBias can correctly distinguish scenarios with substantial shifts from global bias to local interactions, the precise numerical values estimating the contribution of LI varies between scenarios with a low versus high global bias. To address this issue we propose to exploit the variation between experiments for modeling the relation between LI and GI. This is demonstrated by comparing two distinct values of the interaction parameter, α, emulating different experimental settings (Figure 2F). Within experiments variation was obtained by drawing multiple values of σ from a normal distribution. Due to the negative correlation between LI and GI the experimental patterns can only be discriminated by combining LI and GI into a two-dimensional descriptor (Figure 2F). This point will be further demonstrated in one of the following case studies and in the Discussion.

Theory and limiting cases of DeBias

To characterize the properties of DeBias we used theoretical statistical reasoning. The first limiting case is set by the scenario in which observations from X and Y are independent. The expected values of the observed and resampled alignments are identical; accordingly, LI converges to 0 for large N (Appendix 1, Theorem 1). The second limiting case is set by the scenario in which X and Y are both uniform. The corresponding resampled alignment is also uniform; accordingly, GI converges to 0 for a large N (Appendix 1, Theorem 2). The third limiting case occurs with perfect alignment, i.e., x_i = y_i for all i. In this case the observed alignment distribution is concentrated in the bin containing θ = 0. We examine two scenarios of perfect alignment: (1) When all the locally matched measurements are identical (x_i = y_j for all i, j), the resampled distribution is also concentrated in the bin θ = 0 implying that LI = 0 and GI assumes the maximal possible value: $G I = \frac{1}{K} \sum_{i = 1, .., K} (i - 1) = \frac{K - 1}{2}$ , where K is the number of quantization bins (Appendix 1, Theorem 3.I). (2) When X, Y are uniform (and x_i = y_i for all i), the resampled distribution is uniform, thus GI = 0 and LI reaches its maximum value: $L I = \frac{1}{K} \sum_{i = 1, .., K} (i - 1) = \frac{K - 1}{2}$ , (Appendix 1, Theorem 3.II). Generalizing this case, we prove that LI is a lower bound for the actual contribution of the local interaction to the observed alignment (Appendix 1, Theorem 4). Complementarily, GI is an upper bound for the contribution of the global bias to the observed alignment.

Last, we show that when X and Y are truncated normal distributions, or when the alignment distribution is truncated normal, GI reduces to a limit of 0 as σ → ∞, when σ is the standard deviation of the normal distribution before truncation (Appendix 1, Theorem 5). Simulations complement this result demonstrating that σ and GI are negatively associated, i.e., GI decreases with increasing σ (Figure 2E). This final property is intuitive, because resampling from more biased distributions (smaller σ) tends to generate high agreement between (x_i, y_i) leading to reduced alignment angles and increased GI.

The modeling of the observed alignment as the sum of GI and LI allowed us to assess the performance of DeBias from synthetic data. By using a constant local interaction parameter ζ (ζ = c), we were able to retrieve the portion of the observed alignment that is attributed to the local interaction and to compare it with the true predefined ζ (Appendix 2, Appendix 2—figure 1). These simulations demonstrated again the need for a GI-dependent interpretation of LI (first shown in Figure 2E–F). Simulations were also performed to assess how the choice of the quantization parameter K (i.e., number of histogram bins) and number of observations N affect GI and LI (Appendix 2, Appendix 2—figures 2–3), and this was also verified in our experimental data (Figure 3—figure supplement 1). In summary, by combining theoretical considerations and simulations we demonstrated the properties and limiting cases of DeBias in decoupling paired matching variables from orientation data.

Local alignment of vimentin and microtubule filaments

We applied DeBias to investigate the degree of alignment between vimentin intermediate filaments and microtubules in polarized cells. Recent work using genome-edited Retinal Pigment Epithelial (RPE) cells with endogenous levels of fluorescently tagged vimentin and α-tubulin showed that vimentin provides a structural template for microtubule growth, and through this maintains cell polarity (Gan et al., 2016). The effect was strongest in cells at the wound front where both vimentin and microtubule networks collaboratively align with the direction of migration (Figure 3A–C). An open question remains as to how much of this alignment is caused by the extrinsic directional bias associated with the collective migration of cells into the wound as opposed to a local interaction between the two cytoskeleton systems.

Figure 3. — (A) RPE cells expressing TagRFP α-tubulin (MT) and mEmerald-vimentin (VIM) at endogenous levels during a wound healing assay. Scale bar 100 μm. (B) Zooming in on cells in different locations in respect to the wound edge. Right-most column, computer segmented filaments of both cytoskeleton systems. Top row, cells located at the wound edge (‘Front’); Middle row, cells located 2–3 rows away from the wound edge (’Back’); Bottom row, cells located at the wound edge partially with shRNA knock-down of vimentin. Scale bar 10 μm. (C) Orientation distribution of microtubules (left column) and vimentin filaments (middle columns) for the cells outlined in B. Vimentin-microtubule alignment distributions (right column). (D) Scatterplot of GI versus LI derived by DeBias. The GI is significantly higher in WT cells at the wound edge (‘Front’, n = 12) compared to cells inside the monolayer (‘Back’, n = 12, fold change = 4.8, p-value < 0.0001); or compared to vimentin-depleted cells at the wound edge (‘VIM KD’, n = 7, fold change = 5.2, p-value < 0.0001). Statistics based on Wilcoxon rank-sum test. All DeBias analyses performed with K = 15. (E) Polarization of RPE cells at the wound edge at different time points after scratching. Scale bar 10 μm. (F) Representative experiment showing the progression of LI and GI as a function of time after scratching (see color code). Correlation between GI and time ~0.90, p-value < 10⁻³⁰ (n time points = 83). N = 5 independent experiments were conducted of which four experiments showed a gradual increase in GI with increased observed polarity. All DeBias analyses performed with K = 15.

**DOI:** http://dx.doi.org/10.7554/eLife.22323.006

Figure 3—figure supplement 1. — (A) RPE cells expressing TagRFP α-tubulin (MT) and mEmerald-vimentin (VIM) at endogenous levels during a wound healing assay. Scale bar 100 μm. (B) Zooming in on cells in different locations in respect to the wound edge. Right-most column, computer segmented filaments of both cytoskeleton systems. Top row, cells located at the wound edge (‘Front’); Middle row, cells located 2–3 rows away from the wound edge (’Back’); Bottom row, cells located at the wound edge partially with shRNA knock-down of vimentin. Scale bar 10 μm. (C) Orientation distribution of microtubules (left column) and vimentin filaments (middle columns) for the cells outlined in B. Vimentin-microtubule alignment distributions (right column). (D) Scatterplot of GI versus LI derived by DeBias. The GI is significantly higher in WT cells at the wound edge (‘Front’, n = 12) compared to cells inside the monolayer (‘Back’, n = 12, fold change = 4.8, p-value < 0.0001); or compared to vimentin-depleted cells at the wound edge (‘VIM KD’, n = 7, fold change = 5.2, p-value < 0.0001). Statistics based on Wilcoxon rank-sum test. All DeBias analyses performed with K = 15. (E) Polarization of RPE cells at the wound edge at different time points after scratching. Scale bar 10 μm. (F) Representative experiment showing the progression of LI and GI as a function of time after scratching (see color code). Correlation between GI and time ~0.90, p-value < 10⁻³⁰ (n time points = 83). N = 5 independent experiments were conducted of which four experiments showed a gradual increase in GI with increased observed polarity. All DeBias analyses performed with K = 15.

**DOI:** http://dx.doi.org/10.7554/eLife.22323.006

Analysis of the GI and LI revealed that most of the discrepancy in vimentin-microtubule alignment originated from a shift in the global bias (Figure 3D), suggesting that the local interaction between the two cytoskeletons is unaffected by the cell position or knock-down of vimentin. Instead, the reduced alignment between the two cytoskeletons is caused by a loss of cell polarity in cells away from the wound edge, probably associated with the reduced geometric constraints imposed by the wound edge. In a similar fashion, reduction of vimentin expression relaxes global cell polarity cues that tend to impose alignment.

To corroborate our conclusion that the global state of cell polarity is encoded by the GI, we performed a live cell imaging experiment, in which single cells at the edge of a freshly inflicted wound in a RPE monolayer were monitored for 80 min after scratching. DeBias was applied to calculate a time sequence of LI and GI. Cells at the wound edge tended to gradually increase their polarity and started migrating during the imaging time frame (Figure 3E, Video 1). Accordingly, the GI increased over time (Figure 3F). We also used this data set to verify that the reported shifts in GI are independent of the number of data points in and the binning of the distribution (Figure 3—figure supplement 1). This demonstrates the capacity of DeBias to distinguish fundamentally different effectors of cytoskeleton alignment.

Video 1. Polarization of RPE cells at the monolayer edge over time.

Download video file^{(2.3MB, mp4)}

DOI: 10.7554/eLife.22323.008

Open in a new tab

Please note several occasions (44 and 46 min, 65 and 67 min, 73 and 75 min) of focus drift followed by automated focus correction.

DOI: http://dx.doi.org/10.7554/eLife.22323.008

Identifying molecular factors in alignment of cell velocity and mechanical forces during collective cell migration

Collective cell migration requires intercellular coordination, achieved by mechanical and chemical information transfer between cells. One mechanism for cell-cell communication is plithotaxis, the tendency of individual cells to align their velocity with the maximum principal stress orientation (He et al., 2015; Tambe et al., 2011; Zaritsky et al., 2015). As in the previous example of vimentin and microtubule interaction, much of this alignment is associated with a general directionality of velocity and stress field parallel to the axis of collective migration (Zaritsky et al., 2015).

Using a wound healing assay, Das et al. (Das et al., 2015) screened 11 tight-junction proteins to identify pathways that promote motion-stress alignment (Figure 4A). Knockdown of Merlin, Claudin1, Patj and Angiomotin (Amot) reduced the alignment of velocity direction and stress orientation (Das et al., 2015). Further inspection of these hits showed that the stress orientation remained stable upon depletion of these proteins, but the velocity direction distribution was much less biased towards the wound edge (Zaritsky et al., 2015). Here, we further analyze this data to demonstrate the capacity of DeBias to pinpoint tight-junction proteins that alter specifically the global or local components that induce velocity-stress alignment.

Figure 4. — (A) Assay illustration. Wound healing assay of MDCK cells. Particle image velocimetry was applied to calculate velocity vectors (red) and monolayer stress microscopy to reconstruct stresses (blue). Alignment of velocity direction and stress orientation was assessed. (B) Mini-screen that includes depletion of 11 tight-junction proteins and Merlin. Data from (Das et al., 2015), where effective depletion was demonstrated. Shown are GI and LI values; molecular conditions are sorted by the LI values (control is ranked sixth, pointed by the black arrow). Each dot was calculated from accumulation of 3 independent experiments (N = 925–1539 for each condition). Three groups of tight junction proteins are highlighted by dashed rectangles: red - low LI and GI compared to control, purple – different GI but similar LI, orange – high LI. All DeBias analyses were performed with K = 15. (C) Pair-wise statistical significance for LI values. P-values were calculated via a permutation-test on the velocity and stress data (Materials and methods). Red – no significant (p ≥ 0.05) change in LI values, blue – highly significant (< 0.01) change in LI values. (D) Highlighted hits: Claudin1, Claudin2, Merlin and ZO1. Top: Distribution of stress orientation (top), velocity direction (middle) and motion-stress alignment (bottom). Bottom: table of mean alignment angle, LI and GI. Claudin1 and Claudin2 have similar mechanisms for transforming stress to aligned velocity. ZO1 depletion enhances alignment of velocity by stress.

**DOI:** http://dx.doi.org/10.7554/eLife.22323.009

By distinguishing GI and LI we generated a refined annotation of the functional alteration that depletion of these tight-junction components caused in mechanical coordination of collectively migrating cells (Figure 4B–C). First, we confirmed that the four hits reported by (Das et al., 2015) massively reduced the GI, consistent with the notion that absence of these proteins diminished the general alignment of velocity to the direction induced by the migrating sheet (Figure 4B, red dashed rectangle). Merlin, Patj and Angiomotin reduced the LI to values close to 0, suggesting that the local dependency between stress orientation and velocity direction was lost. Depletion of Claudin1, or of its paralog Claudin2, which was not reported as a hit in the Das et al. screen, reduced the LI to a lesser extent, similarly for both proteins, but had very different effects on the GI (Figure 4B, purple dashed rectangle). This suggested that the analysis by (Das et al., 2015) missed effects that do not alter the general alignment of stress or motion, and implied the existence of a local velocity-stress alignment mechanism that does not immediately change the collective aspect of cell migration.

When assessing the marginal distributions of stress orientation and velocity direction we observed that depletion of Claudin1 reduced the organization of stress orientations and of velocity direction, while Claudin2 reduced only the latter. The LI values of depletion conditions were similar and lower than control (Figure 4D). Merlin depletion is characterized by an even lower LI and marginal distributions with aligned stress orientation and almost uniform alignment distribution (Figure 4D). Since we think that aligned stress is transformed to aligned motion (He et al., 2015; Zaritsky et al., 2015), we propose that in this data the LI quantifies the effect of local mechanical communication on parallelizing the velocity among neighboring cells. Accordingly, stress-motion transmission mechanism is impaired to a similar extent by reduction of Claudin1 and Claudin2, albeit less than by reduction of Merlin.

Using LI as a discriminative measure also allowed us to identify a group of new hits (Figure 4C). ZO1, ZO2, ZO3, Occludin and ZONAB are all characterized by small reductions in GI but a substantial increase in LI relative to control (Figure 4B, orange dashed rectangle). A quantitative comparison of control and ZO1 depleted cells provides a good example for the type of information DeBias can extract: both conditions yield similar observed alignment distributions with nearly identical means, yet ZO1 depletion has an 83% increase in LI and 8% reduction in GI, i.e., the mild loss in the marginal alignment of velocity or stress is compensated by enhanced local alignment in ZO1 depleted cells (Figure 4D). This might point to a mechanism, in which stress orientation is reduced by tight-junction depletion, but enhanced by transmission of stress orientation into motion orientation, leading to comparable alignment. Notably, all paralogs, ZO1, ZO2 and ZO3 fall into the same cluster of elevated LI and slightly reduced GI relative to control experiments. This phenotype is in agreement with the outcomes of a screen that found ZO1 depletion to increase both motility and cell-junctional forces (Bazellières et al., 2015).

Using DeBias to assess protein-protein co-localization

Protein-protein co-localization is another ubiquitous example of correlating spatially matched variables in cell biology. To quantify GI and LI for protein-protein co-localization, we normalized each channel to intensity values between 0 and 1. The 'alignment' θ_i of matched observations (x_i,y_i) was replaced by the difference in normalized fluorescent intensities x_i - y_i (Materials and methods). Simulations demonstrated that stronger interactions in co-localization are translated to larger LI values and validated that the choice of K (number of histogram bins) and N (number of observations) marginally affect GI and LI (Appendix 3, Appendix 3—figure 1). While LI could serve as a measure to assess co-localization, the interpretation of GI is less intuitive. In the following, we present two examples of applying DeBias for protein-protein co-localization, and demonstrate the type of information that can be extracted from the combined GI and LI analysis.

PKC FRET: a simple example of pixel-based protein-protein co-localization

To test the potential of DeBias in quantification of pixel-based co-localization, we analyzed the effect of ﬂuorescence resonance energy transfer (FRET) in the C kinase activity reporter (CKAR), which reversibly responds to PKC activation and deactivation (Violin et al., 2003). Reduced PKC activity leads to energy transfer from CFP to YFP_CFP, resulting in reduced FRET ratio ( $\frac{C F P}{Y F P_{C F P}}$ ) (Figure 5A). Assuming that the CFP signal is dominant (CFP > YFP_CFP), this alteration should reduce the difference between the CFP and YFP_CFP channels, which would in DeBias yield an increased LI (Figure 5A, Materials and methods).

Figure 5. — (A) PKC inhibition is expected to lead to elevated LI for cells with dominant CFP signal (CFP > YFP_CFP). Upon FRET, CFP signal is locally transferred to YFP_CFP, reducing the difference in normalized intensity between the two channels, which increases LI. (B) hTERT-RPE-1 cells imaged with the CKAR reporter. A cell before (top) and after (bottom) PKC inhibition. Region of interest was manually annotated and the ratio $\frac{CFP}{{YFP}_{CFP}}$ was calculated within it. (C) Pixel distribution of differences in normalized fluorescent intensities CFP^norm - YFP^norm_CFP before and after PKC inhibition for the cell from panel B. PKC inhibition shifted the average absolute difference from 0.054 to 0.042 and the LI from 2.25 to 2.84. (**D–F**) PKC inhibition experiment. N = 8 cells. Statistics based on Wilcoxon sign-rank test. (D) The FRET ratio $\frac{CFP}{{YFP}_{CFP}}$ decreased (p-value < 0.008), (E) LI increased (p-value < 0.008), and (F) GI decreased (p-value < 0.008) after PKC inhibition. (G) Marginal distribution of CFP and YFP_CFP before (top) and after (bottom) PKC inhibition. (H) Control experiment. N = 7 cells. hTERT-RPE-1 cells expressing cytoplasmic GFP and mCherry before and after PKC inhibition. No significant change in LI or GI was observed. All DeBias analyses were performed with K = 19.

**DOI:** http://dx.doi.org/10.7554/eLife.22323.010

To test this we labeled hTERT-RPE-1 cells with CKAR and imaged CFP and YFP_CFP channels before and after specific inhibition of PKC with HA-100 dihydrochloride (Figure 5B, Materials and methods), leading to reduced pixel differences in their normalized fluorescent intensities (Figure 5C). As expected, the $\frac{C F P}{Y F P_{C F P}}$ ratio decreased (Figure 5D), LI values increased (Figure 5E) and seemed more sensitive to the FRET. Surprisingly, DeBias indicated a shift in the GI values (Figure 5F), reflected in a more homogeneous marginal distribution of both channels before inhibition (Figure 5G). Control experiments with cytoplasmic GFP and mCherry expression did not show the shifts observed in LI or GI (Figure 5H). Thus, we conclude that PKC inhibition changes the localization of PKC towards a more random spatial distribution. One possible mechanism for this behavior is that deactivation releases the kinase from the substrate. This example illustrates DeBias’ capabilities to simultaneously quantify changes in local interaction and global bias in pixel-based co-localization.

Inferring co-localization of molecular cargo and clathrin-coated pits during endocytosis

Clathrin-mediated endocytosis (CME) is the major pathway for entry of cargo receptors into eukaryotic cells. Cargo receptor composition plays an important role in regulating clathrin-coated pit (CCP) initiation and maturation (Liu et al., 2010; Loerke et al., 2009). The clustering of transferrin receptors (TfnR), the classic cargo receptor used to study CME, promotes CCP initiation, in concert with clathrin and adaptor proteins (Liu et al., 2010). Recent evidence suggests a diversity of mechanisms regulating endocytic trafficking, including cross-talks between signaling receptors and components of the endocytic machinery (Di Fiore and von Zastrow, 2014). For example, the oncogenic protein kinase Akt has been shown to play an important role in mediating CME in cancer cells (Liberali et al., 2014; Reis et al., 2015), but not in normal epithelial cells (Reis et al., 2015). Here we tested how the decoupling by DeBias of global and local contributions to the overall intensity alignment of clathrin and TfnR, can be used to simultaneously investigate co-localization and predict CCP dynamics, using fixed cell fluorescence imaging.

We used fluorescence images of fixed non‐small lung cancer cells (H1299) or untransformed human retinal pigment epithelial cells (ARPE-19) expressing clathrin light chain A fused to eGFP (eGFP-CLCa) as a CCP marker (Figure 6A–B). Cells were either treated with DMSO or with an AKT inhibitor (Akt inhibitor X, ‘ten’), and imaged by Total Internal Reﬂection Fluorescence Microscopy (TIRFM). CCPs were reported in the eGFP-CLCa channel and TfnR was visualized by immunofluorescence in a second channel (Materials and methods). For single cells, the location of fluorescent signals of CLCa and TfnR were recorded and the data were pooled and processed by DeBias (Materials and methods).

Figure 6. — (A) H1299 cells expressing CLCa and TfnR ligands. Top row, representative WT cell (TfnR ligand, GI = 4.6, LI = 1.6). Bottom row, representative AKT-inhibited cell (TfnR ligand, GI = 6.0, LI = 0.3). Scale bar 10 μm. (B) ARPE-19 cells. Top row, representative WT cell (TfnR ligand, GI = 4.3, LI = 0.8). Bottom row, representative AKT-inhibited cell (TfnR ligand, GI = 4.6, LI = 1.6). (**C–D**) LI and GI of CLCa-TfnR co-localization for Ctrl (red) and Akt^inh. cells (cyan). Every data point represents the LI and GI values for a single cell. Statistical analyses performed with the Wilcoxon rank-sum test. All DeBias analyses were performed with K = 40. (C) H1299: N number of cells Ctrl = 30, Akt^inh. = 30; number of CCPs per cell: Ctrl = 455.5, Akt^inh. = 179.5. GI p-value < 0.0001, LI p-value < 0.0001. (D) ARPE-19: N number of cells Ctrl = 30, Akt^inh. = 20; number of CCPs per cell: Ctrl = 958.8, Akt^inh. = 1138.2. GI p-value < 0.002, LI p-value < 0.008. (**E–F**) Receiver Operating Characteristic (ROC) curves showing the true positive rates as a function of false-positive rates for single cell classification, higher curves correspond to enhanced discrimination (Materials and methods). Black – LI, orange – (GI,LI). Statistics via permutation test (Materials and methods). (E) H1299 AUC: (GI,LI) = 0.96 versus LI = 0.88, p-value ≤ 0.003. (F) ARPE-19 AUC: (GI,LI) = 0.83 versus LI = 0.72, p-value ≤ 0.048. (**G–H**) Joint distributions of CLCa (x-axis) and TfnR (y-axis) for H1299 (G) and ARPE-19 (H) cells. (**I–J**) Marginal distributions of CLCa (left) and TfnR (right) for H1299 (I) and ARPE-19 (J) cells. (**K–L**) Combined CCP lifetime distribution for 50 Ctrl (red) and Akt^inh. (cyan) cells. Statistics with Wilcoxon rank-sum test (Materials and methods). (K) H1299: p-value < 0.006 (mean EMD: Ctrl = 29.3 versus Akt^inh. = 43.6); number of cells: 50 (Ctrl), 11 (Akt^inh.). (L) H1299: p-value n.s. (mean EMD: Ctrl = 36.1 versus Akt^inh. = 38.0); number of cells: 12 (Ctrl), 12 (Akt^inh.). (**M–N**) Percentage of TfnR uptake: Ctrl versus Akt^inh. (whiskers - standard deviation). Statistics via two-tailed Student’s t-test. (M) H1299: p-value < 0.005; N = 3 independent experiments. (N) ARPE-19: p-value n.s.; N = 3 independent experiments.

**DOI:** http://dx.doi.org/10.7554/eLife.22323.011

Figure 6—figure supplement 1. — (A) H1299 cells expressing CLCa and TfnR ligands. Top row, representative WT cell (TfnR ligand, GI = 4.6, LI = 1.6). Bottom row, representative AKT-inhibited cell (TfnR ligand, GI = 6.0, LI = 0.3). Scale bar 10 μm. (B) ARPE-19 cells. Top row, representative WT cell (TfnR ligand, GI = 4.3, LI = 0.8). Bottom row, representative AKT-inhibited cell (TfnR ligand, GI = 4.6, LI = 1.6). (**C–D**) LI and GI of CLCa-TfnR co-localization for Ctrl (red) and Akt^inh. cells (cyan). Every data point represents the LI and GI values for a single cell. Statistical analyses performed with the Wilcoxon rank-sum test. All DeBias analyses were performed with K = 40. (C) H1299: N number of cells Ctrl = 30, Akt^inh. = 30; number of CCPs per cell: Ctrl = 455.5, Akt^inh. = 179.5. GI p-value < 0.0001, LI p-value < 0.0001. (D) ARPE-19: N number of cells Ctrl = 30, Akt^inh. = 20; number of CCPs per cell: Ctrl = 958.8, Akt^inh. = 1138.2. GI p-value < 0.002, LI p-value < 0.008. (**E–F**) Receiver Operating Characteristic (ROC) curves showing the true positive rates as a function of false-positive rates for single cell classification, higher curves correspond to enhanced discrimination (Materials and methods). Black – LI, orange – (GI,LI). Statistics via permutation test (Materials and methods). (E) H1299 AUC: (GI,LI) = 0.96 versus LI = 0.88, p-value ≤ 0.003. (F) ARPE-19 AUC: (GI,LI) = 0.83 versus LI = 0.72, p-value ≤ 0.048. (**G–H**) Joint distributions of CLCa (x-axis) and TfnR (y-axis) for H1299 (G) and ARPE-19 (H) cells. (**I–J**) Marginal distributions of CLCa (left) and TfnR (right) for H1299 (I) and ARPE-19 (J) cells. (**K–L**) Combined CCP lifetime distribution for 50 Ctrl (red) and Akt^inh. (cyan) cells. Statistics with Wilcoxon rank-sum test (Materials and methods). (K) H1299: p-value < 0.006 (mean EMD: Ctrl = 29.3 versus Akt^inh. = 43.6); number of cells: 50 (Ctrl), 11 (Akt^inh.). (L) H1299: p-value n.s. (mean EMD: Ctrl = 36.1 versus Akt^inh. = 38.0); number of cells: 12 (Ctrl), 12 (Akt^inh.). (**M–N**) Percentage of TfnR uptake: Ctrl versus Akt^inh. (whiskers - standard deviation). Statistics via two-tailed Student’s t-test. (M) H1299: p-value < 0.005; N = 3 independent experiments. (N) ARPE-19: p-value n.s.; N = 3 independent experiments.

**DOI:** http://dx.doi.org/10.7554/eLife.22323.011

LI values, indicative of the co-localization between TfnR and CLCa, were significantly lower in Akt-inhibited H1299 compared to control cells (Figure 6C). In contrast, Akt inhibition increased the LI values in ARPE-19 cells but this effect was less prominent (Figure 6D). Akt inhibition resulted in increased GI values for both cell lines, to a much greater degree in H1299 cells (Figure 6C–D). To test whether GI enhances the ability to distinguish between control and Akt-inhibited cells, we applied Linear Discriminative Analysis (LDA) classification to calculate the true positive rate versus the false positive rate for LI alone (black lines, Figure 6E–F) or the pair (GI, LI), (orange lines, Figure 6E–F). The area under these curves (AUC) provided a direct measure of the ability of each method to accurately classify the experimental condition of single cells. AUC for the (GI, LI) representation was superior to using LI alone for both cell lines (H1299: 0.96 versus 0.88, Figure 6E; ARPE-19: 0.83 versus 0.72, Figure 6F). Similar benefit in discrimination was achieved when using the GI to complement Pearson’s correlation as an alternative to LI for measuring local interaction (Figure 6—figure supplement 1A–F). Such improved discrimination is indicative of distinct molecular processes that were altered upon Akt inhibition. We also used this data set to experimentally validate the independence of GI and LI of the number of observations (N, Figure 6—figure supplement 1G,H) and the choice of the number of histogram bins (K, Figure 6—figure supplement 1I,J).

To interpret the increased GI values for Akt-inhibited cells, we examined the joint and marginal distributions of CLCa and TfnR. Upon Akt-inhibition, the joint distributions were more biased toward regions of low TfnR intensities (Figure 6G–H). This was clearly observed in the marginal distributions (Figure 6I–J). Hence, although the CLCa distribution appeared not to change upon AKT inhibition, the frequency of CCPs with fewer TfnRs increased. Given the positive relation between TfnR cargo quantities and CCPs maturation (Loerke et al., 2009), we wondered whether the increased frequencies of CCPs containing less TfnR might alter CCPs dynamics. Indeed, live-imaging of H1299 cells showed a higher frequency of CCPs with shorter lifetimes upon Akt-inhibition, which was not seen in normal ARPE-19 cells (Figure 6K–L, Materials and methods). It has previously been shown that Akt inhibition reduces the rate of TfnR CME uptake in H1299 cells, but not in ARPE-19 cells ([Reis et al., 2015], see also Figure 6M–N); therefore, these findings indicate that the reduced levels of TfnR in CCPs upon Akt inhibition results in an increase in short-lived, most likely abortive events, and hence a decrease in CME efficiency.

Altogether, DeBias could distinguish alterations in the regulation of CME between two cell types. The decoupling to GI and LI indicated that upon Akt inhibition, both untransformed and cancer cells showed a global bias towards CCPs with lowered TfnR intensities. This conclusion could not have been reached by considering only the LI, which increased for normal and decreased for transformed cell lines.

Discussion

We introduce DeBias as a new method to assess global bias and local interactions between coupled cellular variables. Although the method is generic, we show here specific examples of DeBias analysis in co-orientation and co-localization studies. The source code is available, https://github.com/DanuserLab/DeBias, as well as via a web-based platform, https://debias.biohpc.swmed.edu. The website also provides detailed instructions for the operation of the user interface.

DeBias defines a generalizable framework for eliminating confounding factors in the analysis of interacting variables. Our examples demonstrate that the distinction of global and local contributions to the level of variable coupling eliminates much of the global confounder bias in the analysis of more direct interactions and can unearth in the form of global bias mechanisms that are missed by a single parameter analysis (Figures 1–2). In the example of vimentin-microtubule alignment (Figure 3), the significant decrease in GI as opposed to the LI upon partial vimentin knock-down indicated that the reduction in alignment between the two cytoskeleton systems is associated with a reduction of cell polarity as the global cue. In the example of stress-velocity alignment (Figure 4), depletion of some tight junction proteins increased LI, suggestive of enhanced local stress-motion transmission; knock-down of others decreased GI indicating an overall impaired alignment of velocity in the direction of wound closure. In the example of FRET experiments (Figure 5), PKC inhibition lead to increased LI, validating the FRET response, while a reduced GI was indicative of weaker interactions of the inactivated kinase with its substrates. In the example of Tfn receptor (TfnR) co-localization with CCPs during CME (Figure 6), the increased GI in response to Akt inhibition related to a higher fraction of CCPs containing less TfnR. Moreover, Akt inhibition induced opposite shifts in LI for normal and cancer cells, reflecting differential alterations in co-localization between cell types. Thus, DeBias provided insight into the regulation of cargo-pit association by kinase activity that depended on a proper deconvolution of local and global effects on the interaction of the clathrin and receptor signal. We then validated our conclusions by further analyses of the marginal distributions, live-imaging and uptake assays (Figure 6). Overall, the four applications shown in this work first emphasize the general need for a confounder analysis when dealing with coupled biological variables and second indicate that the global bias may be linked to mechanistically meaningful properties of the studied system. These properties were either ignored or eliminated by previous methods, and now can be assessed directly by DeBias.

Other approaches have been used to address global confounders for assessment of local interactions between biological variables. For the specific example of object-based co-localization, Helmuth et al. simulated the spatial distribution of objects in the absence of local interactions to calibrate co-localization measurement (Helmuth et al., 2010). Other methods mostly used second-order spatial statistics on distances between neighbor points to exclude confounders for better co-localization sensitivity (reviewed in Lagache et al., 2015). Importantly, we show applications of DeBias on co-localization that do not require initial object detection (Figure 5). While the phenomenon of confounder bias is independent of object- versus pixel-based co-localization, we distinguish the peculiarities of the two scenarios in Appendix 4.

An important and more general approach to revealing local interactions masked by global biases was recently proposed by (Krishnaswamy et al., 2014), using applications to single cell mass cytometry data as examples. The authors developed a measure referred to as conditional-Density Resampled Estimate of Mutual Information (DREMI) to quantify the influence of a protein X on protein Y based on the conditional probability P(Y|X). DREMI takes advantage of the abundant mass cytometry data to equally weigh data at different intervals along the range of X values using > 10,000 cells per experimental condition. This approach is less reliable when limited data is available, because of the low confidence in the conditional probability of observations with low data abundance. Thus, DREMI is not well suited for image data, which typically has fewer observations.

DeBias estimates GI and LI assuming a constant global bias and local interaction for all observations. Moreover, its quantification power is relative. For example, a two-fold increase in the direct interaction of two variables would not necessarily result in a two-fold increase in LI. Another limitation is the absence of complete orthogonality of GI and LI values (Figure 2E–F, Appendix 3—figure 1), which complicates the interpretation of GI and LI in certain scenarios. These three limitations apply to all current approaches for quantifying interactions between coupled variables. The main conceptual advance DeBias seeks to make relates to the explicit integration of confounding factors in the analysis of coupled variables, which implies an expansion of the coupling metric from a scalar to a two-dimensional score. A forth limitation in the current implementation of DeBias is the linear normalization of multiple intensity variables in co-localization applications. Future versions may include non-linear normalization methods, although such normalization is usually highly specific to a particular data set. Last, the mechanism encoded by the GI is not always obvious. Sometimes it requires additional experiments to unveil the information contained by the GI. For example, we combined fixed cell dual-color imaging with live-imaging and uptake assays to show that shifts in the GI encode a shift in the relative populations of short- and long-lived CCPs between conditions (Figure 6). Despite some of the discussed complexities, DeBias offers a simple means to quantify and interpret mechanisms that alter confounders in the coupling of two variables and to largely exclude such global biases from the quantification of direct interactions.

Materials and methods

DeBias procedure

The DeBias procedure is depicted in Figure 2A. The marginal distributions X and Y are estimated from the experimental data, $\forall i, x_{i}, y_{i} \in [0, 90 °]$ . The experimentally observed alignment distribution (denoted observed) is calculated from the alignment angles θ_i of matched (x_i,y_i) paired variables, for all i.

θ_{i} = {\begin{matrix} | x_{i} - y_{i} | & | x_{i} - y_{i} | \leq 90 \\ 180 - | x_{i} - y_{i} | & | x_{i} - y_{i} | > 90 \end{matrix}

The resampled alignment distribution (denoted resampled) is constructed by independent sampling from X and Y. N random observations (where N = |X| is the original sample size) from X and Y are independently sampled with replacement, arbitrarily matched and their alignment angles calculated to define the resampled alignment. This type of resampling precludes the local dependencies between the originally matched (x_i,y_i) paired variables.

The uniform alignment distribution (denoted uniform) is used as a baseline for comparison between distributions. This is the expected alignment distribution when neither global bias (reflected by uniform X, Y distributions) nor local interactions exist. The Earth Mover's Distance (EMD) (Peleg et al., 1989; Rubner et al., 2000) was used as a distance metric between alignment distributions. The EMD for two distributions, A and B, is defined as follows:

$E M D (A, B) = \sum_{j = 1, \dots, K} | \sum_{j = 1, \dots, i} a_{j} - \sum_{j = 1, \dots, i} b_{j} |$ , where $a_{j}$ and $b_{j}$ are the frequencies of observations in bins $j$ of the histograms of distributions A and B, respectively, each containing K bins.

The global index (GI) is defined as the EMD between the uniform distribution and the resampled alignment:

GI = EMD(uniform,resampled)

The local index is determined by subtraction of the global index from the EMD between the uniform distribution and the experimentally observed alignment distribution:

LI = EMD(uniform,observed) - global index

DeBias for protein-protein co-localization

The following adjustments to this procedure are implemented to allow DeBias to quantify protein-protein co-localization:

Levels of fluorescence are not comparable between different channels due to different expression levels and imaging parameters. Thus, each channel is normalized to [0,1] by the fifth and 95th percentiles of the corresponding fluorescence intensities to achieve a stable and robust distribution.
The alignment angle θ_i of the matched observation (x_i,y_i) is calculated as the difference in normalized fluorescence intensities x_i - y_i and the alignment distribution is thus defined on the interval [−1,1].

The number of histogram bins for the alignment distributions (observed, resampled and uniform) was K = 15 for orientational data, 19 for PKC and 40 for CME co-localization data.

Automated selection of number of histogram bins, K

The Freedman-Diaconis rule (Freedman and Diaconis, 1981) was used to automate the selection of histogram bin width: $b i n s i z e = \frac{Q_{3} (x) - Q_{1} (x)}{\sqrt[3]{n}}$ , where Q_i is the i^th quartile of the empirical distribution x and n is the number of data points contained. A function to calculate K is included in our publicly available source code and this functionality was also integrated to the web-server implementation. Importantly, GI and LI across experiments can be compared only when evaluated with the same K value and this is enforced by the web-server. It is the responsibility of the source-code user to validate using the same K values when comparing different experimental conditions.

Simulating synthetic data

Simulating co-alignment data

Let us define $X$ and $Y$ as the angular probability distribution functions, with angle instances denoted $x_{i}$ and $y_{i}$ , respectively. When simulating local relations, for each pair of angles, one of the angles will be shifted towards the other by $ζ$ degrees (Figure 1B), unless $| x_{i} - y_{i} | < ζ$ , in which case it will be shifted by $| x_{i} - y_{i} |$ degrees. The angle to be shifted (either $x_{i}$ or $y_{i}$ ) is chosen by a Bernoulli random variable, $p$ , with probability 0.5. The observed angles for pixel $i$ will therefore be

x_{i}^{'} = {\begin{matrix} x_{i} & p = 1 \\ max (x_{i} - ζ, y_{i}) & y_{i} \leq x_{i} \land p = 0 \\ min (x_{i} + ζ, y_{i}) & y_{i} > x_{i} \land p = 0 \end{matrix}

and

y_{i}^{'} = {\begin{matrix} y_{i} & p = 0 \\ max (y_{i} - ζ, x_{i}) & x_{i} \leq y_{i} \land p = 1 \\ min (y_{i} + ζ, x_{i}) & x_{i} > y_{i} \land p = 1 \end{matrix}

The alignment of angles at pixel $i$ will be:

θ_{i} = {\begin{matrix} | x_{i}^{'} - y_{i}^{'} | & | x_{i}^{'} - y_{i}^{'} | \leq 90 \\ 180 - | x_{i}^{'} - y_{i}^{'} | & | x_{i}^{'} - y_{i}^{'} | > 90 \end{matrix}

For example, for our simulations we choose $X, Y$ to be truncated normal distributions on $(- 90, 90)$ with $μ = 0$ and varying values of $σ$ .

$ζ$ is modeled in two ways: either as a constant value, e.g. $ζ = 5 °$ (Appendix 2—figures 1–3), or as a varying value dependent on $| x_{i} - y_{i} |$ (Figures 1–2). For the latter, $ζ$ is defined as a fraction $0 < α < 1$ from $| x_{i} - y_{i} |$ for each pair of observations; namely, $ζ_{i} = α | x_{i} - y_{i} |$ (see Figure 1B). Note, that the observed marginal distributions X’, Y’ may be slightly different from X, Y.

Simulating co-localization data

Let us define $X$ as a probability distribution function, with instances denoted $x_{i}$ . Local interactions were simulated as $y_{i} = x_{i} ζ_{i}$ , where $ζ_{i}$ is an instance of a probability distribution $Z$ . For our simulations we chose $X$ to be a truncated normal distribution on $[0, 1]$ with μ_x = 0.5 and $Z$ to be a normal distribution with μ_ζ = 1. This model of interaction assumes on average a one-to-one interaction between X and Y, deviation of $ζ_{i}$ from one implies reduced interaction. $y_{i}$ samples were also truncated to [0,1]. This ensures that $\forall i, 0 \leq y_{i}, x_{i} \leq 1$ and accordingly, $\forall i, - 1 \leq x_{i} - y_{i}, \leq 1$ , making unnecessary the normalization step in DeBias co-localization calculation.

When simulating scenarios where only sub-groups of the observations undergo interactions, we sampled the none-interacting observations $y_{i}$ from Y the truncated normal distribution on $(0, 1)$ , with μ_y = 0.5 and σ_y = σ_x (same as X).

Vimentin and microtubule filaments experiments and analysis

Cell model

hTERT-RPE-1 cells (ATCC, RRID: CVCL_4388) were TALEN-genome edited to endogenously label vimentin with mEmerald and α-tubulin with mTagRFPt, and validated for protein expression levels (Gan et al., 2016). Cells were stably transfected with shRNA against vimentin to knock down vimentin and the knockdown efficiency was validated as ~75% (Gan et al., 2016). The cell line has been tested negative for mycoplasma contamination.

Fixed cell imaging

hTERT-RPE-1 mEmerald-vimentin/mTagRFPt-α-tubulin cells expressing shRNA-VIM or scrambled control shRNA Scr were plated into MatTek (Ashland, MA) 35 mm glass-bottom dishes (P35G-0–20 C) coated with 5 µg/mL fibronectin. Cells were incubated overnight to allow them to adhere and form monolayers. Monolayers were scratched with a pipette tip to form a wound. Cells were incubated for 90 min, washed briefly and fixed with methanol at −20°C for 15 min. Cells were imaged at the wound edge (denoted ‘front’ cells), and at 2–3 cell rows from the wound edge (denoted ‘back’ cells, only for control condition). Images were acquired using a Nikon Eclipse Ti microscope, equipped with a Nikon Plan Apo Lambda 100x/1.45 N.A. objective. Images were recorded with a Hamamatsu ORCA Flash 4.0 with 6.45 μm pixel size (physical pixel size: 0.0645 × 0.0645 μm). All microscope components were controlled by Micro-manager software.

Live cell imaging

hTERT-RPE-1 mEmerald-vimentin/mTagRFPt-α-tubulin cells expressing scrambled control shRNA Scr were plated into MatTek (Ashland, MA) 35 mm glass-bottom dishes (P35G-0–20 C) coated with 5 µg/mL fibronectin. Cells were incubated overnight to allow them to adhere and form monolayers. Monolayers were scratched with a pipette tip to form a wound. Imaging started 30 min after scratching with an Andor Revolution XD spinning disk microscope mounted on a Nikon Eclipse Ti stand equipped with Perfect Focus, a Nikon Apo 60 × 1.49 N.A. oil objective and a 1.5x optovar for further magnification. Images were recorded with an Andor IXON Ultra EMCCD camera with 16 μm pixel size (physical pixel size: 0.16 × 0.16 μm). Lasers with 488 nm and 561 nm light emission were used for exciting mEmerald and mTagRFPt, respectively. The output powers of the 488 nm and 561 nm lasers were set to 10% and 20% of the maximal output (37 mW and 23 mW, respectively). The exposure time was 300 ms per frame for both channels and images were collected at a frame rate of 1 frame per minute. During acquisition, cells were kept in an onboard environmental control chamber. All microscope components were controlled by Metamorph software.

Filament extraction and spatial matching

We applied the filament reconstruction algorithm reported in (Gan et al., 2016). Briefly, multi-scale steerable filtering is used to enhance curvilinear image structures, centerlines of candidate filament fragments are detected, clustered to high and low confidence sets and iterative graph matching is applied to connect fragments into complete filaments. Each filament is represented by an ordered chain of pixels and the local filament orientation derived from the steerable filter response. Spatial matching was performed as follows: each pixel belonging to a filament detected in the MT channel is recorded to the closest pixel that belongs to a filament in the VIM channel. If the distance between the two pixels is less than 20 pixels, then the pair of VIM and MT orientations at this pixel is recorded for analysis. The same process is repeated to record matched pixels from VIM to MT filaments.

Collective cell migration experiments and analysis

Coupled measurements of velocity direction and stress orientation were taken from the data originally published by Tamal Das et al. (Das et al., 2015). Particle image velocimetry (PIV) was applied to calculate velocity vectors, and monolayer stress microscopy (Tambe et al., 2011) was used to extract stress orientations. Velocity and stress measurements were recorded 3 hr after collective migration was induced by lifting off the culture-insert in which the cells have grown to confluence. Validated siRNAs were used for gene screening. Detailed experimental settings can be found in Das et al., 2015.

Statistical test

We devised a permutation test to determine statistical significance of differences in LI values between experiments (conditions) (Figure 4C). For every pair of conditions (i,j), the following procedure was repeated for 100 iterations. 50% of the velocity-stress observations were randomly selected for each condition and the LI (and GI) were calculated from this subsampling. Without loss of generality, for LI_i < LI_j (based on Figure 4B) the p-value was recorded as the fraction of iterations in which the subsampled LI value for condition i was greater than the LI value for condition j. A fraction of 0 thus implies p-value < 0.01.

PKC FRET experiments

hTERT-RPE-1 cells (ATCC, RRID: CVCL_4388) expressing GFP and mCherry (for the control experiment) or C kinase activity reporter (CKAR, for the PKC activation experiment) (Violin et al., 2003) were plated with DMEM/F12 medium containing 10% fetal bovine serum and 1% penicillin-streptomycin in fibronectin-coated 35 mm MatTek plates (P35G-0–10 c). The cell line has been tested negative for mycoplasma contamination. Cells were incubated overnight and imaged with a custom-built DeltaVision OMX widefield microscope (GE healthcare life sciences, Chicago, IL) equipped with an Olympus PLAN 60 × 1.42 N.A. oil objective and CoolSNAP HQ2 interline CCD cameras. FRET experiments were performed with a 445 nm laser, and control experiments were performed with 488 and 561 nm lasers. 478/35, 541/22, 528/48 and 609/37 emission bandpass filters were used for the CFP, YFP, GFP and mCherry channels, respectively. The output powers of the lasers were set to 10% the maximal output (100 mW). The exposure time was 200 ms per frame for both channels. 10 µM of the PKC inhibitor HA-100 dihydrochloride (Santa Cruz Biotechnology, Dallas, TX) was added to the media after the first image was recorded and a second image was recorded 20 min later.

Single cells were manually selected for analysis. In particular, cells with higher intensities in the CFP channel were found to provide reproducible changes in their FRET intensity. Each cell was manually annotated and analyzed with the Biosensor Processing Software 2.1 to produce the ratio images (Hodgson et al., 2010). Briefly, the field of view was corrected for uneven illumination, background was subtracted, the image was masked with the single cell annotation, and the ratio image was calculated as CFP/YFP_CFP. Statistics was determined using the non-parametric Wilcoxon signed-rank test.

Clathrin mediated endocytosis experiments

Cells, cell culture and chemicals

ARPE-19 (retinal pigment epithelial cells) (ATCC, RRID: CVCL_0145) stably expressing eGFP-CLCa were grown under 5% CO2 at 37°C in DMEM high glucose medium (Life Technologies, ), supplemented with 20 mM HEPES, 10 mg/ml streptomycin, 66 ug/ml penicillin and 10% (v/v) fetal calf serum (FCS, HyClone). H1299 (non-small cell lung carcinoma) (RRID: CVCL_0060, a generous gift from Dr. J. Minna at the UT Southwestern Medical Center) stably expressing eGFP-CLCa were grown under 5% CO₂ at 37°C in RPMI, supplemented with 20 mM HEPES, 10 mg/ml streptomycin, 66 ug/ml penicillin and 5% (v/v) fetal calf serum (FCS, HyClone). STR profiling was performed to ensure cell identity. No mycoplasma contamination was found. The AKT inhibitor (Akt inhibitor X, ‘ten’) was purchased from Calbiochem.

Transferrin receptor internalization

TfnR uptake was measured by an ‘in-cell’ ELISA assay using the anti-TfnR monoclonal antibody, HTR-D65 (Schmid and Smythe, 1991), as ligand, exactly as previously described (Reis et al., 2015). Internalized D65 was expressed as the percentage of the total surface-bound D65 at 4°C (i.e., without acid wash step), measured in parallel.

Fixed cell imaging

Transferrin receptor (TfnR) surface levels were measured using the anti-TfnR mAb (HTR-D65). ARPE-19 and H1299 cells (1.5 × 105 cells per well in a 6-well plate) were grown overnight on glass cover slips and further pre-incubated with 4 ug/ml of D65 in TfnR assay buffer (PBS4+: PBS supplemented with 1 mM MgCl2, 1 mM CaCl2, 5 mM glucose and 0.2% bovine serum albumin) at 4°C for 30 min. After being washed with PBS⁴⁺, cells were fixed in 4% PFA for 30 min at 37°C, permeabilized with 0.1% Triton X-100 for 5 min and further blocked with Q-PBS (2% BSA, 0.1% lysine, pH 7.4) for 30 min. After three washes with PBS, cells were incubated with a 1:500 dilution of goat anti-mouse Alexa-568 labelled secondary antibody (Life Technologies) for 30 min, washed an additional three times with PBS before TIRFM imaging using a 100 × 1.49 NA Apo TIRF objective (Nikon, Japan) mounted on a Ti-Eclipse inverted microscope equipped with the Perfect Focus System (Nikon). Images were acquired with an exposure time of 150 ms for both channels using a pco-edge 5.5 sCMOS camera with 6.5 um pixel size. For inhibition studies, cells were initially pre-incubated in the presence of Akt inhibitor X (10 uM) for 30 min at 37°C, followed by pre-incubation with 4 ug/ml of D65 at 4°C for 30 min, in continued presence of the inhibitor.

Live-cell imaging and analysis

During TIRFM imaging, cells were maintained in DMEM lacking phenol red and supplemented with 2.5% fetal calf serum. Time-lapse image sequences were acquired at a frame rate of 1 frame/s with exposure time of 150 ms using a pco-edge 5.5 sCMOS camera with 6.5 um pixel size. CCP detection, tracking and construction of life-time distributions were performed with the custom CME analysis software described in (Aguet et al., 2013). Lifetime distribution was defined at the resolution of 1 s and limited to 160 s. Longer CCP trajectories were excluded from the analysis. To compare lifetime distributions for single field of views (FOVs) between WT and AKT-inhibited cells we measured the heterogeneity as the EMD distance to the uniform distribution. FOV’s score were compared between the different experimental conditions using the non-parametric Wilcoxon rank-sum test.

Image analysis for fixed cell experiments

Single cell masks were manually annotated in each FOV. We applied the approach described in (Aguet et al., 2013) to automatically detect CCPs from the CLC channel. Briefly, CLC ﬂuorescence was modeled as a two-dimensional Gaussian approximation of the microscope PSF above a spatially varying local background. CCP candidates were first detected via filtering, followed by a model-fitting for sub-pixel localization. The fluorescent intensity of the CLC and any other acquired channel were recorded in the detection coordinates to define the matched observations for DeBias. GI and LI were calculated independently for each single cell. Linear Discriminant Analysis (LDA) classification (Fisher, 1936) was applied to assess single cell classification accuracy. Every cell constituted an observation, a label was assigned based on the experimental condition and the representation was either the LI or the pair (GI,LI). The LDA classifier was trained on a labeled dataset consisting of WT and AKT-inhibition for H1299 or ARPE19 cells. The area under the Receiver Operating Characteristic (ROC) curve was recorded to assess and compare the discriminative accuracy of different measures. The true-positive rate (TPR) is the percentage of control cells classified correctly. The false-positive rate (FPR) is the percent of AKT inhibited cells classified as control. When comparing the potential accuracy of several classification algorithms, a measure that has higher true-positive rate for any fixed false-positive rate values is proved to be the better one. Thus, higher curves (larger areas under the ROC curve, or AUC) correspond to more discriminative measures. Statistical significance for comparing classification performance of LDA classifiers that were trained for scalar measures with or without the GI was calculated by bootstrapping (Figure 6E–F). The following process was repeated 1000 times and the frequency for which the scalar-based classifier outperformed the classifier trained on pairs of measures was reported as the p-value. Random resampling with replacement was performed to obtain a sample size identical to that of the observed dataset. The area under the curve (AUC) of the competing pre-trained LDA classifiers was assessed for this resampled dataset and recorded when the model that was trained without the GI predicted better.

Webserver

The DeBias code was implemented in Matlab, compiled with Matlab complier SDK and transferred to a web-based platform to allow public access for all users at https://debias.biohpc.swmed.edu. The graphical user interface (GUI) was designed to be simple and easy to use. The user uploads one or more datasets to the DeBias webserver and selects the mode of operation (co-localization/orientation). GI and LI values are calculated and the results are displayed and emailed to the user. ‘DeBias Analyst’ enables to group experiments into two experimental conditions (usually control versus treatment), visualizes and outputs statistics on the alterations of GI and LI. The software’s flow chart and a detailed user manual are available in the online user manual. Source code is publicly available, https://github.com/DanuserLab/DeBias Danuser, 2017 (with a copy archived at https://github.com/elifesciences-publications/DeBias).

Acknowledgements

We thank the BioHPC team at UTSW and especially Liqiang Wang for the help in implementing the DeBias web-server and making it freely available for public use. We are grateful to Tamal Das and Joachim Spatz for providing us with the motion and stress data from their tight-junction screen. We thank Andrew Jamieson for helping to package the source code and creating the repository. We thank Dana Reed for assistance with cell characterization and mycoplasma testing. We thank Sangyoon Han, Claudia Schaefer, Meghan Driscoll, Erik Welf, Phillippe Roudot and Marcel Mettlen for critically reading the manuscript and Marcel Mettlen for fruitful discussions and advice. This work was supported by the Cancer Prevention and Research Institute of Texas (CPRIT R1225 to GD), by NIH P01 GM103723, P01 GM096971 (to GD) and by GM713165 (to SLS and GD). UO was supported by an EMBO Long-Term fellowship. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Appendix 1

Theoretical results for co-orientation data

Terms and definitions

Let X, Y be the distribution functions of two random variables representing angles on $[- 90 °, 90 °]$ . Spatially matched random variables from these distributions are denoted $x_{i}$ and $y_{i}$ , $i = 1, 2 \dots N$ , where $N$ is the number of observations. $x_{i}^{*}$ and $y_{i}^{*}$ are random variables sampled from X and Y independently (without considering the spatial matching). The observed and resampled alignment distributions are denoted $A$ and $A^{*}$ , respectively. The alignment distributions represent angles on $[0 °, 90 °]$ , and are functions of $X, Y$ , the interaction between $X$ and $Y$ , and $N$ . Random variables from $A$ are denoted $θ_{i}$ , $i = 1, 2 \dots N$ . Random variables from $A^{*}$ are denoted $θ_{i}^{*}$ . Let $K$ be the number of bins in the alignment histogram. Histogram bins are denoted $b i n_{i}, i = 0, \dots, K - 1$ where $b i n_{0}$ contains the lowest values (including 0°) and $b i n_{k - 1}$ the highest values (including 90°). $U$ denotes the uniform distribution on $[0 °, 90 °]$ with the same K bins as $A, A^{*}$ .

Theorem 1: Local index of independent variables

If $X, Y$ are independent, then $E (L I (X, Y)) = 0$

Proof:

\begin{array}{ll} E (L I) & = E (E M D (A, U) - E M D (A^{*}, U)) \\ = E (E M D (A, U)) - E (E M D (A^{*}, U)) \underset{*}{\underset{⏟}{=}} E (E M D (A, U)) - E (E M D (A, U)) = 0 \end{array}

* Resampling does not change the expectation of the difference of two independent variables.

Theorem 2: Global index of uniform distributions

Let $X, Y$ be uniform distributions, then ${l i m}_{N \to \infty} G I = 0$ .

Proof:

Since $X$ and $Y$ are uniform distributions, the density of the resampled distributions are $f_{x_{i}^{*}} (ω) = f_{y_{i}^{*}} (ω) = \frac{1}{180}, - 90 < ω < 90$ . We can compute the distribution of the difference by $\begin{aligned} f_{x_{i}^{*} - y_{i}^{*}} (ω) = \int_{- \infty}^{\infty} f_{x_{i}^{*}} (x) f_{y_{i}^{*}} (x - ω) d x = \int_{- \infty}^{\infty} \frac{1}{180^{2}} I_{x \in (- 90, 90)} I_{x - ω \in (- 90, 90)} d x = \\ {\begin{matrix} \frac{1}{180^{2}} \int_{- 90}^{90 + ω} 1 d x & 0 < ω < 180 \\ \frac{1}{180^{2}} \int_{- 90 + ω}^{90} 1 d x & 0 < ω < 180 \end{matrix} = {\begin{matrix} \frac{1}{180^{2}} (180 + θ) & - 180 < ω < 0 \\ \frac{1}{180^{2}} (180 - θ) & 0 < ω < 180 \end{matrix} \end{aligned}$

We conclude that the distribution of the difference is

f_{x_{i}^{*} - y_{i}^{*}} (ω) = {\begin{matrix} \frac{1}{180} (1 + \frac{ω}{180}) & - 180 < ω < 0 \\ \frac{1}{180} (1 - \frac{ω}{180}) & 0 < ω < 180 \end{matrix}

Therefore, when taking the absolute value of the angle difference we get:

f_{| x_{i}^{*} - y_{i}^{*} |} (ω) = {\begin{cases} \frac{1}{90} (1 - \frac{ω}{180}) & ω \geq 0 \\ 0 & ω < 0 \end{cases}

Finally, since the alignment is limited to $[0^{\circ}, 90^{\circ}]$ we apply the function $g (ω) = {\begin{array}{lc} 180 - ω & 90 < ω \leq 180 \\ ω & 0 < ω \leq 90 \end{array}$ so that

\begin{array}{ll} f_{g (| x_{i}^{*} - y_{i}^{*} |)} (ω) & = {\begin{matrix} \frac{1}{90} (1 - \frac{w}{180}) + \frac{1}{90} (1 - \frac{180 - ω}{180}) & 0 \leq ω \leq 90 \\ 0 & e l s e \end{matrix} \\ = {\begin{matrix} \frac{1}{90} & 0 \leq ω \leq 90 \\ 0 & e l s e \end{matrix} \end{array}

Therefore $f_{g (| x_{i}^{*} - y_{j}^{*} |)} = A^{*}$ is a uniform distribution.

For $N$ sufficiently large, the histogram has approximately $\frac{1}{K}$ of the observations in each of the $K$ equally spaced intervals between 0 and 90 and thus ${lim}_{N \to \infty} G I = {lim}_{N \to \infty} E M D (A^{*}, U) = 0$ .

Theorem 3: Perfect alignment

If $\forall i, j, x_{i} = y_{j}$ then $G I = \frac{(K - 1)}{2}, lim_{N \to \infty} L I = 0$
If $X$ and $Y$ are uniform distributions, and $\forall i, x_{i} = y_{i}$ then $lim_{N \to \infty} G I = 0, lim_{N \to \infty} L I = \frac{(K - 1)}{2}$

Proof:

(I)

$A = A^{*}$ because $\forall i, j a_{i} = a_{j}^{*} = 0$ . For large $N$ the random variable drawn from the alignment distribution $A$ will be approximately:

θ_{i} = {\begin{matrix} 1 & θ_{i} \in b i n_{0} \\ 0 & e l s e \end{matrix}, \forall i .

The EMD of the alignment from the uniform distribution is therefore simply 'moving' $\frac{1}{K}$ observations from $b i n_{0}$ to every other bin, which sums up to

\begin{array}{l} \underset{N \to \infty}{l i m} E M D (A, U) = \frac{1}{K} * 1 + \frac{1}{K} * 2 + \dots + \frac{1}{K} * (K - 1) = \frac{1}{K} * (K - 1) * \frac{1 + K - 1}{2} = \frac{K - 1}{2} \end{array}

Therefore, for large $N$ , $L I = E M D (A, U) - E M D (A^{*}, U) = 0$ , $G I = E M D (A^{*}, U) = \frac{K - 1}{2}$ .

(II)

Since $\forall i, x_{i} = y_{i}$ we get that, similarly to part (I), $E M D (A, U) = \frac{K - 1}{2}$ .

On the other hand, since $X$ and $Y$ are uniform distributions, we get from theorem 2 that ${lim}_{N \to \infty} E M D (A^{*}, U) = 0$ .

Therefore, for infinite observations, $L I = \frac{K - 1}{2}$ , $G I = 0$ .

Theorem 4: $L I$ is a lower bound for the local contribution to the observed alignment

Assuming that the observed alignment distribution $A$ is cumulatively explained by a global bias and a local interaction, we construct a new alignment distribution $A_{- ζ}$ encoding the true cumulative local contribution to the observed alignment and demonstrate that $L I \leq E M D (U, A_{- ζ}) - G I$ to conclude that LI is a lower bound for the local contribution to the observed alignment.

Proof:

We first define $A^{-}$ , the alignment distribution corresponding to $A$ that does not include any local interaction. Thus, $A^{-}$ , can be interpreted as an alignment distribution constructed from $X^{-}$ and $Y^{-}$ , denoting $X$ and $Y$ after elimination of the (unknown) alignment correction due to local interactions between the observations $(x_{i}, y_{i})$ . The construction of $A_{- ζ}$ is based on the corresponding matching pairs $(x_{i}^{-} \in X^{-}, y_{i}^{-} \in Y^{-})$ with alignment correction by the local interaction $ζ_{i}$ (see Figure 1B for as a schematic depiction). Such local interaction exists in our model (although it might not be explicitly known) and can be represented as a vector $ζ \in ℝ^{N}, ζ_{i} \geq 0 \forall i$ . Note, that this construction supports different $ζ_{i}$ values for every observation $i$ and thus can provide a more detailed platform than the single measure LI that DeBias outputs (which assumes $ζ_{i} = ζ_{j} \forall i, j$ ). Also note, that when $ζ_{i} > θ_{i}^{-}$ ( $θ_{i}^{-}$ is the alignment angle between $(x_{i}^{-}, y_{i}^{-}$ )), then the observed alignment $θ_{i}^{-} - ζ_{i} < 0$ .

Accordingly, $A_{- ζ}$ is defined as the alignment distribution of $θ_{i}^{-} - ζ_{i}$ . As described above, $A_{- ζ}$ can contain negative values for $ζ_{i} > θ_{i}^{-}$ . A, the experimentally observed alignment, thus can be generated from $A_{- ζ}$ as well, by truncating the ‘saturated’ observations (where $ζ_{i} > θ_{i}^{-}$ ) to the value 0. More formally, the elements in A are defined by

\begin{array}{ll} θ_{i}^{-} - ζ_{i} & θ_{i}^{-} > ζ_{i} \\ 0 & θ_{i}^{-} \leq ζ_{i} \end{array}

We can get an upper bound for $E M D (A, A^{-})$ in the form of:

$\begin{matrix} E M D (A, A^{-}) \leq E M D (A_{- ζ}, A^{-}) \leq \sum_{i = 1}^{N} \frac{1}{N} ⌈ \frac{ζ_{i}}{| b i n |} ⌉ \end{matrix}$ .

Where $| b i n |$ defines the size of the angular interval of a bin in the alignment histogram.

This equation is intuitively interpreted as every observation i is locally aligned by $ζ_{i}$ , and therefore is translocated $⌈ \frac{ζ_{i}}{| b i n |} ⌉$ bins, at most.

Note that a decreased bin size reduces this bound as close as needed to the value of $E M D (A_{- ζ}, A^{-})$ .

Finally,

L I \underset{*}{\underset{⏟}{\leq}} E M D (A, A^{*}) \approx E M D (A, A^{-}) \leq E M D (A_{- ζ}, A^{-}) \leq \sum_{i = 1}^{N} \frac{1}{N} ⌈ \frac{ζ_{i}}{| b i n |} ⌉

Thus, the LI is a lower bound on the contribution of the direct interaction between $X$ and $Y$ on the alignment distribution.

Additionally, we get that

\begin{aligned} G I = E M D (A^{*}, U) = E M D (A, U) - L I \underset{*}{\underset{⏟}{\geq}} E M D (A, U) - E M D (A^{*}, A) \\ \geq E M D (A, U) - \sum_{i = 1}^{N} \frac{1}{N} ⌈ \frac{ζ_{i}}{| b i n |} ⌉ \end{aligned}

Implying that the GI is an upper bound of the contribution of the global bias.

* by corollary 2

Corollary 2: For any alignment distribution A, $L I \leq E M D (A, A^{*})$

Proof:

Let $A_{i}, A_{i}^{*}, U_{i}$ denote the relative frequency of observations in $b i n_{i},$ $0 \leq i \leq k - 1$ for $A, A^{*}, U$ , respectively.

\begin{array}{ll} E M D (A, A^{*}) & = \sum_{0 \leq i \leq K - 1} | A_{i} - A_{i}^{*} | \\ = \sum_{0 \leq i \leq K - 1} | A_{i} - U_{i} + U_{i} - A_{i}^{*} | \underset{*}{\underset{⏟}{\geq}} \sum_{0 \leq i \leq K - 1} (| A_{i} - U_{i} | - | A_{i}^{*} - U_{i} |) \\ = \sum_{0 \leq i \leq K - 1} | A_{i} - U_{i} | - \sum_{0 \leq i \leq K - 1} | A_{i}^{*} - U_{i} | = E M D (A, U) - E M D (A^{*}, U) = L I \end{array}

* triangle inequality

Theorem 5: GI limits for highly variant truncated normal distributions

${lim}_{\binom{σ \to \infty}{N \to \infty}} G I = 0$ for the following scenarios:

The resampled alignment is a truncated normal distribution with variance parameter $σ^{2}$ .
$X$ and $Y$ are truncated normal distributions, each with variance parameter $σ^{2}$ .

Proof:

(I)

Let $A_{σ}^{*}$ be the truncated normal resampled alignment distribution, defined by the parameters $μ = 0$ , $σ$ , with the support interval $(a, b)$ , such that $a \leq μ \leq b$ . Let $ϕ^{σ} (x)$ be the probability density function (PDF) of the truncated normal distribution and $u (x)$ , $U$ respectively, the PDF and CDF (cumulative distribution function)of the uniform distribution function on $(a, b)$ . The PDF and CDF of the normal distribution function is denoted in the standard notation of $ϕ$ and $Φ$ respectively.

First we prove that $\underset{σ \to \infty}{l i m} ϕ^{σ} (x) = u (x)$ and use this to conclude that ${lim}_{\binom{σ \to \infty}{N \to \infty}} E M D (A_{σ}^{*}, U) = {lim}_{\binom{σ \to \infty}{N \to \infty}} GI = 0$ .

\begin{array}{l} \forall x_{1}, x_{2} \in (a, b), lim_{σ \to \infty} \frac{ϕ^{σ} (x_{1})}{ϕ^{σ} (x_{2})} = lim_{σ \to \infty} \frac{\frac{ϕ (\frac{x_{1}}{σ})}{σ (Φ (\frac{b}{σ}) - Φ (\frac{a}{σ}))}}{\frac{ϕ (\frac{x_{2}}{σ})}{σ (Φ (\frac{b}{σ}) - Φ (\frac{a}{σ}))}} = lim_{σ \to \infty} \frac{ϕ (\frac{x_{1}}{σ})}{ϕ (\frac{x_{2}}{σ})} = lim_{σ \to \infty} \frac{e^{- \frac{x_{1}^{2}}{2 σ^{2}}}}{e^{- \frac{x_{2}^{2}}{2 σ^{2}}}} \\ = lim_{σ \to \infty} e^{\frac{x_{2}^{2} - x_{1}^{2}}{2 σ^{2}}} = 1 \end{array}

Therefore, ${lim}_{σ \to \infty} ϕ_{t}^{σ} (x) = C o n s t a n t$ . Since the support of $ϕ_{t}^{σ}$ is $(a, b)$ , the only constant satisfying that ${lim}_{σ \to \infty} ϕ_{t}^{σ} (x)$ is the probability distribution, $\frac{1}{b - a} = u (x) .$ Therefore, ${lim}_{\binom{σ \to \infty}{N \to \infty}} E M D (A_{σ}^{*}, U) = {lim}_{\binom{σ \to \infty}{N \to \infty}} GI = 0$ .

(II)

Let $X$ , $Y$ be truncated normal distributions. In part (I) we prove that ${lim}_{σ \to \infty} X = {lim}_{σ \to \infty} Y = u (x)$ . Theorem 2 implies that when $X$ and $Y$ are uniform distributions ${lim}_{\binom{σ \to \infty}{N \to \infty}} G I = 0$ .

Appendix 2

Simulations with constant ζ

To assess the performance of DeBias we tested its ability to retrieve a pre-determined local interaction parameter $ζ$ (see Figure 1B) from simulated synthetic data. $X$ and $Y$ were modeled as truncated normal distributions on (−90, 90), with $μ$ =0 and changing $σ_{x}, σ_{y}$ . Pairs of ( $X_{i}, Y_{i}$ ) were sampled from $X, Y$ and shifted towards each other by $ζ$ degrees (similar to Figure 1B, but with a constant cumulative $ζ$ ) to construct the observed alignment angles. To avoid confusion we denote $X, Y, σ_{x}, σ_{y}$ as the observed values post-simulation. For a given constant $ζ$ , we exhaustively explored the $σ_{x}, σ_{y}$ space. For each $σ_{x}, σ_{y}$ , we performed 20 independent simulations with N = 1600 observations ( $X_{i}, Y_{i}$ ). For each simulation we constructed the resampled distribution 10 times based on 400 observations drawn from the marginal $X, Y$ distributions, and used the mean GI, LI. The final recorded GI, LI were averaged over the independent simulations.

The expected mean alignment when neither global bias nor local interactions exist is 45°. We begin by examining the deviation of the mean observed alignment from this value (45° - θ_mean). Better alignment is reflected by higher 45° - θ_mean values implying a larger deviation from the unbiased and no-interactions scenario. Low standard deviations $σ$ , correspond to better alignment, improving with growing $ζ$ , as expected (Appendix 2—figure 1A). The GI follows a similar pattern and remains relatively stable for small changes in $ζ$ (Appendix 2—figure 1B). The similar patterns between Appendix 2—figure 1A and B indicate that the global bias has a prominent role in determining the observed alignment.

The LI grows with $ζ$ (Appendix 2—figure 1C) and its relative contribution to the observed alignment grow with increasing $ζ$ (Appendix 2—figure 1D, quantified by LI/(LI+GI)), as expected. This relative contribution can be harnessed to restore an estimated $ζ$ as the corresponding fraction from (45° - θ_mean) (Appendix 2—figure 1E). The estimated $ζ$ is a lower bound for the actual value (Appendix 1, Theorem 4). Estimation is more accurate for larger $ζ$ and for large $σ$ (Appendix 2—figure 1F). These results again highlight the importance of exploiting the GI for better interpretation of the LI (first introduced in Figure 2D–E).

We also investigated the effect of the choice of the number of bins K, used for sampling and computation of the EMD between distributions. Increased K induces linear growth in LI and GI values, as expected (Appendix 1 Theorem 5, Appendix 2—figure 2A–B) and stabilized its accuracy in predicting $ζ$ for K ≥ 11 (Appendix 2—figure 2C–E). Large K will require more observations to estimate the true distribution. Using a constant K for a specific application assures fair comparison between different cases. Varying N, the number of observations, did not have a major effect on these measurements (Appendix 2—figure 3A–D), but increasing N reduced the noise which increased the accuracy in predicting ζ (Appendix 2—figure 3E).

Appendix 2—figure 2. — $X, Y$ normal distributions with different $σ_{x}, σ_{y}$ and constant ζ = 15°. K = 4, 7, 11, 15, 22 were examined. (**A–B**) Global (A) and local (B) indices grow with K. (**C–E**) Relative contribution of LI to the observed alignment (C), Retrieved estimated $ζ$ (D) and accuracy of estimated $ζ$ (E) stabilizes for K ≥ 11.

**DOI:** http://dx.doi.org/10.7554/eLife.22323.014

Appendix 2—figure 3. — N = 100, 200, 400, 800, 1600 were examined. $X, Y$ normal distributions with different $σ_{x}, σ_{y}$ and constant $ζ$ = 15°. All measures provide similar information but are noisier for lower N. (A) Global index. (B) Local index. (C) Relative contribution of LI to the observed alignment. (D) Retrieved estimated ζ. (E) Accuracy of estimated ζ.

**DOI:** http://dx.doi.org/10.7554/eLife.22323.015

Appendix 3

Simulating protein-protein co-localization

Co-localization of molecules is commonly used to predict potential local interactions under the assumption that the stoichiometry of interacting molecules remains constant across all observations. We demonstrated this by simulating a random variable $X$ , representing molecular counts, and a random variable $Z$ , representing the local interaction. Pairs of samples $x_{i}$ from $X$ and $ζ_{i}$ from $Z$ were drawn, and $y_{i} = x_{i} ζ_{i}$ represented the molecular counts of the co-localized molecule (Appendix 3—figure 1A). The joint distributions of $X, Y$ for four simulations are shown in Appendix 3—figure 1B. X is normally distributed with mean $μ$ = 0.5 and standard deviations $σ$ = 0.2, truncated to [0,1]. Z is normally distributed with mean $μ$ = 1 and different standard deviations, simulating gradually increasing local interactions (Appendix 3—figure 1B, left-to-right). Applying DeBias on these data demonstrated that increased local interactions (reduced $σ_{ζ}$ ) are translated to increased LI (Appendix 3—figure 1C).

We next simulated a scenario, in which a partial subset of observations $y_{i}$ interacted with $x_{i}$ ; the remaining $y_{i}$ were drawn independently from the distribution $Y = X$ . We found that LI values increased with the fraction of interacting observations (Appendix 3—figure 1D).

Finally, we assessed how variation in the quantization parameter (K) and number of observations (N) alter GI and LI (Appendix 3—figure 1E-F) and this was also demonstrated in our experimental data (Figure 6—figure supplement 1G–J).

Appendix 4

Pixel- versus object-based co-localization

Many methods have been developed to quantify protein-protein co-localization. Pixel-based methods measure pixel-wise correlation coefficients (Adler and Parmryd, 2010; Bolte and Cordelières, 2006; Costes et al., 2004; Manders et al., 1993; Pearson, 1901), exploiting the notion that fluorescence levels of co-localized proteins are correlated. They suffer from background signal where no co-localization exists. Object-based methods first detect objects of interest and then assess co-localization based on second-order statistics of the spatial distributions of the detections (Helmuth et al., 2010; Kalaidzidis et al., 2015; Lagache et al., 2015; Rizk et al., 2014; Semrau et al., 2011). Object-based methods remove the background pixels but lose the information contained by the fluorescence levels at the detected objects. Thus, object-based methods are best applicable for the co-localization of binary signals (Lagache et al., 2015), but less suited for applications in which co-localization accounts for coupling of molecular counts on a continuous spectrum. Moreover, object-based methods require detection of objects in both channels, which often limits their applicability. Pros and cons for using either of these methods are presented in Appendix 4—Table 1. In the examples of receptor-CCP co-localization we implemented a hybrid of the two approaches: co-localization analysis by DeBias focused on the intensity of fluorescent readouts within detected CCPs to decompose the coupling of the two intensity variables into LI and GI (Figure 6). The same decomposition was demonstrated for a diffuse signal without objects in the PKC-FRET example (Figure 5).

Appendix 4 – Table 1.

Object- versus pixel-based colocalization.

Object-based methods are best applicable for the colocalization of binary signals, but not/less for applications, in which colocalization accounts for coupling of molecular counts on a continuous spectrum.

DOI: http://dx.doi.org/10.7554/eLife.22323.017

Pixel based	Object based
Pixel-wise correlation	Spatial colocalization
	Limited to object-detectable data
Suffer from background signal
	Highly altered by detection accuracy
Affected by confounding factors (e.g., CCP size)	Lose the information at the detected objects

Open in a new tab

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Funding Information

This paper was supported by the following grants:

EMBO postdoctoral fellowship to Uri Obolski.
National Institute of General Medical Sciences R01 GM073165 to Gaudenz Danuser, Sandra L Schmid.
Cancer Prevention and Research Institute of Texas R1225 to Gaudenz Danuser.
National Institute of General Medical Sciences P01 GM103723 to Gaudenz Danuser.
National Institute of General Medical Sciences P01 GM096971 to Gaudenz Danuser.

Additional information

Competing interests

The authors declare that no competing interests exist.

Author contributions

AZ, Conceptualization, Software, Formal analysis, Investigation, Methodology, Writing—original draft, Writing—review and editing.

UO, Formal analysis, Investigation, Methodology, Writing—review and editing.

ZG, Investigation, Writing—review and editing.

CRR, Investigation, Writing—review and editing.

ZK, Helped in troubleshooting and optimizing the analysis, Assisted with preparing the manuscript.

YD, Resources, Software.

SLS, Supervision, Writing—review and editing.

GD, Conceptualization, Supervision, Writing—review and editing.

References

Adler J, Parmryd I. Quantifying colocalization by correlation: the pearson correlation coefficient is superior to the Mander's overlap coefficient. Cytometry Part A. 2010;77:733–742. doi: 10.1002/cyto.a.20896. [DOI] [PubMed] [Google Scholar]
Aguet F, Antonescu CN, Mettlen M, Schmid SL, Danuser G. Advances in analysis of low signal-to-noise images link dynamin and AP2 to the functions of an endocytic checkpoint. Developmental Cell. 2013;26:279–291. doi: 10.1016/j.devcel.2013.06.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bazellières E, Conte V, Elosegui-Artola A, Serra-Picamal X, Bintanel-Morcillo M, Roca-Cusachs P, Muñoz JJ, Sales-Pardo M, Guimerà R, Trepat X. Control of cell-cell forces and collective cell dynamics by the intercellular adhesome. Nature Cell Biology. 2015;17:409–420. doi: 10.1038/ncb3135. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bolte S, Cordelières FP. A guided tour into subcellular colocalization analysis in light microscopy. Journal of Microscopy. 2006;224:213–232. doi: 10.1111/j.1365-2818.2006.01706.x. [DOI] [PubMed] [Google Scholar]
Costes SV, Daelemans D, Cho EH, Dobbin Z, Pavlakis G, Lockett S. Automatic and quantitative measurement of protein-protein colocalization in live cells. Biophysical Journal. 2004;86:3993–4003. doi: 10.1529/biophysj.103.038422. [DOI] [PMC free article] [PubMed] [Google Scholar]
Danuser G. DeBias decouples global bias from direct interactions in co-orientation and co-localization data. ab3106fb9adac71ea6e91ef36068faa8acdbf292https://github.com/DanuserLab/DeBias GitHub. 2017
Das T, Safferling K, Rausch S, Grabe N, Boehm H, Spatz JP. A molecular mechanotransduction pathway regulates collective migration of epithelial cells. Nature Cell Biology. 2015;17:276–287. doi: 10.1038/ncb3115. [DOI] [PubMed] [Google Scholar]
Di Fiore PP, von Zastrow M. Endocytosis, signaling, and beyond. Cold Spring Harbor Perspectives in Biology. 2014;6:a016865. doi: 10.1101/cshperspect.a016865. [DOI] [PMC free article] [PubMed] [Google Scholar]
Drew NK, Eagleson MA, Baldo Jr. DB, Parker KK, Grosberg A. Metrics for assessing cytoskeletal orientational correlations and consistency. PLoS Computational Biology. 2015;11:e1004190. doi: 10.1371/journal.pcbi.1004190. [DOI] [PMC free article] [PubMed] [Google Scholar]
Dunn KW, Kamocka MM, McDonald JH. A practical guide to evaluating colocalization in biological microscopy. AJP: Cell Physiology. 2011;300:C723–C742. doi: 10.1152/ajpcell.00462.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
Fisher RA. The use of multiple measurements in taxonomic problems. Annals of Eugenics. 1936;7:179–188. doi: 10.1111/j.1469-1809.1936.tb02137.x. [DOI] [Google Scholar]
Freedman D, Diaconis P. On the histogram as a density estimator: l 2 theory. Probability Theory and Related Fields. 1981;57:453–476. [Google Scholar]
Gan Z, Ding L, Burckhardt CJ, Lowery J, Zaritsky A, Sitterley K, Mota A, Costigliola N, Starker CG, Voytas DF, Tytell J, Goldman RD, Danuser G. Vimentin intermediate filaments template microtubule networks to enhance persistence in cell polarity and directed migration. Cell Systems. 2016;3:252–263. doi: 10.1016/j.cels.2016.08.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
He S, Liu C, Li X, Ma S, Huo B, Ji B. Dissecting collective cell behavior in polarization and alignment on micropatterned substrates. Biophysical Journal. 2015;109:489–500. doi: 10.1016/j.bpj.2015.06.058. [DOI] [PMC free article] [PubMed] [Google Scholar]
Helmuth JA, Paul G, Sbalzarini IF. Beyond co-localization: inferring spatial interactions between sub-cellular structures from microscopy images. BMC Bioinformatics. 2010;11:372. doi: 10.1186/1471-2105-11-372. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hodgson L, Shen F, Hahn K. Biosensors for characterizing the dynamics of rho family GTPases in living cells. Current Protocols in Cell Biology. 2010;Chapter 14:14.11. 11–14.11.14. doi: 10.1002/0471143030.cb1411s46. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kalaidzidis Y, Kalaidzidis I, Zerial M. A probabilistic method to quantify the colocalization of markers on intracellular vesicular structures visualized by light microscopy. AIP Conference Proceedings; 158; 2015. [DOI] [Google Scholar]
Kantorovich LV, Rubinstein GS. On a space of completely additive functions. Vestnik Leningrad. Univ. 1958;13:52–59. [Google Scholar]
Karlon WJ, Hsu PP, Li S, Chien S, McCulloch AD, Omens JH. Measurement of orientation and distribution of cellular alignment and cytoskeletal organization. Annals of Biomedical Engineering. 1999;27:712–720. doi: 10.1114/1.226. [DOI] [PubMed] [Google Scholar]
Krishnaswamy S, Spitzer MH, Mingueneau M, Bendall SC, Litvin O, Stone E, Pe'er D, Nolan GP. Systems biology. conditional density-based analysis of T cell signaling in single-cell data. Science. 2014;346:1250689. doi: 10.1126/science.1250689. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lagache T, Sauvonnet N, Danglot L, Olivo-Marin JC. Statistical analysis of molecule colocalization in bioimaging. Cytometry Part A. 2015;87:568–579. doi: 10.1002/cyto.a.22629. [DOI] [PubMed] [Google Scholar]
Liberali P, Snijder B, Pelkmans L. A hierarchical map of regulatory genetic interactions in membrane trafficking. Cell. 2014;157:1473–1487. doi: 10.1016/j.cell.2014.04.029. [DOI] [PubMed] [Google Scholar]
Liu AP, Aguet F, Danuser G, Schmid SL. Local clustering of transferrin receptors promotes clathrin-coated pit initiation. The Journal of Cell Biology. 2010;191:1381–1393. doi: 10.1083/jcb.201008117. [DOI] [PMC free article] [PubMed] [Google Scholar]
Loerke D, Mettlen M, Yarar D, Jaqaman K, Jaqaman H, Danuser G, Schmid SL. Cargo and dynamin regulate clathrin-coated pit maturation. PLoS Biology. 2009;7:e1000057. doi: 10.1371/journal.pbio.1000057. [DOI] [PMC free article] [PubMed] [Google Scholar]
Manders EMM, Verbeek FJ, Aten JA. Measurement of co-localization of objects in dual-colour confocal images. Journal of Microscopy. 1993;169:375–382. doi: 10.1111/j.1365-2818.1993.tb03313.x. [DOI] [PubMed] [Google Scholar]
Nieuwenhuizen RP, Nahidiazar L, Manders EM, Jalink K, Stallinga S, Rieger B. Co-Orientation: quantifying simultaneous Co-Localization and orientational alignment of filaments in light microscopy. PLoS One. 2015;10:e0131756. doi: 10.1371/journal.pone.0131756. [DOI] [PMC free article] [PubMed] [Google Scholar]
Pearson RA. Section i, social and economic science. Science. 1901;14:912–926. doi: 10.1126/science.14.363.912. [DOI] [PubMed] [Google Scholar]
Peleg S, Werman M, Rom H. A unified approach to the change of resolution: space and gray-level. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1989;11:739–742. doi: 10.1109/34.192468. [DOI] [Google Scholar]
Reis CR, Chen PH, Srinivasan S, Aguet F, Mettlen M, Schmid SL. Crosstalk between akt/GSK3β signaling and dynamin-1 regulates clathrin-mediated endocytosis. The EMBO Journal. 2015;34:2132– 2146. doi: 10.15252/embj.201591518. [DOI] [PMC free article] [PubMed] [Google Scholar]
Reshef DN, Reshef YA, Finucane HK, Grossman SR, McVean G, Turnbaugh PJ, Lander ES, Mitzenmacher M, Sabeti PC. Detecting novel associations in large data sets. Science. 2011;334:1518–1524. doi: 10.1126/science.1205438. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rizk A, Paul G, Incardona P, Bugarski M, Mansouri M, Niemann A, Ziegler U, Berger P, Sbalzarini IF. Segmentation and quantification of subcellular structures in fluorescence microscopy images using squassh. Nature Protocols. 2014;9:586–596. doi: 10.1038/nprot.2014.037. [DOI] [PubMed] [Google Scholar]
Rubner Y, Tomasi C, Guibas LJ. The earth mover's distance as a metric for image retrieval. International Journal of Computer Vision. 2000;40:99–121. doi: 10.1023/A:1026543900054. [DOI] [Google Scholar]
Schmid SL, Smythe E. Stage-specific assays for coated pit formation and coated vesicle budding in vitro. The Journal of Cell Biology. 1991;114:869–880. doi: 10.1083/jcb.114.5.869. [DOI] [PMC free article] [PubMed] [Google Scholar]
Semrau S, Holtzer L, González-Gaitán M, Schmidt T. Quantification of biological interactions with particle image cross-correlation spectroscopy (PICCS) Biophysical Journal. 2011;100:1810–1818. doi: 10.1016/j.bpj.2010.12.3746. [DOI] [PMC free article] [PubMed] [Google Scholar]
Serra-Picamal X, Conte V, Vincent R, Anon E, Tambe DT, Bazellieres E, Butler JP, Fredberg JJ, Trepat X. Mechanical waves during tissue expansion. Nature Physics. 2012;8:628–634. doi: 10.1038/nphys2355. [DOI] [Google Scholar]
Tambe DT, Hardin CC, Angelini TE, Rajendran K, Park CY, Serra-Picamal X, Zhou EH, Zaman MH, Butler JP, Weitz DA, Fredberg JJ, Trepat X. Collective cell guidance by cooperative intercellular forces. Nature Materials. 2011;10:469–475. doi: 10.1038/nmat3025. [DOI] [PMC free article] [PubMed] [Google Scholar]
Trepat X, Fredberg JJ. Plithotaxis and emergent dynamics in collective cellular migration. Trends in Cell Biology. 2011;21:638–646. doi: 10.1016/j.tcb.2011.06.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
VanderWeele TJ, Shpitser I. On the definition of a confounder. The Annals of Statistics. 2013;41:196–220. doi: 10.1214/12-AOS1058. [DOI] [PMC free article] [PubMed] [Google Scholar]
Violin JD, Zhang J, Tsien RY, Newton AC. A genetically encoded fluorescent reporter reveals oscillatory phosphorylation by protein kinase C. The Journal of Cell Biology. 2003;161:899–909. doi: 10.1083/jcb.200302125. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zaritsky A, Welf ES, Tseng YY, Rabadán MA, Serra-Picamal X, Trepat X, Danuser G. Seeds of locally aligned motion and stress coordinate a collective cell migration. Biophysical Journal. 2015;109:2492–2500. doi: 10.1016/j.bpj.2015.11.001. [DOI] [PMC free article] [PubMed] [Google Scholar]

eLife. 2017 Mar 13;6:e22323. doi: 10.7554/eLife.22323.019

Decision letter

Editor: Fabrice Cordelières¹

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Decoupling global biases and local interactions between cell biological variables" for consideration by eLife. Your article has been favorably evaluated by Naama Barkai (Senior Editor) and three reviewers, one of whom, Fabrice Cordelières (Reviewer #1) served as Reviewing Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Chong Zhang (Reviewer #2); Perrine Paul-Gilloteaux (Reviewer #3).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

Zaritsky et al. propose a new approach to characterise interactions between two variables as a result of two influences: a global effect, independent of the two variables and influencing both, and a local effect that depends on their "direct" dependency. This paper raises the point of differentiating situations where correlations are observed, not due to a direct link between variables, but rather being the result of an overall behaviour. The authors built a framework in which both effects are reflected by two indicators: global and local indexes (GI and LI). The paper is divided into a theoretical part, where simulations are used to setup the framework, and a series of 4 biological examples where the "DeBias" method is applied.

Essential revisions:

1) The authors have performed simulation in which the bin widths of distributions or the number of observations have been progressively increased. Conclusions seem to be that a 15 quantization bins and a number of samples around 100 is acceptable. However, how should these parameters be set depending on the nature of the data to analyse? How should these parameters be set, depending on the distributions to analyse? Is there a generic rule? If so, this should be explained. If not, why did the author make those choices? The Methods section includes information about the number of quantization bins: why is it 89 for angular data, where as the value is 10 for PKC experiments and 39 for co-localization? From theoretical results, it seems LI or GI tends to (K-1)/2 when the other indicator is zero, K being the number of quantization bins. How would the authors recommend proceeding when the K parameter is not the same from one experiment to another? I fear we are missing proper guidelines.

2) For example of lack of usability: I did test it also with my own data of spot colocalization (simulation) and interestingly the GIs were inversely related to the spot density (but which was also related to the image size or N), and LIs was less obvious (my N was 100^2 and 256^2 with 100 spots each time with no colocalisation or with 50% of them colocalizing). I’ve found

GI no coloc 50% coloc

(squareroot size) 100 4.47 4.32

(squareroot size)256 3.67 3.62

LI no coloc 50% coloc

(squareroot size)100 1.59 2.24

(squareroot size)256 1.69 1.57

Before any conclusion, interpretation would require more characterisation work, or maybe provided with a simulation script in order to better apprehend the significance of it. I would have had to repeat the experiment, too laborious from the server, and in addition I do not know which number of bins was used in that case and if it was constant.

3) In terms of local interaction ζ, it is simulated as a ratio of θ by α, is this a generic assumption that local interaction is a variable linearly related to the alignment?

4) Subsection “Simulating synthetic data”, the authors demonstrated in the simulation the effectiveness of GI and LI. While it seems true by simulating local interaction of alignment by shifting one of the angles towards the other, how would GI and LI behave when the local interaction is not alignment, i.e., by shifting one away from the other? Would GI and LI show the opposite effect so as to be able to discriminate opposite local interactions?

5) Subsection “DeBias procedure”, for the adjustment #1 for the colocalization quantification, such normalization implicitly assumes the channel signals/intensities have linear relationship. This could be a strong assumption. What would be the adjustment for those signals that do not have linear relationship?

6) Probably some additional validation to highlight the method's advantage over other conventional intensity based colocalization methods would be useful. Such additional evaluation is probably not necessarily within the scope of a "Tools" submission, but that if such data were available, it would be a valuable scientific addition to the paper. Or at least some kind of generic guidelines about under which situations when DeBias may or may not be applied would be instructive.

7) The quantification power is actually relative: in the same conditions (meaning here same K, and potentially same N see point 2), the GIs and LIs would actually give relative information, but a twofold interaction factor would not show a twofold LI. In addition each experiment shows very different range of LIs and GIs. These values are theoretically in the range of [0; (k-1)/2], k been the number of bins, and does not denotes directly to the strength of the interaction, and have to be compared. Normalizing these descriptors by their upper bound may help to interpret.

8) In each of these experiments, the conclusions were drawn with a different path of reasoning and different constructions of statistical tests to compare and assess the significance of observed differences between these 2 descriptors for different conditions. I do believe that the manuscript would gain in impact by providing a general workflow (maybe a scheme with different possibilities), including the assessment of significance of the descriptors differences. Otherwise, the potential user of this powerful technique may struggle with the analysis and be led to erroneous conclusions.

9) The Discussion of this article is quite disappointing. The first part is a summary of previous conclusions, the second part being a short comparison to already published, approaching methods. The manuscript should benefit from this comparison being included into the proper section of the manuscript, namely the co-localisation one.

eLife. 2017 Mar 13;6:e22323. doi: 10.7554/eLife.22323.020

Author response

Essential revisions:

Setting K is a tradeoff between the number of observations and the range of observed values. In the manuscript, the narrow dynamic range in the PKC imaging and wider range in the CME co-localization experiments implied different selection of K.

In response to these valid questions by the reviewers, we have further validated that changing K does not alter the interpretation of simulated or experimental data analysis for both co-orientation and co-localization scenarios (Figure 3—figure supplement 1D-E, Figure 6—figure supplement 1I-J, Appendix 3—figure 1F).

The Freedman-Diaconis rule is designed to calculate the histogram bin width so to minimize the difference between the area under the observed and the theoretical distribution (D Freedman, P Diaconis et al., 1981). Following the reviewers’ comments we implemented and integrated this rule to our source code and web-server implementation as a means to determine a default value of K and the marginal distributions’ bin width.

K must remain constant for a given application, for testing the alterations of LI/GI as response to different perturbations / experimental conditions. However, different applications may use different K values. This is emphasized in the Methods section (“Importantly, GI and LI across experiments can be compared only when evaluated with the same K value and this is enforced by the web-server.”).

2) For example of lack of usability: I did test it also with my own data of spot colocalization (simulation) and interestingly the Gis was inversely related to the spot density (but which was also related to the image size or N), and Lis was less obvious ((my N was 100^2 and 256^2 with 100 spots each time with no colocalisation or with 50% of them colocalizing. I’ve found

GI no coloc 50%coloc

(squareroot size) 100 4.47 4.32

(squareroot size)256 3.67 3.62

LI no coloc 50%coloc

(squareroot size)100 1.59 2.24

(squareroot size)256 1.69 1.57before any conclusion, interpretation would require more characterisation work, or maybe provided with a simulation script in order to better apprehend the significance of it. I would have had to repeat the experiment, too laborious from the server, and in addition I do not know which number of bins was used in that case and if it was constant.

It is not clear to us how these simulations were performed. Below we list possible explanation to the reviewer’s observations:

GI is inversely associated with the density of co-localized pixels. If the background pixels are drawn from a uniform distribution then more “background” pixels (in this case 6.5 fold) imply reduced GI, as observed.

LI is not increased when N = 256^2 (assuming 100 colocalized pixels). A fraction of ~0.0015 (100/255^2) co-localized pixels could be too subtle for DeBias to identify. In the case N = 100^2 and corresponding fraction of 1% co-localized pixels, this is easily identified (LI: 1.59 → 2.24).

In our new co-localization simulations we have mixed interacting and non-interacting observations and demonstrated that LI associates with an increased fraction of locally- interacting observations (Appendix 3—figure 1D). In these simulations, the non- interacting observations from the 2^nd channel (Y) were drawn from the distribution of the first channel (X). Following the reviewer concern, we have also simulated very low fractions of interacting observations of 0%, 0.1%, 0.5% and 1% to highlight the sensitivity of DeBias. See Author response image 1 (parameters were: σ_ζ = 0.1, σ_x = 0.2). We did not include these results in the manuscript because Appendix 3—figure 1D already captures a wide range of fractions of interacting observations. Its purpose was to demonstrate DeBias’ ability to identify scenarios of partial indications and this was achieved with the new figure panel.

Author response image 1. — **DOI:** http://dx.doi.org/10.7554/eLife.22323.018

Following the reviewers’ suggestion, we have made our co-localization and co-orientation simulation publicly available.

3) In terms of local interaction ζ, it is simulated as a ratio of θ by α, is this a generic assumption that local interaction is a variable linearly related to the alignment?

The calculation of LI and GI is based on the assumption that the local interaction is constant, implemented by modeling the LI as an additive constant by complementing GI (calculated from the marginal distributions) to the observed co-alignment / co-localization distribution. This assumption might be unrealistic in many biological systems. As DeBias is the first direct measure to quantify both local interactions and global bias, we simplified the estimation of LI and used it as a proxy to complex and more biological-realistic interactions. Our co-alignment simulations modeled the local interactions as a fraction of the alignment θ. We were able to show, even with this simplifying assumption, that we can discriminate and estimate ζ accurately (Appendix 2—figure 1E-F). We have now also included a simulation for co-localization, where the local interactions are modeled as associations between the channels (y = αx). These results are shown in Appendix 3. Adjusting DeBias to non-constant bias is out-of-scope of this manuscript and will be pursued as future work.

4) Subsection “Simulating synthetic data”, the authors demonstrated in the simulation the effectiveness of GI and LI. While it seems true by simulating local interaction of alignment by shifting one of the angles towards the other, how would GI and LI behave when the local interaction is not alignment, i.e. by shifting one away from the other? Would GI and LI show the opposite effect so as to be able to discriminate opposite local interactions?

We have implemented simulations of two locally repulsive vector variables and demonstrated that yields negative LI values, indeed (Figure 2—figure supplement 1).

This assumption was made to adjust the channels’ intensity to the [0,1] range. Indeed, this is a simplifying assumption that is now discussed in the new section on DeBias’ limitations in the Discussion. Adjusting the normalization to possibly non-linear interactions is out-of-scope of this manuscript.

We already demonstrated that including GI increases discriminative abilities for the experimental CME co-localization data (Figure 6E-F) and for simulated co-alignment data (Figure 2F – only subjectively). We now complement this by demonstrating enhanced discrimination performance of Pearson correlation by including the GI as a coupled measure (Figure 6—figure supplement 1C-F). We also demonstrated the high association of LI and Pearson’s correlations (Pearson’s coefficient > 0.95), validating LI as a measure for local interactions.

7) The quantification power is actually relative: in the same conditions (meaning here same K, and potentially same N see point 2), the GIs and Lis would actually give relative information, but a twofold interaction factor would not show a twofold Li. In addition each experiment shows very different range of Lis and Gis. These values are theoretically in the range of [0; (k-1)/2], k been the number of bins, and does not denotes directly to the strength of the interaction, and have to be compared. Normalizing these descriptors by their upper bound may help to interpret.

It is true that the quantification power is relative. K can be set independently for every application to discriminate between different experimental conditions. It is also true that a two-fold increase in the direct interaction would not necessarily imply a two-fold increase in LI – this requires a priori knowledge of the nature of the interaction and adjustment of the calculation of LI. Tackling this issue for all of the four application areas seems outside the scope of this manuscript. When absolute values are needed, we will in the future refer to the DeBias framework and calibrate the LI change on an application-by- application basis. It should be noted that alternative measures for pixel-based co- localization are relative in nature as well. Normalizing by the upper bound will not solve the second concern (two-fold increased interaction mapped to a two-fold increase in LI). Since dividing by the constant K(K-1) will induce very low numbers for LI and GI (e.g., for K = 15 and LI = 1 → < 0.005) we have decided to avoid it in our implementation.

We excluded the possibility that N has a major effect on the calculated GI and LI values (point #2).

We included a general assessment of discrimination and integrated it into our source code and web-server implementation: the p-value of a two-sided Wilcoxon rank sum test for LI and GI of two distinct experimental conditions. In the web-server this part is now referred as ‘DeBias Analyst’.

In the manuscript, the only data assessed differently were the velocity-stress alignment measures (Figure 4) – because we had a single observation per condition and thus could not perform statistical analysis between independent experimental instances. This data was not produced in our lab (previously published data) so we could not retrieve more replicates and thus had to come up with a different analysis. Other methods were also proposed in the manuscript (e.g., classification accuracy, Figure 6E-F) but are not provided in our code distribution.

We feel that the summary part starting the Discussion is necessary because of the nature of the work. Showcasing DeBias on four quite distinct examples emphasizes the generalization of the method, especially, in extracting insightful mechanistic information that was not available in earlier co-localization quantification methods. This point is now emphasized in the ‘summary’ of the Discussion. Another reason for a summary is that many people read only the Discussion before they decide whether to invest time in reading the full text.

We did not follow the suggestion to move the comparison between pixel and object- based co-localization to the co-localization part of the manuscript. We feel that it will be confusing to follow a co-localization discussion in the Results section immediately before going into a general discussion of the features of DeBias analysis in the Discussion section. Having the comparison at the beginning of the co-localization results would interfere with the flow of the manuscript. Instead, we have rewritten the Discussion part addressing this issue and inserted a new Appendix 4 that discusses co-localization approaches in greater depth.

We also added a section on limitations of DeBias to the Discussion. It includes the issues raised by the reviewers: linearity assumption for co-localization normalization, adjusting DeBias to non-constant bias, relative quantification power, LI and GI are slightly coupled and that in some cases it takes additional means to interpret what biological properties are encoded in the GI.

[bib1] Adler J, Parmryd I. Quantifying colocalization by correlation: the pearson correlation coefficient is superior to the Mander's overlap coefficient. Cytometry Part A. 2010;77:733–742. doi: 10.1002/cyto.a.20896. [DOI] [PubMed] [Google Scholar]

[bib2] Aguet F, Antonescu CN, Mettlen M, Schmid SL, Danuser G. Advances in analysis of low signal-to-noise images link dynamin and AP2 to the functions of an endocytic checkpoint. Developmental Cell. 2013;26:279–291. doi: 10.1016/j.devcel.2013.06.019. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib3] Bazellières E, Conte V, Elosegui-Artola A, Serra-Picamal X, Bintanel-Morcillo M, Roca-Cusachs P, Muñoz JJ, Sales-Pardo M, Guimerà R, Trepat X. Control of cell-cell forces and collective cell dynamics by the intercellular adhesome. Nature Cell Biology. 2015;17:409–420. doi: 10.1038/ncb3135. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib4] Bolte S, Cordelières FP. A guided tour into subcellular colocalization analysis in light microscopy. Journal of Microscopy. 2006;224:213–232. doi: 10.1111/j.1365-2818.2006.01706.x. [DOI] [PubMed] [Google Scholar]

[bib5] Costes SV, Daelemans D, Cho EH, Dobbin Z, Pavlakis G, Lockett S. Automatic and quantitative measurement of protein-protein colocalization in live cells. Biophysical Journal. 2004;86:3993–4003. doi: 10.1529/biophysj.103.038422. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib6] Danuser G. DeBias decouples global bias from direct interactions in co-orientation and co-localization data. ab3106fb9adac71ea6e91ef36068faa8acdbf292https://github.com/DanuserLab/DeBias GitHub. 2017

[bib7] Das T, Safferling K, Rausch S, Grabe N, Boehm H, Spatz JP. A molecular mechanotransduction pathway regulates collective migration of epithelial cells. Nature Cell Biology. 2015;17:276–287. doi: 10.1038/ncb3115. [DOI] [PubMed] [Google Scholar]

[bib8] Di Fiore PP, von Zastrow M. Endocytosis, signaling, and beyond. Cold Spring Harbor Perspectives in Biology. 2014;6:a016865. doi: 10.1101/cshperspect.a016865. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib9] Drew NK, Eagleson MA, Baldo Jr. DB, Parker KK, Grosberg A. Metrics for assessing cytoskeletal orientational correlations and consistency. PLoS Computational Biology. 2015;11:e1004190. doi: 10.1371/journal.pcbi.1004190. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib10] Dunn KW, Kamocka MM, McDonald JH. A practical guide to evaluating colocalization in biological microscopy. AJP: Cell Physiology. 2011;300:C723–C742. doi: 10.1152/ajpcell.00462.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib11] Fisher RA. The use of multiple measurements in taxonomic problems. Annals of Eugenics. 1936;7:179–188. doi: 10.1111/j.1469-1809.1936.tb02137.x. [DOI] [Google Scholar]

[bib12] Freedman D, Diaconis P. On the histogram as a density estimator: l 2 theory. Probability Theory and Related Fields. 1981;57:453–476. [Google Scholar]

[bib13] Gan Z, Ding L, Burckhardt CJ, Lowery J, Zaritsky A, Sitterley K, Mota A, Costigliola N, Starker CG, Voytas DF, Tytell J, Goldman RD, Danuser G. Vimentin intermediate filaments template microtubule networks to enhance persistence in cell polarity and directed migration. Cell Systems. 2016;3:252–263. doi: 10.1016/j.cels.2016.08.007. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib14] He S, Liu C, Li X, Ma S, Huo B, Ji B. Dissecting collective cell behavior in polarization and alignment on micropatterned substrates. Biophysical Journal. 2015;109:489–500. doi: 10.1016/j.bpj.2015.06.058. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib15] Helmuth JA, Paul G, Sbalzarini IF. Beyond co-localization: inferring spatial interactions between sub-cellular structures from microscopy images. BMC Bioinformatics. 2010;11:372. doi: 10.1186/1471-2105-11-372. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib16] Hodgson L, Shen F, Hahn K. Biosensors for characterizing the dynamics of rho family GTPases in living cells. Current Protocols in Cell Biology. 2010;Chapter 14:14.11. 11–14.11.14. doi: 10.1002/0471143030.cb1411s46. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib17] Kalaidzidis Y, Kalaidzidis I, Zerial M. A probabilistic method to quantify the colocalization of markers on intracellular vesicular structures visualized by light microscopy. AIP Conference Proceedings; 158; 2015. [DOI] [Google Scholar]

[bib18] Kantorovich LV, Rubinstein GS. On a space of completely additive functions. Vestnik Leningrad. Univ. 1958;13:52–59. [Google Scholar]

[bib19] Karlon WJ, Hsu PP, Li S, Chien S, McCulloch AD, Omens JH. Measurement of orientation and distribution of cellular alignment and cytoskeletal organization. Annals of Biomedical Engineering. 1999;27:712–720. doi: 10.1114/1.226. [DOI] [PubMed] [Google Scholar]

[bib20] Krishnaswamy S, Spitzer MH, Mingueneau M, Bendall SC, Litvin O, Stone E, Pe'er D, Nolan GP. Systems biology. conditional density-based analysis of T cell signaling in single-cell data. Science. 2014;346:1250689. doi: 10.1126/science.1250689. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib21] Lagache T, Sauvonnet N, Danglot L, Olivo-Marin JC. Statistical analysis of molecule colocalization in bioimaging. Cytometry Part A. 2015;87:568–579. doi: 10.1002/cyto.a.22629. [DOI] [PubMed] [Google Scholar]

[bib22] Liberali P, Snijder B, Pelkmans L. A hierarchical map of regulatory genetic interactions in membrane trafficking. Cell. 2014;157:1473–1487. doi: 10.1016/j.cell.2014.04.029. [DOI] [PubMed] [Google Scholar]

[bib23] Liu AP, Aguet F, Danuser G, Schmid SL. Local clustering of transferrin receptors promotes clathrin-coated pit initiation. The Journal of Cell Biology. 2010;191:1381–1393. doi: 10.1083/jcb.201008117. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib24] Loerke D, Mettlen M, Yarar D, Jaqaman K, Jaqaman H, Danuser G, Schmid SL. Cargo and dynamin regulate clathrin-coated pit maturation. PLoS Biology. 2009;7:e1000057. doi: 10.1371/journal.pbio.1000057. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib25] Manders EMM, Verbeek FJ, Aten JA. Measurement of co-localization of objects in dual-colour confocal images. Journal of Microscopy. 1993;169:375–382. doi: 10.1111/j.1365-2818.1993.tb03313.x. [DOI] [PubMed] [Google Scholar]

[bib26] Nieuwenhuizen RP, Nahidiazar L, Manders EM, Jalink K, Stallinga S, Rieger B. Co-Orientation: quantifying simultaneous Co-Localization and orientational alignment of filaments in light microscopy. PLoS One. 2015;10:e0131756. doi: 10.1371/journal.pone.0131756. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib27] Pearson RA. Section i, social and economic science. Science. 1901;14:912–926. doi: 10.1126/science.14.363.912. [DOI] [PubMed] [Google Scholar]

[bib28] Peleg S, Werman M, Rom H. A unified approach to the change of resolution: space and gray-level. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1989;11:739–742. doi: 10.1109/34.192468. [DOI] [Google Scholar]

[bib29] Reis CR, Chen PH, Srinivasan S, Aguet F, Mettlen M, Schmid SL. Crosstalk between akt/GSK3β signaling and dynamin-1 regulates clathrin-mediated endocytosis. The EMBO Journal. 2015;34:2132– 2146. doi: 10.15252/embj.201591518. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib30] Reshef DN, Reshef YA, Finucane HK, Grossman SR, McVean G, Turnbaugh PJ, Lander ES, Mitzenmacher M, Sabeti PC. Detecting novel associations in large data sets. Science. 2011;334:1518–1524. doi: 10.1126/science.1205438. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib31] Rizk A, Paul G, Incardona P, Bugarski M, Mansouri M, Niemann A, Ziegler U, Berger P, Sbalzarini IF. Segmentation and quantification of subcellular structures in fluorescence microscopy images using squassh. Nature Protocols. 2014;9:586–596. doi: 10.1038/nprot.2014.037. [DOI] [PubMed] [Google Scholar]

[bib32] Rubner Y, Tomasi C, Guibas LJ. The earth mover's distance as a metric for image retrieval. International Journal of Computer Vision. 2000;40:99–121. doi: 10.1023/A:1026543900054. [DOI] [Google Scholar]

[bib33] Schmid SL, Smythe E. Stage-specific assays for coated pit formation and coated vesicle budding in vitro. The Journal of Cell Biology. 1991;114:869–880. doi: 10.1083/jcb.114.5.869. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib34] Semrau S, Holtzer L, González-Gaitán M, Schmidt T. Quantification of biological interactions with particle image cross-correlation spectroscopy (PICCS) Biophysical Journal. 2011;100:1810–1818. doi: 10.1016/j.bpj.2010.12.3746. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib35] Serra-Picamal X, Conte V, Vincent R, Anon E, Tambe DT, Bazellieres E, Butler JP, Fredberg JJ, Trepat X. Mechanical waves during tissue expansion. Nature Physics. 2012;8:628–634. doi: 10.1038/nphys2355. [DOI] [Google Scholar]

[bib36] Tambe DT, Hardin CC, Angelini TE, Rajendran K, Park CY, Serra-Picamal X, Zhou EH, Zaman MH, Butler JP, Weitz DA, Fredberg JJ, Trepat X. Collective cell guidance by cooperative intercellular forces. Nature Materials. 2011;10:469–475. doi: 10.1038/nmat3025. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib37] Trepat X, Fredberg JJ. Plithotaxis and emergent dynamics in collective cellular migration. Trends in Cell Biology. 2011;21:638–646. doi: 10.1016/j.tcb.2011.06.006. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib38] VanderWeele TJ, Shpitser I. On the definition of a confounder. The Annals of Statistics. 2013;41:196–220. doi: 10.1214/12-AOS1058. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib39] Violin JD, Zhang J, Tsien RY, Newton AC. A genetically encoded fluorescent reporter reveals oscillatory phosphorylation by protein kinase C. The Journal of Cell Biology. 2003;161:899–909. doi: 10.1083/jcb.200302125. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib40] Zaritsky A, Welf ES, Tseng YY, Rabadán MA, Serra-Picamal X, Trepat X, Danuser G. Seeds of locally aligned motion and stress coordinate a collective cell migration. Biophysical Journal. 2015;109:2492–2500. doi: 10.1016/j.bpj.2015.11.001. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Decoupling global biases and local interactions between cell biological variables

Assaf Zaritsky

Uri Obolski

Zhuo Gan

Carlos R Reis

Zuzana Kadlecova

Yi Du

Sandra L Schmid

Gaudenz Danuser

Roles

Abstract

eLife digest

Introduction

Results

Similarity of observed co-orientation originating from different mechanisms

Figure 1. Illustration of global bias and local interaction using the alignment of two orientational variables.

DeBias: a method to assess the global and local contribution to observed co-alignment

Figure 2. DeBias algorithm.

Figure 2—figure supplement 1. Simulations demonstrating the negative local interactions induce negative local indices.

Theory and limiting cases of DeBias

Local alignment of vimentin and microtubule filaments

Figure 3. : Alignment of microtubule and vimentin intermediate filaments in the context of cell polarity.

Figure 3—figure supplement 1. LI and GI are independent of the number of observations (N) and the number of histogram bins (K) – experimental evidence from the data in Figure 3E–F (time evolution of microtubule-vimentin alignment).

Video 1. Polarization of RPE cells at the monolayer edge over time.

Identifying molecular factors in alignment of cell velocity and mechanical forces during collective cell migration

Figure 4. : Alignment of stress orientation and velocity direction during collective cell migration.

Using DeBias to assess protein-protein co-localization

PKC FRET: a simple example of pixel-based protein-protein co-localization

Figure 5. : PKC inhibition alters LI and GI.

Inferring co-localization of molecular cargo and clathrin-coated pits during endocytosis

Figure 6. : AKT inhibition differentially alters recruitment of TfnR to CCPs during CME for different cell lines.

Figure 6—figure supplement 1. : GI encodes information that is distinct from local interactions; Experimental validations of DeBias for co-localization.

Discussion

Materials and methods

DeBias procedure

DeBias for protein-protein co-localization

Automated selection of number of histogram bins, K

Simulating synthetic data

Simulating co-alignment data

Simulating co-localization data

Vimentin and microtubule filaments experiments and analysis

Cell model

Fixed cell imaging

Live cell imaging

Filament extraction and spatial matching

Collective cell migration experiments and analysis

Statistical test

PKC FRET experiments

Clathrin mediated endocytosis experiments

Cells, cell culture and chemicals

Transferrin receptor internalization

Fixed cell imaging

Live-cell imaging and analysis

Image analysis for fixed cell experiments

Webserver

Acknowledgements

Appendix 1

Theoretical results for co-orientation data

Terms and definitions

Theorem 1: Local index of independent variables

Theorem 2: Global index of uniform distributions

Theorem 3: Perfect alignment

Theorem 4: LI is a lower bound for the local contribution to the observed alignment

Theorem 5: GI limits for highly variant truncated normal distributions

Appendix 2

Simulations with constant ζ

Appendix 2—figure 1. Simulations for X, Y normal distributions with different σx, σy and constant ζ = 0°, 5°, 10°, 15°.

Appendix 2—figure 2. Simulations for different values of K, the number of bins in the alignment distribution.

Appendix 2—figure 3. Simulations for different N, the number of observations.

Appendix 3

Simulating protein-protein co-localization

Appendix 3—figure 1. Simulating co-localization.

Appendix 4

Pixel- versus object-based co-localization

Appendix 4 – Table 1.

Funding Statement

Funding Information

Additional information

Competing interests

Theorem 4: $L I$ is a lower bound for the local contribution to the observed alignment

Appendix 2—figure 1. Simulations for X, Y normal distributions with different $σ$ _x, $σ$ _y and constant ζ = 0°, 5°, 10°, 15°.