Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 Oct 21.
Published in final edited form as: Langmuir. 2008 Sep 24;24(20):11577–11586. doi: 10.1021/la801186w

Bayesian Analysis of Heterogeneity in the Distribution of Binding Properties of Immobilized Surface Sites

Inna I Gorshkova 1, Juraj Svitel 1,§, Faezeh Razjouyan 1, Peter Schuck 1,#
PMCID: PMC2574969  NIHMSID: NIHMS64525  PMID: 18816013

Abstract

Once a homogeneous ensemble of a protein ligand is taken from solution and immobilized to a surface, for many reasons the resulting ensemble of surface binding sites may be heterogeneous. For example, this can be due to the intrinsic surface roughness causing variations in the local microenvironment, non-uniform density distribution of polymeric linkers, or non-uniform chemical attachment producing different protein orientations and conformations. We have previously described a computational method for determining the distribution of affinity and rate constants of surface sites from the analysis of experimental surface binding data. It fully exploits the high signal/noise ratio and reproducibility provided by optical biosensor technology, such as surface plasmon resonance. Since the computational analysis is ill-conditioned, the previous approach used a regularization strategy assuming a priori all binding parameters to be equally likely, resulting in the broadest possible parameter distribution consistent with the experimental data. We have now extended this method in a Bayesian approach to incorporate the opposite assumption, i.e. that the surface sites a priori are expected to be uniform (as one would expect in free solution). This results in a distribution of binding parameters as close to monodispersity as possible, given the experimental data. Using several model protein systems immobilized on a carboxymethyl dextran surface and probed with surface plasmon resonance, we show micro-heterogeneity of the surface sites, in addition to broad populations of significantly altered affinity. The distributions obtained are highly reproducible. Immobilization conditions and the total surface density of immobilized sites can have a substantial impact on the functional distribution of the binding sites.

Keywords: protein surface immobilization, protein interactions, binding kinetics, binding affinity, Fredholm integral equations, size-distribution, regularization, Bayesian analysis

INTRODUCTION

The study of protein binding and immobilization to surfaces is of great interest in many fields, including, for example, the design of biosensors and micro-arrays, and the design of surfaces that are biocompatible or specifically functionalized. It is also of fundamental interest for understanding the physical chemistry of protein adsorption to surfaces and protein/cell surface interactions. Yet, it is frequently non-trivial to determine the binding affinity and kinetic parameters of analyte binding to sites located at a surface. Even when highly purified proteins are covalently attached to the surface to serve as binding sites, one common observation is that experimental binding data do not conform to the model of a single class of sites (with or without considering mass transport to the surface in the binding kinetics). This is particularly striking in studies of protein surface binding with surface plasmon resonance (SPR) optical biosensors, which exhibit very high signal-to-noise ratios and reproducibility. Typical SPR data display this non-conformity with simple binding models in the majority of experimental systems 13.

One explanation could be the intrinsic heterogeneity of the surface sites. The much-cited aphorism “God made the solid state, he left the surface for the devil”(attributed to Wolfgang Pauli 4 and Enrico Fermi 5) appears to hold true similarly for the behavior of molecules free in liquid state versus at a surface: (1) The detailed structure of the surface may exhibit many local microenvironments with heterogeneous chemical and physical properties 69. These microenvironments are likely to modulate the conformation in which a protein is adsorbed or chemically immobilized, as well as its interaction potential for soluble analytes 815. This effect is likely exacerbated when using random immobilization chemistries 16. (2) When using a polymeric immobilization support, such as the common carboxymethyl dextran brush, a density distribution perpendicular to the surface will arise that, in addition to presenting a kinetic obstacle for analyte transport 17, 18, may also lead to a gradient in the intrinsic binding properties of the proteins through local variations in the electrostatic interactions, pH, steric constraints, or repulsive hydration/fluctuation forces 19. In addition, due to the intrinsic flexibility of the polymer anchors sampling a range of conformations, the resulting interaction potential of immobilized surface sites for soluble analytes can be expected to be distributed over a continuum of states 19. (3) As may be theoretically expected 20 and observed by AFM imaging 9, the vicinity of the surface can promote proteins to cluster. As a consequence of all these factors, proteins that prior to immobilization are uniform and mono-disperse free in solution should be expected to exhibit a dispersion of physical binding parameters between the ensemble of surface site offered to a mobile analyte probing the surface 21. Further, the degree of polydispersity in binding constants would be expected to depend strongly on the particular protein, its interaction with the particular surface during immobilization, and the detailed immobilization conditions.

The analysis of experimental data with models allowing for heterogeneity of binding affinity has been suggested by Sips 22 in the form of a parameterized isotherm, and later several groups have explored the idea of modeling arbitrary distributions of binding parameters to experimental surface binding data 2326. We have reported a method for the deconvolution of experimental kinetic surface binding and dissociation traces obtained at a set of different analyte concentrations resulting in two-dimensional distributions of rate constants and affinity constants 27. More recently, we have extended this model to incorporate first-order corrections for mass transport limitation 18. By considering simultaneously the two major experimental complications in the interpretation of SPR binding data, heterogeneity and mass transport limitation 2, for the first time this model consistently yields qualities of fits to experimental data in the order of the noise of data acquisition and experimental reproducibility (and without the need to invoke complex binding mechanism).

One critical question is how reliable the details of the calculated distributions can be interpreted. The concern arises from the well-known ill-posed nature of exponential fitting, and the fact that the deconvolution method resembles in principle a generalized two-dimensional Laplace transform. In order to numerically stabilize the results against noise amplification, Tikhonov-Phillips or maximum entropy regularization is frequently used 18, 23, 24, 27. This technique can be understood from the background that in such ill-posed problems there is usually a large set of different distributions that fit the data with statistically indistinguishable quality of fit. Regularization picks from those the one distribution that adheres most to a predefined property – the minimum total curvature in Tikhonov-Phillips regularization and the minimal information content in maximum entropy regularization, respectively. These methods are well-established in many biophysical disciplines, and they show excellent performance in extracting the essential features of the distribution while being stable against experimental noise 28. In the current context, one important aspect of the regularization used so far is that it implies a priori an equal probability for all binding constants. As a consequence, the broadest possible distribution is obtained (and a flat distribution is the result of an analysis of data consisting only of noise and not carrying information). Although this reflects the information content of the data alone, it is not entirely satisfactory for the purpose of examining to what extent dispersion of the binding parameters occurs in comparison to the usually uniform ensemble in solution.

Therefore, in the present work we approach regularization with the opposite assumption: that the distribution should consist of a single sharp peak. This can be implemented in a Bayesian framework 29 such that the a priori probabilities resemble a δ-function30. This is fundamentally different from a simple discrete single-site model (which will usually not fit the data), in that the result from the Bayesian approach will, if necessary, add the features to the distribution that may contradict the prior expectation, but are necessary to honor the data (i.e. give a fit quality equal to that of the standard regularization). With this approach, the data analysis will result in the narrowest distribution possible, given the information content of the experimental data. In particular, any deviation from a δ-function in the resulting distribution can be trusted to be essential feature of the data.

In the present paper, using this new tool, we demonstrate microheterogeneity of the binding sites of immobilized antigens and antibodies on the carboxymethyl dextran surface of a surface plasmon resonance biosensor. We further examined the reproducibility of the surface site distribution, supporting the results from the distribution analysis. Finally, we asked the question whether immobilizing to a higher surface density would lead to similar surface site distribution or selectively increase subpopulations of sites. For different proteins, both cases can be observed.

METHODS

Surface Plasmon Resonance Biosensing

Biosensor experiments were conducted in a Biacore 3000 surface plasmon resonance (SPR) instrument equipped with microfluidic sample delivery (GE Healthcare, Piscataway NJ). Customarily, the binding signal is measured in units termed “RU”, for ‘resonance units’, approximately equivalent to a refractive index change near the surface of 10−6. Sensor chips CM3 with ‘short’ carboxy-methyl dextran matrix were used. Standard amine coupling with N-hydroxysuccinimide and (N-ethyl-N-(3-dimethylaminopropyl)carbodiimide, was used, and unreacted sites were quenched with ethanolamine HCl 16, 31. Hepes buffer saline with 3 mM EDTA and 0.005 % P20 was used as a running buffer. A number of different pairs of interacting molecules were studied, as indicated in the Results and Figure legends. Association and dissociation kinetics were recorded at a range of analyte concentrations, adjusting the contact time to achieve close to steady-state binding, and extending the dissociation time to allow monitoring significant curvature in the dissociation trace, where possible. Dependent on the interacting pairs of molecules, surface regeneration was achieved with a pulse of high salt solution, detergents, or by waiting for natural dissociation, respectively.

Calculating Surface Site Distributions

The experimental binding signal sxp(c, t) from several association and dissociation traces obtained at different analyte concentrations were fitted with a model considering a continuous distribution of binding sites P(koff, KD) with a range of chemical off-rate constants koff and equilibrium dissociation constants KD. P(koff*, KD*)dkoffdKD is defined as the population of the class of surface sites (in signal units) with an off-rate constant between koff* and koff* +dkoff , and an equilibrium constant between KD* and KD*+dKD 18, 27. The analyte was assumed to be homogeneous. The kinetic traces at each concentration include an association phase, where at initial time, t0, an analyte at concentration c is brought in contact with the sensor surface for a duration tc, and the dissociation phase where the analyte is removed from the vicinity of the sensor surface. For simplicity, the absence of mass transport limitation is assumed. An extension to the mass transport limited case has been described in 18, and it can be extended to the Bayesian approach analogously. Binding to each class of sites is assumed to proceed independently with the following pseudo-first order rate equation

dsdt=konc(Ss)koffs (1)

where kon is the chemical on-rate constant with kon = koff/KD, and S is the saturating signal for this class of sites. The analytical solution of Eq. 1 consists of well-known sets of exponentials

s1(koff,KD,c,t)=(1+KD/c)1{1e(konc+koff)(tt0)t0<tt0+tc[1e(konc+koff)(tct0)]ekoff(ttc)t>t0+tc (2)

with the normalized signal per unit surface sites s1 = s/S. Considering that we have a distribution of binding sites, the total signal can be expressed as a Fredholm integral equation

stot(c,t)=KD,minKD,maxkoff,minkoff,maxs1(koff,KD,c,t)P(koff,KD)dkoffdkD+δ(c) (3)

considering also baseline offsets δ(c). Discretized in a grid of (koff,i, KD,i) values, this can be approximated as

stot(c,t)=i=1NPi(koff,i,KD,i)s1(koff,i,KD,i,c,t)ΔkoffΔKD+δ(c) (4a)

with the index i enumerating all surface species with associated pairs of parameter values (koff,i, KD,i), or short

stot(c,t)=iPis1(i)+δ(c) (4b)

The finite grid spacing Δkoff and ΔKD is here absorbed into the normalization of the distribution P’. The task of modeling the experimental data can now be formulated as a least-squares problem

MinPi,δ(c){c,t(sxp(c,t)iPis1(i)(c,t)δ(c))2} (5)

Since the effective start time t0 of the binding experiments are frequently not known with sufficient accuracy (due to instrumental and flow-based delays), t0 times are typically also included as fitting parameters for each curve, within the bounds of the experimental uncertainty. The range of the distribution is chosen in a series of analyses with the grid spanning different koff and KD ranges, selecting the smallest range that does not constrain the quality of fit. The discretization is performed on a logarithmic scale of parameter values, with typically 3 – 4 grid points per decade 18.

In the form of Eq. 5, the optimization can be highly susceptible to the amplification of experimental noise to dominate the features in the calculated distribution 28. Previously, we have described the use of Tikhonov regularization which stabilizes Eq. 5 by introducing a penalty term for the sum of the second derivative in both KD and koff direction 27:

MinPi,δ(c){c,t(sxp(c,t)iPis1(i)(c,t)δ(c))2+αi,jPjHijPi} (6)

with Hij denoting the square of the (N×N) second derivative matrix. The penalty term is a constraint, unavoidably decreasing the quality of the fit. However, the nuisance parameter α is iteratively adjusted such that the quality of fit, measured by F-statistics, remains indistinguishable from the overall best-fit with α = 0 on a pre-defined confidence level (usually one or two standard deviations). This ensures that we obtain the most parsimonious distribution possible for the given data. This is a powerful approach that was taken also by other groups in the computation of adsorption energy distributions in 23, 24, and is applied routinely in many other biophysical disciplines, such as dynamic light scattering (CONTIN) 32, and sedimentation velocity (SEDFIT) 33.

In a Bayesian refinement of this approach, we can define a prior expectation x(koff, KD) for the distribution to rescale our measure of parsimony. This is a well-known strategy to analyze new data in the light of pre-existing probabilities 29. Following a strategy that we have recently shown to significantly increase resolution in the context of macromolecular size-and-shape distributions by sedimentation velocity 30, we define the parsimony term on the basis of the derivative of the ratio of P relative to the prior x. The modified minimization

MinPi,δ(c){c,t(sxp(c,t)iPis1(i)(c,t)δ(c))2+αi,jPjHijxixjPi} (7)

with sxp(c, t) denoting the experimentally measured family of association and dissociation progress curves at a set of analyte concentrations c will now describe the distribution that deviates minimally and most parsimoniously from the prior expectation, given the experimental data 30. It should be noted that the optimization in Eq. 7 has the degrees of freedom to add features to the solution (even those contradicting the prior knowledge) to the extent that they are essential to explain the new data, again using F-statistics on a pre-defined confidence level as a criterion.

For constructing the prior, we use a hierarchical approach with a preliminary conventional analysis Eq. 6 is the first step. From the resulting distribution we visually discern the major peak whose binding parameters are presumed to reflect un-impaired interactions as they occur in solution. This peak (or region) is then integrated on a logarithmic scale to determine the best estimate for these parameters koff* and KD*. Next, the prior distribution is constructed as:

xi=A{pqkoff,i<koff*<koff,i+1andKD,i<KD*<KD,i+1p(1q)koff,i1<koff*<koffandKD,i<KD*<KD,i+1(1p)qkoff,i<koff*<koff,i+1andKD,i1<KD*<KD,i(1p)(1q)koff,i<koff*<koff,i+1andKD,i1<KD*<KD,i1/Aelse (8a)

with

lgp=(lgKD*lgKD+)/(lgKDlgKD+)lgq=(lgKoff*lgKoff+)/(lgKofflgKoff+) (8b)

and the closest neighboring grid points KD+=min{Ki>KD*},KD=max{Ki<KD*},koff+=min{koff,i>koff*},andkoff+=max{koff,i<koff*}. This is a finite grid approximation of a delta-function Aδ(k-koff*, K-KD*), with total area A and signal average parameters equal to koff* and KD*, added on a constant distribution. The amplitude A reflects the height of the peak relative to a uniform background, which we empirically set to a value of 100 or 1000 (such as to effect an influence of the prior on the result of the analysis). Finally, Eq. 8 was evaluated with the prior Eq. 8a.

All computational methods have been implemented in the public domain software EVILFIT, which has a graphical user interface running on the MATLAB platform (The Mathworks, Nantucket MA), and is freely available on request (from P.S.). The practical use of EVILFIT is part of the semi-annual data analysis workshops at NIH.

RESULTS

One can demonstrate the properties of the Bayesian distribution analysis best by comparing the results from the application to data of different information content. Figure 1 shows simulated kinetic association and dissociation traces mimicking those that would be obtained in a typical SPR biosensor for a single class of sites with KD = 10 nM and pseudo-first order reaction kinetics with koff = 0.05/sec, in the presence of 0.3 RU normally distributed noise. The left column of panels reflects the analysis of data at low surface site density (Figure 1A) (similar to that recommended in 34), the right column the equivalent analyses for a higher density surface (Figure 1B), both at the same noise levels. For the low signal/noise data, the conventional distribution analysis using Tikhonov-Phillips regularization (Figure 1C) leads to a very broad peak, by design reflecting the limited information content of this data (which was noted previously by Ober & Ward 35). Essentially, only the order of magnitude of KD and koff can be estimated. (The diagonal shape of the broad peak reflects the fact that kon is somewhat better determined from the data than koff). The introduction of the Bayesian prior expectation (Figure 1E) that the surface sites should more likely have binding parameters with KD* = 10 nM and koff* = 0.05/sec – the true parameters underlying this simulation – acts to focus the distribution to the expected value. This is true whether or not the expected parameter values supplied in the Bayesian prior are correct: distributions obtained with two-fold and threefold higher values for both KD*and koff* are shown in Figure 1G and 1F. The data does not have sufficient information content to contradict the prior. Instead, the tails in the distribution compensate for the erroneous peak values. It should be noted that all analyses shown have the same quality of fit, and are therefore statistically indistinguishable. The interpretation of this data is ambiguous, since which one is the true distribution can therefore not be decided, and depends on our belief in the prior.

Figure 1.

Figure 1

Effect of Bayesian prior knowledge on the estimated surface site distribution for data of different signal/noise ratio. Data are simulated to mimic those form SPR biosensors for a single class of sites with KD = 10 nM and koff = 0.05/sec with 0.3 RU normally distributed noise. Panels A and B show the binding data and residuals of the fit for a total surface capacity of 8 RU and 100 RU, respectively. The concentrations were 1, 3, 10, 30, and 100 nM (blue to green), respectively. The red line shows the best-fit distribution from the analyses in C and D, respectively, with the residuals joined in the plot below the binding data. In this figure, the left column of panels refers to the analysis of data in A, and the right column to the analysis of the data in B. Panels C and D are the koff-KD-distributions calculated with standard Tikhonov-Phillips regularization on a confidence level of two standard deviations, corresponding to a uniform prior, not emphasizing the likelihood of any particular surface site parameters. The distributions are presented as a contour plot with the color temperature interpolated from the population at the (KD,i, koff,i) grid points (shown as small black circles). The height of the distribution values can be read from the color bar at the right. The grid spacing is chosen logarithmically in both KD and koff direction, such that lines of constant kon are diagonal. Also plotted in the contour plots as vertical grey lines are the analyte concentrations at which the experimental binding data are collected. The horizontal grey line represents the inverse of the longest time-constant for which the experimental data would permit observing a (1/e)-fold decay. These lines are a guide for the order of magnitude of (KD, koff)-values that are characterized best by the experimental data. Panels E and F are the distributions obtained when using as a prior expectation the correct binding parameters underlying the simulation. The parameters emphasized in the prior are indicated as a cross, and the contour line where the discretized δ-function of the prior distribution reaches 10% of the maximum value is indicated as red dashed line. Panels G and H are obtained under the same conditions, but using an impostor prior, expecting both KD* and koff* to be twofold higher than the true value. Panels I and J are obtained with prior expectations reflecting parameters threefold higher than the true value.

A different situation is observed in the analysis of data with higher surface site density (and therefore better signal/noise ratio). Here, the standard regularization already leads to a narrowly confined region of affinity and rate constants (Figure 1D), which does not seem to be significantly improved in the Bayesian analysis supplying the prior with the correct parameters (Figure 1F). Importantly, the analyses with the impostor prior (Figure 1H and 1J) result in vanishing estimated population of the sites with the wrong parameter estimates. Clearly, the data are sufficiently informative to override wrong expectations inconsistent with the data.

A second interesting question is to what extent the degrees of freedom introduced in the distribution analysis can allow fitting data from interactions with single sites but more complex reaction kinetics, a case where the assumption of parallel pseudo-first order reactions in the current distribution model would fail. This is addressed in Figure 2, where we simulated binding data for a single class of surface sites with ligand-induced conformational change leading to tighter ligand binding. At low surface densities (10 RU) where the data have a low signal/noise ratio, the quality of fit with the impostor distribution model is very good (root-mean-square deviation (rmsd) of 0.33 RU vs simulated noise in the data at 0.30 RU), hardly distinguishable from the best-fit model. In contrast, however, from the higher surface site density (100 RU) one can discern a very significant mismatch of the best-fit binding traces, resulting in an rmsd of 1.87 RU. This can be taken as a criterion to reject this model. As a consequence, the distribution associated with this model should not be interpreted.

Figure 2.

Figure 2

Analysis of surface binding data from a single class of sites with ligand-induced conformational change with an impostor model of a koff-KD-distribution of sites with pseudo-first order kinetics. Data are simulated to mimic those form SPR biosensors for surface binding to a molecule with a low-affinity conformation (kon,1 = 2×105 M−1sec−1, koff,1 = 0.02/sec, KD,1 = 100 nM) and a high-affinity conformation (kon,2 = 1×105 M−1sec−1, koff,2 = 0.0005/sec, KD,2 = 5 nM), initially in an equilibrium favoring by 10:1 the low-affinity state, but upon ligand binding shifting with a rate constant of 0.01/sec to the high-affinity state. Two surface site densities were employed, 10 RU (Panel A) and 100 RU (Panel B), soluble ligand concentrations were 30, 100, 300, and 1000 nM (blue to green), and 0.3 RU normally distributed noise was added. The analysis with the impostor model assuming a distribution of sites each with simple pseudo-first order kinetics results in best-fit binding traces as shown in red in Panels A and B, and the residuals of the fit in Panels C and D, respectively. To facilitate the critical inspection of the residuals, they are shown twice on different scales. The distribution resulting from the analysis of the data of Panels A and B are shown in Panels E and F, respectively (in the same representation as described in the legend of Figure 1). The root-mean-square deviation of the fit to data in A is 0.33 RU, and for the fit to the data in F it is 1.87 RU.

Previously we have used experimental data from the binding of β2-microglobulin to an immobilized monoclonal antibody as a model system to illustrate the properties of the distribution model (e.g., Figure 1 in 18). Accordingly, we applied the Bayesian analysis strategy outlined above to this data (Figure 3A). After integration of the main peak from the conventional analysis (as shown in Figure 3C), we asked the question whether this peak can be described as a single site. The resulting distribution (Figure 3D) shows indeed a focused main peak. However, the main peak comprises only about half of all sites. A broad distribution of secondary sites is still observed, although now with slightly more homogeneous distribution. These include a tail of lower affinity sites in the µM range, a population of sites approximately 10fold weaker than the main peak, as well as significant from more high-affinity and more slowly reversible sites.

Figure 3.

Figure 3

Example for the application of the Bayesian analysis to an antibody-antigen interaction. Data show binding of soluble β2-microglobulin to a surface-immobilized, monoclonal IgG. Panel A: Experimental binding traces at analyte concentrations of 0.1, 1, 10, and 100 nM (blue to green), and best-fit traces from the Bayesian distribution model shown in Panel D (red lines). Panel B: Residuals of the fit, which has an rmsd of 0.31 RU. To facilitate the critical inspection of the residuals, they are shown twice on different scales. Panel C: Distribution calculated with conventional regularization (in the same representation as described in the legend of Figure 1). The bold red line indicates the region integrated to determine KD*= 1.22 nM and koff* = 1.45×10−3/sec. Panel D: Bayesian analysis using the prior expectation that the region enclosed by the red line in Panel C consists of a single class of sites. The maximum value of the distribution is 35 RU at the peak value, which comprises a total of 64 RU from a total estimated binding capacity of 125 RU.

It is interesting to note that the main peak in Figure 3D could not be completely compressed into the expected single class of sites, leaving a slight shadow. The question whether it is possible, in principle, to detect microheterogeneity within the main peak is investigated best with data at a very high signal/noise ratio. Figure 4A shows kinetic binding traces of soluble B5R antigen binding to the immobilized antibody 19C2 on a long-chain carboxy-methyl dextran surface (CM5, Biacore). As we have shown previously 36, the data fit very poorly to a single class of site model (Figure 4.6A in 36). However, it was not clear whether the reason for this is the population of low-affinity and poorly reversible sites (visible as grey spots at the right and lower boundary of the distribution), or from possible heterogeneity in the region of the main peak indicated in the distribution calculated with the standard approach (Figure 4C). With the Bayesian analysis tool we can now hypothesize that the main region may reflect a single class of sites (with binding parameters determined as the signal-average affinity and rate constant over the main region), and test whether an analysis using such a prior expectation will result in a distribution consistent with this prior. As shown in Figure 4D, a clear separation between the two subpopulations at Kd values of 4.8 nM and 20 nM remains, showing that this distinction is an essential feature of the information content of the data.

Figure 4.

Figure 4

Microheterogeneity of surface immobilized antibody. Binding of an antigen to its monoclonal antibody immobilized on a short-chain carboxy-methyl dextran surface. Panel A: Experimental binding traces at analyte concentrations of 1, 10, and 100 nM (blue to green), and best-fit traces from the Bayesian distribution model shown in Panel D (red lines). Panel B: Residuals of the fit, which has an rmsd of 0.31 RU. Panel C: Distribution calculated with conventional regularization (in the same representation as described in the legend of Figure 1). The bold red line indicates the region integrated to determine KD* = 8.6 nM and koff* = 2.4×10−4/sec. The dots indicate the grid-points for the calculated surface site parameter distribution (a slightly higher density than in 36 was employed leading to a better separation of the two partial peaks). Panel D: Bayesian analysis using the prior expectation that the region encompassing the main peaks in Panel C consists of a single class of sites.

Figure 6.

Figure 6

Dependence of the affinity and kinetic rate distributions on the total surface sites density achieved in the experiments. The distributions from Bayesian analysis are shown in the same representation as described in the legend of Figure 1. Panels A and B: Binding of soluble monoclonal Fab (different form that in Figure 5) to its immobilized antigen protein at a low surface density of 1,400 RU leading to a total binding capacity of ~340 RU (A) and high surface density of 3,800 RU leading to a total binding capacity of ~ 700 RU (B), respectively. Panels B and C: Binding of a third soluble monoclonal Fab to its antigen protein immobilized at low surface density of 920 RU leading to a total binding capacity of 381 RU (C), and immobilized at a higher surface density (1500 RU) leading to a total binding capacity of 464 RU (D).

Next, we examined the question how reproducible the details of the calculated distributions are. For this purpose, replicate experiments were performed with two identical immobilization protocols from the same antigen protein batch. As shown in Figure 5 A and B, this led to very similar surfaces regarding their binding capacities and binding signals, with the surface B reaching ~ 20% larger signal. The surface binding sites were probed by soluble analyte protein solution serially at the same set of concentrations of a monoclonal Fab fragment (Figure 5). A complex pattern of major and minor classes of binding sites was identified from the distribution analysis with Tikhonov regularization. Strikingly, a very similar pattern appeared from both surfaces. Small quantitative differences exist in the width and precise location of some peaks, but all classes of sites seem to be reported from both surfaces. The most significant difference appears to be the relative magnitude of the two major, high-affinity peaks. We also performed a control analysis using the Bayesian prior hypothetically probing whether the two most populated peaks may really stem from a single class of sites again supported the heterogeneity of the surface sites (Figure 5 Panels E and F). Similar to Figure 4, this single-site expectation is not supported by the data, as can be observed by the lack of a peak at the position of the average binding parameters (Panel E). For the replicate experiment in Panel F, the slightly different signal/noise ratio, average affinity and relative magnitude of the two peaks allows for the distribution to partially accommodate the prior, but without eliminating the peaks from the Tikhonov regularization, showing, again, that the single-site assumption is insufficient to explain the data.

Figure 5.

Figure 5

Reproducibility of the affinity and kinetic rate distributions. Shown are experimental binding data of a monoclonal Fab fragment specific against an immobilized target antigen protein. Two sets of surfaces were prepared with the same protein batch immobilized under identical conditions to a CM3 carboxymethyl dextran surface, yielding virtually the same density of active surface sites. The same analyte solutions were applied serially to both surfaces at concentrations of 2, 20, 200, and 2000 nM, yielding surface binding traces shown in Figure A and B, respectively. The corresponding results of the distribution analysis with Tikhonov-Phillips regularization are shown in Panels C and D, respectively (in the same representation as described in the legend of Figure 1). The binding parameters for the two highest peaks were KD,1 = 40 nM with koff,1 = 3.2×10−4/sec (27 RU) and KD,2 = 560 nM with koff,2 = 8.2×10−4/sec (100 RU) in Panel C, and KD,1 = 61 nM with koff,1 = 3.2×10−4/sec (48 RU) and KD,2 = 620 nM with koff,2 = 8.4×10−4/sec (112 RU) in Panel D, respectively. The Bayesian distribution analysis hypothesizing that the two peaks may represent a single site is shown in Panels E and F, respectively.

Finally, we asked whether surfaces that were treated differently to yield different densities of surface sites can be assumed to have similar affinity distributions, or if different immobilization conditions would disproportionably favor selected classes of sites. We studied this question with different immobilized proteins exhibiting different types of behavior. Figure 6 Panels A and B shows the affinity distributions of a monoclonal Fab fragment binding to the immobilized antigen molecule at approximately twofold different surface densities. The higher surface site density was achieved by applying a protein antigen concentration to the activated surface during the immobilization step. Only minor differences in the peaks can be discerned. Similar results were found when generating different surfaces by exposing the activated surface to the same protein concentration but for longer times (data not shown). In contrast, a different behavior was observed for a different molecule shown in Figure 6 Panels C and D. Here, achieving higher surface density resulted in the increasing population of a higher-affinity class of sites. The high-affinity site at 220 – 250 nM is populated at a signal amplitude of 71 RU for the low density surface, but 148 RU in the high density surface. Interestingly, this difference is on the same order of magnitude as the total in the binding capacity between the two surfaces (~80 RU). For this pair of interacting molecules, we found that the generation of the higher affinity peak did not depend on whether the higher surface density was achieved with longer exposure times of the activated surface, or for the same time but with higher protein concentrations (data not shown).

DISCUSSION

In the present paper, we have described an extension of the previously introduced approach of interpreting families of surface binding and dissociation traces as the result of populations of surface sites with a continuous distribution of binding parameters 18, 27, 3741. Many man-made or biological surfaces naturally offer heterogeneous populations of surface sites for soluble molecules. But this is true even for specifically functionalized surfaces: if we imagine proteins free in solution being mono-disperse and in an ensemble of conformations that can be well-characterized in three dimensions by a single set of affinity and chemical rate constant values, after immobilization to an intrinsically physically and chemically heterogeneous surface in potentially multiple geometric configurations, one may not be able to make the idealizing assumption for the resulting surface sites to be homogeneous in their analyte binding function.

There is not only substantial structural and functional evidence from numerous biophysical studies using a variety of techniques supporting this view, but also from experimental binding data. We have used an SPR biosensor as an experimental tool to study the binding of soluble analyte molecules to purified proteins immobilized on a carboxy-methyl dextran layer at the surface. The physical structure and chemical property of this surface is rather complex. Commonly, when fitting experimental SPR data with single site models, a poor match between the data and the predicted traces is observed (unless the time and concentration range is artificially truncated or experiments are conduced at extremely low signal/noise ratio). In contrast, with the model allowing for a continuous distribution of surface site binding parameters usually very good fits are obtained, consistently providing rational explanations of the experimental data commensurate with their signal/noise ratio and excellent reproducibility. In our experience of applying the model to many different systems, we found cases where single peaks appear and other cases where relatively complex pattern of multiple peaks emerges. In the present paper, we have tried to critically examine to what extent the resulting distributions are a true reflection of the populations of surface sites.

It is important to note that the continuous distribution model is not so degenerate as to indiscriminately fit any binding traces. We have shown previously that under suitable conditions the continuous surface site model cannot be force-fit to binding data containing unrecognized mass transport limitation 27 (although terms for mass-transport limited binding can be incorporated 18). Another model that is of particular interest in some immunological receptor-ligand reactions and antibody-antigen reactions is that of a ligand-induced conformational change, which provides an alternative mechanism to possibly explain multi-phasic binding behavior. In the present work, we have demonstrated that, given data of moderate or high signal/noise ratio, it displays large deviations from the superposition of parallel reactions in the continuous model developed here, at least if the conformational changes are slow enough to be observable on the time-scale of the surface binding experiment. This provides an opportunity to rationally discriminate between the two models on the basis of the fit quality. As a consequence, for systems where complex binding mechanisms appear to be at work, one could strengthen the evidence for complex binding mechanism by ruling out that the data can be fit, alternatively, with a model accounting for multi-phasic signal through polydispersity of the immobilized sites.

Our data show that even relatively rich patterns of surface sites such as in Figure 5CD and Figure 6AB are usually highly reproducible. Therefore, we believe that the particular pattern of binding sites is a consequence of potentially complex interactions of the particular surface, the immobilized protein, the soluble analyte molecule, and the immobilization process.

One simple question is whether an increase in the total amount of surface immobilized material will generally result in proportional increase of all binding sites populations. This would suggest that the probability of ligand attachment in a certain conformation and spatial configuration remains constant throughout the immobilization time. Alternatively, one could imagine a more dynamic process where the probabilities change with time or with ligand concentration. Obviously, the latter will be true as the surface will approach saturation. However, still far from this regime, we found examples of both types of behavior for different molecules. This could reflect differences in the chemical properties and/or self-association behavior of the ligand (considering the relatively high local concentrations at the surface even at low RU signals). Our observation in Figure 6CD of a higher affinity population emerging at higher surface densities could be related to the finding by Zacher & Wischerhoff (with two-color SPR) suggesting a front of immobilization moving with time from the outside to the inside closer to the surface 42. One could speculate that the interior of the carboxymethyl dextran matrix with its higher dextran density may present a different microenvironment promoting association for this particular system. In any case, these data show that one should be cautious with the expectation that different surface immobilization levels would lead to just more sites with the same properties. Unfortunately, this assumption has been silently made frequently in the literature (including our own work 2, 3), for example, where variations in the surface capacity are employed to evaluate the presence of mass transport limitation, or to attempt to observe effects of multi-valent interactions. With the distribution model making more details available on the functional distribution of surface sites, this assumption can now be critically assessed. It seems that other tests for the presence of mass transport limitation, such as the variation of the flow rate, or the injection of a soluble competitor in the dissociation phase suppressing rebinding, may be better strategies that could be applied without requiring multiple functionalized surfaces.

In order to probe the data for more information on the true presence of microheterogeneity, we have implemented a Bayesian tool that will report, without compromising the quality of fit, the distribution that is closest to a single class of sites. This improves on the property of the standard Tikhonov or maximum entropy regularization to always provide the broadest peaks possible, which becomes a limitation when asking the question of whether or not the data itself provides positive evidence of microheterogeneity. For the study of analyte binding to surface immobilized proteins that are uniform in solution, using the Bayesian prior expectation that the surface sites be uniform might be considered a better implementation of Occam’s razor. A key feature of this approach is that the pre-existing knowledge is not entered as a fixed constraint (which would virtually always lead to a low quality of fit that would statistically need to be rejected), but that the resulting distribution does have the freedom to add features to the solution (even those contradicting the prior knowledge) to the extent that they are essential to explain the new data. Further, the assignment of Bayesian prior expectation for the surface site properties allows to actively probe the space of possible distributions that are consistent with the data 30.

When we applied this new tool to the surface binding data in the present study, we invariably observed that a single class of sites is insufficient to explain the binding data, even allowing for contaminating side reactions with far different binding parameters to be accounted for. This indicates that the environment of the immobilized sites is heterogeneous, and/or the ensemble of conformations of the attached protein experiences a dispersion when attached to the surface.

In conclusion, we believe that the method developed here will be useful to study in more detail than previously possible fundamental questions of protein surface immobilization, and protein binding to surface sites. Also, we believe that it provides a tool to address in more detail biological systems with naturally heterogeneous ensembles of molecules. Further studies are warranted, for example, to relate different populations of sites or the presence of microheterogeneity with the changes in the surface chemistry and polymer support, in order to optimize the properties of the functionalized surface.

Acknowledgment

We thank Drs. Zhaochun Chen and Robert Purcell for providing the antibodies used in this study. This research was supported by the Intramural Research Program of the National Institute of Biomedical Imaging and Bioengineering, NIH.

REFERENCES

  • 1.Karlsson R, Roos H, Fägerstam L, Persson B. Methods: A companion to Methods in Enzymology. 1994;6:99–110. [Google Scholar]
  • 2.Schuck P. Ann. Rev. Biophys. Biomol. Struct. 1997;26:541–566. doi: 10.1146/annurev.biophys.26.1.541. [DOI] [PubMed] [Google Scholar]
  • 3.Schuck P. Curr. Opin. Biotechnology. 1997;8:498–502. doi: 10.1016/s0958-1669(97)80074-2. [DOI] [PubMed] [Google Scholar]
  • 4.Hoefer U. Science. 1998;279:190–191. [Google Scholar]
  • 5.Cass T, Ligler FS. Immobilized Biomolecules in Analysis. Oxford University Press; 1998. [Google Scholar]
  • 6.Heinz WF, Hoh JH. Biophys J. 1999;76:528–538. doi: 10.1016/S0006-3495(99)77221-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Heinz WF, Hoh JH. Trends Biotechnol. 1999;17:143–150. doi: 10.1016/s0167-7799(99)01304-9. [DOI] [PubMed] [Google Scholar]
  • 8.Huang YW, Gupta VK. J Chem Phys. 2004;121:2264–2271. doi: 10.1063/1.1768155. [DOI] [PubMed] [Google Scholar]
  • 9.Xu H, Zhao X, Grant C, Lu JR, Williams DE, Penfold J. Langmuir. 2006;22:6313–6320. doi: 10.1021/la0532454. [DOI] [PubMed] [Google Scholar]
  • 10.Tobias DJ, Mar W, Blasie JK, Klein ML. Biophys J. 1996;71:2933–2941. doi: 10.1016/S0006-3495(96)79497-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Kloss AA, Lavrik N, Yeung C, Leckband DE. Langmuir. 2000;16:3414–3421. [Google Scholar]
  • 12.Qian W, Yao D, Yu F, Xu B, Zhou R, Bao X, Lu Z. Clin Chem. 2000;46:1456–1463. [PubMed] [Google Scholar]
  • 13.Tuttle PVt, Rundell AE, Webster TJ. Int J Nanomedicine. 2006;1:497–505. doi: 10.2147/nano.2006.1.4.497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Runge AF, Mendes SB, Saavedra SS. J Phys Chem B. 2006;110:6732–6739. doi: 10.1021/jp056049e. [DOI] [PubMed] [Google Scholar]
  • 15.Hodgkinson GN, Hlady V. Croatica Chemica Acta. 2007;80:405–420. [PMC free article] [PubMed] [Google Scholar]
  • 16.O’Shannessy DJ, Brigham-Burke M, Peck K. Anal. Biochem. 1992;205:132–136. doi: 10.1016/0003-2697(92)90589-y. [DOI] [PubMed] [Google Scholar]
  • 17.Schuck P. Biophys. J. 1996;70:1230–1249. doi: 10.1016/S0006-3495(96)79681-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Svitel J, Boukari H, Van Ryk D, Willson RC, Schuck P. Biophys. J. 2007;92:1742–1758. doi: 10.1529/biophysj.106.094615. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Leckband DE. Ann. Rev. Biophys. Biomol. Struct. 2000;29:1–26. doi: 10.1146/annurev.biophys.29.1.1. [DOI] [PubMed] [Google Scholar]
  • 20.Minton AP. Biophys Chem. 2000;86:239–247. doi: 10.1016/s0301-4622(00)00151-4. [DOI] [PubMed] [Google Scholar]
  • 21.Vijayendran RA, Leckband DE. Anal Chem. 2001;73:471–480. doi: 10.1021/ac000523p. [DOI] [PubMed] [Google Scholar]
  • 22.Sips R. J. Chem. Phys. 1948;16:490–495. [Google Scholar]
  • 23.Haber-Pohlmeier S, Pohlmeier A. J. Colloid. Interface Sci. 1997;188:377–386. doi: 10.1016/j.jcis.2004.02.047. [DOI] [PubMed] [Google Scholar]
  • 24.Puziy AM, Matynia T, Gawdzik B, Poddubnaya OI. Langmuir. 1999;15:6016–6025. [Google Scholar]
  • 25.Gun'ko VM, Leboda R, Turov VV, Villieras F, Skubiszewska-Zieba J, Chodorowski S, Marciniak M. J Colloid Interface Sci. 2001;238:340–356. doi: 10.1006/jcis.2001.7512. [DOI] [PubMed] [Google Scholar]
  • 26.Rosovitz MJ, Schuck P, Varughese M, Chopra AP, Mehra V, Singh Y, McGinnis LM, Leppla SH. J Biol Chem. 2003;278:30936–30944. doi: 10.1074/jbc.M301154200. [DOI] [PubMed] [Google Scholar]
  • 27.Svitel J, Balbo A, Mariuzza RA, Gonzales NR, Schuck P. Biophys. J. 2003;84:4062–4077. doi: 10.1016/S0006-3495(03)75132-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Provencher SW. Comp. Phys. Comm. 1982;27:213–227. [Google Scholar]
  • 29.Sivia DS. Data Analysis. A Bayesian Tutorial. Oxford: Oxford University Press; 1996. [Google Scholar]
  • 30.Brown PH, Balbo A, Schuck P. Biomacromolecules. 2007;8:2011–2024. doi: 10.1021/bm070193j. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Schuck P, Boyd LF, Andersen PS. In: Current Protocols in Protein Science. Coligan JE, Dunn BM, Ploegh HL, Speicher DW, Wingfield PT, editors. Vol. 2. New York: John Wiley & Sons; 1999. pp. 20.22.21–20.22.21. [Google Scholar]
  • 32.Provencher SW. Makromol. Chem. 1979;180:201–209. [Google Scholar]
  • 33.Schuck P. Biophys. J. 2000;78:1606–1619. doi: 10.1016/S0006-3495(00)76713-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Myszka DG. J Mol Recognit. 1999;12:279–284. doi: 10.1002/(SICI)1099-1352(199909/10)12:5<279::AID-JMR473>3.0.CO;2-3. [DOI] [PubMed] [Google Scholar]
  • 35.Ober RJ, Ward ES. Anal. Biochem. 1999;273:49–59. doi: 10.1006/abio.1999.4185. [DOI] [PubMed] [Google Scholar]
  • 36.Sundberg EJ, Andersen PS, Gorshkova II, Schuck P. In: Protein Interactions: Biophysical Approaches for the Study of Complex Reversible Systems. Schuck P, editor. Vol. 5. New York: Springer; 2007. pp. 97–141. [Google Scholar]
  • 37.Vorup-Jensen T, Carman CV, Shimaoka M, Schuck P, Svitel J, Springer TA. Proc Natl Acad Sci U S A. 2005;102:1614–1619. doi: 10.1073/pnas.0409057102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Chen Z, Earl P, Americo J, Damon I, Smith SK, Zhou YH, Yu F, Sebrell A, Emerson S, Cohen G, Eisenberg RJ, Svitel J, Schuck P, Satterfield W, Moss B, Purcell R. Proc Natl Acad Sci U S A. 2006;103:1882–1887. doi: 10.1073/pnas.0510598103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Chen Z, Moayeri M, Zhou YH, Leppla S, Emerson S, Sebrell A, Yu F, Svitel J, Schuck P, St Claire M, Purcell R. J Infect Dis. 2006;193:625–633. doi: 10.1086/500148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Chen Z, Earl P, Americo J, Damon I, Smith SK, Yu F, Sebrell A, Emerson S, Cohen G, Eisenberg RJ, Gorshkova I, Schuck P, Satterfield W, Moss B, Purcell R. J Virol. 2007;81:8989–8995. doi: 10.1128/JVI.00906-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Stapulionis R, Pinto Oliveira CL, Gjelstrup MC, Pedersen JS, Hokland ME, Hoffmann SV, Poulsen K, Jacobsen C, Vorup-Jensen T. J Immunol. 2008;180:3946–3956. doi: 10.4049/jimmunol.180.6.3946. [DOI] [PubMed] [Google Scholar]
  • 42.Zacher T, Wischerhoff E. Langmuir. 2002;18:1748–1759. [Google Scholar]

RESOURCES