Assembly Reactions of HBV Capsid Protein into Capsid Nanoparticles Follow a Narrow Path Through a Complex Reaction Landscape

Roi Asor; Lisa Selzer; Christopher John Schlicksup; Zhongchao Zhao; Adam Zlotnick; Uri Raviv

doi:10.1021/acsnano.9b00648

. Author manuscript; available in PMC: 2020 May 22.

Published in final edited form as: ACS Nano. 2019 Jun 25;13(7):7610–7626. doi: 10.1021/acsnano.9b00648

Assembly Reactions of HBV Capsid Protein into Capsid Nanoparticles Follow a Narrow Path Through a Complex Reaction Landscape

Roi Asor ^†, Lisa Selzer ^‡,^¶, Christopher John Schlicksup ^‡, Zhongchao Zhao ^‡, Adam Zlotnick ^‡, Uri Raviv ^†

PMCID: PMC7243059 NIHMSID: NIHMS1581441 PMID: 31173689

Abstract

For many viruses, capsids (biological nanoparticles) assemble to protect genetic material and dissociate to release their cargo. To understand these contradictory properties, we analyzed capsid assembly for Hepatitis B virus; an endemic pathogen with an icosahedral, 120-homodimer capsid. We used solution X-ray scattering to examine trapped and equilibrated assembly reactions. To fit experimental results, we generated a library of unique intermediates, selected by umbrella sampling of Monte Carlo simulations. The number of possible capsid intermediates is immense, ~ 10³⁰, yet assembly reactions are rapid and completed with high fidelity. If the huge number of possible intermediates were actually present, maximum entropy analysis shows that assembly reactions would be blocked by an entropic barrier, resulting in incomplete nanoparticles. When an energetic term was applied to select the stable species that dominated the reaction mixture, we found that only a few hundred intermediates, mapping out a narrow path through the immense reaction landscape. This is a solution to a viral application of the Levinthal paradox. With the correct energetic term, the match between predicted intermediates and scattering data was striking. The grand canonical free energy landscape for assembly, calibrated by our experimental results, supports a detailed analysis of this complex reaction. There is a narrow range of energies that supports on-path assembly. If association energy is too weak or too strong progressively more intermediates will be entropically blocked, spilling into paths leading to dissociation or trapped incomplete nanoparticles, respectively. These results are relevant to many viruses, provide a basis for simplifying assembly models and identifying new targets for antiviral intervention. They provide a basis for understanding and designing biological and abiological self-assembly reactions.

Graphical Abstract

graphic file with name nihms-1581441-f0001.jpg

Introduction

Large macromolecular complexes are ubiquitous in nature, yet the assembly paths that lead to thermodynamically stable products have rarely been determined. A challenging example is the assembly of viral capsids (biological nanoparticles). In about half of known virus families, the capsid is a spherical/icosahedral protein complex. Depending on the virus, capsids may package nucleic acid during assembly and/or remain empty, typically as a Trojan horse, a storage form, or an immature particle for subsequent nucleic acid packaging. In many cases, purified capsid protein (CP) molecules can be induced to spontaneously assemble into viruslike capsids. Paradoxically, capsids that protect their genome can be induced to dissociate to release their genetic cargo. The interactions between CP subunits should therefore support the formation of both stable yet dynamic capsid structures.^1-5

To understand this reaction, many groups have examined assembly in vitro.^6-8 The simplest case is where CP subunits at suitable buffer conditions, assemble into empty capsids by a nucleation-elongation mechanism.^9-11 Virus-like particles (VLPs) can also form around synthetic polymers, RNA, or DNA.^12-17 From materials perspective, understanding the thermodynamic stability of virus capsid nanoparticles or VLPs is of great interest owing to their potential use in bionanomedical and bionanomaterial applications.¹⁸ VLPs may serve as protein nanocapsules, nanocarriers for nanoparticles, drug or gene delivery vehicles, biosensors, scaffolds for display of antigens for vaccine applications, or as nanoreactors for catalysis.^19-25

In vivo, the nucleoprotein core of Hepatitis B virus (HBV) assembles in the cytoplasm from 120 CP dimers, either as empty nanoparticles or while packaging viral RNA and viral reverse transcriptase. The reverse transcriptase becomes active after assembly, synthesizing a circular dsDNA genome from the linear ssRNA "pregenome" within the T = 4 icosahedral capsid (Figure 1). The mature DNA-filled core (and empty cores) can then be secreted from a cell while acquiring a protein and lipid envelope. In vitro, the dimers of the 149 residue assembly domain (cp149) can be driven to assemble into virus like capsids by increasing ionic strength, reaction temperature, and cp149 concentration.^1-3,26 In vitro and in vivo,²⁷ there is a small population of T = 3 nanoparticles composed of dimers (Figure 1). This assembly reaction has been characterized, under a wide range of conditions. The results showed that cp149 protein is a well behaved and tractable model system.^1-4 More recently, mass spectrometry and nanofluidic resistive-pulse sensing have provided improved single particle and temporal resolution to study intermediates and capsids formation along the assembly pathway.^7,8,28,29

Figure 1: — Atomic model (red and green) and graph representation (blue) of T = 3 (left) and T = 4 (right) icosahedral capsid symmetries. Each vertex of the graph is at the geometric center of a dimer subunit whereas each edge corresponded to a contact between adjacent dimers. Red and green colors represent the two dimer positions. The three types of interactions (the edges of the graph) are indicated by white letters. T is the triangulation number, defined as the square of the distance between two adjacent five-fold vertices. 60T is the number of monomer subunits comprising the icosahedral structure. T = 3 contains 90 dimers (180 monomers) and T = 4 has 120 dimers (240 monomers). The outer diameters of T = 3 and T = 4 are 28 and 32 nm, respectively.

To attain a comprehensive description of the ensemble of species in HBV capsid assembly reactions, we have collected solution small-angle X-ray scattering (SAXS) data and deconvolved them to a library of model structures derived from umbrella sampling Monte Carlo (MC) simulations of assembly. Using our home-developed software, D+ (https://scholars.huji.ac.il/uriraviv/software/d-software),^30,31 we computed the solution scattering curve contribution from the atomic model of each intermediate. Because of the extraordinary number of possible intermediates, unrestrained selections of intermediates using a maximum entropy approach,^32,33 yielded a chemically and experimentally unrealistic result, a recapitulation of the Levinthal paradox.³⁴ By incorporating thermodynamic constraints on the assembly reaction we dramatically restricted the number of products and obtained compelling interpretations of the data.

We found that a narrow range of dimer-dimer association free energy led to assembly without intermediates. Under those conditions, the path of minimum grand canonical free energy starts at the reactants and ends at the final global minimum energy products, passing through a downhill energy landscape with minimal potential for pathways that could lead to out-of-equilibrium metastable, kinetically trapped intermediates. Under- or over-shooting the ideal energy opened up reaction pathways that led to progressively more kinetically trapped intermediates.

Results and Discussion

Determining capsid assembly reaction products

At low temperature and low ionic strength, cp149 is exclusively dimer

The initial state of the capsid protein solution is critical for the study of its assembly process. Partial assemblies and nonspecific aggregates are likely to affect the assembly path or yield. Figure 2 shows the absolute scattering intensity curve (see Section 1a in the supporting information, SI) from the initial capsid protein solution. Using D+ software (https://scholars.huji.ac.il/uriraviv/software/d-software),^30,31 the curve was fitted to the atomic model of a solvated capsid protein dimeric subunit, where the electron density and the thickness of the hydration shell were fitted to the scattering data. The excellent agreement with the calculated intensity indicates that the initial state for the assembly reaction was pure cp149 dimer. Within the limited resolution of the SAXS curve (about 1nm), the conformation of the dimer in its soluble state was similar to its conformation in the capsid. Significant conformational differences between the free dimer and capsid states are still possible.³⁵ The structure of the solvated dimer, used for fitting the data in Figure 2, measured at 9 °C, was used as the basic subunit to construct the entire intermediates configurational space, used later for fitting the data at all the temperatures. Additionally, we used these data to set the hydration layer parameters of all the other intermediate structural models, eliminating the need to fit additional parameters in the subsequent structural analyses. The assumption in this approximation was that in the intermediate assemblies, the contribution to the scattering intensity from the overlap between the hydration shells of the dimers was negligible. This assumption was validated by verifying that the computed scattering curve from the hydrated capsid was similar to the curve of a capsid that was computed from hydrated dimers (see Figure S1 and Section 1b in the SI).

Figure 2: — Cp149 is dimeric. Azimuthally integrated background-subtracted absolute small angle X-ray scattering intensity (see Section 1a in the SI) from 25 μM (0.9 ${}^{mg}∕_{mL}$ ) cp149, at 50mM HEPES, pH 7.5, at 9 °C (blue symbols) fit to a hydrated model dimer (red line). Aggregates were avoided by applying the protocol explained in Sample Preparation in Materials and Methods. Measurement error bars are shown in gray. The modeled intensity was calculated using the atomic coordinates of cp149 dimer (PDB ID 2G33). Hydrogen atoms were added to the PDB file using MolProbity server³⁶ and the hydration layer was fitted using D+ software. The best fitted values for the thickness and electron density of the hydration layer were 2 Å and 363 ${}^{e}∕_{{nm}^{3}}$ , respectively.

The NaCl concentration for the formation of capsids decreases with temperature

SAXS data recapitulated ionic strength and temperature dependence of assembly observed in earlier studies^{1,2,4,8,26,37,38} but provided new insights. Assembly of 20 μM cp149 was initiated by mixing 25 μM cp149 in 50mM HEPES at pH 7.5 with a concentrated buffered NaCl solution, to final NaCl concentrations between 0 and 1 M, in 50 mM HEPES at pH 7.5. Figure 3 shows the scattering intensity curves measured after several hours of incubation. At 9 or 25 °C, 300 mM NaCl were needed to form capsids. At 36 °C, capsid-like particles were already evident at 50 mM NaCl.

Figure 3: — Equilibrated or trapped cp149 assembly reactions at different temperatures and NaCl concentrations. The figure shows the background-subtracted signals (blue symbols), the standard deviations (gray bars), and fitted models (red curves). The modeling of each curve was done by fixing the total protein concentration to the experimental value and fitting the interdimer association free energy for T = 4 capsids and the relative association energy for T = 3 capsid (Equation 26 and 27). The 9 °C panel presents the atomic structure of cp149 dimer (bottom left) and T = 4 capsid (upper right), not at the same scale. The values of the best fitted interdimer association free energies are provided in Figure 7 and Table S1.

Enumeration of the "en-route" intermediate sample space during the assembly process

To analyze SAXS data for an assembly reaction where there are likely to be a broad ensemble of intermediates, we created an appropriate series of models. $T_{n}^{s,c}$ is an icosahedral capsid or intermediate, whose triangulation number is n, and is made of s capsid protein dimer subunits, held together by c dimer-dimer contacts. An intact T = 4 particle would be $T_{4}^{120, 240}$ . A SAXS curve is influenced by both the size and the shape of the objects that contribute to the scattering intensity as well as by the distribution of the total protein mass between these objects (see Section 1d in the SI). In this enumeration the size and shape depend on s, c, and the manner by which the assembled dimers are connected to each other. There may be many different arrangements of subunits that match a given s,c index.

To construct the sample space of the possible "en-route" capsid intermediates, we examined the configuration space in the s-c plane. The configuration space includes information about the range of possible intermediates (with c values between c_min (s) and c_max (s) for each intermediate size, s), and about the degeneracy for each $T_{n}^{s,c}$ intermediate type (see Materials and Methods). Once the configuration space is defined, the probability of finding a specific structure for a given experimental condition can be estimated from the relevant partition function. Figure 4a shows the distribution of states (Ω_4,s,c) of T = 4 intermediates based on MC simulation where the sampling was not biased by the stability of intermediates (see Materials and Methods). T = 4 and T = 3 configurations had similar distributions of intermediates with a peak in heterogeneity above the midpoint in size (Figures S9 and S10 and Sections 2a and 2b in the SI). A similar distribution was seen for smaller geometric figures (Figure S13 and Section 2d in the SI).⁴ Note that the distribution is asymmetric, with a peak when ≈ ${}^{2}∕_{π}$ of the capsid is complete (Figure S10), because only connected structures were allowed. As the occupancy increased, the connectivity constraint excluded fewer states and shifted the maximum number of allowed intermediates towards larger sizes. Viewed from a different perspective, large intermediates have a broad range of holes.⁴

Figure 4b maps the distribution of states from the c − s plane to the D_C − s plane, where

D_{C} \equiv \frac{c (s) - c_{min} (s)}{c_{\max} (s) - c_{min} (s)}

(1)

is the degree of connectivity. In this representation the most compact structures (highest degree of connectivity) are along the D_C(s) = 1 line and the structures with the lowest degree of connectivity are found along the D_C(s) = 0 line. The dimer and full capsid were placed at (s, D_C) = (1, 1) and (s, D_C) = (120, 1), respectively.

Most intermediate states were based on extended strings of subunits or had many holes (D_C ≃ 0.3). The states with the smallest number of configurations corresponds to the most compact structures, which also have the maximal number of contacts for a given size (and minimal number of holes). This is demonstrated in Figure 4c for intermediates of size 76, which pass through the peak of the distribution. The total number of possible states, Ω_total = Σ_n Σ_s Σ_c Ω_n,s,c, based on simulations, was ≃ 1.82 × 10³⁰. The distribution of states sampled by our MC simulation provides new information on the configurational degeneracy of the capsid assembly process.

The zero energy probabilities alone, however, were insufficient to determine the composition of intermediates of the assembly reaction at equilibrium. Attempts to fit SAXS data using a maximum entropy approach, leading to the broadest intermediate distribution that the scattering data do not exclude,^32,33 failed because of the overwhelming number of incomplete capsids states whose calculated scattering curves are similar to that of a T = 4 particle (see Figures S2-S4 and Section 1c in the SI for a discussion about maximum entropy analysis). A fit with this nonrepresentative library suggests a large concentration of incomplete intermediates, contrary to experimental observations by charge detection mass spectrometry and nanofluidic resistive-pulse sensing,^8,28,29 which showed that T = 4 capsid is the major reaction product in mild assembly conditions. The unedited library is essentially the Levinthal prediction that the number of possible intermediates is so great that only a small fraction can reasonably be sampled.³⁴ In retrospect, the unedited library of intermediates requires the unrealistic assumption that all intermediates are equally accessible. A more rigorous weighting of the library is based on the complete grand canonical partition function, which describes the thermodynamics of macromolecular self-assembly processes. The important caveat of this weighting is that it assumes reactions are approximately equilibrated.

Figure 4d demonstrates the effect of including a bias to the distribution of intermediates of size 76 based on the association free energy. At equilibrium, when the association free energy is included, the conformational space is confined to intermediates with the highest degree of connectivity. For the 76-mers, intermediates with <140 contacts are predicted to make a negligible contribution to the reaction products. The incorporation of association energy into the thermodynamic model (the exponential term in Equation 2), reduces the number of intermediates used to model the reaction products to a few hundreds.

Experimental SAXS data show excellent agreement with the thermodynamic analysis

Making the assumption that intermediates are limited to about 10³⁰ on-path structures, we can determine the association energy for T = 4 and T = 3 capsids by obtaining the best fit of the Boltzmann weighting factor to the observed distribution of species. The two free energies per contact, ΔF₄ and ΔF₃, for the T = 4 and T = 3 capsid symmetries, respectively, are the only adjustable terms in the equation. In practical terms, minimizing the total free energy of the grand canonical ensemble (see Equation 16 in Materials and Methods) yielded the molar fraction of dimer subunits, X_n,s,c, in each of the possible $T_{n}^{s,c}$ intermediate structures in the configurational space:

X_{n,s,c} = s \cdot exp (- \frac{c Δ F_{n} - k_{B} T ln Ω_{n,s,c} - s \cdot μ_{1}}{k_{B} T}) .

(2)

Equation 2 (derived in Materials and Methods) shows that the total mole fraction of protein in each intermediate, X_n,s,c, is a function of (i) the configurational entropy term, k_BT ln Ω_n,s,c, determined from our MC simulations, (ii) the chemical potential of free dimer μ₁, set by the total protein concentration and the law of mass conservation, and (iii) the two free energies per contact, ΔF₄ and ΔF₃. As previously noted, s is the number of dimeric subunits, c is the number of dimer-dimer contacts, and k_BT is Boltzmann’s constant multiplied by the absolute temperature.

Figure 3 shows the experimental and the fitted scattering intensity models (see Equation 26 in Materials and Methods) for the data set measured at a protein concentration of 20 μM. The results show remarkable agreement with the thermodynamic model throughout the entire range of conditions and scattering angles. Small deviations from the model were only observed under extreme experimental conditions and can be seen at the low q range (NaCl concentration of 0.5 M or higher; and temperature of 25 °C or higher; see Figure S14 and Section 3 in the SI).

Figure 5 maps the temperature - NaCl concentration phase space at 20 μM cp149, based on the level of agreement in Figure 3 between the data and our thermodynamic model. At high salt and temperature, deviations were classified into two groups according to difference between the data and the best fitted model. Nonetheless, under a broad range of conditions, the thermodynamic model provided an excellent prediction to the composition of the capsid assembly reaction products at equilibrium. The quality of the fit indicates that the most observed intermediates closely resembled fragments of T = 4 and T = 3 capsids with appropriate radii of curvature; only under extreme conditions a small fraction of off-path and/or aggregated products accumulated. Figures S20 - S22 suggest that using slow dialysis, the boarders (broken curves) in Figure 5 could have been extended to somewhat higher salt concentrations and temperatures.³⁸

At equilibrium, our thermodynamic analysis predicts that the main products of our assembly reactions were dimer subunits and full capsids (Figure 6). Large intermediates were not present at detectable concentrations. Throughout the data set, more than 99% of the protein mass was found as dimer and full capsids. Most of the rest of the protein mass comprised tetramers (dimers of dimers) and hexamers (trimers of dimers). The mass fractions of other intermediate structures were lower than the detection limit in our SAXS measurements and modeling (see Sections 1d and 1e in the SI). At high ionic strength the major assembly product was T = 4 capsid. The entire data set followed a universal curve (Figure 6d).³⁹

Figure 6: — Mass fractions of the major assembly reaction products (free dimers, T = 3 capsids, and T = 4 capsids), as a function of NaCl concentration at (a) 9 °C, (b) 25 °C, and (c) 36 °C. The insets show mass fractions of dimer of dimers (D₂), and trimer of dimers (D₃). Standard deviations in the mass fractions were calculated by allowing a deviation of 10% in the measured total protein concentration (in the thermodynamic model, the total protein concentration was a constraint). (d) Universal curve of the law of mass action,³⁹ given by the fraction of capsids vs. the supersaturation parameter ( ${}^{X_{Total}}∕_{X_{1}}$ ) for all the measured NaCl concentrations and temperatures. X_Total and X₁ are the measured total protein and dimer molar fractions, respectively.

In Section 1d in the SI, we have analyzed our detection limit when the thermodynamic model well fitted the SAXS data. Figure S5 shows that we cannot exclude the presence of ~ 2% or less of small intermediates (60-mer or less). Charge detection mass spectrometry (CDMS) measurements at 1M NaCl showed that a fraction of incomplete capsids, containing between 90 and 120 dimers, may accumulate.⁴⁰ Figure S6shows that we cannot exclude the presence of the CDMS incomplete capsid intermediates with a total mass fraction of ~ 5% or less. As we shall show later, the grand canonical free energy landscape of the entire conformational space does not predict local free energy minima for the CDMS large intermediates. These intermediates could have been kinetically trapped. Clustering algorithm analysis (see Materials and Methods and Section 1e in the SI) was applied to our intermediate library of computed scattering curves. Figure S7shows that the full capsid mass fraction could have included the mass fraction of nearly complete capsids, missing between 1 and 5 dimers. As the intermediate size decreased the clusters became narrower. Selected electron micrographs and size exclusion chromatography measurements lend additional support to the identity and mass fraction of the structures inferred from our SAXS data analysis (Figure S8).

Effect of temperature and salt on the association free energy

In all the reaction conditions, the SAXS data could be well fitted to ensembles of assembly products that were predicted from our thermodynamic analysis, which only required two fitting parameters for a given temperature and ionic strength, ΔF₄, and the association energy ratio $α \equiv {}^{Δ F_{3}}∕_{Δ F_{4}}$ . Under mild, low ionic strength assembly conditions, T = 4 particles were predominant and more stable, α ~ 0.995. Under condition where α ~ 0.996, T = 3 particles accumulated (see Table S1 and Section 4 in the SI), suggesting that slight variation in the association free energy per contact in the T = 4 and T = 3 symmetries can significantly change the preferred reaction products (see Figure S16 and the α sensitivity analysis in Section 4.1 in the SI). The effects of small changes in α are magnified because association energies are defined on a per dimer-dimer interaction while capsids are based on 180 or 240 such interactions for T = 3 and T = 4 symmetry, respectively. This rationale, however, does not explain why T = 3 capsids appear much earlier in a reaction than do T = 4 capsids.³⁷

Figures 7 and S17 show the association free energy per contact of T = 4 capsids, ΔF₄, as a function of temperature at different salt concentrations. At 50 mM NaCl, the variation in the free energy with increasing the temperature (15 ± 0.4 ${}^{cal}∕_{mol \cdot K}$ ) is about half the value at high NaCl concentrations (32.6±0.8 and 33.6±0.3 ${}^{cal}∕_{mol \cdot K}$ for 300 and 500 mM NaCl respectively). These values correspond to the gained entropy when contact was forming between two dimers. Positive entropy is consistent with the known entropy-driven assembly reactions. The gained entropy increased with NaCl concentration, as previously reported.²⁶ This increase was attributed to salt induced dimer conformational change that increased the total hydrophobic surface area, participating in the inter dimer interaction.

Figure 7: — The association free energy per contact of T = 4 capsids, ΔF₄, as a function of temperature at different NaCl concentrations (or Debye screening lengths, κ⁻¹, at room temperature), as indicated. Data are shown if the quality of the fit to the thermodynamic model was classified in Figure 5 as good (blue or orange symbols) and three temperatures were measured. If fewer temperatures were measured, the values of ΔF₄ can be found in Figure S17, which plots ΔF₄ as a function of κ⁻¹. Left y-axis corresponds to the extracted free energies based on equation 2. The right y-axis was shifted by 1.2 ${}^{kcal}∕_{mol}$ to simplify the comparison of our association free energies to an earlier work,²⁶ where the free energies were calculated on the molar concentration scale rather than the molar fraction scale used here, leading to a difference of 0.5k_BT ln 55.5 per contact. The solid lines were obtained from a linear fit to the data. The slopes estimate the contribution of the entropy to the binding free energy of two dimers: 15± 0.4, 32.6± 0.8 , and 33.6± 0.25 ${}^{cal}∕_{mol \cdot K}$ for 50 , 300 , and 500 mM NaCl, respectively.

Conversely, the decrease of the dimer-dimer association free energy with NaCl concentration is attributed to stronger screening of the coulomb electrostatic repulsion between the charged dimers.³⁹ The association free energy, however, was nearly constant when the screening length, κ⁻¹, was 0.95 nm or higher (at κ⁻¹ < 0.95 nm the free energy decreased with increasing ionic strength, see Figure S17 and Section 5 in the SI). Figure S18 and Section 6 in the SI show our measurement sensitivity at high κ⁻¹.

Attempting to fit the full range of κ⁻¹ (Figure S17) with a temperature dependent hydrophobic attraction and screened coulomb repulsion model³⁹ was not successful, especially at the middle range of ionic strengths. Furthermore, there is no obvious reason that ionic strength should change the ${}^{Δ F_{3}}∕_{Δ F_{4}}$ ratio. These results indicate that a more complex analysis will be necessary to understand the salt dependence of cp149 assembly.

Deviations from the "en-route" equilibrium model led to higher mole fractions of small assemblies and accumulation of large aggregates

At high temperature and high ionic strength the scattering data slightly deviated from the "en-route" thermodynamic model (Figure 5). The deviations predominantly appeared at the low q range, where the measured scattered intensity was higher than the model. Higher intensity at low q indicates that particles with molecular weights higher than full capsid were present.²⁹ Alternatively, interactions between particles could have led to positional correlations or pairs of incomplete capsids, for example. In addition, the intensity oscillations that characterized the full capsid were slightly smeared, consistent with a higher proportion of T = 3 particles, which was also observed at high ionic strength (Figure 6). At higher protein concentration, similar behavior was observed but with a tendency for off-path particles to accumulate at lower NaCl concentration (Figure S19a and Section 7 in the SI). The higher protein concentration improved the signal-to-noise ratio and extended the fit of the atomic model to a higher resolution (Figure S19a).

To determine whether the sudden jump in ionic strength was leading to kinetic traps, we examined the effect of slowly increasing the ionic strength from 100 to 1000 mM. In agreement with a recent gel electrophoresis analysis,³⁸ we found that when a gradual, four step dialysis protocol was used, particles with diameters other than T = 4 did not accumulate (Figures S20 - S22 , and Section 8 in the SI). More specifically, Figure S21 shows that more particles with an outer diameter of about 28 nm accumulated when the ionic strength was rapidly increased. The signal-to-noise ratio, however, prevented us from determining whether T = 3 and/or partially collapsed T = 4 capsids also accumulated.^29,41 It has been shown that owing to hysteresis, there is very little dissociation when lowering the salt concentration.⁴²

Thermodynamic stability analysis provides insight into successful assembly conditions

Our experimental results show that successful assembly reactions, which start with pure dimer and end with complete T = 4 and T = 3 capsids, have a narrow optimum of dimer-dimer association free energies. Though largely successful over a broad range of conditions, when the free energy of association was weaker than the lower bound, ΔF_min, no capsids were detected. At association free energies stronger than the upper bound, ΔF_max, off-path assembly products appeared and the yield of T = 4 particles decreased. The boundaries for 20 μM cp149 were between −4.2 and −5.4 ${}^{kcal}∕_{mol}$ . These limits were sensitive to protein concentration. The association free energy values were similar to an earlier report,²⁶ after rescaling the chemical potentials from the molar scale (in the earlier study) to the molar fraction scale, applied in this study (see Figure 7) and Equation 24).

Additional insight into the thermodynamic driving force for assembly can be obtained from the grand canonical free energy landscape at the onset of assembly. The free energy landscape is given by

Θ (Δ F_{n}, μ_{1, t = 0}) = c \cdot Δ F_{n} - k_{B} T ln Ω_{n, s, c} - s \cdot μ_{1, t = 0} .

(3)

Eq. 3 describes the supersaturation state of the dimer solution at the initial time (t = 0) of the assembly reaction.⁴³ At the onset of assembly, dimer concentration and chemical potential (μ_1,t) are maximal. Figure 8 plots the heat map of the initial grand canonical free energy in the size-degree of connectivity (s − D_C) plane for 20 μM cp149 at three different association free energies. Figures S23 - S25 show the same energy landscapes in the s − c plane. Below ΔF_min, the free energy landscape shows that the dimeric form is the most stable state (Figure 8a, purple color). At this association energy, a low energy path to the full capsid must cross several free energy barriers, shown by the contour lines along the red arrow, hence little assembly is likely to take place.

When the association free energy is slightly higher than ΔF_min (Figure 8b), the global minimum shifts to the complete capsid state. In this case, the lowest energy path that connects the local minimum at the dimeric state and the global minimum at the full capsid state crosses a wide and flat free energy barrier (between s = 20 and s = 85). As the contour lines are roughly parallel to the lowest energy path throughout the reaction most of the assembly path from dimer to capsid, the probability for going off-path (in other words, moving away from the lowest energy path and crossing multiple contour lines barriers as illustrated by the black arrow) is very low. This kind of free energy landscape maximizes successful assembly.

As the association free energy is further strengthen to ~ ΔF_max (Figure 8c) the contour lines deviate from the lowest energy path (for example the contour marked by the black arrow), which increases the probability of sampling unstable structures (low D_C). Once mistakes are made, strong association energy creates high barriers for returning to the lowest energy path which may require simultaneous dissociation of several dimer subunits. Also, the concentration of free dimers may at this stage be too low for filling holes in the structure. Similar kinetically trapped structures may form at high protein concentrations (Figure S19b and c).

The apparent narrow range of inter-subunit association free energy that can lead to successful assembly without accumulation of kinetic traps is consistent with earlier predictions.^4,10,44,45 Theoretical studies using either coarse grained molecular dynamics simulations to follow the formation of T = 3 capsids⁴⁴ or a combination of simulations and theory to elucidate possible mechanisms in kinetically trapped viral and general self-assembly processes^45,46 discussed the crucial role of weak and reversible interactions for avoiding kinetic traps. In a reversible process, mistakes may be corrected during the growth of the assembled cluster, keeping the assembly path close to its most stable state (for a given cluster size). At the same time, the subunit concentration slowly decays, allowing the system to attain equilibrium. Under those conditions, the reaction path may be described by considering a small number of intermediate structures at the lowest energy path, as predicted by our grand canonical free energy landscape (Figure 8).

Figure 9a shows the minimum grand canonical free energy path at the onset of the assembly reaction (red arrows in Figure 8), predicted by our model (choosing only c_max (s) in Equation 3), as a function of size, s, at different ΔF₄ values. As ΔF₄ becomes more negative the barrier for assembly decreases from 50 to 12 k_BT and the critical assembly size, beyond which assembly continues downhill in the energy landscape, decreases from 43 to 13 dimers. Notably, our free energy landscapes predict that no local minima are expected beyond the assembly barrier. At high ΔF₄, local minima exist at smaller intermediates, where the most dominant and stable intermediate is a complete face of a T = 4 capsid, comprising 10 dimers. These results show that at strong association free energy, both the thermodynamic barrier for assembly and the barriers for creating less stable intermediates are very small and therefore the assembly process is expected to be kinetically controlled and sensitive to solution conditions and initial protein concentration. Aggressive assembly conditions can lead to a crossover from successful assembly to kinetically trapped reactions.⁴⁷ As the assembly barrier is relatively flat, we expect a fast depletion of dimers and increase in the concentrations of a broad range of off-pathway intermediates. The higher probability for accumulation of unstable intermediates and the low concentration of free dimer may lead to misassembled particles, as detected in our SAXS measurements.

Finally, Figures 9b and c compare the minimum grand canonical free energy paths at the onset of the assembly of T = 3 and T = 4 capsids when the association free energies are moderate (Figure 8b) or strong (Figure 8c). The free energy per dimer (Figure 9b) and the free energy per particle (Figure 9c) are similar as long as the intermediates are smaller than the size of the intermediates at the highest free energy barrier. Beyond the barrier, however, the slope of the T = 3 free energy landscape is steeper³⁷ but eventually complete T = 4 is more stable than complete T = 3.

Materials and Methods

Samples preparation

The N-terminal truncated dimer, cp149, was expressed in E.coli using a pET 11-based vector and purified as described.⁵⁰ Stored samples of cp149 dimer often had small concentrations of large complexes. To start the assembly reaction from pure capsid protein dimer, samples of cp149 were treated with urea to dissociate complex; the samples were desalted and could be kept for several days at 4 °C and low ionic strength. Prior to SAXS data collection, purified cp149 dimer was dialyzed for 1.5 h three times at 4 °C, once against 3M urea in 50 mM Tris-HCl buffer at pH 7.5, and then twice against 50 mM HEPES buffer at pH 7.5. Figure 2 shows the scattering from 25 μM cp149 that was measured after the third dialysis. Protein concentration was determined by uv-vis absorbance using an extinction coefficient of 60,900 M⁻¹cm⁻¹. Cp149 dimer and 5× salt solution (5× NaCl and 50mM HEPES buffer at pH 7.5) were mixed at volume fractions of 4:1, respectively. This ratio minimized protein dilution and allowed better signal-to-noise ratio in scattering measurements. The mixed solution was equilibrated at 4 °C, room temperature, or 36 °C for ~5 hours prior to SAXS data collection. The scattering from the pure dimer solution, shown in Figure 2, was measured at 9 °C immediately following dialysis. For the measurements presented in Figure S20, cp149 was treated with 2 mM Dithiothreitol (DTT) prior to the urea treatment and the final buffer exchange to 50 mM HEPES pH 7.5 was done using a PD-10 desalting column. For these measurements 56 μM of cp149 protein were either mixed with 5× NaCl solution or gradually dialyzed against NaCl solution to reach the final salt concentration. The gradual dialysis protocol included four dialysis steps to slowly increase the NaCl concentration to 1 M: 0 to 100 mM, 100 to 300 mM, 300 to 500 mM, and 500 to 1000 mM.

SAXS Measurement Setup and Data Reduction.

Solution small X-ray scattering (SAXS) measurements of capsid assembly, shown in Figures 2, 3 and S19, were performed at the P12 EMBL BioSAXS Beamline (headed by D. Svergun) in PETRA III (DESY, Hamburg).⁵¹ Measurements were taken using an automated sample changer setup⁵² in which samples were stored on a temperature controlled plate and injected into a 2 mm diameter quartz capillary that was previously equilibrated at a desired temperature. The wavelength of the incident X-ray beam was 0.124 nm and the scattering intensity was recorded on a single-photon 2D PILATUS 2M pixel detector (DECTRIS). The products of the NaCl induced assembly reactions were measured at temperatures of 9, 25 and 36 °C to prevent samples from re-equilibrating. The sample-to-detector distance was 3 m, resulting in q_min = 0.025 nm⁻¹ and q_max = 4.8 nm⁻¹. For each assembly condition, between 30 and 35 μL of sample were injected to the measurement cell and between 30 to 35 frames were measured while the sample was flowing through the capillary. The exposure time per frame was 50 ms. Measurements shown in Figure S20 were performed at ID02 beamline (headed by T. Narayanan) in ESRF (Grenoble). These measurements were taken using the flow-cell setup which included a temperature controlled, 2 mm thick, quartz capillary.⁵³ The wavelength of the incident beam was 0.995 nm and the scattered intensity was recorded on a FReLoN 16M Kodak CCD detector. Measurements were taken at 25 °C at sample-to-detector distance of 2.5 m that resulted in a q-range of 0.03-2.7 nm⁻¹. For each experiment, between 40 and 50 μL of sample was injected to the measurement cell and 30 frames were recorded with exposure times that lasted between 50 and 100 ms. Background measurements were taken before and after each sample. The background solutions contained the same solution as the sample apart from the protein and was measured under identical conditions. Frames of data were normalized to the intensity of the transmitted beam, and azimuthally averaged to yield the scattering intensity curve as a function of the magnitude of the scattering vector, q.⁵⁴ Background scattering curves were averaged and the averaged background signal was subtracted from the averaged sample signal to give the final scattering intensity curve of the assembly reactions, as explained in our earlier papers.^30,55,56 Measurements were repeated at SWING beamline (J. Perez) in Soleil synchrotron (Gif-sur-YVETTE). Detailed experimental description of this setups were provided elsewhere.^57-60

Graph representation of capsids

$T_{n}^{s,c}$ is an icosahedral capsid intermediate, whose triangulation number is n, and is made of s dimer subunits, which are forming c dimer - dimer contacts. The capsid intermediate configurations were represented as a weighted undirected graph. Each vertex of the graph represented the geometric center of a dimeric subunit whereas each edge corresponded to an inter-dimer interaction. In our case, each dimer was in either T = 3 or T = 4 symmetry and could interact with four neighboring dimers. The degree of each vertex in our graphs was therefore four. As not all the inter-dimer interactions were equivalent, we weighted the edges according to the center-to-center distance between two interacting dimers that defined an edge. Within this representation, the complete capsids were considered as a weighted graphs of 4-degrees with 90 vertices and 180 edges, in the case of T = 3, and 120 nodes and 240 edges, in the case of T = 4. According to the crystal structure there were two types of dimer positions both for the T = 3 and the T = 4 symmetries that formed three types of inter-dimer interactions in the T = 4 symmetry and two types in the T = 3 symmetry. We defined the two positions as position A and position B and therefore the edges could be of type AA, AB, or BB. In the simulations, each $T_{n}^{s,c}$ intermediate structure was represented by an s × 4 matrix, where each row corresponded to a vertex and contained the indexes of its neighboring vertices. Based on the matrix, we mapped the occupied vertices and edges of each intermediate.

The weighted undirected full capsid graphs were then represented by G_T_=n = (V, E, E_T), where V, E and E_T are the sets of vertices (v₁, …, $v_{{}^{(n \cdot 60)}∕_{2}}$ ), edges (e₁, …, e_n·60) and edge types (et₁, …, et_n·60), respectively, where the possible edge types were et_i ∈ {AA, AB, BB}. $T_{n}^{s,c}$ intermediate structures were represented by subgraphs (see below). Figure 1 shows the reduction of the two capsid geometries to their graph representations.

Monte Carlo simulations

Using our graph representation of capsids, we performed MC simulations to obtain the canonical partition function, Z_n, of $T_{n}^{s,c}$ capsid intermediates, in the size-contact phase space:

Z_{n} = \sum_{s} \sum_{c} Ω_{n, s, c} exp (-^{Δ F_{n, c}} ∕ k_{B} T) .

(4)

ΔF_n,c is the association free energy of an intermediate complex with c contacts in a T = n symmetry. Ω_n,s,c is the number of different possible ways to create $T_{n}^{s,c}$ intermediates of size s with c inter-subunit contacts in a T = n symmetry. The distribution of microstates that dictate the values of Ω_n,s,c are computed explicitly from the simulation.

Simulation phase space

The simulations were based on the graph representations of T = 3 or T = 4 capsid intermediates. $G_{T = n}^{i} = (V_{i}, E_{i}, E_{T i})$ represented a subgraph of G_T=n at the i-th iteration of the MC simulation. V_i, E_i, and E_Ti were subsets of V, E, and E_T, respectively. The size, s_i (or number of occupied vertices), and connectivity, c_i (or number of contacts between dimers), of the i-th iteration where calculate from the size of V_i and E_i subsets. As each vertex in the subgraph corresponded to a specific dimer, we could represent the structure of each intermediate as a binary occupancy vector of size n · ${}^{60}∕_{2}$ , in which 1 corresponded to an occupied position and 0 to an unoccupied position.

Umbrella sampling method

Simple MC procedure could not properly sample the distribution of intermediate states of Hepatitis B capsids, because the number of states combinatorially increases with the number of subunits. In simulations with low dimer-dimer association energies, the fast increase in the number of states with intermediate size ,s, precluded sampling of energetically favorable states, because energetically favored states occupy a negligible part of the size (s) - contact (c) phase-space. Higher association energies generate high dissociation energy barriers that confined the simulation sampling to a small subspace and exclude kinetically favorable intermediates. To correctly explore the entire distribution of intermediate states, we performed MC simulations with zero association energy and used an umbrella sampling procedure.⁶¹ We separately scanned the s coordinate and the c coordinate, using two sets of umbrella potentials. For the size coordinate,

E^{j} (s, c) = {\begin{matrix} 0, & for s_{min}^{j} \leq s \leq s_{\max}^{j} \\ \infty, & otherwise \end{matrix}

(5)

where $s_{\max}^{j}$ and $s_{min}^{j}$ are the maximal and minimal intermediate sizes, respectively, at umbrella potential E^j. The width of the umbrella potential windows, given by $s_{\max}^{j} - s_{min}^{j}$ , were between 5 and 7 for different j values along the s coordinate. For a given size, s_i, where s_i ∈ (1, …, n · ${}^{60}∕_{2}$ ), we scanned the number of contacts, c, axis with another umbrella potential defined as,

E_{i}^{j} (s_{i}, c) = {\begin{matrix} 0, & for c_{min}^{j, i} \leq c \leq c_{\max}^{j, i} \\ \infty, & otherwise \end{matrix}

(6)

where $c_{min}^{j, i}$ and $c_{\max}^{j, i}$ are the minimum and maximum number of contacts, respectively, of intermediates of size s_i. The width of the umbrella potential windows were between 4 and 6 contacts. Within each umbrella potential, the two dimensional size-contact probability distribution was constructed. As the contact association free energy was set to zero, the probability distributions per umbrella potential, E^j, were defined as, $P^{j} (s) = \sum_{c} {}^{Ω_{s, c}^{j}}∕_{Ω_{total}^{j}}$ for the size axis, and $P_{i}^{j} (s_{i}, c) = {}^{Ω_{s_{i}, c}^{j}}∕_{Ω_{total}^{j}}$ for the contact axis at a given size, s_i, where $Ω_{total}^{j}$ were the total recorded frames along the simulation at umbrella potential $E_{i}^{j}$ .

Allowed steps, acceptance criteria and detailed balance

The metropolis algorithm⁶² was used in our MC simulations, where the graph representation of capsid intermediates defined the discrete set of possible steps. As described, one simulation set scanned the s coordinate, from which the intermediate size distribution was obtained. The second simulation set scanned the c coordinate, within subgraphs with s occupied vortexes, from which the connectivity distributions at a given s value was obtained.

Simulation along the s axis.

The allowed steps in the simulations were either adding or removing a single vertex. The simulation followed the following procedure: Given the current state, $G_{T = n}^{i} = (V_{i}, E_{i}, E_{T i})$ , of the system, a list of all the possible positions for adding or removing a subunit were created based on the nearest neighbors of each vertex. The size of this list gave the total number of allowed steps, N_i. In addition, a list of all the articulation points (APs) of the subgraph (in other words, vertices that disconnect the subgraph) was generated using depth first search (DFS) algorithm.⁶³

At a given state of the simulation, μ_s, a random vertex position, v, was chosen from the list of the N_i allowed positions. If v was an empty vertex, the simulation tried to add a new vertex to the current subgraph. If v was an occupied vertex, the simulation tried to remove the vertex from the subgraph. To maintain detailed balance, if v was an occupied articulation point, any attempt of the simulation to change the current state was denied, as we allowed only addition or removal of a single vertex at each iteration. Removal of an articulation point would lead to dissociation of multisubunit oligomers, which had to be denied because the reverse process was also not allowed. The new state was then given by ν_s′, where s′ is the number of occupied vertices of the new state.

To maintain the condition of detailed balance at equilibrium, the probability to proceed from the current state, μ_s, to the new state, ν_s′, had to satisfy⁶⁴

\frac{P (μ_{s} \to ν_{s^{'}})}{P (ν_{s^{'}} \to μ_{s})} = \frac{C (ν_{s^{'}}) A (μ_{s} \to ν_{s^{'}})}{C (μ_{s}) A (ν_{s^{'}} \to μ_{s})} = exp (- \frac{E_{ν_{s^{'}}} - E_{μ_{s}}}{k_{B} T})

(7)

where, C(ν_s′) is the probability to select the new state, ν_s′, and A(μ_s → ν_s′) is the probability to accept the step from state μ_s to ν_s′. Similar conversion was applied to the reverse process. As within each umbrella potential, E_{ν_s′} = E_{μ_s} = 0 (Equation 5) we got:

C (ν_{s^{'}}) A (μ_{s} \to ν_{s^{'}}) = C (μ_{s}) A (ν_{s^{'}} \to μ_{s}) .

(8)

If s′ > s, we defined A(ν_s′ → μ_s) ≡ 1, hence $A (μ_{s} \to ν_{s^{'}}) = {}^{C (μ_{s})}∕_{C (ν_{s^{'}})}$ , and $C (μ_{s}) = {}^{1}∕_{N_{ν_{s^{'}}}}$ , and $C (ν_{s^{'}}) = {}^{1}∕_{N_{μ_{s}}}$ , where N_{μ_s} and N_{ν_s′} are the total numbers of allowed steps at states μ_s and $ν_{s}^{'}$ , respectively. We can summarize the acceptance criteria as follows:

P (μ_{s} \to ν_{s^{'}}) = {\begin{matrix} 0, & for s^{'} < s and p \in APs \\ 1, & for s^{'} < s and p \notin APs \\ \frac{N_{μ_{s}}}{N_{ν_{s^{'}}}}, & for s^{'} > s \end{matrix}

(9)

Simulation along the c axis.

The simulations that sampled the connectivity distribution were performed under a fixed number of occupied vertices, s. At each iteration, the allowed steps changed the position of one vertex, using the following procedure. At a given state, $G_{T = n}^{i} = (V_{i}, E_{i}, E_{T i})$ , a vertex, v, was randomly selected from the set, V_i, of s occupied vertices. If v was an articulation point the step was denied, otherwise the vertex was removed and an intermediate state of size s − 1, defined as $G_{T = n}^{j} = (V_{j}, E_{j}, E_{T j})$ , was obtained. At intermediate state $G_{T = n}^{j}$ , the list of all the possible sites where a new vertex can be added was created, based on the nearest neighbors of the occupied vertices. A random position was then selected from the list, and its occupancy was changed from an empty to an occupied vertex.

Reduction to unique structures

The simulations described so far, over estimated the number of states because translated and rotated subgraphs were treated as distinct states, even though they were structurally identical. To include in our probability distributions only unique structures from each simulation, we introduced a three-step classification test to discard replicated states. The first classification was based on the graph representation of the state, namely the occupied vertices and the connectivity between them. The second classification was based on the pair distribution function of the center of mass of the subunits. The third classification, was based on complete alignment of the 3D structures.

Graph representation

Based on the graph representation of the structures, two structures could be considered identical only if they were isomorphic, namely they contained the same number and type of vertices and edges. Isomorphic graphs were tested using the built-in function of networkX package,⁶⁵ which uses the VF2 algorithm.⁶⁶

Pair distribution function.

The graph representation did not contain the entire structural information. In some cases, two different states could have isomorphic graphs (Figure S11). To determine whether two structures were identical or not, we compared the pair distribution function of the geometrical centers of the subunits. As each vertex in the graph was linked to a specific coordinate in 3D space, the occupancy vector could be used to generate a histogram of the distances between all the subunit pairs in the complex. The pair distribution functions were binned into a vector of integers, $\vec{h}$ , where the value at each index, h_i, represented the occurrence of distances within the limits of the i–th bin. Using bin sizes ranging between 0.1 Å and 4nm, we could build a unique $\vec{h}$ signature for each state, namely, two structures with the same $\vec{h}$ signature had exactly the same pair distribution functions.

Complete alignment between two structures

In special cases (chiral structures, for example), the pair distribution function and the graph representation criteria could still under estimate the number of states (Figure S12 and Section 2c). We therefore introduced an alignment test, which exploited the entire 3D structural information. When two structures had isomorphic graph representations, G₁ and G₂, and identical $\vec{h}$ signatures, we checked whether the structures could be fully aligned.

Let M be the set of possible mapping options between G₁ and G₂. For each m ∈ M, we created two s×3 matrices, A₁ and A₂, containing the x, y, z coordinates of the geometric centers of all the subunits in the complexes (where the origin was at the geometric center of the entire complex). We defined the covariance matrix, $C_{12} \equiv A_{2} A_{1}^{⊺}$ , and found its singular value decomposition (SVD), C₁₂ = UΣV^⊺, where, U and V were unitary matrices and Σ was a diagonal matrix with non-negative real values along its diagonal. The rotation matrix, R, which minimized the squared norm of the residual matrix, ∥D₁₂∥², was then give by⁶⁷

R = V (\begin{matrix} 1 \\ 1 \\ \cdot \\ \cdot \\ \cdot \\ 1 \\ det {VU}^{⊺} \end{matrix}) U^{⊺},

where D₁₂ ≡ RA₂ − A₁. Two structures were considered identical if the minimum norm of their residual matrix, ∥D₁₂∥, was 1 Å or less. The loop over m ∈ M was terminated, when the algorithm identified a mapping, m, which satisfied the minimization criterion.

Practically, the probability of sampling two different structures with identical pair distribution functions was extremely low for large s. We found that the number of states, based on the pair distribution signature alone, was similar to the number of states obtained by the three-step procedure, whereas the pair distributions were significantly simpler to compute. Hence, when the number of subunits in the complexes was between 30 and n · ${}^{60}∕_{2}$ − 5, the alignment test was not applied. For s > n · ${}^{60}∕_{2}$ − 5, we performed the three-step analysis on the "inverse" graphs, which we defined as the graph that contained the unoccupied vertices of each iteration, $V_{i}^{empty}$ , and the edges connecting them, with $V_{i}^{empty}$ ∈ V − V_i.

Unifying the probability distribution

Once the distribution of states within each umbrella potential was obtained, the distributions were unified into a single s − c distribution. By selecting the overlap between two adjacent potentials to satisfy $s_{min}^{j + 1} = s_{\max}^{j} - 1$ or $c_{min}^{j + 1, i} = c_{\max}^{j, i} - 1$ , the relative scale between subsequent potentials, r_j,j+1, was given by

r_{j, j + 1} = 0.5 \cdot (\frac{P^{j} (s_{\max}^{j} - 1)}{P^{j + 1} (s_{min}^{j + 1})} + \frac{P^{j} (s_{\max}^{j})}{P^{j + 1} (s_{min}^{j + 1} + 1)})

for the s axis. A similar expression was applied for the c axis. The absolute scale A_k of umbrella potential k was therefore $A_{k} = \prod_{j = 1}^{k - 1} r_{j, j + 1}$ , where k ≥ 2. Once the scaling for all the potentials was set, the combined distribution was renormalized so that Σ_s Σ_c P(s, c) = 1.

SAXS curves calculations

To generate a pool of scattering models we randomly selected between 1 and 5 different structures for each $T_{n}^{s, c}$ intermediate (representing a point in the s − c plane). Each structure was defined by a list of s parameter vectors, each contained the position (x, y, z coordinates) of the geometric center of a dimer and the three Euler angles (α, β, γ) that defined its orientation. The dimer subunit was taken from the protein data bank (PDB ID 2G33)⁴⁸ and was docked into both T = 3 and T = 4 symmetries. The parameters for calculating the scattering intensity of the entire set were chosen by fitting the expected scattering intensity from the atomic structure of the dimer to the experimental scattering data from a pure solution of 25 μM cp149 in 50 mM HEPES buffer pH 7.5, at room temperature (Figure 2). The fitting procedure was done using our home developed software, D+ (https://scholars.huji.ac.il/uriraviv/software/d-software).^30,31 The electron density of the solvent was set to the electron density of water (333 ${}^{e}∕_{n m^{3}}$ ), and the thickness and electron density of the solvation layer around the protein were our fitting parameters. Once the best fitted parameters were found, the scattering amplitude of the dimer was saved and used for calculating the scattering intensity, $I_{T_{n}^{s, c}}$ (q), of all the other intermediates in the pool, according to

I_{T_{n}^{s, c}} (q) = {〈 {∣ \sum_{j = 1}^{s} F_{dimer}^{sol} (A_{j}^{- 1} \vec{q}) \cdot exp [i \vec{q} \cdot {\vec{R}}_{j}] ∣}^{2} 〉}_{Ω_{q}}

where, $F_{dimer}^{sol} (A_{j}^{- 1} \vec{q})$ is the scattering amplitude of the j-th dimer whose orientations in the complex $T_{n}^{s, c}$ is given by the of rotation matrices A_j. ${\vec{R}}_{j}$ is the geometric center position of the j-th dimer, and ⟨…⟩_{Ω_q} represents the orientation averaging of the scattering intensity. All models were then scaled to the absolute scale (see derivation in Section 1a in SI) according to

\frac{d Σ}{d Ω} (q) = C [\frac{m g}{c m^{3}}] (4.8 \times 10^{- 5} [\frac{c m^{2}}{m o l}] \frac{1}{M_{w, s}^{'}}) I_{D +} (q) .

The scaling corresponds to a protein concentration of 1 ${}^{mg}∕_{mL}$ (C = 1) with $M_{w, s}^{'} = s \cdot M_{w}^{'}$ where $M_{w}^{'}$ is the molecular weight of the dimer subunit including its solvation layer. I_D+ (q) is the scattering intensity computed by D+ software, in which the Thomson scattering length is set to unity.

Scattering amplitude of the atomic model of the dimer in solution

The scattering amplitude of the j-th atom in vacuo (using the five Gaussian approximation) is:

f_{j}^{0} (q) = \sum_{k = 1}^{4} a_{k} \cdot exp (- b_{k} {(\frac{∣ \vec{q} ∣}{4 π})}^{2}) + c .

(10)

The coefficients a_k, b_k, and c are the Cromer-Mann coefficients, given in units of the Thomson scattering length, r₀. Note that in our analysis software, D+, r₀ is set to 1.^30,31 To account for the solvent excluded volume of the atoms, we subtracted a Gaussian sphere from each atom, where the center of the sphere was located at the atom coordinate.^68-70 Our analysis software, D+,³¹ subtracts the Fourier transform of the following normalized Gaussian sphere:

F_{j}^{d} (q) = ρ_{0} \cdot V_{j} \cdot exp [- \frac{V_{m}^{2 ∕ 3} \cdot q^{2}}{4 π}],

where $V_{j} = \frac{4 π}{3} r_{j}^{3}$ is the volume of atom j, and V_m = Σ_j V_j is the mean atomic volume over the entire molecule. r_j is an effective atomic radius (dictated by the type of atom) and ρ₀ is the mean solvent electron density. The scattering amplitude from atom j after subtracting the contribution of the solvent is:

f_{j}^{\circ} (q) - F_{j}^{d} (q)

(11)

The solvent contribution can be modulated like

f_{j}^{s} (q) = f_{j}^{\circ} (q) - C_{1} (q) F_{j}^{d} (q)

where

C_{1} (q) = c_{1}^{3} \cdot exp [- \frac{V_{m}^{2 ∕ 3} \cdot q^{2} (c_{1}^{2} - 1)}{4 π}] .

is used to uniformly adjust V_m and c₁ is a scaling factor.⁷¹

The solution scattering amplitude from a dimer, given by a list of its n atoms, whose coordinates are ${\vec{r}}_{j}$ , was:

F_{dimer}^{sol} (\vec{q}) = \sum_{j = 1}^{n} f_{j}^{s} (q) \cdot exp (i \vec{q} {\vec{r}}_{j}) .

Solvation Layer

The solvent excluded volume was calculated by discretizing space into 0.125 A³ voxels. Each voxel was marked as occupied if at least one atom’s coordinate was within that atom’s radius. A probe with a radius of 1.4 A, which corresponds to the radius of a water molecule, determined the accessible surface of the protein. All voxels that were within the solvation thickness, Δ_{Solvent Layer}, of the protein surface and were not occupied by an atom were marked as occupied by a solvation layer, and assumed a mean electron density, ρ_{Solvent Layer}. The contribution of the solvation layer to the scattering amplitude was computed as the sum over the scattering amplitudes from the collection of voxels comprising the solvation layer. The scattering amplitude of the jth voxel of dimensions, ω_j, τ_j, and μ_j, whose center was at ${\vec{r}}_{j}^{Voxel}$ was:⁷²

f_{j}^{Voxel} (\vec{q}) = \frac{8}{q_{x} \cdot q_{y} \cdot q_{z}} \sin (\frac{q_{x} ω_{j}}{2}) \sin (\frac{q_{y} τ_{j}}{2}) \sin (\frac{q_{z} μ_{j}}{2}) \cdot exp (i \cdot \vec{q} \cdot {\vec{r}}_{j}^{Voxel})

(12)

The amplitude of the solvation layer was then:

F_{Solvation Layer} (\vec{q}) = \sum_{j \in {Solvation Layer Voxels}} f_{j}^{Voxel} (\vec{q}) .

(13)

The scattering amplitude of a solvated molecule was:

F_{Solvated Molecule} (\vec{q}) = F_{Molecule}^{sol} (\vec{q}) + (ρ_{Solvation Layer - ρ_{0}}) \cdot F_{Solvation Layer} (\vec{q}) .

(14)

where $F_{Molecule}^{sol}$ ( $\vec{q}$ ) is the scattering amplitude from the molecule in solution (without the solvation layer contribution).

Resolution correction of the models

To account for the finite resolution of the measurement, we included a resolution correction to our calculated models set.⁷³ The corrected intensity, I_corr, as a function of q for each computed model intensity, I_model, was:

I_{corr} (q) = \int_{q_{min}}^{q_{\max}} \frac{1}{\sqrt{2 π σ_{q}}} exp (- \frac{(q^{'} - q)^{2}}{2 σ_{q}^{2}}) \cdot I_{model} (q^{'}) d q^{'}

where σ_q was calculated according to the geometry of our setup to be 0.01 nm⁻¹.⁷⁴

Sensitivity analysis of SAXS models using a clustering algorithm

To estimate the sensitivity of our fitting results relative to the signal to noise ratio (SNR) we applied clustering algorithm on the expected scattering intensities in the size and contact plane. Clustering of the SAXS models, as presented in Section 1d in the SI and in Figure S7, was done on the set of models that was used to analyze our SAXS data. The number of clusters rapidly increased as the size of the intermediates decreased (as the size of the particle decreases the effect on the scattering curve of adding or moving subunits is larger). Hence, we applied the clustering procedure to models whose s values were within the top 33% of each capsid symmetry. The clustering was done on the weighted matrix, M^T=4 and M^T=3 given by,

M^{T = 4} = (\begin{matrix} \frac{I_{4}^{80, 87} (q_{1})}{σ (q_{1})} & \cdot \cdot \cdot \cdot \cdot & \frac{I_{4}^{120, 240} (q_{1})}{σ (q_{1})} \\ ⋮ & ⋱ & ⋮ \\ \frac{I_{4}^{80, 87} (q_{N})}{σ (q_{N})} & \cdot \cdot \cdot \cdot \cdot & \frac{I_{4}^{120, 240} (q_{N})}{σ (q_{N})} \end{matrix})

and,

M^{T = 3} = (\begin{matrix} \frac{I_{3}^{60, 65} (q_{1})}{σ (q_{1})} & \cdot \cdot \cdot \cdot \cdot & \frac{I_{3}^{90, 180} (q_{1})}{σ (q_{1})} \\ ⋮ & ⋱ & ⋮ \\ \frac{I_{3}^{60, 65} (q_{N})}{σ (q_{N})} & \cdot \cdot \cdot \cdot \cdot & \frac{I_{3}^{90, 180} (q_{N})}{σ (q_{N})} \end{matrix})

where each column in the matrix corresponds to the computed scattering intensity, $I_{n}^{s, c}$ (q), of the atomic model of intermediate $T_{n}^{s, c}$ with s_min = 80 or 60 for T = 4 and T = 3, respectively. The models were computed between q₁ = 0.1 nm⁻¹ and q_N = 1.5 nm⁻¹, neglecting the high q range where the SNR was low. The intensities were weighted according to the measured noise level, σ (q_i). The dimensions of the matrices were N × m_T=4 and N × m_T=3 where the total number of q values that were considered was N = 280, and the number of intermediate structures were m_T=4 = 1361 and m_T=3 = 749, for the T = 4 and T = 3 symmetries, respectively.

To reduce the dimensions of the matrices, prior to the clustering algorithm, SVD analysis was applied. The analysis revealed the ranks, r and r′ of the M^T=4 and M^T=3 matrices, respectively. Hence, the dimensions of the matrices decreased to r × m_T=4 and r′ × m_T=3, where r = 23 and r′ = 15. The ranks were determined by demanding that the maximum deviation between the SVD-reconstructed matrix and the original matrix will be 0.01σ or less. We than applied the k-means algorithm⁷⁵ onto the reduced space, in order to classify the m models onto the minimal number of clusters, k. The minimal k was found by demanding that all the models that were classified into the same cluster will fulfill the following χ² condition,

\frac{1}{N} \sum_{i = 1}^{N} \frac{(I_{n}^{s, c} (q_{i}) - I_{c}^{j} (q_{i}))^{2}}{σ (q_{i})^{2}} < 1

where $I_{n}^{s, c}$ (q_i) is a model that was classified into the jth cluster, and $I_{c}^{j}$ (q_j) is the modeled scattering intensity of the centroid of the jth cluster. Using this method, we classified the M^T=4 and M^T=3 matrices into 221 and 107 clusters, respectively.

Thermodynamic model

Consider the following set of coupled assembly reactions, induced by increasing the salt concentration:

ν_{D} D_{(aq)} \overset{[Salt]}{⇌} \sum_{n} \sum_{s} \sum_{c > 0} ν_{n,s,c} {T_{n}^{s,c}}_{(aq)} .

s is the number of dimer molecules, D, which assemble into a $T_{n}^{s,c}$ icosahedral capsid intermediate, whose triangulation, T, number is n, and c is the number of its dimer - dimer contacts. n is either 3 or 4 and ν_n,s,c and ν_D are the stoichiometric coefficients of $T_{n}^{s,c}$ and D, respectively (note that ν_D = ν_3,1,0 = ν_4,1,0). From mass conservation we get:

ν_{D} = \sum_{n} \sum_{s} \sum_{c > 0} s \cdot ν_{n,s,c} .

Our aim is to determine the distribution of intermediates at equilibrium.

The degeneracy, Ω_n,s,c, of each intermediate state in the s − c phase space was obtained from our MC simulations (in the simulations, s was the number of occupied vertices and c was the number of edges). At temperature T, the equilibrium probability distribution of intermediate structures in the canonical ensemble was calculated as a function of the dimer-dimer association free energy per contact, ΔF_n, by adding Boltzmann factor to the number of possible unique $T_{n}^{s,c}$ intermediate states (the degeneracy factor), Ω_n,s,c, obtained from our zero association free energy simulations:

P_{n,s,c} = \frac{Ω_{n,s,c} exp (-^{c \cdot Δ F_{n} ∕_{k_{B} T}})}{Z} .

Z is the partition function in the canonical ensemble:

Z = \sum_{n} \sum_{s} \sum_{c} Ω_{n,s,c} exp (-^{c \cdot Δ F_{n} ∕_{k_{B} T}}),

where k_B is Boltzmann’s constant. To obtain the distribution in the grand canonical ensemble, the translational entropy of the intermediate structures was taken into account. Following earlier theoretical models,^76-79 we assumed an implicit solvent background (rather than explicitly consider the contribution of the water molecules or salt ions to the free energy).

The dimer molecules could be in each of the $T_{n}^{s,c}$ intermediate states. The occupancy of each state was described on the mole fraction scale, where X_n,s,c, is the molar fraction of dimer subunits, incorporated into $T_{n}^{s,c}$ intermediate, and is given by $X_{n,s,c} = {}^{s \cdot n_{n,s,c}}∕_{n_{total}}$ . n_n,s,c is the number of intermediate of type $T_{n}^{s,c}$ and n_total is the total number of molecules. n_total = n_dimer + n_water, where n_dimer is the total number of dimer molecules in all the intermediates and n_water is the total number of water molecules. As the dimer is much larger than a water molecule, the total number of molecules is a function of the total dimer concentration. Therefore the general definition of X_n,s,c using intensive quantities (molar concentrations, c, rather than number of molecules) is $X_{n,s,c} = {}^{s \cdot c_{n,s,c}}∕_{c_{dimer} + (c_{water} - N_{w} \cdot c_{dimer})}$ , where N_w is the number of excluded water molecules per dimer. The dimer excluded volume was calculated in D+³¹ based on the dimer atomic model, using voxel size of 0.05 nm and a probe radius of 0.14 nm. The calculated volume was 28.5 nm³, corresponding to ~ 10³ water molecules. The total dimer concentrations in our assembly reactions were between 20 and 40 μM, contributing a correction of ~ 10⁻² M to the total concentration of molecules. We therefore neglected this correction and set c_total ≃ c_water and $X_{n,s,c} = {}^{s \cdot c_{n,s,c}}∕_{55.5 M}$ , for the analysis at all our protein concentrations.^76,77,79

From the conservation of mass we get,

\sum_{n} \sum_{s} \sum_{c} X_{n,s,c} = X_{Total}

(15)

where X_Total is the total molar fraction of dimer molecules. The range of s is a function of n, and the range of c is a function of both s and n.

The total Helmholtz free energy in the grand canonical ensemble is given by:

F = N_{tot} \sum_{n,s,c} \frac{X_{n,s,c}}{s} (Δ F_{n,s,c} + k_{B} T ln \frac{X_{n,s,c}}{s} - k_{B} T)

(16)

where N_tot is the total number of molecules (water and capsid protein dimer). ΔF_n,s,c is the Helmholtz free energy gain for the formation of $T_{n}^{s,c}$ intermediates, approximated by:

Δ F_{n,s,c} = c \cdot Δ F_{n} - k_{B} T ln Ω_{n,s,c},

(17)

The first term in Equation 17 is the formation free energy, in the molar fraction scale, of c dimer-dimer (independent) contacts and the second term in Equation 17 is the configurational entropy term, associated with the number of ways to form $T_{n}^{s,c}$ intermediates, given by Ω_n,s,c. The second and third terms in Equation 16 come form the mixing entropy of all the intermediates (second term) and the contribution of the solvent to the mixing entropy (third term). The equilibrium distribution of intermediates is obtained by minimizing the total Helmholtz free energy subject to the constraint of Equation 15. Using the Lagrange multipliers method, the solution to the minimization problem is obtained by solving the following equation

\frac{\partial}{\partial X_{n,s,c}} (\frac{F}{N_{t o t}} - λ (\sum_{n} \sum_{s} \sum_{c} X_{n,s,c} - X_{Total})) = 0

where λ is the Lagrange multiplier. The result of the derivation is

Δ F_{n,s,c} + k_{B} T ln \frac{X_{n,s,c}}{s} - s λ = 0

(18)

hence, the distribution that minimizes the Helmholtz free energy is

X_{n,s,c} = s \cdot exp (- \frac{Δ F_{n,s,c} - s \cdot λ}{k_{B} T}) = s \cdot Ω_{n,s,c} \cdot exp (- \frac{c Δ F_{n} - s \cdot λ}{k_{B} T})

(19)

To find the physical meaning of λ, we proceed as follows. The chemical potential of s dimers in $T_{n}^{s,c}$ intermediates is,

μ_{n,s,c} = {(\frac{\partial F}{\partial N_{n,s,c}})}_{V, T, N_{(n,s,c) \neq (n^{’}, s^{’}, c^{’})}} = Δ F_{n,s,c} + k_{B} T ln \frac{X_{n,s,c}}{s}

(20)

From Equations 18 and 20 we get that at equilibrium, the chemical potential per dimer at each of the intermediates is the same:

\frac{μ_{n,s,c}}{s} = \frac{μ_{n^{’}, s^{’}, c^{’}}}{s^{'}} .

(21)

In particular, the chemical potential of a dimer in each of the $T_{n}^{s,c}$ intermediates must be equal to the chemical potential of the free dimer

μ_{1} = \frac{μ_{n,s,c}}{s}

(22)

By combining Equations 20 and 22, we find that the molar fraction distribution takes the following form:

X_{n,s,c} = s \cdot Ω_{n,s,c} exp (- \frac{c Δ F_{n} - s \cdot μ_{1}}{k_{B} T})

(23)

By setting s = 1 in Equation 18 (or by comparing Equation 23 with Equation 19), we find that μ₁ = λ. In other words, the Lagrange multiplier, λ, is the chemical potential of the free dimer in the solution, given by μ₁ = k_BT ln X₁. By substituting μ₁ to Equation 23 we get:

X_{n,s,c} = s \cdot Ω_{n,s,c} X_{1}^{s} \cdot exp (- \frac{c Δ F_{n}}{k_{B} T})

(24)

Equation 24, satisfies the law of mass action in the s − c phase-space.

Fitting the law of mass action

The parameters of our model were the total molar fraction of the protein, X_Total, which we know from the experiments, and the dimer-dimer association free energies, ΔF_n, in the T = 3 and T = 4 symmetries. The total molar fraction was given by $X_{Total} = {}^{C_{protein}}∕_{55.5 M}$ when C_protein is the molar protein concentration in mol/L. The fitting parameters were ΔF₄ and α, where ΔF₃ = α · ΔF₄. Since X₁ is a function of ΔF₄ and α to compute equation 24, we set the values of α and ΔF₄ and numerically solved equation 15 by finding the value of X₁, to minimize:

∣ X_{Total}^{- 1} \cdot \sum_{n} \sum_{s} \sum_{c} X_{n,s,c} (X_{1}, Δ F_{n}) - 1 ∣ .

(25)

The resulting X_n,s,c molar fractions were then used to compute the expected solution X-ray scattering curve:

I_{model} (q) = X_{Total}^{- 1} \cdot \sum_{n,s,c} X_{n,s,c} I_{n,s,c} = \sum_{n,s,c} p_{n,s,c} I_{n,s,c} (q)

(26)

where I_n,s,c (q) was the scattering curve of the most probable $T_{n}^{s,c}$ intermediate and $p_{n,s,c} \equiv X_{Total}^{- 1} \cdot X_{n,s,c}$ was the probability of finding that intermediate. I_model (q) was then compared with the experimental scattering curve, I_exp (q). ΔF₄ and α were then modified and the process repeated until the best fit to the experimental data was attained. The best fit criterion was minimal χ² value, where

χ^{2} = \frac{1}{N - 1} \sum_{i = 0}^{N} {(\frac{a \cdot I_{model} (q_{i}) + b - I_{exp} (q_{i})}{σ_{i}})}^{2} .

(27)

I_exp (q_i) is the measured scattering intensity at q_i, where q_i is the magnitude of the momentum transfer vector. σ_i is the standard deviation of I_exp (q_i), a is a total scaling factor and b is a constant value to correct the high q region for small inaccuracies (mainly owing to dark current contribution) when subtracting background scattering curves.

Conclusions

Our high resolution solution X-ray scattering data, MC simulations, and thermodynamic analysis reveal the distribution of capsid assembly reaction products over a wide range of conditions. The results extend earlier studies showing that dimer-dimer association free energy is a function of temperature and ionic strength by rigorously fitting high resolution data to a grand canonical free energy landscape. To construct this model we used umbrella sampling MC simulations of assembly leading us to estimate about 10³⁰ possible intermediates. By applying Boltzmann weighting to this immense library we observed (i) a narrow swath of stable species that demarcated a series of en route reactions from dimer to capsid and (ii) that subtle changes in association energy could favor reaction pathways that led to kinetic traps with high barriers to return to an on-path route. For example, at strong dimer-dimer association free energy, high temperature, and high dimer concentration, kinetically trapped T=3-like nanoparticles, with a diameter of 28 nm, formed. These structures did not form when the salt concentration was slowly increased by dialysis, suggesting that these nanoparticles may not be stable T = 3 capsids. Similar flaws could also be achieved at high concentrations of cp149 because the grand canonical free energy landscape of the assembly reaction is also determined by reactant concentration. Successful assembly reactions are best realized within a narrow range of protein concentration and dimer-dimer association free energy. Thus, molecules that strengthen subunit-subunit interactions will be predicted to have antiviral effects.^6,48,49 The approach established in the present work, can be applied to characterize other complex macromolecular assemblies. The mechanistic insight gained by our analysis provide means to describe bottom-up complex assembly reactions.

Supplementary Material

NIHMS1581441-supplement-SI.pdf^{(13.1MB, pdf)}

Acknowledgement

We thank Daniel Harries and Michael Hagan for very helpful discussions. We acknowledge Desy synchrotron at Hamburg, beamline P12 (D. Svergun and his team), the European Synchrotron Radiation Facility (ESRF) beamline ID02 (T. Narayanan and his team), and Soleil synchrotron, Swing beamline (J. Perez and his team), for provision of synchrotron radiation facilities and for assistance in using the beamlines. This project was supported by the NIH (award number 1R01AI118933). R.A. acknowledges support from the Kaye-Einstein Fellowship Foundation. U.R. and R.A. acknowledge financial support from the Israel Science Foundation (grant 656/17).

References

(1).Katen S; Zlotnick A The thermodynamics of virus capsid assembly. Methods Enzymol. 2009, 455, 395–417. [DOI] [PMC free article] [PubMed] [Google Scholar]
(2).Zlotnick A Are weak protein–protein interactions the general rule in capsid assembly? Virology 2003, 315, 269–274. [DOI] [PubMed] [Google Scholar]
(3).Zlotnick A; Johnson JM; Wingfield PW; Stahl SJ; Endres D A theoretical model successfully identifies features of hepatitis B virus capsid assembly. Biochemistry 1999, 38, 14644–14652. [DOI] [PubMed] [Google Scholar]
(4).Moisant P; Neeman H; Zlotnick A Exploring the paths of (virus) assembly. Biophys. J 2010, 99, 1350–1357. [DOI] [PMC free article] [PubMed] [Google Scholar]
(5).Asor R; Khaykelson D; Ben-nun Shaul O; Oppenheim A; Raviv U Effect of Calcium Ions and Disulfide Bonds on Swelling of Virus Particles. ACS Omega 2019, 4, 58–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
(6).Zlotnick A; Mukhopadhyay S Virus assembly, allostery and antivirals. Trends Microbiol. 2011, 19, 14–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
(7).Uetrecht C; Barbu IM; Shoemaker GK; Van Duijn E; Heck AJ Interrogating viral capsid assembly with ion mobility–mass spectrometry. Nat. Chem 2011, 3, 126. [DOI] [PubMed] [Google Scholar]
(8).Lutomski CA; Lyktey NA; Zhao Z; Pierson EE; Zlotnick A; Jarrold MF Hepatitis B Virus Capsid Completion Occurs through Error Correction. J. Am. Chem. Soc 2017, 139, 16932–16938, PMID: 29125756. [DOI] [PMC free article] [PubMed] [Google Scholar]
(9).Hagan MF; Elrad OM Understanding the concentration dependence of viral capsid assembly kinetics - the origin of the lag time and identifying the critical nucleus size. Biophys. J 2010, 98, 1065–1074. [DOI] [PMC free article] [PubMed] [Google Scholar]
(10).Endres D; Miyahara M; Moisant P; Zlotnick A A reaction landscape identifies the intermediates critical for self-assembly of virus capsids and other polyhedral structures. Protein Sci. 2005, 14, 1518–1525. [DOI] [PMC free article] [PubMed] [Google Scholar]
(11).Morozov AY; Bruinsma RF; Rudnick J Assembly of viruses and the pseudo-law of mass action. J. Chem. Phys 2009, 131, 10B607. [DOI] [PubMed] [Google Scholar]
(12).Garmann RF; Comas-Garcia M; Knobler CM; Gelbart WM Physical principles in the self-assembly of a simple spherical virus. Acc. Chem. Res 2015, 49, 48–55. [DOI] [PubMed] [Google Scholar]
(13).Tresset G; Chen J; Chevreuil M; Nhiri N; Jacquet E; Lansac Y Two-dimensional phase transition of viral capsid gives insights into subunit interactions. Phys. Rev. Appl 2017, 7, 014005. [Google Scholar]
(14).Kler S; Asor R; Li C; Ginsburg A; Harries D; Oppenheim A; Zlotnick A; Raviv U RNA encapsidation by SV40-derived nanoparticles follows a rapid two-state mechanism. J. Am. Chem. Soc 2012, 134, 8823–8830. [DOI] [PMC free article] [PubMed] [Google Scholar]
(15).Li C; Kneller AR; Jacobson SC; Zlotnick A Single particle observation of SV40 vp1 polyanion-induced assembly shows that substrate size and structure modulate capsid geometry. ACS Chem. Biol 2017, 12, 1327–1334. [DOI] [PMC free article] [PubMed] [Google Scholar]
(16).Kler S; Wang JC-Y; Dhason M; Oppenheim A; Zlotnick A Scaffold properties are a key determinant of the size and shape of self-assembled virus-derived particles. ACS Chem. Biol 2013, 8, 2753–2761. [DOI] [PMC free article] [PubMed] [Google Scholar]
(17).Hu Y; Zandi R; Anavitarte A; Knobler CM; Gelbart WM Packaging of a polymer by a viral capsid: the interplay between polymer length and capsid size. Biophys. J 2008, 94, 1428–1436. [DOI] [PMC free article] [PubMed] [Google Scholar]
(18).Auyeung E; Li TI; Senesi AJ; Schmucker AL; Pals BC; de La Cruz MO; Mirkin CA DNA-mediated nanoparticle crystallization into Wulff polyhedra. Nature 2014, 505, 73–77. [DOI] [PubMed] [Google Scholar]
(19).Schwarz B; Uchida M; Douglas T Chapter One-Biomedical and Catalytic Opportunities of Virus-Like Particles in Nanotechnology. Adv. Virus Res 2017, 97, 1–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
(20).Pokorski JK; Steinmetz NF The art of engineering viral nanoparticles. Mol. Pharm 2011, 8, 29. [DOI] [PMC free article] [PubMed] [Google Scholar]
(21).Loo L; Guenther RH; Basnayake VR; Lommel SA; Franzen S Controlled encapsidation of gold nanoparticles by a viral protein shell. J. Am. Chem. Soc 2006, 128, 4502–4503. [DOI] [PubMed] [Google Scholar]
(22).Chen C; Daniel M-C; Quinkert ZT; De M; Stein B; Bowman VD; Chipman PR; Rotello VM; Kao CC; Dragnea B Nanoparticle-templated assembly of viral protein cages. Nano Lett. 2006, 6, 611–615. [DOI] [PubMed] [Google Scholar]
(23).Douglas T; Young M Viruses: making friends with old foes. Science 2006, 312, 873–875. [DOI] [PubMed] [Google Scholar]
(24).Uchida M; Klem MT; Allen M; Suci P; Flenniken M; Gillitzer E; Varpness Z; Liepold LO; Young M; Douglas T Biological containers: protein cages as multifunctional nanoplatforms. Adv. Mater 2007, 19, 1025–1042. [Google Scholar]
(25).Jordan PC; Patterson DP; Saboda KN; Edwards EJ; Miettinen HM; Basu G; Thielges MC; Douglas T Self-assembling biomolecular catalysts for hydrogen production. Nat. Chem 2015, [DOI] [PubMed] [Google Scholar]
(26).Ceres P; Zlotnick A Weak protein- protein interactions are sufficient to drive assembly of hepatitis B virus capsids. Biochemistry 2002, 41, 11525–11531. [DOI] [PubMed] [Google Scholar]
(27).Stannard LM; Hodgkiss M Morphological irregularities in Dane particle cores. J. Gen. Virol 1979, 45, 509–514. [DOI] [PubMed] [Google Scholar]
(28).Zhou J; Kondylis P; Haywood DG; Harms ZD; Lee LS; Zlotnick A; Jacobson SC Characterization of Virus Capsids and Their Assembly Intermediates by Multicycle Resistive-Pulse Sensing with Four Pores in Series. Anal. Chem 2018, 90, 7267–7274. [DOI] [PMC free article] [PubMed] [Google Scholar]
(29).Lutomski CA; Lyktey NA; Pierson EE; Zhao Z; Zlotnick A; Jarrold MF Multiple Pathways in Capsid Assembly. J. Am. Chem. Soc 2018, 140, 5784–5790. [DOI] [PMC free article] [PubMed] [Google Scholar]
(30).Ginsburg A; Ben-Nun T; Asor R; Shemesh A; Ringel I; Raviv U Reciprocal Grids: A Hierarchical Algorithm for Computing Solution X-ray Scattering Curves from Supramolecular Complexes at High Resolution. J. Chem. Inf. Model 2016, 56, PMID: 27410762. [DOI] [PubMed] [Google Scholar]
(31).Ginsburg A; Ben-Nun T; Asor R; Shemesh A; Fink L; Tekoah R; Levartovsky Y; Khaykelson D; Dharan R; Fellig A; Raviv U D+: Software for High-Resolution Hierarchical Modeling of Solution X-Ray Scattering from Complex Structures. J. Appl. Crystallogr 2019, 52. [DOI] [PMC free article] [PubMed] [Google Scholar]
(32).Jaynes ET Information theory and statistical mechanics. Phys. Rev 1957, 106, 620. [Google Scholar]
(33).Levine RD An information theoretical approach to inversion problems. J. Phys. A Math. Gen 1980, 13, 91. [Google Scholar]
(34).Levinthal C How to fold graciously. Mossbauer spectroscopy in biological systems 1969, 67, 22–24. [Google Scholar]
(35).Packianathan C; Katen SP; Dann CE; Zlotnick A Conformational changes in the hepatitis B virus core protein are consistent with a role for allostery in virus assembly. J. Virol 2010, 84, 1607–1615. [DOI] [PMC free article] [PubMed] [Google Scholar]
(36).Chen VB; Arendall WB III; Headd JJ; Keedy DA; Immormino RM; Kapral GJ; Murray LW; Richardson JS; Richardson DC MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr. D 2010, 66, 12–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
(37).Harms ZD; Selzer L; Zlotnick A; Jacobson SC Monitoring assembly of virus capsids with nanofluidic devices. ACS Nano 2015, 9, 9087–9096. [DOI] [PMC free article] [PubMed] [Google Scholar]
(38).Sun X; Li D; Wang Z; Liu Q; Wei Y; Liu T A dimorphism shift of hepatitis B virus capsids in response to ionic conditions. Nanoscale 2018, 10, 16984–16989. [DOI] [PubMed] [Google Scholar]
(39).Kegel WK; van der Schoot P Competing hydrophobic and screened-Coulomb interactions in hepatitis B virus capsid assembly. Biophys. J 2004, 86, 3905–3913. [DOI] [PMC free article] [PubMed] [Google Scholar]
(40).Pierson EE; Keifer DZ; Selzer L; Lee LS; Contino NC; Wang JC-Y; Zlotnick A; Jarrold MF Detection of late intermediates in virus capsid assembly by charge detection mass spectrometry. J. Am. Chem. Soc 2014, 136, 3536–3541. [DOI] [PMC free article] [PubMed] [Google Scholar]
(41).Luque A; Reguera D; Morozov A; Rudnick J; Bruinsma R Physics of shell assembly: Line tension, hole implosion, and closure catastrophe. J. Chem. Phys 2012, 136, 184507. [DOI] [PubMed] [Google Scholar]
(42).Singh S; Zlotnick A Observed hysteresis of virus capsid disassembly is implicit in kinetic models of assembly. J. Biol. Chem 2003, 278, 18249–18255. [DOI] [PubMed] [Google Scholar]
(43).Hagan MF Modeling viral capsid assembly. Adv. Chem. Phys 2014, 155, 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
(44).Rapaport D Molecular dynamics study of T= 3 capsid assembly. J. Biol. Phys 2018, 44, 147–162. [DOI] [PMC free article] [PubMed] [Google Scholar]
(45).Hagan MF; Elrad OM; Jack RL Mechanisms of kinetic trapping in self-assembly and phase transformation. J. Chem. Phys 2011, 135, 104115. [DOI] [PMC free article] [PubMed] [Google Scholar]
(46).Lazaro GR; Hagan MF Allosteric control of icosahedral capsid assembly. J. Phys. Chem. B 2016, 120, 6306–6318. [DOI] [PMC free article] [PubMed] [Google Scholar]
(47).Michaels TC; Bellaiche MM; Hagan MF; Knowles TP Kinetic constraints on self-assembly into closed supramolecular structures. Sci. Rep 2017, 7, 12295. [DOI] [PMC free article] [PubMed] [Google Scholar]
(48).Bourne CR; Finn M; Zlotnick A Global structural changes in hepatitis B virus capsids induced by the assembly effector HAP1. J. Virol 2006, 80, 11055–11061. [DOI] [PMC free article] [PubMed] [Google Scholar]
(49).Venkatakrishnan B; Zlotnick A The structural biology of hepatitis B virus: form and function. Annu. Rev. Virol 2016, 3, 429–451. [DOI] [PMC free article] [PubMed] [Google Scholar]
(50).Zlotnick A; Ceres P; Singh S; Johnson JM A small molecule inhibits and misdirects assembly of hepatitis B virus capsids. J. Virol 2002, 76, 4848–4854. [DOI] [PMC free article] [PubMed] [Google Scholar]
(51).Blanchet CE; Spilotros A; Schwemmer F; Graewert MA; Kikhney A; Jeffries CM; Franke D; Mark D; Zengerle R; Cipriani F; Fiedler S; Roessle M; Svergun DI Versatile sample environments and automation for biological solution X-ray scattering experiments at the P12 beamline (PETRA III, DESY). J. Appl. Crystallogr 2015, 48, 431–443. [DOI] [PMC free article] [PubMed] [Google Scholar]
(52).Round A; Felisaz F; Fodinger L; Gobbo A; Huet J; Villard C; Blanchet CE; Pernot P; McSweeney S; Roessle M; Svergun DI; Cipriani F BioSAXS Sample Changer: a robotic sample changer for rapid and reliable high-throughput X-ray solution scattering experiments. Acta Crystallogr. D 2015, 71, 67–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
(53).Van Vaerenbergh P; Léonardon J; Sztucki M; Boesecke P; Gorini J; Claustre L; Sever F; Morse J; Narayanan T An upgrade beamline for combined wide, small and ultra small-angle x-ray scattering at the ESRF. Proceedings of the 12th International Conference on Synchrotron Radiation Instrumentation (SRI2015). 2016; p 030034. [Google Scholar]
(54).Hammersley A FIT2D: a multi-purpose data reduction, analysis and visualization program. J. Appl. Crystallogr 2016, 49. [Google Scholar]
(55).Asor R; Ben-nun Shaul O; Oppenheim A; Raviv U Crystallization, Reentrant Melting, and Resolubilization of Virus Nanoparticles. ACS Nano 2017, 11, 9814–9824. [DOI] [PMC free article] [PubMed] [Google Scholar]
(56).Ginsburg A; Shemesh A; Millgram A; Dharan R; Levi-Kalisman Y; Ringel I; Raviv U Structure of Dynamic, Taxol-Stabilized, and GMPPCP-Stabilized Microtubule. J. Phys. Chem. B 2017, 121, 8427–8436. [DOI] [PubMed] [Google Scholar]
(57).David G; Pérez J Combined sampler robot and high-performance liquid chromatography: a fully automated system for biological small-angle X-ray scattering experiments at the Synchrotron SOLEIL SWING beamline. J. Appl. Crystallogr 2009, 42, 892–900. [Google Scholar]
(58).Nadler M; Steiner A; Dvir T; Szekely O; Szekely P; Ginsburg A; Asor R; Resh R; Tamburu C; Peres M; Raviv U Following the structural changes during zinc-induced crystallization of charged membranes using time-resolved solution X-ray scattering. Soft Matter 2011, 7, 1512–1523. [Google Scholar]
(59).Louzon D; Ginsburg A; Schwenger W; Dvir T; Dogic Z; Raviv U Structure and Intermolecular Interactions between L-Type Straight Flagellar Filaments. Biophys. J 2017, 112, 2184–2195. [DOI] [PMC free article] [PubMed] [Google Scholar]
(60).Fink L; Feitelson J; Noff R; Dvir T; Tamburu C; Raviv U Osmotic stress induced desorption of calcium ions from dipolar lipid membranes. Langmuir 2017, [DOI] [PubMed] [Google Scholar]
(61).Chandler D Introduction to modern statistical mechanics; Oxford University Press, 1987. [Google Scholar]
(62).Metropolis N; Rosenbluth AW; Rosenbluth MN; Teller AH; Teller E Equation of state calculations by fast computing machines. J. Chem. Phys 1953, 21, 1087–1092. [Google Scholar]
(63).Tarjan R Depth-First Search and Linear Graph Algorithms. SIAM J. Comput 1972, 1, 146–160. [Google Scholar]
(64).Newman M; Barkema G Monte Carlo Methods in Statistical Physics chapter 1–4; Oxford University Press: New York, USA, 1999. [Google Scholar]
(65).Hagberg AA; Schult DA; Swart PJ Exploring Network Structure, Dynamics, and Function using NetworkX. Proceedings of the 7th Python in Science Conference. Pasadena, CA USA, 2008; pp 11 – 15. [Google Scholar]
(66).Cordella LP; Foggia P; Sansone C; Vento M An improved algorithm for matching large graphs. 3rd IAPR-TC15 workshop on graph-based representations in pattern recognition. 2001; pp 149–159. [Google Scholar]
(67).Eggert DW; Lorusso A; Fisher RB Estimating 3-D rigid body transformations: a comparison of four major algorithms. MACH. VISION APPL 1997, 9, 272–290. [Google Scholar]
(68).Fraser R; MacRae T; Suzuki E An improved method for calculating the contribution of solvent to the X-ray diffraction pattern of biological molecules. J. Appl. Crystallogr 1978, 11, 693–694. [Google Scholar]
(69).Svergun D; Barberato C; Koch MH CRYSOL – a program to evaluate X-ray solution scattering of biological macromolecules from atomic coordinates. J. Appl. Crystallogr 1995, 28, 768–773. [Google Scholar]
(70).Grudinin S; Garkavenko M; Kazennov A Pepsi-SAXS: an adaptive method for rapid and accurate computation of small-angle X-ray scattering profiles. Acta Crystallogr. D 2017, 73. [DOI] [PubMed] [Google Scholar]
(71).Schneidman-Duhovny D; Hammel M; Tainer JA; Sali A Accurate SAXS profile computation and its assessment by contrast variation experiments. Biophys. J 2013, 105, 962–974. [DOI] [PMC free article] [PubMed] [Google Scholar]
(72).Székely P; Ginsburg A; Ben-Nun T; Raviv U Solution x-ray scattering form factors of supramolecular self-assembled structures. Langmuir 2010, 26, 13110–13129. [DOI] [PubMed] [Google Scholar]
(73).Ben-Nun T; Ginsburg A; Székely P; Raviv U X+: a comprehensive computationally accelerated structure analysis tool for solution x-ray scattering from supramolecular self-assemblies. J. Appl. Crystallogr 2010, 43, 1522–1531. [Google Scholar]
(74).Pauw BR Everything SAXS: small-angle scattering pattern collection and correction. J. Phys. Condens. Matter 2013, 25, 383201. [DOI] [PubMed] [Google Scholar]
(75).MacQueen J Some methods for classification and analysis of multivariate observations. Proc. Fifth Berkeley Symp. on Math. Statist. and Prob 1967; pp 281–297. [Google Scholar]
(76).Gelbart WM; Ben-Shaul A; Roux D Micelles, membranes, microemulsions, and monolayers; Springer Science & Business Media, 2012. [Google Scholar]
(77).Zandi R; van der Schoot P; Reguera D; l, W.; Reiss, H. Classical nucleation theory of virus capsids. Biophys. J 2006, 90, 1939–1948. [DOI] [PMC free article] [PubMed] [Google Scholar]
(78).Perlmutter JD; Hagan MF Mechanisms of virus assembly. Annu. Rev. Phys. Chem 2015, 66. [DOI] [PMC free article] [PubMed] [Google Scholar]
(79).Safran S Statistical thermodynamics of surfaces, interfaces, and membranes; CRC Press, 2018. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

NIHMS1581441-supplement-SI.pdf^{(13.1MB, pdf)}

[R1] (1).Katen S; Zlotnick A The thermodynamics of virus capsid assembly. Methods Enzymol. 2009, 455, 395–417. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] (2).Zlotnick A Are weak protein–protein interactions the general rule in capsid assembly? Virology 2003, 315, 269–274. [DOI] [PubMed] [Google Scholar]

[R3] (3).Zlotnick A; Johnson JM; Wingfield PW; Stahl SJ; Endres D A theoretical model successfully identifies features of hepatitis B virus capsid assembly. Biochemistry 1999, 38, 14644–14652. [DOI] [PubMed] [Google Scholar]

[R4] (4).Moisant P; Neeman H; Zlotnick A Exploring the paths of (virus) assembly. Biophys. J 2010, 99, 1350–1357. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] (5).Asor R; Khaykelson D; Ben-nun Shaul O; Oppenheim A; Raviv U Effect of Calcium Ions and Disulfide Bonds on Swelling of Virus Particles. ACS Omega 2019, 4, 58–64. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] (6).Zlotnick A; Mukhopadhyay S Virus assembly, allostery and antivirals. Trends Microbiol. 2011, 19, 14–23. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] (7).Uetrecht C; Barbu IM; Shoemaker GK; Van Duijn E; Heck AJ Interrogating viral capsid assembly with ion mobility–mass spectrometry. Nat. Chem 2011, 3, 126. [DOI] [PubMed] [Google Scholar]

[R8] (8).Lutomski CA; Lyktey NA; Zhao Z; Pierson EE; Zlotnick A; Jarrold MF Hepatitis B Virus Capsid Completion Occurs through Error Correction. J. Am. Chem. Soc 2017, 139, 16932–16938, PMID: 29125756. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] (9).Hagan MF; Elrad OM Understanding the concentration dependence of viral capsid assembly kinetics - the origin of the lag time and identifying the critical nucleus size. Biophys. J 2010, 98, 1065–1074. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] (10).Endres D; Miyahara M; Moisant P; Zlotnick A A reaction landscape identifies the intermediates critical for self-assembly of virus capsids and other polyhedral structures. Protein Sci. 2005, 14, 1518–1525. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] (11).Morozov AY; Bruinsma RF; Rudnick J Assembly of viruses and the pseudo-law of mass action. J. Chem. Phys 2009, 131, 10B607. [DOI] [PubMed] [Google Scholar]

[R12] (12).Garmann RF; Comas-Garcia M; Knobler CM; Gelbart WM Physical principles in the self-assembly of a simple spherical virus. Acc. Chem. Res 2015, 49, 48–55. [DOI] [PubMed] [Google Scholar]

[R13] (13).Tresset G; Chen J; Chevreuil M; Nhiri N; Jacquet E; Lansac Y Two-dimensional phase transition of viral capsid gives insights into subunit interactions. Phys. Rev. Appl 2017, 7, 014005. [Google Scholar]

[R14] (14).Kler S; Asor R; Li C; Ginsburg A; Harries D; Oppenheim A; Zlotnick A; Raviv U RNA encapsidation by SV40-derived nanoparticles follows a rapid two-state mechanism. J. Am. Chem. Soc 2012, 134, 8823–8830. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] (15).Li C; Kneller AR; Jacobson SC; Zlotnick A Single particle observation of SV40 vp1 polyanion-induced assembly shows that substrate size and structure modulate capsid geometry. ACS Chem. Biol 2017, 12, 1327–1334. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] (16).Kler S; Wang JC-Y; Dhason M; Oppenheim A; Zlotnick A Scaffold properties are a key determinant of the size and shape of self-assembled virus-derived particles. ACS Chem. Biol 2013, 8, 2753–2761. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] (17).Hu Y; Zandi R; Anavitarte A; Knobler CM; Gelbart WM Packaging of a polymer by a viral capsid: the interplay between polymer length and capsid size. Biophys. J 2008, 94, 1428–1436. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] (18).Auyeung E; Li TI; Senesi AJ; Schmucker AL; Pals BC; de La Cruz MO; Mirkin CA DNA-mediated nanoparticle crystallization into Wulff polyhedra. Nature 2014, 505, 73–77. [DOI] [PubMed] [Google Scholar]

[R19] (19).Schwarz B; Uchida M; Douglas T Chapter One-Biomedical and Catalytic Opportunities of Virus-Like Particles in Nanotechnology. Adv. Virus Res 2017, 97, 1–60. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] (20).Pokorski JK; Steinmetz NF The art of engineering viral nanoparticles. Mol. Pharm 2011, 8, 29. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] (21).Loo L; Guenther RH; Basnayake VR; Lommel SA; Franzen S Controlled encapsidation of gold nanoparticles by a viral protein shell. J. Am. Chem. Soc 2006, 128, 4502–4503. [DOI] [PubMed] [Google Scholar]

[R22] (22).Chen C; Daniel M-C; Quinkert ZT; De M; Stein B; Bowman VD; Chipman PR; Rotello VM; Kao CC; Dragnea B Nanoparticle-templated assembly of viral protein cages. Nano Lett. 2006, 6, 611–615. [DOI] [PubMed] [Google Scholar]

[R23] (23).Douglas T; Young M Viruses: making friends with old foes. Science 2006, 312, 873–875. [DOI] [PubMed] [Google Scholar]

[R24] (24).Uchida M; Klem MT; Allen M; Suci P; Flenniken M; Gillitzer E; Varpness Z; Liepold LO; Young M; Douglas T Biological containers: protein cages as multifunctional nanoplatforms. Adv. Mater 2007, 19, 1025–1042. [Google Scholar]

[R25] (25).Jordan PC; Patterson DP; Saboda KN; Edwards EJ; Miettinen HM; Basu G; Thielges MC; Douglas T Self-assembling biomolecular catalysts for hydrogen production. Nat. Chem 2015, [DOI] [PubMed] [Google Scholar]

[R26] (26).Ceres P; Zlotnick A Weak protein- protein interactions are sufficient to drive assembly of hepatitis B virus capsids. Biochemistry 2002, 41, 11525–11531. [DOI] [PubMed] [Google Scholar]

[R27] (27).Stannard LM; Hodgkiss M Morphological irregularities in Dane particle cores. J. Gen. Virol 1979, 45, 509–514. [DOI] [PubMed] [Google Scholar]

[R28] (28).Zhou J; Kondylis P; Haywood DG; Harms ZD; Lee LS; Zlotnick A; Jacobson SC Characterization of Virus Capsids and Their Assembly Intermediates by Multicycle Resistive-Pulse Sensing with Four Pores in Series. Anal. Chem 2018, 90, 7267–7274. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] (29).Lutomski CA; Lyktey NA; Pierson EE; Zhao Z; Zlotnick A; Jarrold MF Multiple Pathways in Capsid Assembly. J. Am. Chem. Soc 2018, 140, 5784–5790. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] (30).Ginsburg A; Ben-Nun T; Asor R; Shemesh A; Ringel I; Raviv U Reciprocal Grids: A Hierarchical Algorithm for Computing Solution X-ray Scattering Curves from Supramolecular Complexes at High Resolution. J. Chem. Inf. Model 2016, 56, PMID: 27410762. [DOI] [PubMed] [Google Scholar]

[R31] (31).Ginsburg A; Ben-Nun T; Asor R; Shemesh A; Fink L; Tekoah R; Levartovsky Y; Khaykelson D; Dharan R; Fellig A; Raviv U D+: Software for High-Resolution Hierarchical Modeling of Solution X-Ray Scattering from Complex Structures. J. Appl. Crystallogr 2019, 52. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] (32).Jaynes ET Information theory and statistical mechanics. Phys. Rev 1957, 106, 620. [Google Scholar]

[R33] (33).Levine RD An information theoretical approach to inversion problems. J. Phys. A Math. Gen 1980, 13, 91. [Google Scholar]

[R34] (34).Levinthal C How to fold graciously. Mossbauer spectroscopy in biological systems 1969, 67, 22–24. [Google Scholar]

[R35] (35).Packianathan C; Katen SP; Dann CE; Zlotnick A Conformational changes in the hepatitis B virus core protein are consistent with a role for allostery in virus assembly. J. Virol 2010, 84, 1607–1615. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] (36).Chen VB; Arendall WB III; Headd JJ; Keedy DA; Immormino RM; Kapral GJ; Murray LW; Richardson JS; Richardson DC MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr. D 2010, 66, 12–21. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] (37).Harms ZD; Selzer L; Zlotnick A; Jacobson SC Monitoring assembly of virus capsids with nanofluidic devices. ACS Nano 2015, 9, 9087–9096. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] (38).Sun X; Li D; Wang Z; Liu Q; Wei Y; Liu T A dimorphism shift of hepatitis B virus capsids in response to ionic conditions. Nanoscale 2018, 10, 16984–16989. [DOI] [PubMed] [Google Scholar]

[R39] (39).Kegel WK; van der Schoot P Competing hydrophobic and screened-Coulomb interactions in hepatitis B virus capsid assembly. Biophys. J 2004, 86, 3905–3913. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] (40).Pierson EE; Keifer DZ; Selzer L; Lee LS; Contino NC; Wang JC-Y; Zlotnick A; Jarrold MF Detection of late intermediates in virus capsid assembly by charge detection mass spectrometry. J. Am. Chem. Soc 2014, 136, 3536–3541. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R41] (41).Luque A; Reguera D; Morozov A; Rudnick J; Bruinsma R Physics of shell assembly: Line tension, hole implosion, and closure catastrophe. J. Chem. Phys 2012, 136, 184507. [DOI] [PubMed] [Google Scholar]

[R42] (42).Singh S; Zlotnick A Observed hysteresis of virus capsid disassembly is implicit in kinetic models of assembly. J. Biol. Chem 2003, 278, 18249–18255. [DOI] [PubMed] [Google Scholar]

[R43] (43).Hagan MF Modeling viral capsid assembly. Adv. Chem. Phys 2014, 155, 1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R44] (44).Rapaport D Molecular dynamics study of T= 3 capsid assembly. J. Biol. Phys 2018, 44, 147–162. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R45] (45).Hagan MF; Elrad OM; Jack RL Mechanisms of kinetic trapping in self-assembly and phase transformation. J. Chem. Phys 2011, 135, 104115. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R46] (46).Lazaro GR; Hagan MF Allosteric control of icosahedral capsid assembly. J. Phys. Chem. B 2016, 120, 6306–6318. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R47] (47).Michaels TC; Bellaiche MM; Hagan MF; Knowles TP Kinetic constraints on self-assembly into closed supramolecular structures. Sci. Rep 2017, 7, 12295. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R48] (48).Bourne CR; Finn M; Zlotnick A Global structural changes in hepatitis B virus capsids induced by the assembly effector HAP1. J. Virol 2006, 80, 11055–11061. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R49] (49).Venkatakrishnan B; Zlotnick A The structural biology of hepatitis B virus: form and function. Annu. Rev. Virol 2016, 3, 429–451. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R50] (50).Zlotnick A; Ceres P; Singh S; Johnson JM A small molecule inhibits and misdirects assembly of hepatitis B virus capsids. J. Virol 2002, 76, 4848–4854. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R51] (51).Blanchet CE; Spilotros A; Schwemmer F; Graewert MA; Kikhney A; Jeffries CM; Franke D; Mark D; Zengerle R; Cipriani F; Fiedler S; Roessle M; Svergun DI Versatile sample environments and automation for biological solution X-ray scattering experiments at the P12 beamline (PETRA III, DESY). J. Appl. Crystallogr 2015, 48, 431–443. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R52] (52).Round A; Felisaz F; Fodinger L; Gobbo A; Huet J; Villard C; Blanchet CE; Pernot P; McSweeney S; Roessle M; Svergun DI; Cipriani F BioSAXS Sample Changer: a robotic sample changer for rapid and reliable high-throughput X-ray solution scattering experiments. Acta Crystallogr. D 2015, 71, 67–75. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R53] (53).Van Vaerenbergh P; Léonardon J; Sztucki M; Boesecke P; Gorini J; Claustre L; Sever F; Morse J; Narayanan T An upgrade beamline for combined wide, small and ultra small-angle x-ray scattering at the ESRF. Proceedings of the 12th International Conference on Synchrotron Radiation Instrumentation (SRI2015). 2016; p 030034. [Google Scholar]

[R54] (54).Hammersley A FIT2D: a multi-purpose data reduction, analysis and visualization program. J. Appl. Crystallogr 2016, 49. [Google Scholar]

[R55] (55).Asor R; Ben-nun Shaul O; Oppenheim A; Raviv U Crystallization, Reentrant Melting, and Resolubilization of Virus Nanoparticles. ACS Nano 2017, 11, 9814–9824. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R56] (56).Ginsburg A; Shemesh A; Millgram A; Dharan R; Levi-Kalisman Y; Ringel I; Raviv U Structure of Dynamic, Taxol-Stabilized, and GMPPCP-Stabilized Microtubule. J. Phys. Chem. B 2017, 121, 8427–8436. [DOI] [PubMed] [Google Scholar]

[R57] (57).David G; Pérez J Combined sampler robot and high-performance liquid chromatography: a fully automated system for biological small-angle X-ray scattering experiments at the Synchrotron SOLEIL SWING beamline. J. Appl. Crystallogr 2009, 42, 892–900. [Google Scholar]

[R58] (58).Nadler M; Steiner A; Dvir T; Szekely O; Szekely P; Ginsburg A; Asor R; Resh R; Tamburu C; Peres M; Raviv U Following the structural changes during zinc-induced crystallization of charged membranes using time-resolved solution X-ray scattering. Soft Matter 2011, 7, 1512–1523. [Google Scholar]

[R59] (59).Louzon D; Ginsburg A; Schwenger W; Dvir T; Dogic Z; Raviv U Structure and Intermolecular Interactions between L-Type Straight Flagellar Filaments. Biophys. J 2017, 112, 2184–2195. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R60] (60).Fink L; Feitelson J; Noff R; Dvir T; Tamburu C; Raviv U Osmotic stress induced desorption of calcium ions from dipolar lipid membranes. Langmuir 2017, [DOI] [PubMed] [Google Scholar]

[R61] (61).Chandler D Introduction to modern statistical mechanics; Oxford University Press, 1987. [Google Scholar]

[R62] (62).Metropolis N; Rosenbluth AW; Rosenbluth MN; Teller AH; Teller E Equation of state calculations by fast computing machines. J. Chem. Phys 1953, 21, 1087–1092. [Google Scholar]

[R63] (63).Tarjan R Depth-First Search and Linear Graph Algorithms. SIAM J. Comput 1972, 1, 146–160. [Google Scholar]

[R64] (64).Newman M; Barkema G Monte Carlo Methods in Statistical Physics chapter 1–4; Oxford University Press: New York, USA, 1999. [Google Scholar]

[R65] (65).Hagberg AA; Schult DA; Swart PJ Exploring Network Structure, Dynamics, and Function using NetworkX. Proceedings of the 7th Python in Science Conference. Pasadena, CA USA, 2008; pp 11 – 15. [Google Scholar]

[R66] (66).Cordella LP; Foggia P; Sansone C; Vento M An improved algorithm for matching large graphs. 3rd IAPR-TC15 workshop on graph-based representations in pattern recognition. 2001; pp 149–159. [Google Scholar]

[R67] (67).Eggert DW; Lorusso A; Fisher RB Estimating 3-D rigid body transformations: a comparison of four major algorithms. MACH. VISION APPL 1997, 9, 272–290. [Google Scholar]

[R68] (68).Fraser R; MacRae T; Suzuki E An improved method for calculating the contribution of solvent to the X-ray diffraction pattern of biological molecules. J. Appl. Crystallogr 1978, 11, 693–694. [Google Scholar]

[R69] (69).Svergun D; Barberato C; Koch MH CRYSOL – a program to evaluate X-ray solution scattering of biological macromolecules from atomic coordinates. J. Appl. Crystallogr 1995, 28, 768–773. [Google Scholar]

[R70] (70).Grudinin S; Garkavenko M; Kazennov A Pepsi-SAXS: an adaptive method for rapid and accurate computation of small-angle X-ray scattering profiles. Acta Crystallogr. D 2017, 73. [DOI] [PubMed] [Google Scholar]

[R71] (71).Schneidman-Duhovny D; Hammel M; Tainer JA; Sali A Accurate SAXS profile computation and its assessment by contrast variation experiments. Biophys. J 2013, 105, 962–974. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R72] (72).Székely P; Ginsburg A; Ben-Nun T; Raviv U Solution x-ray scattering form factors of supramolecular self-assembled structures. Langmuir 2010, 26, 13110–13129. [DOI] [PubMed] [Google Scholar]

[R73] (73).Ben-Nun T; Ginsburg A; Székely P; Raviv U X+: a comprehensive computationally accelerated structure analysis tool for solution x-ray scattering from supramolecular self-assemblies. J. Appl. Crystallogr 2010, 43, 1522–1531. [Google Scholar]

[R74] (74).Pauw BR Everything SAXS: small-angle scattering pattern collection and correction. J. Phys. Condens. Matter 2013, 25, 383201. [DOI] [PubMed] [Google Scholar]

[R75] (75).MacQueen J Some methods for classification and analysis of multivariate observations. Proc. Fifth Berkeley Symp. on Math. Statist. and Prob 1967; pp 281–297. [Google Scholar]

[R76] (76).Gelbart WM; Ben-Shaul A; Roux D Micelles, membranes, microemulsions, and monolayers; Springer Science & Business Media, 2012. [Google Scholar]

[R77] (77).Zandi R; van der Schoot P; Reguera D; l, W.; Reiss, H. Classical nucleation theory of virus capsids. Biophys. J 2006, 90, 1939–1948. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R78] (78).Perlmutter JD; Hagan MF Mechanisms of virus assembly. Annu. Rev. Phys. Chem 2015, 66. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R79] (79).Safran S Statistical thermodynamics of surfaces, interfaces, and membranes; CRC Press, 2018. [Google Scholar]

PERMALINK

Assembly Reactions of HBV Capsid Protein into Capsid Nanoparticles Follow a Narrow Path Through a Complex Reaction Landscape

Roi Asor

Lisa Selzer

Christopher John Schlicksup

Zhongchao Zhao

Adam Zlotnick

Uri Raviv

Abstract

Graphical Abstract

Introduction

Figure 1:

Results and Discussion

Determining capsid assembly reaction products

At low temperature and low ionic strength, cp149 is exclusively dimer

Figure 2:

The NaCl concentration for the formation of capsids decreases with temperature

Figure 3:

Enumeration of the "en-route" intermediate sample space during the assembly process

Figure 4:

Experimental SAXS data show excellent agreement with the thermodynamic analysis

Figure 5:

Figure 6:

Effect of temperature and salt on the association free energy

Figure 7:

Deviations from the "en-route" equilibrium model led to higher mole fractions of small assemblies and accumulation of large aggregates

Thermodynamic stability analysis provides insight into successful assembly conditions

Figure 8:

Figure 9:

Materials and Methods

Samples preparation

SAXS Measurement Setup and Data Reduction.

Graph representation of capsids

Monte Carlo simulations

Simulation phase space

Umbrella sampling method

Allowed steps, acceptance criteria and detailed balance

Simulation along the s axis.

Simulation along the c axis.

Reduction to unique structures

Graph representation

Pair distribution function.

Complete alignment between two structures

Unifying the probability distribution

SAXS curves calculations

Scattering amplitude of the atomic model of the dimer in solution

Solvation Layer

Resolution correction of the models

Sensitivity analysis of SAXS models using a clustering algorithm

Thermodynamic model

Fitting the law of mass action

Conclusions

Supplementary Material

Acknowledgement

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases