Abstract
The rate of deposition of models determined by neutron diffraction, or a hybrid approach that combines X-ray and neutron diffraction, has increased in recent years. The benefit of neutron diffraction is that hydrogen atom (H) positions are detectable, allowing for the determination of protonation state and water molecule orientation. This study analyses all neutron models deposited in the Protein Data Bank to date, focusing on protonation state and properties of H (or deuterium, D) atoms as well as the details of water molecules. In particular, clashes and hydrogen bonds involving H or D atoms are investigated. As water molecules are typically the least reproducible part of a structural model, their positions in neutron models were compared to those in homologous high-resolution X-ray structures. For models determined by joint refinement against X-ray and neutron data, the water structure comparison was also carried out for models re-refined against the X-ray data alone. The homologues have generally fewer conserved water molecules where X-ray only was used and the positions of equivalent waters vary more than in the case of the hybrid X-ray model. As neutron diffraction data are generally less complete than X-ray data, the influence of neutron data completeness on nuclear density maps was also analyzed. We observe and discuss systematic map quality deterioration as result of data incompleteness.
1. Introduction
X-ray crystallography is the predominant method of determining the three-dimensional structure of macromolecules, accounting for 89% of all models deposited in the PDB (Berman, Henrick, & Nakamura, 2003; Bernstein et al., 1977; wwPDB Consortium, 2019). The technique is based on the interaction between X-rays and the electrons of the atoms in a crystalline sample. Major peaks in Fourier maps equate to atomic positions that can be used to determine the structure. As a hydrogen atom (H) possesses only one valence electron, it does not produce enough signal to be observed in electron density maps, unless the data resolution is very high.
By contrast, neutron diffraction relies on the interaction of neutrons with atomic nuclei. As hydrogen and deuterium atoms (D) have scattering cross sections of similar magnitude to those of heavier atoms (O, N, C), neutron diffraction can be used to locate H and D atoms even at medium data resolution. This opens up the possibility of determining the protonation states of amino acid side chains and the orientation of some water molecules, which is important for understanding the reaction pathways of proteins (Ankner, Heller, Herwig, Meilleur, & Myles, 2013). However, there are practical reasons that impede routine application of the technique; in particular, refinement is challenging due to low data completeness, low signal-to-noise ratio and the increased number of parameters (H atoms can introduce more variable parameters). The data completeness is typically low for several reasons, including the relatively low flux of neutron beams, reduced signal to noise due to incoherent scattering when hydrogen atoms are present in the sample, and the limited data-collection time available on neutron crystallography instruments (due to oversubscribed facilities). As a consequence, the completeness of neutron diffraction data reaches only about 80% on average across all neutron depositions. By contrast, an X-ray dataset is considered to be of good quality if the completeness is at least 95% (Dauter & Dauter, 2017). Neutron data are typically collected in Laue mode yielding harmonic and spatially overlapped reflections, particularly affecting the completeness of low resolution reflections (Cruickshank, Helliwell, & Moffat, 1987). Furthermore, a 95% complete dataset may lack 20% of low-resolution reflections (Dauter, 1999), so the completeness expressed as an overall value is not particularly informative (Rupp, 2018). Because all reflections contribute to each point of an electron density map, the map quality decreases if the data are not complete and if reflections are missing (Urzhumtsev, 1991; Urzhumtsev, Afonine, Lunin, Terwilliger, & Adams, 2014; Wlodawer, Minor, Dauter, & Jaskolski, 2008). The deterioration can be mild or severe: random or weak absent reflections have little effect on maps (Evans, 2011) while missing strong reflections and missing wedges (Wennmacher et al., 2019) can distort maps considerably. The completeness also affects coordinate errors as estimated by the Cruickshank DPI (diffraction-component precision index) (Cruickshank, 1999). In this work we quantify the influence of missing reflections on nuclear density maps by comparing model-calculated Fourier syntheses and histograms of density at atomic centers.
The method of joint X-ray/neutron refinement (joint XN refinement) addresses the low data-to-parameter ratio by refining a single model simultaneously against X-ray and neutron data (Adams, Mustyakimov, Afonine, & Langan, 2009; Afonine et al., 2010; Coppens, 1967; Orpen, Pippard, Sheldrick, & Rouse, 1978; Wlodawer, 1980; Wlodawer & Hendrickson, 1982). Ideally, the X-ray and neutron datasets should be collected at the same temperature and from the same or a highly isomorphous crystal, although this cannot always be accomplished.
The number of neutron models deposited in the PDB has nearly doubled in the past 5 years, reaching 161 as of October 2019. Therefore, an assessment of these models is timely as the number of structures derived from neutron diffraction is likely to increase further in the coming years. Special focus is put on protonation states and properties of H (D) atoms and water molecules, because this is the distinctive information that can be derived from neutron diffraction experiments. In particular, we investigate clashes and hydrogen bonds involving H (D) atoms, distinguishing in our analyses between hydrogen atoms that have a degree of freedom (“rotatable” H atoms) and those that are fully determined by their nonhydrogen parent atoms (“fixed” H atoms).
A prerequisite of applying the joint XN refinement method is that the models derived from the X-ray and neutron data should be isomorphous. It has been shown that water molecules are the least reproducible part of structural models (Blakeley, Kalb, Helliwell, & Myles, 2004; Fields et al., 1994; Fujinaga, Delbaere, Brayer, & James, 1985; Ohlendorf, 1994). One fundamental reason why water is difficult to model is that it has a very simple shape—just one peak—that often can be similar in appearance to noise (Weichenberger, Afonine, Kantardjieff, & Rupp, 2015). Ohlendorf (1994) reported that water is the least reliable species when re-refining four models of human interleukin 1β. Fujinaga et al. (1985) found that 15% of water molecules did not reappear if they were deleted and redetermined. Fields et al. (1994) showed that 85% of water molecule positions occurred within 1Å in two independently refined models of platocyanin, while the root mean square deviation (rmsd) of protein Cα atoms was only 0.08Å. The comparison of a 1.65Å X-ray model with a 2.5Å neutron model of concavanalin A revealed that only 19% of water molecules had matching positions (Blakeley et al., 2004). While these analyses show a trend that about 80% of water positions are conserved, a systematic study has not yet been performed. To address this issue, we compared the water structures in neutron models to (a) the water structure in high-resolution X-ray homologue models and (b) to joint XN models re-refined against X-ray data alone.
2. Materials and methods
2.1. Retrieving models and data from the PDB
Computations were carried out with Phenix tools (Liebschner et al., 2019) and scripts based on the CCTBX library (Grosse-Kunstleve, Sauter, Moriarty, & Adams, 2002). Model and data files were retrieved from the Protein Data Bank (PDB). To identify models determined by neutron diffraction, the PDB file header (“EXPERIMENTAL METHOD” in PDB format) was parsed. Other information retrieved from the file header includes resolution limits, sigma cutoffs, the twin law (if present), crystallographic R-factors and deposition year. Some models had to be curated before they could be used in Phenix tools; for example, atoms were renamed to obey conventions in CCTBX tools. Details of model and data curation are described in Liebschner, Afonine, Moriarty, Langan, and Adams (2018). For some analyses, such as refinement, map calculation and computation of completeness, the high-resolution limit is required. If the resolution limit in the PDB file header was inconsistent with that deduced from the reflections in the data file, the maximum (numerically larger value) of the two resolution limits was used. This prevents underestimating data completeness if the deposited dataset contains more reflections than indicated by the user-specified limit. If the limit value was missing or ambiguous, the high-resolution limit was determined from the data file.
2.2. Elongation to neutron X-H distances
Density peaks for H atoms appear at different distances in electron and nuclear density maps. In an electron density map, the density of the H atom is shifted away from the nucleus along the X-H bond towards atom X. To account for this difference, it is standard practice to apply different ideal X-H bond lengths for X-ray and neutron data, with neutron models using distances that are around 0.1Å longer. However, the X-H distances in the deposited neutron models are not consistent: a nonnegligible number have H atoms at X-ray distances (Gruene, Hahn, Luebben, Meilleur, & Sheldrick, 2014; Liebschner et al., 2018). To compare models, it is important to compute clashes and hydrogen bonds from models with consistent X-H distances. Therefore, the X-H distances were elongated to neutron dictionary target values (Allen, 1986; Allen & Bruno, 2010) in all neutron models using CCTBX functions.
2.3. Fixed and rotatable H atoms
The position of fixed H atoms can be inferred from the coordinates of neighboring, non-H atoms (see also chapter “Implementation of the riding hydrogen model in CCTBX to support the next generation of X-ray and neutron joint refinement in Phenix” by Liebschner et al.). An example is the amide H atom in the peptide group; it is located in a plane formed by the atoms C, O and N. These fixed H atoms may deviate somewhat from their ideal position, but the geometrical configuration will be preserved: an H atom in a tetragonal geometry will not switch into planar geometry. Rotatable H atoms have a degree of freedom to rotate, for example the Hγ atom in a serine residue, which can rotate around the axis formed by the Oγ and Cβ atoms (Fig. 1A), or “propeller” groups such as atoms Hγ21, Hγ22 and Hγ23 in an isoleucine residue. While the Hγ atom can rotate 360 degrees, the propeller group superposes after 120 degrees, so its effective degree of freedom is diminished. Water molecules are a special case because the molecule can orient to optimize local interactions, such as hydrogen bonds. Therefore, some analyses focus on the different H atom types: fixed, rotatable and water H atoms.
Fig. 1.

(A) The Hγ atom in Ser has a rotational degree of freedom while the other H atoms have fixed geometry. (B) Percentage of fixed and rotatable H atom labels in neutron models. About two-thirds are fixed and one-third is rotatable. Water molecules are excluded as it depends on individual choice how many H atoms were added.
2.4. Calculating clashscores for subsets of H atoms
The Molprobity clashscore is defined as the number of steric overlaps ≥=0.4Å per 1000 atoms (Word et al., 1999). It is typically calculated for all atoms, but it can also be calculated for a subset of atoms. For this study, the clashscore was calculated for a subset of hydrogen atoms (rotatable, fixed and water H atoms):
We note that interactions that qualify as hydrogen bond (see Section 2.5) are not counted as a clash.
2.5. Hydrogen bond analysis
The general chemical definition describes a hydrogen bond X-H…A-Y as a local, noncovalent bond where the X-H group acts as proton donor to the acceptor atom A (Steiner, 2002). A specialized definition in the context of crystal structures involves interaction geometries (distances and angles). Different cut-offs for bond lengths and angles are used in the literature (Fabiola, Bertram, Korostelev, & Chapman, 2002; Koch, Bocola, & Klebe, 2005; Thomas, Benhabiles, Meurisse, Ngwabije, & Brasseur, 2001; Torshin, Weber, & Harrison, 2002); here we interpreted a X-H…A-Y interaction as a H-bond if (a) the H…A distance is between 1.2 and 2.2Å, (b) the X…A distance is between 2.2 and 3.2Å, (c) the X-H…A angle is larger than 110° and (d) the H…A-Y angle is larger than 90°, inline with recommendations given by Steiner (2002). The number of hydrogen bonds and their geometry was determined for subsets of hydrogen atoms (rotatable, fixed and water molecules).
2.6. Obtaining homologous models
Similar protein sequences for each neutron model chain were found with the program BLAST (blastp, Altschul, Gish, Miller, Myers, & Lipman, 1990; Madden, 2003). The model with the highest sequence identity and highest resolution was assigned as homologous structure. The following cut-offs were applied: minimum sequence identity of 90% and minimum data resolution of 2Å.
2.7. Refinement of joint XN models against X-ray data alone
To investigate if the positions of water molecules change, joint XN models were re-refined against the X-ray data alone with phenix.refine (Afonine et al., 2012). The following parameters are different from default options: number of macrocycles (set to 10), optimization of the stereochemistry/X-ray weight and the weight of atomic displacement parameters (ADP). Water molecules were automatically updated during refinement, the defaults were changed so that water molecules with ADPs >60Å2 or with occupancy less than 0.5 were removed, ensuring that less reliable water molecules are excluded from the model.
2.8. Comparing water structures
The water clustering method as described in Moriarty et al. (2018) was used to determine if a water molecule in the neutron model matches a water in another structure (X-ray homology model or newly refined joint XN model against X-ray data). Prior to this analysis, the models were superposed using the CCTBX script phenix.superpose_pdbs. If no water molecule was found within 1Å, the water was considered as nonpreserved. The comparison was restricted to proteins with one chain, as the current version of the clustering algorithm only compares water molecules within a particular chain.
2.9. Calculated maps for incomplete and complete neutron data
To investigate the influence of data completeness on nuclear density maps, we calculated Fourier syntheses that we scaled by the standard deviation of the map values (for details, see Section 3.3.1 in Urzhumtsev et al., 2014). For each neutron dataset, we computed Fourier syntheses from model-calculated data using a complete dataset up to highest resolution cutoff or only the reflections present in the experimental dataset. The peak correlation coefficient CC<peaks> at the 90 percentile rank between the two maps for each neutron model was calculated as described in Urzhumtsev et al. (2014). Furthermore, histograms of density values interpolated at atomic centers as well as data completeness profiles were calculated using CCTBX functions. Completeness was calculated for the entire resolution range (minimum to maximum resolution) and for resolution bins. Two different binning schemes were used: one containing an approximately equal number of reflections (cubic binning, number of reflections: 500), the other applied a logarithmic scale. The logarithmic scaling allows more detailed binning at low resolution without disproportionally increasing the number of bins at high resolution (Afonine, Grosse-Kunstleve, Adams, & Urzhumtsev, 2013).
2.10. Percentage of observed D atoms
To find out how many D atoms in deposited neutron models can be reliably located, we evaluated how many D atoms have significant nuclear density peaks. The script used for this task loops over all D atoms in a neutron model, omitting one D atom at a time and calculating an mFobs-DFmodel map. If a density peak above 3σ exists in the near vicinity of the D atom, the D atom is considered to be “observed.” This test was restricted to D atoms with occupancy >0.95, which excludes all D atoms forming multiple conformations or that are partly exchanged. If the D atom is in an alternative conformation, applying a criterion of 3σ for the observation of a peak is most likely too strict. If the D atom is only partly exchanged, it is superposed to an H atom, which has a negative nuclear density peak. The resulting superposed density of D and H is therefore diminished, so that the 3σ criterion is not appropriate. This analysis was performed using a Phenix tool that is currently under development (Xu et al., unpublished) and will be described elsewhere.
3. Results and discussion
3.1. Overview of deposited neutron models and data
As of October 2019, the PDB contains 161 models from originating from neutron diffraction experiments. Fig. 2 shows the cumulative number of neutron models. The oldest model in the PDB is from 1984 (5PTi, Wlodawer, Walter, Huber, & Sjölin, 1984; for the PDB code naming convention used in this article, please see Moriarty, 2015), but some structural reports were performed before the PDB was established (Schoenborn, 1969) or the models were not deposited (Teeter & Kossiakoff, 1984). In the early 1990s, no new models were deposited because neutron facilities such as the Institute Laue-Langevin in Grenoble and the High Flux Beam Reactor at Brookhaven were unavailable (Chen & Unkefer, 2017). In the 2000s, the rate of model deposition increased as a result of new advanced neutron sources (SNS in USA, FRM-2 in Germany, J-PARC in Japan) and developments in methods and technologies, such as the neutron image-plate detector (Niimura et al., 1994) and time-of-flight data collection techniques (Langan, Greene, & Schoenborn, 2004).
Fig. 2.

Cumulative number of models determined by neutron diffraction in the PDB. Orange: models from refinement with neutron data alone. Blue: models from joint XN refinement. The number of cumulative models is written above the bar if it changed compared to the previous year.
While the total number of structures has grown, the number of depositions per year is very low compared to X-ray crystallography (around 9000 in 2018). Among the 161 deposited models, 97 were determined from joint XN refinement (blue in Fig. 2) and 64 were obtained from neutron data alone (orange in Fig. 2). Most of the recent models were refined using the joint XN refinement method.
Some properties of deposited models and data are displayed in Fig. 3. Neutron diffraction requires relatively large (typically >0.1mm3), well diffracting crystals. This is typically the case for relatively small proteins. As shown in Fig. 3A, most of the neutron models have less than 500 protein residues grouped in one or two chains (Fig. 3B). A large majority of structures (156) contains protein while only a few consist of nucleic acids (6). About half of neutron models (89) contain at least one ligand. Neutron diffraction is indeed attractive for the investigation of binding modes due to its ability to locate H atoms. The resolution of neutron diffraction data ranges from 1.05Å to 2.75Å, with an average of 1.97Å. This reflects the fact that nuclear density maps at resolutions lower than 2.5Å are often difficult to interpret (see for example water molecule density in Fig. 2B of chapter “Interactive model building in neutron macromolecular crystallography” by Logan). For joint XN experiments, the X-ray data resolution is generally better than that of the neutron data; the average is 1.65Å, extending from 0.93Å to 2.30Å. The data completeness differs strikingly between the experiments. The average completeness of neutron datasets is 82.3% compared to 94.1% for X-ray data. The crystallographic R-factor (Rwork) is systematically smaller when calculated against X-ray data than against neutron data; the difference occurs because the data sets contain different sets of reflections and therefore cannot be compared meaningfully.
Fig. 3.

Properties of neutron models and X-ray/neutron data.
3.2. Clashscore
The clashscore represents sterically unlikely atom-atom overlaps (distance cut-off for overlaps: 0.4Å) (Williams et al., 2018). It can reflect tight packing, but it is usually indicative of local fitting problems, such as flipped sidechains, misplaced residues, or misidentified groups. About two-thirds of H/D atoms are fixed and one-third are rotatable (Fig. 1B). We note that this number does not necessarily represent the number of sites, as there can be both H and D atoms at some sites when the crystal was subject to H/D exchange. We wanted to test if one H atom type (rotatable or fixed) has a higher propensity to form clashes than the other. As rotatable H atoms have a degree of freedom, it may be anticipated that they have fewer clashes.
Fig. 4A shows the clashscore of rotatable H atoms in each neutron model (including H from water) versus the clashscore of fixed H. The value of the clashscore is typically around four for mid-resolution structures (Richardson et al., 2018). The values for H atom subsets are higher, because the number of clashes in the subset is divided by the number of the atoms in the subset instead of the total number of atoms. The numerical value, therefore, should not be interpreted like the usual clashscore. The distribution of points around the bisector shows that the majority of models have a larger clashscore for rotatable H and water (117 points are above the bisector, 41 are below). This is unexpected as rotatable hydrogen atoms and water molecules can theoretically adjust their position to fit into local geometry. Four models have clashscores larger than 100: in 1iO5, all 251 water molecules have two deuterium atoms, which form many clashes. PDB entries 1LZN and 3iNS have some modeling errors, such as water oxygen atoms in close proximity to protein residues and erroneous alternative conformations. Structure 2iNQ has several clashing H atoms in propeller groups. An example for a clashing rotatable D atom in model 4C3Q is shown in Fig. 4B. The Dη atom of Tyr60 clashes with the Dα atom of Glu34. The distance between the two deuterium atoms is 1.51Å, which corresponds to an overlap of 0.66Å. There are two putative H-bond acceptors in the vicinity of the Dα atom: Oε1 of Glu37 (2.4Å distance) and the O atom of water DOD1125 (3.2Å distance). Also, the Dα atom of Glu34 is not covered by nuclear density. Altogether, this suggests that the orientation of the Dα atom could be improved to better fit the chemical environment. It is conceivable that some of the clashes involving rotatable H atoms in other neutron models can be also corrected by adjusting the H atom position.
Fig. 4.

(A) Comparing clashscores for fixed H and rotatable H (water included). (differences > zero: 117, difference < zero: 41, difference equal zero: 3). (B) Example for a clash (dashed line) in model 4C3Q. The 2mFobs-DFmodel nuclear density is represented at 1σ in blue, mFobs-DFmodel nuclear density at ± 3σ is represented in green (positive) and red (negative).
3.3. Hydrogen bonds
The properties of H-bonds for different H atom types are summarized in Fig. 5. Less than 25% of fixed H atoms are involved in H-bonds in any neutron model. This means that at least 75% of fixed H atoms do not form H-bonds. Fixed H atoms include those attached to carbon (tetragonal Hα, H atoms in aromatic rings, CH2 groups, etc.) that have weaker polarity and therefore lower propensity to form an H-bond. Similarly, less than 20% of rotatable atoms are involved in H-bonds in most neutron models (Fig. 5A); however, there are models with a higher percentage, reaching up to 65%. This may reflect the possibility of rotatable H atoms using their degrees of freedom to optimize local interactions. It is not clear why the majority of models have fewer rotatable H atoms forming an H-bond, but as there are no prior reports available, it is uncertain if this represents genuine differences between structures or modeling errors of rotatable H atoms.
Fig. 5.

Properties of hydrogen bonds for different H atom types. (A) Number of H-bonds. (B) Distance between H(D) and acceptor atom. (C) X-H…A angle.
Water molecules are a special case because they may act as both H-bond donor and acceptor, with up to four possible interactions per molecule. For water molecules modeled as oxygen or as oxygen with a single D atom, the number of possible interactions is two and three, respectively, as we did not take into account putative H-bonds. The shape of the distribution is very different compared to that of fixed or rotatable H atoms; it spans the entire range, from 0% to 5% of water molecules forming an H-bond up to 100–105% (as water can have several interactions, the percentage may be larger than 100%). There is no single peak, but in most models 25–80% of water molecules are involved in an H-bond. While the number of interactions are generally larger for water molecules than for fixed or rotatable H atoms, it still means that many models have water molecules without any H-bond interaction at all, as the percentage is mostly less than 100% at which, on average, each water molecule forms one H-bond. This can be partly due to the fact that some water molecules are modeled as oxygen atom without any D atoms, decreasing the number of possible interactions that can be determined with our algorithm. However, it also indicates that a nonnegligible number of water molecules are without any H-bond making it questionable if these waters should be placed.
The average donor-acceptor distance for each model is shown in Fig. 5B. The shorter the H…A distance, the stronger is the H-bond. Rotatable H atoms tend to form shorter H…A bonds, with an average of 1.8(1)Å, while that for fixed H atoms and water molecules is about 0.1Å longer (2.01(6)Å and 1.95(5)Å for fixed H and water, respectively). However, at lower resolutions, this is within the experimental coordinate error and cannot be deemed statistically significant. The bond lengths are within typical values for H-bonds (Steiner, 2002). It can be noted that bond lengths <1.6Å fall into the regime of low barrier hydrogen bonds between two oxygen atoms, where the energy barrier drops so that the H atom can move freely between the two O atoms (Cleland, 2000; Cleland, Frey, & Gerlt, 1998). For rotatable H, only one model has an average H…A distance <1.5Å (6FJJ) because it has exactly one very short H-bond involving a rotatable H atom (Hγ of Ser50 with the O atom of DOD405). The distributions and the average values for X-H…A angles (Fig. 5C) are similar for all three H atom types (153(7)°, 153(10)° and 149(7)° for fixed, rotatable H and water, respectively). These values agree with the expectation that H-bonds approach linear angles.
3.4. Homologue models
Neutron diffraction is not typically used to determine a structure de novo; in most cases, the model has been previously determined in an X-ray diffraction experiment. We parsed the PDB to find homologue structures for all neutron models. The BLAST search yielded homologues for 156 of all chains in the 161 neutron models. A complete list of homologues for each neutron model chain is available for download at https://doi.org/10.7941/D18907. Only five neutron models lack a high sequence identity X-ray equivalent: 6D54, 3QBA, 1V9G, 1WQZ, 6D4L, all of which are RNA/DNA structures. Fig. 6A shows the data resolution at which the homologue structure was determined against the sequence identity with the neutron model. Most homologues have more than 98% sequence identity and have resolutions better than 1.2Å. We also tested how well the homologues superposed to the neutron models (Fig. 6B). The rmsd between the homologue and neutron models is less than 1Å in most cases; in particular, this appears to be independent of neutron data resolution, meaning that the protein structures are reproducible. However, the rmsd is calculated between protein residues only, ignoring water molecules. Section 3.6 focuses on the water structure comparison in neutron and homology models.
Fig. 6.

(A) For each neutron model, the homologue X-ray model has high sequence identity and the concomitant X-ray data have high resolution. (B) rmsd between corresponding chains in neutron and homologue models. The following models have a rmsd larger than 1Å (shown as dashed horizontal line): 5XPE, 5VNQ, 5PTi, 4QXK.
3.5. Refinement of joint XN models against X-ray data alone
Of the 97 joint XN models, 91 have X-ray data available and could be re-refined. The Rwork and Rfree factors are shown before and after re-refinement in Fig. 7. For the large majority of models, both R-factors decrease after refinement against X-ray data alone. With one exception, Rwork and Rfree are less than 25% for the re-refined models. This is not unexpected as the joint XN model is refined to simultaneously fit both the neutron and X-ray data, which can lead to a model that doesn’t quite reach the optimal fit to either of the two datasets. However, the coordinates did not change significantly after re-refinement, as evidenced by an average rmsd of 0.07Å for main chain atoms between deposited and re-refined models. The largest rmsd is 0.16Å for model 3QBA. In particular, we were interested how the water network changes after re-refinement, which is described in Section 3.6.
Fig. 7.

Rwork and Rfree before and after re-refinement of joint XN models against X-ray data alone. Most of points are below the bisector (dashed line), meaning that the R-factors improved after re-refinement of the joint XN model against the X-ray data.
3.6. Water structures
3.6.1. Water structure in homologues
Fig. 8 shows the results of the water cluster analysis that was performed for 121 of the 161 neutron models. The 40 nonevaluated models have either more than one protein chain (a requirement to have one chain is the limitation of our clustering analysis tool), have no water molecules modeled or their rmsd to the homologue is larger than 1Å. Of the 121 compared water structures, 58 have at least 80% conserved water molecule positions, i.e., located closer than 1Å to each other, in the superposed structures. However, more than half of the tested models have a lower percentage of matching waters; moreover, 20 models have more than 50% of waters without an equivalent in X-ray homologues. The model with the least matching water molecules is 1iO5 (Niimura et al., 1997). It contains 251 DOD molecules, while the 0.65Å resolution X-ray homologue (2VB1) has only 165. We note that 1iO5 has space group P 43 21 2 while the 2VB1 has P1. The packing is likely to affect the arrangement of water molecules. As no diffraction data are available for 1iO5, it is not possible to calculate maps to verify water positions in the neutron model. For all models, the mean distance between equivalent water molecule positions is 0.3–0.7Å (Fig. 8B). Therefore, if the water molecules are conserved, they are located close to each other in the two structures.
Fig. 8.

(A) Comparing the water structure in neutron models and homologues. Water positions in neutron models are reproducible, but the spread is quite large. The water positions in the homologue are less conserved, most likely due to the fact that the high-resolution homologue models contain more water molecules than the neutron counterpart. (B) Histogram of the mean distance between matching waters. Most matching waters are 0.3–0.6Å apart. In both figures, the Y-axis represents counts.
There may be several possible reasons why some water molecules do not have equivalents: if sidechain rotamers are different, water molecules can occupy the space in one structure, while the same space in the other structure is occluded. Similarly, ligands that are only present in one of the structures can occupy available space for water molecules. Furthermore, even if the overall rmsd of the models is <1Å, it may be that the “core” of the protein superposes well, but the outer regions, in particular surface sidechains, may deviate, forcing the water molecules to shift accordingly. Then, even if the protein superposes well everywhere, water molecules in the core of the protein might be more conserved than those at the surface, as they have fewer degrees of freedom being surrounded by protein residues. Furthermore, modeling of water molecules is quite subjective. Although guidelines exist (Levitt & Park, 1993), there are no community-wide accepted standards to validate water molecules. It is customary to reject water molecules that have low occupancy, are located too far from the protein, have large ADPs or weak density peaks but the cut-off values for these criteria vary from person to person and are typically adapted to data resolution: for example, a water with 0.5 occupancy can be acceptable at 1Å while it may reflect over-optimistic modeling at 2.5Å. This is also obvious from the proportion of unmatched waters in the homologue models (Fig. 8A, red bars); because they were mostly determined at high X-ray resolution where typically more water density peaks are visible, they have more lone water molecules than their neutron counterpart. Finally, some water molecules might be without map peaks due to data quality or they could be wrongly placed in noise peaks. A combination of all above reasons may explain why the proportions of conserved waters in neutron models vary so much.
3.6.2. Water structure in re-refined joint XN models
Of the 91 re-refined joint XN models, 73 have one chain so were used to compare their water structures. The results of the water structure comparison are shown in Fig. 9. While the majority of models (43) have more than 80% reproducible water molecules, a nonnegligible number of models (30) have a lower proportion. This means that some joint XN models contain water molecules for which there is no or not enough signal in the X-ray data. If there are equivalent water molecules in both models, the distance between them is between 0.1 and 0.3Å, which is closer than for the homologues. The position of waters that match is therefore well preserved. An example for matching and nonmatching water molecules in model 5EBJ (Langan et al., 2016) is shown in Fig. 10. DOD333 is present in the deposited and re-refined model and has clear density in both the electron and nuclear density maps. DOD360 has nuclear density but no electron density peak is present. The water molecule that was added with the Phenix water picking procedure during refinement against X-ray data, has clear electron density but no nuclear density peak. DOD373 has no peak at all at the given contour level in either map. Therefore, when using two datasets for the refinement of a single model, it is important to carefully model water molecules.
Fig. 9.

(A) Comparing the water structure in deposited joint XN models and re-refined models against X-ray data alone. The majority of water positions are reproducible, but a significant number have a large proportion of unmatched waters. (B) Histogram of the mean distance between matching waters. Most waters in clusters are 0.1–0.3Å apart. In both figures, the Y-axis represents counts.
Fig. 10.

Example for matching and nonmatching water molecules in 5EBJ. The deposited joint XN model (orange) and re-refined model (blue) are superposed. The 2mFobs-DFmodel X-ray map calculated with the re-refined model (blue, 1.6σ) and 2mFobs-DFmodel nuclear density map obtained from the deposited model (orange, 1.6σ) are displayed along with the mFobs-DFmodel nuclear density map represented in green (positive, +3σ) and red (negative, % 3σ).
The analysis showed that the water structure in neutron models is more conserved for joint models re-refined against X-ray data alone than for homologue structures. The purpose of the joint refinement procedure is to produce a model that fits both datasets, so it is expected that the water positions should agree to a certain extent. Additionally, the crystals used for the neutron and X-ray experiments are usually produced with the same protocol while the crystals yielding the homologue models were most likely prepared under slightly different conditions (different reagents, procedure and handling). It is therefore possible that the reproducibility of water positions also depends on the crystallization conditions although it might be also due to the modeling or refinement protocol.
3.7. Density histograms
Unless completely deuterated, neutron models contain both hydrogen and deuterium atoms. Hydrogen has a negative scattering length while that of other elements typically found in macromolecules (C, N, O, S) and deuterium is positive. As a consequence, nuclear density maps can exhibit negative peaks near atomic centers, or the density can cancel out if H atom peaks overlap with positive peaks, such as when H is partially replaced by D or in groups of atoms with both positive and negative scattering lengths (for example CH2 groups). We calculated histograms of model-calculated density map values interpolated at atomic centers for each neutron model that has deposited and usable diffraction data. We first gauged the impact of some model and data properties (H/D content, resolution, B-factors) for one particular example and then we investigated the impact of data completeness for all models.
Fig. 11 shows model based Fourier map histograms at atomic centers for model 6BCC (Kovalevsky et al., 2018) (the model was chosen arbitrarily as an example). Fig. 11A shows the density histogram for different scenarios of H/D content; the histogram has a single peak for positive density points if the model contains only deuterium (blue). If the model contains only hydrogen atoms, there are two peaks (orange): one peak for the negative H atoms, and a broader peak in the positive region for all other protein atoms. The number of points with density equal to zero is about the same as the number of points at the positive peak. This means that the map has many regions of zero density, which can be challenging for model building and real-space refinement algorithms. If exchangeable sites are modeled as a superposition of H and D (with an occupancy ratio of 0.5/0.5, green), then there are two almost equally high peaks, located at similar positions to those observed when only D or only H is present. As the large majority of samples used for neutron diffraction experiments are deuterated (either completely or partially), the case of a sample with only hydrogen atoms is rare. Accordingly, there is currently only one model with only H atoms deposited in the PDB (Chatake & Fujiwara, 2016). For completely deuterated models density values at all atom centers are positive. Therefore, it is worth considering the case of models with partially exchanged H atoms in more detail because the histogram has an additional peak for negative densities. Fig. 11B shows the histogram for a partially exchanged model with different occupancy ratios at the exchanged sites. The shape and position of the negative peak is not affected, while the location of the positive peak shifts closer to zero if the H atom occupancy increases. We note that the number of points equal to zero does not increase if the H/D ratio is such that the sum of scattering lengths is zero (H/D occupancy ratio of 0.65/0.35), possibly because the density of the covalently bound heavy atom contributes at the H/D position. Fig. 11C shows the impact of resolution. The lower the resolution, the closer the peaks move together, and the narrower they become. Furthermore, the number of density points close to zero increases. The effect of isotropic B-factors is illustrated in Fig. 11D. With increasing B-factor, the histograms get narrower, but the peak position remains similar. These considerations illustrate that data resolution and H atom content are the main factors responsible for shifting the density histogram peaks towards zero.
Fig. 11.

Histogram of density values at atomic centers (the y-axes represent the probability density). The histograms are represented as smooth curve using a kernel density estimation, as implemented in the plot package Seaborn (www.seaborn.pydata.org). (A) Hydrogenation state. (B) H/D occupancy ratios if H is partially exchanged with D. (C) Data resolution. (D) B-factors.
3.8. Effect of incomplete neutron data on maps
Fig. 12 shows the peak correlation coefficient between complete and incomplete model-based maps as a function of data completeness. The data points are colored according to the hydrogenation state, which governs the shape of the density histogram (as discussed in Section 3.7). The distribution of the points suggests that the lower the completeness, the lower the correlation between the maps. Examples for different hydrogenation states are discussed below.
Fig. 12.

Correlation coefficient between complete and incomplete calculated Fourier syntheses for all neutron models.
One model has CC values less than 0.5: 1WQZ (CC<peaks> = 0.39) (Arai, 2005) contains both H and D atoms with a majority of H atoms. The calculated density for two thymine residues is shown in Fig. 13A and B. The density for the complete dataset covers most atoms, except CH2 and CH3 groups. For the incomplete dataset, the density covers fewer atoms, for example the sugar group of DT4 has very weak density. At other places, the density is smeared, such as between the two thymine bases. The histogram of density at atomic centers (Fig. 13C) has a drastically different shape for the incomplete dataset. Instead of two peaks, there is one peak with considerably more points of zero density. The neutron data completeness is 61% overall and also across resolution ranges (Fig. 13D). As the data resolution is rather low (2.5Å) and the space group symmetry is relatively high (tetragonal, P32 2 1), there are only a few resolution bins. This example shows that if data completeness is low across the entire resolution range, nuclear density maps can be severely distorted.
Fig. 13.

Effect of data completeness, example of model 1WQZ. (A) Model-calculated map from complete data, contoured at 2σ. (B) Model-calculated map using the completeness of the deposited neutron data, contoured at 1.68σ. (C) Histogram of density at atomic centers. (D) Completeness vs. resolution for deposited neutron data. Contouring levels of maps were chosen with phenix.map_comparison.
The nuclear density map for model 6E21 (Gerlits et al., 2019), which contains both H and D atoms with a majority of D, is shown in Fig. 14. In the complete map, the nuclear density covers most atoms, with the exception of most fixed H (D) atoms. At a comparable σ level, the density in the incomplete map covers much fewer atoms. We note that in both maps, the Oη group of Thr183 does not have a density peak impeding the location of both the O and the H (D) atoms, which are both rotatable. For the incomplete dataset, the density histogram is shifted to the left, as a result of fewer terms contributing to the Fourier transform. Consequently, more density points have a value of zero. The completeness is around 80% from the lowest limit to about 4Å, then decreases slowly down to about 40%, leading to an overall completeness of 61%. As in the previous example (1WQZ), the completeness is therefore low over a wide resolution range, leading to a distorted nuclear density map.
Fig. 14.

Effect of data completeness, example of model 6E21. (A) Model-calculated map from complete data, contoured at 2σ. (B) Model-calculated map using the completeness of the deposited neutron data, contoured at 1.8σ. (C) Histogram of density at atomic centers. (D) Completeness vs. resolution for deposited neutron data. Contouring levels of maps were chosen with phenix.map_comparison.
Fig. 15 shows the nuclear density map for model 2R24 (Blakeley et al., 2008), which contains a majority of D atoms. The nuclear density calculated from the complete data covers almost all atoms at the 2σ contour. The density in the incomplete map is less clear as it has many gaps between atomic centers. In particular, the DH atom of Tyr107, which is rotatable, does not have a density peak. We note further that the atom does not form any interaction with neighboring atoms while the oxygen atom of DOD376 could be an H-bond acceptor. The density histogram of the incomplete dataset is shifted to the left similar to the previous example. The overall completeness of the incomplete set is about 73%, dropping slowly from 90% at 4Å and lower to 50% at the high-resolution limit of 2.19Å.
Fig. 15.

Effect of data completeness, example of model 2R24. (A) Model-calculated map from complete data, contoured at 2σ. (B) Model-calculated map using the completeness of the deposited neutron data, contoured at 1.93σ. (C) Histogram of density at atomic centers. (D) Completeness vs. resolution for deposited neutron data. Contouring levels of maps were chosen with phenix.map_comparison.
The three examples show that nuclear density maps are typically distorted when the diffraction data are not complete. In particular, some H (D) atom density disappears or becomes weaker than that of neighboring atoms. This is the case for model-calculated maps, i.e. the structure factors are not subject to experimental noise or incorrectly measured intensities. It is very likely that nuclear maps calculated from measured structure factor amplitudes are even more distorted. Weak density makes it difficult to locate the H atoms; zero density will render it impossible to accurately determine the position based on the nuclear density itself. This suggests that nuclear density maps from incomplete data cannot be used to reliably locate all H (D) atom positions based on the experimental data alone.
3.9. Percentage of observed D atoms
The percentage of D atoms with nuclear density peaks above 3σ level is shown in Fig. 16. For nonwater D atoms, the percentage of reliably determined D atoms is typically less than 80%. Moreover, in the majority of models, less than 5% of D atom density has significant density peaks above 3σ. For water D atoms, the distribution is different: While the majority of models has less than 80% reliably located water D atoms, there are much fewer models with only 0–5% D atoms with peaks above the 3σ level. This result may appear counterintuitive: water D atoms are typically more mobile than those in the macromolecule. However, most neutron models typically contain all possible D (or H) atoms, while only those water D atoms that have peaks in density maps are manually placed. It follows that most neutron models have a significant number of D atoms that do not have nuclear density peaks above the 3σ level. This means that these D atom positions cannot be accurately located on the basis of the experimental data.
Fig. 16.

Percentage of D atoms with nuclear density peaks > 3σ in neutron models.
4. Summary
This study evaluated currently available neutron models in the PDB. The clashscore analysis for subsets of H atoms showed that a majority of models have a larger clashscore for rotatable and water H atoms than for fixed H atoms. Inspection of deposited neutron structures suggests that some of the clashes involving rotatable H atoms may be corrected by adjusting the H atom position. This suggests that it may be possible to improve some of these neutron models, although whether this would lead to any change in the biological interpretation is unclear. As anticipated we observe that in any neutron model less than 25% of fixed H atoms form an H-bond. More surprisingly, for rotatable H atoms, the proportion of H atoms forming an H-bond is typically less than 20% (but can occasionally reach up to 65%), and many models have water molecules without any H-bond interaction. It is reassuring to note, that when present, the H-bond lengths and angles are within typical values found in the literature. These H-bond results again suggest that many neutron models could be improved in terms of local stereochemistry to better reflect the expected chemical interactions. It is possible that the use of refinement procedures that incorporate more realistic potential functions might automatically generate improved models (Moriarty et al., 2020).
Analysis of X-ray homologue structures for the deposited neutron models showed that the rmsd between the two models is less than 1Å in most cases. We also saw that approximately 80% of water molecules have conserved positions, with mean distances between 0.3 and 0.7Å. These results demonstrate that neutron models are well determined. Re-refinement of joint XN models against X-ray data alone showed that, in general, both Rwork and Rfree decreased after refinement. We also observed that, while more than 80% of water molecules are conserved in most models, a nonnegligible number of models have fewer reproducible water molecules. These results highlight some limitations with current joint XN refinement procedures: (a) the final model is a compromise fit to the two datasets and (b) the model contains a single water structure that might not accurately reflect the water structures in the crystals used to collect the two datasets. These are particularly important issues when different data collection temperatures are used for the two datasets. Therefore, new joint refinement protocols are needed that combine the information available in neutron and X-ray datasets while simultaneously generating models that model the details of each dataset.
The analysis of nuclear density histograms showed that data resolution and H atom content can shift the density peaks towards zero, complicating their interpretation. Furthermore, nuclear density maps can be distorted when the diffraction data are not complete, as is typically the case for neutron datasets as a result of various experimental constraints. Analysis of the D atom nuclear density peaks, revealed that most neutron models have a significant number of D atoms with density peaks below the 3σ level. These results suggest that nuclear density maps from incomplete datasets cannot be used to accurately locate all D (H) atom positions on the basis of the experimental data. Experimental data collection protocols and data processing algorithms that can maximize data completeness would greatly improve this situation, as would the inclusion of evermore sophisticated descriptions of macromolecular stereochemistry in refinement algorithms.
Acknowledgments
This work was supported by the US National Institutes of Health (NIH) (grant P01GM063210), the Phenix Industrial Consortium and the NIH-funded (grant R01GM071939) Macromolecular Neutron Consortium between Oak Ridge National Laboratory and Lawrence Berkeley National Laboratory. This work was supported in part by the US Department of Energy under Contract No. DE-AC02-05CH1123.
References
- Adams PD, Mustyakimov M, Afonine PV, & Langan P (2009). Generalized X-ray and neutron crystallographic analysis: More accurate and complete structures for biological macromolecules. Acta Crystallographica Section D: Biological Crystallography, 65, 567–573. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Afonine PV, Grosse-Kunstleve RW, Adams PD, & Urzhumtsev A (2013). Bulk-solvent and overall scaling revisited: Faster calculations, improved results. Acta Crystallographica Section D: Biological Crystallography, 69, 625–634. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Afonine PV, Grosse-Kunstleve RW, Echols N, Headd JJ, Moriarty NW, Mustyakimov M, et al. (2012). Towards automated crystallographic structure refinement with phenix.refine. Acta Crystallographica Section D: Biological Crystallography, 68, 352–367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Afonine PV, Mustyakimov M, Grosse-Kunstleve RW, Moriarty NW, Langan P, & Adams PD (2010). Joint X-ray and neutron refinement with phenix.refine. Acta Crystallographica Section D: Biological Crystallography, 66, 1153–1163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Allen FH (1986). A systematic pairwise comparison of geometric parameters obtained by X-ray and neutron diffraction. Acta Crystallographica Section B: Structural Science, Crystal Engineering and Materials, 42, 515–522. [Google Scholar]
- Allen FH, & Bruno IJ (2010). Bond lengths in organic and metal-organic compounds revisited: X—H bond lengths from neutron diffraction data. Acta Crystallographica Section B: Structural Science, Crystal Engineering and Materials, 66, 380–386. [DOI] [PubMed] [Google Scholar]
- Altschul SF, Gish W, Miller W, Myers EW, & Lipman DJ (1990). Basic local alignment search tool. Journal of Molecular Biology, 215, 403–410. [DOI] [PubMed] [Google Scholar]
- Ankner JF, Heller WT, Herwig KW, Meilleur F, & Myles DAA (2013). Neutron scattering techniques and applications in structural biology. Current Protocols in Protein Science, 72, 17.16.1–17.16.34. [DOI] [PubMed] [Google Scholar]
- Arai S (2005). Complicated water orientations in the minor groove of the B-DNA decamer d(CCATTAATGG)2 observed by neutron diffraction measurements. Nucleic Acids Research, 33, 3017–3024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berman H, Henrick K, & Nakamura H (2003). Announcing the worldwide protein data bank. Nature Structural & Molecular Biology, 10, 980. [DOI] [PubMed] [Google Scholar]
- Bernstein FC, Koetzle TF, Williams GJB, Meyer EF, Brice MD, Rodgers JR, et al. (1977). The protein data bank: A computer-based archival file for macromolecular structures. Journal of Molecular Biology, 112, 535–542. [DOI] [PubMed] [Google Scholar]
- Blakeley MP, Kalb AJ, Helliwell JR, & Myles DA (2004). The 15-K neutron structure of saccharide-free concanavalin A. Proceedings of the National Academy of Sciences of the United States of America, 101, 16405–16410. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blakeley MP, Ruiz F, Cachau R, Hazemann I, Meilleur F, Mitschler A, et al. (2008). Quantum model of catalysis based on a mobile proton revealed by subatomic x-ray and neutron diffraction studies of h-aldose reductase. Proceedings of the National Academy of Sciences of the United States of America, 105, 1844–1848. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chatake T, & Fujiwara S (2016). A technique for determining the deuterium/hydrogen contrast map in neutron macromolecular crystallography. Acta Crystallographica Section D: Structural Biology, 72, 71–82. [DOI] [PubMed] [Google Scholar]
- Chen JC-H, & Unkefer CJ (2017). Fifteen years of the protein crystallography station: The coming of age of macromolecular neutron crystallography. International Union of Crystallography, 4, 72–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cleland WW (2000). Low-barrier hydrogen bonds and enzymatic catalysis. Archives of Biochemistry and Biophysics, 382, 1–5. [DOI] [PubMed] [Google Scholar]
- Cleland WW, Frey PA, & Gerlt JA (1998). The low barrier hydrogen bond in enzymatic catalysis. Journal of Biological Chemistry, 273, 25529–25532. [DOI] [PubMed] [Google Scholar]
- Coppens P (1967). Comparative x-ray and neutron diffraction study of bonding effects in s-triazine. Science, 158, 1577–1579. [DOI] [PubMed] [Google Scholar]
- Cruickshank DWJ (1999). Remarks about protein structure precision. Acta Crystallographica Section D: Biological Crystallography, 55, 583–601. [DOI] [PubMed] [Google Scholar]
- Cruickshank DWJ, Helliwell JR, & Moffat K (1987). Multiplicity distribution of reflections in Laue diffraction. Acta Crystallographica Section A: Foundations of Crystallography, 43, 656–674. [Google Scholar]
- Dauter Z (1999). Data-collection strategies. Acta Crystallographica Section D: Biological Crystallography, 55, 1703–1717. [DOI] [PubMed] [Google Scholar]
- Dauter M, & Dauter Z (2017). Many ways to derivatize macromolecules and their crystals for phasing Methods in Molecular Biology (Clifton, NJ: ), 1607, 349–356. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Evans PR (2011). An introduction to data reduction: Space-group determination, scaling and intensity statistics. Acta Crystallographica Section D: Biological Crystallography, 67, 282–292. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fabiola F, Bertram R, Korostelev A, & Chapman MS (2002). An improved hydrogen bond potential: Impact on medium resolution protein structures. Protein Science, 11, 1415–1423. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fields BA, Bartsch HH, Bartunik HD, Cordes F, Guss JM, & Freeman HC (1994). Accuracy and precision in protein crystal structure analysis: Two independent refinements of the structure of poplar plastocyanin at 173 K. Acta Crystallographica Section D: Structural Biology, 50, 709–730. [DOI] [PubMed] [Google Scholar]
- Fujinaga M, Delbaere LT, Brayer GD, & James MN (1985). Refined structure of alpha-lytic protease at 1.7 A resolution. Analysis of hydrogen bonding and solvent structure. Journal of Molecular Biology, 184, 479–502. [DOI] [PubMed] [Google Scholar]
- Gerlits O, Weiss KL, Blakeley MP, Veglia G, Taylor SS, & Kovalevsky A (2019). Zooming in on protons: Neutron structure of protein kinase a trapped in a product complex. Science Advances, 5, eaav0482. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grosse-Kunstleve RW, Sauter NK, Moriarty NW, & Adams PD (2002). The computational crystallography toolbox: Crystallographic algorithms in a reusable software framework. Journal of Applied Crystallography, 35, 126–136. [Google Scholar]
- Gruene T, Hahn HW, Luebben AV, Meilleur F, & Sheldrick GM (2014). Refinement of macromolecular structures against neutron data with SHELXL2013. Journal of Applied Crystallography, 47, 462–466. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Koch O, Bocola M, & Klebe G (2005). Cooperative effects in hydrogen-bonding of protein secondary structure elements: A systematic analysis of crystal data using Secbase. Proteins: Structure, Function, and Bioinformatics, 61, 310–317. [DOI] [PubMed] [Google Scholar]
- Kovalevsky A, Aggarwal M, Velazquez H, Cuneo MJ, Blakeley MP, Weiss KL, et al. (2018). “To be or not to be” protonated: Atomic details of human carbonic anhydrase-clinical drug complexes by neutron crystallography and simulation. Structure, 26, 383–390.e3. [DOI] [PubMed] [Google Scholar]
- Langan PS, Close DW, Coates L, Rocha RC, Ghosh K, Kiss C, et al. (2016). Evolution and characterization of a new reversibly photoswitching chromogenic protein, Dathail. Journal of Molecular Biology, 428, 1776–1789. [DOI] [PubMed] [Google Scholar]
- Langan P, Greene G, & Schoenborn BP (2004). Protein crystallography with spallation neutrons: The user facility at Los Alamos neutron science center. Journal of Applied Crystallography, 37, 24–31. [Google Scholar]
- Levitt M, & Park BH (1993). Water: Now you see it, now you don’t. Structure, 1, 223–226. [DOI] [PubMed] [Google Scholar]
- Liebschner D, Afonine PV, Baker ML, Bunkóczi G, Chen VB, Croll TI, et al. (2019). Macromolecular structure determination using X-rays, neutrons and electrons: Recent developments in Phenix. Acta Crystallographica Section D: Structural Biology, 75, 861–877. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liebschner D, Afonine PV, Moriarty NW, Langan P, & Adams PD (2018). Evaluation of models determined by neutron diffraction and proposed improvements to their validation and deposition. Acta Crystallographica Section D: Structural Biology, 74, 800–813. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Madden T (2003). The BLAST sequence analysis tool. USA: National Center for Biotechnology Information. [Google Scholar]
- Moriarty NW (2015). Human readable PDB codes: An editorial. Computational Crystallography Newsletter, 6, 26. [Google Scholar]
- Moriarty NW, Janowski PA, Swails JM, Nguyen H, Richardson JS, Case DA, et al. (2020). Improved chemistry restraints for crystallographic refinement by integrating the Amber force field into Phenix. Acta Crystallographica Section D: Structural Biology, 76, 51–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moriarty NW, Liebschner D, Klei HE, Echols N, Afonine PV, Headd JJ, et al. (2018). Interactive comparison and remediation of collections of macromolecular structures. Protein Science, 27, 182–194. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Niimura N, Karasawa Y, Tanaka I, Miyahara J, Takahashi K, Saito H, et al. (1994). An imaging plate neutron detector. Nuclear Instruments and Methods in Physics Research A, 349, 521–525. [Google Scholar]
- Niimura N, Yoshiaki M, Nonaka T, Castagna J-C, Cipriani F, Hoghoj P, et al. (1997). Neutron Laue diffractometry with an imaging plate provides an effective data collection regime for neutron protein crystallography. Nature Structural Biology, 4, 909–914. [DOI] [PubMed] [Google Scholar]
- Ohlendorf DH (1994). Acuracy of refined protein structures. II. Comparison of four independently refined models of human interleukin 1β. Acta Crystallographica Section D: Biological Crystallography, 50, 808–812. [DOI] [PubMed] [Google Scholar]
- Orpen AG, Pippard D, Sheldrick GM, & Rouse KD (1978). Decacarbonyl-μ-hydrido-μ-vinyl-triangulo-triosmium: A combined X-ray and neutron diffraction study. Acta Crystallographica Section B: Structural Crystallography and Crystal Chemistry, 34, 2466–2472. [Google Scholar]
- Richardson JS, Williams CJ, Hintze BJ, Chen VB, Prisant MG, Videau LL, et al. (2018). Model validation: Local diagnosis, correction and when to quit. Acta Crystallographica Section D: Structural Biology, 74, 132–142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rupp B (2018). Against method: Table 1—Cui bono? Structure, 26, 919–923. [DOI] [PubMed] [Google Scholar]
- Schoenborn BP (1969). Neutron diffraction analysis of myoglobin. Nature, 224, 143–146. [DOI] [PubMed] [Google Scholar]
- Steiner T (2002). The hydrogen bond in the solid state. Angewandte Chemie International Edition, 41, 48–76. [DOI] [PubMed] [Google Scholar]
- Teeter MM, & Kossiakoff AA (1984). The neutron structure of the hydrophobic plant protein Crambin In Schoenborn BP (Ed.), Neutrons in biology (pp. 335–348). Boston, MA: Springer US. [Google Scholar]
- Thomas A, Benhabiles N, Meurisse R, Ngwabije R, & Brasseur R (2001). Pex, analytical tools for PDB files. II. H-Pex: Noncanonical H-bonds in?-helices. Proteins: Structure, Function, and Genetics, 43, 37–44. [DOI] [PubMed] [Google Scholar]
- Torshin IY, Weber IT, & Harrison RW (2002). Geometric criteria of hydrogen bonds in proteins and identification of ‘bifurcated’ hydrogen bonds. Protein Engineering, Design and Selection, 15, 359–363. [DOI] [PubMed] [Google Scholar]
- Urzhumtsev AG (1991). Low-resolution phases: Influence on SIR syntheses and retrieval with double-step filtration. Acta Crystallographica Section A Foundations of Crystallography, 47, 794–801. [Google Scholar]
- Urzhumtsev A, Afonine PV, Lunin VY, Terwilliger TC, & Adams PD (2014). Metrics for comparison of crystallographic maps. Acta Crystallographica Section D: Biological Crystallography, 70, 2593–2606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weichenberger CX, Afonine PV, Kantardjieff K, & Rupp B (2015). The solvent component of macromolecular crystals. Acta Crystallographica Section D: Biological Crystallography, 71, 1023–1038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wennmacher JTC, Zaubitzer C, Li T, Bahk YK, Wang J, van Bokhoven JA, et al. (2019). 3D-structured supports create complete data sets for electron crystallography. Nature Communications, 10, 1–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Williams CJ, Headd JJ, Moriarty NW, Prisant MG, Videau LL, Deis LN, et al. (2018). MolProbity: More and better reference data for improved all-atom structure validation: PROTEIN SCIENCE.ORG. Protein Science, 27, 293–315. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wlodawer A (1980). Studies of ribonuclease-A by X-ray and neutron diffraction. Acta Crystallographica Section B: Structural Science, Crystal Engineering and Materials, 36, 1826–1831. [Google Scholar]
- Wlodawer A, & Hendrickson WA (1982). A procedure for joint refinement of macromolecular structures with X-ray and neutron diffraction data from single crystals. Acta Crystallographica Section A: Foundations and Advances, 38, 239–247. [Google Scholar]
- Wlodawer A, Minor W, Dauter Z, & Jaskolski M (2008). Protein crystallography for non-crystallographers, or how to get the best (but not more) from published macromolecular structures. The FEBS Journal, 275, 1–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wlodawer A, Walter J, Huber R, & Sjölin L (1984). Structure of bovine pancreatic trypsin inhibitor. Journal of Molecular Biology, 180, 301–329. [DOI] [PubMed] [Google Scholar]
- Word JM, Lovell SC, LaBean TH, Taylor HC, Zalis ME, Presley BK, et al. (1999). Visualizing and quantifying molecular goodness-of-fit: Small-probe contact dots with explicit hydrogen atoms. Journal of Molecular Biology, 285, 1711–1733. [DOI] [PubMed] [Google Scholar]
- wwPDB Consortium. (2019). Protein Data Bank: The single global archive for 3D macromolecular structure data. Nucleic Acids Research, 47, D520–D528. [DOI] [PMC free article] [PubMed] [Google Scholar]
