Abstract
For many macromolecular NMR ensembles from the Protein Data Bank (PDB) the experiment-based restraint lists are available, while other experimental data, mainly chemical shift values, are often available from the BioMagResBank. The accuracy and precision of the coordinates in these macromolecular NMR ensembles can be improved by recalculation using the available experimental data and present-day software. Such efforts, however, generally fail on half of all NMR ensembles due to the syntactic and semantic heterogeneity of the underlying data and the wide variety of formats used for their deposition. We have combined the remediated restraint information from our NMR Restraints Grid (NRG) database with available chemical shifts from the BioMagResBank and the Common Interface for NMR structure Generation (CING) structure validation reports into the weekly updated NRG-CING database (http://nmr.cmbi.ru.nl/NRG-CING). Eleven programs have been included in the NRG-CING production pipeline to arrive at validation reports that list for each entry the potential inconsistencies between the coordinates and the available experimental NMR data. The longitudinal validation of these data in a publicly available relational database yields a set of indicators that can be used to judge the quality of every macromolecular structure solved with NMR. The remediated NMR experimental data sets and validation reports are freely available online.
INTRODUCTION
Experimentally determined biomacromolecular three-dimensional (3D) structures typically are deposited in the Worldwide Protein Data Bank (wwPDB) (1–3) as a requirement by most journals including NAR. As of September 2011, there were over 76 000 entries in the PDB (cf. Table 1) of which ∼9000 entries had been solved by NMR. The BioMagResBank (BMRB) (4) serves as a global repository of experimental NMR data, such as restraints, assigned chemical shifts and dynamic order parameters. Together, these repositories present a valuable resource for numerous research areas in the life sciences.
Table 1.
PDB entries
| Set | Entries |
|---|---|
| PDB | 76 003 |
| Solution NMR | 9042 |
| NRG-CING | 8915 |
| Proteins | 7967 |
| Dimers | 413 |
| Complexes | 1235 |
| Ligands | 384 |
| Deposition | |
| Before 1990 | 9 |
| 1990-2000 | 1920 |
| After 2000 | 7113 |
Overview of subsets of PDB entries (23 September 2011).
A series of experiments have shown that many NMR structures can be improved if they are recalculated from the original experimental data using present-day software and refinement protocols (5–7) including the STAP database published in this ‘Database’ issue of Nucleic Acids Research. These efforts have revealed that the deposited experimental data were highly heterogeneous in format, completeness and quality. Recently, we performed a large-scale optimization of X-ray derived PDB entries (8), which showed that nearly three quarters of these could be improved in terms of fit with the experimental data and geometric quality (9). The massive scale of this effort also allowed the analysis of even the smallest improvements in a statistically meaningful way (10).
Recalculation and proper validation (i.e. validation including the experimental data) both require that the underlying experimental data are syntactically and semantically correct. We have therefore worked for several years on this topic (11,12). In collaboration with the BMRB, we have completed the remediation of the NMR restraint data entries, which resulted in the NMR Restraints Grid (NRG) databases. We recently added the BMRB chemical shift (CS) data and these combined results have been subjected to our integrated NMR structure and experimental data validation analyses, to yield the new database described in this contribution. We have named this database NRG-CING. The database is freely available at http://nmr.cmbi.ru.nl/NRG-CING and it will be updated on a weekly basis. For the NRG-CING pipeline, we have extended the Common Interface for NMR structure Generation (CING; pronounced ‘king’) software package (G. Vuister, et al., CING; an integrated residue-based structure validation program suite, manuscript in preparation). The pipeline first assembles a set of experimental and structural data and then produces a report that includes the results of eleven computer programs that were written by us or by others. The quality of the structure coordinates is currently determined mainly by WHAT_CHECK (12) and PROCHECK-NMR (13). The experimental restraints are tested for consistency and agreement with the structure by CING, Wattos (14), and PROCHECK-NMR/Aqua (13). In addition, the systematic analysis of NMR restraints allowed us to extract new patterns of recurring problems (15). Validation of CS values based on structural and sequence information by CING and the external programs VASCO (16) and SHIFTX (17) and TALOS+ (18) is an integral part of the analyses.
The NRG-CING database is a coherent, annotated and verified collection of experimental input data, the resulting structures and the analyses of their quality. NRG-CING will be the basis for recalculation efforts such as the STAP (http://psb.kobic.re.kr/stap/refinement) and LOGRECOORD (7) databases that will lead to better quality NMR structure ensembles that in turn will allow researchers in the life sciences, in drug design and in bioinformatics to better perform their structure-based research.
DATA PREPARATION
Data conversion
The creation of a coherent and validated database of both structures and experimental data requires several steps. For the NRG-CING production pipeline we employed four stages, that we call C, R, S and F denoting coordinate, restraint, chemical shift and filtering, respectively (Figure 1).
Figure 1.
Flow chart. Data flow chart showing the software tools involved in this project: CING, Wattos and FormatConverter (FC). The four stages denoted: C, R, S and F are described in the text. The dashed line indicates an alternative to the default route including all data types. The repositories, programs and data-formats are represented by cylinders, ‘closed rectangles’ and ‘open rectangles’, respectively.
Coordinate stage
The coordinate data flow in from the wwPDB using an mmCIF formatted file that adheres to the PDB eXchange dictionary (pdbx).
Restraints stage
When restraints are present, the coordinates and the restraints are imported directly from the NRG Database Of Converted Restraints [DOCR; (11)] at BMRB as a CCPN XML file.
Shift stage
We developed code in collaboration with BMRB to run through a wide variety of data sources in order to match older entries for which the match relation between BMRB and PDB entries had not yet been archived. The matching algorithms are documented for the NRG part at: http://tinyurl.com/68dd9l9 and the CING part at http://tinyurl.com/67vfuyl. The CS data from BMRB are then merged by the FormatConverter (FC) (19) in a procedure similar to the one used for the restraints (15).
Filter stage
The distance restraints (DR) are stereospecifically checked and in some cases corrected by FC and CING using the same method as currently in use at the BMRB (11). Distance restraints with violations over 2 Å (up to a maximum of three per entry) were omitted from the NRG-CING database and are labelled as outliers. Although such DRs are sometimes correct, the impact of removing correct DRs is deemed to be less detrimental compared to the effects of retaining potentially incorrect ones. In particular, the latter situation could result in unjustified labelling of an entry to be in discord with its experimental data. From anecdotal interactions with depositors we know that these restraints are often errant violations that were not observed at the time of structure calculation, but arose later as a consequence of correcting other problems, for example, typographical errors that led to a restraint being accidentally uncommented or incorrect mapping of one or two atom names. The referencing of the CS is validated during this stage by VASCO, which compares the CS values for the atoms in a protein to their statistical distribution in relation to the coordinate-derived per-atom solvent exposure (16).
Cloud computing
The CING calculations require on average 20 min per entry for a total of 3000 core hours to process the current set of entries. Most of that time is used to run the many external programs and to prepare the large number of plots that report on the data. Because the complete database needs to be reassembled following each major overhaul of the analysis, this project continues to require substantial computing power. As CING has many external program dependencies, it cannot easily be installed on a traditional grid, but we have found it to be very suitable for a cloud computing setup. The eleven programs required for generating a CING report besides CING (G. Vuister et al., manuscript in preparation) are: CCPN (19), DSSP (20), MatPlotLib (http://matplotlib.sourceforge.net), MOLMOL (21), PROCHECK/Aqua (13), Povray (http://www.povray.org) ShiftX (22), TALOS+ (18), VASCO (16), Wattos (14) and WHAT_CHECK (12). We use the cloud facilities at SARA, our industrial partner Bitbrains (Amstelveen, NL, USA) and WeNMR/INFN for each full iteration in the NRG-CING project.
Project management
A large international collaborative project like NRG-CING requires the identification and remediation of issues with software developed and procedures used. From the beginning of this project in 2008, the issues were maintained in a Google Code repository at http://code.google.com/p/cing and linked to the source code in the CING project. Together with the general CING issues, almost all of the 300+ issues currently listed have been addressed. The documentation is described in Wiki pages at the same site. An automatic build and test farm for several Operation Systems is managed by Jenkins Continuous Integration (CI, http://jenkins-ci.org) at http://nmr.cmbi.ru.nl/jenkins/job/CING.
RESULTS
NRG-CING database overall composition
Of the 8915 entries contained in the NRG-CING database (September 2011) 5423 contained experimental data including DRs (Tables 1 and 2). These entries span the full time frame during which NMR structures have been deposited (1988 to present). Analysis of the experimental data variation also showed that the set contains structures determined both from ‘sparse data’, where only a limited amount of structural information was extracted from NMR experiments, and from abundant experimental data.
Table 2.
Statistics of the NRG-CING database
| Set | Entries | Per entry count |
||
|---|---|---|---|---|
| Average (SD) | Min. | Max. | ||
| Experimental restraints | 5519 | 1392 (1158) | 9 | 11 044 |
| Distances (DRs) | 5423 | 1325 (1107) | 11 | 10 112 |
| AIR DRs onlya | 97 | 27 (14) | 11 | 49 |
| Dihedral anglesb | 3401 | 128 (106) | 9 | 1099 |
| RDCs | 426 | 139 (148) | 9 | 970 |
| Chemical shifts | 3626 | 780 (512) | 2 | 3959 |
| Number of residues | NA | 92 (71) | 2 | 1659 |
aThe number of HADDOCK AIR entries was overestimated by including every NRG-CING entry with <50 DRs. bThe number of entries with dihedral angle restraints is overestimated by including CS derived ones from Talos+.
NA: Not Applicable.
Examples of longitudinal validation
The CS values of the β and γ carbons of proline have been shown sensitive to the usual trans or the occasionally occurring cis peptide bond configurations. A study based on 33 cis and 1000 trans Pro residues in non-paramagnetic proteins showed a clear clustering for the 13C β/γ CS difference (CSD) values (23). The regions of (0.0, 4.8) and (9.15, 14.4) ppm corresponded with near absolute certainty to the trans and cis conformations, respectively. In NRG-CING we observe 228 cis and 7949 trans Pro in 3435 entries with β/γ carbons CS values obtained from BMRB. We have identified the reversed correspondence for 8 (cis) and over 100 (trans) occurrences. For example, the recent Structural Genomics PDB entry 2k8s (Cort J.R. et al., unpublished results) Pro57 in chain B has a CSD of 11.9 ppm, which indicates a contradiction with the trans state modelled in all conformers of the ensemble. We also observed much more extreme CSD values that are likely caused by human error: e.g. the CSD of Pro71 in PDB entry 2i4k (24) has a very large value (37 ppm) that most likely resulted from uncorrected folding/aliasing of the NMR spectrum.
A second example of the combined analysis of chemical shifts in relation to structural quality concerns the sidechain conformation of the leucine delta carbons. Also here, chemical shifts have proven reliable indicators of conformation (25). For the NRG-CING database, 218 (trans) and 115 (gauche+) structured leucine residues in a total of 286 entries showed inconsistencies between observed chemical shifts and χ1/χ2 sidechain conformations, that warrant further investigation (Berntsen, K.R.M. Doreleijers, J.F., Breukels, V., Stens, E., Vriend, G. and Vuister, G.W. manuscript in preparation).
AVAILABILITY
Reports
Currently all wwPDB members (RCSB-PDB, PDBe, PDBj and BMRB) include links to the NRG-CING reports. These pointers drive the vast majority of traffic to the NRG-CING database. The complete NRG-CING database can be accessed by any user. In addition to straightforward selection of specific PDB entries, the front page of the NRG-CING website also allows interactive selection using different criteria, such as protein size, number of distance restraints or chemical shift restraints or ROG score. According to Google Analytics, during the last year NRG-CING was on average visited each day ∼25 times by 9 ‘absolute unique visitors’.
Relational database
In addition to the web-based interactive HTML, CSV dumps from the relational database are available (http://nmr.cmbi.ru.nl/NRG-CING/pgsql). These files can be imported to a slave database using the SQL script at http://tinyurl.com/3rb24eq. The relational database (RDB) contains the validation data at the levels of entry, chain, residue and atom with special tables recently added for DRs and CSs. Many of the validation criteria in CING are also in this relational database, and plots are available at http://nmr.cmbi.ru.nl/NRG-CING/HTML/plot.html, showing the distribution of values such as detailed in Figure 2 for the CING ROG scores. The NRG-CING RDB is setup in conjunction with the PDBj Mine RDB for full cross-correlated access to PDB meta data such as deposition dates (26).
Figure 2.
ROG Results from NRG-CING. The percentage of residues with ROG score red (bad) versus green (good) is plotted with filled circles for 6265 NMR PDB entries from NRG. The red, orange, green (ROG) score is a composite assessment over individual program's validation criteria on the quality of entities such as restraint, coordinate, peak, chemical shift, atom, residue, molecule, etc. The ROG scores are propagated based upon defined relationships between such entities. The entries were selected to have at least: 3.5 kDa molecular mass, 10 models and one protein chain. On the bottom right of the banana-shaped distribution are a minority of entries that have a significant fraction of residues marked red. Note that the percentages green and red taken together with the omitted dimension for orange, add up to 100%.
iCing Server and service
Our multilingual web server (https://nmr.cmbi.ru.nl/icing/) and a web service together are called iCing (see Figure 3). It allows a user to submit NMR-derived coordinates, restraints and CS values in three data formats. We preferentially employ CCPN project files (19), but also accommodate additional data formats, such as the out-dated plain PDB format for structural data only. Although not preferred, this capability does provide the casual user access without sophistication. In collaboration with Dr Torsten Herrmann, we added the capability to upload CYANA formatted data, which will facilitate more standalone programs to integrate with the iCing service.
Figure 3.
iCing Web Server and Service. The screenshot of iCing (Spanish translation selected) shows the customizable definitions for ‘poor’ (orange) and ‘bad’ (red) that CING will use for some WHAT_CHECK parameters. The Google Web Toolkit (GWT) allowed us to easily add German, Spanish, French, Italian, Japanese, Dutch, Portuguese, Russian and Chinese translations to the default English language with help from our colleagues who are native speakers of these languages.
The iCing server can be used prior to a submission or, even better, as part of the iterative process of NMR structure determination. Figure 3 shows that the user can customize the validation criteria, which can be useful to specifically focus attention on particular aspects. Generally speaking, however, this is not recommended because the standard criteria are used in deriving the NRG-CING database. Validation of the validation criteria themselves is a topic of ongoing research.
The server uses a simple three-tier setup with a Google Web Toolkit 2.0 front end, an Apache/Tomcat secured HTTP servlet, and a backend part including the CING installation. The iCing server has seen 1025 unique views during the first 10 months of 2011, according to Google Analytics. The standalone CCPN Analysis program (19) is using iCing as a service extensively. In total, the iCing service has been used for 1417 data sets in the same period.
FUTURE PERSPECTIVES
Improvements
Although already a valuable resource, as judged from its usage statistics, we continuously seek improvements to the database. We plan to address the following topics: (i) we aim to make the database 100% complete by solving a series of difficult data-related issues (such as Google Code NRG issue 272 and CING issues 266, 310–312) that currently limit us to include only 98.6% of the PDB entries. (ii) We plan on improving the NRG-CING setup with better matches between older BMRB and PDB entries, deposited before the relationship between these was maintained. (iii) Finally, although RDC data are contained within the database, these should be validated as well.
Usage
Finally, NRG-CING only contains the released PDB entries. This journal, Nucleic Acids Research like many journals, encourages authors of new structure papers to provide referees with the output from PDB's validation report from http://deposit.pdb.org/validate. It would be of great value to authors and referees to have these CING reports available in addition to the currently used validation reports on the coordinates alone.
FUNDING
The Netherlands Organization for Scientific Research (NWO) (grant 700.55.443, to G.W.V.), Netherlands Bioinformatics Centre (NBIC); EU FP6 grants STREP Extend-NMR (LSHG-CT-2005-018988) and EMBRACE (LHSG-CT-2004-512092); FP7 WeNMR (grant 261572, to J.F.D., G.V. and G.W.V.); Brussels Institute for Research and Innovation (Innoviris) (grant BB2B 2010-1-12, to W.V.F.); US National Library of Medicine (grant LM05799, to C.S., E.L.U. and J.L.M.). Funding for open access charge: Radboud University Medical Centre Funds to the CMBI.
Conflict of interest statement. None declared.
ACKNOWLEDGEMENTS
We gratefully like to acknowledge the iCing translators: Jundong Lin (Madison), Alan W.S. da Silva (Cambridge), Thomas Lütteke (Gießen, Germany) Philip Kensche (Nijmegen), Olì Maria Victoria Grober (Naples), Naohiro Kobayashi (Osaka) and Nadia Kovalevskaya (Nijmegen). We enjoyed the advice and constructive criticism from all CCPN developers in Cambridge and Tim te Beek, Karen Berntsen, Vincent Breukels, Maarten Hekkelman, Wilmar Teunissen and Wouter Touw in Nijmegen. We appreciated Akira Kinjo's advice for setting up a local slave of PDBj Mine. We are especially indebted to all the authors providing wwPDB with the fruits of their labor. We appreciated the support from Floris Sluiter and colleagues at Sara (Amsterdam), Gjalt van Rutten at Bitbrains and Marco Verlato (WeNMR) with using their HPC Cloud infrastructure.
REFERENCES
- 1.Rose PW, Beran B, Bi C, Bluhm WF, Dimitropoulos D, Goodsell DS, Prlic A, Quesada M, Quinn GB, Westbrook JD, et al. The RCSB Protein Data Bank: redesigned web site and web services. Nucleic Acids Res. 2011;39:D392–D401. doi: 10.1093/nar/gkq1021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Velankar S, Alhroub Y, Alili A, Best C, Boutselakis HC, Caboche S, Conroy MJ, Dana JM, van Ginkel G, Golovin A, et al. PDBe: Protein Data Bank in Europe. Nucleic Acids Res. 2011;39:D402–D410. doi: 10.1093/nar/gkq985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Berman H, Henrick K, Nakamura H. Announcing the worldwide Protein Data Bank. Nat. Struct. Biol. 2003;10:980. doi: 10.1038/nsb1203-980. [DOI] [PubMed] [Google Scholar]
- 4.Ulrich EL, Akutsu H, Doreleijers JF, Harano Y, Ioannidis YE, Lin J, Livny M, Mading S, Maziuk D, Miller Z, et al. BioMagResBank. Nucleic Acids Res. 2008;36:D402–D408. doi: 10.1093/nar/gkm957. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Nabuurs SB, Nederveen AJ, Vranken W, Doreleijers JF, Bonvin AMJJ, Vuister GW, Vriend G, Spronk CAEM. DRESS: a database of REfined solution NMR structures. Proteins. 2004;55:483–486. doi: 10.1002/prot.20118. [DOI] [PubMed] [Google Scholar]
- 6.Nederveen AJ, Doreleijers JF, Vranken W, Miller Z, Spronk CAEM, Nabuurs SB, Güntert P, Livny M, Markley JL, Nilges M, et al. RECOORD: a recalculated coordinate database of 500+ proteins from the PDB using restraints from the BioMagResBank. Proteins. 2005;59:662–672. doi: 10.1002/prot.20408. [DOI] [PubMed] [Google Scholar]
- 7.Bernard A, Vranken WF, Bardiaux B, Nilges M, Malliavin TE. Bayesian estimation of NMR restraint potential and weight: a validation on a representative set of protein structures. Proteins. 2011;79:1525–1537. doi: 10.1002/prot.22980. [DOI] [PubMed] [Google Scholar]
- 8.Joosten RP, Salzemann J, Bloch V, Stockinger H, Berglund A-C, Blanchet C, Bongcam-Rudloff E, Combet C, Costa ALD, Deleage G, et al. PDB_REDO: automated re-refinement of X-ray structure models in the PDB. J. Applied Crystallogr. 2009;42:376–384. doi: 10.1107/S0021889809008784. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Joosten R, Womack T, Vriend G, Bricogne G. Re-refinement from deposited X-ray data can deliver improved models for most PDB entries. Acta Cryst. D. 2009;65:176–185. doi: 10.1107/S0907444908037591. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Touw WG, Vriend G. On the complexity of Engh and Huber refinement restraints: the angle τ as example. Acta Crystallogr. D Biol. Crystallogr. 2010;66:1341–1350. doi: 10.1107/S0907444910040928. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Doreleijers JF, Vranken WF, Schulte C, Lin J, Wedell JR, Penkett CJ, Vuister GW, Vriend G, Markley JL, Ulrich EL. The NMR restraints grid at BMRB for 5,266 protein and nucleic acid PDB entries. J. Biomol. NMR. 2009;45:389–396. doi: 10.1007/s10858-009-9378-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Hooft R, Vriend G, Sander C, Abola E. Errors in protein structures. Nature. 1996;381:272. doi: 10.1038/381272a0. [DOI] [PubMed] [Google Scholar]
- 13.Laskowski R, Rullmann J, MacArthur M. AQUA and PROCHECK-NMR: programs for checking the quality of protein structures solved by NMR. J. Biomol. NMR. 1996;8:477–486. doi: 10.1007/BF00228148. [DOI] [PubMed] [Google Scholar]
- 14.Doreleijers JF, Nederveen AJ, Vranken W, Lin J. BioMagResBank databases DOCR and FRED containing converted and filtered sets of experimental NMR restraints and coordinates from over 500 protein PDB structures. J. Biomol. NMR. 2005;32:1–12. doi: 10.1007/s10858-005-2195-0. [DOI] [PubMed] [Google Scholar]
- 15.Vranken W. A global analysis of NMR distance constraints from the PDB. J. Biomol. NMR. 2007;39:303–314. doi: 10.1007/s10858-007-9199-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Rieping W, Vranken WF. Validation of archived chemical shifts through atomic coordinates. Proteins. 2010;78:2482–2489. doi: 10.1002/prot.22756. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Zhang H, Neal S, Wishart D. RefDB: a database of uniformly referenced protein chemical shifts. J. Biomol. NMR. 2003;25:173–195. doi: 10.1023/a:1022836027055. [DOI] [PubMed] [Google Scholar]
- 18.Shen Y, Delaglio F, Cornilescu G, Bax A. TALOS+: a hybrid method for predicting protein backbone torsion angles from NMR chemical shifts. J. Biomol. NMR. 2009;44:213–223. doi: 10.1007/s10858-009-9333-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Vranken W, Boucher W, Stevens T, Fogh R, Pajon A, Llinas M, Ulrich EL, Markley JL, Ionides J, Laue E. The CCPN data model for NMR spectroscopy: development of a software pipeline. Proteins. 2005;59:687–696. doi: 10.1002/prot.20449. [DOI] [PubMed] [Google Scholar]
- 20.Kabsch W, Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983;22:2577–2637. doi: 10.1002/bip.360221211. [DOI] [PubMed] [Google Scholar]
- 21.Koradi R, Billeter M, Wüthrich K. MOLMOL: a program for display and analysis of macromolecular structures. J. Mol. Graph. 1996;14:51–55. doi: 10.1016/0263-7855(96)00009-4. [DOI] [PubMed] [Google Scholar]
- 22.Neal S, Nip A, Zhang H, Wishart D. Rapid and accurate calculation of protein 1H, 13C and 15N chemical shifts. J. Biomol. NMR. 2003;26:215–240. doi: 10.1023/a:1023812930288. [DOI] [PubMed] [Google Scholar]
- 23.Schubert M, Labudde D, Oschkinat H, Schmieder P. A software tool for the prediction of Xaa-Pro peptide bond conformations in proteins based on 13C chemical shift statistics. J. Biomol. NMR. 2002;24:149–154. doi: 10.1023/a:1020997118364. [DOI] [PubMed] [Google Scholar]
- 24.Zhong Q, Watson MJ, Lazar CS, Hounslow AM, Waltho JP, Gill GN. Determinants of the endosomal localization of sorting nexin 1. Mol. Biol. Cell. 2005;16:2049–2057. doi: 10.1091/mbc.E04-06-0504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Mulder F. Leucine side-chain conformation and dynamics in proteins from 13C NMR chemical shifts. Chem. Bio. Chem. 2009;10:1477–1479. doi: 10.1002/cbic.200900086. [DOI] [PubMed] [Google Scholar]
- 26.Kinjo AR, Yamashita R, Nakamura H. PDBj Mine: design and implementation of relational database interface for Protein Data Bank Japan. Database. 2010;2010 doi: 10.1093/database/baq021. doi:10.1093/database/baq021. [DOI] [PMC free article] [PubMed] [Google Scholar]



