Table 1.
Detected and corrected problems in the BioMeta database
Type of Problem | # in KEGG | # in BioMeta | # Corrected |
Structure missing | 1239 | 1106 | 133 |
Valence violation(s) | 76 | 0 | 76 |
Incorrect constitution | unknown | unknown | 107 |
Total (constitution) | 1315 | 1106 | 316 |
Undefined stereo double bond(s) | 35 | 32 | 3 |
Invalid sp3 stereocenter(s) | 70 | 47 | 23 |
Ambiguous sp3 stereocenter(s) | 46 | 0 | 46 |
Undefined sp3 stereocenter(s) | 1398 | 865 | 533 |
Unspecified enantiomer | 2326 | 1840 | 486 |
Undefined sp3 stereochemistry | 554 | 366 | 188 |
Incorrect stereochemistry | unknown | unknown | 69 |
Total (stereochemistry) | 3990 | 2907 | 1152 |
Total corrected | 1468 |
The table shows the validation and correction results of 12,815 entries present in both the KEGG Compound (version of October 25, 2005) and BioMeta databases. Note that the absence of a structure does not need to be an error – it may be a generic compound such as "acceptor" or "phosphorylated protein". Likewise, not all "unspecified enantiomer" cases need to be errors – a number of drugs may be racemic compounds. The row "total (stereochemistry)" is not the sum of the preceding cases because compounds may have multiple problems. The rows with the totals do not add up because of the "unknown" entries – if these numbers were known then the numbers would add up.