Table 1.
Size of KaBOB
imported OBOs | ICE records | generated (rules and id sets) | KaBOB total | |||||
---|---|---|---|---|---|---|---|---|
subset | # triples | size .owl (GB) | # triples | size .nt.gzip (GB) | # triples | size .nt.gzip (GB) | # triples | size (GB) |
human only | 13,830,676 | 1.5 | 144,489,737 | 2.0 | 7,615,547 | 0.2 | 165,935,960 | 3.6 |
human +7 major model organisms | 13,830,676 | 1.5 | 369,027,022 | 4.9 | 34,968,305 | 0.7 | 417,826,003 | 7.1 |
all organisms | 13,830,676 | 1.5 | 9,584,033,541 | 126 | n/a | n/a | n/a | n/a |
Lists the size of the various collection of RDF generated in the KaBOB build process, recorded in number of triples and size on disk. The first three major columns include the imported OBOs, the ICE records (output of the file parsers), and the generated triples (output of the rules and ID merging). The fourth column is the sum of the first three. The rows represent subsets of the KaBOB data based on organisms included. The subsets are human-only, human plus seven major model organisms (listed in the paper), and the final row is for all organisms combined. Due to the scale of the data in the final subset this data is currently incomplete.