Skip to main content
. Author manuscript; available in PMC: 2010 Jun 30.
Published in final edited form as: Chem Biodivers. 2010 May;7(5):1124–1141. doi: 10.1002/cbdv.200900317

Table 1.

Data integration challenges and solutions encountered when integrating genomic and post-genomic experiment data related to infectious disease research.

Challenges Solutions
Tolerance for identification of gene or
proteins
Maintaining multiple interpretations of data
Ensuring completeness, validity, accuracy
and currency of data
Formalizing data staging and review
process; implementing multiple data
quality controls
Managing multiple identifier schemes Short term, use mapping service, but long
term use reduced number of identifier
schemes and equivalence mappings in
Semantic Web
Data synchronization Use access on demand APIs with
canonical data structures when possible
and bulk load only when APIs cannot meet
requirements
Literature integration Use APIs to query literature repositories;
provide end users with "smart query"
capabilities
Maximizing the use of limited post-
genomic experimental evidence for gene
and protein predictions.
Treat experiments using orthologs as
indirect evidence for gene and protein
predictions.
Usability Employ usability engineering to
complement and inform software
development; include support for
integrated information, data and tools,
progressive filtering, and context sensitivity