Skip to main content
. Author manuscript; available in PMC: 2011 Aug 10.
Published in final edited form as: J Biomed Inform. 2009 Mar 24;42(5):824–830. doi: 10.1016/j.jbi.2009.03.009

Figure 2.

Figure 2

Using PubMed abstract database and hypergeometric test to evaluate terms for ontology development. A) X is the number of abstracts that contains both a term for ontology development (putative ontology term to be evaluated) and the domain term; M and N are the abstracts with and without the term for the domain, respectively; K is the number of abstracts with the term. M+N is the total number of PubMed Abstracts examined. For example, to evaluate whether hexokinase activity is an appropriate term to develop an ontology for monosaccharide metabolism, X is the number of PubMed abstracts that contain both hexokinase activity and monosaccharide metabolism. B) The equation shows the probabil-a! ity of observing X by employing the hypergeometric formulation, where Cab=a!b!(ab)! is the binomial coefficient.