Figure 3.
Hypergeometric test can be applied to identify children of a particular Biological Process ontology term in well-developed Gene Ontology. A) The occurrence of child terms of Monosaccharide Metabolic Process in PubMed Abstract. X axis is the frequency of each term in the entire PubMed abstracts in logarithmic scale. Y axis is the number of terms at a particular frequency. Terms are binned together based on their frequency. B) The Precision/Recall curve created for the overrepresentation of all the children of parent term “Monosaccharide Metabolic Process” in the abstracts that contain the parent term. This curve indicates that the hypergeometric test or Weirdness Metric indeed can identify the children (ontology terms) of the parent term (domain), and the result of this computational method is consistent with the Gene Ontology created by domain experts.