Fig. 2. CNLP can extract a more detailed phenome than manual EHR review or OMIM clinical synopsis.
(A) Example CNLP of a sentence from the EHR of an 8-day-old baby (patient 341) with maple syrup urine disease, showing four extracted HPO terms. ED, emergency department. (B) Hierarchical display of HPO phenotypic features extracted by manual review of the EHR of neonate 341 and by CNLP (red) and expected phenotypic features (from the OMIM Clinical Synopsis; blue). Yellow circles: Phenotypic features extracted by both CNLP and expert review. Purple circles: Phenotypic overlap between CNLP and OMIM. Gray circles: The location of parent terms of identified phenotypic features within the HPO hierarchy. The information content (IC) was defined by IC(phenotype) = −log(pphenotype), where pphenotype was the probability of observing the exact term or one of its subclasses across all diseases in OMIM. IC increases from top (general) to bottom (specific).