Skip to main content
. Author manuscript; available in PMC: 2013 Jun 23.
Published in final edited form as: Sci Transl Med. 2011 Apr 20;3(79):79re1. doi: 10.1126/scitranslmed.3001807

Table 2.

Data categories for defining a clinical gold standard, an EMR-derived phenotype, and covariates and exclusion criteria.

Primary Phenotype Data Categories Phenotyping Methods
Clinical gold standard EMR-derived phenotype Phenotype cohort (e.g. covariates, exclusion criteria)
Dementia Demographics, Clinical notes (clinician documentation of mental status and histopathological examination data) Diagnoses, medications Demographics, laboratory tests, radiology reports Structured data extraction, Free-text searches, Manual chart review
Cataracts Clinical notes (Ophthalmologic examination) Diagnoses, procedure codes Demographics, medications Structured data extraction, NLP, Intelligent Character Recognition
Peripheral Arterial Disease Radiology test results (ankle-brachial index or arteriography) Diagnoses, procedure codes, medications, radiology test results Demographics Structured data extraction, NLP
Type 2 Diabetes Laboratory Tests Diagnoses, laboratory tests, medications Demographics, Laboratory tests, height, weight, family history, smoking history Structured data extraction, Free-text searches
Cardiac Conduction ECG measurements ECG report results Demographics, diagnoses, procedure codes, medications, laboratory tests Structured data extraction, NLP