. 2020 Mar 4;2019:755–764.

Table 1.

Phenotype algorithm workflow model

Domain	Step	Description	Potential Challenges to Portability
Data	Data Collection	The processes by which data is collected within the source EHR, and its intended purpose.	Only data that is collected can be used for electronic phenotyping. How data is collected at a local institution (vocabulary used, frequency of collection, etc.) determines how that institution authors a phenotype algorithm. Modality of data collection (e.g., structured, narrative text, images) can affect how and if the data used in executing a phenotype algorithm.
Data	Data Preparation	Extract-Transform-Load (ETL) processes through which data is consolidated into an integrated data repository (IDR).	The need to transform the shape of the data from an IDR data into a common data model (CDM). Effort to convert data from one modality to another (e.g., natural language processing to obtain structured results). Mapping of local terms to a standard vocabulary term (national standard or prescribed by CDM), and potential lossy mappings or semantic drift.
Authoring	Define Value Sets	Identifying the medical terms that are used to represent data elements within the phenotype algorithm logic.	Not all terminologies/vocabularies are fully implemented at all institutions. Value sets may list all codes, or may list codes at the top level of a hierarchy that need to be expanded.
Authoring	Define Logic	Create a representation of the required data elements, and how the elements are related by different operators (e.g., Boolean, temporal) to create a phenotype algorithm.	The modality of the logic representation (narrative, intermediate representation, programming language), and what system(s) may understand it. Strictness of the logic, considering local instead of broader data availability.
Implementation	Distribution	The mechanism by which a phenotype algorithm is transmitted from the author to an implementing site.	Automated vs. manual approach. Policies that require human review and approval before execution.
	Translation	How the phenotype algorithm is converted into an executable representation that may be directly applied to the institutional data model.	Automated vs. manual approach. Technology-specific customizations (e.g., database schema names, table names). Information loss when elements of a data model do not have a direct translation or differ in granularity.
	Execution	The computation process by which the executable representation is applied to an institutional data warehouse, and results are retrieved.	Syntax errors that require human intervention and correction.
	Validation	A formal or informalcomparison of the execution results against a reference standard.	Lack of detailed information concerning the inclusion and exclusion implications across multiple phenotype algorithm steps. Lack of access to source data to evaluate results.