Figure 2.
An example that illustrates the process of extracting interventions from ClinicalTrials.gov (through AACT) and creating a unique active pharmaceutical ingredient record in CDEK. Curation begins by extracting the intervention names from trials containing active pharmaceutical ingredients and cleaning names to strip any perfulous text (e.g. dosing amount, dosing frequency). Once complete, an automated program flags entities that should be merged into a single CDEK record using a set of `merging’ criteria. The curation software will also flag entities that are made up of two or more active pharmaceutical ingredients using a set of `splitting’ criteria (e.g. the drug `Mavyret’ is a combination of two active pharmaceutical ingredients, glecaprevir and pibrentasvir, used to treat hepatitis C). A unique CDEK active pharmaceutical ingredient record is created and assigned a unique id, a type, and a preferred name. All names are stored as synonyms and all trials are linked to the unique active pharmaceutical ingredient ID. Finally, several external databases are cross-referenced to pull metadata and provide hyperlinks to more information about that active pharmaceutical ingredient. This metadata was also used to flag entries that should be merged into a single active pharmaceutical ingredient.