Skip to main content
AMIA Summits on Translational Science Proceedings logoLink to AMIA Summits on Translational Science Proceedings
. 2025 Jun 10;2025:527–536.

A Standardized Guideline for Assessing Extracted Electronic Health Records Cohorts: A Scoping Review

Nattanit Songthangtham 1, Ratchada Jantraporn 2, Elizabeth Weinfurter 3, Gyorgy Simon 1, Wei Pan 4, Sripriya Rajamani 1,2, Steven G Johnson 1
PMCID: PMC12150730  PMID: 40502227

Abstract

Assessing how accurately a cohort extracted from Electronic Health Records (EHR) represents the intended target population, or cohort fitness, is critical but often overlooked in secondary EHR data use. This scoping review aimed to (1) identify guidelines for assessing cohort fitness and (2) determine their thoroughness by examining whether they offer sufficient detail and computable methods for researchers. This scoping review follows the JBI guidance for scoping reviews and is refined based on the Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for scoping reviews (PRISMA-ScR) checklists. Searches were performed in Medline, Embase, and Scopus. From 1,904 results, 30 articles and 2 additional references were reviewed. Nine articles (28.13%) include a framework for evaluating cohort fitness but only 5 (15.63%) contain sufficient details and quantitative methodologies. Overall, a more comprehensive guideline that provides best practices for measuring the cohort fitness is still needed.

Introduction

EHR-based cohorts for retrospective study designs are created by extracting the relevant patient data from a database. The term “Electronic Health Records (EHR) cohort” or “EHR-based cohort” typically refers to a group of individuals whose health information is collected and stored electronically within a healthcare system’s EHR system13. Cohorts can be thought of as a group of individuals who share a common characteristic or experience and are studied over a period of time to observe outcomes or trends related to that characteristic or experience. The purpose of creating a cohort is to have a dataset that is representative of the target population of the study and provide results that are generalizable4,5. Patient selection from Electronic Health Records should be data-driven and grounded in sound clinical and scientific reasoning. Although the process of building a cohort for an EHR-based retrospective study is intricate, it is essential to research and warrants careful attention to ensure accurate and meaningful results6.

Building a cohort from EHR data is complex, especially when researchers can only work with what’s available in the system. Various disease descriptions can lead to different patient groups, highlighting the importance of standardized phenotype definitions to ensure accuracy and reproducibility7,8. Inclusion and exclusion criteria are key to shaping the cohort, but the abundance of codes such as ICD, CPT, LOINC, and SNOMED914 for a single disease can complicate this process8,15. Researchers may combine different codes, but without expert knowledge or clear guidelines, they risk including false positives or negatives. Additionally, balancing cohort specificity and result generalizability is essential. Sample selection based on record completeness can result in bias, excluding patients with less data and potentially over-generalizing findings1618.

Creating a cohort representing the study target population is challenging, even for experienced researchers. Ostropolets et al.19 asked nine research teams to reproduce an observational study, aiming to recreate the target cohort. The results showed the teams were unable to create consistent cohorts, with at least four criteria deviations from the master implementation. The authors noted that eligibility criteria based on phenotypes often require further refinement, leading to investigator-induced errors. Despite using the same tools, individual logic affected cohort creation, suggesting that knowledge of data tables, columns, and elements is insufficient. Researchers need detailed guidelines that aids in creating consistent and valid cohorts.

According to Gatto et al.20, identifying study designs and data that are fit for the research purpose help ensure validity, transparency, and reliability. The fitness of an extracted cohort can be defined as the extent to which the collected data represent the target population or the alignment between cohort characteristics and the intended target population21. An unfit cohort has downstream effects on analyses as well as results interpretation, generalizability, and implementation16. Therefore, ensuring cohort fitness by carefully selecting and validating study cohorts from EHR data is essential for generating reliable and meaningful insights from secondary analyses.

To date, manual chart review is still considered the gold standard for validating a cohort; however, it may not be feasible for the large amount of information available in the EHR system. Most importantly, this method does not answer the question of how researchers can know that the cohort they extract from the EHR system reflects their intended target population, or whether the cohort “fits”. To our knowledge, there does not exist a tool that researchers can use to test the fitness of a cohort. The detailed steps of cohort creation are not generally disclosed by researchers during publication. Therefore, there is no sure way for independent researchers to assess how well the cohorts used in the study are actually suitable for their research questions.

The purpose of this scoping review is to search and evaluate the current state of literature in order to identify any existing guidelines or frameworks that can be used to verify the fitness of an extracted cohort from the EHR. The guidelines, frameworks, or tools should contain sufficient details and methods that are computable quantitatively and that sufficient details are disclosed so that a researcher can implement the guideline. The scoping review also describes aspects of EHR data usage that are already available and identifies any gaps that may still exist. The information from this scoping review will provide a foundation for creating a new and improved guideline that will help researchers assess cohort fitness. For the rest of the paper, the terms guideline and framework will be used interchangeably.

Methods

Database search

The methodological framework described by Arskey and O’Malley22 was used to guide our search and evaluation of the literature. The protocol was developed using the JBI guidance for scoping reviews23 and refined based on the Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for scoping reviews checklist24 . The completed protocol25 for the scoping review was registered with OSF on November 27, 2023. Final database searches to identify relevant literature were constructed by an experienced health sciences librarian (EW) and conducted in Medline via Ovid, Embase via Ovid, and Scopus on 12/1/2023. Records were imported into the Covidence systematic review software26 for deduplication and management of the screening and review process. Two references from the FDA27 and the European Medicines Agency28 were independently added.

Search terms

In October and November of 2023, preliminary searches were conducted to identify relevant terms and to pilot test approaches to the question. The basic search approach was to combine three concept groups: cohort selection AND electronic health records AND standards. The research team discussed different results sets and validated iterations of the search strategy against articles known to fit these criteria, refining terms to reflect the most relevant aspects of the question. A combination of subject headings and keywords were used to diversify the search methods and increase reliability of retrieval, and the search syntax was optimized for each individual database. Additionally, reference lists of included studies and related reviews were also searched to identify additional potentially eligible studies. Search results were limited to English language and publication year 2013-2023, due to the FDA supporting the use of real world evidence in 2018 and concerns about practices before 2013 being outdated. A list of full search terms and strategies are presented in Table 1.

Table 1.

Full Terms and Strategies for Database Search

Search terms Number of Articles
Ovid MEDLINE(R) All  
1) exp Research/ or exp Data Collection/ or exp Research Design/ or exp Epidemiologic 12,704,103
Methods/ or ((patient* adj2 select*) or assess* or evaluat* or cohort* or (data adj3 (collect* or quality))).mp.  
2) limit 1 to (english language and yr=“2013 -Current”) 6,053,814
3) ((routinely adj1 collected) or (real world or noninterventional or non-interventional or postintervention* or post-intervention* or observational or EHR or EHRs or EMR or EMRs or electronic health record* or electronic medical record*)).ti. 92,101
4) (guideline* or standard* or reporting or best practi* or framework* or template* or solution* or propos* or reproducib* or replicat*).ti. 562,516
5) 2 and 3 and 4 1,583
Scopus  
( TITLE ( ( routinely W/1 collected ) OR ( “real world” OR noninterventional OR “non- interventional” OR postintervention* OR “post-intervention*” OR observational OR ehr 1,190
OR ehrs OR emr OR emrs OR “electronic health record*” OR “electronic medical record*” ) ) AND TITLE ( ( guideline* OR standard* OR reporting OR “best practi*” OR framework* OR template* OR solution* OR propos* OR reproducib* OR replicat* ) )  
AND TITLE-ABS-KEY ( ( patient* W/2 select* ) OR assess* OR evaluat* OR cohort*  
OR ( data W/3 ( collect* OR quality ) ) ) ) AND PUBYEAR > 2012 AND PUBYEAR <  
2024 AND ( LIMIT-TO ( DOCTYPE , “ar” ) ) AND ( LIMIT-TO ( LANGUAGE ,  
“English” ) )  
Embase  
1) exp Research/ or exp Methodology/ or exp Epidemiology/ or ((patient* adj2 select*) or assess* or evaluat* or cohort* or (data adj3 (collect* or quality))).mp. 17,686,654
2) limit 1 to (english language and yr=“2013 -Current”) 9,531,313
3) ((routinely adj1 collected) or (real world or noninterventional or non-interventional or postintervention* or post-intervention* or observational or EHR or EHRs or EMR or 138,496
EMRs or electronic health record* or electronic medical record*)).ti.  

Study Selection

Papers selected in this scoping review were published between the year of 2013 and 2023 and were published by health-related organizations only which may include research institutions, academic institutions, or hospitals. The abstracts should mention the Electronic Medical Records, the Electronic Health Records, or observational data. Papers are excluded if they were not written in the English language. Dissertations and conference proceedings are also excluded. Additionally, clinical trials, randomized control trials as well as grey literature are excluded. Screening for both title/abstract and full-text papers was conducted by two independent reviewers [NS, RJ]. After conducting a review of the included papers on which reviewers had divergent assessments, a consensus was achieved to either exclude them or advance them to full-text screening

Data Extraction

Data extracted from Covidence include the name of the authors, institution affiliations, year of publication, where the study was conducted, objectives of the publication, journal of the publications, how the guideline/framework was created, description of the guideline, the key goal of the guideline, and key findings related to the scoping review question. All extracted data were recorded on an Excel sheet.

Main Topics Covered by Guidelines

The topics identified in the included guidelines are compared to best practices and steps for using healthcare data in studies as recommended in Artificial Intelligence and Machine Learning in Health Care and Medical Sciences: Best Practices and Pitfalls29. Although these steps do not directly provide methods for researchers to evaluate the fitness of the cohort, they describe a series of actions necessary before the data should be analyzed. The steps in the life cycle show that poor handling of data can affect downstream work of modeling as well as the results generated from these models. For this review only the best practices related to data design are of interest. The topics include: Study Design, Data Source Selection, Data Collection, Data Extraction, Data Cleaning, Cohort Formation, Data Quality, Reporting and documentation, Reproducibility, Comprehensive regulatory guideline, and Cohort Fitness

Guideline Evaluation

Even if papers were identified based on search criteria and full-text review, the crucial step in assessing the included frameworks is to judge the alignment or closeness of the guidelines in relation to what the objective of the scoping review is. Based on the paper objective, the desired guidelines, frameworks, or tools must contain enough details that can sufficiently guide the users to understand the recommendations and follow the process as well as allowing for quantitative assessment for cohort fitness. These guidelines will be categorized as “Detailed Quantitative”. Other detailed guidelines that alternatively offer qualitative assessment methods are categorized as “Detailed Qualitative”. Other guidelines that include recommendations without detailed guidance will be categorized as “Brief”. Any other guidelines will be categorized as “Not Applicable”.

Results

The scoping review took place between October 2023 and January 2024. Overall, 4,232 papers and 2 references from other sources were imported for screening. After deduplication 1,904 articles were screened for titles and abstracts, giving 47 articles for full-text review. Articles were excluded for several reasons including: methods are based on machine-learning or artificial intelligence (n=2), insufficient guideline description (n=12), and guideline is still in development (n=1). As a result, 30 articles and 2 additional references were used for data extraction.

Article characteristics/general description of included studies

The included studies were published between the year 2015 and 2023 with most publications occurring in 202320,30–35 (n=7, 21.88%), followed by 20163641 (n=6, 18.75%), and 20203033 (n=4, 15.5%). Most studies20,27,35,36,38,39,42–51(n=16, 50.0%) are identified as frameworks by the author. The remaining sources were identified as checklists52,53 (n=2; 6.25%), reporting guidelines32,54 (n=2; 6.25%), templates55,56 (n=2, 6.25%), while the rest are mixtures of different types of guideline documents and tools. The included references came from the United States20,27,30,33,35–38,38– 43,45,47,48,51,55,56 (n=20; 62.50%), the United Kingdom34,46,54,57 (n=4; 12.5%), Canada52,53 (n=2; 6.25%), Switzerland32 (n=1; 3%), South Korea49 (n=1; 3.13%), and collaboration between multiple nations28,50,56(n=3; 9.38%). Five articles20,41,44,47,48 were published in Clinical Pharmacology and Therapeutics (15.63%), four papers36,38,39,45 came from eGEMs (12.50%). A total of three papers52,54,55were published in BMJ and two papers30,34 were published in JAMIA. One paper was published in each of the following journals: BMC Pediatrics40, Pharmacoepidemiology and Drug Safety56, BioData Mining57, American Psychiatric Association43, International Journal of Medical Informatics42, Annals of Internal Medicine58, Journal of Clinical and Translational Science46, Drug Safety33, Lancet Digital Health50, PLOS Medicine53, Annals of Oncology32, Therapeutic Innovation and Regulatory Science51, Health Services and Outcomes Research Methodology31, Journal of Biomedical Informatics37, and Information Systems35. Non-journal sources include the European Medicines Agency28 and the U.S. Food and Drug Administration27.

Topics Covered by Guidelines

While only 9 of 32 papers directly addressed cohort fitness, all included papers contribute to using EHR data effectively. These topics were extracted based on the objectives of the frameworks stated in each paper. The most common topics after cohort fitness were data quality, reporting/documentation, and cohort formation (N=8; 25.0% each), followed by data source selection (N=7; 21.88%). Other topics included data extraction (N=6; 18.75%), reproducibility (N=5; 15.63%), study design (N=3; 9.38%), data cleaning (N=3; 9.38%), comprehensive regulation (N=3; 9.38%), and data collection (N=1; 3.13%). These guidelines cover various aspects of EHR data use, from selection and extraction to reporting and reproducibility. Table 2 represents the main topic areas directly covered.

Table 2:

Guidelines and Topics Addressed.

Guideline SDa SSb Dcoc Dexd Dcle CFf DQg RDh RPi CGj Cfitk
ACE-IT33   X       X         X
ATRAcTR31   X         X        
Callahan 202058               X     X
Castillo 201543 X X   X              
CODE-EHR50                   X  
CONSORT-ROUTINE52               X      
DAQCORD 46   X         X X      
DataGauge45       X X X X       X
Denaxas 201757                 X    
Dziadkowiec 201638         X   X        
EnCePP (rev 11)28                   X  
ESMO-GROW32               X      
FDA’s RWE Framework27                   X  
Generalizability Score42           X         X
GIST 2.037       X   X         X
Graul 202334           X          
Han 202249     X     X         X
Haneuse 201639           X          
HARPER56               X X    
Kahn 201636             X        
KEEPER30       X             X
Knake 201640       X              
Miao 202335       X X X X       X
MVET criteria41                 X    
PICOTS51   X                  
RECORD53               X      
RECORD-PE54               X      
SPACE47 X               X    
SPIFD48   X         X        
SPIFD220 X X         X        
STaRT-RWE55               X X    
The Certainty Framework44                     X

aSD = Study Design, bSS = Data Source Selection, cDCo = Data Collection, dDEx = Data Extraction, eDCl = Data Cleaning, fCF = Cohort Formation, gDQ = Data Quality, hRD = Reporting and documentation, iRP = Reproducibility, jCG = Comprehensive regulatory guideline, kCFit = Cohort Fitness

Study design guidelines included in this review provide checklists emphasizing design structures, research objectives, and methodological rigor. Castillo et al.43, SPACE47, and SPIDF220 offer outlines and checklists that focus on design structures, research objectives, and methodological rigor assessment. Three additional frameworks offer a more comprehensive approach for regulatory guidelines for study designs. Code-EHR50 provides best practices for minimum information in research, focusing on documentation and reporting. The EnCePP28 and the FDA27 developed by government organizations offer a broad perspective. While these comprehensive guidelines cover much of the life cycle, they lack detailed methodologies for each step of the cycle.

Frameworks for data source selection emphasize the relevance and quality of sources, with several studies discussing frameworks for data extraction as either standalone processes or within broader frameworks. For example, PICOTS51 aids researchers in evaluating the suitability of data sources for specific research aims, while DAQCORD46(p20) provides guidance on identifying appropriate sources and addressing ethical concerns in data acquisition. Han et al.49 developed a systematic framework for real-world data (RWD) collection within the Korean health system, specifically targeting cancer drug safety and effectiveness. A framework by Knake et al.40 promotes consistency in data extraction, though the framework did not provide specific methodologies. The KEEPER Profile30 assists researchers by offering a general workflow for data extraction, starting with the identification of phenotypic information, followed by data standardization and algorithmic extraction. Dziadkowiec et al.38 focus their entire framework on data cleaning, developing strategies to address identified quality issues, correct errors, review inconsistencies, and perform data imputations. Haneuse et al.39 systematically address selection bias by examining missing or available data and providing guidance on patient selection. Additionally, frameworks such as DataGauge45, ATRAcTR31, and Dziadkowiec et al.38 evaluate extracted data against specific quality aspects to evaluate both data source fitness and study design.

Reporting and documentation frameworks are designed to ensure consistent and detailed methodologies, thereby enhancing research reproducibility and generalizability. Reproducibility is a central concept driving the development of many of these frameworks. For instance, a framework by Callahan et al.58 includes action items for reporting, while others provide standards that researchers should follow when publishing their work32,46,52,54. The START-RWE55 template is specifically designed to improve transparency and consistency, while enhancing the credibility and reproducibility of RWE through transparency. HARPER56 serves as a protocol template offering standardized operational details, context, and rationale. The MVET criteria41 guides researchers in using real-world data (RWD) for decision-making, emphasizing the importance of reproducibility for replication and robustness checks.

Cohort Fitness Guideline Evaluation

Table 3 represents the summary of all of the included papers. Out of the 9 papers that address cohort fitness, 5 papers (15.63%) offer quantitative methodologies and contain enough details that can sufficiently guide the users to understand the recommendations and follow the process. Three papers (9.38%) also offered sufficient detail that researchers can implement; however, the assessment methodologies are qualitative. One paper (3.13%) offers brief recommendations without clear steps on how users should approach cohort fitness assessment. More than half of the included papers do not at all describe the kind of framework we had hoped to find (71.88%) as they do not address the topic of cohort fitness in any way or simply present the topic of cohort fitness without offering any practices, recommendations, or methodologies.

Table 3.

Guideline Evaluation Categories.

Categories and Summary of Methods   N (%)
Detailed Quantitative | Quantitative methodologies for cohort fitness assessment are described in sufficient detail that researchers can implement.
  • DataGauge34

    An iterative process to systematically implement quality assessment of repurposed clinical data. The tool relies on concepts from Wang and Strong’s classification of data quality dimensions48 and Borek et al.’s classification of data quality assessment methods49. Because the data quality analysis is dependent on the predeveloped data needs model specific to the analysis questions, the tool allows users to evaluate the data quality requirements that align with their research questions. DataGauge also calculates the percentage of compliance for each data quality aspect and flags all data items that do not comply with the criteria. The total number of non-flagged patients is also calculated.

  • Generalizability Score31

    The Generalizability Score represents how well the cohort reflects the “general population” in the context of a clinical trial. The tool uses computer algorithms to analyze the characteristics of trial participants and compare them with the potential target population. For each characteristic, the algorithm produces a ratio that represents likeness or generalizability.

  • Han 202238

    A data collection framework that includes reliability analysis for collected data. The authors presented that consistency between different cohorts reveals how data are collected, which includes how investigators describe their data elements and data element definitions. Using concordance rates between previously collected data and data collected by the independent investigators, the reliability and consistency of the cohorts can be assessed.

5 (15.63%)
  • Miao 202324

    A data preparation framework where the evaluation of extracted EHR data is done concurrently with data cleaning and quality assessment. The framework uses data quality aspects as a foundation and focuses on the quality of the data and suitability of individual variables. During variable domain assessment, the framework allows users to calculate various statistical metrics such as the mean, median, and standard deviation in order to create visualizations for each variable that can be used to compare to external data or reviewed by experts.

  • GIST 2.026

    GIST 2.0 quantifies population representativeness, focusing on generalizability. The metric uses the weighted scoring system to assign importance to traits based on their relevance to the research question. It also allows users to calculate clinical significance of each eligibility criteria through a trial-specific significance scale where greater stringency correlates to higher significance.

 
Detailed Qualitative | Qualitative methodologies for cohort fitness assessment are described in sufficient detail that researchers can implement
  • ACE-IT22

    The ACE-It tool was designed to qualitatively assess electronic medical records or claims-based algorithms in relation to a specific decision context through a series of questions.

  • The Certainty Framework33

    The framework provides level-based assessment for determining the degree of certainty in which the data source and algorithms derived from real-world data are reliable. The framework presents that certainty of the data can be categorized as either “optimal”, “sufficient”, or “probable” depending on the importance of study variables and intended use. Users can use the framework’s

    recommendations for each category to determine the appropriate data processing techniques.

  • KEEPER19

    The KEEPER profile was designed for phenotype evaluation in replacement of manual chart review. The profile was developed to offer a standardized solution for extracting and evaluating data elements in relevance to the steps clinicians follow when diagnosing a patient. The extracted data elements correspond to common disease concepts or ideas such as symptoms, family history, or vital signs. In the background, the profile translates the symptoms into condition codes. The output contains data elements defined by the user along with their corresponding phenotype definitions and specific codes.

3 (9.38%)

Discussion

The goal of this scoping review is to identify current frameworks that offer quantitative assessment methodologies that are in sufficient detail in which users can implement. While some included guidelines in this scoping review offer quantitative methods that could be adapted to evaluate extracted EHR-based cohorts, they were originally designed for other purposes. At the time of writing this scoping review, there are no guidelines that were designed specifically for the purpose of obtaining a quantitative measure that allows researchers to know if their cohort fits their research objectives. Therefore, a new guideline is needed in order to fulfill this gap. The design of a new guideline should also consider the following characteristics:

  • Flexibility and adaptability. Because health research and health data can vary greatly depending on factors such as the disease of interest, the setting of study, or data collection habits, guidelines should be flexible enough that the researchers can use with ease. Guidelines that are too restrictive may be too difficult to use and may add more burden to users.

  • Clarity. Many health data users may not be adequately trained to use complex programs or tools; therefore, guidelines should be straightforward and have clear instructions to avoid misuse. Clear instructions should specify when to use the guideline in the data life cycle. For instance, a data extraction tool should be used after the specification of cohort definition and before cohort evaluation, while reporting guidelines should be applied after analytics validation but before publication preparation.

  • Updatability. Due to the ever changing nature of the disease and the structure of the EHR system, changes over time such as the addition of new diseases, codes, procedures, or medications have to be considered during the development of a guideline.

To our knowledge, this is the first attempt to identify detailed guidelines for assessing the fitness of extracted EHR cohorts. This review provides a foundation for developing improved guidelines for evaluating cohort quality. We conducted a broad search across various health and non-health databases to provide diverse perspectives. Articles were restricted to those published in English from 2013 to 2023 to capture recent developments while avoiding outdated information. Key limitations include the exclusion of articles published in other languages, methodologies predating 2013, and new approaches from 2024. Guidelines for unstructured data and those based on machine learning or artificial intelligence were also excluded. While acknowledging the growing shift toward automation, this review prioritizes basic methodologies, recognizing that many EHR users lack extensive backgrounds in machine learning. These gaps highlight opportunities for future research to address unstructured data and automated processes.

Conclusion

This scoping review aimed to identify frameworks, guidelines, or best practices for assessing the fitness of extracted EHR cohorts and their alignment with the desired population. Of the 32 articles reviewed, only 9 provided sufficient detail that researchers can implement. However, only 5 papers provided methodologies that allow for quantitative computations. Although these guidelines were not originally designed for cohort fitness evaluation, they can be adapted to indirectly assess the fitness of the extracted EHR-cohorts. Additionally, existing guidelines, while covering various aspects of EHR data use, still struggle with flexibility, adaptability, clarity, and updatability. To effectively assess cohort fitness, a more comprehensive framework that addresses these gaps and meets all necessary criteria is required.

Figures & Table

Figure 1.

Figure 1.

Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) Flowchart.

References

  • 1.Chandawarkar A, Chaparro JD. Burnout in clinicians. Curr Probl Pediatr Adolesc Health Care. 2021;51(11):101104–101104. doi: 10.1016/j.cppeds.2021.101104. doi:10.1016/j.cppeds.2021.101104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Evans RS. Electronic Health Records: Then, Now, and in the Future. Yearb Med Inform. 2016;Suppl 1(Suppl 1):S48–S61. doi: 10.15265/IYS-2016-s006. doi:10.15265/IYS-2016-s006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Kroth PJ, Morioka-Douglas N, Veres S, et al. Association of Electronic Health Record Design and Use Factors With Clinician Stress and Burnout. JAMA Netw Open. 2019;2(8):e199609–e199609. doi: 10.1001/jamanetworkopen.2019.9609. doi:10.1001/jamanetworkopen.2019.9609. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Kukull WA, Ganguli M. Generalizability: the trees, the forest, and the low-hanging fruit. Neurology. 2012;78(23):1886–1891. doi: 10.1212/WNL.0b013e318258f812. doi:10.1212/WNL.0b013e318258f812. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Rahm E, Do HH. Data cleaning: Problems and current approaches. IEEE Data Eng Bull. 2000;23(4):3–13. [Google Scholar]
  • 6.Moskowitz A, Chen K. Defining the Patient Cohort. Second Anal Electron Health Rec. Published online 2016:93100. doi:10.1007/978-3-319-43742-2_10. [Google Scholar]
  • 7.Ancker JS, Shih S, Singh MP, et al. AMIA Annual Symposium Proceedings. Vol 2011. American Medical Informatics Association; 2011. Root causes underlying challenges to secondary use of data; p. 57. [PMC free article] [PubMed] [Google Scholar]
  • 8.Richesson RL, Rusincovitch SA, Wixted D, et al. A comparison of phenotype definitions for diabetes mellitus. J Am Med Inform Assoc JAMIA. 2013;20(e2):e319–e326. doi: 10.1136/amiajnl-2013-001952. doi:10.1136/amiajnl-2013-001952. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Hirsch JA, Nicola G, McGinty G, et al. ICD-10: History and Context. AJNR Am J Neuroradiol. 2016;37(4):596599. doi: 10.3174/ajnr.A4696. doi:10.3174/ajnr.A4696. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.American Medical Association CPT Overview and Code Approval. CPT. Accessed May 15, 2024. https://www.ama-assn.org/practice-management/cpt/cpt-overview-and-code-approval#%3A~ . [Google Scholar]
  • 11.World Health Organization International Classification of Diseases (ICD) Accessed May 15, 2024. https://www.who.int/standards/classifications/classification-of-diseases . [Google Scholar]
  • 12.Centers for Medicare and Medicaid Services ICD-10. Coding. Accessed May 15, 2024. https://www.cms.gov/medicare/coding-billing/icd-10-codes . [PubMed] [Google Scholar]
  • 13.Regenstrief Institute, Inc Knowledge Base. LOINC. Accessed May 15, 2024. https://loinc.org/ [Google Scholar]
  • 14.SNOMED International SNOMED International 2022. SNOMED CT Maps. Accessed May 15, 2024. https://www.snomed.org/maps . [Google Scholar]
  • 15.Homco J, Carabin H, Nagykaldi Z, et al. Validity of Medical Record Abstraction and Electronic Health Record-Generated Reports to Assess Performance on Cardiovascular Quality Measures in Primary Care. JAMA Netw Open. 2020;3(7):e209411–e209411. doi: 10.1001/jamanetworkopen.2020.9411. doi:10.1001/jamanetworkopen.2020.9411. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Ferguson L. External Validity, Generalizability, and Knowledge Utilization. J Nurs Scholarsh. 2004;36(1):16–22. doi: 10.1111/j.1547-5069.2004.04006.x. doi:10.1111/j.1547-5069.2004.04006.x. [DOI] [PubMed] [Google Scholar]
  • 17.Sullivan GM. Getting off the “gold standard”: randomized controlled trials and education research. J Grad Med Educ. 2011;3(3):285–289. doi: 10.4300/JGME-D-11-00147.1. doi:10.4300/JGME-D-11-00147.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Rusanov A, Weiskopf NG, Wang S, Weng C. Hidden in plain sight: bias towards sick patients when sampling patients with sufficient electronic health record data for research. BMC Med Inform Decis Mak. 2014;14:51–51. doi: 10.1186/1472-6947-14-51. doi:10.1186/1472-6947-14-51. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Ostropolets A, Albogami Y, Conover M, et al. Reproducible variability: assessing investigator discordance across 9 research teams attempting to reproduce the same observational study. J Am Med Inform Assoc. 2023;30(5):859868. doi: 10.1093/jamia/ocad009. doi:10.1093/jamia/ocad009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Gatto NM, Vititoe SE, Rubinstein E, Reynolds RF, Campbell UB. A Structured Process to Identify Fit-for-Purpose Study Design and Data to Generate Valid and Transparent Real-World Evidence for Regulatory Uses. Clin Pharmacol Ther. 2023;113(6):1235–1239. doi: 10.1002/cpt.2883. doi:10.1002/cpt.2883. [DOI] [PubMed] [Google Scholar]
  • 21.Eysenbach G. Issues in evaluating health websites in an Internet-based randomized controlled trial. J Med Internet Res. 2002;4(3):E17–E17. doi: 10.2196/jmir.4.3.e17. doi:10.2196/jmir.4.3.e17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Arksey H, O’Malley L. Scoping studies: towards a methodological framework. Int J Soc Res Methodol. 2005;8(1):19–32. doi:10.1080/1364557032000119616. [Google Scholar]
  • 23.Peters MDJ, Godfrey C, McInerney P, et al. Best practice guidance and reporting items for the development of scoping review protocols. JBI Evid Synth. 2022;20(4):953–968. doi: 10.11124/JBIES-21-00242. doi:10.11124/jbies-21-00242. [DOI] [PubMed] [Google Scholar]
  • 24.Tricco AC, Lillie E, Zarin W, et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): Checklist and Explanation. Ann Intern Med. 2018;169(7):467–473. doi: 10.7326/M18-0850. doi:10.7326/m18-0850. [DOI] [PubMed] [Google Scholar]
  • 25.Songthangtham N, Weinfurter L, Jantraporn R, Johnson S, Rajamani S. A Scoping Review of Standardized Protocols for Assessing EHR-based Cohort Fitness. Published online April 1, 2024 osf.io/5b46. [Google Scholar]
  • 26.Veritas Health Innovation Covidence systematic review software. www.covidence.org.kchi. [Google Scholar]
  • 27.Administration UF and D Framework for FDA’s real-world evidence program. Silver Spring MD US Food Drug Adm. Published online 2018. [Google Scholar]
  • 28.Methodological Guide - European Union Accessed May 13, 2024. https://encepp.europa.eu/encepp-toolkit/methodological-guide_en . [Google Scholar]
  • 29.Simon GJ, Aliferis CF, editors. Artificial Intelligence and Machine Learning in Health Care and Medical Sciences: Best Practices and Pitfalls. Springer; 2024. [PubMed] [Google Scholar]
  • 30.Ostropolets A, Hripcsak G, Husain SA, et al. Scalable and interpretable alternative to chart review for phenotype evaluation using standardized structured data from electronic health records. J Am Med Inform Assoc JAMIA. 2023;31(1):119–129. doi: 10.1093/jamia/ocad202. doi:10.1093/jamia/ocad202. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Berger ML, Crown WH, Li JZ, Zou KH. ATRAcTR (Authentic Transparent Relevant Accurate Track-Record): a screening tool to assess the potential for real-world data sources to support creation of credible real-world evidence for regulatory decision-making. Health Serv Outcomes Res Methodol. Published online November 29, 2023. doi:10.1007/s10742-023-00319-w. [Google Scholar]
  • 32.Castelo-Branco L, Pellat A, Martins-Branco D, et al. ESMO Guidance for Reporting Oncology real-World evidence (GROW) ESMO Real World Data Digit Oncol. 2023;1:100003. doi: 10.1016/j.annonc.2023.10.001. doi:10.1016/j.esmorw.2023.10.001. [DOI] [PubMed] [Google Scholar]
  • 33.Singh S, Beyrer J, Zhou X, et al. Development and Evaluation of the Algorithm CErtaInty Tool (ACE-IT) to Assess Electronic Medical Record and Claims-based Algorithms’ Fit for Purpose for Safety Outcomes. Drug Saf. 2023;46(1):87–97. doi: 10.1007/s40264-022-01254-4. doi:10.1007/s40264-022-01254-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Graul EL, Stone PW, Massen GM, et al. Determining prescriptions in electronic healthcare record data: methods for development of standardized, reproducible drug codelists. JAMIA Open. 2023;6(3):ooad078, ooad078. doi: 10.1093/jamiaopen/ooad078. doi:10.1093/jamiaopen/ooad078. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Miao Z, Sealey MD, Sathyanarayanan S, Delen D, Zhu L, Shepherd S. A data preparation framework for cleaning electronic health records and assessing cleaning outcomes for secondary analysis. Inf Syst. 2023;111:102130. doi:10.1016/j.is.2022.102130. [Google Scholar]
  • 36.Kahn MG, Callahan TJ, Barnard J, et al. A Harmonized Data Quality Assessment Terminology and Framework for the Secondary Use of Electronic Health Record Data. EGEMS Wash DC. 2016;4(1):1244–1244. doi: 10.13063/2327-9214.1244. doi:10.13063/2327-9214.1244. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Sen A, Chakrabarti S, Goldstein A, Wang S, Ryan PB, Weng C. GIST 2.0: A scalable multi-trait metric for quantifying population representativeness of individual clinical studies. J Biomed Inform. 2016;63:325–336. doi: 10.1016/j.jbi.2016.09.003. doi:10.1016/j.jbi.2016.09.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Dziadkowiec O, Callahan T, Ozkaynak M, Reeder B, Welton J. Using a Data Quality Framework to Clean Data Extracted from the Electronic Health Record: A Case Study. EGEMS Wash DC. 2016;4(1):1201–1201. doi: 10.13063/2327-9214.1201. doi:10.13063/2327-9214.1201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Haneuse S, Daniels M. A General Framework for Considering Selection Bias in EHR-Based Studies: What Data Are Observed and Why? EGEMS Wash DC. 2016;4(1):1203–1203. doi: 10.13063/2327-9214.1203. doi:10.13063/2327-9214.1203. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Knake LA, Ahuja M, McDonald EL, et al. Quality of EHR data extractions for studies of preterm birth in a tertiary care center: guidelines for obtaining reliable data. BMC Pediatr. 2016;16:59–59. doi: 10.1186/s12887-016-0592-z. doi:10.1186/s12887-016-0592-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Schneeweiss S, Eichler H, Garcia-Altes A, et al. Real World Data in Adaptive Biomedical Innovation: A Framework for Generating Evidence Fit for Decision-Making. Clin Pharmacol Ther. 2016;100(6):633–646. doi: 10.1002/cpt.512. doi:10.1002/cpt.512. [DOI] [PubMed] [Google Scholar]
  • 42.Cahan A, Cahan S, Cimino JJ. Computer-aided assessment of the generalizability of clinical trial results. Int J Med Inf. 2017;99:60–66. doi: 10.1016/j.ijmedinf.2016.12.008. doi:10.1016/j.ijmedinf.2016.12.008. [DOI] [PubMed] [Google Scholar]
  • 43.Castillo EG, Olfson M, Pincus HA, Vawdrey D, Stroup TS. Electronic Health Records in Mental Health Research: A Framework for Developing Valid Research Methods. Psychiatr Serv. 2015;66(2):193–196. doi: 10.1176/appi.ps.201400200. doi:10.1176/appi.ps.201400200. [DOI] [PubMed] [Google Scholar]
  • 44.Cocoros NM, Arlett P, Dreyer NA, et al. The Certainty Framework for Assessing Real-World Data in Studies of Medical Product Safety and Effectiveness. Clin Pharmacol Ther. 2020;109(5):1189–1196. doi: 10.1002/cpt.2045. doi:10.1002/cpt.2045. [DOI] [PubMed] [Google Scholar]
  • 45.Diaz-Garelli JF, Bernstam EV, Lee M, Hwang KO, Rahbar MH, Johnson TR. DataGauge: A Practical Process for Systematically Designing and Implementing Quality Assessments of Repurposed Clinical Data. EGEMS Wash DC. 2019;7(1):32–32. doi: 10.5334/egems.286. doi:10.5334/egems.286. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Ercole A, Brinck V, George P, et al. Guidelines for Data Acquisition, Quality and Curation for Observational Research Designs (DAQCORD) J Clin Transl Sci. 2020;4(4):354–359. doi: 10.1017/cts.2020.24. doi:10.1017/cts.2020.24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Gatto NM, Reynolds RF, Campbell UB. A Structured Preapproval and Postapproval Comparative Study Design Framework to Generate Valid and Transparent Real-World Evidence for Regulatory Decisions. Clin Pharmacol Ther. 2019;106(1):103–115. doi: 10.1002/cpt.1480. doi:10.1002/cpt.1480. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Gatto NM, Campbell UB, Rubinstein E, et al. The Structured Process to Identify Fit-For-Purpose Data: A Data Feasibility Assessment Framework. Clin Pharmacol Ther. 2022;111(1):122–134. doi: 10.1002/cpt.2466. doi:10.1002/cpt.2466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Han HS, Lee KE, Suh YJ, et al. Data collection framework for electronic medical record-based real-world data to evaluate the effectiveness and safety of cancer drugs: a nationwide real-world study of the Korean Cancer Study Group. Ther Adv Med Oncol. 2022 doi: 10.1177/17588359221132628. doi:10.1177/17588359221132628. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Kotecha D, Asselbergs FW, Achenbach S, et al. CODE-EHR best-practice framework for the use of structured electronic health-care records in clinical research. Lancet Digit Health. 2022;4(10):e757–e764. doi: 10.1016/S2589-7500(22)00151-0. doi:10.1016/s2589-7500(22)00151-0. [DOI] [PubMed] [Google Scholar]
  • 51.Ritchey ME, Girman CJ. Evaluating the Feasibility of Electronic Health Records and Claims Data Sources for Specific Research Purposes. Ther Innov Regul Sci. 2020;54(6):1296–1302. doi: 10.1007/s43441-020-00139-x. doi:10.1007/s43441-020-00139-x. [DOI] [PubMed] [Google Scholar]
  • 52.Imran M, Kwakkenbos L, McCall SJ, et al. Methods and results used in the development of a consensus-driven extension to the Consolidated Standards of Reporting Trials (CONSORT) statement for trials conducted using cohorts and routinely collected data (CONSORT-ROUTINE) BMJ Open. 2021;11(4):e049093–e049093. doi: 10.1136/bmjopen-2021-049093. doi:10.1136/bmjopen-2021-049093. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Benchimol EI, Smeeth L, Guttmann A, et al. The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement. PLoS Med. 2015;12(10):e1001885–e1001885. doi: 10.1371/journal.pmed.1001885. doi:10.1371/journal.pmed.1001885. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Langan SM, Schmidt SA, Wing K, et al. The reporting of studies conducted using observational routinely collected health data statement for pharmacoepidemiology (RECORD-PE) BMJ. 2018 doi: 10.1136/bmj.k3532. doi:10.1136/bmj.k3532. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Wang SV, Pinheiro S, Hua W, et al. STaRT-RWE: structured template for planning and reporting on the implementation of real world evidence studies. BMJ. 2021;372:m4856–m4856. doi: 10.1136/bmj.m4856. doi:10.1136/bmj.m4856. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Wang SV, Pottegård A, Crown W, et al. HARmonized Protocol Template to Enhance Reproducibility of Hypothesis Evaluating Real-World Evidence Studies on Treatment Effects: A Good Practices Report of a Joint ISPE/ISPOR Task Force. Value Health. 2022;25(10):1663–1672. doi: 10.1016/j.jval.2022.09.001. doi:10.1016/j.jval.2022.09.001. [DOI] [PubMed] [Google Scholar]
  • 57.Denaxas S, Direk K, Gonzalez-Izquierdo A, et al. Methods for enhancing the reproducibility of biomedical research findings using electronic health records. BioData Min. 2017;10:31–31. doi: 10.1186/s13040-017-0151-7. doi:10.1186/s13040-017-0151-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Callahan A, Shah NH, Chen JH. Research and Reporting Considerations for Observational Studies Using Electronic Health Record Data. Ann Intern Med. 2020;172(11 Suppl):S79–S84. doi: 10.7326/M19-0873. doi:10.7326/M19-0873. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Wang RY, Strong DM. Beyond Accuracy: What Data Quality Means to Data Consumers. J Manag Inf Syst. 1996;12(4):5–33. doi:10.1080/07421222.1996.11518099. [Google Scholar]
  • 60.Borek A, Woodall P, Oberhofer MA, Parlikad AK. A classification of data quality assessment methods. ICIQ. 2011 [Google Scholar]

Articles from AMIA Summits on Translational Science Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES