Skip to main content
AMIA Summits on Translational Science Proceedings logoLink to AMIA Summits on Translational Science Proceedings
. 2020 May 30;2020:171–180.

Natural Language Processing for the Evaluation of Methodological Standards and Best Practices of EHR-based Clinical Research

Sunyang Fu 1,3, Luke A Carlson 1, Kevin J Peterson 2,3, Nan Wang 3, Xin Zhou 1, Suyuan Peng 1, Jun Jiang 1, Yanshan Wang 1, Jennifer St Sauver 1, Hongfang Liu 1
PMCID: PMC7233049  PMID: 32477636

Abstract

The effective use of EHR data for clinical research is challenged by the lack of methodologic standards, transparency, and reproducibility. For example, our empirical analysis on clinical research ontologies and reporting standards found little-to-no informatics-related standards. To address these issues, our study aims to leverage natural language processing techniques to discover the reporting patterns and data abstraction methodologies for EHR-based clinical research. We conducted a case study using a collection of full articles of EHR-based population studies published using the Rochester Epidemiology Project infrastructure. Our investigation discovered an upward trend of reporting EHR-related research methodologies, good practice, and the use of informatics related methods. For example, among 1279 articles, 24.0% reported training for data abstraction, 6% reported the abstractors were blinded, 4.5% tested the inter-observer agreement, 5% reported the use of a screening/data collection protocol, 1.5% reported that team meetings were organized for consensus building, and 0.8% mentioned supervision activities by senior researchers. Despite that, the overall ratio of reporting/adoption of methodologic standards was still low. There was also a high variation regarding clinical research reporting. Thus, continuously developing process frameworks, ontologies, and reporting guidelines for promoting good data practice in EHR-based clinical research are recommended.

Introduction

The rapid adoption of electronic health records (EHRs) has transformed the way in which medical records are routinely collected, integrated, and stored. This transformation holds great promise for driving clinical research by enriching patient information, integrating computable phenotype algorithms, and facilitating cohort exploration on an unprecedented scale. Studies have demonstrated the efficiency and effectiveness of using EHRs for large cohort comparative effectiveness research1, case-control2, and retrospective chart review studies3. Despite this great potential, there are many challenges to the effective use of EHR data for clinical research, particularly due to the high heterogeneity and complexity of EHRs4, 5. Thus, understanding the challenges of using EHR data becomes critical to ensure the quality of clinical research.

Chart review is a common method for facilitating EHR-based observational research. It is a process of extracting or reviewing information from EHRs and assembling a data set for various research needs such as case ascertainment, status validation, information collection, and case matching6. However, evidence has suggested that the methodologies of chart review lack standardization, scientific rigor, and reporting guidelines. For example, a review conducted by Gilbert et al. on three emergency medicine journals discovered that among all studies related to retrospective chart review, only 11% reported the use of an abstraction form and 4% reported inter-rater agreement7. Meanwhile, informatics tools have been developed to enhance the use of EHR data for supporting clinical and translational research. For example, Informatics for Integrating Biology and the Bedside (I2B2) is a patient privacy preserving query tool for facilitating research feasibility assessment8. Once study feasibility is determined, data capture tools such as Redcap, TELEforms® (Cardiff Software, Inc., Vista, CA), and Studytrax® (ScienceTRAX LLC, Macon, GA, USA) aim to facilitate patient information collection. Protocol management aids such as Protocol Builder® (Biomedical Research Alliance of New York) and questionnaire development tools like QDS™(NOVA Research Company, Bethesda, USA) ensure the document development process is more efficient. Additionally, many informatics methodologies, such as natural language processing (NLP), have been leveraged to perform chart review by automatically extracting clinical concepts from unstructured EHR data. Various types of EHR-based phenotype algorithms have been developed, ranging from drug-related adverse events9 to individualized risk prediction10. However, even with the great advancement of clinical research informatics, there is a lack of systematic understanding of how the tools and methods are utilized and reported, as well as the impact on overall research quality. Our empirical analysis on clinical research ontologies and reporting standards found little-to-no informatics- related standards.

An additional challenge for EHR-based clinical research is reproducibility. This often manifests as the withholding of key methodological details such as data abstraction methods, protocols, processes, and definitions11. Several pragmatic evaluations of high-profile clinical journals have shown that only ~11% to 25% of projects can be replicated12-14. The issues often appear as the inability to reproduce research data. Mobley et al. surveyed faculty and trainees at MD Anderson Cancer Center, and discovered that 50% of respondents had experienced issues with data reproducibility in cancer-related research15. The consequences of invalid methodologic processes and unreproducible results in biomedical research can be serious, such as preventing clinical knowledge translation, wasting scientific resources, and delaying treatment timing16.

To ensure valid, transparent, and reproducible clinical research, a growing number of informatics related efforts have been reported. Several clinical research ontologies have been developed to ensure scientific standards, including Ontology of Clinical Research (OCRe)17 and Biomedical Resource Ontology (BRO)18. Leveraging existing ontologies, Kong et al. further expanded the schematic representation of clinical research using Conceptual Model Representation (CMP) to aid the development of clinical research databases19. Sahoo et al. created an informatics framework that allows detailed research data elements to be systematically mapped and represented20. By combining NLP techniques, Valdez et al. were able to build an ontology-based clinical research knowledgebase for evaluating research studies and enhancing study reproducibility21. Ross et al. systematically analyzed eligibility criteria in clinical trials using heuristic rules and logic22. Our project aims to enhance the current informatics solution by demonstrating a methodologic development process (corpus development, sub-language analysis, and modeling) that uses NLP to discover the reporting patterns of EHR-based observational studies. Our investigation is focused on studying the trends, variability, utilization and adoption of EHR-based data abstraction related methodologies. Existing clinical research ontologies and research reporting standards were leveraged to help define important data elements.

Methodology

Data Selection

The Rochester Epidemiology Project is a National Institutes of Health-funded research infrastructure that collates and indexes health care information from virtually all sources of medical care available to residents of Olmsted County, Minnesota23. It has maintained a comprehensive medical records linkage system for over half a century, which makes it an ideal resource for conducting population-based studies. These data have been utilized by investigators throughout the country, resulting in more than 2,000 publications on a wide range of health care topics from top clinical journals and conferences including JAMA, NEJM, and Lancet. Our study investigated all articles from the REP publication registry between 1995 and 2016. In total, 1,543 articles were retrieved; 321 were removed due to their non-convertible PDF format or no full text being available. The final data set was comprised of 1,279 articles.

Guideline

We adopted existing guidelines gathered from the Reporting Guidelines for Health Research EQUATOR Network24 including RECORD (REporting of studies Conducted using Observational Routinely-collected health Data) and STROBE (The Strengthening the Reporting of Observational Studies in Epidemiology) as the baseline guideline25, 26. Additional methodologic strategies were borrowed from research by Boyd et al. and Horwitz et al., including training, abstraction forms, meetings, monitoring, and testing of interrater agreement27, 28. In consultation with the above standards and guidelines, we defined the use of EHRs to include the following processes: feasibility assessment, cohort identification (case selection), and data retrieval. Reporting categories were included as follows:

Annotation

Based on the baseline guideline, we randomly sampled 200 articles (from 1,279) for manual review. 71 out of the 200 reports were randomly sampled and double read to determine interrater reliability. We defined two objectives for this process. The first objective was to assess the reporting cohesiveness to the existing standards. Each article was annotated for the presence or absence of methodologic standards provided in the guideline. The second objective was to identify additional important activities that were not captured by the current standards, such as recently proposed best practices and methodologies for the use of EHRs.

The annotation process was conducted according to Corpus Annotation Schemes29, including organizing training sessions, developing annotation guidelines, multi-phase annotation, evaluation, and adjudication. Four annotators (N.W., J.J, X.Z, and S.P) were given initial one-hour training. Questions raised from the training exercise were used to refine the baseline guideline. In the first week, each annotator annotated four to eight papers (two papers for every batch). After each batch, the inter-annotator agreement (IAA) was calculated using F-measure (2* precision * recall/(precision + recall)). The matched cases were determined by comparing the bipartite set alignment for two annotated sentences using Kuhn-Munkres algorithm30. A consensus meeting was organized to resolve disagreements and annotation issues. The process continued until a high agreement was reached. Over the next three to four weeks, weekly batches of a total of 200 papers were annotated. Each document was independently annotated by two annotators. After the weekly assignments were completed, we computed the IAA, resolved disagreements, and clarified the guidelines.

The final gold standard annotations were created by combining the individual experts’ annotations followed by adjudication of the mismatches. The jointly annotated training notes were added to the gold standard but excluded from the final IAA computation. The annotation tool for this project is Multi-document Annotation Environment (MAE), a Java-based natural language annotation software package31.

Natural Language Processing

Based on the above methodologic standards and gold standard corpus, we developed an NLP algorithm to automate the manual process. The infrastructure for the NLP system was adopted from the existing open source NLP framework MedTaggerIE32, a resource-driven open-source Unstructured Information Management Architecture (UIMA)33-based IE framework. The NLP algorithm was developed through three steps:

1 prototype system development based on existing knowledge and standards, information theory algorithms, and expert knowledge, 2) formative system development using a training dataset and manual case review for iterative

refinement, and 3) final system evaluation using a test dataset. A total of 200 annotated articles were divided into a training set (n=100) and a test set (n=100).

Section Detection

The initial step was to exclude irrelevant information by segmenting the information into different sections. Since MedTaggerIE contains a built-in section detector through terminology lookup34, we only needed to modify the dictionary to match with sections for clinical research. Based on STROBE, we included sections that are related to study method, study design, data (collection), case definition, and participants (cohort).

Concept Extraction

Concept extraction is a knowledge-driven annotation and indexing process to identify phrases referring to concepts of interests in an unstructured text. We leveraged Pointwise Mutual Information (PMI), to provide heuristic ranks of n-gram features (an adjoining sequence of n items from a given sample of sentences) for keyword prototyping. Table 1 lists the top 15 n-gram features that were automatically generated from the algorithm.

Table 1.

Example of N-gram Features with High PMI Scores

Reporting Category Top Uni- and Bigram Features with the Highest PMI Score* Example Sentence
Participant records; medical; medical records; residents; identified; reviewed; records of; review; complete; case; county residents; were identified; subjects; were reviewed; nurse “We reviewed charts to identify cases of PJP, cross referenced with the REP database using diagnostic codes for PJP and the Mayo Clinic and Olmsted Medical Center databases.”
Data source medical records; data; records of; reviewed; information; was collected; records linkage; used; database; nurse; system; selected; from the; obtained; by trained “We used the REP database to retrieve all medical records for residents of Olmsted county who had an established diagnosis of any of the subtypes of CLE”

*Keywords in second column were only for system prototyping and had not been manually curated

Due to a high textual similarity between participant and data source, additional patterns needed to be identified in order to accurately distinguish the two classes. Thus, we analyzed the syntactic patterns by parsing the dependencies of each sentence. We used the Stanford CoreNLP 3.9.235 and integrated it into the existing MedTagger UIMA framework. We found the majority of methodologic events can be modeled into concepts separated by semantic connectors. For example, Figure 1 shows the syntactic structure of sentence objective (event confirmation) and methodologic event (medical record review) were connected through the case making element and the adjectival modifier.

Figure 1.

Figure 1.

Parsing structure of case ascertainment

Context Detector

The assertion and temporal expressions were handled by the MedTaggerIE context detector. The assertion of each concept includes certainty (i.e., positive, negative, and possible) along with experiencer (i.e., patient, associated with someone else), while temporality identifies historical or present. For example, from the sentence “Data were collected from a random sample using questionnaires,” “Data” would be extracted as a data concept and “collected” would be extracted as a methodology concept, along with corresponding assertion status “positive”, temporality “past”.

Normalization and Summarization

After keywords or phrases were extracted from a sentence, they were normalized to a specific concept. As an example, the phrases “patient record” and “medical record” were normalized to the concept of “data”. The normalized categories were processed by the rule engine, a series of conditional clauses including “and”, “or” and “not” independently. A summary of these concepts, keywords, and modifiers are listed in Table 2.

Table 2.

Keywords for Concept Extraction

NLP Concept Keywords*
Population selection general assemble(d); categorize(d); choose(chosen); classify(classified); construct(ed); contact(ed); draw(drawn); determine(d); establish(ed); screen(ed); select(ed); recruit(ed); invit(ed); sample(d); abstract(ed); cohort screen(ing); complete(d) abstraction
Population selection specific search(ed); review(ed); identify(ied); exclude(d); include(d); confirm(ed)/ascertain(ed); avoid(ed);
Measurement measurement(s)/ assessment(s) performed; assessed; measure(d); determine(d); evaluate(d); search(ed); review(ed); identify(ied);
Data collection retrieve(d); collecte(d); obtaine(d); retrieve(d); contact(ed); interview(ed); complete abstraction; data collection/abstraction; questionnaire(s)/survey(ies) was/were designed/used/created/mailed;
Validation validate(d); validation; confirm(ed); verify(verified); verification; ensure(d); agreement(s); agreement measure(s); accuracy; inter/intra-rader/annotator/observer agreement(s); agreement(s) between; IAA; test retest; gold standard; kappa; reliability; validity; (doubly; double; triply; triple; quadruply; quadruple) + (review/read/exam/assess/measure) + (twice; multiple times); Consensus; disagreement(s) resolved;
Data linkage linking; link(ed); data linkage; linkage system; indexing; cross referenc(ed/ing); cross match(ed);
Follow up follow(ed) up; follow(ed) up through; follow up period; follow(ed) for
Matching match(ed) with; matching; match(ed) (subject) to; matched pair with; matched(ing) on; matched in (characteristics); matched for;
Cohort-related cohort (of); sub(-)cohort; population; participant(s); patient(s); control(s); case(s); resident(s); child; children; man; men; woman; women; subject(s); adult(s); volunteer(s); person(s); survey respondent(s); comparison group(s)
Study team/abstractor abstractor(s); specialist(s); fellow(s); RTP(s); research temporary professional(s); intern(s); author(s); reviewer(s); operator(s)
Eligibility-related eligible; eligibility; ineligible; criteria; criterion; inclusion criteria; exclusion criteria; included; excluded; screening protocol; screening
EHR-related medical record(s); information; data; record(s); characteristic(s); chart(s); sample(s); questionnaire.?; database; computerized diagnostic index; EHR(s); EMR(s); electronic medical record(s); electronic health record(s); survey(s);
Terminology diagnostic; diagnostic code; ICD(s); international classification of diseases; CPT(s); current procedural terminology; Berkson code (s or ing) (REP cohort only); symptom(s); factor(s);
UMLS Dictionary 1) Medical concept, 2) procedures, 3) medical professional expressions): https://www.nlm.nih.gov/research/umls/

* Keywords should be connected through wild card regular expression (i.e. (\S?\s?){start, end}) for fuzzy matching; additional refinement is required when used for different data sources.

The following patterns were used to identify the common expressions for each class. We used square brackets “[]” to represent each concept group, parentheses “()” for direct keywords, curly brackets for “{}” typed dependencies, “|” for the conjunction or, and “&” for the conjunction and. The expressions for dependencies were followed by the Stanford Typed Dependencies Manual35, where “auxpass” represents passive auxiliary, “case” represents case- marking elements, “mark” represents a marker which introduces a clause subordinate to another clause, and “amod” as an adjectival modifier. Textbox 1 provides the logic rules for nine different events related to the use of EHRs for clinical research.

Evaluation

After section selection and sentence detection, the final test corpus consisted of 1220 sentences. Each sentence was pre-annotated with either one of the nine categories listed in Table 4 or “other”. The final system was evaluated on all tasks using F-measure.

Table 4.

Summary of Methodologic Events Among 1279 Articles from REP

Sections Methodologic Events* Number of Articles (n=1279)
Participants Study population selection 90.30% (1155)
Chart review 51.92% (664)
Database Query 30.6% (391)
Standard terminology 7.43% (95)
Cohort screen tools 1.49% (19)
Computer-based algorithms 1.41% (18)
Natural language processing 1.09% (14)
Validation 22.28% (285)
Chart review 7.97% (102)
Existing criteria 5.47% (70)
Questionnaires/survey/interview 0.70% (9)
Data linkage 26.11% (334)
Participant follow-up 49.57% (634)
Matching 21.11% (270)
Variables Measurement/Assessment of variables 35.26% (451)
Clinical intervention/criteria 29.48% (377)
Questionnaires/survey/interview 7.35% (94)
Chart review 1.41% (18)
Computer-based algorithms 0.47% (6)
Validation 9.85% (126)
Questionnaires/survey/interview 1.96% (25)
Existing criteria 0.16% (2)
Data sources Data collection 49.18% (629)
Electronic retrieval 6.18% (79)
Manual chart review 4.14% (53)
Survey/questionnaire/interview 1.40% (18)
Electronic data capture tools 0.70% (9)
Data quality assessment 0.63% (8)

*Categories are not mutually exclusive.

Analysis of Methodologic Reporting Patterns

We applied the algorithm to the entire REP cohort from 1995 to 2016. Each sentence was categorized into the above nine categories. Once the category was determined, we conducted a sublanguage analysis to identify how each activity was conducted. Briefly, we identified the top five commonly used case-marking elements and markers including “using”, “through”, “with use of”, “with”, “via” for identifying the key methodologic expressions. The expressions that cannot be identified automatically were assessed through manual review. Furthermore, we applied the research practice framework proposed by Boyd et al., Horwitz et al., and Gilbert et al. to evaluate the quality of the reported methods through the identification of the following six activities: screening/data collection protocol, training, blinding, inter-observer agreement, team meetings, and supervision. To understand the usage of informatics tools and methods, a trend analysis was conducted using least- squares regression to test the incremental significance of the use of methodology throughout the years. Finally, we randomly sampled 40 articles to conduct an authorship and affiliation analysis through manual review. The goal of this analysis is to understand whether there would be a variation regarding the reporting patterns given the first author’s training background. As the focus of our study is observational research, we classified each author into epidemiology background and other using the affiliation information from PubMed. We then defined the positive outcome as the satisfaction of at least three activities from the framework (Boyd et al., Horwitz et al., and Gilbert et al.) and the negative outcome as less than three criteria were discovered.

Results and Discussion

Performance of Annotation and NLP

Three articles were removed due to no full-text found. The averaged inter- annotator agreement (IAA) of manual reviewing of 71 articles is 0.863 in F-measure. The evaluation of the NLP system on the test articles is provided in Table 3. We found the identification of variables was the most challenging task. Many expressions either lacked context or used very specific terminology such as directly referring to the inventor’s name (e.g. Morris, a type of clinical rating scale assessment for dementia). The MedTaggerIE dictionary look up from UMLS Metathesaurus was able to provide additional context information such as medical concept and assessment type, however, the performance of the system on this task is capped by the comprehensiveness of the dictionary. Despite this limitation, the system achieved a moderate-high performance over nine different tasks.

Table 3.

Performance of IAA and NLP System of Nine Different Tasks

Reporting Category Methodologic Events NLP (F-measure)
Participants Study population selection 0.716
Screening validation 0.866
Data linkage process 0.900
Participant follow-up (cohort study only) 0.888
Matching (matched studies only) 0.955
Variables Measurement and classification of variables 0.780
Validation of variables 0.759
Data sources Data collection 0.850
Data quality assessment 0.967

Analysis of Methodologic Reporting Patterns

Our analysis showed that manual chart review was the most popular method reported for study population selection (51.92%) and case validation (7.97%) and the second most popular method for data collection (4.14%). We found electronic retrieval (i.e. query) was the most popular method for data collection (6.18%). However, there were a large number of articles that did not specify what methods they used for various tasks, e.g. only 49% of articles mentioned activities related to data collection. We believed this was due to the lack of reporting standards. For example, the expression like “all clinical variables were either obtained electronically or from patient records” described a potential data collection activity that was conducted. However, there were no related expressions discussing how exactly the data were collected, when this activity happened, or who conducted the abstraction. Furthermore, even among the sentences with abstractors mentioned, 77% use the pronoun “we” as an unspecified expression for the entire method sections.

In assessing the use of methodologic standards, 5% (61) reported the use of a screening/data collection protocol, 24.0% (146) reported training for data abstraction, 6% (74) reported the abstractors were blinded, 4.5% (57) tested the inter-observer agreement, 1.5% (19) reported that team meetings were organized for consensus building, and 0.8% (10) mentioned supervision activities by senior researchers. In comparison with the study conducted by Gilbert et al. in 1996, we found an increasing number of studies reported the use of good methodologic practices when dealing with EHRs. However, no single methodologic standard had an adoption rate of 25% or greater among the 1279 articles. Our author and affiliation analysis showed that papers with the first author of epidemiology background were more likely to report good practices (Figure 2). However, the result was not significant (p-value = 0.118).

Figure 2.

Figure 2.

Comparison of author training background and literature reporting.

The trend analysis showed a significantly increased number of articles reported using informatics-related methods (i.e. electronic data capture, phenotype algorithms, etc.). Figure 3 shows an upward trend of using informatics methods

Figure 3.

Figure 3.

Number of articles use informatics method.

since 1998 (p-value < 0.0001).

Among these articles, we were able to identify 11 different phenotype algorithms from 14 articles that used computer- based algorithms for case ascertainment (Table 5).

Table 5.

Computer Based Case Ascertainment Algorithms

Computer Based Case Ascertainment Algorithms Articles
Interstitial lung disease 36
Myocardial infarction 37-39
Osteoarthritis 40
Fracture risk assessment 41
Antineutrophil cytoplasmic autoantibody –associated vasculitis 42
Vertebral deformities 43
Nonalcoholic fatty liver disease 44
White matter hyperintensity volume 45
Herpes zoster 46, 47
Cause of death 48
Heart failure 49, 50

Conclusion

In summary, our study demonstrated a process of using informatics to discover research reporting patterns and methodologic events from a series of papers that used the REP cohort. Our investigation discovered an upward trend of reporting research methodologies, good practices, and the utilization of informatics-related tools and methods for EHR based clinical research. Despite these findings, the methodologic standards were still consistently under- reported. We also discovered high variation regarding clinical research reporting. Developing process frameworks, ontology models and reporting guidelines for the given context are recommended for future work.

Reporting Category Definition
Participants The methods of study population selection (such as codes or algorithms used to identify subjects), validation of the codes or algorithms used to select the population, data linkage process, participant follow-up and matching.
Variables The methods of the classification, assessment, and validation of variables (exposures, outcomes, confounders, and effect modifiers)
Data sources The methods of data assessment (reliability and validity) and data collection (development, training, validation, and administration, such as blinding)

Textbox 1.

NLP Rules for Extracting Methodologic Events

Reporting Category Methodologic Events NLP Rules
Participants Study population selection (study event)
  • • [Cohort-related] & {auxpass} & [Population selection general] & {preposition case-marking element} & [Population selection specific]

  • • [Population selection general | specific] & {preposition marker} & [Population selection specific] & ([Cohort- related | Eligibility])

  • • [Data] & {auxpass} & [Population selection specific] & {preposition case-marking element} & [Study team/abstractor] & {preposition marker} & [Population selection general | specific]

Screening validation (validation of screening protocol, procedure, inclusion and exclusion criteria)
  • • [Population selection concepts] & {auxpass} & [Validation]

  • • [Cohort-related] & [Study population selection] & [Validation]

Data linkage process (study event)
  • • ([EHR related] | [Cohort-related]) & [Data linkage]

Participant follow-up (cohort study only)
  • • [Cohort related] & {auxpass} & [Follow up]

  • • [Follow up] & [Cohort related]

Matching (matched studies only)
  • • [Matching] & [Cohort related]

Variables Measurement and classification of variables (study and clinical event)
  • • [UMLS Dictionary (medical concept | procedure)] & [Measurement]

Validation of variables (confirmation of subject has certain characteristics)
  • • [UMLS Dictionary (medical concept | procedure)] & [Measurement] & [Validation]

Data sources Data collection (study event)
  • • [EHR related] & {auxpass} & [Data collection] & {preposition case-marking element}

  • • [Study team/abstractor] & [Data collection] & [EHR- related]

  • • [Population selection general] & {preposition marker} & [Data collection] & [EHR-related]

Data quality assessment (validation of data collection tools, frameworks, protocols and methods)
  • • [Data collection] & [Validation]

Acknowledgments

We gratefully thank Mayo Foundation for Medical Education, Research and Rochester Epidemiology Project, and Mayo Clinic Center for Clinical and Translational Science (#8882575) for supporting this project

References

  • 1.Vestbo J, Leather D, Diar Bakerly N, New J, Gibson JM, McCorkindale S. Effectiveness of fluticasone furoate–vilanterol for COPD in clinical practice. New England Journal of Medicine. 2016;375(13):1253–60. doi: 10.1056/NEJMoa1608033. [DOI] [PubMed] [Google Scholar]
  • 2.Abul-Husn NS, Cheng X, Li AH, Xin Y, Schurmann C, Stevis P. A protein-truncating HSD17B13 variant and protection from chronic liver disease. New England Journal of Medicine. 2018;378(12):1096–106. doi: 10.1056/NEJMoa1712191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Dhossche DM, Ulusarac A, Syed W. A retrospective study of general hospital patients who commit suicide shortly after being discharged from the hospital. Archives of internal medicine. 2001;161(7):991–4. doi: 10.1001/archinte.161.7.991. [DOI] [PubMed] [Google Scholar]
  • 4.Kahn MG, Raebel MA, Glanz JM, Riedlinger K, Steiner JF. Medical care. 2012. A pragmatic framework for single-site and multisite data quality assessment in electronic health record-based clinical research; p. 50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Zozus MN, Richesson RL, Walden A, Tenenbaum JD, Hammond WE. AMIA Jt Summits Transl Sci Proc 2016. 2016. Research Reproducibility in Longitudinal Multi-Center Studies Using Data from Electronic Health Records; pp. 279–85. [PMC free article] [PubMed] [Google Scholar]
  • 6.Gearing RE, Mian IA, Barber J, Ickowicz A. A methodology for conducting retrospective chart review research in child and adolescent psychiatry. Journal of the Canadian Academy of Child and Adolescent Psychiatry. 2006;15(3):126. [PMC free article] [PubMed] [Google Scholar]
  • 7.Gilbert EH, Lowenstein SR, Koziol-McLain J, Barta DC, Steiner J. Chart reviews in emergency medicine research: where are the methods? Annals of emergency medicine. 1996;27(3):305–8. doi: 10.1016/s0196-0644(96)70264-0. [DOI] [PubMed] [Google Scholar]
  • 8.Murphy SN, Mendis ME, Berkowitz DA, Kohane I, Chueh HC, editors. AMIA Annual Symposium Proceedings. American Medical Informatics Association; 2006. Integration of clinical and genetic data in the i2b2 architecture. [PMC free article] [PubMed] [Google Scholar]
  • 9.Overby CL, Weng C, Haerian K, Perotte A, Friedman C, Hripcsak G. AMIA Summits on Translational Science Proceedings 2013. 2013. Evaluation considerations for EHR- based phenotyping algorithms: A case study for drug-induced liver injury; p. 130. [PMC free article] [PubMed] [Google Scholar]
  • 10.Arruda‐Olson AM, Afzal N, Priya Mallipeddi V, Said A, Moussa Pacha H, Moon S. Leveraging the electronic health record to create an automated real‐time prognostic tool for peripheral arterial disease. Journal of the American Heart Association. 2018;7(23):e009680. doi: 10.1161/JAHA.118.009680. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Manrai AK, Patel CJ, Gehlenborg N, Tatonetti NP, Ioannidis JP, Kohane IS. Methods to Enhance the Reproducibility of Precision Medicine. Pacific Symposium on Biocomputing Pacific Symposium on Biocomputing. 2016;21:180–2. [PMC free article] [PubMed] [Google Scholar]
  • 12.Prinz F, Schlange T, Asadullah K. Believe it or not: how much can we rely on published data on potential drug targets? Nature reviews Drug discovery. 2011;10(9):712. doi: 10.1038/nrd3439-c1. [DOI] [PubMed] [Google Scholar]
  • 13.Ioannidis JP. Acknowledging and overcoming nonreproducibility in basic and preclinical research. Jama. 2017;317(10):1019–20. doi: 10.1001/jama.2017.0549. [DOI] [PubMed] [Google Scholar]
  • 14.Begley CG, Ellis LM. Drug development: Raise standards for preclinical cancer research. Nature. 2012;483(7391):531. doi: 10.1038/483531a. [DOI] [PubMed] [Google Scholar]
  • 15.Mobley A, Linder SK, Braeuer R, Ellis LM, Zwelling L. A survey on data reproducibility in cancer research provides insights into our limited ability to translate findings from the laboratory to the clinic. PloS one. 2013;8(5):e63221. doi: 10.1371/journal.pone.0063221. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Begley CG, Buchan AM, Dirnagl U. Institutions must do their part for reproducibility: the funding to verified good institutional practice, and robust science will shoot up the agenda. Nature. 2015;525(7567):25–8. doi: 10.1038/525025a. [DOI] [PubMed] [Google Scholar]
  • 17.Sim I, Tu SW, Carini S, Lehmann HP, Pollock BH, Peleg M. The Ontology of Clinical Research (OCRe): an informatics foundation for the science of clinical research. Journal of biomedical informatics. 2014;52:78–91. doi: 10.1016/j.jbi.2013.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Tenenbaum JD, Whetzel PL, Anderson K, Borromeo CD, Dinov ID, Gabriel D. The Biomedical Resource Ontology (BRO) to enable resource discovery in clinical and translational research. Journal of biomedical informatics. 2011;44(1):137–45. doi: 10.1016/j.jbi.2010.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Kong YM, Dahlke C, Xiang Q, Qian Y, Karp D, Scheuermann RH. Toward an ontology-based framework for clinical research databases. Journal of biomedical informatics. 2011;44(1):48–58. doi: 10.1016/j.jbi.2010.05.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Sahoo SS, Valdez J, Rueschman M, editors. AMIA Annual Symposium Proceedings. American Medical Informatics Association; 2016. Scientific reproducibility in biomedical research: provenance metadata ontology for semantic annotation of study description. [PMC free article] [PubMed] [Google Scholar]
  • 21.Valdez J, Kim M, Rueschman M, Socrates V, Redline S, Sahoo SS, editors. AMIA Annual Symposium Proceedings. American Medical Informatics Association; 2017. ProvCaRe semantic provenance knowledgebase: evaluating scientific reproducibility of research studies. [PMC free article] [PubMed] [Google Scholar]
  • 22.Ross J, Tu S, Carini S, Sim I. Analysis of eligibility criteria complexity in clinical trials. Summit on Translational Bioinformatics 2010. 2010;46 [PMC free article] [PubMed] [Google Scholar]
  • 23.St. Sauver JL, Grossardt BR, Yawn BP, Melton III LJ, Rocca WA. Use of a medical records linkage system to enumerate a dynamic population over time: the Rochester epidemiology project. American journal of epidemiology. 2011;173(9):1059–68. doi: 10.1093/aje/kwq482. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Altman DG, Simera I, Hoey J, Moher D, Schulz K. EQUATOR: reporting guidelines for health research. The Lancet. 2008;371(9619):1149–50. doi: 10.1016/S0140-6736(08)60505-X. [DOI] [PubMed] [Google Scholar]
  • 25.Benchimol EI, Smeeth L, Guttmann A, Harron K, Moher D, Petersen I. The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement. PLoS medicine. 2015;12(10):e1001885. doi: 10.1371/journal.pmed.1001885. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. Annals of internal medicine. 2007;147(8):573–7. doi: 10.7326/0003-4819-147-8-200710160-00010. [DOI] [PubMed] [Google Scholar]
  • 27.Boyd N, Pater J, Ginsburg A, Myers R. Observer variation in the classification of information from medical records. Journal of Chronic Diseases. 1979;32(4):327–32. [Google Scholar]
  • 28.Horwitz RI, Eunice CY. Assessing the reliability of epidemiologic data obtained from medical records. Journal of Clinical Epidemiology. 1984;37(11):825–31. doi: 10.1016/0021-9681(84)90015-8. [DOI] [PubMed] [Google Scholar]
  • 29.Leech G. Corpus annotation schemes. Literary and linguistic computing. 1993;8(4):275–81. [Google Scholar]
  • 30.Kuhn HW. The Hungarian method for the assignment problem. Naval research logistics quarterly. 1955;2(1‐2):83–97. [Google Scholar]
  • 31.Stubbs A, editor. Proceedings of the 5th Linguistic Annotation Workshop. Association for Computational Linguistics; 2011. MAE and MAI: lightweight annotation and adjudication tools. [Google Scholar]
  • 32.Liu H, Bielinski SJ, Sohn S, Murphy S, Wagholikar KB, Jonnalagadda SR. AMIA Summits on Translational Science Proceedings 2013. 2013. An information extraction framework for cohort identification using electronic health records; p. 149. [PMC free article] [PubMed] [Google Scholar]
  • 33.Ferrucci D, Lally A. UIMA: an architectural approach to unstructured information processing in the corporate research environment. Nat Lang Eng. 2004;10(3-4):327–48. [Google Scholar]
  • 34.Denny JC, Spickard III A, Johnson KB, Peterson NB, Peterson JF, Miller RA. Evaluation of a method to identify and categorize section headers in clinical documents. Journal of the American Medical Informatics Association. 2009;16(6):806–15. doi: 10.1197/jamia.M3037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Manning C, Surdeanu M, Bauer J, Finkel J, Bethard S, McClosky D, editors. Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations. 2014. The Stanford CoreNLP natural language processing toolkit. [Google Scholar]
  • 36.Bongartz T, Nannini C, Medina‐Velasquez YF, Achenbach SJ, Crowson CS, Ryu JH. Incidence and mortality of interstitial lung disease in rheumatoid arthritis: a population‐based study. Arthritis & Rheumatism. 2010;62(6):1583–91. doi: 10.1002/art.27405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Hellermann JP, Jacobsen SJ, Redfield MM, Reeder GS, Weston SA, Roger VL. Heart failure after myocardial infarction: clinical presentation and survival. European journal of heart failure. 2005;7(1):119–25. doi: 10.1016/j.ejheart.2004.04.011. [DOI] [PubMed] [Google Scholar]
  • 38.Gerber Y, Weston SA, Killian JM, Jacobsen SJ, Roger VL. Sex and classic risk factors after myocardial infarction: a community study. American heart journal. 2006;152(3):461–8. doi: 10.1016/j.ahj.2006.02.003. [DOI] [PubMed] [Google Scholar]
  • 39.Roger VL, Killian J, Henkel M, Weston SA, Goraya TY, Yawn BP. Coronary disease surveillance in Olmsted County objectives and methodology. Journal of clinical epidemiology. 2002;55(6):593–601. doi: 10.1016/s0895-4356(02)00390-6. [DOI] [PubMed] [Google Scholar]
  • 40.Gabriel SE, Crowson CS, O'Fallon WM. A mathematical model that improves the validity of osteoarthritis diagnoses obtained from a computerized diagnostic database. Journal of clinical epidemiology. 1996;49(9):1025–9. doi: 10.1016/0895-4356(96)00115-1. [DOI] [PubMed] [Google Scholar]
  • 41.Kanis J, Johnell O, Odén A, Johansson H, McCloskey E. FRAX™ and the assessment of fracture probability in men and women from the UK. Osteoporosis international. 2008;19(4):385–97. doi: 10.1007/s00198-007-0543-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Watts R, Lane S, Hanslik T, Hauser T, Hellmich B, Koldingsnes W. Development and validation of a consensus methodology for the classification of the ANCA-associated vasculitides and polyarteritis nodosa for epidemiological studies. Annals of the rheumatic diseases. 2007;66(2):222–7. doi: 10.1136/ard.2006.054593. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Berti A, Cornec D, Crowson CS, Specks U, Matteson EL. The Epidemiology of Antineutrophil Cytoplasmic Autoantibody–Associated Vasculitis in Olmsted County, Minnesota: A Twenty‐Year US Population– Based Study. Arthritis & Rheumatology. 2017;69(12):2338–50. doi: 10.1002/art.40313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Melton L, Wenger D, Atkinson E, Achenbach S, Berquist T, Riggs B. Influence of baseline deformity definition on subsequent vertebral fracture risk in postmenopausal women. Osteoporosis international. 2006;17(7):978–85. doi: 10.1007/s00198-006-0106-1. [DOI] [PubMed] [Google Scholar]
  • 45.Raz L, Jayachandran M, Tosakulwong N, Lesnick TG, Wille SM, Murphy MC. Thrombogenic microvesicles and white matter hyperintensities in postmenopausal women. Neurology. 2013;80(10):911–8. doi: 10.1212/WNL.0b013e3182840c9f. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Yawn BP, Saddier P, Wollan PC, Sauver JLS, Kurland MJ, Sy LS, editors. Mayo Clinic Proceedings. 2007. A population-based study of the incidence and complication rates of herpes zoster before zoster vaccine introduction. Elsevier. [DOI] [PubMed] [Google Scholar]
  • 47.Kwon HJ, Bang DW, Kim EN, Wi C-I, Yawn BP, Wollan PC. Asthma as a risk factor for zoster in adults: A population-based case-control study. Journal of Allergy and Clinical Immunology. 2016;137(5):1406–12. doi: 10.1016/j.jaci.2015.10.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Lopez-Jimenez F, Jacobsen SJ, Reeder GS, Weston SA, Meverden RA, Roger VL. Prevalence and secular trends of excess body weight and impact on outcomes after myocardial infarction in the community. Chest. 2004;125(4):1205–12. doi: 10.1378/chest.125.4.1205. [DOI] [PubMed] [Google Scholar]
  • 49.Chamberlain AM, Redfield MM, Alonso A, Weston SA, Roger VL. Atrial fibrillation and mortality in heart failure: a community study. Circulation: Heart Failure. 2011;4(6):740–6. doi: 10.1161/CIRCHEARTFAILURE.111.962688. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Bielinski SJ, Pathak J, Carrell DS, Takahashi PY, Olson JE, Larson NB. A robust e-epidemiology tool in phenotyping heart failure with differentiation for preserved and reduced ejection fraction: the electronic medical records and genomics (eMERGE) network. Journal of cardiovascular translational research. 2015;8(8):475–83. doi: 10.1007/s12265-015-9644-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from AMIA Summits on Translational Science Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES