. 2016 Apr 23;24(1):218–226. doi: 10.1093/jamia/ocw046

An appraisal of published usability evaluations of electronic health records via systematic review

Marc A Ellsworth 1,2,, Mikhail Dziadzko 2,3, John C O'Horo 2,4, Ann M Farrell 5, Jiajie Zhang 6, Vitaly Herasevich 2,3
PMCID: PMC7654077  PMID: 27107451


Objective: In this systematic review, we aimed to evaluate methodological and reporting trends present in the current literature by investigating published usability studies of electronic health records (EHRs).

Methods: A literature search was conducted for articles published through January 2015 using MEDLINE (Ovid), EMBASE, Scopus, and Web of Science, supplemented by citation and reference list reviews. Studies were included if they tested the usability of hospital and clinic EHR systems in the inpatient, outpatient, emergency department, or operating room setting.

Results: A total of 4848 references were identified for title and abstract screening. Full text screening was performed for 197 articles, with 120 meeting the criteria for study inclusion.

Conclusion: A review of the literature demonstrates a paucity of quality published studies describing scientifically valid and reproducible usability evaluations at various stages of EHR system development. A lack of formal and standardized reporting of EHR usability evaluation results is a major contributor to this knowledge gap, and efforts to improve this deficiency will be one step of moving the field of usability engineering forward.

Keywords: electronic health records, health information technology, human factors, usability


Using electronic health records (EHRs) is a key component of a comprehensive strategy to improve healthcare quality and patient safety.1 The incentives provided by the Meaningful Use program are intended to encourage increased adoption of EHRs as well as more interactions between EHRs and medical providers.2 As such, there has never been a greater need for effective usability evaluations of EHR systems, both to prevent the implementation of suboptimal EHR systems and improve EHR interfaces for healthcare provider use. Compromised EHR system usability can have a number of significant negative implications in a clinical setting, such as use errors that can potentially cause patient harm and an attenuation of EHR adoption rates.1,3 As a result, the National Institute of Standards and Technology has provided practical guidance for the vendor community on how to perform user- centered design and diagnostic usability testing to improve the usability of EHR systems currently under development.1,2,4 The Office of the National Coordinator for Health Information Technology has even established user-centered design requirements that must be met before vendor EHRs receive certification.5

Despite the pressing need for usability evaluations, knowledge gaps remain in regards to how to successfully execute such an assessment. There is currently little guidance on how to perform systematic evaluations of EHRs and report findings and insights that can guide future usability studies.6,7 Identifying these gaps and reporting on the current methodology and outcome reporting practices is the first step of moving the field toward adopting a more unified and generalizable process of studying EHR system usability. In the present systematic review, we aimed to evaluate methodological and reporting trends present in the current literature by assessing published usability studies of EHRs. We reviewed the different engineering methods employed for usability testing in these studies as well as the distribution of medical domains, end-user profiles, development phases, and objectives for each method used.


We followed the standards set forth by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA-P) 2015 Initiative8 and the Institute of Medicine's Standards for Systematic Reviews9 to conduct this study.

Institutional Review and Human Subject Determination

The present study was exempted from approval by the Mayo Clinic institutional review board, because it did not involve active human subject research. No individual patients participated in this study.

Data Sources

Our medical reference librarian (A.M.F) designed the search strategy and literature search with input from the investigators. We searched MEDLINE (Ovid), EMBASE, Scopus, and Web of Science for articles published from database inception through January 2015. Duplicates were removed automatically using Endnote, and the remaining articles were then compared manually using the author, year, title, journal, volume, and pages to identify any additional duplicates. We increased the comprehensiveness of the database searches by looking for additional potentially relevant studies in the cited references of the retrieved studies.

Search Terms

We performed the search by identifying literature that contained keywords in two or more of the following categories (expanded with appropriate Medical Subject Headings and other related terms): (1) electronic health records, (2) usability, and (3) usability testing methods.10 The final search strategy we used is available online in Supplementary Appendix A.

Inclusion and Exclusion Criteria


We included studies that tested the usability of hospital and clinic EHR systems in the inpatient, outpatient, emergency department, or operating room settings. We excluded usability studies of medical devices, web-based applications, mobile devices, dental records, and personal health records.11


We included studies that performed usability testing on any of the following EHR components: electronic medical record (ie, document manager, lab viewer, etc.), computerized physician order entry, clinical decision support tools, and anesthesia information management system.


Only English-language studies were included in the review.

Article Selection

One of the authors (M.A.E.) independently screened the titles and abstracts of all the citations retrieved by the final search strategy (4848) to determine whether an abstract was "potentially relevant" or "not relevant" to the subject at hand. Abstracts were considered "not relevant" if they were not about a medical subject or were overall grossly unrelated to the study topic. Using this screening process, 543 studies were considered to be "potentially relevant." At least two of three reviewers (M.A.E, J.C.O, and M.D.) further screened each of the 543 studies' abstracts to determine candidates for full text review. Studies were excluded from full text review if they did not pertain to EHRs. Of these, 197 studies underwent a subsequent full text review, performed by the same three reviewers, with 120 articles ultimately selected for data extraction.12–131 A summary of this review process is presented in Figure 1. Any conflicts that arose in the screening and review processes were resolved by discussion between the conflicting reviewers. Article screening was coordinated and performed using the online systematic review tool Covidence (Alfred Health, Monash University, Melbourne, Australia).132,133

Figure 1.

Figure 1.

Study selection flow chart.

Data Abstraction

Comprehensive data extraction was performed by one of the authors (M.A.E.) on the articles that passed the full screening review. We extracted data relating to country, objective, setting, component tested, type of evaluation, description of evaluators and subjects (prescriber: physician, advanced practice nurse, physician assistant; non-prescriber: registered nurse, pharmacist, support staff), and usability methods employed.


Characteristics of Included Studies

Table 1 summarizes the characteristics of the studies included in this systematic review. The majority of the studies were conducted in the United States (53%) or the Netherlands (8%), and no other country had more than four articles. Most studies had a summative (80%) outcome design and evaluated EHRs in either a mixed (42%) or outpatient (35%) clinical setting.

Table 1.

Characteristics of Included Studies

Characteristic n (%)
Total 120 (100)
 United States 64 (53)
 Netherlands 10 (8)
 Canada 4 (3)
 Norway 4 (3)
 Othera 38 (32)
Study Objective
 Summative 96 (80)
 Formative 20 (17)
 Both 4 (3)
 Mixed 50 (42)
 Outpatient 42 (35)
 Hospital wards 13 (11)
 Emergency department 6 (5)
 Intensive care unit 3 (2)
 Operating room 3 (2)
 Other 3 (2)

aCountries with ≤3 studies, totaled.

Characteristics of Usability Evaluations

Table 2 summarizes the characteristics of the usability evaluations performed in the included studies. A slight majority (56%) of the studies were performed on an electronic medical record, and clinical decision support tools (21%) and computerized physician order entries (17%) were the next most-common study subjects. A majority of the studies were performed as implementation/post-implementation evaluations (64%), and requirements/development evaluations (8%) were the least-often performed in the studies reviewed. More than two-thirds of the evaluations involved clinical prescribers (45%, prescribers only; 23%, prescribers and non-prescribers), and 15 (13%) studies failing to give a description of the evaluation subjects. Only 29% of the studies assessed included a description of the study evaluators responsible for designing and carrying out the usability evaluation.

Table 2.

Characteristics of Usability Evaluations

Characteristic n (%)
Component Tested
 Electronic medical record 67 (56)
 Clinical decision support tool 25 (21)
 Computerized physician order entry 20 (17)
 Anesthesia information management system 2 (1)
 Mixed 6 (5)
Type of Evaluation
 Implementation/post-implementation 77 (64)
 Prototype 18 (15)
 Requirements/development 9 (8)
 Mixed 16 (13)
Evaluation Subjects
 Prescribers 54 (45)
 Non-prescribers 23 (19)
 Both 28 (23)
 Unknown 15 (13)
Description of Evaluators Included?
 Yes 35 (29)
 No 85 (71)

Usability Methods Utilized

Supplementary Appendix B provides the list of usability methods used for categorization.4,10 The 120 studies analyzed for this review were categorized based on these different analysis methods, with many studies utilizing more than one method. Table 3 and Figure 2 provide tabular and graphical breakdowns of the relative frequency of use of each usability analysis method, stratified by the four different EHR evaluation types (requirements/development, prototype, implementation/post-implementation, and mixed). The most frequent methods used were survey (37%) and think-aloud (19%), which, combined, accounted for more than half of all the usability evaluations we reviewed. These two methods were both the first- and second-most-commonly used techniques in each type of evaluation.

Table 3.

Usability Methods Utilized

Usability Method Type of Evaluation
Total n (%)
R/D Prototype I/PI Mixed
Survey 2 7 50 10 69 (37)
Think-aloud 2 6 19 9 36 (19)
Interview 2 5 12 4 23 (12)
Heuristics 2 4 8 3 17 (9)
Cognitive walkthrough 2 2 8 1 13 (7)
Focus group 1 3 5 9 (5)
Task analysis 1 1 2 2 6 (3)
Clinical workflow analysis 1 1 3 5 (3)
Card sort 1 1 2 (1)
KLM 1 1 2 (1)
TURF/UFuRT 1 1 2 (1)
Brainstorm 1 1 (<1)
GOMS 1 1 (<1)
Total, n (%) 14 (8) 28 (15) 108 (58) 36 (19) 186

GOMS, goals, operators, methods, and selections; KLM, keystroke-level model; I/PI, implementation/post-implementation; R/D, requirements/development; TURF/UFuRT, tasks, users, representations, and functions.

Figure 2.

Figure 2.

Usability method distribution by type of evaluation. R/D, requirements/development; I/PI, implementation/post-implementation.

Of the 69 studies in which surveys were performed, 33 (48%) were developed specifically for the study or were not described in the article's text. The most common existing and validated survey used was the System Usability Scale134 (20%), followed closely by the Questionnaire for User Interaction Satisfaction135 (16%).

The heuristics method was used in 17 studies, with Nielsen's Usability Heuristics136 being cited as the basis of the evaluation in 10 (59%) of these studies. Study-specific heuristic methods or methods that were not described in the article's text accounted for 35% of the heuristic evaluations.

Objective Data

Twenty-eight (23%) of the included studies reported objective data that were obtained in addition to the formal usability evaluation. These data included time to task completion, task completion accuracy, usage rates, mouse clicks, and cognitive workload.

Publication Year

Included studies were published between the years 1991 and 2015, although only six studies were published before 2000 (Figure 3). When analyzed by 5-year time increments, there were no obvious differences in regards to study objective, components tested, stage of development, and usability method used in the studies reviewed.

Figure 3.

Figure 3.

Distribution of included studies, stratified by year. The number of studies that were published within each 5-year increment is in parentheses.


The goal of this systematic review was to describe the range and characteristics of published EHR evaluation studies that included accepted usability analysis or engineering methods.10,137 Even though our initial search identified nearly 5000 potential studies, only a very small fraction of these truly applied usability evaluation standards and were therefore eligible for our review.

The majority of studies included in the systematic review had a summative study objective and were performed late in the EHR system design cycle, either during or after the system's implementation. This is consistent with previous findings7,138 and sheds light on the lack of EHR evaluations performed early on or throughout the design process, when usability issues can be more readily identified and rectified. Often, the responsibility of conducting evaluations early in the EHR design process is placed on commercial EHR vendors, and relying on their diligence is important for successful EHR implementation and adoption. However, a recent evaluation of the largest EHR vendors revealed that less than half are conducting industry-standard5 usability evaluations, and a significant number of vendors are not even employing usability staff to carry out these assessments.139,140

We found that the usability method most often employed in published evaluations of EHR systems is to survey or distribute questionnaires among end-users. Although surveys are useful for gathering self-reported data about the user's perception of how useful, usable, and satisfying141 an EHR system is, they do not allow evaluators to identify individual usability problems that can be targeted for improvement, a process that is at the core of usability evaluations.2,10,43,142 Furthermore, of all the survey evaluations identified, almost half did not use validated surveys or failed to describe the survey creation methods in a way that allows the reader to assess the reliability and generalizability of the studies' results.

As with the survey-based studies, we found that studies using other, sometimes more complicated, usability analysis methods also often neglected to provide a clear, detailed description of their study design.2,138 For example, a significant number of the heuristic evaluation reports we reviewed did not describe what heuristics were used or the methods behind developing the study-specific heuristics. In addition, interview, focus group, and think-aloud evaluations almost universally lacked specific details about the techniques used to moderate participant sessions143 and the expertise or qualifications of the moderator.4 These omissions prevent the reader from being able to assess what biases or unintended consequences, which could possibly affect the reported outcomes, may have been introduced into the study design.

Although the majority of studies we reviewed provided a description of the subjects who tested the EHR system's usability, these studies less consistently described the evaluators responsible for designing and carrying out the usability evaluation. Often the reader is not informed of what the expertise level and domain experience are of those performing the evaluations. Many usability evaluation methods are complex and multifaceted, and evaluators who have usability, domain, and/or local systems expertise are critical for an effective evaluation.1,43 This lack of a consistent background reporting framework limits the reader's ability to appraise the reliability of the evaluation or validate its outcomes.6

The need to improve usability standardization and transparency has never been greater, especially since this issue has been garnering attention from academic,11 industry,139,140 and media144,145 sources. The pressure from industry to improve the usability of EHRs should be systematically aligned with policy-level certification and requirement standards, because recent data show variability among EHR vendors regarding what actually constitutes "user-centered designed" and how to carry out appropriate evaluations of EHR systems.140

This systematic review has some limitations, most of which stem from the heterogeneity of the studies reviewed. The lack of uniformity in study authors' descriptions of usability reporting and the methodology employed in the studies means it is possible that articles that met our inclusion criteria were overlooked and not included in the review. However, this emphasizes not only the need for formal usability evaluation performance standards1,2,5 but also for reporting and disseminating evaluation results.6 The exclusion of non-English-language studies may limit the generalizability of our findings to the worldwide effort to improve EHR usability evaluations. However, the general conclusions of this review are similar to those of previous international work done on this topic7,146 and help validate our search and study evaluation processes.

Additionally, it is important to remember that the purpose of a specific evaluation cannot be accurately determined by only assessing the types of methods used for the evaluation. This is especially relevant when multiple or mixed methods are employed to evaluate EHRs at various stages of their development. Thus, any conclusions about EHR evaluations are best made in the context of each unique evaluation, its primary purpose and goals, and the specific boundaries within which it was undertaken.


Usability evaluations of EHR systems are an important component of the recent push to put electronic tools in the hands of clinical providers. Both government and industry standards have been proposed to help guide these evaluations and improve the reporting of outcomes and relevant findings from them (Table 4). However, a review of the literature on EHR evaluations demonstrates a paucity of quality published studies describing scientifically valid and reproducible usability evaluations conducted at various stages of EHR system development and how findings from these evaluations can be used to inform others in similar efforts. The lack of formal and standardized reporting of usability evaluation results is a major contributor to this knowledge gap, and efforts to improve this deficiency will be one step of moving the field of usability engineering forward.

Table 4.

Proposed Framework for Reporting Usability Evaluations

 Previous usability work
 Impetus for current study
Test Planning
 Test objectives (the questions the study is designed to answer)
 Test application
 Performance and satisfaction metrics
 Study evaluators
 Test environment/equipment
 Metrics (performance, issues-based, self-reported, behavioral)
 Audience insights
 Actionable improvements

Adapted from Shumacher and Lowry2 and

M.A.E.: study design, data collection, data analysis and interpretation, writing, manuscript editing; M.D.: study design, data collection, data analysis and interpretation, writing, manuscript editing; J.C.O.: study design, data collection, data analysis and interpretation, manuscript editing; A.M.F.: study design, data collection, writing, manuscript editing; J.Z.: study design, data analysis and interpretation, manuscript editing; V.H.: study design, data analysis and interpretation, manuscript editing.


J.C.O.'s time and participation in this study was funded by the Mayo Clinic's Robert D. and Patricia E. Kern Center for the Science of Health Care Delivery. No other funding was secured for this study.

Competing interests



