Abstract
Objective
Using heuristics to evaluate user experience is a common methodology for human-computer interaction studies. One challenge of this method is the inability to tailor results towards specific end-user needs. This manuscript reports on a method that uses validated scenarios and personas of older adults and care team members to enhance heuristics evaluations of the usability of commercially available personal health records for homebound older adults.
Materials and Methods
Our work extends the Chisnell and Redish heuristic evaluation methodology by using a protocol that relies on multiple expert reviews of each system. It further standardizes the heuristic evaluation process through the incorporation of task-based scenarios.
Results
We were able to use the modified version of the Chisnell and Redish heuristic evaluation methodology to identify potential usability challenges of two commercially available personal health record systems. This allowed us to: 1) identify potential usability challenges for specific types of users, 2) describe improvements that would be valuable to all end-users of the system, and 3) better understand how the interactions of different users may vary within a single personal health record.
Conclusions
The methodology described in this paper may help designers of consumer health information technology tools, such as personal health records, understand the needs of diverse end-user populations. Such methods may be particularly helpful when designing systems for populations that are difficult to recruit for end-user evaluations through traditional methods.
Keywords: aged, health records, personal, heuristics, home health nursing, patient portals, usability
1. Introduction
As patients are increasingly responsible for managing their own health and wellness, consumer health information technologies (CHIT) are becoming a critical component of health delivery systems. These tools are defined as “computer-based systems…designed to facilitate information access and exchange, enhance decision making, provide social and emotional support, and help behavior changes that promote health and well-being [1].” Older adults (aged 65 and older) who are homebound experience high levels of disease burden [2], which may increase the significance of CHIT as an important tool in supporting their health.
Although there is no universal definition, Medicare defines someone as “homebound” if they are unable to leave their place of residence without significant support from another person or from assistive devices [3]. Homebound older adults (HOAs) generally have more cognitive and functional impairments, and maintain more complex self-care routines than non-homebound individuals. One study found that 98% of HOAs had difficulty regularly performing at least one instrumental activity of daily living and, on average, required 9 to 12 medical provider visits per year [4].
Due to the complexity of care, homebound individuals who remain in the community receive support from a variety of sources. Such services are often managed outside traditional office-based medical care, and are in most cases poorly coordinated with other medical providers [2, 4, 5]. The two most common home-based care programs designed to support HOAs are Medicare home health services and informal caregiving [5]. In the United States there are over 3 million individuals that annually receive home-based medical services at home through Medicare [5]. In addition, up to 15 million older adults in the United States receive informal home-based care services provided by family, friends, and volunteers [5]. These informal care services include homemaking, help with activities of daily living, and transportation to medical appointments. Due to the increased number of older adults in the United States and the decreased number of older adults who reside in facilities, experts anticipate that the need for Medicare home health and informal caregiver services will increase over the next several decades to support the growing number of HOAs [5].
Though CHIT may serve as an important tool to support health, recent literature unfortunately suggests that the adoption and acceptance of CHIT has been modest among many healthcare consumers including older adults [1, 6, 7]. There are many potential reasons for why adoption has been modest, and researchers have proposed investigating factors such as access to technology, patient care preferences, and disparities in care among racial and ethnic minority populations [8]. While these problems are important, one recent study suggests that even by providing universal access to the Internet and related online technologies, disparities in CHIT adoption among patient populations may not be eliminated [8]. This study found that technology literacy plays an important role in CHIT adoption regardless of access to the technology and the Internet. The authors suggest that designing CHIT that is easy to use may help people with low technology and/or health literacy adopt these tools in situations when other barriers have been reduced [8]. Therefore, ‘usability’ should continue to be a focus for designers of CHIT to promote the use of their tools by the intended audience [1, 6, 9]. Usability describes “the extent to which a system is easy to use or ‘user friendly’” [10]. Common usability problems found in CHIT include a mismatch between functionality and end-user requirements, poorly designed user interfaces, and frequent functional errors [11]. Usability is important for all end-user groups; however, it is especially important for CHIT that may be used by older adults. Due to the natural aging process and an increased prevalence of chronic and acute diseases, older adults often have reduced cognitive or physical function that make using technology more difficult than other adult populations [12].
There are many methods that can be used to evaluate the usability of CHIT. Heuristic evaluations are a tool that helps researchers consider user perspectives through expert reviews. Heuristic evaluations can be used to identify potential problematic areas prior to an intervention study or end-user testing. For example, a previous PHR evaluation performed a heuristic evaluation, and recruited adults aged 18 to 55 for end-user testing and interview. This study found that the heuristic evaluation was more effective at identifying technical usability issues than the end-user testing. Results from the heuristic evaluation were used to understand how technical issues affected end-user experience and overall perspectives of the PHR [13].
2. Objective
This paper describes a study that evaluates the usability of two commercially available CHITs, specifically Internet-based personal health record systems (PHRs). In this study, we use a methodology that combines a heuristic evaluation with personas and scenarios. We chose a heuristic methodology to gain an initial understanding of the perspectives of HOAs and their care team as the first step to understanding potential usability challenges with existing PHRs before conducting a larger PHR evaluation study with end-users. Our methodology extends the methods used by Chisnell and Redish in their usability review of 50 websites for older adults, commissioned by the AARP [14]. We further the work of Chisnell and Redish by using an evaluation protocol that relies on multiple expert reviews of each system, and standardizes the heuristic evaluation process using associated task-based scenarios. This manuscript describes our methodology, discusses the lessons learned from our evaluation, and suggests potential uses of this methodology in future work.
3. Methods
3.1. Identification of the Heuristic Evaluation Measures
We performed a literature review on heuristic evaluations for older adult users in PubMed and Embase, and identified five sets of heuristics that were developed to assess older adult usability of web-based technologies [14–18]. After evaluating these sets we chose the Chisnell and Redish guidelines for older adult web users. This set of guidelines were the most appropriate for our use case because it allowed us to review each system from multiple user perspectives - homebound older adult, family caregiver, and home health - and accounts for the wide variety in characteristics in these user populations. In addition, the designers of the original methodology are experts in both human-computer evaluation and the needs of older adult web-users [14].
The Chisnell and Redish guidelines contain 20 heuristics that fit into four categories: interaction design, information architecture, visual design, and information design. In order to account for users with different levels of skills, motivation, and abilities, personas are incorporated into the evaluation methodology. This heuristic methodology is unique in that it does not seek to gain an exhaustive list of heuristic violations, but focuses on identifying the most important problems that a persona may face when performing tasks within the system. In their report, Chisnell and Redish highlight that their methodology is designed to be different than a traditional expert-led heuristic evaluations to encourage a list of frequent problems that an average target user will face in the system. This methodology first asks the evaluator to record observations using the website as the persona, and then to fit these observations into the heuristic framework [14].
3.3. Creation of Personas
Following the above methodology, our first step was to create personas for the end-users. A persona is a fictional characterization of a user that is meant to “capture the user’s mental model comprising of their expectations, prior experience, and anticipated behavior” [9]. To meet this goal, we conducted a literature review on HOAs, their caregivers, and nursing staff. We adapted our initial homebound older adult persona (‘Alice’) from a case study of a homebound older adult published by Leff et al. [2], our family caregiver persona (‘Matthew’) after a persona published by Chisnell and Redish [14], and our home health nurse persona (‘Lisa’) from home health nurses known to the authors (LK, HT, GD) from prior research activities.
The original Chisnell and Redish methodology advocates for a persona characteristic comparison to help the reviewers understand the main differences between the personas [14]. Our persona document includes a comparison between the personas on four dimensions: chronological age, degree of physical and cognitive limitations that affect using the Internet, expertise with computers and the Internet, and PHR aptitude- the degree to which the persona feels positive or negative towards incorporating the PHR into home health routines.
3.4. Search for Commercially Available Personal Health Records
We identified existing PHRs using MyPHR.com [19], a systematic literature review [20], and the webpages on Healthit.gov [21]. These sites were chosen to identify systems developed for both commercial and research purposes. Three authors (LK, SM, YC) assessed the PHRs identified in this search to ensure that the PHRs were active, met the basic functionality criteria, and were available to the reviewers at no cost. The functionality criteria were developed from a literature review on HOAs and a previously published study by the authors that examined commercially available personal health records using a published home-health case study [22]. The PHRs that performed in the top quartile of the functionality assessment were included in this heuristic evaluation: MyMedWall [23] and RememberItNow [24].
3.5 Initial Training of Reviewers
Three authors (LK, SM, YC) used Chisnell and Redish’s published report to train themselves on the heuristic evaluation methodology [14]. For training purposes, the reviewers assessed three PHRs that performed in the middle quartiles of the functionality assessment. The Microsoft Excel spreadsheet used for the heuristic evaluation can be found in Appendix A.
The original Chisnell and Redish methodology had one person review each system. This methodological choice allowed Chisnell and Redish to assess over 50 websites within the study period [14]. Previous heuristic evaluation methodologies have found that 40%–60% of known usability problems can be identified using three to five reviewers [25]. Although the goal of the Chisnell and Redish methodology is not to identify an exhaustive list, we wanted to ensure that we captured a comprehensive list of potential usability problems. Therefore, to improve the completeness and accuracy of our heuristic evaluation results, our protocol had all three reviewers evaluate each system. Because there was no standard prescribed approach what to prioritize when navigating the websites in the original protocol [14], using multiple reviewers for each system was at times challenging because each reviewer focused on different actions and functionalities when evaluating the system. In addition, it could be challenging to combine all the results from the reviewers. Therefore, after reviewing our literature review, we introduced scenarios as a companion for the personas to guide reviewers’ assessment of the PHRs.
3.6. Creation of a PHR Scenario for Each Persona
Scenarios help designers understand the goals of end-user populations. Unlike personas, which promote understanding of end-user characteristics, scenarios highlight specific actions taken by individuals in the system. Due to the focus on actors and actions, scenarios promote work-oriented design and a focus on specific functionalities [26]. An initial scenario was created for the HOA persona based on a previous literature review of PHR use among older adults [20]. Our scenarios for the caregiver and home health nurse were created using previous PHR literature reviews that were focused on general adult population use [27–29] and descriptions of home health scenarios [2, 30]. The scenarios depicted a homebound older adult, family caregiver, and home health nurse performing a series of care management tasks relevant to HOAs after a hospitalization. These tasks included reviewing medication lists, entering patient-reported outcome data, and accessing documentation from previous medical visits.
3.7. Validation of Personas and Scenarios
After developing the initial materials, we recruited five practicing home health nurses in the United States using online home health nursing forums. The primary author (LK) conducted phone interviews with the home health nurses to validate the personas and scenarios. Participants were offered a $10 gift card for their time. All participants had been practicing at a Medicare-certified home health nurse for longer than one year, and provided care in a variety of settings including rural communities and large urban centers. Participants were located in the Northeast, Midwest, and Southern regions of the United States.
The home health nurse participants were asked to review all three personas and scenarios prior to the phone call. During the interview, participants were asked to provide feedback on how realistic the personas and scenarios were based on their professional experience, and offer suggestions on how to improve the accuracy of the scenarios and personas. Interviews were audiotaped and transcribed. Inductive coding was completed on the interview transcriptions to identify commonalities between the participant opinions. Participant feedback was incorporated into the final personas and scenarios. Feedback was overall positive, and many home health nurses expressed seeing potential value in a PHR for their work. The thematic concerns that arose from the initial materials were: the underestimation of the involvement of informal caregivers, the desire to have more information about pain between home care visits, and the perceived ability and/or willingness of HOAs to manage their own care. Based on the feedback from these interviews a few changes were made: 1) the involvement of the family caregiver persona (‘Matthew’) in his dad’s care was increased, 2) ‘adding a medication’ task was moved from the HOA scenario to the family caregiver scenario, and 3) documenting pain (an important patient reported outcome for home health nurses) was added to the older adult scenario. The final versions of the persona and scenario materials are available in Appendix B.
3.8. Conducting the Heuristic Evaluation
Reviewers (LK, SM, YC) conducted the heuristic evaluation on the two PHRs described earlier. To reduce a system learning effect due to performing multiple scenarios on each system, the reviewers evaluated both systems using one persona and scenario at a time before moving on to the next persona, leaving at least a two-week period between personas. Instructions given to the evaluators can be seen in Appendix C.
Each system was scored on two different criteria: scenario task completion and adherence to the heuristics. Following the original methodology, the reviewers read the appropriate persona prior to starting each round of heuristic analysis to understand the user’s strengths and weaknesses. Then the reviewers followed the scenarios, embodying the associated personas, and recorded detailed observations of their experiences performing the scenario tasks. In our study, two reviewers used bullet points to record their observations for each task and one reviewer chose to write narratives to reflect his experience as the persona. After the observations were complete, each reviewer would score each scenario task using the Chisnell and Redish scale of 1 to 4, with 1 representing “task failure: this prevents the user going further” and 4 indicated “no problem: satisfied scenario”.
After the initial observation and the scoring of the scenario tasks, the reviewers met to reconcile differences between the task scores and observations. Differences were reconciled through discussion and demonstration of the workflows performed during the individual evaluations. In this meeting, a final list of all observations, both positive and negative, was created. Final scores for each task were agreed upon through consensus of the three reviewers. Consensus was used to ensure that all reviewers benefited from the experiences of the other evaluations. Even though there was a scenario to follow, the reviewers still found multiple paths to accomplish each scenario, providing different reviewers with different overall experiences. Reaching a consensus allowed us to examine all reviewer experiences in order to determine the most appropriate final rating.
The list of observations was used to populate the heuristic spreadsheet. The primary author (LK) took the complete list of observations and matched the observations to the appropriate heuristic categories. The other reviewers (SM, YC) reviewed the initial matching to ensure that the observations were accurately portrayed, and that the observations were put into the appropriate heuristic categories. After the heuristic tool was populated with observations, each reviewer individually ranked each category using the 1 to 4 rating.
After individual ratings, the evaluations were combined. Heuristic categories that differed between the reviewers were examined during an in-person meeting and final scores were finalized based on consensus between the three reviewers. Following the original methodology, heuristics that did not have any observations associated with it, either positive or negative, were not scored for that persona.
4. Results
4.1. Summary of Results from Scenario Task Evaluation
Table 1 displays the results from our scenario task evaluation. As shown in this table, our older adult persona (‘Alice’) was unable to complete two tasks (view cancer diagnosis and enter pain score) in either system, but scored fairly high (3 out of 4) on both systems when viewing medications and logging out. ‘Matthew,’ our caregiver, had most difficulty with laboratory values (accessing existing values and adding new ones), but was better able to access medication lists and print documents that were uploaded by the home health nurse. ‘Lisa,’ our home health nurse, had difficulty viewing pain scores and changing medication doses for her patients. Similar to ‘Matthew’, she was better able to navigate the system (login, logout), add new documents into the system, and view other provider notes.
Table 1.
Summary of Results from Scenario Task Evaluation
| Persona | Task | RememberItNow | MyMedWall |
|---|---|---|---|
| Older Adult (‘Alice’) | Login | 2 | 2 |
| View cancer diagnosis | 1 | 1 | |
| View medications | 3 | 3 | |
| Enter pain score | 1 | 1 | |
| Logout | 3 | 3 | |
| Family Caregiver (‘Matthew’) | Login | 4 | 4 |
| View medications | 4 | 3 | |
| Print medications | 3 | 1 | |
| View Outpatient Document | 2 | 3 | |
| Download Outpatient Document | 4 | 4 | |
| View laboratory value | 2 | 1 | |
| Enter 4 new laboratory values | 1 | 1 | |
| Logout | 4 | 4 | |
| Home Health Nurse (‘Lisa’) | Login | 4 | 4 |
| View pain scores | 1 | 2 | |
| View outpatient document | 3 | 2 | |
| Upload Lisa’s home health note | 4 | 3 | |
| Change medication dose | 2 | 1 | |
| Logout | 4 | 4 |
1 = Task failure; prevents this user from going further, 2 = Serious problem; may hinder this user, 3 = Minor hindrance; possible issue, but probably will not hinder this user, 4 = No problem; satisfies heuristic
4.2. Summary of Results from Heuristic Evaluation
We recorded observations for 50% to 80% of the heuristics for each combination of persona and system. Considering both systems together, we recorded observations for 80% (16/20) of the heuristics for each persona. There is significant overlap between the heuristics scored for each persona, but the lists of scored heuristics are not identical. Only two heuristics (10%) were not scored by any persona, in any system during the study: 1) provide feedback in other modes in addition to visual; 2) include a site map and link to it from every page. In addition, our processes allowed us to record both positive and negative observations on the system. All personas had heuristics that ranged from 1 (task failure) to 4 (satisfies heuristics). The summary of results from the evaluation is displayed in the table below.
4.3. Agreement between Reviewers
All final ratings for both scenario tasks and heuristics were made by consensus between the three reviewers. Consensus was reached during in-person meetings after each evaluation step: one consensus meeting for task rating, and a separate meeting for heuristic results. Even though all reviewers followed the same tasks as defined in the scenarios, there were often multiple ways for a user to complete the task in the PHR. These multiple pathways often affected how difficult it was for the user to complete the scenario.
Task agreement between all three reviewers (across the three scenarios for both systems) was 29% (15/38). In 78% of the disagreements (18/23), two of the three reviewers had the same score and one reviewer was an outlier. Heuristic agreement between all three reviewers was 46% (37/81). Two of the three reviewers scored the heuristic the same score in 90% of disagreements (38/42). Task agreement was more difficult to achieve because the scores were influenced by the individual paths that each reviewer took in the PHR to accomplish the scenario tasks. Heuristic disagreement was caused by differences in opinion on how significant an identified usability challenge would be on the persona, and the disagreement between reviewers was limited to 1 (e.g. between a ‘3’ and a ‘4’) in 81% of the disagreement cases (34/42).
5. Discussion
Overall, this methodology allowed the authors to comprehensively evaluate the usability of two PHRs for a care team that consisted of users with different technological abilities. Using the modified Chisnell and Redish methodology, we were able to identify certain tasks in each system that would be difficult for a specific type of user to perform, as well as discover overall areas of improvement that could help improve the usability for all three types of users.
5.1 Evaluation Results
We were able to identify three different types of results by using this methodology to conduct a heuristic usability evaluation. First, this methodology allowed us to identify usability challenges that may present a challenge to any user who wishes to perform any functionality in the PHR. For example, all three of our personas found it difficult to understand the PHR structure, and struggled to use the headings to navigate through their scenarios. These type of results prompted general design recommendations that are similar to traditional heuristic evaluation recommendations, such as “improve the headings of the system menus.”
In addition, this methodology allowed us to understand how performing different functions within the system created varied user experiences. For example, all three personas had to access the medication list within their scenario. ‘Alice’ viewed her medication list, ‘Matthew’ printed his dad’s medication list, and ‘Lisa’ changed a medication dose. ‘Alice and ‘Matthew’ had little difficulty completing their tasks, each scoring a 3 or 4 in both systems. ‘Lisa,’ on the other hand, found this to be one of her most challenging work items (scoring a 1 and a 2). The difference between the user experiences wasn’t because ‘Lisa’s’ characteristics made it more difficult for her to complete her task, but that the interfaces to update the medication list were poorly designed in both systems. This type of finding allowed us to make specific recommendations on how the PHRs could improve specific functionality, such as “improve the workflow for updating a medication dose.”
Finally, this methodology also allowed us to better understand how user characteristics affected an individuals’ experience with the PHRs. We used these types of results to provide design recommendations that were targeted for specific user groups. For example, ‘Alice’ had the greatest difficulty with the color contrast (e.g. white font on a light blue background) and text size. ‘Matthew’ had the greatest challenge with navigating complex workflows because he had other tools (e.g. calling the home health nurse) that he could use to complete his scenario tasks. Finally, ‘Lisa’ had the greatest advantage with navigating the systems because of her understanding of clinical abbreviations (e.g. “BP), and the overlap in design between these PHRs and clinician-focused electronic health records. Therefore, this methodology could also be used to make specific recommendations to promote the use of a target end-user population. It also may help CHIT designers understand the needs of different user populations. In situations where the CHIT will be used by a variety of patient populations, such as PHRs, these recommendations could help design systems that meet the needs of all potential users.
5.2 Methodology
5.2.1. Ease of Using the Methodology
Once the reviewers were fully trained on the methodology, evaluation materials were complete, and the evaluation protocol was finalized, this methodology was relatively straightforward to follow. During the training sessions, the reviewers had the opportunity to identify details of the persona and scenario tasks that were confusing, and reconcile these details to ensure consistency between reviewers. In addition, the training allowed us to come to consensus about the abilities and weaknesses for each persona.
Because some functionality in the PHRs was more difficult to use than others, a persona’s PHR aptitude became a key factor into coming to an agreement on the final score. This was especially important when determining if a task or heuristic was a “2” (serious problem) or a “3” (minor hindrance). The characteristic comparison, designed based on the original methodology, became a key resource for helping our team come to consensus on how each persona would react during a difficult-to-use task. The reviewers also differed in how they envisioned personas with potentially similar abilities, which was an issue for the caregiver and home nurse personas who were both middle-aged adults. Faced with personas with similar physical abilities, differences in scores arose from imagined personality characteristics such as a persona’s level of patience or comfort with types of websites. In contrast, scoring tasks for the older adult persona was easier since ‘Alice’ was more likely to fail a task due to factors that were easy to ascertain visually, such as font size and color contrast. Discussion was crucial to harmonizing the group conceptualization of a persona and reconciling scores.
The reviewers also differed in terms of their computer configurations. They used computers with differently sized screens and with different default browser settings. One important factor to consider was default browser font size and the effects of using zoom commands to change the sizes of images and text on the screen as systems differed in how they rearranged their webpages in response to these settings. Another potentially important factor was the reviewers’ browser plug-ins, such as the use of plug-ins to render PDFs, which could affect how systems function. It was useful to explore computer settings and how they would alter a system’s ability to meet the needs of the personas in our discussion.
Finally, our process and findings demonstrated that the Chisnell and Redish set of heuristics is robust. It has twenty overall heuristics, and up to eight sub-categories under each heuristic. For our evaluation, we sometimes struggled to reach consensus on where an observation would fit into a single heuristic, as there appears to be some overlap between definitions. For example, it was often difficult for the reviewers to place comments into the three following heuristics: “make pages easy to skim or scan (#13)”, “visually group related topics (#15), and “make it easy to find things on the page quickly (#18)”. In order to put an observation into the correct heuristic, we had to continually review the original intention of these heuristics, and reach a consensus of where the observation best fit. Other heuristics for older adult web users have a smaller list [15, 17, 18], which may decrease the time associated with scoring and reduce the amount of perceived overlap between categories.
Although a smaller list of heuristics may be helpful for reviewers, we also found that some of our observations did not fit into any of the existing 20 heuristics. The additional comments involved observations related to workflow (e.g. “too many clicks to perform a task, ‘Matthew’ will probably give up”), inconsistent functionality (e.g. “However, if she clicked on any of the links under Table of Contents, the logout link is not there anymore”), and unhelpful information (e.g. “She scrolls down and finds the FAQ, but it doesn’t answer her question”). These results suggest that additional work on the Chisnell and Redish heuristics may be needed to identify a comprehensive and concise set of heuristics for older adult web users.
5.2.2. Use of Multiple Reviewers
Using only one reviewer to evaluate each website helped Chisnell and Redish evaluate 50 different websites during their study period; however, this methodological choice may have led to unintentional bias of the evaluation results [14]. Since only one researcher evaluated each system, there was the potential for the individual researchers’ biases and preconceived notions of the characteristics of a usable website to skew their results. Our protocol, by having all three reviewers evaluate each PHR, reduced the number of systems evaluated, but it also increased the amount of time spent on each system and provided for multiple perspectives to view each system. Having multiple people review each system allowed us to improve the completeness of our observations, and reduce the likelihood that bias from an individual reviewer would be reflected in our final results. We found this to be especially useful because using these systems often proved difficult, and reviewer fatigue was present when a reviewer had a particularly hard time accomplishing a scenario task. We can see this fatigue in the agreement results, where one reviewer having a different experience than the other two caused most of the disagreement. This fatigue had the potential to bias the reviewer against the system and lower the score ratings for subsequent tasks. This bias was mitigated when the results from all three reviewers were compared, discrepancies between observations were discussed, and final ratings were scored through consensus.
Using multiple reviewers allowed us to gain a better understanding of the potential end-user experiences by identifying multiple pathways to accomplish each scenario goals, and be able to adjust our original opinions if reviewer fatigue caused a rating to be too high or too low. As seen in our agreement statistics, despite the extensive training, the reviewers still conceptualized some aspects of the personas differently. This affected where they initially looked on the screen to complete a task, how they interpreted labels and instructions, and their expectations for how different controls in the systems should behave. This resulted in coverage of a wider range of probable end-user behaviors, which helped to develop a larger and more comprehensive list of positive and negative observations for each persona and for most of the heuristic categories.
One significant challenge emerged because multiple reviewers used the same persona logins in each system, and system settings and data were not reset between reviewer evaluations. Therefore, if a reviewer erroneously entered data in the wrong spot in the system for data entry scenario tasks, the reviewers that tested the system afterwards may mistakenly believe that this was the right place to access the data. One way to get around this reviewer error would be to reset the PHR after each reviewer, or create mirror PHR accounts for each reviewer. Resetting the PHR after each review would increase the time that each review took by eliminating the potential for reviewers to conduct the evaluations concurrently. Creating separate reviewer PHR accounts would require more set-up time, including finding a distinct email address for each reviewer account.
5.2.3. Time to Complete Tasks
Some researchers may look to a heuristic evaluation to reduce the time and money spent on a usability study. Our methodology still required a significant amount of resource investment. We did not have expenses related to participant recruitment and retention, but we spent a significant amount of time preparing for and conducting the research.
The primary author (LK) conducted most of the preparation activities including preparing the materials, validating the personas and scenarios, and setting up the PHR systems. All reviewers, however, participated in extensive training. This training was helpful for us to ensure that our materials were robust, and that all three reviewers had a similar understanding of the personas, tasks, and heuristic guidelines. Conducting full reviews on systems not included in the final analysis was found to be very helpful in expediting the actual time spent on the PHR systems included in the final evaluation, though it initially increased workload.
Despite the investment in time, the strength of this methodology is the robust design recommendations generated from the results. From the results of this one study, we were able to identify design recommendations based on overall aesthetics and system design, classify problematic workflows, and illuminate the differences between user populations. For our study goal, the robust nature of these results was worth the time investment; however, we recognize that not all usability studies have the same goals.
5.2.4. Use of Personas and Scenarios
The Chisnell and Redish methodology is unique because it uses personas to capture the diversity of abilities within a larger end-user population to enhance the heuristic evaluation process. We found that the personas were most helpful combined with scenarios. The personas and scenarios allowed us to define tasks that were most important based on an individual’s role, and specifically evaluate these tasks in each system. Standardizing the tasks in each system allowed us to better compare our results, identify tasks that were consistently hard to accomplish, and understand how overall usability could be improved through general system enhancements such as increased font size.
5.2.5. Use of Observations before Heuristics
Another characteristic that makes the Chisnell and Redish method unique is the use of observations to drive the heuristic evaluation instead of evaluating all heuristics in a list. Our study found this to be an efficient way to capture information about the ease of use of a system without having to access each system page. This was especially helpful since we were evaluating PHR systems that were complex and that contained multiple pages, menus, and functionality.
Using data from three reviewers, we were able to develop a comprehensive list of observations that led to populating most heuristics for each persona. There were only two heuristics that were not scored for any user, in any system: “provide feedback in other modes in addition to visual” (#9) and “include a site map and link t it from every page (#12).” The original methodology involved using a tactile mouse to provide haptic feedback. We did not use any assistive devices in our review. Future heuristics, however, may wish to consider haptic devices as well as evaluating the text-to-speech functionality if these devices are regularly used within the end-user population. In addition, neither system reviewed used site maps as part of their design.
Using the observations to drive the heuristic evaluation also prompted the reviewers to focus on the more negative aspects of the systems because problems were easier to spot than positive aspects of the system. The purpose of our study was to evaluate the usability, and to identify potential problems for home health care teams, therefore having more negative observations than positive observations met our research needs. This methodology may not be appropriate for researchers looking to accurately compare the number of positive and negative aspects in a single system.
5.3. Limitations
None of the authors had used this methodology before this study. The Chisnell and Redish methodology is different than a traditional heuristic evaluation, and it is possible that we misinterpreted some of the guidelines proposed in the report. As described in section 3.5, we mitigated this limitation by having all reviewers complete extensive training on the methodology before finalizing our protocol. Reviewing three PHRs before starting our study helped us better understand the methodology, resolve misunderstandings between reviewers, and identify the areas that needed more clarification from the original methodology report.
In addition, it is possible that our personas and scenarios do not reflect HOA perspectives on PHR use for home health. We did perform a validation study with experienced home health nurses, and used their feedback to modify the original scenarios and personas. The opinions of the nurses may not have completely reflected HOA perspectives. Future work could further validate these scenarios and personas by gathering perspectives from HOAs.
5.4. Potential Uses for Future Work
As the number of HOAs increases, healthcare workers and informal caregivers will need to rely on new processes to support HOAs in their homes. CHIT, such as personal health records, may be one tool that can help connect HOAs to their care team. However, in order to be used, the CHIT must be designed for the needs of HOAs and their care team members. The Chinsell and Redish described in this paper combines personas and a heuristic evaluation. This methodology may be a good way for researchers to consider the basic needs of HOAs before an end-user evaluation or in situations where it is difficult (or impractical) to recruit evaluation participants from this end-user population.
Our findings, however, imply that modifications may be needed to the original methodology. Specifically, we found that adding scenarios to the methodology can help standardize the evaluation procedures of the reviewers and strengthen the results by identifying problematic workflows in the CHIT system. Our scenarios were validated with home health nurses, but other user groups may have different opinions on the important workflows within PHR systems. Future work could focus on developing a consensus about important workflows within PHRs.
In addition, the original Chinsell and Redish methodology only used one reviewer to evaluate each website. We found that having three separate reviewers made our results more robust, and helped us identify more usability challenges for each system and persona. We chose to use three reviewers based on the heuristic methodology literature, however, the Chinsell and Redish methodology is unique. Therefore, additional future work could also study the relationship between number of reviewers and saturation of usability problems using this combined heuristic and persona method.
Finally, we performed our evaluation using the original set of heuristics as advocated by Chisnell and Redish. This set of heuristics is large (twenty overall categories with up to eight sub-categories under each heuristic). We found it difficult to distinguish between some heuristic categories, and also found that some identified problems did not fit under any defined category. Although no list of heuristics is exhaustive, future work could reconcile the Chisnell and Redish list to reduce overlap between heuristic categories, and add more categories to improve coverage of the issues identified during the observation phase.
6. Conclusion
Extending the Chisnell and Redish methodology to include multiple reviewers and scenarios allowed us to comprehensively evaluate the usability of two PHR systems from three perspectives- homebound older adult, family caregiver, and home health nurse. The materials included in this manuscript may help other CHIT designers and researchers better understand the needs and perspectives of HOAs and their care teams.
Supplementary Material
Table 2.
Summary of Results from Heuristic Evaluation
| System | Older Adult (‘Alice’) | Family Caregiver (‘Matthew’) | Home Health Nurse (‘Lisa’) | |
|---|---|---|---|---|
| MyMedWall | # of Heuristics Scored (%) | 16 (80%) | 14 (70%) | 13 (65%) |
| Average Heuristic Score | 2.2 | 2.4 | 2.4 | |
| # of Score 1: Task Failure (%) | 5 (31%) | 3 (21%) | 2 (15%) | |
| # of Score 2: Serious Problem (%) | 5 (31%) | 5 (36%) | 5 (38%) | |
| # of Score 3: Minor Hindrance (%) | 4 (25%) | 3 (21%) | 5 (38%) | |
| # of Score 4: Satisfies Heuristic (%) | 2 (13%) | 3 (21%) | 1 (8%) | |
| RememberItNow | # of Heuristics Scored (%) | 15 (75%) | 10 (50%) | 13 (65%) |
| Average Heuristic Score | 2.5 | 2.6 | 2.8 | |
| # of Score 1: Task Failure (%) | 2 (13%) | 2 (20%) | 1 (8%) | |
| # of Score 2: Serious Problem (%) | 5 (33%) | 2 (20%) | 5 (38%) | |
| # of Score 3: Minor Hindrance (%) | 6 (40%) | 4 (40%) | 3 (23%) | |
| # of Score 4: Satisfies Heuristic (%) | 2 (13%) | 2 (20%) | 4 (31%) |
1 = Task failure; prevents this user from going further, 2 = Serious problem; may hinder this user, 3 = Minor hindrance; possible issue, but probably will not hinder this user, 4 = No problem; satisfies heuristic
Acknowledgments
This study was supported, in part, by the NIH National Library of Medicine Biomedical and Health Informatics Training Grant at the University of Washington (grant nr. T15LM007442). We would also like to thank the home health nurses that participated in the creation and validation of the home health scenarios and personas.
References
- 1.Or CK, Karsh BT. A systematic review of patient acceptance of consumer health information technology. [Accessed: 17th February 2017];J Am Med Inform Assoc. 2009 16(4):550–60. doi: 10.1197/jamia.M2888. Available from: [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Leff B, Carlson CM, Saliba D, Ritchie C. The invisible homebound: setting quality-of-care standards for home-based primary and palliative care. [Accessed 17th February 2017];Health Aff (Millwood) 2015 34(1):21–9. doi: 10.1377/hlthaff.2014.1008. Available from: [DOI] [PubMed] [Google Scholar]
- 3.Medicare and Home Health [Internet] Washington: Department of Health and Human Services; 2010. [cited 17th February 2017]. Available from: https://www.medicare.gov/Pubs/pdf/10969.pdf. [Google Scholar]
- 4.Beck RA, Arizmendi A, Purnell C, Fultz BA, Callahan CM. House calls for seniors: building and sustaining a model of care for homebound seniors. [Accessed 17th February 2017];J Am Geriatr Soc. 2009 57(6):1103–9. doi: 10.1111/j.1532-5415.2009.02278x. Available from: [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Weisfeld V, Lustig TA, editors. The Future of Home Health Care: Workshop Summary. Washington: Institute of Medicine and National Research Council; 2015. [Accessed 17th February 2017]. 978-0-309-36753-0. Available from: [DOI] [Google Scholar]
- 6.Baldwin JL, Singh H, Sittig DF, Giardina TD. Patient portals and health apps: Pitfalls, promises, and what one might learn from the other. [Accessed 17th February 2017];Healthc (Amst) 2016 doi: 10.1016/j.hjdsi.2016.08.004. Availbale from: [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Smith SG, O’Conor R, Aitken W, Curtis LM, Wolf MS, Goel MS. Disparities in registration and use of an online patient portal among older adults: findings from the LitCog cohort. [Accessed 17th February 2017];J Am Med Inform Assoc. 2015 22(4):888–95. doi: 10.1093/jamia/ocv025. Available from: [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Graetz I, Gordon N, Fung V, Hamity C, Reed ME. The Digital Divide and Patient Portals: Internet Access Explained Differences in Patient Portal Use for Secure Messaging by Age, Race, and Income. [Accessed 17th February 2017];Med Care. 2016 54(8):772–9. doi: 10.1097/MLR.0000000000000560. Available from: [DOI] [PubMed] [Google Scholar]
- 9.LeRouge C, Ma J, Sneha S, Tolle K. User profiles and personas in the design and development of consumer health technologies. [Accessed 17th February 2017];Int J Med Inform. 2013 82(11):e251–68. doi: 10.1016/j.ijmedinf.2011.03.006. Available from: [DOI] [PubMed] [Google Scholar]
- 10.Usability Evaluation [Internet] Rockville, MD: Agency for Healthcare Research and Quality; [cited 17th February 2017]. Available from: https://healthit.ahrq.gov/health-it-tools-and-resources/workflow-assessment-health-it-toolkit/all-workflow-tools/usability-evaluation. [Google Scholar]
- 11.Goldberg L, Lide B, Lowry S, Massett HA, O’Connell T, Preece J, et al. Usability and Accessibility in Consumer Health Informatics: Current Trends and Future Challenges. [Accessed 17th February 2017];American Journal of Preventive Medicine. 2011 40(5):S187–S97. doi: 10.1016/j.amepre.2011.01.009. Available from: [DOI] [PubMed] [Google Scholar]
- 12.Zajicek M. Successful and available: interface design exemplars for older users. [Accessed 17th February 2017];Interacting with computers. 2004 16(3):411–30. doi: 10.1016/j.intcom.2004.04.003. Available from: [DOI] [Google Scholar]
- 13.Liu LS, Shih PC, Hayes GR. Barriers to the adoption and use of personal health record systems. Proceedings of the 2011 iConference; Seattle, Washington, USA. ACM; 2011. [Accessed 9th May 2017]. pp. 363–70. 1940811. Available from: [DOI] [Google Scholar]
- 14.Chisnell D, Redish J. Designing web sites for older adults: expert review of usability for older adults at 50 web sites. Bethesda, Maryland: 2005. [Accessed 17th February 2017]. Available from: http://assets.aarp.org/www.aarp.org/articles/research/oww/AARP-50Sites.pdf. [Google Scholar]
- 15.Lynch KR, Schwerha DJ, Johanson GA. Development of a Weighted Heuristic for Website Evaluation for Older Adults. [Accessed 17th February 2017];Int J Hum-Comput Int. 2013 29(6):404–18. doi: 10.1080/10447318.2012.715277. Available from: [DOI] [Google Scholar]
- 16.Hart T, Chaparro BS, Halcomb CG. Evaluating websites for older adults: adherence to ‘senior-friendly’ guidelines and end-user performance. [Accessed 17th February 2017];Behaviour & Information Technology. 2008 27(3):191–9. doi: 10.1080/01449290600802031. Available from: [DOI] [Google Scholar]
- 17.Kurniawan S, Zaphiris P, editors. Research-derived web design guidelines for older people. Proceedings of the 7th international ACM SIGACCESS conference on Computers and accessibility; ACM; 2005. [Accessed 17th February 2017]. Available from: [DOI] [Google Scholar]
- 18.Zaphiris P, Ghiawadwala M, Mughal S. Age-centered research-based web design guidelines. CHI ‘05 Extended Abstracts on Human Factors in Computing Systems; Portland, OR, USA. ACM; 2005. [Accessed 17th February 2017]. pp. 1897–900. 1057050. Available from: [DOI] [Google Scholar]
- 19.myPHR [Internet] Chicago, IL: The American Health Information Management Association; c2017. [cited 17th February 2017]. Available from: https://www.myphr.com/ [Google Scholar]
- 20.Kneale L, Demiris G. Lack of Diversity in Personal Health Record Evaluations with Older Adult Participants: A Systematic Review of Literature. J Innov Health Inform. 2017;23(4):881. doi: 10.14236/jhi.v23i4.881. [DOI] [PubMed] [Google Scholar]
- 21.Stay Well: Access Wellness Resources [Internet] Washington, DC: Department of Health and Human Services; 2013. [cited 17th February 2017]. Available from: https://www.healthit.gov/patients-families/stay-well. [Google Scholar]
- 22.Kneale L, Choi Y, Demiris G. Assessing commercially available personal health records for home health: recommendations for design. Appl Clin Inform. 2016;7(2):355–67. doi: 10.4338/ACI-2015-11-RA-0156. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.MyMedWall [Internet] [cited 17th February 2017]. Available from: https://secure.mymedwall.com/phr/
- 24.RememberItNow [Internet] Orinda, CA: RememberItNow LLC; [cited 17th February 2017]. Available from: http://rememberitnow.com/ [Google Scholar]
- 25.Mankoff J, Dey AK, Hsieh G, Kientz J, Lederer S, Ames M. Heuristic evaluation of ambient displays. Proceedings of the 2003 SIGCHI conference on Human Factors in Computing Systems; Fort Lauderdale, Florida, USA. ACM; 2003. [Accessed 9th May 2017]. pp. 169–176. Available from: [DOI] [Google Scholar]
- 26.Carroll JM. Five reasons for scenario-based design. Interacting with computers. 2000;13(1):43–60. [Google Scholar]
- 27.Thompson MJ, Reilly JD, Valdez RS. Work system barriers to patient, provider, and caregiver use of personal health records: A systematic review. [Accessed 17th February 2017];Appl Ergon. 2016 54:218–42. doi: 10.1016/j.apergo.2015.10.010. Available from: [DOI] [PubMed] [Google Scholar]
- 28.Archer N, Fevrier-Thomas U, Lokker C, McKibbon KA, Straus SE. Personal health records: a scoping review. [Accessed 17th February 2017];J Am Med Inform Assoc. 2011 18(4):515–22. doi: 10.1136/amiajnl-2011-000105. Available from: [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Uslu AM, Stausberg J. Value of the electronic patient record: an analysis of the literature. [Accessed 18th February 2017];J Biomed Inform. 2008 41(4):675–82. doi: 10.1016/j.jbi.2008.02.001. Available from: [DOI] [PubMed] [Google Scholar]
- 30.Millerick Y. Case study 6: an account of a patient’s journey following a diagnosis of left ventricular systolic dysfunction of ischaemic aetiology. In: Stewart S, Blue L, editors. Improving Outcomes in Chronic Heart Failure: A practical guide to specialist. 2. London: BMJ Publishing Group; 2004. pp. 226–32. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
