Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Apr 1.
Published in final edited form as: Int J Med Inform. 2024 Feb 12;184:105355. doi: 10.1016/j.ijmedinf.2024.105355

Comparison of Evaluation Methods for Improving the Usability of a Spanish mHealth Tool

Alexandria L Hahn 1, Claudia L Michaels 1, Gabriella Khawly 1,2, Tyler Nichols 1, Pamela Baez 3, Sergio Ozoria Ramirez 1, Janeth Juarez Padilla 4, Samantha Stonbraker 5, Susan Olender 6, Rebecca Schnall 1
PMCID: PMC10923187  NIHMSID: NIHMS1968891  PMID: 38368698

Abstract

Objective:

Mobile health (mHealth) technology is now widely used across health conditions and populations. The rigorous development of these tools has yielded improved health outcomes, yet the ideal approach for developing mHealth tools continues to evolve, indicating the need for rigorous usability evaluation methods. This study compares two usability evaluation methods – cognitive interviews and usability assessments employing a think-aloud approach – for adapting an evidence-based mHealth tool from English into Spanish.

Methods:

We conducted cognitive interviews and usability assessments using a think-aloud protocol to evaluate the usability of an HIV mHealth application among 40 Spanish-speaking adults with HIV in New York City, NY, and La Romana, Dominican Republic. The Health IT Usability Evaluation Model (Health-ITUEM) was used to guide the analysis of qualitative data collected from each method.

Results:

Participants (N=40) averaged 43 years old (SD=12.26; range 20–79), identified primarily Hispanic/Latino (92.5%), and resided in La Romana (50%) or New York City (50%). Both usability evaluation methods yielded similar findings, highlighting learnability and information needs as crucial components of participant feedback for the mHealth application. Cognitive interviews captured participants’ perspectives on the app’s interface and design. On the other hand, results from usability assessments offered insights into participants’ competency while interacting with the mHealth tool.

Conclusion:

Findings from this study highlight the contributions and limitations of including cognitive interviews and task-based usability assessments using a think-aloud approach in mHealth usability testing. Future research should employ a multi-method approach, incorporating complementary usability evaluation methods and engaging participants in multiple assessments. Using complementary usability evaluation methods may provide a more comprehensive understanding of the usability and participant experience aspects of a mHealth tool compared to using a single usability evaluation method.

Keywords: Usability, Usability evaluation methods, Cognitive Interviews, Think aloud, mHealth

1. Introduction

The use of mobile health technology (mHealth) is rapidly increasing worldwide. mHealth refers to using mobile devices such as smartphones for healthcare delivery, health information dissemination, and health monitoring [1]. mHealth technology can advance health equity in groups experiencing vulnerability because it is a widely available and inexpensive way to improve health behavior change [2]. A systematic review found that 31% of mobile phone owners use their devices to access health information, and 19% installed an app relating to a current medical condition or to manage health and well-being [3]. When rigorously designed, mHealth technology is a powerful and relevant tool that can be adapted to meet the needs of its end-users [47].

Usability, defined as the extent to which a product can be effectively, efficiently, and satisfactorily used by end-users to achieve specific goals [8], is a critical factor in evaluating mHealth technology [9]. However, a lack of rigorous evaluation of usability factors may fail to achieve the mHealth usability goals of effectiveness, efficiency, and user satisfaction [10]. Researchers have employed several methodological approaches to evaluate the usability of mHealth tools, such as think-aloud protocols, focus groups, in-depth interviews, and structured questionnaires [11]. These methods provide insights into how easily users can navigate a mHealth tool, understand its features, and accomplish its included tasks. In detail, a review of usability evaluation methods employed in assessing electronic HIV interventions found that each usability evaluation method had advantages and disadvantages. However, using multiple usability methods in a single study produced varying results between methods [10]. To ensure successful mHealth tool development, it is critical to consider the most appropriate evaluation techniques for study goals.

Many mHealth usability studies have been conducted to explore usability and design approaches [1116], but there has been little comparison of usability evaluation approaches [14, 9]. The theoretical foundations of different methodological approaches yield different results; therefore, comparing the findings from discrete approaches may provide a greater understanding of the utility of each evaluation method. This paper aims to analyze the differences in the findings from data captured through cognitive interviewing and usability assessments employing a think-aloud approach across two geographic settings with Spanish-speaking participants.

Traditional usability testing uses a think-aloud protocol [12,14] to expose the cognitive process in navigating technology and understand how it facilitates problem resolution [9,12]. Think-aloud protocols are popular in usability testing because they provide comprehensive insights into end-users’ problem-solve approaches with mHealth technology [17]. Cognitive interviewing uses cognitive theory to gather useful information on users’ experience and perception of a mHealth tool, including problem-solving and decision-making [18]. Cognitive interviews also assess the validity of translated or adapted materials, identifying language-related challenges and evaluating the cultural appropriateness of visual elements (e.g., icons, symbols, images) that may affect comprehension and usability [19, 20].

This study was part of the cultural adaptation and translation of the WiseApp from English into Spanish. The WiseApp was originally designed as part of a cooperative agreement with the Centers for Disease Control and Prevention [4, 2123]. Project goals were to design a mHealth application for people with HIV (PWH) to self-manage their health [24]. The design process was guided by the Information Systems Research Framework, a design science framework that is now a fundamental guide to the design of mHealth applications [25]. Following this project, the study team was funded by the Agency for Healthcare Research and Quality to build the app and test it in a RCT with 200 PWH in New York City [26]. The study team contracted with Pattern Health technologies and CMT Cares to develop the mHealth application – the WiseApp, a native app which works on both the iOS and Android platforms. Results from the RCT, as well as other findings related to the usability and efficacy of the intervention, are described in detail elsewhere [2729]. In short, WiseApp seeks to improve medication adherence in PWH through push notification reminders and medication trackers [19]. In response to the need for evidence-based interventions for Spanish-speaking PWH in New York and the Dominican Republic, our study team culturally adapted and translated the WiseApp into Spanish through the work described here [20]. A multi-site trial to test the efficacy of the WiseApp is currently underway and described here [30].

2. Materials and methods

This multi-site study was conducted in New York, NY, and La Romana, Dominican Republic (DR). Bilingual (English/ Spanish) study staff conducted each visit. All study activities were approved by the Columbia University Irving Medical Center Institutional Review Board and CONABIOS, the ethical review committee in the Dominican Republic. Informed consent was obtained electronically from all study participants via the Research Electronic Data Capture (REDCap) web-based application. We recruited 40 PWH in La Romana and NYC to participate in either a cognitive interview (n=20) or a usability assessment (n=20) of the WiseApp. All interviews were conducted in person. Inclusion criteria were adults (≥18 years) who were diagnosed with HIV; Spanish-speaking; living in the US or DR; owned a smartphone; and were on active antiretroviral therapy (ART). Participants completed screening and baseline demographic surveys to confirm eligibility. All US-based participants were compensated $40 in appreciation for their time and the equivalent in Dominican Pesos for the Dominican participants.

2.1. Data collection

Participants completed a survey in REDCap to report demographic information, including age, sex, gender, race, ethnicity, education, and employment status. Data were generated through cognitive interviews and usability assessments using a think-aloud approach (Figure 1). Interviews lasted 46–60 minutes. We applied a think-aloud approach to usability testing to capture participants’ thoughts while viewing different screens or performing tasks on WiseApp [31]. Additionally, scripted probing questions were provided to study staff. Bilingual study staff audio-recorded the interviews and asked participants to state their understanding of the meaning of each screen on the app.

Figure 1.

Figure 1.

Overview of usability testing

2.1.1. Procedures

Participants participating in a cognitive interview were shown each screen on WiseApp (see Figure 2 for screenshots). They were prompted by study staff to concurrently share their thoughts and interpretation of the content (such as translated terms, sentence structure, quality, or lack of instructions), the appearance of the user interface (such as the colors, font size, images), and to comment on any features they liked, disliked, or would like to see on the App within the context of HIV care and management. A list of each screen presented to participants is shown in Table 1. At the end of the interview, participants were asked to assess the app’s relevance to specific HIV care considerations.

Figure 2.

Figure 2.

Screenshots of WiseApp in English and Spanish

Table 1.

Screens included in cognitive interviews

Welcome Screen
Dashboard Screen
My Medication Screen
My Alerts Screen
My Statistics Screen
My Account Screen
Support Screen
Options Screen
Chat Screen
Video Screen

Participants participating in a usability assessment were given a list of 11 tasks (Table 2) related to the functionality of the WiseApp, including reviewing medication adherence history, creating a medication reminder, and searching the app for specific information. Participants were asked to complete each task and evaluate the app using a think-aloud approach. Participants were allowed to ask questions before starting the app testing, but once testing began, participants were encouraged to complete all tasks independently. Study staff allowed participants to complete a task within five minutes or the first three attempts before showing the correct steps to accomplish the task.

Table 2.

Tasks included in usability assessments

Task # 1: Log-in to app.
Task #2: Edit a WiseApp pending dose to mark as taken in the Dashboard screen.
Task #3: Enable one of the in-app reminders for the WiseApp medication and add a new dosing schedule. Select a time and number of pills. Then, delete added schedule. Create a reminder for a dose of your medication at 8am.
Task #4: Set an alarm for a missed dose by text message / SMS and for a taken-off schedule dose by Email. Then, save.
Task #5: Add a desired alarm; then delete added alarm.
Task #6: Change the settings of an existing text message alarm to an email alarm.
Task #7. Describe the patient’s adherence stats for the month of June 2022.
Task #8: Log out and log back into App.
Task #9: Please find information on “What kind of notification will I get from CleverCap?”.
Task #10: Send a text to ask a question to the study team.
Task #11: Search for the video “How to use the CleverCap”.

2.1.2. Data Preparation

All interviews were audio recorded, transcribed, and translated into English for analysis by a professional transcription and translation company. The practice of verbatim transcription, involving a word-for-word reproduction of the interview, is a standard convention that is recognized for enhancing the rigor and accuracy of the data [32].

2.2. Data Analysis

Data were managed using Dedoose Qualitative Data Analysis software. This web-based qualitative data analysis program allows researchers to collaborate together in real time and is recognized for its user-friendly interface and cost-effectiveness compared to alternative qualitative analysis software (e.g., NVivo) [33].

Content analysis, a systematic qualitative method for examining and interpreting textual data to identify patterns and insights [34], was used to analyze the transcripts from 20 cognitive interviews and 20 usability assessments (N=40). Full transcripts were first condensed into meaningful text segments (i.e., excerpts), then further condensed and labeled with code. Codes are understood to be the “building blocks” of qualitative analysis [35]. Codes were then compared and grouped into categories that reflect the data’s content based on shared characteristics or meanings [35]. These categories were employed as the basis for reporting qualitative findings.

The coding process was guided by the Health IT Usability Evaluation Model (Health-ITUEM) [36], which has been validated for the assessment of mHealth technology [37], supporting the rigor of the use of this framework for guiding the qualitative analysis in this study. Only subjective concepts from the Health-ITUEM [36] were used to code datasets, including error prevention, memorability, information needs, flexibility/customizability, learnability, performance speed, competency, and other outcomes [37]. The concept of completeness was not included in the analysis because it represents an objective measure. Data that could not be coded based on the predefined codes were analyzed to create new codes and categories (indicated in the paper using italics). One subcategory of learnability, improve learnability, was identified during analysis to capture instances where participants provided specific recommendations to improve future user comprehension of the App. A new coding category, translation, was created to capture participants’ feedback about translated materials, language, and cultural comprehension and acceptance.

Four study team members (ALH, CLM, GMK, and TKN) with healthcare-related degrees and experience in qualitative data collection coded six transcripts with predefined codes using the Health-ITUEM [36] to confirm and modify components of an initial code set. This initial list of codes was used to develop the codebook (Table 4). Codebooks are instrumental during coding because they show how codes are operationalized and grouped together to form categories [35,38]. The codebook (Table 4) contains code mnemonics, code definitions, and exemplar quotes from the data collection [38]. The codebook was created through iterative independent and collaborative analysis (Figure 3).

Table 4.

Codebook

Codes/subcodes Definition Cognitive Interview Exemplar Usability Assessment Exemplar
Error prevention The App offers error management (e.g., error messages, undo function, or instructions) to assist participants in performing tasks and avoid errors. RID 62 (NY): You should include ‘I forgot my password’. Some people forget the password. RID 93 (NY): I realized now that I didn’t have the option, the little eye, to be able to visualize if the password was correctly set. [The login screen] does not [have it].
Memorability Participants can remember easily how to perform tasks in the App. Code did not appear in this dataset. RID 55 (DR): People don’t know how [to use the App], but the second time you try, you already know how to do it.
Information needs The information content offered by the App. RID 69 (NY): This part is important because there is a lot of information that is important for HIV patients. RID 79 (NY): Yeah well, it just says chat, it doesn’t say name, it doesn’t say anything. But it has to be an image, like the logo of the clinic here. To make you feel better because you don’t even know who you are talking to.
Flexibility and Customizability The App provides more than one way to accomplish tasks, which allows participants to operate the App as preferred RID 58 (NY): You can change the schedule options, the directions that the doctor gives you, you can put them here, and not just a schedule. How many pills you can take, you can change, because you have the option up there to edit. RID 94 (DR): For example, when we add the doses here, you see that here we put the schedules, here there can be an option that gives you the option of where to receive the notification of that schedule that you are creating if by [text] message or [e]mail.
Learnability Participants can easily learn how to operate the App. RID 42 (DR): I don’t think it’s difficult because if people know how to use these devices, the applications… I don’t think it’d be difficult. If you don’t have any knowledge, yes, it’s difficult. RID 85 (NY): I felt comfortable. It’s easy to use when you have some experience. I wouldn’t change anything.
Improve Learnability (subcode of learnability) Participants discuss ways to modify the App by enhancing comprehension and design of the App. RID 73 (DR): I would use blue and white. The colors are nice, it just needs something more radiant to catch the eye. This one, on the other hand, is clear, but it looks kind of sad with the colors. They don’t look good; it doesn’t make you want to turn it on. It looks cool but I’m not curious about it. RID 92 (DR): The title, maybe bigger. The header, larger instructions. The color, I didn’t like the green. A more striking color maybe.
Translation Participants discuss specific translation needs. RID 47 (DR): Something very important [are the] accents. I understand that we do not use them in English but since they want to offer an application [in Spanish], accents are very important. And we try to use the question mark at the beginning and at the end [of a question]. RID 103 (NY): And here the Spanish looks like an awkward translation from English because it says “explore our FAQ to have an answer” I think it’s redundant because [to] explore questions is to have an answer. It’s redundant. I’d say “here are some FAQ” something like that, less redundant. In English, it may have sense because the language structure. But in Spanish we use it that way because the direct object is determined, so anything else is redundant.
Performance speed Participants are able to use the App efficiently. RID 46 (NY): Well, [the App] makes you react if you’re not doing things right, it tells you what needs to improve in case you’re not taking your meds, you’re not taking them on time, so that you focus on an equal schedule, every day. And it keeps you informed. RID 94 (DR): It was a little confusing because usually we are used to using Android devices.
Competency Participants express confident in their ability to perform tasks using the App, based on Social Cognitive Theory. Code did not appear in this dataset. RID 87 (DR): I didn’t have so much difficulty because I’ve been taking medications for years, so I already know a little bit about schedules and the reminders that existed before.
Other outcomes Other system-specific expected outcomes representing higher level of expectations including personal networks (i.e., doctors, family members), and internet-based tools besides mHealth (i.e., phone, search engines). RID 53 (DR): Well, like all things now are now so digital, that I can see virtual, with device. I think eventually it’s going to get to the thing that you’re not going to have to go to the health center for a consultation and you’re just going to be seen on the phone with the app, a consultation. The only way you will have to go to the doctor is if you feel very sick. I think eventually it’s going to be like that, everything else is going to be virtual. RID 95 (NY): I thought the chat was to contact someone who has your information, when you need a refill or you’re running out of meds. For example, I talk to my doctor if I need a painkiller and he says that he’ll send the prescription to your pharmacy.

Figure 3.

Figure 3.

Development of codebook

Note. This figure was adapted from “Codebook development for team-based qualitative analysis,” by K.M. MacQueen et al., 1998, Cultural Anthropology, 10(2), p.34.

The initial six transcripts were reviewed again by ALH, CLM, GMK, and TKN to confirm assigned codes and refine the codebook before coding the 34 remaining transcripts. A codebook is deemed stable after reviewing 10% of the total transcripts [39]. However, we opted for a more conservative approach to review 15% of the transcripts (i.e., 6 transcripts), ensuring our coding processes’ reliability and consistency. Each remaining transcript was coded in dyads (CLM, GMK, and TKN), reviewed by a third coder (ALH), and discussed weekly in team meetings.

To enhance the rigor of our qualitative analysis, we maintained an audit trail of codebook iterations, which was discussed with an additional research team member (RS) for peer debriefing [40]. The combination of a codebook, audit trail, and regular team discussions illustrates investigator triangulation, supporting the credibility and reliability of our qualitative analysis [40].

3. Results

3.1. Sample

The sample (N=40) averaged 43 years old (SD=12.26; range 20–79) and was primarily Hispanic/Latino (92.5%), had a high school degree or higher (67.5%), and were living in an owned or rented house or apartment (62.5%) in La Romana, Dominican Republic (50%) or New York City (50%). Survey results indicated 55% (n=22) identified as male, 10% (n=4) identified as American Indian or Alaska Native, 20% (n=8) as Black or African American, 10% (n=4) as White, 5% (n=2) as multiracial, 35% (n=14) as Hispanic/Latino, 20% (n=40) as Dominicano, and 5% (n=2) as other race (Table 3). Regarding employment status, 32.5% of participants were unemployed and looking for work, 30% were working full-time, 20% were working part-time, and the remaining participants were either not actively seeking employment, enrolled as a student, retired, or receiving disability benefits.

Table 3.

Study Sample

Overall (N=40) Cognitive interviews (n=20) Usability assessments (n=20)
Age in years, average (SD), range 43.64 (12.26), 20–71 46.09 (12.03), 23–71 41.33 (12.33), 20–63
Geographic location
 New York 20 10 10
 La Romana 20 10 10
Gender
 Male 22 12 10
 Female 13 7 6
 Transgender Male / Transman / FTM 1 0 1
 Transgender Female / Transwoman / MTF 2 0 2
 Nonbinary 2 1 1
Race*
 American Indian or Alaska Native 4 3 1
 Asian 0 0 0
 Black or African American 8 2 6
 Native Hawaiian or Other Pacific Islander 0 0 0
 White 4 1 3
 Multiracial 2 1 1
 Other; Hispanic/Latino 14 7 7
 Other; Domicano 8 5 3
 Other 2 2 0
Ethnicity
 Hispanic/Latino 37 19 18
Highest level of education
 Elementary school 7 3 4
 Some high school, no diploma 6 2 4
 High school diploma or equivalent (e.g., GED) 8 4 4
 Some college 8 6 2
 Associate degree or technical degree 3 2 1
 Bachelor/college degree 4 1 3
 Professional or graduate degree 4 2 2
Employment status*
 Working full-time 12 6 6
 Working part-time (including seasonal, work-study, etc.) 8 3 5
 Unemployed, looking for work 13 7 6
 Unemployed, not looking for work 2 1 1
 Retired 1 0 1
 Student 4 2 2
 Disability benefits 4 3 1
*

Select all that apply questions; some participants selected more than one option, resulting in totals that do not sum to up 100%

3.2. Participant (user) response

For each excerpt, a coder could select one or more of the 10 codes based on the concepts from the Health-ITUEM [31]. Excerpts that reflected more than one code were given multiple codes as appropriate. Each code, its definition, and its exemplar are included in Table 4. A total of 987 codes were applied to all transcripts. Of those, 590 codes were applied to 498 excerpts of transcripts from cognitive interviews, and 397 codes were applied to 324 excerpts of transcripts from usability assessments. The frequency of use of each code appears in Table 5. To maintain the confidentiality of participants, names were removed and replaced with research identification numbers (RID).

Table 5.

Frequency of code use in each data set.

Cognitive Interviews Usability assessment Both data sets, Totals(s)
Competency 0 3 3
Error prevention 9 16 25
Flexibility/Customizability 14 14 28
Information needs 141 84 225
Learnability 244 156 400
Improve Learnability 108 83 191
Translation 18 7 25
Memorability 0 4 4
Other outcomes 16 6 22
Performance speed 40 24 64
Total(s) 590 397 987

3.2.1. Cognitive interviews

This dataset focused on two major concepts from the Health-ITUEM [31] – learnability and information needs – and one subcode, Improve Learnability, which captures instances where participants provided specific recommendations to improve future participant comprehension of the App. The most commonly occurring code in this data set was Learnability and used to code 244 (48.9%) of the 498 excerpts. Learnability captures instances of participants discussing their ability to understand the App’s features and when participants expressed their thoughts on the design aesthetics of the App, including layout, navigation, and visual appeal. Many participants also noted the effects of specific colors on their ability to interact with the App. Participants suggested simple colors made them comfortable navigating and learning about different App pages.

“It’s simple, and the colors are eye-catching, beautiful. Red, green, and yellow make it more likable. [The] colors create emotions in people.”

[RID 65]

“I like the colors because they are not too flashy, in a sense that they might cause you discomfort.”

[RID 55]

Information needs was the second most frequently occurring concept code, appearing in 141 (28.3%) excerpts. Information needs captures instances where participants discussed their thoughts and opinions regarding the informational content offered by the App. Participants generally agreed that the information was relevant for people receiving HIV treatment.

“It has a lot of information. It is for you to soak in, for you to visualize, and to fill your conscious and subconscious with information about the disease that you have. It tells you how the epidemic starts, not only in the Dominican Republic because it is all over the world.”

[RID 51]

“This part is important because there is a lot of information that is important for HIV patients. And so sometimes people or patients want to see and hear, and through video, then that can be achieved. Because I’m watching images and I’m listening. An audiovisual, this is important.”

[RID 69]

“Anything new that comes up should be put here because people want to know how it relates to HIV and all the risks. And all the information must be updated and constantly updated based on what arises. Also, Covid/HIV is a topic that does not appear, and it would be necessary to interact with the testimony of people who had Covid and HIV, as was the whole process. That information should be put and updated.”

[RID 69].

Improve learnability was the third most frequently occurring code used to code 108 (21.7%) excerpts of this data set. As previously mentioned, this subcategory captures participants’ recommendations to improve the overall learnability of the App, which focuses greatly on making changes to visual aspects of the app, including highlighting important information, increasing the font size and font colors, and including visual aids such as pictures and logos.

“You need to add larger letters for those who are short-sighted.”

[RID 52]

“[The information is] not difficult [but this screen] doesn’t have images. It’d be nice to include them because the application now looks like a newspaper. It doesn’t have any amusing parts.”

[RID 62]

“White doesn’t attract much attention, and it’s hard to see it from a tablet and computer.”

[RID 73]

“The background of this color should be darker so that the sign can be seen. […] It’s a lot more effort to read it.” [RID 74] Each of the following codes: Error prevention, Flexibility/Customizability, Translation, Performance speed, and Other outcomes also appeared in this data set, but to a small degree. Performance speed was used in 40 (8%) excerpts, Translation was used to code 18 (3.6%) experts, Other outcomes was used in 16 (3.2%) experts, Flexibility/Customizability was used in 14 (2.8%) excerpts, and Error prevention was used in 9 (1.8%) excerpts. Competency and Memorability did not appear in this data set.

3.2.2. Usability assessment using a think-aloud approach

All subjective concepts from the Health-ITUEM [27] successfully identified mHealth usability issues in this dataset. Similar to usability data obtained from cognitive interviews, this dataset’s codes also focused on two major concepts: Learnability, Information needs, and one subcategory: Improve learnability. These three codes accounted for 81.3% of the excerpts in this data set. One hundred fifty-six excerpts (39.3%) were coded as Learnability. 84 (21.2%) excerpts were coded as Information needs, and 83 (20.9%) were coded as Improved learnability. Performance Speed was used to code 24 (6%) excerpts. Error Prevention was used to code 16 (4%) excerpts. Flexibility/ Customizability was used to code 14 (3.5%) excepts. Translation, Other Outcomes, Memorability, and Competency were the least used codes and only applied to 7 (1.8%), 6 (1.5%), 4 (1%), and 3 (0.8%) excerpts, respectively.

The code, Learnability, was extended after reviewing transcripts from usability assessments to capture instances of participants discussing their ability to understand and complete assigned tasks within the App. Using a think-aloud approach, participants verbalized their experience completing tasks.

“I felt comfortable. It’s easy to use [the App] when you have some experience.”

[RID 85]

“[The task] was confusing because when I tapped on [the panel screen], I thought there would be an option to mark to take the dose, but it took me to a window to edit the schedule or something and quickly seemingly skipped the dose of medication instead of taking it. That wasn’t very clear.”

[RID 94]

Additionally, a few participants referenced technology literacy when using digital platforms, further adding to the understanding of learnability and information needs.

“You have to keep in mind that the people who will use this are not all technological. Make this more basic. Some older people are going to use this, and some children are going to help install it.”

[RID 88]

“It’s a little bit more difficult there. Because as I said before, you will meet people with different levels of education. So, try to use graphic drawings as much as possible because many may not understand the written language. With infographics or something like that.”

[RID 93]

Participants also specifically discussed information needs, focusing on additional content and instruction to assist with task completion.

“In principle, I think that if I’m going to use that application for the first time, I’m not familiar with it, I’m not used to it, there should be an explanation that says [to] touch the timeline that your medication is scheduled and that gives you the explanation that if you’re not taking it at the scheduled time, you have to put the time that you’re taking it so that the person knows. More instructions for the person to know that you were supposed to take it at 3 o’clock, but you did not mark 3 o’clock just because that was not your time to take it, you took it at 7 o’clock. You have to change, but just the time you take it, not the time you programmed. At least you who are there, I can ask you.”

[RID 97]

“As it’s my first time, you don’t know how to proceed. If you have a guide, you will know how to proceed […] A person who can tell you go this way, do this, and shows you the proper steps to do this so that in the future you know how to do it.”

[RID 85]

Cognitive interviews and usability assessments provided similar and complementary results in the usability testing of a mHealth app. The majority of data captured by both methods focused on learnability, information needs, and participant recommendations for improvements to enhance learnability. Yet, usability assessments captured all subjective concepts of the Health-ITUEM [36], whereas cognitive interviewing provided more information on participant perspectives on the appearance and design of the app’s interface but did not yield results related to competency or memorability. Although these two methods are similar in many ways, the differences between the methods are important and have implications for the usability testing of mHealth technologies. Moreover, the key implication is each usability evaluation method yielded distinct results. Therefore, researchers should carefully consider the specific usability aspects they aim to address, encompassing a diverse set of attributes such as learnability, information needs, and competency.

4. Discussion

To ensure success in the development and usability evaluation of mHealth tools, it is critical to consider the most appropriate evaluation techniques to meet the study needs best. Our methodological approach involved the evaluation and comparison of usability evaluation methods to determine a methodological approach for developing mHealth tools. This study contributes to the research on the methods used to design and test mHealth tools by describing the contributions, advantages, and disadvantages of different usability evaluation methods, including cognitive interviews and task-based usability assessments using a think-aloud approach.

This study was a qualitative content analysis [34] of usability data aimed at culturally adapting and translating a mHealth app for Spanish-speaking PWH [20]. Cognitive interviews were included in the methodological approach because they are best suited for exploring participants’ understanding, interpretation, and experience interacting with mHealth technologies [19,41]. While cognitive interviews were a valuable methodology for the overall study’s goals of dissemination and implementation of the WiseApp, this evaluation method has limitations when used broadly across the development and usability testing of mHealth technology. Cognitive interviews provide valuable insights into participants’ cognitive processes and experiences. However, when participants cannot articulate their thoughts effectively during cognitive interviews, this poses challenges for researchers, such as incomplete data and misinterpretations that may impact the data quality. To mitigate these implications, our study staff used probing questions to help participants express their thoughts and identify specific usability issues while interacting with the app.

To leverage the strengths of multiple usability methods, the study conducted cognitive interviews followed by task-based usability assessments using a think-aloud approach to translate and adapt the WiseApp. Task-based usability assessment allows participants to interact with the mHealth tool by performing specific tasks representative of realistic usage of the WiseApp, including the user experience [42]. This evaluation method provided insights into how participants navigated the app and identified usability issues and obstacles encountered while completing assigned tasks. Additionally, the think-aloud approach provided qualitative data to obtain participant feedback and verbalizations to provide supplemental information into participants’ thoughts and experiences of how they approach and complete different tasks in the app.

This is a multi-site, cross-national study, which contributes to the rigor of this study given that a site in New York City and one in LaRoman, DR. Inclusion of diverse sites results in a heterogeneous population of Latino participants, which could translate to higher generalizability of our findings. Therefore, insights gained from this study may be applicable to a wider array of settings and populations, thereby bolstering the generalizability of results and the broader relevance of the mHealth tool.

Despite these strengths, there were limitations. First, audio recordings from the Spanish language interviews were translated and transcribed into English by a professional company and not verified through member checking [43]. This approach presents limitations since using a transcription company that translates may affect the content’s accuracy, quality, and overall understanding. Another potential limitation may be our inclusion of participants with an average age of 43, which might not represent older adults’ technology use. Finally, we acknowledge that a limitation of this study is the absence of the concurrent use of complementary usability methods. Despite this limitation, these findings contribute to the research on the methods used to evaluate mHealth by comparing two different usability evaluation methods.

5. Conclusion

Findings from this study fill a gap in the literature by providing evaluation results and comparing two usability evaluation methods – cognitive interviews and task-based usability assessments using a think-aloud protocol. Additionally, this study contributes to the literature on using the Health ITUEM [36] for mHealth technologies and demonstrates that these concepts are helpful when evaluating mHealth technology. While the methods we selected represent several possibilities, this study can serve as a methodological reference for the usability evaluation for developing a consumer mHealth tool. Furthermore, we suggest that future studies improve their methodological approach by involving participants in multiple usability evaluation methods, incorporating complementary usability evaluation techniques. This approach may provide a more comprehensive evaluation of the usability of mHealth tools, but future research is needed.

Supplementary Material

2
1
3
4
5

Highlights.

  • The study emphasizes the need to consider appropriate usability evaluation techniques for the development and usability evaluation of mHealth tools.

  • This study contributes to the research on testing mHealth tools by describing the contributions of different usability evaluation methods, including cognitive interviews and usability assessments using a think-aloud protocol.

  • Cognitive interviews provided valuable insights into end-users’ cognitive processes, while usability assessments offered insights into how users navigate the mHealth tool.

  • Future studies are encouraged to involve end-users in multiple usability evaluation methods to provide a more comprehensive evaluation of mHealth tools.

Summary Table.

What is already known What this paper adds
  • The ideal methodology for evaluating mobile health (mHealth) technology is unknown.

  • While many studies have explored usability approaches in mHealth, limited research has focused on comparing different usability evaluation methods.

  • Our methodological approach involved the evaluation and comparison of usability evaluation methods to determine an methodological approach for developing mHealth tools.

  • This study contributes to the research on testing mHealth tools by describing the contributions of different usability evaluation methods.

Acknowledgment

We extend our appreciation to Maureen George, PhD, RN, AE-C, FAAN, an expert in qualitative research, for her valuable contributions to the methodology of this qualitative analysis.

Funding

This work was funded by the Agency for Healthcare Research and Quality through grant number R18HS028523 (PI: Schnall). ALH was funded by the Reducing Health Disparities through Informatics (RHeaDI) training grant funded by the National Institute of Nursing Research (T32 NR007969). SS was funded through a career development award (R00NR017829) funded by the National Institute of Nursing Research of the National Institutes of Health. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Agency for Healthcare Research and Quality or the National Institutes of Health.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Declaration of interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  • 1.Akter S, & Ray P (2010). mHealth - an Ultimate Platform to Serve the Unserved. Yearbook of Medical Informatics, 94–100. [PubMed] [Google Scholar]
  • 2.Cole-Lewis H, & Kershaw T (2010). Text messaging as a tool for behavior change in disease prevention and management. Epidemiologic Reviews, 32(1), 56–69. 10.1093/epirev/mxq004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Rathbone AL, & Prescott J (2017). The Use of Mobile Apps and SMS Messaging as Physical and Mental Health Interventions: Systematic Review. Journal of Medical Internet Research, 19(8), e295. 10.2196/jmir.7740 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Schnall R, Rojas M, Travers J, Brown W 3rd, & Bakken S (2014). Use of Design Science for Informing the Development of a Mobile App for Persons Living with HIV. AMIA Annual Symposium Proceedings, 2014, 1037–1045. [PMC free article] [PubMed] [Google Scholar]
  • 5.Ostergren JE, Rosser BR, & Horvath KJ (2011). Reasons for non-use of condoms among men who have sex with men: a comparison of receptive and insertive role in sex and online and offline meeting venue. Culture, Health & Sexuality, 13(2), 123–140. 10.1080/13691058.2010.520168 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Winetrobe H, Rice E, Bauermeister J, Petering R, & Holloway IW (2014). Associations of unprotected anal intercourse with Grindr-met partners among Grindr-using young men who have sex with men in Los Angeles. AIDS Care, 26(10), 1303–1308. 10.1080/09540121.2014.911811 [DOI] [PubMed] [Google Scholar]
  • 7.Hightow-Weidman LB, Muessig KE, Bauermeister J, Zhang C, & LeGrand S (2015). Youth, Technology, and HIV: Recent Advances and Future Directions. Current HIV/AIDS reports, 12(4), 500–515. 10.1007/s11904-015-0280-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Blanton M, Zhou J, Moro MM, Zhang D Tsotras VJ, Halkidi M, Hainaut JL, Tanin E, Korn FR, Pedersen TB, Bohm C, Plant C, Zhang Q, Strauss MJ, Jensen CS, Snodgrass R, Li N, Kantarcioglu M, Sebe N, Jaimes A, Dix A and Tompa FW (2009). Human-computer interaction. In Liu L, & Tamer Ozsu M (Eds.), Encyclopedia of Database Systems (pp. 1327–1331). Springer. [Google Scholar]
  • 9.Maramba I, Chatterjee A, & Newman C (2019). Methods of usability testing in the development of eHealth applications: A scoping review. International Journal of Medical Informatics, 126, 95–104. 10.1016/j.ijmedinf.2019.03.018 [DOI] [PubMed] [Google Scholar]
  • 10.Davis R, Gardner J, & Schnall R (2020). A review of usability evaluation methods and their use for testing eHealth HIV interventions. Current HIV/AIDS Reports, 17(3), 203–218. 10.1007/s11904-020-00493-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Cho H, Yen PY, Dowding D, Merrill JA, & Schnall R (2018). A multi-level usability evaluation of mobile health applications: A case study. Journal of Biomedical Informatics, 86, 79–89. 10.1016/j.jbi.2018.08.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Stonbraker S, Cho H, Hermosi G, Pichon A, & Schnall R (2018). Usability testing of a mHealth App to support self-management of HIV-associated non-AIDS related symptoms. Studies in Health Technology and Informatics, 250, 106–110. [PMC free article] [PubMed] [Google Scholar]
  • 13.Kuhns LM, Hereth J, Garofalo R, Hidalgo M, Johnson AK, Schnall R, Reisner SL, Belzer M, & Mimiaga MJ (2021). A Uniquely targeted, mobile app-based HIV prevention intervention for young transgender women: Adaptation and usability study. Journal of Medical Internet Research, 23(3), e21839. 10.2196/21839 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Wang Q, Liu J, Zhou L, Tian J, Chen X, Zhang W, Wang H, Zhou W, & Gao Y (2022). Usability evaluation of mHealth apps for elderly individuals: a scoping review. BMC Medical Informatics and Decision Making, 22(1), 317. 10.1186/s12911-022-02064-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Hoque R, & Sorwar G (2017). Understanding factors influencing the adoption of mHealth by the elderly: An extension of the UTAUT model. International Journal of Medical Informatics, 101, 75–84. 10.1016/j.ijmedinf.2017.02.002 [DOI] [PubMed] [Google Scholar]
  • 16.Palas JU, Sorwar G, Hoque MR, & Sivabalan A (2022). Factors influencing the elderly’s adoption of mHealth: an empirical study using extended UTAUT2 model. BMC Medical Informatics and Decision Making, 22(1), 191. 10.1186/s12911-022-01917-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Cho H, Powell D, Pichon A, Kuhns LM, Garofalo R, & Schnall R (2019). Eye-tracking retrospective think-aloud as a novel approach for a usability evaluation. International Journal of Medical Informatics, 129, 366–373. 10.1016/j.ijmedinf.2019.07.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Drennan J (2003). Cognitive interviewing: verbal data in the design and pretesting of questionnaires. Journal of Advanced Nursing, 42(1), 57–63. 10.1046/j.1365-2648.2003.02579.x [DOI] [PubMed] [Google Scholar]
  • 19.Levin K, Willis GB, Forsyth BH, Norberg A, Kudela MS, Stark D, & Thompson FE (2009). Using cognitive interviews to evaluate the Spanish-language translation of dietary questionnaire. Survey Research Methods, 3(1), 13–25. 10.18148/srm/2009.v3i1.88 [DOI] [Google Scholar]
  • 20.Schnall R, Ramirez SO, Padilla JJ, Halpern M, Olender S, & Baez P (2023). Expert feedback on the adaptation and translation of Spanish version of wiseApp. Studies in Health Technology and Informatics, 302, 500–501. 10.3233/SHTI230190 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Schnall R, Bakken S, Rojas M, Travers J, & Carballo-Dieguez A (2015). mHealth technology as a persuasive tool for treatment, care and management of persons living with HIV. AIDS and Behavior, 19, 81–89. 10.1007/s10461-014-0984-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Schnall R, Bakken S, Brown Iii W, Carballo-Dieguez A, & Iribarren S (2016). Usability evaluation of a prototype mobile app for health management for persons living with HIV. Studies in Health Technology and Informatics, 225, 481–485. [PMC free article] [PubMed] [Google Scholar]
  • 23.Schnall R, Higgins T, Brown W, Carballo-Dieguez A, & Bakken S (2015). Trust, perceived risk, perceived ease of use and perceived usefulness as factors related to mHealth technology use. Studies in Health Technology and Informatics, 216, 467–471. [PMC free article] [PubMed] [Google Scholar]
  • 24.Schnall R, Mosley JP, Iribarren SJ, Bakken S, Carballo-Diéguez A, & Brown Iii W (2015). Comparison of a user-centered design, a self-management app to existing mHealth apps for persons living With HIV. JMIR mHealth and uHealth, 3(3). 10.2196/mhealth.4882 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Schnall R, Rojas M, Bakken S, Brown W, Carballo-Dieguez A, Carry M, Gelaude D, Mosley JP, & Travers J (2016). A user-centered model for designing consumer mobile health (mHealth) applications (apps). Journal of Biomedical Informatics, 60, 243–251. 10.1016/j.jbi.2016.02.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Flynn G, Jia H, Reynolds NR, Mohr DC, & Schnall R (2020). Protocol of the randomized control trial: the WiseApp trial for improving health outcomes in PLWH (WiseApp). BMC Public Health, 20(1), 1775. 10.1186/s12889-020-09688-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Beauchemin M, Gradilla M, Baik D, Cho H, & Schnall R (2019). A multi-step usability evaluation of a self-management app to support medication adherence in persons living with HIV. International Journal of Medical Informatics, 122, 37–44. 10.1016/j.ijmedinf.2018.11.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Schnall R, Sanabria G, Jia H, Cho H, Bushover B, Reynolds NR, Gradilla M, Mohr DC, Ganzhorn S, & Olender S (2023). Efficacy of an mHealth self-management intervention for persons living with HIV: the WiseApp randomized clinical trial. Journal of the American Medical Informatics Association, 30(3), 418–426. 10.1093/jamia/ocac233 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Alvarez G, Sanabria G, Jia H, Cho H, Reynolds NR, Gradilla M, Olender S, Mohr DC, & Schnall R (2023). Do walk step reminders improve physical activity in persons living with HIV in New York City?-results from a randomized clinical trial. The Journal of the Association of Nurses in AIDS Care, 34(6), 527–537. 10.1097/JNC.0000000000000427 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Olaya F, Brin M, Caraballo PB, Halpern M, Jia H, Ramírez SO, Padilla JJ, Stonbraker S, & Schnall R (2024). A randomized controlled trial of the dissemination of an mHealth intervention for improving health outcomes: the WiseApp for Spanish-speakers living with HIV study protocol. BMC Public Health, 24(1), 201. 10.1186/s12889-023-17538-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Jaspers MW (2009). A comparison of usability methods for testing interactive health technologies: methodological aspects and empirical evidence. International Journal of Medical Informatics, 78(5), 340–353. 10.1016/j.ijmedinf.2008.10.002 [DOI] [PubMed] [Google Scholar]
  • 32.MacLean LM, Meyer M, & Estable A (2004). Improving accuracy of transcripts in qualitative research. Qualitative Health Research, 14(1), 113–123. 10.1177/1049732303259804 [DOI] [PubMed] [Google Scholar]
  • 33.Johns Hopkins Sheridan Libraries. (2023, October 4). Qualitative data analysis software. https://guides.library.jhu.edu/QDAS
  • 34.Hsieh HF, & Shannon SE (2005). Three approaches to qualitative content analysis. Qualitative Health Research, 15(9), 1277–1288. 10.1177/1049732305276687 [DOI] [PubMed] [Google Scholar]
  • 35.Graneheim UH, & Lundman B (2004). Qualitative content analysis in nursing research: concepts, procedures and measures to achieve trustworthiness. Nurse Education Today, 24(2), 105–112. 10.1016/j.nedt.2003.10.001 [DOI] [PubMed] [Google Scholar]
  • 36.Yen PY (2010). Health information technology usability evaluation: Methods, models, and measures (Publication No. 3420882) [Doctoral dissertation, Columbia University]. ProQuest. [Google Scholar]
  • 37.Brown W 3rd, Yen PY, Rojas M, & Schnall R (2013). Assessment of the health IT usability evaluation model (Health-ITUEM) for evaluating mobile health (mHealth) technology. Journal of Biomedical Informatics, 46(6), 1080–1087. 10.1016/j.jbi.2013.08.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.MacQueen KM, McLellan E, Kay K, & Milstein B (1998). Codebook development for team-based qualitative analysis. Cultural Anthropology, 10(2), 31–36. 10.1177/1525822X980100020301 [DOI] [Google Scholar]
  • 39.Campbell JL, Quincy C, Osserman J, & Pedersen OK (2013). Coding in-depth semistructured interviews: Problems of unitization and intercoder reliability and agreement. Sociological Methods & Research, 42(3), 294–320. 10.1177/0049124113500475 [DOI] [Google Scholar]
  • 40.Guba EG (1981). Criteria for assessing the trustworthiness of naturalistic inquiries. Educational Technology Research and Development, 29(1), 75–91. 10.1007/BF02766777 [DOI] [Google Scholar]
  • 41.Eremenco S, Pease S, Mann S, Berry P, & PRO Consortium’s Process Subcommittee (2017). Patient-reported outcome (PRO) consortium translation process: Consensus development of updated best practices. Journal of Patient-Reported Outcomes, 2(1), 12. 10.1186/s41687-018-0037-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Kessler MM, Breuch LK, Stambler DM, Campeau KL, Riggins OJ, Feddema E, Doornink SI, & Misono S (2021). User experience in health & medicine: Building methods for patient experience design in multidisciplinary collaborations. Journal of Technical Writing and Communication, 51(4), 380–406. 10.1177/00472816211044498 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Al-Amer R, Ramjan L, Glew P, Darwish M, & Salamonson Y (2016). Language translation challenges with Arabic speakers participating in qualitative research studies. International Journal of Nursing Studies, 54, 150–157. 10.1016/j.ijnurstu.2015.04.010 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

2
1
3
4
5

RESOURCES