Table 4.
Data collection method n (%) |
Description | Metrics/tools | Comments |
---|---|---|---|
Performance metrics 25 (26.1)) |
Collecting quantifiable measurements of participants’ actions during the test to understand the impacts of usability issues, usually focusing on effectiveness and efficiency | Effectiveness: number of errors, number of tasks that can be completed successfully; efficiency: task duration, number of times asking for assistance or hints, time spent recovering from errors | These quantitative indicators can be compared in young adults and seniors to reflect differences in performance [76, 81] |
Behavior observation log 14 (14.6) |
Observing and recording the participant’s mood and body gestures during the test | Sometimes, the observation is structured and based on predefined classifications of user behavior, such as delay or pause of > 5 s in locating the answer button [42] | This method is often used in conjunction with thinking aloud and performance metrics to improve triangulation [33] |
Screen recording 3 (3.1) |
Capturing the touches and actions performed on the mobile device | Screen recording software and video coding software (Behavioral Observation Research Interactive Software) | – |
Eye tracking 1 (1.1) |
Monitoring and recording the visual activity of the participants by tracing pupil movement within the eye | Fixations: the number of views of the area of interest; Saccade: the number of repeated visits to the specific area [52] | Because of the drooping eyelids of elderly individuals, the eye tracker may not scan their pupil accurately |
Concurrent thinking aloud 25 (26.1) |
Encouraging the participants to continuously verbalize their ideas, beliefs, expectations, doubts, and discoveries while performing tasks in order to understand their thoughts as they interact with the app | – | This method relies heavily on the cognitive capacities of participants, whereas these capacities decline with age; thus, it may cause reporter bias [83] |
Retrospective thinking aloud 1 (1.1) |
Asking the participants to view the recording of their actions and verbalize their thoughts about the tasks and the difficulties they encountered in completing the tasks | – |
1. This method will increase the overall length of the evaluation and may cause the elderly to lose focus [85] 2. This method will not increase the cognitive load of the elderly compared with concurrent thinking aloud [83] |
Questionnaire 68 (70.8) |
Gathering the participants’ opinions about, preferences for and satisfaction with the user interface on a predefined scale after they completed the tasks | Validated questionnaires: SUS, USE, UEQ, ASQ, NASA-TLX, NPS, Health-ITUES, QUIS, PSSUQ, ICF-US, MARS, Ruland’s eight-item adaptation of Davis' ease-of-use survey, self-made questionnaires according to the unique features of a specific app |
1. A larger sample size can be investigated by this method [78] 2. Some items have to be answered by an expert rather than the elderly because they are either beyond the scope of the test or based on experiencing rare occurrences. [85] 3. To prevent the response burden of the elderly and improve the understandability of the questionnaire, some items are removed or the language is modified [43, 54, 82] |
Interview 36 (37.5) |
Collecting the data in the form of face-to-face oral conversations with the participants, including individual interviews and focus group interviews | The interview outline: opinions on unique features, product satisfaction, and difficulties encountered during the test as well as suggestions for improvement |
1. This method can obtain more new insights from the participants 2. This method is often combined with a questionnaire to collect the explanations of answers to the questionnaire |
Feedback log 1 (1.1) |
Asking the participants to record their experiences on a provided form when using the app | – | This method is suitable for long-term usability testing, as it can record the participant’s experience dynamically [52] |
SUS, System Usability Scale [86]; USE, Usefulness Satisfaction and Ease of Use Questionnaire [87]; UEQ, User Experience Questionnaire [88]; ASQ, After Scenario Questionnaire [89]; NASA-TLX, National Aeronautics and Space Administration Task Load Index [90]; NPS, Net Promoter Score [78]; Health-ITUES, Health Information Technology Usability Evaluation Scale [91]; QUIS, Questionnaire for User Interaction Satisfaction [92]; PSSUQ, Post-Study System Usability Questionnaire [93]; ICF-US, International Classification of Functioning based Usability Scale [94]; MARS, Mobile Application Rating Scale [95]