Skip to main content
AMIA Summits on Translational Science Proceedings logoLink to AMIA Summits on Translational Science Proceedings
. 2022 May 23;2022:496–503.

Comparison of Alexa Voice and Audio Video Interfaces for Home-Based Physical Telerehabilitation

Chenhao Wei 1, Joseph Finkelstein 1
PMCID: PMC9285164  PMID: 35854718

Abstract

The goal of this pilot study was to compare Alexa voice and video interfaces for home-based telerehabilitation dialog by conducting cognitive walkthrough testing. All task performance scores were higher in video interface as compared to the audio interface. The overall task score was significantly higher for video interface (42.4±4.6) as compared to the audio score (41.3±5.9). Comparative usability survey demonstrated higher preference of the video interface as compared to the audio interface. Based in the comparative survey, 85.7% stated they definitely prefer video interface, 85.7% felt that video introduction was simpler to understand, 71.4% felt that exercise instructions were simpler to understand with the video interface, and 78.6% felt that overall navigation was easier with the video interface. The overall time to accomplish all three tasks was significantly shorter (p<0.05) for the video interface (170.5±12.2 seconds) as compared to the audio interface (194.2±10.3 seconds). This is the first study systematically comparing two major Alexa interfaces in a telerehabilitation system. These results are instrumental for future development of Alexa-based telerehabilitation systems.

Introduction

Physical rehabilitation has been shown to slow down disease progression and improve quality of life in a wide range of chronic health conditions [1-3]. Access to life-long rehabilitation programs may be limited due a spectrum of barriers including insurance coverage and mobility limitations [2]. Telemedicine approaches have been shown effective in addressing these barriers [3]. Making telerehabilitation services more acceptable for people with limited health and computer literacy may potentially affect ability of older adults with chronic health conditions to effectively use home-based telemedicine systems. Recent projects explored use of Amazon Alexa as means for delivering various skillsets to individuals with minimal computer literacy. Alexa assistant can potentially help older adults who don’t feel comfortable using computers or computer-like devices to overcome computer literacy barriers by allowing natural voice–driven interactions with the telerehabilitation system. However, use of Amazon Alexa for delivery of rehabilitation services has not been systematically explored. With regard to interface, Alexa devices can be divided in two groups: voice operated units without video capability and voice-operated units with integrated video capability [4]. It is not clear whether addition of video affects usability of these devices for telerehabilitation services. The goal of this pilot study was to compare Alexa voice and video interfaces for home-based telerehabilitation dialog by conducting cognitive walkthrough testing.

Methods

Study Design

The system for online rehabilitation exercise with personalized schedules has been designed based on the voice interaction with Amazon’s voice assistant Alexa for aging patients to exercise at home. Patients can use their voice to finish the whole rehabilitation exercise in the system via the Amazon Echo devices. The system design of the Alexa based Remote Rehabilitation System is depicted in Figure 1. All the voice data is collected by the Echo devices and then sent to the Alexa voice-processing server. After the server translates the voice data to string commands, our program will analyze the command’s intent and fetch the exercise schedule or media resources to generate the response to send back to the echo device. The whole rehabilitation system contains four stages: exercise schedule check, watch the exercise introduction video, exercise with video, post-exercise survey.

Figure 1.

Figure 1.

System Design

Study Design

All the participants were given a list of instructions and questionnaires to carry out the system’s cognitive walkthrough. The participants underwent two tests: one test required interaction with an Alexa device supported only voice communication; and another test required interaction with an Alexa device with a video screen providing visual ques and short videos in parallel to voice communication. The order of these two tests was randomly selected for each of the participant in order to prevent the training bias. Before the test, participants filled out a pre-survey to collect the socio-demographic information and were familiarized with the Alexa commands. All the participants were using the same Echo Show and Echo Dot devices to interact with the system. All the participants were timed for each task. If the participant needed additional help to complete the task, the research assistant noted it in the test report. At the completion of each cognitive walkthrough task, each participant was asked to grade the task from 1 (very difficult) to 5 (very easy) using a survey that contained the following questions: (1) How difficult or easy was it to complete this task? (2) How satisfied are you using this application/system to complete this task? (3) How would you rate the amount of time it took to complete this task? After participants finished all the tasks, they were asked System Usability Scale (SUS) and System Usability Comparison.

Results

The user interface and conversation tree is depicted in Figure 2. Fourteen cognitive walkthrough experiments have been completed. The profiles of the 14 study participants are presented in Table 2. The usability analysis is presented in Tables 2-6. From Table 4, all task performance scores were higher in video interface as compared to the audio interface. Specifically, the total Task 2 score for the video interface (14.3±1.5) were significantly higher than the total audio score (13.7±1.9), the total Task 3 video score (13.6±2.9) was significantly higher than audio score (13.4±3.0), and overall task score was significantly higher for video interface (42.4±4.6) as compared to the audio score (41.3±5.9). Comparative usability survey in Table 6 demonstrated higher preference of the video interface as compared to the audio interface. Based in the comparative survey (table 5), 85.7% stated they definitely prefer video interface, 85.7% felt that video introduction was simpler to understand, 71.4% felt that exercise instructions were simpler to understand with the video interface, and 78.6% felt that overall navigation was easier with the video interface. The overall time to accomplish all three tasks was significantly shorter (p<0.05) for the video interface (170.5±12.2 seconds) as compared to the audio interface (194.2±10.3 seconds).

Figure 2.

Figure 2.

User Interface and Conversation Tree

Table 2.

Participant profile (N=14).

Mean(SD) Mean(SD)
Age(years) 37.2 (12.5) How many days exercise/week 1.9
%(N) %(N)
Gender Born in USA
Female 42.9(6 Yes 28.6(4)
Male 57.1(8) No 71.4(10)
Race Job
Asian 57.1(8) Permanent 100(14)
White 35.7(5) Internet use
Other 7.2(1) Once a day 100(14)
ATM Use Computer use work/school
Once a day 0 Once a day 100(14)
Once a week 21.4(3) English Proficiency
Once a month or less 71.4(10) Excellent 64.2(9)
Never 7.2(1) Good 35.8(5)
Internet Proficiency Experience with Alexa Echo
Excellent 78.6(11) None 64.2(9)
Good 21.4(3) Once 7.2(1)
Very limited 0 Several times 14.3(2)
Native English Speaker Frequent 14.3(2)
Yes 28.6(4)
No 71.4(10)

Table 4.

Results of patient testing of the Alexa-based home telerehabilitation system with video screen or with audio only

Screen / Audio
Mean SD Mean SD Mean Δ p-value
Task 1. Review Your Exercise Plan 1. How difficult or easy was it to complete this task? 4.786 0.43 4.714 0.61 0.071 0.583
2. How satisfied are you with using this application/ system to complete this task? 4.714 0.73 4.500 0.65 0.214 0.189
3. How would you rate the amount of time it took to complete this task? 4.643 0.93 4.429 1.16 0.071 0.336
Total Task 1 Score 14.571 0.85 14.214 1.67 0.357 0.208
Task 2. Review Instruct ion Video for Stand-to-Sit Exercise 1. How difficult or easy was it to complete this task? 4.929 0.27 4.714 0.73 0.214 0.189
2. How satisfied are you with using this applicatio n/system to complete this task? 4.714 0.61 4.429 0.94 0.286 0.040
3. How would you rate the amount of time it took to complete this task? 4.571 0.94 4.286 1.14 0.071 0.671
Total Task 2 Score 14.286 1.49 13.714 1.90 0.571 0.041
Task 3. Perform Exercise Stand-to-sit 1. How difficult or easy was it to complete this task? 4.857 0.36 4.786 0.43 0.214 0.082
2. How satisfied are you with using this application/ system to complete this task? 4.714 0.61 4.643 0.63 0.286 0.040
3. How would you rate the amount of time it took to complete this task? 4.357 1.15 4.643 0.93 -0.286 0.040
Total Task 3 Score 13.571 2.85 13.357 3.03 0.214 0.189
Total Task Score (Task1-Task3) 42.429 4.75 41.286 5.93 1.143 0.036
Exit Survey The Alexa exercise program is appealing 4.643 0.84 4.439 0.94 0.214 0.082
The Alexa exercise program is easy to navigate: 4.571 0.94 4.286 1.07 0.286 0.302
Total Exit Survey Score 9.214 1.72 8.714 1.94 0.500 0.151

Table 6.

System Usability Comparison and System Usability Scale of with Screen System†

Items Mean SD
Q1 4.7 0.7
Q2 4.3 1.0
Q3 4.1 1.0
Q4 4.4 1.0

† 1: strongly disagree – 5: strongly agree

Table 5.

System Usability Comparison Survey

Questions asked after each task Score Range
Q1. I prefer the Alexa with video screen more than the pure voice system 1. “Strongly Disagree,” to 5, “Strongly Agree”
Q2. The Video introduction guides me more clearly than the Audio introduction 1. “Strongly Disagree,” to 5, “Strongly Agree”
Q3. The exercise video instructions are better than the exercise audio explanation 1. “Strongly Disagree,” to 5, “Strongly Agree”
Q4. The video screen system is easier to follow than the pure voice system. 1. “Strongly Disagree,” to 5, “Strongly Agree”

Discussion

The Alexa-based remote rehabilitation system’s usability inspection with 14 participants demonstrated its high acceptance. Comparison of voice and video interfaces demonstrated statistically significant better performance of the video interface. Our results are congruent with previous reports demonstrating the significant potential of patient-centered digital health [5] tailored to patient preferences. Previous studies demonstrated potential utility of Alexa-based voice assistants as voice Interface technology in patients with heart failure [6] and in geriatric care [7]. Our findings establish an evidence-based usability framework for further expansion of Alexa-based devices into telerehabilitation domain by demonstrating utility of combined voice and video interface. Home-based telerehabilitation systems with user-friendly interfaces implemented with direct user input have been shown well accepted in patients with COPD [8], post-acute hip fracture recovery [9], geriatric syndromes [10], multiple sclerosis [11] and cancer [12]. Future steps would include developing Alexa-based user-friendly interfaces for these patients which include interactive video components.

Conclusion

This is the first study systematically comparing two major Alexa interfaces in a telerehabilitation system. These results are instrumental for future development of Alexa-based telerehabilitation systems.

Figures & Table

Table 1.

Tasks performed by study participants during cognitive walkthrough (The sentences with * mark are specifically for with only audio experiment).

Task 1 Review your exercise plan
1. Sit in front of the Alexa Echo Show / Alexa Echo Dot*
2. Speak out “Alexa, open my exercise.”
3. Wait for the Alexa response.
4. Speak out “Alexa, show me the exercise.”
Task 2 Review Instruction for “Stand-to-Sit” Exercise
1. Choose “Stand to sit” exercise by saying “Alexa, stand to sit.”
2. Wait for the Alexa to ask you whether you want the introduction or not.
3. Speak out “Yes” to start watching the introduction video / to start listening the introduction audio.*
4. Finish the video/audio*.
Task 3 Perform the Exercise “Stand-to-Sit”
1. Speak out “Alexa, start exercise.”
2. Follow the video to do the “stand to sit” exercise for 3 times.
3. Click the “Exercise Finished” button / Speak out “Finish*.”
4. Answer the post-exercise survey questions.

Table 3.

Post-task survey

Questions asked after each task Score Range Sub Session
How difficult or easy was it to complete this task? 1, “Very Difficult,” to 5, “Very Easy.” Task X.1
How satisfied are you with using this application/system to complete this task? 1, “Very Unsatisfied,” to 5, “Very Satisfied.” Task X.2
How would you rate the amount of time it took to complete this task? 1, “Too Much Time,” to 5, “Very Little Time.” Task X.3

References

  • 1.Bedra M, McNabney M, Stiassny D, Nicholas J, Finkelstein J. Defining patient-centered characteristics of a telerehabilitation system for patients with COPD. Stud Health Technol Inform. 2013;190:24–26. [PubMed] [Google Scholar]
  • 2.Zeng X, Chen H, Ruan H, Ye X, Li J, Hong C. Effectiveness and safety of exercise training and rehabilitation in pulmonary hypertension: a systematic review and meta-analysis. J Thorac Dis. 2020;12(5):2691–705. doi: 10.21037/jtd.2020.03.69. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Holland AE, Mahal A, Hill CJ, Lee AL, Burge AT, Cox NS, Moore R, Nicolson C, O'Halloran P, Lahham A, Gillies R. Home-based rehabilitation for COPD using minimal resources: a randomised, controlled equivalence trial. Thorax. 2017;72(1):57–65. doi: 10.1136/thoraxjnl-2016-208514. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Shade MY, Rector K, Soumana R, Kupzyk K. Voice Assistant Reminders for Pain Self-Management Tasks in Aging Adults. J Gerontol Nurs. 2020 Oct 1;46(10):27–33. doi: 10.3928/00989134-20200820-03. [DOI] [PubMed] [Google Scholar]
  • 5.Finkelstein J, Jeong IC. Feasibility of interactive biking exercise system for telemanagement in elderly. Stud Health Technol Inform. 2013;192:642–646. [PubMed] [Google Scholar]
  • 6.Apergi LA, Bjarnadottir MV, Baras JS, Golden BL, Anderson KM, Chou J, Shara N. Voice Interface Technology Adoption by Patients with Heart Failure: Pilot Comparison Study. JMIR Mhealth Uhealth. 2021 Apr 1;9(4):e24646. doi: 10.2196/24646. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Balasubramanian GV, Beaney P, Chambers R. Digital personal assistants are smart ways for assistive technology to aid the health and wellbeing of patients and carers. BMC Geriatr. 2021 Nov 15;21(1):643. doi: 10.1186/s12877-021-02436-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Finkelstein J, Jeong IC, Doerstling M, Shen Y, Wei C, Karpatkin H. Usability of Remote Assessment of Exercise Capacity for Pulmonary Telerehabilitation Program. Stud Health Technol Inform. 2020 Nov 23;275:72–76. doi: 10.3233/SHTI200697. [DOI] [PubMed] [Google Scholar]
  • 9.Bedra M, Finkelstein J. Feasibility of post-acute hip fracture telerehabilitation in older adults. Stud Health Technol Inform. 2015;210:469–73. [PubMed] [Google Scholar]
  • 10.Finkelstein J, Cisse P, Jeong IC. Feasibility of Interactive Resistance Chair in Older Adults with Diabetes. Stud Health Technol Inform. 2015;213:61–4. [PubMed] [Google Scholar]
  • 11.Jeong IC, Karpatkin H, Stein J, et al. Relationship Between Exercise Duration in Multimodal Telerehabilitation and Quality of Sleep in Patients with Multiple Sclerosis. Stud Health Technol Inform. 2020 Jun 16;270:658–662. doi: 10.3233/SHTI200242. [DOI] [PubMed] [Google Scholar]
  • 12.Finkelstein J, Huo X, Parvanova I, Galsky M. Usability Inspection of a Mobile Cancer Telerehabilitation System. Stud Health Technol Inform. 2022 Jan 14;289:405–409. doi: 10.3233/SHTI210944. [DOI] [PubMed] [Google Scholar]

Articles from AMIA Summits on Translational Science Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES