Abstract
Multi-domain activities that incorporate physical, cognitive, and social stimuli can enhance older adults’ overall health and quality of life. Several robotic platforms have been developed to provide these therapies in a quantifiable manner to complement healthcare personnel in resource-strapped long-term care settings. However, these platforms are primarily limited to one-to-one human robot interaction (HRI) and thus do not enhance social interaction. In this paper, we present a novel HRI framework and a realized platform called SAR-Connect to foster robot-mediated social interaction among older adults through carefully designed tasks that also incorporate physical and cognitive stimuli. SAR-Connect seamlessly integrates a humanoid robot with a virtual reality-based activity platform and a multimodal data acquisition module including game interaction, audio, visual and electroencephalography responses of the participants. Results from a laboratory-based user study with older adults indicates the potential of SAR-Connect that showed this system could 1) involve one or multiple older adults to perform multi-domain activities and provide dynamic guidance, 2) engage them in the robot-mediated task and foster human-human interaction, and 3) quantify their social and activity engagement from multiple sensory modalities.
Keywords: Socially Assistive Robotics, Human-Centered Robotics, Virtual Reality and Interfaces, Multi-user Human-Robot Interaction
I. Introduction
IN 2014, the number of people aged 65 and over accounted for 15 percent of the population in the US. As the baby boom generation ages, this number will increase dramatically. By 2030, the older population is projected to represent nearly 21 percent of the total population [1]. The majority of older adults have multiple chronic health conditions, which result in increased health care expenditures and limitations in activities of daily living [1, 2]. Among these chronic diseases, dementia is a prevalent syndrome that is characterized by difficulties with memory, language, problem-solving and other cognitive skills related to everyday activities. Alzheimer’s disease, the most common cause of dementia, is the sixth leading cause of death among older adults. One in ten people age 65 and over has Alzheimer’s disease. In addition, approximately 15 to 20 percent of older adults have mild cognitive impairment (MCI), a potential precursor to Alzheimer’s and other dementias [2].
A. The Importance of Social Activities and Engagement
There is no cure as yet for dementia. A major focus in geriatrics is prevention of cognitive decline, even among those with MCI or dementia. Social activity and engagement is perhaps one of the most effective non-pharmacologic strategies to maintain cognitive function and decrease the rate of cognitive decline among older adults with and without dementia [3, 4]. Moreover, social activity and engagement is often advocated for older adults with dementia who also experience neuropsychiatric symptoms, including depression and apathy – two conditions that have limited pharmacologic options [5, 6]. Older adults residing in long term care are particularly susceptible to depression and apathy, with up to 50% having depression [7] and 72% with apathy [8]. Activities in a group setting are particularly effective because older adults are engaged both in the activity stimuli and the simultaneous contact with others. Additionally, the combination of sensory and motor based activity appears most effective for enhancing engagement and reducing rate of cognitive decline and apathy [3, 9–11].
The importance of social activity and engagement in long term care settings (LTCs) is manifested by the U.S.A. Centers for Medicare and Medicaid Services (CMS) regulations that mandate LTCs provide residents with an individualized program of activities (§483.15(f)) [12]. However, activities are often suboptimal in degree of engagement, variety, stimulation or content [13]. LTC caregivers frequently lack the time, skill or resources to engage older adults in activities. This is problematic since multimodal social engagement strategies, which appear most successful in slowing cognitive decline, are resource intensive [14]. Most LTCs have inadequate staffing, either in labor quantity or skill mix and provide only 0.1 to 0.6 activity staff hours/resident/day [15]. Hence, the National Institutes on Aging and the National Science and Technology Council have recognized the critical need for the use of smart technology as a viable option for resource strapped long term care settings; virtual environments and robotics are two such technologies that show promise. Our goal in this initial work was to design, develop and validate a robotic system that could deliver these multimodal interventions with a minimum of human involvement.
B. Literature Review
Socially assistive robotic (SAR) systems have been proposed to provide social companionship [16, 17], support independent living [17], facilitate healthy eating [18], and engage older adults in various forms of physical and cognitively stimulating activities [18–23]. While promising, many of these initial robotic systems are open-loop or remotely operated [24, 25]. In recent years, closed-loop SAR systems, which have the ability to monitor human interaction in real time and adapt system behaviors accordingly, have been proposed for older care. Tapus et al. [21] developed a SAR system to help stroke patients and people with cognitive impairment. Fasola and Mataric [19] designed a robotic exercise coach that monitored older adults’ performance of chair exercise and actively provided feedback and guidance to encourage task completion. Gorer et al. [20] developed a robotic fitness coach that learned physical exercises from a professional trainer and guided older adults through these physical exercises. McColl et al. [18] developed a robotic system to engage older adults in meal eating activity and a cognitively stimulating activity.
Although promising, these systems were designed to work with a single older adult and did not address the social aspect of older adults’ health and wellbeing by involving multiple users simultaneously. Realizing the need for social interaction, Louie et al. [22] developed an autonomous assistive robot, Tangy, that played Bingo game with a group of older adults. However, the goal of the system was to plan and facilitate group activity instead of promoting interpersonal social interaction. The robot’s behavior adaptation during game playing was applied only at the individual level. Similar to Tangy, the robot Matilda was designed to play Bingo and Hoy with groups of 8 to 30 older adults [26]. Back et al. [23] and Matsusaka et al. [24] developed SAR systems to lead physical activity with multiple older adults. Konah et al. [27] developed a series of robot assisted activities for group interaction. These systems have been shown to be useful, however, they either operate in an open loop fashion or require a human mediator that creates resource constraints. Although not designed for older adults, there exists one SAR system that was developed to enhance human-human interaction (HHI). Matsuyama et al. [28] programmed a conversational robot SCHEMA to participate in a conversational game with the goal of promoting communication among human participants.
In addition to SAR systems, virtual reality (VR)/virtual environments to support the care of older adults have also been explored [29]. Redon et al. [30] studied how exercising using balance games with a Wii Balance Board would increase dynamic balance and static stability in older adults. Young et al. [31] also developed a platform consisting of a Wii balance board and a virtual environment for the purpose of improving older adults’ balance through interaction with virtual tasks. Anderson-Hanley et al. [32] compared the effect of stationary cycling with and without VR tours on older adults’ cognition; they found that cycling with VR tours had greater potential for preventing cognitive decline.
C. Scope of this Work
In this work we integrate the advantages of a physically embodied social robot with those of VR systems to create a unified system, SAR-Connect, that can provide a variety of engaging tasks, administer both individual and multi-person human-robot interaction (HRI), and measure task performance and interaction in a relatively less complex way. In SAR-Connect, an embodied robot can act as both a practice partner and a coach. Several studies have shown that people trust a physical robot and pay more attention to it as compared to virtual avatars [19, 25, 33]. However, current social robots, at least those that are relatively inexpensive, do not have high enough payload and dexterity to be useful in meaningful engaging tasks that require handling of objects. At the same time, one can create a large number of interesting and engaging virtual tasks that can be useful for older adults.
We chose VR as the task implementation platform over a real world task because 1) VR provides a safe interaction environment, 2) tasks can be designed to be interesting without exhausting older adults by adapting to each individual’s range of motion (ROM), endurance, and physical strength, and 3) VR allows software-based comprehensive objective measurements of older adults’ interaction without requiring an extensive set of sensors. The integration of VR and robotics provides a unique novel platform leveraging the advantages of both these technologies for multi-modal interaction strategy, activity and social engagement. Note that as the functionality and perception capability of social robots become more advanced, VR-based activities could be replaced by a similar physical task without any change of the presented HRI framework and without sacrificing the advantages provided by the VR as mentioned above.
The goal of the SAR-Connect system is to engage older adults and foster interpersonal social interaction between older adults while helping them remain physically and mentally active. The robot is responsible for keeping older adults engaged with the task as well as with each other. As the older adults start interacting with each other in the activity-oriented therapies, the role of the robot gradually fades away. To the best of our knowledge, this is the first instance of a SAR system designed to foster and engage interpersonal social interaction among older adults. The rest of the paper is structured as follows. Section II describes a few key challenges for our robot-mediated social interaction tasks for older adults. Section III present a conceptual framework to engage older adults and the HRI framework. Section IV provides an overview of the SAR-Connect system, followed by a detailed discussion on system design and development in Section V. Section VI and Section VII present the user study and its results. Finally, further discussion of the system and its impact are provided in Section VIII and Section IX.
II. Challenges of Robot-mediated Social Interaction for Older Adults
The foremost challenge in HRI of older adults is to ensure technology acceptance [34]. Robot behaviors need to be friendly, easily understandable and meaningful [35]. The interaction tasks should be meaningful and engaging, and feasible considering any physical and cognitive limitations of older adults [14, 36]. With regard to the specific goals of the current work, the SAR must provide activity-oriented therapies based on multimodal strategies that are tailored to the individual and highlights the importance of social engagement [14, 32, 37]. Thus, to be most effective, SAR-Connect needs to offer multimodal stimuli, including physical, cognitive, and social components.
The design of robot behaviors is a key issue for successful HRI with older adults. Since the purpose of the SAR is to administer activity-oriented therapies and foster social interaction, the robot must dynamically guide older adults to perform the activity and fulfill task requirements related to the physical, cognitive, and social stimuli. This requires the robot to understand and interpret multi-user HRI in terms of task engagement, performance, and HHI for task completion.
The HRI task design for older adults requires attention to their impairments. Many older adults experience impairments in vision, hearing, motor functioning, and cognitive functioning and memory. In order to address these impairments several steps may be taken to create a fulfilling HRI experience: large object sizes, addition of text to the objects; addition of sound when necessary within the task, use of vibrant colors and increased color contrast, making robot’s speech slow and of low pitch, creating interfaces that do not require high hand dexterity and cognitive load, and simple task rules with repeated instructions and reminders. We discuss specific design issues of the presented tasks in Section IV.
Finally, the SAR needs to achieve measurable progress on physical and cognitive function and on social interaction. Unlike traditional robotic systems or personal service robots, the progress of older adults cannot be simply extracted from the task specification. Although robot behaviors are tailored to older adults’ task performance, performance itself is not an optimal indicator of older adults’ progress for our work. While we want them to improve or maintain physical and cognitive function through HRI, the primary goal of the current work is to improve their activity and social engagement, which may ultimately lead to a reduction of apathy in future studies. Thus, there needs to be a mechanism to measure activity and social engagement without highly resource-intensive manual coding.
III. HRI Framework
A. Conceptual Framework to Engage Older Adults
The primary objective of our presented HRI framework was to administer engaging physical, cognitive, and social activities to older adults through multi-modal interaction. We adapted Cohen-Mansfield’s Group Comprehensive Process Model of Engagement for multiple older adults since social interactions enhance engagement [35, 38]. Engagement is defined as the act of being occupied or involved with an external stimulus. Following Cohen-Mansfield’s work, we examine engagement by the person’s attendance to the activity sessions, the degree of attentiveness to the activity, the degree to which the individual actively participates, the person’s attitude and whether the person appears bored.
We considered environmental factors (e.g., noise) [39], individual characteristics (e.g., impairments) [40], and activity factors (e.g., domain) [14, 41] into our HRI framework because these can impact the degree to which individuals engage in an activity. We adapted the Unified Theory of Acceptance and Use Technology framework [42] to determine individual characteristics likely to influence a person’s interaction with the robot: performance expectancy, effort expectancy, and attitude. We also incorporated robot factors that influence HRI, including autonomy, anthropomorphic aspects, and multimodal sensors [43, 44].
Individual Characteristics of older adults have been associated with engagement in activities in long term care settings. Female gender, higher cognitive function, and better physical function are positively associated with longer engagement in activities [35, 38–41]. Visual and hearing impairments as well as cognitive impairment are associated with shorter times of engagement. Pre-existing conditions such as depression, apathy, those associated with fatigue (e.g., congestive heart failure), limited movement (e.g., arthritis), or pain (e.g., cancer) can all negatively impact the person’s ability to engage in activities as can certain medications. With regard to Activity Factors, there is a growing body of evidence that multi-modal activities that engage more than one domain (cognitive, physical, social) are more successful in engaging older adults than simple one domain activities [35, 38–41].
For this study, we did not examine the environment since all experiments were conducted in a standardized manner within an engineering laboratory. Nor did we follow people over time to determine the effect on outcomes of apathy. Rather, our intent was to determine whether we could devise a robotic architecture that could successfully engage older adults with one another to achieve an activity or task.
Following this conceptual framework and our goals, we devised an HRI framework consisting of a general task structure for multi-modal activities and a robotic system architecture for activity and social engagement among older adults as described below. The concrete SAR system (Fig. 1) developed following the HRI framework is described in details in the following two sections.
Fig. 1.

The concrete SAR-Connect system overview.
B. Design of a General Task Structure
In our previous studies, we have explored older adults’ perception and acceptance on different forms of physical and cognitive activities as well as simultaneous interaction with their peers [45]. For both one-to-one interaction with the robot and triadic HRI involving two older adults and the robot, older adults’ perceptions of the robot were more positive after the session. Social communication between two older adults were observed during triadic HRI. These results indicated that robot-mediated physical and cognitive activities were well tolerated by the older adults and SAR had the potential to involve more than one person and could facilitate interpersonal social interaction. One weakness of our previous robot-mediated activities lies in their ad-hoc nature and lack of a mechanism to encourage HHI. This led to the challenge of the design of task structures to involve physical and cognitive stimuli and to encourage interpersonal social interaction.
We propose a general task structure (Fig. 2) using hierarchical and modular design, which is flexible enough to accommodate combinations of physical, cognitive, and social stimuli and general enough to account for a large variety of tasks including the ones we have previously developed in [45]. The essential elements in a task are categorized into physical, cognitive, and social stimuli. Subtasks are formed by involving one or multiple essential elements. These subtasks are then combined to form various tasks. For example, to design a chair exercise task, we can take physical essential elements such as arm movement and head movement to form gross motor movement such as raising arms up and looking using head rotation. A chair exercise is then formed by having multiple gross motor movements. In order to design a physical and cognitive task, we can take arm movement as physical stimuli and create an ordered sequence of gross movements that will require memorization and ordered execution as cognitive stimuli. In order to add social stimuli into this sequence of chair exercise task, we can add another person into the HRI and define collaborative rules to encourage older adults to communicate and collaborate. Simon says is such a task and it incorporates social stimuli by having older adults mimic each other’s ordered physical movements.
Fig. 2.

Task structure.
C. Robotic System Architecture
The robotic system architecture is illustrated in Fig. 3. Robot Characteristics are handled by Robot Behavior and Low-Level Robot Controller modules. Activity Factors are taken into account by Activity State module. HRI are realized through a user-friendly motion-based interaction interface, as well as various sensory modalities to provide measures on activity and social engagement such as visual attention, verbal communication, and physiological responses.
Fig. 3.

System architecture.
With feedback from older adults, LTC Activity Directors and geriatric and psychiatric experts within our team (Mion, Beuscher, and Newhouse) we designed a virtual book sorting task to demonstrate the proposed HRI framework that combines gross motor movement, sorting, and collaborative rules to provide physical, cognitive, and social stimuli. The details of the task are discussed in Section V. We chose the humanoid NAO robot1 as the robotic platform to administer the VR-based multimodal task since older adults were interested and engaged in activities led by this robot in our previous studies [45].
SAR-Connect was composed of a NAO, a Microsoft Kinect for Windows RGB-D sensor, two 14-channel Emotiv electroencephalogram (EEG) headsets2, and a VR-based task displayed on a 32-inch computer monitor. One or multiple users sit in front of and facing the Kinect sensor to interact with the system through arm and hand movements. The Kinect sensor tracks the skeleton positions and hand states of the users and sends them to the Interaction Manager module. The Interaction Manager module maps the arm and hand movements of the users in real world to hand cursors and grip/release cursor states in the virtual world to allow users to manipulate virtual books. Users’ interaction is mediated by NAO via robot speech and gestures. The core element, the Supervisory Controller module, communicates with the Interaction Manager module, the VR-based multimodal task, and the robot for real-time closed-loop interaction. It gains knowledge on user’s interaction with the robot-mediated task by monitoring, updating, and analyzing users’ movements, task states, and robot behaviors. It then dynamically guides users to perform the activity and fulfill task requirements by generating events to trigger robot behaviors as well as audiovisual feedback in the VR-based task.
In terms of measurable progress, in addition to computing older adults’ task performance based on their scores, we added data acquisition modules in the system to continuously record multimodal sensory data to capture older adults’ behaviors during HRI. In addition to the comprehensive interaction data from the VR-based task, the system automatically logs the head pose angles of older adults, the sound source angles detected by the Kinect sensor, robot behaviors, and older adults’ EEG signals (Fig. 3). The EEG signals are used as an implicit measurement of user engagement during the interaction. Data from multiple sensory modalities are analyzed offline to develop algorithms for the purpose of automatically evaluating social interaction and activity engagement that capture older adults’ responses to the physical, cognitive, and social stimuli generated by the robot-mediated task. For this initial validation study, we wished to use different sensing modalities that followed our conceptual framework to engage older adults. We adapted Cohen-Mansfield’s Group Comprehensive Process Model of Engagement, which states that engagement in a group setting activity can be measured by a multitude of factors. We captured the multitude of factors using implicit (e.g., EEG and gaze estimation through head pose) and explicit sensing measures (e.g., gestures and audio).
IV. System Overview
Fig. 1 shows the concrete SAR-Connect system used by two older adults. Both older adults interact with the system through a motion-based user interface (UI) with NAO acting as a facilitator. The robot is responsible for 1) engaging an older adult with both physical and cognitive exercises; and 2) further helping foster social interaction between two older adults by guiding them to perform collaborative exercises. Specifically, the robot has several roles for the virtual book sorting task. First, it continuously monitors how an older adult is interacting with the system using gestures. Second, the robot observes the state changes in the VR-based task and provides appropriate feedback. Third, it guides older adults to achieve task requirements related to the physical, cognitive, and social stimuli and encourages HHI. Finally, it acts as a user and guidance provider when interacting with a single older adult and takes actions in the VR-based task to perform the activity.
The goal of the VR-based task is to sort virtual books into the collection bins based on their colors. The task was designed to offer physical, cognitive, and social stimuli. The physical stimuli are the arm and hand movements that result in manipulative actions in the virtual environment. Collaborative rules that encourage older adults to help each other to achieve a common goal were specifically designed as social stimuli. The cognitive stimuli are the book sorting task itself, the mapping rules for the physical stimuli, and the collaborative rules for the social stimuli.
The robot engages older adults in the multimodal task and encourages social communication through gestures and speech. It teaches the older adults how to interact with the virtual environment through incremental learning and demonstration. A complex task of collecting a virtual book are broken down into subtasks such as control hand cursor, close hand, and move a book in different directions. As the robot explains each subtask, it demonstrates the arm and hand movements to accomplish the subtask and evaluates older adults’ performance in real time to provide appropriate guidance. When interacting with a single older adult to sort books, the robot behaviors range from reminding the older adult to move books to offering a book to help the older adult. Robot head movements are used to simulate robot attention, and is programmed to look at the screen, the sensor, or older adults. Robot hand movements related to the VR-based task are in sync with audio and visual effects in the virtual environment.
In order to foster social interaction between two older adults, the robot intervenes immediately when older adults are not collaborating by reminding them the collaborative rules and telling them how to collaborate. In addition, when one of the older adults struggles to interact with the task, the robot would ask the other older adult to offer help before providing instructions directly. Another method used to induce social communication is by referring older adults by their names. We found this elicited gaze behavior towards one other in our previous study [45]. Details on robot behavior design and generation are described in the next section.
As reported by others [46], we had several challenges with the task design for older adults that were evident during the participatory design phase. Older adults’ feedback based on common vision aging changes resulted in modifications to the virtual environment: increased size of books and bins; increased size of the ‘hand’ cursor; more intense and vibrant colors displayed on a dark background; increased thickness and color contrast of the vertical line boundaries; addition of text to the bins; and addition of sound when the book was correctly placed in the bin. Older adults also evidenced some difficulty with depth perception in negotiating the movement of the books in the VR. To accommodate this, additional instruction and prompting from the robot was incorporated. To address changes with hearing we slowed down the speed of the robot’s speech and lowered the pitch to compensate for loss of hearing to higher frequency sounds.
Motor changes with aging were evident in reduced manual dexterity, slower movement and greater variance in movements. Thus, we changed the control mechanism from a hand-held device to a motion-based UI that allowed for the person to grip, move and release books using hand and arm movements. To address the variance in movements (e.g., speed of movement, reduced range of motion, tremors), greater latitude was programmed for the robot’s evaluation of the older adults’ movements.
Notation:
Subscripts s and w denote positions in screen space and world space respectively. Functions S2W() and W2S() map positions from screen space to world space and vice versa. Position is represented by Vector3 structure, which contains x, y, z components.
ALGORITHM 1:
Collaborative Rule for Red-Green Book Task
| Given: User’s interaction area IAi; user’s collaboration area CAi. | |
| Input: Current book position BPw; user index idx. | |
| Requirements for IAi and CAi: | |
| ∀i ≠ j, i ∈ n, j ∈ n | |
| {Virtual world is composed of IAi; users share interaction areas} | |
| (1): U ← IA1 ∪ IA2 ... ∪ IAn; (IAi ∩ IAj) ≠ ∅ | |
| {Ai: Area that is accessible only by user i} | |
| (2): Ai ← (U − IA1 ... ∪ IAi-1 ∪ IAi+1 ... ∪ IAn) ≠ ∅; Ai ≠ Aj | |
| (3): CAi ⊄ Ai; CAi ⊂ IAi | |
| Update book score given user interactions: | |
| 1: | do forever |
| 2: | get start position sBPw, end position eBPw, idx |
| 3: | if sBPw ≠ Vector3.zero and eBPw ≠ Vector3.zero |
| 4: | sBPs, eBPs ← W2S(sBPw), W2S(eBPw) |
| 5: | if book ∈ i and idx ≠ i and sBPs ∈ U − IAi and eBPw ∈ CAi |
| 6: | collaborative move, book score increases, correct sound |
| 7: | else if book ∈ i and idx ≠ i and sBPs ∈ IAi and eBPs ∈ U − IAi |
| 8: | competitive move, book score decreases, wrong sound |
| 9: | else if book ∈ i and idx ≠ i and sBPs ∈ U − IAi and eBPs ∈ IAi |
| 10: | unsuccessful collaborative move |
| 11: | else if book ∈ i and idx = i and sBPs ∈ U − IAi and eBPs ∈ IAi |
| 12: | reduced chance of collaboration |
| 13: | sBPw, eBPw ← Vector3.zero, Vector3.zero |
To address slower cognitive processing with aging, instructions were kept simple with corresponding movements from the robot and repeated until the person was successful in manipulating the books in the environment. The cognitive portion of the task was designed at three levels: maximum robot interaction and input, moderate robot interaction and finally, minimal robot interaction. See Section V for further details addressing the book sorting games and robot behaviors.
ALGORITHM 2:
Collaborative Rule for Yellow Book Task
| Input: Users’ cursor positions CPs; current book position BPw. | |
| Initialization: | |
| 1: | prevCPs[i] ← Vector3.zero for i = 1 to n |
| 2: | collab ← [True, True, True] [For x, y, z directions] |
| Update target position TPw: | |
| 3: | do while users grab the same book |
| 4: | get CPs, BPw |
| 5: | BPs, direct, TPs ← W2S(BPw), Vector3.zero, Vector3.zero |
| 6: | for i = 1 to n do {Calculate TPs given each user’s CPs[i]} |
| 7: | if prevCPs[i] = Vector3.zero |
| 8: | handle edge case, TPs ← n × BPs; break |
| 9: | currDirect ← CPs[i] − prevCPs[i] |
| 10: | if direct.x = 0 |
| 11: | init common direction, direct.x ← currDirect.x |
| 12: | else if collab[1] and (direct.x matches currDirect.x) |
| users 1 to i are moving in the same x direction | |
| 13: | collab[1] ← True |
| 14: | else collab[1] ← False |
| 15: | TPs.x ← TPs.x + CPs[i].x if collab[1] else n × BPs.x |
| 16: | repeat line 10-15 for TPs in y and z directions |
| 17: | TPw ← S2W(TPs ÷ n); Move the book to TPw |
| 18: | prevCPs ← CPs |
V. System Design and Development
A. VR-based Multimodal Task
The book sorting task was designed based on our general task structure to ensure the task had physical, cognitive, as well as social stimuli. In addition, the task needed to be interesting and engaging for older adults. Several tasks were created in consultation with several stakeholders and tested by a group of older adults who were not part of the study. We found that the book sorting task was engaging to them and they enjoyed working together. The design details of the VR-based multimodal task was presented in [47]. Here, for the sake of continuity, we briefly describe the virtual book sorting task and its physical and cognitive stimuli and the embedded collaborative rules for encouraging interpersonal social interaction.
The virtual book sorting task was developed using Unity3 game engine and is shown in Fig. 4. The goal is to sort virtual books into the collection bins based on their colors. Each user is assigned to collect books of a particular color. Efficient collection of some of these books may require help from the other user. By sorting the books as a team, they increase their game scores. The physical stimuli come from a motion-based UI in the Interaction Manager module that naturally maps users’ physical movements to manipulative actions in the virtual world. The motion-based UI is realized by means of a Kinect sensor using its skeleton tracking and hand state detection features. It supports grip, move, and release actions through physical movements. To control a hand cursor in the VR-based task by physical movements, we first defined a user’s left and right interaction boxes. Fig. 5 illustrates the front and side views of the interaction boxes in the Kinect coordinate space. Shoulders, hips, and spine joints’ positions were used to compute the vertices of the interaction boxes. Only one hand controls one user’s hand cursor at a time. The current interaction box is the one that corresponds to the current control hand. In the virtual world, a corresponding 3D interaction area is assigned to each user. A user’s cursor position is the projection of his/her hand position from the interaction box in the physical world to the interaction area in the virtual world. Next, we need to allow users to manipulate books through simple hand gestures. Kinect’s hand state detection algorithm returns five possible hand states, which are closed, lasso, not tracked, open, and unknown. A finite state machine was designed to map these detections to close and open hand states. The detected close and open hand states were then mapped to grip and release cursor states, respectively.
Fig. 4.

Interaction boxes.
Fig. 5.

Supervisory controller module.
The cognitive stimuli are the book sorting task itself, the mapping rules for the physical stimuli, and the collaborative rules for the social stimuli. Two collaborative rules were designed to encourage social communication between older adults. The collaborative rule for red-green book task is shown in Algorithm 1, and the collaborative rule for yellow book task is shown in Algorithm 2. As can be seen, these collaborative rules are not restricted by the number of older adults to allow for future extension of SAR-Connect to work with more than two older adults.
For the red-green book task, based on Algorithm 1, the virtual world is divided into two interaction areas, marked by the red and green vertical lines (Fig. 4). The red interaction area excludes the space to the right of the green vertical line, and the green interaction area excludes the space to the left of the red vertical line. This results in one shared interaction area and two areas that are accessible only by each individual user. The constraints on the interaction areas create the need for collaboration between the users and therefore induce the potential for social communication. For example, the user controlling the red cursor cannot move the cursor past the green vertical line. Therefore, books that are not reachable by the red cursor need to be moved to the shared interaction area by the user controlling the green cursor.
To make the collaborative rule more specific, we defined two collaborative areas, which are the red and green squares on the virtual floor. Users’ collaboration is linked with the scoring scheme of the task. Each book has an initial score of 5. The books with numbers on them are called team bonus books, which are positioned far away from the color matched bins. If the user controlling the red cursor moves a green team bonus book closer to the other user by putting the book inside the green collaborative area, this is considered as a collaborative move and the score of the team bonus book increases to 10 accompanied by a rewarding audio feedback. The user is allowed to prevent the other user from scoring by moving any book outside the reach of the other user. Such competitive moves are discouraged by resetting the score of the team bonus book back to 5 and playing an error sound. In Algorithm 1, in addition to the collaborative move and the competitive move conditions, two more conditions are encoded. The user may intend to collaborate without successfully reaching the collaborative area, and the user could reduce his/her peer’s opportunity to make collaborative moves. These two conditions do not associate with any immediate activity feedback, but are used to determine robot behaviors.
For the yellow book task, based on Algorithm 2, the users collaborate by grabbing the same book and moving the book in the same direction. Otherwise, the book does not move. The moving direction of each user’s cursor movement is projected to the X, Y, and Z directions in the virtual world. If the x components of both users are in the same direction, the target position of the yellow book in the X axis is the mean value of the target position of the two hand cursors in the X axis. The target position of the yellow book in the Y and Z axis are computed similarly.
The collaborative rules for the red-green book task (Algorithm 1) depends on interaction areas and collaboration areas, whereas the collaborative rules for the yellow book task (Algorithm 2) depends on common effort. In addition, older adults can take turns or simultaneously interact with the system. This allows us with different combination of interaction strategies when more than two older adults are involved. Each older adult can have their own unique interaction area, and their own unique or shared collaboration areas. Older adults can be grouped into teams, in which case the members of the teams can take turns to interact with the system or they can follow the collaborative rule in Algorithm 2 to simultaneously interact with the system. For the yellow book task, instead of requiring all older adults to move the same book at the same time, we can define the task to require at least n number of older adults to move the same book in the same direction. From HRI perspective, the same measures are going to be used to trigger various robot behaviors. Social engagement in a group setting can be defined as the amount of social interaction between any two older adults through collaboration in task or gaze and speech.
SAR-Connect was developed based on the red-green book task, which we referred to as the main task. The yellow book task was used as the post-test to explore older adults’ behaviors when they perform a similar task (book collection) with unknown information. Older adults see yellow books and yellow bins but are not aware of the collaborative rule that they have to move the same book in the same direction together. We were interested in determining whether older adults would communicate with each other to figure out the unknown piece of the task. If they could not move any yellow book half way through the interaction, the robot gave them a hint by asking them to try moving the book together.
B. Supervisory Controller
The key behaviors of our SAR system were implemented in the Supervisory Controller module. During HRI, the system continuously evaluates older adults’ activity compliance and collaboration status and generates robot behaviors to engage older adults in the robot-mediated task and social interaction with their peers. There are three interaction modes: one older adult interacts with the system (one-to-one interaction), or two older adults take turns or simultaneously interact with the system (triadic interactions).
Our SAR system is a hybrid system involving both discrete events and continuous dynamics. A Low-level Robot Controller module and a Virtual Book Sorting Activity module are responsible for continuous dynamics, including physical behaviors of virtual objects and robot’s physical movements. The Supervisory Controller module decides the mode transition structure of robot behaviors and activity states. It was modeled by timed automata and hierarchical state machines (HSM), as shown in Fig. 6. The top level of the hierarchy contains two concurrent states: Robot Behavior and Activity State. Each of them has a state refinement. These two super-states communicate with each other using shared variables through network interface. Fig. 6 only illustrates the modes or strategies the system is following. These modes were modeled by HSM, Markov decision processes (MDP), or finite state machines (FSM). The notation for labeling state transitions is guard / action. The Activity State is updated as it receives inputs representing the physical movement and task performance of older adults from the Interaction Manager and Virtual Book Sorting Activity modules, and inputs representing the current robot status from the Low-level Robot Controller module. Activity State is always in one of the three interaction modes. For the mode triadic interaction – take turns, the sub-state changes from one user to another when the current interacting user successfully collects a book or makes a collaborative move. The variables associated with the new user are reset with the transition.
Fig. 6.

Raw sound source angle and yaw angle data for one pair.
The initial state for Robot Behavior is Robot Instruction State, where the robot provides the interaction logic to the user(s). During the interaction, the robot is in one of the five states: Play State, MDP State, Correcting Feedback State, Immediate Collaboration Feedback State, or Score Feedback State. The robot finishes the interaction in the End State with a dance if the total score is high. When the robot is in the Play State, it is in a standing posture with its head rotated towards the virtual world as if it is monitoring the activity state. The MDP State and Feedback States are the ones that generate robot behaviors automatically based on real time human interactions. State transitions between the Play State and the four other states, i.e., the MDP State and the Feedback States, are triggered by time variables and discrete events. The robot goes back to the Play State after the Low-level Robot Controller completes the designated robot behavior assigned in one of the four states. If no robot behavior is assigned, the robot goes back to the Play State immediately. While the MDP State and Feedback States are waiting for robot’s continuous dynamics to finish, time variables and discrete events keep updating. We designed the refinement of Robot Behavior in this way due to the fact that it takes time for the robot to execute any behavior. When the robot is handling one event, another event may occur. This design ensures that no events are neglected and if one event needs to be handled at a later time, the variables are always up-to-date.
For the purpose of facilitating and engaging older adults to interact with the system and with each other, we developed several robot behaviors ranging from playing as a user to prompting user(s) on how to improve their interaction. These robot behaviors are categorized to the MDP State or one of the three Feedback States. In the MDP State, the robot plays the role of a second user and assumes that it is playing with a perfect user, who interacts with the system correctly and knows how to obtain a maximal score. In Feedback States, the goal of the robot is to make user(s) perform better by prompting user(s) on how to control their hand cursors correctly and how to collaborate with their peers to improve their scores, and by celebrating their achievement marked by high scores to keep them engaged and motivated. Table I summarizes the variables related to all Feedback States and the resulting robot behaviors. The variables related to the Correcting Feedback State were checked in order. If GripPerc variable triggered the robot behavior, this indicates CursorHeight variable satisfied the requirement and the robot would not evaluate the other two variables. The four states are completely decoupled.
TABLE I.
Variables Related to Feedback States
| Feedback States | Variables | Description/Robot Behavior |
|---|---|---|
|
| ||
| Correcting Feedback State | CursorHeight | Averaged cursor screen height over time interval tcorrecting. If this value is below threshold Thch, NAO reminds users to hold their hands up higher. |
|
| ||
| GripPerc | The percentage of time cursor is in grip state over time interval tcorrecting. If this value is below threshold Thgp, NAO reminds users to close hands and pick up books. | |
|
| ||
| BookDist | The longest book moving distance during time interval tcorrecting. If this value is 0, NAO reminds users to move books. Otherwise, if this value is below threshold Thbd, it is usually because the book drops while user is moving it, and NAO suggests users to move books slowly. | |
|
| ||
| HowToCollab | The number of times users initiate a collaborative move but fail during time interval tcorrecting. If this value is greater than 0, NAO encourages their attempt and reminds them how to do it correctly. | |
|
| ||
| Immediate Collaboration Feedback State | ShouldCollab | True if users try to collect a team bonus book without collaboration. NAO reminds users to collaborate. |
|
| ||
| NoCompete | True if users play competitively by moving a team bonus book outside the reach of the other user. NAO persuades them to stop competing. | |
|
| ||
| Score Feedback State | Score | Cumulative book collection score. NAO celebrates once each time the score is above 30, 60, and 90. |
1). Examples of Robot Behaviors
Examples of robot gestures are 1) task-related moves such as moving virtual books and demonstrating how to move books in different direction, 2) guidance moves such as pointing and looking at older adults to provide feedback, 3) celebration moves such as clap and dance, and 4) observation move such as looking towards computer monitor to allow interactions of the older adults without interference from the robot.
During orientation, the robot teaches the older adult participants how to interact with the virtual environment through incremental learning and demonstration. The robot breaks down the task of grab and move virtual books into subtasks such as control hand cursor, close hand, and move book up and down. As the robot explains each subtask, it demonstrates the arm and hand movements to accomplish the subtask and ensures that the participant is able to follow before continuing to the next subtask. For example, when the robot teaches an older adult to grab a book, the instruction it provides is “Try to move your hand on one of the books. Then close your hand to grab the book. Open your hand with palm facing the sensor and then close your hand. Follow me.” If the older adult fails to successfully grab the book, depending on how he/she has performed, robot might say “I see you closed your hand. Put your hand on the book to grab it.” or “Take your time. Once you put your hand on the book, close your hand to grab it.” Otherwise, robot provides positive feedback such as “Great job!” and “Wonderful!” and continues demonstrating the next subtask.
During the red-green book task, the robot behaviors fall into the behavior states in Fig. 6. In the Robot Instruction State, NAO provides instructions on the collaborative game, emphasizes how the score will increase with collaboration, and encourages older adults to achieve a high score by saying it will dance to celebrate if they achieve more than 100 points. In the End State, robot dances to music if a high score is achieved and thanks the older adults for their participation. MDP State is specifically designed for one-to-one HRI; there are four robot behaviors, no action, collect book, offer book, and request book. No action behavior will result in whatever behavior robot was previously doing. When the robot collects or offer books, its arm and hand movements are in sync with the book movement in the virtual environment. Robot says “I will help you with this book. Here you go.” when offering the book and “Can you get a green book? I cannot reach that far.” when requesting a book to induce collaborative interaction from older adults. Robot takes one of the four actions in the MDP State based on the MDP model.
The purpose of the Correcting Feedback State is to prompt the users on how to improve their interaction with the VR-based task. In this state, the robot behavior is determined by the older adult’s real time interaction with the task. Several task-oriented interaction metrics such as cursor height, percentage of hand grip, book moving distance, and the correctness of collaboration are used to determine the robot behavior. Robot behaviors are generated following a least-to-most prompting hierarchy. For example, if an older adult is not actively interacting with the virtual environment and his/her hand cursor remains low on the screen, the robot will remind the older adult to hold their hands up higher and ignore other interaction metrics. The definition of each task-oriented metrics and the corresponding robot behaviors are listed in Table I. The Immediate Collaboration Feedback State lets the robot intervene immediately when older adults are not collaborating. The two collaboration-oriented interaction metrics are described in Table I. Finally, we added the Score Feedback State based on the principles of positive reinforcement to cheer older adults on and keep them engaged and motivated. In addition to the robot behaviors, many visual and audio feedback, either by themselves or in sync with a robot behavior, are embedded in the VR-based task to provide positive reinforcement.
In one-to-one HRI, the robot provides feedback directly to the older adult. For example, if GripPerc is low for the older adult, robot says “[Name], remember to close your hand to pick up books.” Whereas in simultaneous HRI, the robots would say “[Name 0], can you help [Name 1] with how to grab the book?” when only one of the older adult struggles to interact with the task; and the robot would say “[Name 0] and [Name 1], remember to close your hands to pick up books” when both of the older adults struggle to interact with the task. What the robot actually says in the same activity state depends on the interaction modality, one-to-one, triadic take turns, or triadic simultaneous, the performance of each older adult and their overall performance. The key design principle we follow is to induce social communication and interaction between older adults as much as possible, either by referring to their names such as “[Name 0], remember that [Name 1] cannot move the hand cursor pass the green vertical line. If you move the book further away, [Name 1] cannot grab the book”, or by directly asking one older adult to help another, or by prompting them to collaborate in the task. Another robot behavior is monitoring how older adults are interacting with the VR-based task by turning its head towards the computer monitor. The rationale for this behavior is to allow the robot to fade into the background to allow interactions of the older adult participants without interference from the robot.
2). MDP Model
MDP State is used only for the one-to-one interaction task, in which the user controls the red hand cursor and the robot acts as a user with the green hand cursor. When a transition from the Play State to the MDP State occurs, a MDP model is used to determine the action robot should be taking. The MDP model is a 5-tuple (S, A, P, R, γ),where:
S is the finite set of states. These are the current configuration of the task, and is represented by a 6-tuple (cr,cg,rbrm,gbr,rbg). cr and cg are Boolean values that describe whether the number of collected red books and green books increases or not from time step t to time step t +1, respectively. rbrm is the number of red books in the red and middle areas, i.e., the red books that can be moved by the red hand cursor. gbgm is the number of green books in the green and middle areas. gbr is the number of green books in the red area, and rbg is the number of red books in the green area. Given the total number of books n and assuming equal number of red and green books, the values of rbrm, gbgm, gbr, and rbg comply with rbrm + rbg ≤ n/2 and gbgm + gbr ≤ n/2.
A is the set of robot behaviors. Four behaviors are defined for the MDP State, including no action, collect book, offer book, and request book.
P :Pr(st+1 |st,at) is the transition function defining the probability that state s at time step t will lead to new state at time step t +1, if robot behavior at time step t is a. We assume that from time step t to time step t +1, robot behavior together with the user behavior change state S. Robot behaviors are deterministic; however, user behaviors are not. We assume user may take three types of actions: no/failed action, collect book, or offer book. These probabilities model the stochastic user behaviors given robot behavior and were set empirically.
R(st,at,st+1) is the immediate reward received after state transition. Positive rewards are given to every book collection and collaborative move made by both the user and the robot. Negative rewards are associated with impossible robot behaviors under st. An example of impossible robot behavior is robot collect book when no book is left to be collected.
γ is the discount factor that favors immediate rewards over future ones and was set to 0.9.
The time step is at least every TimestepMDP second. The variables related to MDP State are the 6-tuple of state S.
C. Objective Measures
We recorded four types of data, which were game interaction, head pose, vocal sound, and EEG, in order to automatically generate objective measures to capture older adults’ social and activity engagement. In [45], we defined three engagement variables: engagement action, engagement intention, and engagement state. Engagement actions are older adults’ explicit actions related to the task. In SAR-Connect, this corresponds to task-related actions stored in the interaction data and HHI actions in the form of talking to promote task performance, which could be evaluated using vocal sound data. Engagement intention, on the other hand, is implicit states of older adults. In SAR-Connect, this corresponds to where older adults are paying attention to and their electrophysiological responses. We used older adults’ head movements to approximate their gaze and used EEG signals for electrophysiological responses. Together, engagement action and engagement intention determine engagement state based on a timed automaton.
EEG signals from 0.2 to 45Hz were collected at a sampling rate of 128Hz. The 14 channels were placed at positions AF3, F7, F3, FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8, and AF4, defined by the 10–20 system of electrode placement. We recorded 2 minutes of baseline EEG signals where we asked the participants to sit quietly with eyes open and recorded their EEG signals continuously during the entire session. Before EEG analysis, the signals were preprocessed to increase the signal-to-noise ratio by slew rate limiting, band-pass filtering, artifact rejection and correction to exclude poor signals, EOG, and EMG artifacts. The processed signals were then analyzed to estimate participants’ engagement intention.
Interaction data indicated older adults’ real time interaction with SAR-Connect, including the motion-based control data such as the hand cursors’ position and the type and position of a grabbed book, the task states such as the number of different books in different virtual areas, the performance data such as the total score and the number of collaborative moves, and robot actions. From the interaction data, we defined two metrics to represent activity and social engagement. Individual task completion was defined as the amount of action exerted by older adults to move their own books. This included the effort needed to successfully collect a book and the effort needed to move books closer to one’s own bin. Collaborative task completion was defined as the amount of effort exerted by older adults to help move their peers’ books. This included the effort needed to successfully move team bonus books to collaboration area and the effort needed to move peers’ book closer to their bins. In this context, the effort was the change of book distance due to older adults’ hand and arm movements. These metrics were computed automatically from the motion-based hand control data, the task states, and part of the performance data.
The head pose yaw angles were detected by the Kinect sensor and the data were used to estimate older adults’ engagement towards the task and each other in terms of their gaze direction during HRI. Before HRI, we recorded about 15s of calibration data where we asked pairs of older adults to look at different locations, including the robot, the computer screen, and at their peers. The calibration data were used to define ranges of head pose yaw angles for head rotation towards the robot, head rotation towards the computer screen, and head rotation towards the other person. Fig. 7 shows the raw head pose yaw angles for two older adults while they were interacting with the system. The green bands represent the range of head pose yaw angles for looking at the computer screen. The blue bands represent the range of head pose yaw angles for looking at the robot. The red lines indicate the head pose yaw angles for looking at their peers as calculated from the calibration data. In order to include subtle head movements towards another person, instead of setting the red lines as thresholds for looking at another person, we added a margin to the left and right most edges of the system as thresholds for head rotation towards the other person. These new thresholds are represented as the other edges for head rotation towards the other person in Fig. 7.
Fig. 7.

Experimental setup and procedure.
From the head pose yaw angle plots of two older adults, it can be seen that the majority of the yaw angles fall within the ranges of head rotation towards the screen and head rotation towards the robot, and sometimes the yaw angles overshoot or undershoot to reach thresholds for looking at the other person. This indicates the ability of the generated ranges and parameters to interpret raw head pose yaw angles as a measure of engagement towards the task and each other. Because the computer screen and the robot were positioned in close proximity and their ranges of head pose yaw angles overlapped with each other, we combined these two ranges into one range of head pose yaw angles for engagement towards the task. These ranges were then used to automatically calculate the amount of time older adults’ paying attention to the task and their peers as well as the number of times they looked and/or turned their heads towards each other.
In a similar manner, we were able to automatically detect the start and end of vocal sounds made by older adults. The Kinect sensor recorded the sound source angles during HRI, which identified the direction of a sound source. In order to isolate the range of sound source angles for one older adult from the rest of the sound sources such as the other person, the robot, and sounds generated by the virtual task, each older adult was asked to read a sentence during which we recorded the sound source angles and the corresponding confidence levels for the detection before HRI. By aggregating the sound source angle calibration data for all the pairs, we computed the ranges for sound source angles that capture older adults’ vocal sounds and the lower bounds for the confidence levels. An example of raw sound source angles recorded during one session of triadic HRI is shown in Figure 7. The green band represents the range of sound source angles that detects vocal sounds from the older adult sitting to the right facing the robot. The blue band represents the range of sound source angles that detects vocal sounds from the older adult sitting to the left facing the robot. The sound source angle data that fall outside of these two ranges are detected vocal sounds generated by either the robot, the virtual task, or noise in the environment. These ranges for sound source angles and parameters for confidence levels of the detection allow us to compute automatically the amount of time older adults were talking and the number of times older adults spoke.
With respect to the EEG signals, we used the EEG engagement index (EEI) [45] to estimate older adults’ overall engagement level during HRI. EEI calculated over 40s sliding window with 38s overlap gives us an engagement trace. We found that by averaging EEI over an activity and taking into account individual differences in baseline EEI, we have a summarized EEI that has strong correlation with older adults’ self-rated preference of an activity [45]. The summarized EEI is high when older adults enjoy the activity whereas the summarized EEI is relatively low when older adults show less interest to the activity. In this work, we use the same summarized EEI to indicate older adults’ engagement for the one-to-one and triadic HRI. Before HRI, we recorded two minutes of baseline EEG signals during which we asked the participants to sit quietly with eyes open. The EEI was calculated by taking the ratio of beta band spectral power (13–22 Hz) to the sum of alpha band spectral power (8–13 Hz) and theta band spectral power (4–8 Hz). EEIs calculated from baseline EEG signals were averaged to serve as the base engagement level for each older adult. We then computed the summarized EEI during HRI by averaging the change of EEI from baseline for every 40s of EEG epoch. Details of the algorithms to process EEG signals and to compute EEI and summarized EEI are described in our previous papers [45, 48].
VI. User Study
A. Experimental Design
SAR-Connect, the experimental room setup, and the experimental procedure are shown in Fig. 8. Participants sat in the two chairs in front of and facing the system. NAO was positioned by the side of the computer monitor. The Kinect was placed on the edge of the table in front of the monitor. Participants sat approximately 2 meters away from the monitor and at a 30-degree angle towards each other. When a single participant interacted with the robot, one chair was positioned directly in front of the table. The experimental procedure consisted of five components: a practice session, which was then followed by three main tasks (red-green book task with one-to-one HRI, triadic HRI – take turns, and triadic HRI – simultaneous), and finally a post-test (yellow book task). Each participant first interacted independently with the system and then pairs of participants played with each other under the guidance of NAO. During practice, the robot taught participants how to interact with the system by arm movement and hand manipulation. Participants then performed the main task alone with the robot as the second player. After two older adults completed the one-to-one HRI, they were paired to perform the main task together. They first took turns to interact with the system and then played again simultaneously. Lastly, they completed the post-test to finish the whole experiment.
Fig. 8.

Task completion metrics analysis results. Self represents individual task completion and Collaboration represents collaborative task completion.
For the main tasks, there were 8 red books and 8 green books, and 10 out of the total 16 books were team bonus books. If none of the participants collaborated, the maximum score they could obtain was 80. If they collaborated for every team bonus book, the maximum score increased to 130. The robot encouraged them to achieve a high score in Robot Instruction State by telling them it would dance to celebrate if they had achieved more than 100 points. The duration of the interaction excluding Robot Instruction State and End State was limited to 6 minutes. The thresholds for Feedback States were set to be Thch = 100, Thgp = 0.1, and Thbd = 2. TimestepMDP was 6s if participants finished their part and were waiting for the robot to complete the task. Otherwise, TimestepMDP was 12s. Timestepcorrecting was 15s for triadic interactions – take turns and 20 s for the other two main tasks. For the post-test, the interaction duration was set to 3 minutes. All these parameters were chosen based on limitation of NAO (runs about 10~15min before motors become hot) and by pilot testing with older adults and volunteers.
B. Participants
The study was approved by the Vanderbilt University Institutional Review Board. For this initial study, experiments were geared towards older adults (ages 65 and older) who resided in the community. Exclusion criteria included: vision or hearing impairments precluding the ability to interact with SAR-Connect, lack of transportation, inability to walk, inability to speak and understand English, inability to understand and consent to the study procedures.
We targeted three distinct groups of older adults: those with normal cognition, those with MCI, and those with mild dementia. Recruitment efforts for those with normal cognition took place at public meeting areas, such as area churches, older adult health fairs, and the YMCA Silver Sneakers activities. We recruited those with a diagnosis of MCI or mild dementia from the Vanderbilt University Center for Cognitive Medicine (VUCCM). Flyers and contact information were distributed at all sites; at the VUCCM, clinicians and nursing personnel approached potential participants. Interested participants contacted the investigators to discuss the study, review the informed consent elements and arrange for a private meeting with the research personnel to determine eligibility criteria and obtain informed consent.
Research personnel met each participant privately to complete the informed consent process. After discussing the study, the participant reviewed the consent form and answered questions to determine their understanding using the University of California, San Diego Brief Assessment of Capacity to Consent (UBACC) [49]. The UBACC is a short 10-item scale that includes questions focused on the person’s understanding of the research protocol, such as the purpose, procedures, and potential risks or discomforts. All participants had to be able to understand and correctly complete the UBACC to be considered as capable of decisional capacity, even those with a diagnosis of MCI or mild dementia.
Participants completed the Montreal Cognitive Assessment (MoCA) screening tool for MCI under the supervision of a trained geriatric nursing faculty member, Dr. Beuscher, of our research team [50]. The MoCA is a brief screening tool to aid frontline clinicians in detecting MCI as well as dementia with excellent sensitivity and specificity. Scores can range from 0 to 30 (no identified impairment) with scores 22–25 suggesting MCI and <22 suggesting potential dementia. It is to be noted that the research team did not diagnose any of these individuals; their MoCA scores were used to categorize the data.
We separated the five tasks into two sessions: a one-to-one session and a triadic session. Participants came to the laboratory for practice and one-to-one interaction first. If they completed the tasks and were willing to come back for another session, we paired them with another participant who had finished the one-to-one session. The pairs were assigned randomly to represent naturalistic pairing in LTC settings. The triadic session consisted of the two triadic interactions and the post-test. A total number of 26 older adults took part in the study (17 females, 9 males). The age of the participants ranged from 70 to 90 years old (Mean = 76.7, SD = 5.6). Participants’ MoCA score ranged from 19 to 27 (Mean = 24.2, SD = 2.2). Based on the MoCA scores, 9 participants were in the range of normal cognition, 12 were in MCI range, and 5 were within the range of Alzheimer’s dementia. Out of the 26 older adults, 18 were paired for triadic interaction (7 with normal cognition, 8 with MCI, and 3 with Alzheimer’s dementia per MoCA and/or self-report). The pairs consisted of 1 normal-normal, 4 normal-MCI, 1 normal-dementia, 1 MCI-MCI, and 2 MCI-dementia. For one of the pairs, both older adults had severe hearing issues and were not able to understand the robot. It was even difficult for them to understand the administrator. We thus removed this pair’s data. The rest of the older adults who dropped out were due to scheduling issues.
VII. Results
A. System Performance Results
The system worked as designed. The VR-based tasks were displayed and updated correctly. The motion-based UI was stable and older adults could easily move their hands to control their hand cursors in both horizontal and vertical directions. There were times participants struggled to move books in the third direction, which corresponds to depth in the virtual environment. However, once they learned this type of motion, they were able to perform it without help from the administrator. During one-to-one HRI, triadic HRI, and post-test, there was no interruption by the administrator unless the older adults were unable to interact with the system and became very frustrated. This rarely happened during the experiment.
In one-to-one HRI, the robot was able to play and make progress towards task completion for all the participants. In both one-to-one HRI and triadic HRI, the robot prompted older adults on their task performance and encouraged them to collaborate with each other following the decision of the Supervisory Controller module. All robot behaviors generated by the Supervisory Controller module were executed successfully. In addition to activity instructions and celebration feedback generated by the robot, a total number of 513 robot behaviors were generated. In one-to-one HRI, 204 robot behaviors were collecting books (M = 8.50, SD = 2.48), 109 robot behaviors were offering books (M = 4.54, SD = 0.78), 8 robot behaviors were requesting books (M = 0.33, SD = 0.70), 23 robot behaviors were increasing task performance (M = 0.96, SD = 1.23), and 50 robot behaviors were increasing collaboration (M = 2.08, SD = 1.47). In take turns HRI, 41 robot behaviors were increasing task performance (M = 5.13, SD =3.48) and 28 robot behaviors were increasing collaboration ( M = 3.50, SD = 2.83 ). In simultaneous HRI, 19 and 31 robot behaviors were observed to increase task performance ( M = 2.38, SD = 1.92 ) and collaboration (M = 3.88, SD = 1.55 ), respectively. The system logged all the generated robot behaviors, activity states, participants’ interaction data, head pose data, vocal sound data, and EEG signals correctly.
B. Objective Measures of Interaction
From the interaction data, we computed the individual task completion and collaborative task completion metrics. The collaborative task completion is related to social engagement between the two older adults whereas the combination of two is related to their activity engagement. In one-to-one HRI, the collaborative task completion is computed in the same way to measure the amount of collaboration between the robot and the human. Fig. 9 shows the participants’ task completion metrics analysis results during one-to-one and triadic HRI. We used the wilcoxon signed rank test to compare task completion metrics analysis results. One-to-one HRI had the highest collaborative task completion and total task completion. For triadic HRI, participants’ collaborative task completion and total task completion increased from take turns HRI to simultaneous HRI. As compared to take turns HRI, both the collaborative task completion (z = 2.07, r = 0.37) and the total task completion (Z = 2.28, r = 0.40) in simultaneous HRI were statistically significantly higher at the 0.05 level with a medium effect size. The increase of total task completion was partially due to a statistically significant increase of individual task completion in simultaneous HRI as compared to take turns HRI (Z = 2.28, r = 0.40). We also calculated the ratio of the individual task completion to the collaborative task completion. The mean ratios of Indiv/Collab were 1.3 (SD = 1.0) for one-to-one interaction, 1.7 (SD = 1.5) for triadic – take turns interaction, and 2.3 (SD = 2.6) for triadic – simultaneous interaction. The differences among the ratios for the three types of HRI were not statistically significant. The effect sizes for one-to-one and take turns was 0.21, for one-to-one and simultaneous was 0.21, and for take turns and simultaneous was 0. Therefore, when older adults’ collaborative task completion increased or decreased from one task to another, their individual task completion followed the change.
Fig. 9.

Task completion metrics analysis results. Self represents individual task completion and Collaboration represents collaborative task completion.
In terms of the yellow book task, 5 out of 8 pairs of older adults were able to figure out the unknown collaborative rule through social interaction without help from the robot. For successful collaboration, we computed the number of times older adults moved a yellow book together, the amount of task completion to move a yellow book together, and the amount of time they spent to move a yellow book together. For task effort that was unsuccessful, we computed the number of times older adults tried to move a yellow book and the amount of time they spent to move a yellow book. The total duration takes into account both successful collaboration as well as unsuccessful task effort. The results are listed in Table II. It can be seen from the table that on average, older adults successfully moved yellow books 6 times and spent 75.7s actively engaging in the 3-minute task. Although the group that did not receive a hint from the robot had more successful collaboration and more unsuccessful attempts, the total durations for the two groups are comparable.
TABLE II.
Post-test (Yellow Book Task) Results
| Performance Metric | All | Without Hint | With Hint | ||||
|---|---|---|---|---|---|---|---|
|
| |||||||
| Mean | SD | Mean | SD | Mean | SD | ||
|
| |||||||
| Success | Count | 6.4 | 3.9 | 7.8 | 4.2 | 4.0 | 1.5 |
| Task completion | 6.8 | 4.5 | 8.8 | 4.5 | 3.6 | 1.9 | |
| Duration (s) | 38.9 | 28.9 | 44.0 | 35.5 | 30.2 | 9.9 | |
|
|
|||||||
| Fail | Count | 16.9 | 7.6 | 18.7 | 7.8 | 13.8 | 6.8 |
| Duration (s) | 36.8 | 24.0 | 30.7 | 20.2 | 46.9 | 28.1 | |
|
|
|||||||
| Total Duration (s) | 75.7 | 28.9 | 74.8 | 32.2 | 77.2 | 25.1 | |
From the head pose data, we computed the amount of time older adults were paying attention to the computer screen or the robot as activity engagement. For social engagement, we computed the amount of time and the number of times older adults looked towards their peers. The results are listed in Table III. In general, older adults spent the majority of the time (80.7% in take turns HRI, 75.9% in simultaneous HRI, and 86.2% in post-test, which is the yellow book task) focusing on the task and the system. These data indicate their overall engagement on triadic HRI. They also had social engagement in terms of looking towards their peers. In take turns HRI, the number of times they looked towards their peers ranged from 0 to 25 (median: 2.5). In simultaneous HRI, the number of times they looked towards their peers ranged from 0 to 19 (median: 2). In post-test task, the number of times they looked towards their peers ranged from 0 to 9 (median: 1). On average, older adults looked at their peers 0.7 times per minute in triadic HRI and 0.6 times per minute in post-test. Compared to take turns HRI, older adults looked towards their peers more during simultaneous HRI. The percentage of looking time duration increased from 4.1% to 5.6%. As a result, their activity engagement decreased slightly. In terms of the post-test, older adults spent less time on looking behaviors but they spent more time focusing on the system.
TABLE III.
Head Pose Analysis Results
| Head Pose Metric | Turn | Simul | Yellow | ||||
|---|---|---|---|---|---|---|---|
|
| |||||||
| Mean (SD) | Simul r | Mean (SD) | Yellow r | Mean (SD) | Turn r | ||
|
| |||||||
| Activity Engagement (n = 16) | Duration (s) | 309.3 (43.0) | 0.5** | 253.6 (70.5) | 0.62*** | 155.2 (26.5) | 0.62*** |
| Percentage | 80.7% (11.3%) | 0.27 | 75.9% (14.4%) | 0.34* | 86.2% (14.7%) | 0.33 | |
|
|
|||||||
| Social Engagement (n = 16) | Duration (s) | 16.0 (26.1) | 0.02 | 19.3 (31.5) | 0.36* | 3.9 (5.8) | 0.35* |
| Percentage | 4.1% (6.8%) | 0.02 | 5.6% (8.6%) | 0.26 | 2.2% (3.2%) | 0.23 | |
| Count | 4.7 (6.5) | 0.13 | 3.8 (4.7) | 0.30 | 1.9 (2.3) | 0.33 | |
| Count per minute | 0.7 (1.0) | 0.04 | 0.7 (0.8) | 0.12 | 0.6 (0.8) | 0.06 | |
p<0.05
p<0.01
p<0.001
n = sample size, r = effect size between sub and main column, Wilcoxon signed rank tests
From the vocal sound data, we computed the total amount of time and the number of times older adults were speaking. The results are listed in Table IV. For individual older adults, they spent about 12% of the time talking to each other during triadic HRI. In post-test, the amount of talking increased significantly, nearly doubled (21.5%) as compared to triadic HRI. This increase is statistically significant (Wilcoxon signed rank test) for both take turns HRI (Z = 2.90, r = 0.51) and simultaneous HRI (Z = 2.84, r = 0.50) at the 0.01 level with a medium to large effect size.
TABLE IV.
Vocal Sound Analysis Results
| Vocal Sound Metric | Turn | Simul | Yellow | ||||
|---|---|---|---|---|---|---|---|
|
| |||||||
| Mean (SD) | Simul r | Mean (SD) | Yellow r | Mean (SD) | Turn r | ||
|
| |||||||
| Individual (n = 16) | Duration (s) | 45.9 (42.3) | 0.07 | 43.6 (44.1) | 0.05 | 38.6 (27.2) | 0.04 |
| Percentage | 11.9% (10.9%) | 0.06 | 12.6% (11.8%) | 0.50* | 21.5% (15.1%) | 0.51* | |
| Count | 34.7 (28.5) | 0.13 | 31.9 (29.0) | 0.10 | 26.3 (14.5%) | 0.19 | |
| Count per minute | 5.4 (4.4) | 0.01 | 5.6 (4.7) | 0.53* | 8.8 (4.8) | 0.52* | |
|
|
|||||||
| Pair (n = 9) | Duration (s) | 91.8 (51.1) | 0.18 | 87.3 (66.1) | 0.06 | 77.3 (28.1) | 0.36 |
| Percentage | 23.7% (13.2%) | 0.04 | 25.3% (16.9%) | 0.62* | 42.9% (15.6%) | 0.62* | |
| Count | 69.4 (41.7) | 0.27 | 63.9 (43.2) | 0.13 | 52.5 (18.9) | 0.40 | |
| Count per minute | 10.8 (6.5) | 0.01 | 11.2 (6.7) | 0.62* | 17.5 (6.3) | 0.62* | |
p<0.05
p<0.01
p<0.001
n = sample size, r = effect size between sub and main column, Wilcoxon signed rank tests
The summarized EEI was used to estimate older adults’ overall engagement level during HRI. We list the results in Table V. As can be seen from the table, the summarized EEIs were comparable for different type of HRI. Out of 16 older adults, 9 older adults’ engagement level increased in triadic HRI than one-to-one HRI. We also calculated the mean and standard deviations of the summarized EEI for the increased group and the decreased group.
TABLE V.
EEG Analysis Results
| Summarized EEI | Single | Turn | Simul | Yellow | ||||
|---|---|---|---|---|---|---|---|---|
|
| ||||||||
| Mean (SD) | Turn / simul r | Mean (SD) | Yellow r | Mean (SD) | Turn/Yellow r | Mean (SD) | Single r | |
|
| ||||||||
| Increased Group (n = 9) | −0.086 (0.106) | 0.60*/0.62** | 0.004 (0.100) | 0.46 | 0.005 (0.116) | 0.01/0.26 | 0.024 (0.114) | 0.62** |
| Decreased Group (n = 6) | 0.006 (0.151) | 0.63*/0.63* | −0.136 (0.144) | 0.21 | −0.134 (0.173) | 0.03/0.21 | −0.119 (0.167) | 0.63* |
| Whole Group (n = 15) | −0.049 (0.129) | 0/0.02 | −0.052 (0.135) | 0.35 | −0.050 (0.153) | 0.02/0.22 | −0.033 (0.1115) | 0.08 |
p<0.05
p<0.01
p<0.001
n = sample size, r = effect size between sub and main column, Wilcoxon signed rank tests
C. Objective Measure Results According to Cognition
The previous section illustrates the results of objective measures aggregated among all the participants. By grouping results based on the three cognition categories, we were able to evaluate whether there were any differences in how different groups interacted with the system.
Participants with dementia had the highest head pose towards the system. As a result, their social engagement (as measured by their head pose towards the other person) was less, in the range of 0.6 – 1 %, when compared to both participants with MCI or normal cognition, who had social engagement values in the range of 3 – 6%. The same trend was observed for vocal sound data analysis results. Participants with normal cognition or MCI have higher speaking duration percentages (12–25%) as compared to those with dementia (3–11%). However, the increase in speaking duration percentages in yellow book task is more pronounced for participants with dementia showing a 3-fold increase from 3.5% to 11.6%.
A similar trend was shown in task performance. Participants with dementia demonstrated less collaborative interaction. When comparing their Indiv/Collab task completion ratio with participants with normal cognition or MCI, we found that the ratio nearly doubled during the one-to-one and post-test interactions. In terms of the EEI analysis, participants with dementia all enjoyed triadic HRI more than one-to-one HRI. All three groups have shown an increased EEI from take-turns session to the post-test, with the participants with MCI or dementia demonstrated a significant increase of 48.2% and 43.8% respectively, whereas the participants with normal cognition had an increase of 16.9%.
D. Participants Perception on the System
From the participants’ comments, there were many instances where the human likeness of the NAO robot was emphasized. Many participants were nodding their heads during the orientation session showing their intrinsic acceptance of NAO’s likeness to a human facilitator in the interaction. Several participants engaged in human like conversations with the robot. For example, when the robot (named Billy) told Betty (the participant) “Betty, you can collect the team bonus books to score more points”, Betty said, “I am trying Billy, but it does not seem to be working”. Similarly, when Ann (another participant) was reminded by the robot that it was her turn she replied “Oh, I didn’t know that Billy”.
The participants were also helping each other while performing the tasks. These included participants strategizing before the game and one participant guiding and helping the other during sessions. In the interval between the orientation and the game, a few participants also began talking about general topics. One participant was telling her peer about how she had to reschedule her gym session to attend the intervention. Several participants liked the appreciative feedbacks given by the robot. When the robot expressed satisfaction by raising its hands up, one participant also copied the action saying “hurray”. Another clapped her hands. One of the participant pairs asked if they could have another session since they missed a dance by the robot by only 5 points.
VIII. DISCUSSION
A. Result Discussion
The system worked as designed and participants had an overall positive perception of the system. Since there were 8 books for each individual in total and 5 of them were team bonus books, an average of 8.5 ‘collecting’ book behaviors and 4.54 ‘offering’ book behaviors by the robot indicated that the robot was trying to collect all the books it could collect and trying to offer help as much as possible. Thus, the robot was able to play the game with older adults. In terms of task performance and collaboration feedback, the number of robot prompts were in line with older adults’ interaction data shown in Fig. 9. When older adult’s performance worsened, the robot provided more prompts.
One-to-one HRI had the highest collaborative task completion and total task completion. This result is expected since the robot was designed to perform as a collaborative player as well as to prompt older adults on their task performance and encourage them to collaborate. In addition, during the triadic HRI, participants interacted with both of their peers and the system and helped each other rather than only focused on their own task. As a result, divided attention is likely to play a role in the decrease of efforts. The increase in collaborative task completion and total task completion from take turns HRI to simultaneous HRI need to be interpreted cautiously. Since the two triadic HRI tasks were not exactly the same, although these increased task completion metrics are positive and indicate the potential usefulness of the system, we were not able to conclude from these two metrics that older adults’ activity and social engagement increased as they interacted with the system more. The individual to collaborative task completion ratio result shows that older adults had more individual task completion compared to collaborative task completion for both one-to-one HRI and triadic HRI, and older adults maintained their collaborative task completion throughout the entire HRI.
Participants’ performance in the yellow book task indicates their engagement in an unseen task and their willingness to interact with each other and explore the collaborative rule. Interestingly, failed trials without a hint were completed faster than failed trials with a hint. One explanation for this finding is nature of the yellow book sorting task as a trial-and-error task. Older adults do not know what to do in the beginning and if they fail fast, they may quickly figure out the collaborative rule. Also, there are more failed trials for pairs without a hint. We believe that trial and error that yielded failures in the first few attempts helped the participants identify the correct strategy faster.
The significant increase of conversation among the two older adults in the yellow book task is expected since the only way for older adults to figure out how to move a yellow book is through trial and error and communication. For the older adult pairs, the standard deviation of the amount of talking decreased significantly as compared to individual results. This is due to the fact that in most cases older adults’ talking were not balanced. One older adult would generally talk more than the other. It can be seen that older adults talked slightly more in simultaneous HRI than take turns HRI. By the time they performed the post-test, their talking between each other became more balanced. Older adults’ head pose results related to social interaction also increased slightly from take turns HRI to simultaneous HRI. Collectively, these results are positive and indicate that older adults were engaged in social interaction during the entire session of HRI and the slight improvements indicate that the system might be potentially useful.
In general, older adults’ engagement level as measured by the summarized EEI was maintained throughout the whole session. Some older adults preferred one-to-one HRI whereas others preferred triadic HRI. Despite whether older adults preferred triadic HRI or one-to-one HRI, their engagement level as estimated by the summarized EEI increased as they continued interacting with the system during triadic HRI.
We did not observe any noticeable differences in the metrics between participants with normal cognition and those with MCI. However, there were noticeable differences between participants with dementia and the other two groups. Participants with dementia had lower social interaction metrics. This is consistent with the general experience that people with dementia tend to interact less socially when compared to their MCI and normal peers. Due to the small sample size, we cannot generalize these findings and they are presented as observations. Also, the influence of a participant’s partner’s cognitive ability on their results cannot be conclusively stated. Overall, those with dementia had the greatest increase in their metrics. This is to be expected as their starting baseline values are comparatively lower, having more room for improvement.
B. System Discussion
In this user study we designed and tested a unified system integrating robotics and virtual reality, SAR-Connect, to engage multiple older adults in social, cognitive and physical activities. Our primary goal to enhance HHI was realized through several mechanisms. First was the purposeful design of fading the physical robot presence into the background of interaction as the HRI successfully improved older adults’ activity and social engagement. As the older adults progressed through the book sorting activity, the robot became less intrusive and prompted the participants to help each other. Our work is similar to Lazar and colleagues [51] in focusing on the older adult as an active rather than passive participant. They referred to the concept ‘Third Hand’ utilized in art therapy to enhance creative processes without being intrusive; art therapists purposefully empower the older adults in their process of making art without imposing the art therapists’ artistic preferences. Based on the Third Hand concept, they designed an interactive HCI framework using Wizard of Oz architecture for older adults with cognitive impairments to participate with an art therapist in creating art therapy. In our work, our robot is similar to the Third Hand concept that is less intrusive and does not impose its preference and activity on the participants. However, the robot in our HRI framework is autonomous and reduces the burden of operator controls.
A second design factor to foster HHI was through carefully designed tasks embedded with interaction strategies and delivery of multimodal activities to pairs of older adults. The combination of cognitive, physical and social elements has been shown to engender a host of physical and mental health benefits of older adults [3, 14]. Prior work on social interaction can be found in rehabilitation robotics systems. For example, Gorsic et al. [52] designed multiplayer games for stroke patients using a commercially available arm rehabilitation system and Mace et al. [53] developed a haptic controlled virtual task to allow real-time collaboration and performance enhancement across a wide-range of inter-subject skill matches.
Because our long-term goal is to engage older adults with cognitive impairment and apathy in social interaction activities, we designed our HRI framework somewhat differently from rehabilitation focused social interactions. Similar to these physical rehabilitation works on cooperative robotic gaming, our tasks are also designed to increase motivation by making them interesting and challenging. Unlike these systems, older adults’ physical performance, such as their scores within our VR-based task, is not a key factor for improvement; we do not seek to increase their exercise intensity or game scores. In our approach, it is more important for the robot to be a facilitator that engages older adults in social communication and engages them in a multimodal task that provides useful stimuli.
Some have used SAR conversation as a way to enhance social engagement. Recently, Traeger and colleagues [54] reported on designing ‘vulnerable’ expressions within their SAR that resulted in increased social conversation among groups of three young adults playing a collaborative game on individual tablets. In addition to positive statements and encouragement, they also embedded disclosures of mistakes or frustration and the occasional joke. With these three types of statements, they found that their young adult participants were more talkative than when only neutral or no expressions were used. Future work can determine whether these types of communication are effective with older adults with cognitive impairment.
Task design is only part of our HRI framework; the social robot and the various sensing modalities are designed specifically for activity and social engagement. We designed our SAR to provide positive reinforcement and encouragement, such as ‘great job’ and ‘hooray’ as well as directions for one person to assist the other, “John, can you show Mary how to grab the book?” In this user study, we focused on a collaborative task design. Depending upon the personality of the older adults, we could also incorporate competitive elements in our task design as a way to provide social engagement and stimuli. Indeed, Miltiades and Thatcher [55] observed friendly competition developing over time among older adults with Alzheimer’s disease with a card and tile matching game, demonstrating that competition could be appropriate even in this population.
We designed the current and past tasks based on several of the investigators’ expertise in gerontology and geriatrics, including dementia (LM, LB, PN), participation by several activity directors working at local long-term care and retirement settings, and participation by older adults. Older adult involvement in design for SAR tasks and activities is crucial [36, 46, 56–59]. Indeed, Lee and Riek [36] caution against negative attitudes, i.e., ageism, in designing for deficits rather than ‘successful’ aging; a concept held and promoted among gerontology specialists. The first design level pertains to the ability of the older adult and the robotic platform to ‘understand’ one another. Older adults’ physical aging changes, such as vision and hearing, demonstrated the need for a reiterative design process to ensure older adults could see the VR environment and hear and understand the robot’s speech. Similarly, the platform required successive design enhancements to properly capture the wide variation in physical movement, speech delay and volume manifested among the older adults. The task completion metrics were the change of book distance due to older adults’ hand and arm movements. We did not examine whether this movement required greater effort for one older adult compared to another. Nor did we capture arm movements that did not result in book movement. In this study, the Interaction Manager was not customized for each individual’s range of motion, rather parameters were established to account for the variation in movement and motion. It is possible that in future work, one could normalize older adults’ efforts by a calibration process at the beginning of the session to incorporate individual differences in task completion metrics to further personalize HRI.
The second design level for older adults’ input and feedback is on the task itself. Over the years, our laboratory has involved older adults to provide feedback and suggestions for a variety of tasks and activities within the physical capabilities of NAO. For this user study, older adults provided suggestions on how to make the book sorting task more interesting and keeping their engagement, such as sound effects and differing levels of difficulty. Given the varying preferences for types of activities among older adults residing in long term care settings [60], older adults’ participation will be crucial to expand the type and array of activities.
Although the NAO robot and the booking sorting task was used to develop SAR-Connect system in this work, our HRI framework is capable of more generalizable application. Any robot capable of carrying out complex gestures with an open architecture to integrate with other interactive devices can be used by the framework. Further, any tasks that comply with our general task structure can be used. The system is also capable of greater application to older adults other than those targeted in this study, those residing in long term care settings. Next steps will be to incorporate staff’s perceptions and use of these systems to allow for widespread adoption and sustainability. Last, we envision that as the costs and complexity decrease, that these robotic platforms can be applicable for home settings, and address loneliness and social isolation, major issues impacting the health of older adults residing in the community [61, 62].
IX. Conclusion
We present a novel SAR system that aims to augment the care of older adults with or without cognitive impairment. We designed an HRI framework specifically for robot-mediated social interaction among older adults through multimodal activities. The use of VR to supplement that SAR is an example of how we can design and implement multimodal activities that addresses the physical limitations of a robot while maintaining the advantages of a physical presence. The long-term goal of SAR-Connect is to engage older adults with apathy and cognitive impairment who reside in LTC settings in physical, social, and cognitive activities and foster HHI via robot mediation through 1) delivering multimodal activities that combine cognitive, physical and social activities to pairs of older adults to promote social interaction through carefully designed tasks embedded with interaction strategies; 2) using multimodal sensory modules to detect activity and social engagement of older adults from various perspective for personalized interaction and long term engagement; and 3) extending the capability of the social robot by VR-based task to accommodate individual differences and functional limitations of older adults.
System performance results indicate the usability and older adults’ acceptance of SAR-Connect. From the robot behavior data, it can be seen that the system was able to perform the task with older adults, and adapt to their interaction in order to improve their task performance and encourage collaboration. The activity engagement and social interaction results from different sensory modalities are positive. Participants had high activity engagement and their engagement level either maintained or increased as they interacted more with the system. In addition, participants’ social engagement maintained or slightly increased as they interacted with the system more. Participants’ post-test performance further shows promising results that their collaboration behavior transferred to an unseen task with unknown collaborative rule.
The aim of this study was to design and validate a robotic system that could provide multimodal activities while enhancing social interaction between older adults. Due to the fact that older adults only took part in a single session, no conclusion can be drawn on whether the system is able to foster their social interaction over time nor whether the system can reduce apathy, our long-term goal. The participant pairs were also randomly assigned and not according to their cognitive abilities to represent a naturalistic setting type of pairing. Hence the specific effect of the SAR system on each cognitive group needs to be further investigated. The presented SAR system can engage one or multiple older adults in a closed-loop fashion. In the case of one robot to multiple older adults, the system adapts to both individual and group interaction. More sophisticated system adaptation based on HHI can be embedded to the presented HRI framework. The current system is limited in that only interaction data and collaboration in task were evaluated to adapt robot behaviors. In the future, we intend to include the activity engagement and social interaction measures as a way to evaluate real-time interpersonal social interaction and task engagement, and extend the adaptive behaviors of the robot to shape the social interaction among older adults. We will also design tasks with different difficulty levels to accommodate older adults with different cognition levels. Studies to evaluate system performance and efficacy with long-term interaction, and examine whether and how different levels of cognition would affect pair’s interactions are important future directions as well.
Acknowledgments
This work was supported by the National Institute of Health Grant 1R21AG050483–01A1 and Vanderbilt University UL1 Tr000445 from the National Center for Advancing Translational Sciences/NIH.
Biography

Jing Fan received the B.S. degree in electrical engineering from Beijing Jiaotong University, China in 2012 and the M.S. degree and the Ph.D. degree in electrical engineering, in 2014 and 2019 respectively, from the Vanderbilt University, Nashville, TN, USA. Her research interests include human-robot interaction, robotics, machine learning, artificial intelligence, affective computing, and brain-computer interface. She serves as an Associate Editor for the IEEE International Symposium on Robot and Human Interactive Communication.

Lorraine C. Mion received her B.S. nursing degree from St. Johns University, Cleveland OH in 1976, M.S. and Ph.D. from Case Western Reserve University, Cleveland, OH, U.S.A. in 1981 and 1992 respectively. Her major field of study is gerontological nursing.
Since 1976, she has been employed as a registered nurse in several U.S.A. acute care medical centers and universities. Since 2016 she has been a Professor at The Ohio State University, Columbus, OH, U.S.A. She has 120 data-based publications, 31 clinical publications, and 24 book chapters. Her areas of research interest include quality of care improvements in geriatric care in acute and long term care settings, fall prevention, delirium and agitation, and use of technology in health care settings. She is on the Editorial Boards of The Joint Commission Journal on Quality and Patient Safety and Geriatric Nursing.

Linda Beuscher has been a certified gerontological nurse practitioner since 1996. She is a member of the Vanderbilt University Center for Geriatric Nursing Excellence. For the past 9 years Dr. Beuscher has worked with Dr. John Schnelle and Dr. Sandra Simmons at the Center of Quality Aging conducting research that focused on improving the quality of life for LTC and assisted living residents, particularly those with depression and cognitive impairment and quality improvement process in nursing homes.

Nilanjan Sarkar holds the David K. Wilson Professor of Engineering at the Scool of Engineering and is a Professor and the Chair of Mechanical Engineering, and a Professor of Electrical Engineering and Computer Science. He received his Ph.D. in Mechanical Engineering and Applied Mechanics from the University of Pennsylvania His current research interests include human–robot interaction, affective computing, dynamics, and control. Dr. Sarkar is a Fellow of American Society of Mechanical Engineers. He served as an associate editor for the IEEE Transactions on Robotics.

Akshith Ullal received the B.S. degree in electrical engineering from Visvesvaraya Technological University, Belgaum, India, in 2015 and the M.S. degree in electrical engineering from University of Missouri, Columbia, USA, in 2018. He is currently pursuing the Ph.D. degree in electrical engineering at Vanderbilt University, Nashville, TN, USA. His current research interests include human computer interaction, augmented and virtual reality, machine learning and artificial intelligence.

Paul Newhouse, M.D. holds the Jim Turner Chair of Cognitive Disorders at Vanderbilt University School of Medicine and is Professor of Psychiatry, Pharmacology, and Medicine and is Director of the Center for Cognitive Medicine in the Department of Psychiatry and Behavioral Sciences at Vanderbilt University Medical Center and Clinical Core Director of the Vanderbilt Alzheimer’s Disease Research Center. He is also a physician-scientist at the VA Tennessee Valley Health Systems Geriatric Research, Education, and Clinical Center (GRECC). Dr. Newhouse’s research has focused on central cholinergic mechanisms in brain aging and normal and disordered cognitive functioning in humans. He has funding from NIA, Alzheimer’s Association, and Alzheimer’s Drug Discovery Foundation.
Footnotes
Contributor Information
Jing Fan, Electrical Engineering and Computer Science Department, Vanderbilt University, Nashville, TN 37212 USA.
Lorraine C. Mion, Center of Excellence in Critical and Complex Care, College of Nursing, The Ohio State University, OH 43210 USA
Linda Beuscher, Vanderbilt University School of Nursing, Nashville, TN 37204 USA.
Akshith Ullal, Electrical Engineering and Computer Science Department, Vanderbilt University, Nashville, TN 37212 USA.
Paul A. Newhouse, Center for Cognitive Medicine, Department of Psychiatry and Behavioral Sciences, Vanderbilt University, Geriatric Research Education and Clinical Center (GRECC), Tennessee Valley Veterans Affairs Medical Center, Nashville, TN 37212 USA
Nilanjan Sarkar, Mechanical Engineering Department, Electrical Engineering and Computer Science Department, Vanderbilt University, Nashville, TN 37212 USA.
References
- [1].Federal Interagency Forum on Aging-Related Statistics, “Older Americans 2016: Key Indicators of Well-Being,” Federal Interagency Forum on Aging-Related Statistics, 2016. [Google Scholar]
- [2].Alzheimer’s Association, “2018 Alzheimer’s disease facts and figures,” Alzheimer’s & Dementia, vol. 14, pp. 367–429, 2018. [Google Scholar]
- [3].Krueger KR, Wilson RS, Kamenetsky JM, Barnes LL, Bienias JL, and Bennett DA, “Social engagement and cognitive function in old age,” Experimental aging research, vol. 35, pp. 45–60, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [4].James BD, Wilson RS, Barnes LL, and Bennett DA, “Late-life social activity and cognitive decline in old age,” Journal of the International Neuropsychological Society: JINS, vol. 17, p. 998, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Agüera-Ortiz L, García-Ramos R, Grandas Pérez FJ, López-Álvarez J, Montes Rodríguez JM, Olazarán Rodríguez FJ, Olivera Pueyo J, Pelegrín Valero C, and Porta-Etessam J, “Depression in Alzheimer’s disease: a Delphi consensus on etiology, risk factors, and clinical management,” Frontiers in psychiatry, vol. 12, p. 141, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Lanctôt KL, Agüera-Ortiz L, Brodaty H, Francis PT, Geda YE, Ismail Z, Marshall GA, Mortby ME, Onyike CU, and Padala PR, “Apathy associated with neurocognitive disorders: recent progress and future directions,” Alzheimer’s & Dementia, vol. 13, pp. 84–100, 2017. [DOI] [PubMed] [Google Scholar]
- [7].C. f. D. Control and Prevention, “QuickState: percentage of users of longterm care services with a diagnosis of depression, by provider type—National study of long-term care providers, United States, 2011 and 2012,” Morbidity and mortality weekly report, vol. 63, p. 82, 2014. [Google Scholar]
- [8].Nobis L and Husain M, “Apathy in Alzheimer’s disease,” Current Opinion in Behavioral Sciences, vol. 22, pp. 7–13, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Brodaty H and Burns K, “Nonpharmacological management of apathy in dementia: a systematic review,” The American Journal of Geriatric Psychiatry, vol. 20, pp. 549–564, 2012. [DOI] [PubMed] [Google Scholar]
- [10].Ellis JM, Doyle CJ, and Selvarajah S, “The relationship between apathy and participation in therapeutic activities in nursing home residents with dementia: Evidence for an association and directions for further research,” Dementia, vol. 15, pp. 494–509, 2016. [DOI] [PubMed] [Google Scholar]
- [11].Buettner L and Kolanowski A, “Prescribing activities that engage passive residents: an innovative method,” Journal of gerontological nursing, vol. 34, pp. 13–18, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [12].US Department of Health Human Services, “CMS Manual System: Pub. 100–07,” Washington, DC: Medicaid CfM. [Google Scholar]
- [13].Mansbach WE, Mace RA, Clark KM, and Firth IM, “Meaningful activity for long-term care residents with dementia: A comparison of activities and raters,” The Gerontologist, vol. 57, pp. 461–468, 2017. [DOI] [PubMed] [Google Scholar]
- [14].Cohen-Mansfield J, Marx MS, Dakheel-Ali M, and Thein K, “The use and utility of specific nonpharmacological interventions for behavioral symptoms in dementia: an exploratory study,” The American Journal of Geriatric Psychiatry, vol. 23, pp. 160–170, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [15].Long Term Care Coalition. (15 May 2019). Nursing Home Staffing 2018 Q4. Available: https://nursinghome411.org/nursing-home-staffing-2018-q4/ [Google Scholar]
- [16].Yu R, Hui E, Lee J, Poon D, Ng A, Sit K, Ip K, Yeung F, Wong M, and Shibata T, “Use of a Therapeutic, Socially Assistive Pet Robot (PARO) in Improving Mood and Stimulating Social Interaction and Communication for People With Dementia: Study Protocol for a Randomized Controlled Trial,” JMIR research protocols, vol. 4, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Gross H, Schroeter C, Mueller S, Volkhardt M, Einhorn E, Bley A, Martin C, Langner T, and Merten M, “Progress in developing a socially assistive mobile home robot companion for the elderly with mild cognitive impairment,” in Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ International Conference on, 2011, pp. 2430–2437. [Google Scholar]
- [18].McColl D, Louie WYG, and Nejat G, “Brian 2.1: A socially assistive robot for the elderly and cognitively impaired,” Robotics & Automation Magazine, IEEE, vol. 20, pp. 74–83, 2013. [Google Scholar]
- [19].Fasola J and Matarić MJ, “A Socially Assistive Robot Exercise Coach for the Elderly,” Journal of Human-Robot Interaction, vol. 2, pp. 3–32, 2013. [Google Scholar]
- [20].Görer B, Salah A, and Akin HL, “A Robotic Fitness Coach for the Elderly,” in Ambient Intelligence, ed: Springer International Publishing, 2013, pp. 124–139. [Google Scholar]
- [21].Tapus A, Tapus C, and Matarić M, “Long Term Learning and Online Robot Behavior Adaptation for Individuals with Physical and Cognitive Impairments,” in Field and Service Robotics, 2010, pp. 389–398. [Google Scholar]
- [22].Louie WYG, Vaquero T, Nejat G, and Beck JC, “An autonomous assistive robot for planning, scheduling and facilitating multi-user activities,” in Robotics and Automation (ICRA), 2014 IEEE International Conference on, 2014, pp. 5292–5298. [Google Scholar]
- [23].Bäck I, Makela K, and Kallio J, “Robot-Guided Exercise Program for the Rehabilitation of Older Nursing Home Residents,” Annals of Long-Term Care: Clinical Care and Aging, vol. 21, pp. 38–41, 2013. [Google Scholar]
- [24].Matsusaka Y, Fujii H, Okano T, and Hara I, “Health exercise demonstration robot T AIZO and effects of using voice command in robot-human collaborative demonstration,” in Robot and Human Interactive Communication, 2009. RO-MAN 2009. The 18th IEEE International Symposium on, 2009, pp. 472–477. [Google Scholar]
- [25].Looije R, Neerincx MA, and Cnossen F, “Persuasive robotic assistant for health self-management of older adults: Design and evaluation of social behaviors,” International Journal of Human-Computer Studies, vol. 68, pp. 386–397, 2010. [Google Scholar]
- [26].Khosla R and Chu M-T, “Embodying care in Matilda: an affective communication robot for emotional wellbeing of older people in Australian residential care facilities,” ACM Transactions on Management Information Systems (TMIS), vol. 4, p. 18, 2013. [Google Scholar]
- [27].Kanoh M, Oida Y, Nomura Y, Araki A, Konagaya Y, Ihara K, Shimizu T, and Kimura K, “Examination of practicability of communication robot-assisted activity program for elderly people,” Journal of Robotics and Mechatronics, vol. 23, p. 3, 2011. [Google Scholar]
- [28].Matsuyama Y, Taniyama H, Fujie S, and Kobayashi T, “Framework of Communication Activation Robot Participating in Multiparty Conversation,” in AAAI Fall Symposium: Dialog with Robots, 2010. [Google Scholar]
- [29].Ienca M, Fabrice J, Elger B, Caon M, Pappagallo AS, Kressig RW, and Wangmo T, “Intelligent assistive technology for Alzheimer’s disease and other dementias: a systematic review,” Journal of Alzheimer’s Disease, vol. 56, pp. 1301–1340, 2017. [DOI] [PubMed] [Google Scholar]
- [30].Rendon AA, Lohman EB, Thorpe D, Johnson EG, Medina E, and Bradley B, “The effect of virtual reality gaming on dynamic balance in older adults,” Age and ageing, vol. 41, pp. 549–552, 2012. [DOI] [PubMed] [Google Scholar]
- [31].Young W, Ferguson S, Brault S, and Craig C, “Assessing and training standing balance in older adults: a novel approach using the ‘Nintendo Wii’Balance Board,” Gait & posture, vol. 33, pp. 303–305, 2011. [DOI] [PubMed] [Google Scholar]
- [32].Anderson-Hanley C, Arciero PJ, Brickman AM, Nimon JP, Okuma N, Westen SC, Merz ME, Pence BD, Woods JA, and Kramer AF, “Exergaming and older adult cognition: a cluster randomized clinical trial,” American journal of preventive medicine, vol. 42, pp. 109–119, 2012. [DOI] [PubMed] [Google Scholar]
- [33].Shibano T, Ho Y, Kono Y, Fujimoto Y, and Yamaguchi T, “Daily support system for care prevention by using interaction monitoring robot,” in Intelligent Robots and Systems (IROS), 2010IEEE/RSJ International Conference on, 2010, pp. 3477–3482. [Google Scholar]
- [34].Holthe T, Halvorsrud L, Karterud D, Hoel K-A, and Lund A, “Usability and acceptability of technology for community-dwelling older adults with mild cognitive impairment and dementia: a systematic literature review,” Clinical Interventions in Aging, vol. 13, p. 863, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [35].Cohen-Mansfield J, Hai T, and Comishen M, “Group engagement in persons with dementia: the concept and its measurement,” Psychiatry Research, vol. 251, pp. 237–243, 2017. [DOI] [PubMed] [Google Scholar]
- [36].Lee HR and Riek LD, “Reframing assistive robots to promote successful aging,” ACM Transactions on Human-Robot Interaction (THRI), vol. 7, pp. 1–23, 2018. [Google Scholar]
- [37].Livingston G, Sommerlad A, Orgeta V, Costafreda SG, Huntley J, Ames D, Ballard C, Banerjee S, Burns A, and Cohen-Mansfield J, “Dementia prevention, intervention, and care,” The Lancet, vol. 390, pp. 2673–2734, 2017. [DOI] [PubMed] [Google Scholar]
- [38].Cohen-Mansfield J, Dakheel-Ali M, and Marx MS, “Engagement in persons with dementia: the concept and its measurement,” The American Journal of Geriatric Psychiatry, vol. 17, pp. 299–307, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [39].Cohen-Mansfield J, Thein K, Dakheel-Ali M, and Marx MS, “Engaging nursing home residents with dementia in activities: the effects of modeling, presentation order, time of day, and setting characteristics,” Aging & mental health, vol. 14, pp. 471–480, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [40].Cohen-Mansfield J, “Activity groups for persons with dementia: personal predictors of participation, engagement and mood,” Psychiatry Research, vol. 257, pp. 375–380, 2017. [DOI] [PubMed] [Google Scholar]
- [41].Cohen-Mansfield J, Marx MS, Freedman LS, Murad H, Regier NG, Thein K, and Dakheel-Ali M, “The comprehensive process model of engagement,” The American Journal of Geriatric Psychiatry, vol. 19, pp. 859–870, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [42].Venkatesh V, Morris MG, Davis GB, and Davis FD, “User acceptance of information technology: Toward a unified view,” MIS quarterly, pp. 425–478, 2003. [Google Scholar]
- [43].Sim DYY and Loo CK, “Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction–A review,” Information Sciences, vol. 301, pp. 305–344, 2015. [Google Scholar]
- [44].Sidner CL, Lee C, Kidd CD, Lesh N, and Rich C, “Explorations in engagement for humans and robots,” Artificial Intelligence, vol. 166, pp. 140–164, 2005. [Google Scholar]
- [45].Fan J, Bian D, Zheng Z, Beuscher L, Newhouse PA, Mion LC, and Sarkar N, “A Robotic Coach Architecture for Elder Care (ROCARE) based on multi-user engagement models,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 25, pp. 1153–1163, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [46].Johnson J and Finn K, Designing user interfaces for an aging population: Towards universal design: Morgan Kaufmann, 2017.
- [47].Fan J, Beuscher L, Newhouse P, Mion LC, and Sarkar N, “A Collaborative Virtual Game to Support Activity and Social Engagement for Older Adults,” in International Conference on Universal Access in Human-Computer Interaction, 2018, pp. 192–204. [Google Scholar]
- [48].Fan J, Wade JW, Bian D, Key AP, Warren ZE, Mion LC, and Sarkar N, “A Step towards EEG-based brain computer interface for autism intervention,” in Conference proceedings:... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual Conference, 2015, p. 3767. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [49].Jeste DV, Palmer BW, Appelbaum PS, Golshan S, Glorioso D, Dunn LB, Kim K, Meeks T, and Kraemer HC, “A new brief instrument for assessing decisional capacity for clinical research,” Archives of general psychiatry, vol. 64, pp. 966–974, 2007. [DOI] [PubMed] [Google Scholar]
- [50].Doerflinger DMC, “Mental status assessment in older adults: Montreal Cognitive Assessment: MoCA Version 7.1 (original version),” The Clinical Neuropsychologist, vol. 25, pp. 119–126, 2012. [Google Scholar]
- [51].Lazar A, Cornejo R, Edasis C, and Piper AM, “Designing for the third hand: Empowering older adults with cognitive impairment through creating and sharing,” in Proceedings of the 2016 ACM Conference on Designing Interactive Systems, 2016, pp. 1047–1058. [Google Scholar]
- [52].Goršič M, Cikajlo I, and Novak D, “Competitive and cooperative arm rehabilitation games played by a patient and unimpaired person: effects on motivation and exercise intensity,” Journal of NeuroEngineering and Rehabilitation, vol. 14, pp. 1–18, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [53].Mace M, Kinany N, Rinne P, Rayner A, Bentley P, and Burdet E, “Balancing the playing field: collaborative gaming for physical training,” Journal of NeuroEngineering and Rehabilitation, vol. 14, pp. 1–18, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [54].Traeger ML, Sebo SS, Jung M, Scassellati B, and Christakis NA, “Vulnerable robots positively shape human conversational dynamics in a human-robot team,” Proceedings of the National Academy of Sciences, vol. 117, pp. 6370–6375, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [55].Miltiades HB and Thatcher W, “Social engagement during game play in persons with Alzheimer’s: Innovative practice,” Dementia, vol. 18, pp. 808–813, 2019. [DOI] [PubMed] [Google Scholar]
- [56].Lazar A, Edasis C, and Piper AM, “Supporting people with dementia in digital social sharing,” in Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, 2017, pp. 2149–2162. [Google Scholar]
- [57].Lazar A, Thompson HJ, Piper AM, and Demiris G, “Rethinking the design of robotic pets for older adults,” in Proceedings of the 2016 ACM Conference on Designing Interactive Systems, 2016, pp. 1034–1046. [Google Scholar]
- [58].Nestorov N, Stone E, Lehane P, and Eibrand R, “Aspects of socially assistive robots design for dementia care,” in 2014 IEEE 27th International Symposium on Computer-Based Medical Systems, 2014, pp. 396–400. [Google Scholar]
- [59].Broadbent E, “Interactions with robots: The truths we reveal about ourselves,” Annual review of psychology, vol. 68, pp. 627–652, 2017. [DOI] [PubMed] [Google Scholar]
- [60].Cohen-Mansfield J, Gavendo R, and Blackburn E, “Activity Preferences of persons with dementia: An examination of reports by formal and informal caregivers,” Dementia, vol. 18, pp. 2036–2048, 2019. [DOI] [PubMed] [Google Scholar]
- [61].Gardiner C, Geldenhuys G, and Gott M, “Interventions to reduce social isolation and loneliness among older people: an integrative review,” Health & social care in the community, vol. 26, pp. 147–157, 2018. [DOI] [PubMed] [Google Scholar]
- [62].Gerst-Emerson K and Jayawardhana J, “Loneliness as a public health issue: the impact of loneliness on health care utilization among older adults,” American journal of public health, vol. 105, pp. 1013–1019, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
