Abstract
Background
We developed MARVIN, an artificial intelligence (AI)‐based chatbot that provides 24/7 expert‐validated information on self‐management‐related topics for people with HIV. This study assessed (1) the feasibility of using MARVIN, (2) its usability and acceptability, and (3) four usability subconstructs (perceived ease of use, perceived usefulness, attitude towards use, and behavioural intention to use).
Methods
In a mixed‐methods study conducted at the McGill University Health Centre, enrolled participants were asked to have 20 conversations within 3 weeks with MARVIN on predetermined topics and to complete a usability questionnaire. Feasibility, usability, acceptability, and usability subconstructs were examined against predetermined success thresholds. Qualitatively, randomly selected participants were invited to semi‐structured focus groups/interviews to discuss their experiences with MARVIN. Barriers and facilitators were identified according to the four usability subconstructs.
Results
From March 2021 to April 2022, 28 participants were surveyed after a 3‐week testing period, and nine were interviewed. Study retention was 70% (28/40). Mean usability exceeded the threshold (69.9/68), whereas mean acceptability was very close to target (23.8/24). Ratings of attitude towards MARVIN's use were positive (+14%), with the remaining subconstructs exceeding the target (5/7). Facilitators included MARVIN's reliable and useful real‐time information support, its easy accessibility, provision of convivial conversations, confidentiality, and perception as being emotionally safe. However, MARVIN's limited comprehension and the use of Facebook as an implementation platform were identified as barriers, along with the need for more conversation topics and new features (e.g., memorization).
Conclusions
The study demonstrated MARVIN's global usability. Our findings show its potential for HIV self‐management and provide direction for further development.
Keywords: antiretroviral, artificial intelligence, Canada, chatbot, conversational agent, digital health, feasibility, HIV, implementation science, mixed methods, mobile phone, patient and stakeholder engagement, self‐management, telehealth, usability
INTRODUCTION
Background
In 2022, around 39 million people were living with HIV globally [1]. In Canada, the estimated number of new HIV diagnoses in 2022 stood at 1833, a 25% increase from 2021 [2]. Four decades into the global HIV pandemic, effective antiretroviral therapy (ART) has significantly improved the life expectancy of people with HIV, closing the gap between them and those without HIV [3]. HIV is now a manageable chronic disease that [4], nevertheless, requires lifelong self‐management, including engagement in beneficial behaviours such as attending regular healthcare appointments, acquiring self‐management‐related knowledge, and developing decision‐making skills [5]. Adherence to ART is especially important to maintain viral suppression and thereby avoid complications and forward transmission [6]. However, self‐management and ART adherence are challenged by a variety of factors, including medication beliefs and concerns (lack of understanding of treatment, side effects), lifestyle (disrupted routine, substance use), interpersonal relationships (fear of disclosure, stigma), and healthcare‐related factors (patient–provider communication) [7].
Digital health, the use of information technology (IT) to manage illnesses and promote wellness, can provide innovative solutions to help people with HIV address these self‐management barriers. A systematic review revealed that telephone‐ and website‐supported counselling and messaging facilitated remote access and timely information exchange with people with HIV, resulting in effective improvements in medication adherence, coping with HIV‐related conditions, and management of side effects [8]. A Canadian study also established that weekly follow‐up via text message improved medication adherence and viral suppression among people with HIV [9]. While minimizing travel costs, saving time, and protecting privacy [8], IT‐assisted interventions could enable swift access to reliable health information for people with HIV and optimize their self‐management practices.
Within health‐related IT, chatbots are among the most promising tools, with the potential to revolutionize patient self‐management and support [10]. Chatbots are acceptable by patients [11, 12] and can establish a good collaborative connection with them [13, 14, 15], promoting active engagement in care. Since the first appearance of the ELIZA chatbot as a psychotherapist in 1966 [16], chatbots have been explored in a variety of healthcare applications across mental health [17, 18, 19, 20, 21], oncology [22, 23, 24, 25, 26], and diabetes [27, 28]. Chatbots often harness the power of artificial intelligence (AI) to enable natural language interpretation and aid decision‐making. They have been found to be easily accessible and able to provide valid information quickly while ensuring anonymity [29, 30]. This is well suited for people with HIV who require lifelong self‐management and are still reluctant to disclose their condition for fear of stigmatization [31]. Yet, relevant chatbot work remains modest [32, 33]. Brixey et al. implemented SHIHbot on Facebook in 2017, the first chatbot to provide HIV‐related sexual health information [34]. Ardiana et al. also presented a mobile‐based chatbot for HIV/AIDS information and counselling [35]. User satisfaction was high (3.6/4) as was usability (3.3/4), indicating user endorsement of the overall system concept. Apart from these two forays into information provision, most other studies in this area have focused on HIV prevention [11, 36, 37, 38, 39, 40, 41]. The team of Van Heerden et al. tested the use of chatbots for rapid HIV self‐testing and counselling in South Africa in 2017 and 2022 [11, 36], with the majority of testers reporting their intention to use such technology due to the privacy and anonymity it afforded compared with human counsellors. Cheah et al. reached similar conclusions, where participants in Malaysia perceived chatbots as helpful to avoid stigma‐inducing interactions and found chatbots to be a useful tool for HIV self‐testing and pre‐exposure prophylaxis information [40]. Along with Yam et al. [37], both studies noted the need for their chatbot to have more HIV‐related information, especially on ART treatment and mental health support, to ensure the subsequent successful implementation and rollout [40].
Despite these sporadic yet encouraging efforts in HIV prevention, self‐management by people with HIV remains an essential aspect of care that has not seen similar advancements. Starting in 2020, our multidisciplinary team including patient partners, healthcare professionals, engineers, and researchers collaborated on the development of an AI‐based bilingual chatbot named MARVIN (Minimal AntiRetroViral INterference). Created using a co‐design approach with patient and stakeholder engagement [42], MARVIN can converse on issues of HIV self‐management, answering with expert‐validated information on ART administration, ART management while travelling, and general HIV‐related knowledge. It can also provide medication reminders.
Aim and objectives
To the best of our knowledge, there are no published studies on chatbot use to facilitate self‐management among people with HIV, including ART adherence [43]. Furthermore, despite the growing interest in healthcare chatbots, the extent to which people find them useful needs to be thoroughly evaluated [44]. To bridge this knowledge gap, the primary objectives of this study were to (1) assess the feasibility of using the new MARVIN chatbot by people with HIV and (2) gauge MARVIN's global usability. Its secondary objective was to (3) further determine its usability in terms of four subconstructs and their interrelationships: perceived ease of use, perceived usefulness, attitude towards use, and behavioural intention to use.
METHODS
The MARVIN Chatbot Intervention
Described in detail in a previous publication [42], MARVIN is an AI‐based bilingual chatbot that communicates with people with HIV in either English or French, offering them advice on issues of self‐management through brief text‐based conversations. During the study, MARVIN operated 24/7 on Facebook Messenger for free and was accessible exclusively to study participants. No updates were made to the chatbot, and no third‐party human was involved in the interactions between MARVIN and participants.
MARVIN begins its first conversation with the user by explicitly introducing itself as a bot and asking for the user's preferred language. It then describes how it handles data, how accounts can be deleted, and the intended use of the chatbot. A complete list of conversation topics can be found in Multimedia Appendix S1. The HIV‐related conversations mainly cover the following topics:
ART administration: MARVIN can address issues related to time management, dosing, common drug interactions, medication storage, and medication identification. Figure 1 shows an example conversation with MARVIN on forgetting to take a medication.
ART management while travelling: MARVIN can specify whether a particular country has immigration restrictions for people with HIV or vaccination requirements, how to prepare and carry ART medications for travel, and how to deal with time zone differences.
General HIV‐related knowledge: MARVIN can also converse on common HIV symptoms, modes of transmission and prevention, and routine vaccination recommendations for people with HIV.
Medication reminders: on the user's request, MARVIN can send daily reminders to take medication. For privacy reasons, reminders can be customized, changed, or deleted at any time. As an example, a user can ask MARVIN to send the message “Time for a walk!” as a reminder, thus avoiding disclosing their HIV status.
FIGURE 1.
A conversation with MARVIN on forgetting to take a medication.
In terms of AI algorithms, MARVIN was developed using the Rasa framework [45] and employs a variety of algorithms, mainly intent classification and entity extraction (e.g., identifying time, drug name, or quantity), as well as decision trees for dialogue management. The corresponding decision‐tree nodes lead to different messages pre‐defined by our multidisciplinary team. When unable to understand the user's intent or reach a specific intent related to diagnosis or treatment, MARVIN acknowledges its limits and encourages the participant to contact their healthcare provider.
Study design
This 4‐week uncontrolled single‐group usability study employed a mixed‐methods convergent design [46]. It simultaneously collected qualitative and quantitative data and combined them to assess usability outcomes of the MARVIN chatbot. We reported our findings in accordance with the CONSORT‐AI (Consolidated Standards of Reporting Trials–Artificial Intelligence) guidelines [47] (Multimedia Appendix S2).
Ethical considerations
This study received approval from the McGill University Health Centre (MUHC) research ethics board on 9 April 2021 (approval 2021‐7191).
Settings and participants
This study was conducted at the Chronic Viral Illness Service of the MUHC – Glen site, located in Montreal, Quebec, Canada. Currently, over 2300 patients, about 40% of whom are women, are being followed up at the CVIS clinic.
Eligibility criteria
Participant inclusion criteria were as follows: (1) age ≥18 years, (2) fluency in English or French, (3) confirmed diagnosis of HIV infection, (4) on ART, (5) access to a smart device (e.g., smartphones), (6) access to an internet connection, (7) acceptance to use or create a personal Facebook account, and (8) acceptance of Facebook's privacy and data security policies. Exclusion criteria were not meeting inclusion criteria, being hospitalized, concurrent enrolment in another study involving chatbots, or having a cognitive impairment that prevented participation.
Sample recruitment
The target sample size for the quantitative component was 30 participants, as recommended for one‐group pilot studies [48, 49].
Using a convenience sampling strategy [50], healthcare providers asked their patients at an in‐clinic or remote follow‐up visit whether they were interested in participating in the study. The study coordinator then contacted interested individuals to determine eligibility and, if eligible, proceeded to informed consent. All participants were given detailed verbal and written information describing the study procedures, anticipated benefits, and potential risks. Patients who consented to participate were asked whether they agreed to also partake in the qualitative component of the study.
Study procedures
Table 1 presents the main study procedures for the patient participants and their schedule.
TABLE 1.
Study procedures – MARVIN usability study.
Study procedure | At entry | Week 1 | Week 2 | Week 3 | Week 4 |
---|---|---|---|---|---|
Preparation | |||||
Screening and consent process | ✓ | ||||
Training session | ✓ | ||||
Data collection | |||||
Sociodemographic questionnaire | ✓ | ||||
Usability testing | |||||
At least 20 conversations with MARVIN | ✓ | ✓ | ✓ | ✓ | |
Quantitative component | |||||
Usability questionnaire | ✓ | ||||
Qualitative component | |||||
2‐h focus groups or 1‐h interview | ✓ |
After consent, a training session with a dedicated digital coordinator provided the participant with access to MARVIN on Facebook and instructions for its use (syntax, question stems, etc.) and helped engage them in a conversation with MARVIN to familiarize them with the process.
Participants were required to complete a sociodemographic questionnaire following enrolment. They then began a 3‐week usability test. This involved initiating 5–10 conversations with MARVIN at any given time on each of the following topics, by asking questions of their own creation: (1) advice for taking ART, (2) travelling with ART, and (3) vaccination recommendations for people with HIV. If MARVIN did not receive input from participants within a week's time, a standardized reminder was sent asking if everything was okay. Upon completion, the coordinator would inform them via email to fill out the study questionnaire and plan for a focus group or interview.
At study completion, if participants did not wish to maintain their Facebook Messenger conversations with MARVIN, the team would help them delete all records in accordance with the relevant Facebook Messenger privacy policy [51, 52, 53]. We compensated participants with CAD $30 for completing the usability testing and the study questionnaire and an additional CAD $30 for participating in the focus group discussions/interviews.
Guiding framework and hypotheses
Usability refers to the extent to which a product can be used to achieve specific goals with effectiveness, efficiency, and satisfaction [54]. Poor usability of digital health technology can diminish work system performance, increase error rates, and cause harm to patients [55]. Acceptability represents how agreeable, palatable, or satisfactory an intervention is perceived to be, which collectively affects the final adoption rate and its eventual implementation and use [56, 57]. Given the overlap between these concepts, we decided to treat them both as indicators of ‘global usability’, our primary outcome.
Concurrently, we adopted the technology acceptance model (TAM) to deepen our assessment of usability, both quantitatively and qualitatively. The TAM is a frequently used and validated conceptual framework for explaining the actual use and acceptance of new IT interventions in healthcare [58, 59]. Figure 2 specifies the four main subconstructs of the TAM: perceived ease of use, perceived usefulness, attitude towards use, and behavioural intention to use [60]. Three of these subconstructs are analogous to the ISO‐defined aspects of usability (i.e. effectiveness = perceived usefulness, efficiency = perceived ease of use, satisfaction = attitude towards use) [61], whereas behavioural intention to use denotes actual system use [62, 63]. They are thus appropriate for assessing different components of usability. We tested the following five hypotheses that were consistent with the original TAM and extant research [64, 65, 66]:
Perceived ease of use is positively associated with perceived usefulness.
Perceived ease of use is positively associated with attitude towards use.
Perceived usefulness is positively associated with attitude towards use.
Perceived usefulness is positively associated with behavioural intention to use.
Attitude towards use is positively associated with behavioural intention to use.
FIGURE 2.
Guiding framework: The technology acceptance model, adapted from [51]. Hyp, hypothesis.
Quantitative data
Data collection
A sociodemographic questionnaire was administered to describe the study sample's characteristics and diversity (e.g., age, preferred language, gender, ethnic group identity) and participants' use of mobile devices, health apps, and Facebook Messenger (Multimedia Appendix S3). All study questionnaires (Multimedia Appendix S3) were administered via Google Forms.
Primary outcome
Concerning feasibility, we documented reasons for refusal to participate or screening failures with a designated refusal form. We examined the recruitment rate (i.e., the proportion of eligible contacts enrolled in the study) and the retention rate (i.e., the proportion of participants who completed the usability test and study questionnaires).
Data on global usability were collected with two validated scales.
The first scale is a slightly adapted version of the Usability Metric for User Experience‐lite (UMUX‐lite) [67]. It is based on the well‐established System Usability Scale (SUS) [68]. Although a standardized score of 68 is generally considered the baseline for a usable tool, UMUX‐lite yields scores that are 99% predictive of SUS scores, with the benefit of only two 7‐point Likert scale items. UMUX‐lite is also considered suitable for healthcare technology assessment [69]. A corrective regression formula is used to align its scores with those of the SUS [67].
The second scale, a revised version of the 6‐item Acceptability E‐Scale (AES), is designed to evaluate computer‐based interventions for healthcare populations [70]. Items are rated on a 5‐point Likert scale and summed to create a global score (range 6–30), with 24 recommended by the developers as the acceptability threshold.
Secondary outcome
Four subconstructs of the TAM were assessed with validated instruments.
Perceived ease of use was assessed through an adapted 7‐point Likert scale single ease‐of‐use question (SEQ) [71].
Perceived usefulness was measured using four items adapted from the tools of Chau & Hu [72] and Davis [73]. Rated on a 7‐point Likert scale, item ratings were averaged to produce a perceived usefulness score.
To evaluate attitude towards use MARVIN, the net promoters score [74] was adopted with a single 11‐point Likert scale question. The final net promoters score is the percentage of detractors (score 0–6) subtracted from promoters (9–10). Positive scores, especially those over 50%, are judged positively.
Behavioural intention to use was measured with two validated 7‐point Likert scale items [59] that were averaged to derive an intention score.
For perceived ease of use, perceived usefulness, and behavioural intention to use, a mean score target of 5 was considered a positive outcome [75].
Statistical analysis
All statistical analyses were conducted using Python coding language [76].
Sample characteristics
The sociodemographic variables were summarized with descriptive statistics. For continuous variables, we report the mean and standard deviation (SD). In the case of ordinal and nominal qualitative variables, we present both counts and proportions.
Primary outcome
The global usability outcomes were also summarized using descriptive statistics (i.e., the means, SD, and the range). The sample means for both the UMUX‐lite and AES were confronted to their recommended minimal thresholds of 68/100 and 24/30, respectively.
Next, global usability outcomes were stratified for subgroup comparisons by selecting sociodemographic variables that emerged as important from the qualitative analysis [46], that is, preferred language and years since diagnosed with HIV infection. The normality of the distribution was tested using the skewness and kurtosis coefficients. Student's t‐tests were used to test the null hypothesis that the observed means were equal between subgroups, with a significance level of 5%.
Secondary outcome
The TAM subconstructs (perceived usefulness, perceived ease of use, attitude towards use, behavioural intention to use) were summarized using descriptive statistics (i.e., the means, SD, and the range).
We tested five hypotheses of inter‐subconstruct relationships within the TAM framework using simple linear regression models. Each model's slope coefficient sign reflected the direction of the variable's (i.e., subconstruct's) association. We performed residual diagnostics for each model and adopted appropriate strategies when the assumptions were not met. Coefficient significance was tested with a Student's t‐test on each slope coefficient with a 5% significance level. The 95% confidence interval for each coefficient is presented. For each model's predictive accuracy, we report the coefficient of determination R 2.
Qualitative data
Data collection
In conjunction with the study questionnaire, three semi‐structured focus group interviews were planned to further explore participants' experiences with MARVIN. The choice of three focus groups follows recent work suggesting that this would capture at least 80% of the themes, which we considered sufficient saturation for our usability study [77]. The interviews were designed following the subconstructs of the TAM framework. Additionally, participants were asked about future improvements they would like to see with MARVIN. The focus group interview guide can be found in Multimedia Appendix S4.
Focus group interviews were held via Zoom videoconferencing. Limitations imposed by responses to the COVID‐19 pandemic meant that some interviews ended up being conducted individually. YM and GT conducted the interviews, which lasted 15–85 minutes each and were audio‐recorded.
Analysis
All interviews were transcribed verbatim and de‐identified. All transcripts were cross‐checked by YM and GT for accuracy and completeness while both attempted to understand the entire dataset through immersion in the data.
A comprehensive coding matrix based on the TAM and the Consolidated Framework for Implementation Research (CFIR) [78] was used for deductive thematic analysis. The CFIR is a commonly used conceptual framework for identifying factors that might influence intervention implementation and effectiveness. It consists of five broad domains (i.e., intervention characteristics, outer setting, inner setting, characteristics of individuals and process) and 39 sub‐domains. Given that ‘characteristics of individuals’ and ‘process’ focused more on scaling‐up implementation, we focused our analysis on the first three subdomains to illustrate relevant barriers and facilitators. Using the NVivo R1 software, initial codes were generated deductively using the CFIR subdomains. We then matched identified facilitators and barriers to the subconstructs of the TAM to determine their impact on global usability.
To ensure credibility and reliability, results were debriefed and discussed repeatedly by DL, ML, and KE and triangulated with quantitative findings. Illustrative quotes in French presented in this manuscript were translated to English by co‐authors.
RESULTS
Overview of participation
The participant flowchart is shown in Figure 3. From 30 March to 2 December 2021, a total of 88 people with HIV were screened for participation in the study, including 34 ineligible and 54 eligible individuals. The most common reasons for exclusion were that they were unable to commit to the entire study period or did not have a Facebook account. In total, 40 people were enrolled in the study, of whom 12 withdrew. All 28 participants who completed at least 20 conversations within 3 weeks were included in the analysis.
FIGURE 3.
Participants’ flow diagram throughout the study.
Sample characteristics
Table 2 describes the baseline sociodemographic characteristics of the study sample. Notably, most participants (23/28; 82.1%) self‐identified as men. Preferred language was evenly distributed between English (15/28; 53.6%) and French (13/28; 46.4%). Nearly half of participants (13/28; 46.4%) reported having a university degree. More than half (17/28; 60.7%) earned less than $40 000 CAD per year.
TABLE 2.
Sociodemographic characteristics of the sample (N = 28).
Characteristic | N (%) or mean ± SD |
---|---|
Age (years) a | 40.2 ± 11.5 |
Years diagnosed with HIV a | 8.2 ± 8.2 |
Gender | |
Men | 23 (82.1) |
Women | 5 (17.9) |
Preferred language | |
English | 15 (53.6) |
French | 13 (46.4) |
Sexual orientation b | |
Heterosexual/straight | 6 (21.4) |
Lesbian | 1 (3.6) |
Gay | 19 (67.9) |
Queer | 2 (7.1) |
Bisexual | 4 (14.3) |
Questioning | 1 (3.6) |
Other | 1 (3.6) |
Ethnic groups b | |
English Canadian | 1 (3.6) |
French Canadian | 6 (21.4) |
British | 1 (3.6) |
Other Eastern/Western European | 3 (10.7) |
West Asian | 1 (3.6) |
Arab or North African | 3 (10.7) |
Latin American | 10 (35.7) |
African | 4 (14.3) |
Black | 5 (17.9) |
Caribbean | 1 (3.6) |
Mixed race/ethnicity | 1 (3.6) |
Highest education level | |
Secondary (high school) | 2 (7.1) |
Professional degree/college | 6 (21.4) |
CEGEP/technical degree | 4 (14.3) |
University | 13 (46.4) |
Other | 3 (10.7) |
Annual income ($CAD) | |
<10 000 | 2 (7.1) |
10 000–$ 19 999 | 5 (17.9) |
20 000–39 999 | 10 (35.7) |
40 000–59 999 | 7 (25.0) |
60 000–79 999 | 2 (7.1) |
80 000–99 999 | 1 (3.6) |
Greater than $100 000 | 1 (3.6) |
Most frequently used mobile device | |
Android | 11 (39.3) |
Apple iPhone | 15 (53.6) |
Tablet | 1 (3.6) |
Other c | 1 (3.6) |
Confident in the effective use of mHealth platforms/electronic surveys | |
Strongly disagree | 4 (14.3) |
Disagree | 0 (0) |
Neutral | 1 (3.6) |
Agree | 12 (42.9) |
Strongly agree | 11 (39.3) |
Extent of use of health‐related apps | |
Never | 4 (14.3) |
Very little | 6 (21.4) |
Sometimes | 11 (39.3) |
Frequently | 6 (21.4) |
Very frequently | 1 (3.6) |
Frequency of use of mobile devices | |
Several times a day | 26 (92.9) |
Once a day | 0 (0) |
Several times per week | 2 (7.1) |
Several times per month | 0 (0) |
Experience with Facebook Messenger | |
Less than 2 years | 3 (10.7) |
2–4 years | 4 (14.3) |
4–6 years | 2 (7.1) |
6–8 years | 3 (10.7) |
8–10 years | 2 (7.1) |
More than 10 years | 14 (50.0) |
No missing data.
Multiple‐choice question.
One participant answered “Laptop.”
In terms of experience with technology, almost two‐thirds (18/28; 64.3%) had moderate experience with health‐related apps. Nearly all participants (26/28; 92.9%) used their mobile devices several times a day, and 89.3% (25/28) had used Facebook Messenger for more than 2 years.
Quantitative results
Primary endpoints: feasibility and global usability
Regarding feasibility, 40 participants were recruited among 54 eligible contacts, with a study recruitment rate of 74% (40/54). The retention rate was 70% (28/40), with 28 completing the 3‐week test.
Table 3 shows the descriptive statistics for the UMUX‐lite and AES. The mean value for the UMUX‐lite was 69.9 (SD 14.2), which surpassed the threshold of 68. The mean value of the AES was 23.8 (SD 4.9), which was close to the expected mean threshold of 24. Checking the distribution of questionnaire data revealed one outlier. Excluding the outlying participant made both measures exceed their respective threshold, with no effect on the remaining results.
TABLE 3.
Primary endpoints: Global usability and subgroup comparisons.
Measure | Range | Sample | Mean ± SD | Threshold | >Threshold, n (%) |
---|---|---|---|---|---|
Global usability | |||||
UMUX‐lite | 12.1–87.9 | N = 28 | 69.9 ± 14.2 | 68.0 | 15 (53.6) |
AES | 6–30 | 23.8 ± 4.9 | 24.0 | 16 (57.1) | |
UMUX‐lite | 12.1–87.9 | N = 27 | 71.4 ± 11.9 | 68.0 | 15 (55.6) |
AES | 6–30 | 24.2 ± 4.5 | 24.0 | 16 (59.3) |
Measure | Subgroup | p value* | |||
---|---|---|---|---|---|
Subgroup comparisons | |||||
UMUX‐lite | Preferred language | ||||
English | 15 | 69.8 ± 17.0 | 0.98 | 9 (60.0) | |
French | 13 | 70.0 ± 10.9 | 6 (46.2) | ||
Years diagnosed with HIV | |||||
≤5 | 14 | 74.7 ± 11.4 | 0.07 | 8 (57.1) | |
>5 | 14 | 65.1 ± 15.5 | 7 (50.0) | ||
AES | Preferred language | ||||
English | 15 | 24.4 ± 5.5 | 0.51 | 10 (66.7) | |
French | 13 | 23.2 ± 4.3 | 6 (46.2) | ||
Years diagnosed with HIV | |||||
≤5 | 14 | 26.1 ± 3.6 | 0.009 | 11 (78.6) | |
>5 | 14 | 21.5 ± 5.0 | 5 (35.7) |
Note: Italics indicate the siginificant difference.
Abbreviations: AES, Acceptability E‐Scale; UMUX‐lite, Usability Metric for User Experience‐lite.
p values were calculated to test the null hypothesis that the observed means are equal between subgroups.
Subgroup comparisons on both global usability measures are presented in Table 3. By skewness‐kurtosis test, all normal distributions of the subgroups held. Considering the preferred language, a greater proportion of participants rated the English version of MARVIN over threshold on both the UMUX‐lite and the AES (reported in Table 3 [60.0% (9/15) and 66.7% (10/15), respectively]) compared with the French version (46.2% [6/13] for both). However, no significant differences were found between the means of both measures (p = 0.98 and p = 0.51) for the English and French subgroups.
Regarding years since diagnosis with HIV infection, among participants diagnosed ≤5 years, 57.1% (8/14) had mean UMUX‐lite scores above the threshold, and 78.6% (11/14) had above‐threshold scores for the AES. In comparison, these values were 50.0% (7/14) and 35.7% (5/14), respectively, for participants who had been diagnosed for >5 years. There was no statistically significant difference (p = 0.07) between the two subgroups in terms of UMUX‐lite. On the AES, the mean of participants diagnosed for ≤5 years was significantly higher than that of those diagnosed for >5 years (p = 0.009).
Secondary outcomes: usability subconstructs
Results for the usability subconstructs are shown in Table 4. Means for perceived ease of use, perceived usefulness, and behavioural intention to use all exceeded the target score of 5 out of 7. The net promoters score, which measured attitude towards MARVIN use, was equal to 14%, representing positive user ratings. Significant positive associations were found for all five hypotheses in the expected direction (p = 0.024 and p = 0.002 for H1 and H2, and p < 0.001 for [Link], [Link], [Link], respectively). Corresponding scatterplots can be found in Multimedia Appendix S5. Perceived ease of use explained 18% and 31% of the variance in perceived usefulness and attitude towards use. Perceived usefulness explained 71% and 59% of the variance in attitude towards use and behavioural intention to use. Attitude towards use, on the other hand, explained 68% of the variance in behavioural intention to use.
TABLE 4.
Secondary endpoints: Subconstructs of the TAM and their associations (N = 28).
Secondary endpoint | Score range | Mean ± SD |
---|---|---|
PEU | 1–7 | 5.6 ± 1.3 |
PU | 1–7 | 5.1 ± 1.6 |
ATU a NPS (%) a |
0–10 −100–100 |
7.4 ± 2.9 14% |
BIU | 1–7 | 5.2 ± 2.0 |
Association between TAM subconstructs | ||||
---|---|---|---|---|
Hypothesis | Slope | p value* | 95% CI | R2 |
H1 – PU = f(PEU) | 0.53 | 0.024 | 0.08–0.98 | 0.18 |
H2 – ATU = f(PEU) | 1.22 | 0.002 | 0.49–1.95 | 0.31 |
H3 – ATU = f(PU) | 1.48 | <0.001 | 1.10–1.86 | 0.71 |
H4 – BIU = f(PU) | 0.94 | <0.001 | 0.61–1.28 | 0.59 |
H5 – BIU = f(ATU) | 0.57 | <0.001 | 0.42–0.73 | 0.68 |
Note: Italics indicate the siginificant difference.
We present the numerical results for ATU, and the final scores calculated using the NPS method.
p values were calculated to test the null hypothesis that there is no linear relationship between both variables.
Abbreviations: ATU, attitude towards use; BIU, behavioural intention to use; CI, confidence interval; NPS, net promoters score; PEU, perceived ease of use; PU, perceived usefulness; SD, standard deviation; TAM, technology acceptance model.
Qualitative results
Overview
Between August 2021 and April 2022, 11 participants were recruited to participate in two focus groups (one in English, one in French, with three participants per group) and three one‐on‐one interviews (one in English and two in French). Two participants did not attend the sessions, leaving nine. The thematic analysis identified 20 themes, representing 12 facilitators and 10 barriers to global usability, as presented in Table 5 with selected participant quotes. The remaining quotes contributing to the analysis can be found in Multimedia Appendix S6.
TABLE 5.
Themes identified, with corresponding quotations, facilitators, and barriers.
Domain | Subdomain | Theme | Quotationsa | Facilitator | Barrier |
---|---|---|---|---|---|
Implementation characteristics | Intervention source | Reliable information | We have a very major point that Google might be wrong, where you might be referred to many different answers and you don't know what the right answer is. And here [MARVIN], we're more confident that the information that has been offered to us is more accurate because it was professionally done (P#1 focus group, English, 32 y, man) | ✓ | |
Evidence strength and quality | Easy‐access, go‐to tool |
It's a reference tool for me. Every time I have a question, I'd have the reflex to go and ask my question to this chatbot and see its answer (P#3 interview, French, 56 y, man) Because there's a lot of stuff I forgot about. So, I want a refresher. [MARVIN] is a good reference instead of seeing someone, waiting for a doctor's appointment (P#3 focus group, English, 40 y, man) |
✓ | ||
Useful real‐time support |
I find it really useful. The speed of response, and it also answers the majority of questions relating to this treatment, vaccination, everything to do with HIV. I think it's great (P#1 focus group, French, 28 y, man) It was useful also for travelling, and it [MARVIN] gave advice about timing and time zones. It's helpful (P#3 focus group, English, 53 y, man) |
✓ | |||
Convivial conversation |
But it's super easy to use, it really is a friend you can share a question with and get an answer (P#3 focus group, French, 40 y, man) I'm satisfied with it [MARVIN] because it's like you're chatting with a friend (P#1 interview, English, 37 y, woman) To me, I like the fun aspect of chatting with that [MARVIN], it makes it fun to make a question and get an answer (P#2 focus group, English, 32 y, man) |
✓ | |||
Emotionally safe |
No‐one is, what can I say, judging you … This platform [MARVIN] is good because there's someone you can share with … I think MARVIN does not ask [me] to identify myself, so I think it's a safe space (P#1 interview, English, 37 y, woman) Let's say a category of questions that I would call ‘shame questions’ like: “What happens if I decide to not wear a condom anymore?” Talking to an AI that has this kind of answer would be very helpful so that you have the right answer, and you don't do something stupid, but you also don't feel guilty about asking this kind of question (P#1 focus group, English, 32 y, man) |
✓ | |||
Confidential | There is a value that you should present more that MARVIN is confidential, that it does not share information with other researchers, with other people (P#3 focus group, French, 40 y, man) | ✓ | |||
Lack of conversation topics |
For a start, it's very good. On the other hand, I did ask some more detailed questions, but unfortunately, I didn't get the answers [I wanted]! (P#2 focus group, French, 34 y, man) Because there's so many things we would want to ask. So, if you can go deeper and make MARVIN be available to answer, ask everything, it would help (P#2 interview, English, 53 y, woman) |
✓ | |||
Limited understanding of user input |
I asked: “What is an undetectable viral load?”… MARVIN didn't have a good answer, but I rephrased to “What does undetectable mean?” And that [answer] was exactly what I wanted … (P#1 focus group, English, 32 y, man) For me, the thing MARVIN can improve is his language. He doesn't understand everything other people say. All that is to make it easier for other people to use and maybe if you could do it in languages apart from English and French (P#1 focus group, French, 28 y, man) |
✓ | |||
Useful reminders but still room for improvement |
So, it can remind us to take our medication … It's very useful for me (P#3 interview, French, 56 y, man) I think it could be very useful to use as a reminder for multiple things like … It's been 3 months or 6 months, shouldn't you do bloodwork again? … These kinds of reminders as well (P#1 focus group, English, 32 y, man) |
✓ | |||
Desired features beyond conversation content |
To make MARVIN easier to use, is it possible for MARVIN to have memories? Because, for example, when I ask him about medication, he always asks me the same question: “What medication?” (P#3 focus group, French, 40 y, man) MARVIN should, in the future, if it starts to detect something odd in the person's questions, that it can also act as an alert for specialists. I think it would be useful for the specialists (P#3 focus group, French, 40 y, man) |
✓ | |||
Lack of proactivity | It's very passive right now: you need to ask, and it responds. But if it could ask you and sort of come up with an answer based on the questions (P#3 focus group, English, 53 y, man) | ✓ | |||
Allows for health‐related teaching moments among peers or friends | I once used MARVIN in front of a friend to explain what ‘undetectable’ meant. MARVIN responded quickly, explaining everything. I think this is another advantage of MARVIN (P#1 focus group, French, 28 y, man) | ✓ | |||
Relative advantage | More comprehensive information available elsewhere | But on CATIE you can find all the information you need about travel. CATIE covers an enormous amount of information (P#2 focus group, French, 34 y, man) | ✓ | ||
Trust in MARVIN vs doctor's advice |
And I learned a few things too because the doctor said different things from what MARVIN did. I really wanted to understand time zone changes. I noticed with MARVIN it was pretty good for that. My doctor said something else [compared to MARVIN] to answer the question [when to take your medication when travelling], but I don't know if I agree with that … (P#3 focus group, English, 53 y, man) The subject of medication, I leave only to the doctors (P#3 focus group, French, 40 y, man) |
✓ | ✓ | ||
Adaptability | Accessible on different devices |
I use it on my PC only. I don't use it on my phone because I decided to create a new account for that (P#3 focus group, English, 53 y, man) And it works both ways [mobile and PC] very well. As far as I'm concerned, I haven't had any problems with bugs, whether on a mobile interface or a fixed interface, laptop (P#2 focus group, French, 34 y, man) |
✓ | ||
Complexity | Preference for other platforms | Couldn't we simply have a connection with the MUHC website, so that we could connect directly there and not have to go through Facebook. For my part, I'm a bit worried about data confidentiality. I'm a bit worried about security (P#3 interview, French, 56 y, man) | ✓ | ||
Design quality and packaging | Need for instructional materials on MARVIN |
It was a bit tricky at first, as you know, yeah (laughs). You (coordinator) did help me through it … I think it's because I'm not that good with social media and technology (P#1 interview, English, 37 y, woman) Yes, it's useful to put a guide for everyone that uses MARVIN. Even an explanatory video on the main page or you ask MARVIN for the guide, and it gives it to you … (P#1 focus group, French, 28 y, man) |
✓ | ||
Outer setting | Patient needs and resources | More pertinent for people who recently initiated ART |
I'm someone who's been undetectable for several years, and I've been HIV positive since 2006 … I know my medication very well, I know what I have to do, so I don't need to go looking for information as much as I did before. On the other hand, I can understand that someone recently diagnosed with HIV will find it useful (P#3 interview, French, 56 y, man) If it's someone that just found out about their status. For sure, it's an uncomfortable position. Maybe you don't want to talk to a doctor, or you can't see a doctor right away … It [MARVIN] is not really someone, but it's just something that can have a conversation with you and provide answers to the questions you have. So, for sure, it would be helpful for someone that is in the beginning of the treatment (P#1 focus group, English, 32 y, man) |
✓ | ✓ |
Relevant for all sexually active people, regardless of HIV status | I think any knowledge that is basic about HIV is important to any person who engages in sexual activities, even if they have never tested positive (P#1 focus group, English, 32 y, man) | ✓ | |||
Cosmopolitanism | Absence of referrals | If it [MARVIN] is not able to answer me, but at least able to direct me to a first line, a physical person would answer me there, or saying “In 24 h, there's someone who will answer you or call you.” (P#3 interview, French, 56 y, man) | ✓ |
Notations in parentheses indicate participant number, style of interaction, nationality, age, and sex.
Abbreviations: ATU, attitude towards use; BIU, behavioural intention to use; MUHC, McGill University Health Centre; P, participant; PEU, perceived ease of use; PU, perceived usefulness.
Implementation characteristics
Intervention source
Theme: Reliable information
Participants found MARVIN to be a reliable source of medical knowledge and information as it was validated by expert health professionals. Users felt that responses were accurate and more trustworthy than what they would normally find on common search engines (e.g., Google). This theme was identified as a facilitator to perceived usefulness and behavioural intention to use.
Evidence strength and quality
Theme: Easy‐access, go‐to tool
MARVIN's ease of accessibility made it a tool that participants would think of using first when they had questions that needed answering, eliminating the need to wait for their next clinical appointment. We identified this theme as a facilitator of perceived ease of use and behavioural intention to use.
Theme: Useful real‐time support
MARVIN's ability to respond instantly worked to its advantage when users needed a quick response. Users would get a timely answer to a wide range of questions, which participants considered very useful. The travel‐related content was especially appreciated by some participants, as they would not usually see their doctor before travelling. This theme contributed to MARVIN's perceived usefulness and users' attitudes towards its use.
Theme: Convivial conversation
The day‐to‐day use of MARVIN was described by participants as easy and straightforward since MARVIN functions like a normal conversation with a friend. The conversational nature of MARVIN was a further enjoyable factor for one participant. The question‐and‐answer interaction added an engaging and convivial atmosphere, making the chat an experience in itself. This theme was among the contributing factors to MARVIN's perceived ease of use.
Theme: Emotionally safe
The fact that MARVIN is an AI, i.e., non‐human, was said to put many participants at ease. It did not ask identifying questions and was thus considered a safe space for them to confide sensitive information. Indeed, MARVIN's non‐judgmental nature allowed them to ask questions that would be difficult to ask a doctor without fear of being judged. This promoted users' attitude towards MARVIN use.
Theme: Confidential
Confidentiality was stated by many participants as one of the most important characteristics of MARVIN. The users appreciated that the data were stored on the research institute's internal server and accessible only to the research team. This was also one of the facilitators of attitude towards MARVIN use.
Theme: Lack of conversation topics
Although participants expressed their satisfaction with MARVIN as a first release, they also pointed out that conversations with the chatbot were superficial and that some questions were not answered in a useful way. Participants wished for MARVIN to cover a wider range of topics and to delve deeper into its existing topics, which included (1) lifestyle and behavioural factors: diet, nutrition, exercise; (2) HIV treatment: updates on new treatments and diseases, pre‐exposure prophylaxis; (3) reproductive and sexual health: pregnancy, breastfeeding, sexual health behaviours, and other sexually transmitted infection‐related information; (4) healthcare service support: appointments/vaccination scheduling, symptom checkers; (5) mental health support: resources on local psychologists; and (6) socioeconomic issues: immigration process, financial and insurance support. This theme was identified as a barrier to perceived usefulness.
Theme: Limited understanding of user input
In some instances, MARVIN was unable to give answers that were already in its knowledge base because it did not understand the way the questions were posed by users. Some participants reported that they had to rephrase certain questions for MARVIN to provide a related answer. On this theme, one participant emphasized the utility of offering multilingual support, which could make MARVIN more user‐friendly by having it learn languages other than French and English. This point hindered both MARVIN's ease of use and users' attitude towards its use.
Theme: Useful reminders but still room for improvement
The current daily medication reminder function was used and appreciated by many participants. However, they would like this feature to be more adaptable, such as a one‐time reminder that is not repeated daily, or a reminder for a 6‐month hospital follow‐up visit. This was one of the facilitators of MARVIN's perceived usefulness.
Theme: Desired features beyond conversation topics
It was suggested that MARVIN could be enriched to deliver information in more compelling ways, for example through videos and pictures. A few participants also suggested improving MARVIN with a working memory, so it could remember some important previous questions and answers (e.g., the medication they are taking). This would spare them the trouble of asking or answering the same questions repeatedly. Detecting anomalies in conversations and sending alerts to healthcare professionals, with the user's consent, were also raised as potentially useful features. This theme was identified as a barrier to MARVIN's perceived ease of use and perceived usefulness.
Theme: Lack of proactivity
Interactions with MARVIN currently require the user to ask questions first. Participants expressed their desire for MARVIN to be more proactive. They envisioned MARVIN asking questions, initiating conversations, and providing information without waiting for a prompt. This lack of proactivity was also seen as a barrier to MARVIN's perceived ease of use and perceived usefulness.
Theme: Allows for health‐related teaching moments among peers or friends
Some participants mentioned that MARVIN was able to facilitate health discussions with their friends. Access to accurate information through MARVIN helped address some of their health concerns and played a role in peer health education. This fact contributed to MARVIN's perceived usefulness and user's behavioural intention to use it.
Relative advantage
Theme: More comprehensive information available elsewhere
Participants indicated that there are other, more comprehensive sources of information than MARVIN, which currently covers a limited number of topics. These included CATIE [79], a well‐known Canadian site for information on HIV treatment. This limited MARVIN's perceived usefulness and users' behavioural intention to use it.
Theme: Trust in MARVIN vs doctor's advice
We found that participants had different levels of trust in MARVIN versus the guidance provided by their physicians on certain topics. For example, one participant reported discrepancies between MARVIN's response and their doctor's recommendations on how to manage ART when time zones change, preferring MARVIN's advice. However, another participant indicated that for topics related to medication, he would defer to his doctor's advice. This theme was identified as both a facilitator and a barrier to the behavioural intention to use.
Adaptability
Theme: Accessible on different devices
MARVIN is accessible through Facebook Messenger, which allows users to interact with MARVIN on their phone or portable devices and computer. Some participants exclusively used one specific device to access MARVIN, and others used multiple devices. This facilitated MARVIN's perceived ease of use.
Complexity
Theme: Preference for other platforms
MARVIN is currently limited to Facebook Messenger, and participants were concerned about the privacy issues brought up in the past with Facebook, the broader social media of which Facebook Messenger is part of. Even if they were informed that conversations could be removed from Facebook, users expressed a lack of trust. Besides, Facebook is simply not the preferred social media of many participants, and making MARVIN available on a variety of platforms would increase access. Participants suggested alternative popular social media platforms (e.g., WhatsApp), creating a MARVIN application to install on devices, or giving access directly through healthcare organizations’ (e.g., MUHC) websites. We identified this theme as a barrier to attitude towards use and behavioural intention to use.
Design quality and packaging
Theme: Need for instructional materials on MARVIN
Some participants indicated that they were not familiar with the platform (i.e., Facebook Messenger) itself or social media in general. The inherent technological aspect of a chatbot is also a barrier to some individuals. These contributed to a more difficult setup process and a steeper learning curve, whereas the training session prior to usability testing was perceived as beneficial by participants. Further, they felt that a future guide or a video tutorial explaining MARVIN and how to use it would make it more accessible and less intimidating for those unfamiliar with chatbots. This theme was considered a barrier to perceived ease of use.
Outer setting
Patient needs and resources
Theme: More pertinent for people who recently initiated ART
Several participants who had been living with HIV for several years mentioned that MARVIN's conversation topics were more appropriate for recently diagnosed treatment‐naïve patients. They stated that they already knew most of the information it provided. One interviewee also highlighted that newly diagnosed patients are more likely to be in a state of stress or unease; in this case, MARVIN could act as a conversation starter. This theme was identified as both a facilitator and a barrier to perceived usefulness and behavioural intention to use.
Theme: Relevant for all sexually active people, regardless of HIV status
For some participants, MARVIN's potential user base could extend beyond people with HIV. They reported that basic HIV‐related knowledge was essential for anyone who is sexually active. Participants found that having conversations with MARVIN was also valuable for people who are not living with HIV. This contributed to MARVIN's perceived usefulness and users' behavioural intention to use it.
Cosmopolitanism
Theme: Absence of referrals
In cases where MARVIN was unable to answer a question, some participants suggested that the chatbot provide reputable resources for advice or redirect users to specific professionals. Doing so would create a win–win situation, by both ensuring a timely response and providing access to further assistance. This theme was identified as a barrier to perceived usefulness.
DISCUSSION
Principal findings
Our team developed MARVIN, the first chatbot to promote self‐management among people with HIV with a focus on ART adherence. The present work sought to assess (1) the feasibility of using MARVIN by people with HIV, (2) its global usability, and (3) four usability subconstructs and their interrelationships following the TAM. Quantitatively, our findings, from 28 participant people with HIV, support the feasibility and usability of MARVIN for the study sample. Our qualitative results showed that usability facilitators for MARVIN included the provision of reliable information and useful real‐time support, as well as its easy accessibility. Participants perceived a sense of conviviality, emotional safety, and confidentiality from talking to MARVIN. However, limited understanding of user input and lack of conversation topics were identified by participants as major usability barriers to the current chatbot. It was also desired that MARVIN would offer more features, be implemented on more platforms, and provide instructional materials on its use.
Concerning feasibility, the recruitment rate was 74% (40/54), with 70% of participants completing at least 20 rounds of conversation with MARVIN in 3 weeks, rates comparable to other successful healthcare chatbot studies [80, 81]. This happened despite the COVID‐19 pandemic, which posed a degree of challenge to the recruitment and conduct of the study, as all processes had to be conducted remotely. Automatic weekly study reminders to participants likely facilitated the high retention rate, as found in other research [82]. Overall, we can confidently conclude that patients’ use of the MARVIN chatbot was feasible in this context of implementation.
Regarding the results of the usability questionnaire, the UMUX‐lite mean exceeded the preset threshold for success (69.9/68), whereas the AES measure was very close to target (23.8/24). Coupled with the fact that over half of participants scored above both thresholds (15/28 for UMUX‐lite, 16/28 for AES), we consider that these results provide evidence of MARVIN's global usability in the study sample. Mean scores of perceived ease of use, perceived usefulness, and behavioural intention to use also surpassed the cut‐off value of 5/7 and the net promoters score indicated a positive attitude towards MARVIN (+14%), further substantiating its usability.
As per ease of use, the qualitative analyses revealed that participants found MARVIN was easily accessible across various interfaces such as smartphones, tablets, and laptops. Furthermore, almost everyone in our sample (26/28) used mobile devices multiple times per day, and the vast majority (23/28) were confident at baseline in using mHealth platforms effectively and had experience with similar applications (18/28). These findings, as well as the growing popularity of mobile technology, may have contributed to the high ease of use observed in this study, which is consistent with previous studies [11, 41, 83, 84]. Participants also reported that, with the on‐call MARVIN chatbot, people with HIV can obtain needed information, resources, and support to self‐manage their health anytime, anywhere, without having to wait for the next meeting with their doctor. However, some participants struggled with chatbot technology and social media due to gaps in digital literacy. Although a training session on MARVIN was provided by the technical coordinator to participants before study entry, the qualitative results stressed participant preferences for further instructional materials on the chatbot. We thus need to develop and disseminate relevant user guides or video tutorials for all prospective MARVIN users to aid their use and ensure equitable delivery of the intervention.
Convivial conversation was another factor explaining MARVIN's ease of use, with several interviewees expressing how using MARVIN was like chatting with a friend. Consistent with our attempts to include empathic elements (e.g., smiley emoji, words of encouragement) when designing predefined messages [42], this contributes to the chatbot's usability and potential to comfort users. Certainly, the quality of interaction could be improved. Several participants indicated that MARVIN sometimes required multiple user attempts to answer questions correctly on certain topics. Such limited comprehension disrupted the fluidity of use, which may in turn reduce their willingness to use it. Technically, the English language model is easier to train than the French one and therefore promises better results [85]. However, there was no significant difference (p = 0.07) in the primary outcomes between the two versions when comparing the language‐based groupings. Further development will focus on collecting more training data and improving MARVIN's comprehension in both languages. Considering the high proportion of immigrant people among newly diagnosed people with HIV in Canada and that language barriers are a key challenge to linkage and retention in care [86], new language versions (e.g., Spanish), as proposed by participants, could also be developed. Large language models (LLMs) will also be integrated into MARVIN in the future. Represented by ChatGPT, launched in late 2022, they offer impressive comprehension capabilities compared with traditional technologies, while also generating fluent responses [87]. Their multilingual capabilities could also help address potential language barriers [88]. However, a narrative review of HIV care‐related chatbots noted that current LLMs can provide biased and fabricated responses, challenging their practical application [33].
Emotional safety and confidentiality were highlighted by participants during the qualitative interviews as two other prominent factors contributing to satisfaction with MARVIN. Many people with HIV still face significant psychological challenges as they are often discriminated against, socially isolated, and stigmatized for their condition [31, 89]. A needs‐assessment study identified a concern that non‐human interactions with chatbots could exacerbate feelings of marginalization due to the technology's lack of empathy [90]. However, our findings, along with those of other studies [11, 38, 40, 41, 83, 84], suggest that a nonjudgmental chatbot such as MARVIN can provide a reassuring connection for people with HIV. Its objectivity and user anonymity allow for open discussion on sensitive or taboo topics [91] such as sexual behaviour and HIV transmission prevention without fear of being criticized. This encourages users to seek information freely and might enhance their willingness to use MARVIN. However, the choice of Facebook Messenger as the interface for deploying MARVIN was seen as a big hurdle by some participants. Some people with HIV were unable to participate because they did not have a Facebook account, and a further subset expressed concern about Facebook's poor privacy record. Consequently, we have begun work on a standalone web user interface. Additional options (e.g., third‐party mobile applications) are also being considered and will be changed in the future to improve user trust and adoption.
Interviewees also appreciated MARVIN's ability to provide reliable and useful real‐time information related to ART self‐management, and the reminders and medication management while travelling were particularly appreciated by some. We attribute the perceived trustworthiness of MARVIN's content to the co‐design strategy implemented through patient and stakeholder engagement: ongoing communication with patient experts identified medication adherence as the primary goal of MARVIN, and healthcare professionals ensured information reliability, both of which contributed to usability. However, some participants perceived MARVIN as more suitable for newly diagnosed people with HIV. Subgroup comparisons in the statistical analyses showed significantly higher acceptability (p = 0.009) among patients diagnosed in the last 5 years than among those diagnosed earlier. A similar though insignificant difference was also found for usability (p = 0.07). MARVIN's relevance for more newly diagnosed people with HIV suggests that it can play a role in models for rapid ART initiation, which is considered a key strategy for achieving rapid viral suppression [86]. Patients involved in this model of care are typically treatment naïve and require HIV‐related health information to answer their concerns [92]. For those who have a seasoned understanding of ART management, the breadth and depth of MARVIN's conversation topics must be further expanded. A previous finding underscored that treatment adherence is but one of the many facets of HIV self‐management [93]. In fact, MARVIN has grown since this study's completion to include more than 50 new topics (e.g., lifestyle, socioeconomic issues, mental health). It also addresses healthcare practices to promote self‐management of overall health, with more than 20 in preparation as of April 2024. Further, based on participant input, referrals to relevant information and other external sources will also be added to enhance MARVIN's usability. Once again, LLMs also have a very high potential to address content breadth. In a study using the ChatGPT test for ART counselling and advice, it answered all questions accurately and comprehensively [94]. However, given the current limitations of LLMs in terms of interpretability, privacy protection, data transparency, and liability for use [95, 96], research is needed to investigate how to integrate them safely and responsibly into healthcare chatbot services.
In addition to the conversation capabilities, interview participants expressed their desire for more advanced chatbot features. For example, participants wanted MARVIN to have memories. Currently, MARVIN is repetitive and always collecting the same information, which degrades the user experience. Long‐term memory of key information is important and could simplify chatbot use. But even today's most powerful LLMs still face the challenge of poor memory capacity [97]. The status quo of chatbots, not limited to those used in healthcare, is that the majority can only perform short‐term ad hoc interactions and have no memory [98, 99, 100]. There is a need to optimize access to the long‐term context of conversational threads [101]. Lack of proactivity was another issue mentioned by participants. Prolonged interactions with chatbots that lack proactivity will create a sense of predictability for users, who will know the subsequent interactions they will encounter, thus reducing motivation to use them after the novelty wears off [102, 103]. Therefore, MARVIN must acquire the ability to initiate new conversations to increase its usability and ensure the long‐term retention of its users. Participants suggested that question triaging, as seen with the Vik chatbot [12] and Woebot [15], may enhance chatbot proactivity. Yet, doing so could lead users to expect even more proactive bot interactions, and failure to this anticipation can negatively impact perceived chatbot usability. In sum, future development needs to enable MARVIN to utilize memory data and engage more proactively in deeper conversations.
The potential target audience for MARVIN may be broader than anticipated, given our qualitative findings. They suggest the chatbot's accurate and useful information and ease of use motivate users to share it with their peers, thus promoting better dissemination of HIV‐related knowledge. Meanwhile, interviewees highlighted that HIV‐related knowledge is not only relevant to people with HIV but is essential for every sexually active person. Several chatbots have indeed been developed to focus on pre‐exposure prophylaxis information and help facilitate HIV self‐testing [11, 33, 36, 38, 39, 40, 41, 104]. Such efforts highlight the promising future of chatbots in different areas of HIV care. It is important to note, however, that the main role of MARVIN will continue to be assisting people with HIV rather than replacing healthcare professionals. Although many participants reported trusting MARVIN, some said they would only trust healthcare providers on certain topics. Especially when it comes to diagnosis and treatment options or topics where the use of chatbots is restricted, users need to be directed to professional human resources.
Lastly, regarding our secondary endpoints, all validated positive associations attested to the appropriateness of the TAM to elucidate usability. Attitude towards use explained a significant portion of the variance in behavioural intention to use (H5), as did perceived usefulness for attitude towards use (H3). This can be explained by our qualitative analyses: with emotional safety identified as a facilitator of attitude towards use, participants would have the intention to converse with MARVIN. The results for H5 and H3 may also explain the only moderate and weak predictive power of perceived usefulness for behavioural intention to use (H4) and perceived ease of use for attitude towards use (H2). In the case of H1, perceived ease of use has a weak explanatory power for perceived usefulness. This may be because other external variables that are not part of the chatbot itself (i.e., alternative information resources, external referrals) have a greater influence on perceived usefulness as antecedents of TAM (as shown in Figure 2). Future research could focus on investigating these parameters to better predict perceived usefulness and refine the application of the TAM model to chatbot evaluation.
LIMITATIONS
We acknowledge several limitations of this study. First, although the overall socio‐demographic characteristics of the usability study participants, including age, race, and education level, were relatively diverse, it is important to highlight the gender imbalance of the participants. Digital divides related to limited technology access may alienate certain groups, such as women [105, 106]. Although 40% of clinic patients were women, only 14.3% of the participants in this study were women. The outcomes for this user group need further study to determine whether there is a gender gap in the use of MARVIN. Moreover, given the limited number of conversation topics, new subjects regarding women's health and pregnancy will be added. Female users will also be invited to participate more in the implementation process in the future to improve MARVIN's adoption in this important population.
Second, partly due to convenience sampling and the small sample size, participants overall had relatively high levels of digital experience or interest in IT. This may have introduced a sampling bias to our findings. The single‐group, short‐term, single‐site study design also limits the generalizability of our findings. To gain a deeper understanding of chatbot implementation, we have developed a master protocol for future research based on the design and results of this study [42]. Clinical validation will be conducted with a larger sample of users to further investigate chatbot performance in large‐scale implementations. Randomized controlled trials with chatbot interventions need to evaluate other treatment modalities in parallel [107, 108].
CONCLUSIONS
Our MARVIN chatbot was validated for its usability in promoting self‐management for people living with HIV, and the mixed‐methods design allowed us to gain a detailed understanding of the facilitators and barriers to usability. Our findings further demonstrate the promise of chatbots for HIV care and provide direction for MARVIN's further development. Next steps will focus on integrating LLMs to improve MARVIN's comprehension and expand its content and enhancing the chatbot's functional intelligence, including memory and proactivity, to better respond to the needs of people with HIV. Given the limitations of current LLMs, their integration with MARVIN must advance prudently and responsibly. Ultimately, we hope MARVIN will become a personalized health companion for people with HIV.
AUTHOR CONTRIBUTIONS
In no order of contribution, YM, SV, KE, and BLeb helped conceptualize the study and data collection tools. YM and SA on the software side, BLem, ML, BLeb, and the MARVIN chatbots Patient Expert Committee on the clinical side, collaborated to develop the MARVIN chatbot. ADP, JC, and BLeb referred patients. YM and SV completed the statistical analysis. YM, GT, DL, and KE completed the qualitative analysis. YM and GT drafted the original manuscript. All authors critically reviewed the manuscript and approved the final version.
CONFLICT OF INTEREST STATEMENT
BLeb has received research support, consulting fees, and speaker fees from ViiV Healthcare, Merck, and Gilead. The authors are identical to the developers of the intervention, and the ethical evaluators are identical to the sponsors of the study. The remaining authors have no conflicts of interest to declare.
Supporting information
Appendix S1. Complete list of conversation topics.
Appendix S2. CONSORT artificial intelligence (AI) guidelines.
Appendix S3. Sociodemographic and study questionnaires.
Appendix S4. Focus group interview guide.
Appendix S5. Scatterplots – linear regression models of inter‐subconstruct relationships.
Appendix S6. Complete quotes – qualitative analysis.
ACKNOWLEDGEMENTS
This study is supported by the Canadian Institutes of Health Research Strategy for Patient‐Oriented Research Québec Support Unit‐Methodological Developments (grant M006, PI: BLeb) and the PIHVOT program from ViiV Healthcare (grant 2020‐1976‐PIHVOT, PI: BLeb). YM is supported by the Postgraduate Scholarship – Doctoral program (PGS D), the OPSIDIAN program from the Natural Sciences and Engineering Research Council, and a doctoral research award from the Fonds de recherche Nature et Technologies (FRQNT) in partnership with the Unité de soutien au système de santé apprenant (SSA) Québec. BLeb is supported by two career awards: a Senior Salary Award from Fonds de recherche du Québec–Santé (FRQS) (#311200) and the Lettre d'Entente 250 from the Quebec Ministry of Health for researchers in Family Medicine, and holds a Canadian Institutes for Health Research, Strategy for Patient‐Oriented Research Mentorship Chair in Innovative Clinical Trials for HIV Care. ADP is also supported by a Senior Salary Award from FRQS and the Lettre d'Entente 250. The funding sources had no role in the design of this study, the interpretation of the results, or the decision to submit them for publication. The authors acknowledge the technical support provided by students and interns from Polytechnique Montreal for the development of the MARVIN chatbots, and Lévis Thériault for referring the students. The authors also acknowledge Imane Frih and Maria Nait El Haj for the creation of the clinical content. The authors thank the Chronic Viral Illness Service research team. The names of the members of the MARVIN chatbots Patient Expert Committee cannot be provided as they are patients, and their names must remain confidential. The authors attest that there was no use of generative AI technology in the generation of text, figures, or other informational content of this manuscript.
Ma Y, Achiche S, Tu G, et al. The first AI‐based Chatbot to promote HIV self‐management: A mixed methods usability study. HIV Med. 2025;26(2):184‐206. doi: 10.1111/hiv.13720
Trial Registration: ClinicalTrials.gov NCT05789901: https://classic.clinicaltrials.gov/ct2/show/NCT05789901.
DATA AVAILABILITY STATEMENT
The datasets generated and analysed for this study are available from the corresponding author (BLeb) on reasonable request.
REFERENCES
- 1. Organization WH . The Global Health Observatory, HIV, https://www.who.int/data/gho/data/themes/hiv‐aids#:~:text=Globally%2C%2039.0%20million%20%5B33.1%E2%80%93,at%20the%20end%20of%202022. (2023, accessed 13/03/2024).
- 2. Canada PHAo . HIV in Canada: 2022 Surveillance Highlights. https://www.canada.ca/en/public‐health/services/publications/diseases‐conditions/hiv‐2022‐surveillance‐highlights.html 2023.
- 3. Wandeler G, Johnson LF, Egger M. Trends in life expectancy of HIV‐positive adults on antiretroviral therapy across the globe: comparisons with general population. Curr Opin HIV AIDS. 2016;11:492‐500. doi: 10.1097/COH.0000000000000298 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Colvin CJ. HIV/AIDS, chronic diseases and globalisation. Global Health. 2011;7:1‐6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Van de Velde D, De Zutter F, Satink T, et al. Delineating the concept of self‐management in chronic conditions: a concept analysis. BMJ Open. 2019;9:e027775. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Saag MS. HIV infection—screening, diagnosis, and treatment. N Engl J Med. 2021;384:2131‐2143. [DOI] [PubMed] [Google Scholar]
- 7. Engler K, Lènàrt A, Lessard D, Toupin I, Lebouché B. Barriers to antiretroviral therapy adherence in developed countries: a qualitative synthesis to develop a conceptual framework for a new patient‐reported outcome measure. AIDS Care. 2018;30:17‐28. doi: 10.1080/09540121.2018.1469725 [DOI] [PubMed] [Google Scholar]
- 8. Areri HA, Marshall A, Harvey G. Interventions to improve self‐management of adults living with HIV on antiretroviral therapy: a systematic review. PLoS One. 2020;15:e0232709. doi: 10.1371/journal.pone.0232709 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. King E, Kinvig K, Steif J, et al. Mobile text messaging to improve medication adherence and viral load in a vulnerable Canadian population living with human immunodeficiency virus: a repeated measures study. J Med Internet Res. 2017;19:e190. doi: 10.2196/jmir.6631 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Xing Z, Yu F, Qanir YAM, et al. Intelligent conversational agents in patient self‐management: a systematic survey using multi data sources. Stud Health Technol Inform. 2019;264:1813‐1814. doi: 10.3233/SHTI190661 [DOI] [PubMed] [Google Scholar]
- 11. Ntinga X, Musiello F, Keter AK, Barnabas R, van Heerden A. The feasibility and acceptability of an mHealth conversational agent designed to support HIV self‐testing in South Africa: cross‐sectional study. J Med Internet Res. 2022;24:e39816. doi: 10.2196/39816 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Chaix B, Bibault JE, Romain R, et al. Assessing the performances of a chatbot to collect real‐life data of patients suffering from primary headache disorders. Digit Health. 2022;8: 20552076221097783. doi: 10.1177/20552076221097783 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Darcy A, Daniels J, Salinger D, et al. Evidence of human‐level bonds established with a digital conversational agent: cross‐sectional, retrospective observational study. JMIR Form Res. 2021;5: e27868. doi: 10.2196/27868 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Hauser‐Ulrich S, Kunzli H, Meier‐Peterhans D, et al. A smartphone‐based health care Chatbot to promote self‐Management of Chronic Pain (SELMA): pilot randomized controlled trial. JMIR Mhealth Uhealth. 2020;8: e15806. doi: 10.2196/15806 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Prochaska JJ, Vogel EA, Chieng A, et al. A therapeutic relational agent for reducing problematic substance use (Woebot): development and usability study. J Med Internet Res. 2021;23: e24850. doi: 10.2196/24850 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Weizenbaum J. ELIZA—a computer program for the study of natural language communication between man and machine. Commun ACM. 1966;9:36‐45. doi: 10.1145/365153.365168 [DOI] [Google Scholar]
- 17. Darcy A, Beaudette A, Chiauzzi E, et al. Anatomy of a Woebot(R) (WB001): agent guided CBT for women with postpartum depression. Expert Rev Med Devices. 2022;19:287‐301. doi: 10.1080/17434440.2022.2075726 [DOI] [PubMed] [Google Scholar]
- 18. Klos MC, Escoredo M, Joerin A, et al. Artificial intelligence‐based Chatbot for anxiety and depression in university students: pilot randomized controlled trial. JMIR Form Res. 2021;5: e20678. doi: 10.2196/20678 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Jang S, Kim JJ, Kim SJ, et al. Mobile app‐based chatbot to deliver cognitive behavioral therapy and psychoeducation for adults with attention deficit: a development and feasibility/usability study. Int J Med Inform. 2021;150: 104440. doi: 10.1016/j.ijmedinf.2021.104440 [DOI] [PubMed] [Google Scholar]
- 20. Zhou X, Edirippulige S, Bai X, Bambling M. Are online mental health interventions for youth effective? A systematic review. J Telemed Telecare. 2021;27:638‐666. doi: 10.1177/1357633x211047285 [DOI] [PubMed] [Google Scholar]
- 21. Li S, Wang Y, Chen L, et al. Virtual agents among participants with methamphetamine use disorders: acceptability and usability study. J Telemed Telecare. 2024:1357633X231219039. doi: 10.1177/1357633x231219039 [DOI] [PubMed] [Google Scholar]
- 22. Chen Y, Sinha B, Ye F, et al. Prostate cancer management with lifestyle intervention: from knowledge graph to Chatbot. Clinical and translational . Discovery. 2022;2:e29. doi: 10.1002/ctd2.29 [DOI] [Google Scholar]
- 23. Xu L, Sanders L, Li K, et al. Chatbot for health care and oncology applications using artificial intelligence and machine learning: systematic review. JMIR Cancer. 2021;7: e27850. doi: 10.2196/27850 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Bibault J‐E, Chaix B, Guillemassé A, et al. A chatbot versus physicians to provide information for patients with breast cancer: blind, randomized controlled noninferiority trial. J Med Internet Res. 2019;21:e15787. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Chaix B, Bibault J‐E, Pienkowski A, et al. When chatbots meet patients: one‐year prospective study of conversations between patients with breast cancer and a chatbot. JMIR Cancer. 2019;5:e12856. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Nazareth S, Hayward L, Simmons E, et al. Hereditary cancer risk using a genetic Chatbot before routine care visits. Obstet Gynecol. 2021;138:860‐870. doi: 10.1097/AOG.0000000000004596 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Rehman UU, Chang DJ, Jung Y, Akhtar U, Razzaq MA, Lee S. Medical instructed real‐time assistant for patient with glaucoma and diabetic conditions. Applied Sciences. 2020;10:2216. doi: 10.3390/app10072216 [DOI] [Google Scholar]
- 28. Mash R, Schouw D, Fischer AE. Evaluating the implementation of the GREAT4Diabetes WhatsApp Chatbot to educate people with type 2 diabetes during the COVID‐19 pandemic: convergent mixed methods study. JMIR Diabetes. 2022;7: e37882. doi: 10.2196/37882 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Croes EA, Antheunis ML. 36 questions to loving a chatbot: are people willing to self‐disclose to a chatbot? Chatbot Research and Design: 4th International Workshop, CONVERSATIONS 2020, Virtual Event, November 23–24, 2020, Revised Selected Papers 4. Springer; 2021:81‐95. [Google Scholar]
- 30. Ischen C, Araujo T, Voorveld H, van Noort G, Smit E. Privacy Concerns in Chatbot Interactions. Chatbot Research and Design: Third International Workshop, CONVERSATIONS 2019, Amsterdam, The Netherlands, November 19–20, 2019, Revised Selected Papers 3. Springer; 2020:34‐48. [Google Scholar]
- 31. Arora AK, Ortiz‐Paredes D, Engler K, et al. Barriers and facilitators affecting the HIV care Cascade for migrant people living with HIV in Organization for Economic Co‐Operation and Development Countries: a systematic mixed studies review. AIDS Patient Care STDS. 2021;35:288‐307. doi: 10.1089/apc.2021.0079 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Marcus JL, Sewell WC, Balzer LB, Krakower DS. Artificial intelligence and machine learning for HIV prevention: emerging approaches to ending the epidemic. Curr HIV/AIDS Rep. 2020;17:171‐179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. van Heerden A, Bosman S, Swendeman D, et al. Chatbots for HIV prevention and care: a narrative review. Curr HIV/AIDS Rep. 2023;20(6):481‐486. doi: 10.1007/s11904-023-00681-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Brixey J, Hoegen R, Lan W, et al. Shihbot: a facebook chatbot for sexual health information on hiv/aids. Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue. Association for Computational Linguistics; 2017:370‐373. [Google Scholar]
- 35. Ardiana D, Joni I, Udayana I. Mobile based chatbot application for HIV/AIDS counseling using artificial intelligence markup language approach. Journal of Physics: Conference Series. IOP Publishing; 2020, p.012041. [Google Scholar]
- 36. van Heerden A, Ntinga X, Vilakazi K. The potential of conversational agents to provide a rapid HIV counseling and testing services. 2017 International Conference on the Frontiers and Advances in Data Science (FADS). IEEE; 2017:80‐85. [Google Scholar]
- 37. Yam EA, Namukonda E, Mcclair T, et al. Developing and testing a Chatbot to integrate HIV education into family planning clinic waiting areas in Lusaka. Zambia Global Health: Science and Practice. 2022;10:e2100721. doi: 10.9745/ghsp-d-21-00721 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Braddock WRT, Ocasio MA, Comulada WS, Mandani J, Fernandez MI. Increasing participation in a TelePrEP program for sexual and gender minority adolescents and Young adults in Louisiana: protocol for an SMS text messaging‐based Chatbot. JMIR Research Protocols. 2023;12:e42983. doi: 10.2196/42983 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Chen S, Zhang Q, Chan CK, et al. Evaluating an innovative HIV self‐testing service with web‐based, real‐time counseling provided by an artificial intelligence Chatbot (HIVST‐Chatbot) in increasing HIV self‐testing use among Chinese men who have sex with men: protocol for a noninferiority randomized controlled trial. JMIR Res Protoc. 2023;12: e48447. doi: 10.2196/48447 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Hui M. Testing the feasibility and acceptability of using an artificial intelligence Chatbot to promote HIV testing and pre‐exposure prophylaxis in Malaysia: mixed methods study. JMIR Hum Factors. 2024;11:e52055 https://humanfactorsjmirorg/2024/1/e52055. doi: 10.2196/52055 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Massa P, De Souza Ferraz DA, Magno L, et al. A transgender Chatbot (Amanda selfie) to create pre‐exposure prophylaxis demand among adolescents in Brazil: assessment of acceptability, functionality, usability, and results. J Med Internet Res. 2023;25:e41881. doi: 10.2196/41881 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Ma Y, Achiche S, Pomey M‐P, et al. Adapting and evaluating an AI‐based Chatbot through patient and stakeholder engagement to provide information for different health conditions: master protocol for an adaptive platform trial (the MARVIN Chatbots study). JMIR Res Protocols. 2024;13:13. doi: 10.2196/54668 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Laymouna M, Ma Y, Lessard D, Schuster T, Engler K, Lebouché B. Roles, users, benefits and limitations of Chatbots in healthcare: a rapid review. J Med Internet Res. 2024;26. doi: 10.2196/56930 e56930. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Garett R, Young SD. Potential application of conversational agents in HIV testing uptake among high‐risk populations. J Public Health. 2023;45:189‐192. doi: 10.1093/pubmed/fdac020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Bocklisch T, Faulkner J, Pawlowski N, et al. Rasa: open source language understanding and dialogue management. arXiv Preprint arXiv:171205181 . 2017.
- 46. Creswell JW, Clark VLP. Designing and Conducting Mixed Methods Research. Sage Publications, Inc; 2007. xviii, 275‐xviii, 275. [Google Scholar]
- 47. Liu X, Cruz Rivera S, Moher D, et al. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT‐AI extension. Nat Med. 2020;26:1364‐1374. doi: 10.1038/s41591-020-1034-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Browne RH. On the use of a pilot sample for sample size determination. Stat Med. 1995;14:1933‐1940. [DOI] [PubMed] [Google Scholar]
- 49. Lancaster GA, Dodd S, Williamson PR. Design and analysis of pilot studies: recommendations for good practice. J Eval Clin Pract. 2004;10:307‐312. doi: 10.1111/j..2002.384.doc.x [DOI] [PubMed] [Google Scholar]
- 50. Etikan I. Comparison of convenience sampling and purposive sampling. Am J Theor Appl Stat. 2016;5:5. doi: 10.11648/j.ajtas.20160501.11 [DOI] [Google Scholar]
- 51. Meta. Meta privacy policy , https://www.facebook.com/privacy/policy/ (accessed 6 October, 2024)
- 52. Meta. Meta data security terms , https://www.facebook.com/legal/terms/data_security_terms (accessed 6 October, 2024)
- 53. Meta. Privacy & safety on Messenger , https://www.facebook.com/help/messenger‐app/1064701417063145/?helpref=hc_fnav (accessed 6 October, 2024)
- 54. Standardization IOf . Ergonomics of Human‐System Interaction — Part 11: Usability: Definitions and Concepts. 2018. [Google Scholar]
- 55. Watbled L, Marcilly R, Guerlinger S, et al. Combining usability evaluations to highlight the chain that leads from usability flaws to usage problems and then negative outcomes. J Biomed Inform. 2018;78:12‐23. doi: 10.1016/j.jbi.2017.12.014 [DOI] [PubMed] [Google Scholar]
- 56. Hagglund M, Scandurra I. Usability of the Swedish accessible electronic health record: qualitative survey study. JMIR Hum Factors. 2022;9: e37192. doi: 10.2196/37192 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Patel B, Thind A. Usability of Mobile health apps for postoperative care: systematic review. JMIR Perioper Med. 2020;3: e19099. doi: 10.2196/19099 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58. Holden RJ, Karsh B‐T. The technology acceptance model: its past and its future in health care. J Biomed Inform. 2010;43:159‐172. doi: 10.1016/j.jbi.2009.07.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Venkatesh V, Davis FD. A theoretical extension of the technology acceptance model: four longitudinal field studies. Manag Sci. 2000;46:186‐204. doi: 10.1287/mnsc.46.2.186.11926 [DOI] [Google Scholar]
- 60. Revythi A, Tselios N. Extension of technology acceptance model by using system usability scale to assess behavioral intention to use e‐learning. Educ Inform Technol. 2019;24:2341‐2355. doi: 10.1007/s10639-019-09869-4 [DOI] [Google Scholar]
- 61. Standardization IOf . ISO/TS 20282–2:2013(en) Usability of consumer products and products for public use — Part 2: Summative test method.
- 62. Alharbi S, Drew S. Using the technology acceptance model in understanding academics’ Behavioural intention to use learning management systems. 2014.
- 63. Proctor E, Silmere H, Raghavan R, et al. Outcomes for implementation research: conceptual distinctions, measurement challenges, and research agenda. Adm Policy Ment Health. 2011;38:65‐76. doi: 10.1007/s10488-010-0319-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64. Dhagarra D, Goswami M, Kumar G. Impact of trust and privacy concerns on technology acceptance in healthcare: an Indian perspective. Int J Med Inform. 2020;141:104164. doi: 10.1016/j.ijmedinf.2020.104164 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65. Kalayou MH, Endehabtu BF, Tilahun B. The applicability of the modified technology acceptance model (TAM) on the sustainable adoption of eHealth Systems in Resource‐Limited Settings. J Multidiscip Healthc. 2020;13:1827‐1837. doi: 10.2147/jmdh.s284973 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. Kamal SA, Shafiq M, Kakria P. Investigating acceptance of telemedicine services through an extended technology acceptance model (TAM). Technol Soc. 2020;60:101212. doi: 10.1016/j.techsoc.2019.101212 [DOI] [Google Scholar]
- 67. Lewis JR, Utesch BS, Maher DE. UMUX‐LITE: when there's no Time for the SUS. Proceedings of the SIGCHI conference on human factors in computing systems. Association for Computing Machinery; 2013:2099‐2102. [Google Scholar]
- 68. Brooke J. SUS: a retrospective. J Usability Stud. 2013;8:29‐40. [Google Scholar]
- 69. Borsci S, Buckle P, Walne S. Is the LITE version of the usability metric for user experience (UMUX‐LITE) a reliable tool to support rapid assessment of new healthcare technology? Appl Ergon. 2020;84: 103007. doi: 10.1016/j.apergo.2019.103007 [DOI] [PubMed] [Google Scholar]
- 70. Tariman JD, Berry DL, Halpenny B, Wolpin S, Schepp K. Validation and testing of the acceptability E‐scale for web‐based patient‐reported outcomes in cancer care. Appl Nurs Res. 2011;24:53‐58. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71. Sauro J, Dumas JS. Comparison of Three One‐Question, Post‐Task Usability Questionnaires. Proceedings of the SIGCHI conference on human factors in computing systemsAssociation for Computing Machinery;. 2009:1599‐1608. [Google Scholar]
- 72. Chau PY, Hu PJ‐H. Investigating healthcare professionals' decisions to accept telemedicine technology: an empirical test of competing theories. Inf Manag. 2002;39:297‐311. [Google Scholar]
- 73. Davis FD. Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Q. 1989;13:319‐340. [Google Scholar]
- 74. Adams C, Walpola R, Schembri AM, et al. The ultimate question? Evaluating the use of net promoter score in healthcare: a systematic review. Health Expect. 2022;25:2328‐2339. doi: 10.1111/hex.13577 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75. Dawes J. Do data characteristics change according to the number of scale points used? An experiment using 5‐point, 7‐point and 10‐point scales. Int J Market Res. 2008;50:61‐104. [Google Scholar]
- 76. Foundation PS . Python (Programming Language). https://www.python.org/ (accessed 14 May, 2024)
- 77. Guest G, Namey E, McKenna K. How many focus groups are enough? Building an evidence base for nonprobability sample sizes. Field Methods. 2017;29:3‐22. [Google Scholar]
- 78. Kirk MA, Kelley C, Yankey N, et al. A systematic review of the use of the consolidated framework for implementation research. Implement Sci. 2016;11: 1‐13. doi: 10.1186/s13012-016-0437-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79. CATIE . CATIE, Canada's source for HIV and hepatitis C information. https://www.catie.ca/ (accessed 14 May, 2024)
- 80. Leo AJ, Schuelke MJ, Hunt DM, et al. A digital mental health intervention in an orthopedic setting for patients with symptoms of depression and/or anxiety: feasibility prospective cohort study. JMIR Formative Res. 2022;6:e34889. doi: 10.2196/34889 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81. Ehrlich C, Hennelly SE, Wilde N, et al. Evaluation of an artificial intelligence enhanced application for student wellbeing: pilot randomised trial of the mind tutor. Int J Appl Positive Psychol. 2023;9:435‐454. doi: 10.1007/s41042-023-00133-2 [DOI] [Google Scholar]
- 82. Amagai S, Pila S, Kaat AJ, et al. Challenges in participant engagement and retention using Mobile health apps: literature review. J Med Internet Res. 2022;24: e35120. doi: 10.2196/35120 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83. Bragazzi NL, Crapanzano A, Converti M, Zerbetto R, Khamisy‐Farah R. The impact of generative conversational artificial intelligence on the lesbian, gay, bisexual, transgender, and queer community: scoping review. J Med Internet Res. 2023;25:e52091. doi: 10.2196/52091 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84. Sanabria G, Greene KY, Tran JT, et al. “A great way to start the conversation”: evidence for the use of an adolescent mental health Chatbot navigator for youth at risk of HIV and other STIs. J Technol Behav Sci. 2023;8(4):1‐10. doi: 10.1007/s41347-023-00315-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85. Mielke SJ, Cotterell R, Gorman K, et al. What kind of language is hard to language‐model? arXiv Preprint arXiv:190604726 . 2019.
- 86. Arora AK, Vicente S, Engler K, et al. Impact of social determinants of health on time to antiretroviral therapy initiation and HIV viral undetectability for migrants enrolled in a multidisciplinary HIV clinic with rapid, free, and onsite B/F/TAF: 'The ASAP study'. HIV Med. 2024;25:600‐607. doi: 10.1111/hiv.13608 [DOI] [PubMed] [Google Scholar]
- 87. Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW. Large language models in medicine. Nat Med. 2023;29:1930‐1940. doi: 10.1038/s41591-023-02448-8 [DOI] [PubMed] [Google Scholar]
- 88. Yang R, Tan TF, Lu W, Thirunavukarasu AJ, Ting DSW, Liu N. Large language models in health care: development, applications, and challenges. Health Care Sci. 2023;2:255‐263. doi: 10.1002/hcs2.61 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89. de Los RP, Okoli C, Castellanos E, et al. Physical, emotional, and psychosocial challenges associated with daily dosing of HIV medications and their impact on indicators of quality of life: findings from the positive perspectives study. AIDS Behav. 2021;25:961‐972. doi: 10.1007/s10461-020-03055-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90. Comulada WS, Rezai R, Sumstine S, et al. A necessary conversation to develop chatbots for HIV studies: qualitative findings from research staff, community advisory board members, and study participants. AIDS Care. 2024;36:463‐471. doi: 10.1080/09540121.2023.2216926 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91. Belen‐Saglam R, Nurse JRC, Hodges D. An investigation into the sensitivity of personal information and implications for disclosure: a UK perspective. Front Comput Sci. 2022;4:908245. doi: 10.3389/fcomp.2022.908245 [DOI] [Google Scholar]
- 92. Arora AK, Engler K, Lessard D, et al. Experiences of migrant people living with HIV in a multidisciplinary HIV care setting with rapid B/F/TAF initiation and cost‐covered treatment: the 'ASAP' study. J Pers Med. 2022;12:1497. doi: 10.3390/jpm12091497 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93. Iribarren S, Siegel K, Hirshfield S, et al. Self‐management strategies for coping with adverse symptoms in persons living with HIV with HIV associated non‐AIDS conditions. AIDS Behav. 2018;22:297‐307. doi: 10.1007/s10461-017-1786-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94. Koh MCY, Ngiam JN, Yong J, Tambyah PA, Archuleta S. The role of an artificial intelligence model in antiretroviral therapy counselling and advice for people living with HIV. HIV Med. 2024;25:504‐508. doi: 10.1111/hiv.13604 [DOI] [PubMed] [Google Scholar]
- 95. Wang C, Liu S, Yang H, et al. Ethical considerations of using ChatGPT in health care. J Med Internet Res. 2023;25: e48009. doi: 10.2196/48009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96. Lee P, Goldberg C, Kohane I. The AI Revolution in Medicine: GPT‐4 and beyond. Pearson; 2023. [Google Scholar]
- 97. Zhong W, Guo L, Gao Q, Ye H, Wang Y. Memorybank: Enhancing Large Language Models with Long‐Term Memory. Proceedings of the AAAI Conference on Artificial Intelligence. Vol 38. Association for the Advancement of Artificial Intelligence ‐ MIT Press; 2024:19724‐19731. [Google Scholar]
- 98. Nißen M, Selimi D, Janssen A, et al. See you soon again, chatbot? A design taxonomy to characterize user‐chatbot relationships with different time horizons. Comput Hum Behav. 2022;127:107043. doi: 10.1016/j.chb.2021.107043 [DOI] [Google Scholar]
- 99. Janssen A, Passlick J, Rodríguez Cardona D, Breitner MH. Virtual assistance in any context. Bus Inf Syst Eng. 2020;62:211‐225. doi: 10.1007/s12599-020-00644-1 [DOI] [Google Scholar]
- 100. Griffin AC, Xing Z, Khairat S, et al. Conversational agents for chronic disease self‐management: a systematic review. AMIA Annual Symposium Proceedings. American Medical Informatics Association; 2020:504. [PMC free article] [PubMed] [Google Scholar]
- 101. Følstad A, Brandtzæg PB. Chatbots and the new world of HCI. Interactions. 2017;24:38‐42. [Google Scholar]
- 102. Baraka K, Alves‐Oliveira P, Ribeiro T. An extended framework for characterizing social robots. Human‐robot interaction. Springer; 2020:21‐64. [Google Scholar]
- 103. Leite I, Martinho C, Paiva A. Social robots for long‐term interaction: a survey. Int J Social Robot. 2013;5:291‐308. doi: 10.1007/s12369-013-0178-y [DOI] [Google Scholar]
- 104. Peng ML, Wickersham JA, Altice FL, et al. Formative evaluation of the acceptance of HIV prevention artificial intelligence Chatbots by men who have sex with men in Malaysia: focus group study. JMIR Form Res. 2022;6: e42055. doi: 10.2196/42055 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105. Senteio C, Murdock PJ. The efficacy of health information Technology in Supporting Health Equity for black and Hispanic patients with chronic diseases: systematic review. J Med Internet Res. 2022;24: e22124. doi: 10.2196/22124 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106. Pérez‐Stable EJ, Jean‐Francois B, Aklin CF. Leveraging advances in technology to promote health equity. Med Care. 2019;57:S101‐S103. [DOI] [PubMed] [Google Scholar]
- 107. Gaffney H, Mansell W, Tai S. Conversational agents in the treatment of mental health problems: mixed‐method systematic review. JMIR Mental Health. 2019;6:e14166. doi: 10.2196/14166 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108. Laranjo L, Dunn AG, Tong HL, et al. Conversational agents in healthcare: a systematic review. J Am Med Inform Assoc. 2018;25:1248‐1258. doi: 10.1093/jamia/ocy072 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Appendix S1. Complete list of conversation topics.
Appendix S2. CONSORT artificial intelligence (AI) guidelines.
Appendix S3. Sociodemographic and study questionnaires.
Appendix S4. Focus group interview guide.
Appendix S5. Scatterplots – linear regression models of inter‐subconstruct relationships.
Appendix S6. Complete quotes – qualitative analysis.
Data Availability Statement
The datasets generated and analysed for this study are available from the corresponding author (BLeb) on reasonable request.