Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Nov 1.
Published in final edited form as: Int J Eat Disord. 2024 Jul 28;57(11):2204–2216. doi: 10.1002/eat.24260

Effects of Chatbot Components to Facilitate Mental Health Services Use in Individuals with Eating Disorders Following Online Screening: An Optimization Randomized Controlled Trial

Ellen E Fitzsimmons-Craft 1, Gavin N Rackoff 2, Jillian Shah 1, Jillian C Strayhorn 3, Laura D’Adamo 1,4, Bianca DePietro 1, Carli P Howe 1, Marie-Laure Firebaugh 1, Michelle G Newman 2, Linda M Collins 3, C Barr Taylor 5,6, Denise E Wilfley 1
PMCID: PMC11560741  NIHMSID: NIHMS2005642  PMID: 39072846

Abstract

Objective:

Few individuals with eating disorders (EDs) receive treatment. Innovations are needed to identify individuals with EDs and address care barriers. We developed a chatbot for promoting services uptake that could be paired with online screening. However, it is not yet known which components drive effects. This study estimated individual and combined contributions of four chatbot components on mental health services use (primary), chatbot helpfulness, and attitudes toward changing eating/shape/weight concerns (‘change attitudes,’ with higher scores indicating greater importance/readiness).

Methods:

205 individuals screening with an ED but not in treatment were randomized in an optimization randomized controlled trial to receive up to 4 chatbot components: psychoeducation; motivational interviewing; personalized service recommendations; and repeated administration (follow-up check-ins/reminders). Assessments were at baseline and 2, 6, and 14 weeks.

Results:

Participants who received repeated administration were more likely to report mental health services use, with no significant effects of other components on services use. Repeated administration slowed the decline in change attitudes participants experienced over time. Participants who received motivational interviewing found the chatbot more helpful, but this component was also associated with larger declines in change attitudes. Participants who received personalized recommendations found the chatbot more helpful, and receiving this component on its own was associated with the most favorable change attitude time trend. Psychoeducation showed no effects.

Discussion:

Results indicated important effects of components on outcomes; findings will be used to finalize decision making about the optimized intervention package. The chatbot shows high potential for addressing the treatment gap for EDs.

Keywords: chatbot, conversational agent, digital intervention, eating disorder, screening, mental health treatment, mHealth, optimization


Eating disorders (EDs) are serious mental illnesses (Klump et al., 2009), but less than 20% of affected individuals ever receive treatment (Kazdin et al., 2017; Rackoff et al., 2023). As such, innovations are needed to not only identify individuals with EDs but also address barriers patients face in seeking and receiving care, including stereotypes about EDs, denial of illness severity, limited motivation, and lack of knowledge about options (Ali et al., 2017, 2020).

One such innovation that could address this problem is pairing readily-available online ED screening with an automated chatbot for promoting services uptake. Chatbots have been shown to promote positive health-related behavior change (Pereira & Diaz, 2019; Tudor Car et al., 2020; Zhang et al., 2020). We developed an initial version of a chatbot named Alex that included four theoretically-informed components (i.e., psychoeducation, motivational interviewing, personalized recommendations for services, repeated administration) to increase services use following ED screening using user-centered design (Graham et al., 2019b; Lyon & Koerner, 2016). Participants generally reflected positively on interactions with Alex. Refinements between usability testing cycles further improved user experiences, and overall, this initial work provided preliminary evidence of the feasibility and acceptability of a chatbot designed to promote EDs services use (Shah et al., 2022).

However, what was not yet known was which specific components of the chatbot were active in promoting service use, either individually or in combination with other components (e.g., because they boosted the effect of another component). At the same time, including inactive components in the chatbot comes with important risks, such as increased intervention length, which may burden participants to the extent that they drop out partway through and miss out on potentially effective components. It is widely acknowledged that engagement in digital mental health, including with chatbots, can be challenging and is often brief (Borghouts et al., 2021; Lipschitz et al., 2023). This makes optimization, or the empirical process of estimating the contributions of candidate components and identifying active ingredients, particularly important. In this article, we present the results of an optimization randomized control trial (ORCT), conducted using the multiphase optimization strategy (MOST; Collins, 2018; Collins et al., in press), of four candidate chatbot components that were included in the initial version of Alex for increasing services use in individuals with EDs following online screening.

Identifying or recognizing that symptoms warrant treatment is a crucial first step in accessing services (Kazdin et al., 2017). Thus, we partnered with the National EDs Association (NEDA), the largest non-profit organization dedicated to EDs in the U.S., to disseminate an evidence-based online EDs screen. The NEDA screen garners up to 200,000 respondents annually. Our work has shown that whereas most respondents (86%) screen positive for an ED, most of those (86%) have never been in treatment and only 3% are currently in treatment (Fitzsimmons-Craft et al., 2019). Additional work suggested that only 16% of those screening positive for an ED and not in treatment initiated care following NEDA screen completion and being offered a variety of referral options (Fitzsimmons-Craft et al., 2020). However, this figure is likely a gross overestimate given its basis on only those who volunteered to provide follow-up data. This low treatment uptake reinforces the need for a novel solution to increase services use following screening.

The aim of the current study was to conduct an ORCT with adults not currently in treatment who screened positive for an ED via the NEDA screen to estimate the individual and combined contributions of Alex’s four components on mental health services use (primary outcome). Secondary outcomes included self-reported chatbot helpfulness and attitudes toward changing eating, shape, or weight concerns.

Method

Participants and Procedure

Participants were recruited in two waves from 3/9/21–3/16/21 and from 7/19/21–8/9/21 from NEDA’s online screen (Fitzsimmons-Craft et al., 2019), which uses the Stanford-Washington University EDs screen (Graham et al., 2019a). During recruitment windows, immediately following screening, eligible respondents were shown a web page with details about a research study evaluating a chatbot aimed to promote mental health services use and were given options to enroll or see other resources. Eligible respondents were 18+ years, screened positive for an ED, and reported not currently being in treatment for an ED. Those who elected to participate were directed to a survey with additional eligibility questions (i.e., smartphone ownership, U.S. residency). Participants were then asked to provide informed consent and consenting participants completed baseline questionnaires. Following completion, participants were randomized using a 24 factorial design to receive up to four chatbot components. Participants were provided with the chatbot’s SMS number and prompted to initiate a conversation by texting a study-provided ID number to the chatbot.

At 2-, 6-, and 14-weeks post-baseline, participants were invited via email to complete an online follow-up survey. Participants were remunerated with a $5, $10, $10, and $20 e-gift card after completion of the baseline, 2-, 6-, and 14-week follow-up surveys, respectively. All procedures were approved by the institutional review board, and the trial was preregistered (NCT04806165). This report follows the Consolidated Standards of Reporting Trials (CONSORT) reporting guidelines. See Figure 1 for the CONSORT flow chart.

Figure 1.

Figure 1.

Participant CONSORT flow diagram

Candidate Chatbot Components

The chatbot was hosted by a mental health chatbot company. The four components under study were designed to target distinct mechanisms that could impact mental health services seeking, as detailed below. Further information about components can be found in Shah et al. (2022).

The psychoeducation component was based on the mental health literacy model, which has been shown to improve patients’ help-seeking attitudes (Xu et al., 2018). This component refuted common stereotypes/myths about EDs, informed of potential consequences of EDs, provided tailored information on specific aspects of ED psychopathology, and reinforced the seriousness of EDs.

The motivational interviewing component delivered core elements of motivational interviewing (Miller & Rollnick, 2013), which builds on cognitive dissonance (Festinger, 1962) and self-perception (Bem, 1967) theories. A review of meta-analytic findings demonstrated the effectiveness of motivational interviewing for increasing care engagement (Lundahl & Burke, 2009). This component highlighted discrepancies between individuals’ unhealthy behaviors and healthy goals (Lundahl & Burke, 2009).

The personalized recommendations component provided tailored recommendations for seeking care and was based on the elaboration likelihood model (Petty et al., 1981), whereby individuals are more likely to process information in an active manner when messages are perceived as personally relevant (Kreuter & Wray, 2003). Meta-analyses have demonstrated that tailored messages outperform generic ones in affecting health behavior change (Noar et al., 2007). In this component, users responded to questions regarding intervention preferences (e.g., in-person or online), and the chatbot followed up with tailored resources from the NEDA website, in contrast to the long list of options typically presented at the end of online mental health screens. This component also acknowledged things that can get in the way of seeking help (e.g., feeling problems not severe enough, shame, logistics) and offered troubleshooting for these concerns.

The repeated administration component was based on literature demonstrating that reminders increase adherence to health-related interventions (Fenerty et al., 2012; Thakkar et al., 2016). In this component, the chatbot provided up to three interactive check-ins that asked if the user had sought treatment, and for those who had not, reminded them of available resources (with content tailored depending on whether they had been randomized to personalized recommendations or not) and promoted reflection on overcoming barriers. Users received their first check-in three days after completing the main conversation. Those who endorsed seeking help stopped receiving check-ins, whereas those users who did not continued to receive up to two more check-ins each spaced about three days apart. All check-ins were received prior to the 2-week follow-up.

The psychoeducation, motivational interviewing, and personalized recommendations components were each designed to be completed in about 5 minutes. Each repeated administration check-in took about 3 minutes to complete.

Study Design

This ORCT employed a full 24 factorial design to test the four candidate components. Candidate components were operationalized as two-level factors, each with ‘on’ and ‘off’ levels, indicating presence vs absence of a particular component. Participants were randomized to 16 experimental conditions representing each of the 24 combinations of candidate components (i.e., from all ‘off’ to all ‘on’ and administered in a set order; see Table 1). Those assigned to ‘off’ for all four components received a minimal chatbot that provided only general recommendations for services, similar to information typically provided after the NEDA screen.

Table 1.

Experimental Conditions

condition Participants randomized Chatbot component
Psychoeducation Motivational Interviewing Personalized Recommendations Repeated Administration
1 n=13 N N N Y
2 n=12 N N Y N
3 n=11 N Y N N
4 n=14 N N N N
5 n=13 N Y Y Y
6 n=13 N Y N Y
7 n=14 N Y Y N
8 n=13 N N Y Y
9 n=13 Y N N N
10 n=12 Y N N Y
11 n=13 Y N Y N
12 n=13 Y Y N N
13 n=13 Y Y Y N
14 n=13 Y N Y Y
15 n=12 Y Y N Y
16 n=13 Y Y Y Y

Note. “N” indicates the chatbot component was not offered to participant in that condition. “Y” indicates the chatbot component was offered to participant in that condition. Components were administered in a set order, with psychoeducation always appearing first if offered, then motivational interviewing, then personalized recommendations, and then repeated administration.

Study Outcomes

Mental health services use was the primary outcome, assessed at all follow-ups (2-week, 6-week, 14-week). Mental health services use was measured by asking participants, “Have you tried or used any type of mental health help (e.g., self-help app, telehealth/in-person counseling, etc.) for your concerns related to your eating, shape, or weight in the past [2, 6, or 14] weeks?” Assessing mental health services use via self-report is standard in research using national samples (e.g., Wang et al., 2005). We defined a participant as having utilized mental health services at the first time they responded, “yes” (vs “no”) to this question. For descriptive purposes, we also analyzed responses to a follow-up question asking about the type of help received (“From which of the following places did you receive help?”). Participants were displayed a non-mutually-exclusive list of types of help and could also select “other” or “don’t know.”

As a secondary outcome, chatbot helpfulness was assessed immediately at the end of the initial conversation by asking, “How helpful was this conversation?,” rated on a 1 (not helpful) to 4 (very helpful) scale.

An additional secondary outcome was a measure of attitudes toward changing eating, shape, or weight concerns (hereafter referred to as “change attitudes”). Change attitudes were assessed by asking, “How important is it for you to change your eating, shape, or weight concerns?” and, “How ready are you to make changes in your eating, shape, or weight concerns?,” both rated on a 1 (not at all important/ready) to 7 (very important/ready) scale. The sum of the two questions was used as an index of change attitudes. Change attitudes were measured at baseline, as well as 2-week, 6-week, and 14-week follow-ups. Cronbach’s alpha ranged from .71 to .74 over time.

Data Analysis

Data were analyzed using R version 4.1.1. Consistent with recommendations for factorial trials conducted using MOST, we effect-coded main effect and interaction terms, and we considered candidate components “active” on a particular outcome if they were associated with important main effects or involved with important interaction effects (Collins et al., 2014). Data on the primary outcome, mental health services use, were analyzed using discrete time survival analysis. This approach estimates the effect of independent variables (e.g., a given factor, representing a particular candidate component) on the occurrence of an event of interest; in this case, mental health services use. Following recommendations of Singer and Willett (2003), models were fit with mental health services use regressed on follow-up time (2-week, 6-week, or 14-week), each chatbot factor, and all interactions among chatbot factors. In a balanced factorial experiment that uses effect-coding, main effect terms and interaction terms are uncorrelated (Strayhorn et al., 2022). Models did not incorporate an intercept; this allowed estimating the hazard odds of mental health services use (i.e., odds of mental health services use among participants who had not already reported services use) at each time point. The model was estimated using a logit link. For all model terms, odds ratios and 95% confidence intervals were computed.

For the helpfulness secondary outcome, we used linear regression in which helpfulness was regressed on an intercept, each chatbot factor, and all possible interactions among chatbot factors. Helpfulness was measured once (i.e., immediately after the chatbot conversation), so time was not incorporated in these models. This measure was not expected to differ as a function of repeated administration, because the follow-up chatbot check-ins had not been delivered at the time of measurement; nonetheless, the factor for repeated administration was included in this model to be able to detect any unexpected differences as a function of this component.

For the change attitudes secondary outcome, we used linear mixed effects models in which change attitude score was regressed on an intercept, time (continuous; coded as 0 for baseline, 2 for 2-week, 6 for 6-week, 14 for 14-week), each chatbot factor, and all possible interactions among the chatbot factors and time. Key parameters of interest were the interactions between chatbot components and time, which indicated main effects of chatbot components on change over time in change attitudes. The linear mixed effects models incorporated a random intercept to accommodate the correlation of repeated measures within participants.

For secondary outcome models, we computed both unstandardized (B) and standardized (ß) coefficients to indicate effect sizes. As suggested by Fey et al. (2023), we interpreted absolute ß values <0.2 as small, 0.2–0.5 as medium, and >0.5 as large. The threshold for significance was set at p<.05.

Missing data were handled using multiple imputation implemented in the R package mice (van Buuren & Groothuis-Oudshoorn, 2011). To improve quality of the imputations,gender, age, and race were included as auxiliary variables (Graham, 2009). We generated ten imputed datasets with the predictive mean matching method and pooled coefficients, degrees of freedom, and standard errors with Rubin’s (1987) rules. As a sensitivity analysis, we also ran the same models using full information maximum likelihood to handle missing data. These analyses resulted in identical conclusions and are in the Supplementary Materials.

Results

Descriptive Statistics

During the recruitment periods, 12,187 respondents completed the NEDA screen, with 6,747 subsequently shown the study opportunity. Of these, 595 (8.8%) opted to receive a link to complete additional eligibility questions followed by the consent and baseline. Of those, 209 individuals completed baseline and were randomized. During data cleaning, it was discovered that three of these individuals were erroneously shown the study invitation (i.e., did not screen positive for ED). In addition, one individual completed the baseline twice and was thus randomized to two different conditions. As such, data from four individuals were excluded, ultimately yielding a final sample for analysis of 205. See the participant flow in Figure 1, along with reasons for exclusion and follow-up rates. Overall completion of at least one follow-up was 83%.

Mean age of the sample was 33.15 (SD=12.63, range=18–73, median=29) years. In terms of gender, 184 (90%) identified as female, 12 (6%) as male, 6 (3%) as genderqueer/gender nonconforming, 2 (1%) as trans male/trans man, and 1 (<1%) self-identified as another identity. In terms of race, 164 (80%) identified as White, 3 (1%) as Black or African American, 15 (7%) as Asian, 14 (7%) as multiracial, and 9 (4%) declined or reported racial identity as unknown. In terms of ethnicity, 17 (8%) identified as Hispanic. Reported household income varied widely, with 15% reporting incomes <$20,000, 17% between $20,000 and $39,999, 19% between $40,000 and $59,999, 12% between $60,000 and $79,999, 12% between $80,000 and $99,999, 13% between $100,000 and $149,999, and 12% >$150,000. Probable ED diagnoses were as follows: 6% anorexia nervosa (AN), 17% bulimia nervosa (BN), 15% binge-eating disorder (BED), 25% subclinical BN, 6% subclinical BED, 0.5% purging disorder, and 30% unspecified feeding or eating disorder (UFED). The Eating Disorder Examination-Questionnaire (EDE-Q) (Fairburn & Beglin, 2008) was also administered at baseline, with the average score being 4.00 (SD=1.10), which is the well-established clinical cutoff (Fairburn & Beglin, 2008).

Analyses indicated 19% of participants had missing data on chatbot helpfulness, and 38% had missing data on at least one follow-up. At baseline, participants reported high levels of change attitudes (M=11.42 out of 14, SD=2.66). See Supplementary Materials for more information on rates of missing data and baseline characteristics by chatbot component. Overall, there were no systematic associations between any chatbot component or demographic characteristic and missing data. The two exceptions were that recipients of motivational interviewing scored higher than non-recipients on baseline change attitudes (B=0.76, SE=0.37, t(200)=2.04, p=.043, ß=0.14), and recipients of personalized recommendations scored lower on baseline EDE-Q score (B=−0.34, SE=0.15, t(200)=−2.27, p=.024, ß=−0.16). Adjusting for each of these variables by including them as covariates led to no change in results (see Supplementary Materials), and thus there was no indication that baseline differences substantially affected results or interpretations.

Notably, 189 (92%) participants started the main chatbot conversation, and 167 (81%) completed the main chatbot conversation, which lasted ~5–15 minutes depending on the number of components the individual was randomly assigned.

Primary Outcome

See Figure 2 for plots of cumulative mental health services use rates as a function of each chatbot component. Of the sample (N=205), 81 (40%) participants reported services use by 2 weeks, 101 (49%) by 6 weeks, and 118 (58%) by 14 weeks.

Figure 2.

Figure 2.

Cumulative mental health services use over time as a function of chatbot component. The dotted “Off” line denotes that a participant did not receive a chatbot component, whereas the solid “On” line denotes that a participant did receive a chatbot component. MI = motivational interviewing; PE = psychoeducation; PR = personalized recommendation; RA = repeated administration.

Descriptive analysis of the type of service used across all follow-ups indicated that, in the sample (N=205), online self-help (n=56, 27%) and telehealth therapy (n=55, 27%) were the most commonly used, followed closely by outpatient therapy (n=40, 20%) and online guided self-help (n=37, 18%). Partial hospitalization, inpatient psychiatric hospital, and psychiatric emergency services were each used by 1% of the sample (n=2 each), 14% used another service not listed (n=29), and 2% didn’t know which service they used (n=5).

See Table 2 for results from the discrete time survival analysis. In terms of time effects, the hazard odds of mental health services use were highest at 2 weeks and lower at 6 and 14 weeks. Thus, across the sample, participants were most likely to initiate mental health services use in the first two weeks. In terms of chatbot components, the model indicated a significant positive main effect for repeated administration. Thus, receiving this component was associated with higher odds of initiating mental health services than not receiving it, averaging across the levels of the factors for the other components. There were no other significant main effects for any chatbot components, nor were there significant interactions among components.

Table 2.

Discrete Time Survival Analysis Results

Term OR 95% CI Z p
Lower Upper
2-week 1.02 0.73 1.41 0.09 .928
6-week 0.37 0.22 0.63 −3.77 .002
14-week 0.52 0.24 1.13 −1.73 .084
MI 0.89 0.55 1.44 −0.49 .624
PE 0.82 0.50 1.34 −0.80 .424
PR 1.32 0.83 2.10 1.20 .230
RA 1.72 1.06 2.79 2.21 .027
MI × PE 0.99 0.35 2.83 −0.02 .984
MI × PR 0.69 0.23 2.01 −0.70 .484
PE × PR 0.96 0.35 2.60 −0.09 .928
MI × RA 0.83 0.33 2.09 −0.40 .689
PE × RA 1.67 0.63 4.41 1.04 .298
PR × RA 0.61 0.22 1.65 −0.99 .322
MI × PE × PR 0.42 0.56 3.09 −0.86 .390
MI × PE × RA 1.70 0.21 14.07 0.50 .617
MI × PR × RA 0.48 0.06 3.67 −0.71 .478
PE × PR × RA 0.29 0.04 2.06 −1.25 .211
MI × PE × PR × RA 0.86 0.02 42.61 −0.07 .944

Note. N = 205. MI = motivational interviewing; PE = psychoeducation; PR = personalized recommendation; RA = repeated administration. Boldface indicates statistical significance with alpha of .05.

Secondary Outcomes

Table 3 contains descriptive statistics for each secondary outcome as a function of each chatbot component over time. Table 4 shows output for the model of chatbot helpfulness. There were significant and positive main effects for motivational interviewing and for personalized recommendations in the model of chatbot helpfulness, indicating that participants who received either of these components found the chatbot more helpful immediately after the conversation than participants who did not receive the component (again, averaging across the levels of the factors for the other components). There were no significant interactions between chatbot components.

Table 3.

Descriptive Statistics for Secondary Outcomes

Variable MI PE PR RA
Off On Off On Off On Off On
M SD M SD M SD M SD M SD M SD M SD M SD
Helpfulness 2.82 0.91 3.14 0.91 3.00 0.91 2.94 0.94 2.74 0.95 3.21 0.83 3.00 0.95 2.94 0.89
Change attitudes
 Baseline 11.05 2.89 11.80 2.35 11.45 2.63 11.40 2.70 11.45 2.68 11.40 2.66 11.50 2.57 11.34 2.76
 2-week 10.70 2.95 10.73 2.59 11.10 2.57 10.31 2.93 10.90 2.89 10.53 2.65 10.57 2.94 10.86 2.60
 6-week 10.97 2.99 10.27 2.99 10.90 2.81 10.37 3.09 10.64 2.93 10.62 3.01 10.36 3.19 10.89 2.72
 14-week 10.45 3.28 9.87 3.09 10.32 3.12 10.01 3.28 9.86 3.10 10.45 3.27 9.83 3.33 10.54 3.02

Note. N = 205. MI = motivational interviewing; PE = psychoeducation; PR = personalized recommendation; RA = repeated administration; M = mean, SD = standard deviation. Helpfulness was rated on a 1 (not helpful) to 4 (very helpful) scale. Change attitudes were rated on a scale from 2 (not at all important/ready) to 14 (very important/ready) scale.

Table 4.

Chatbot Helpfulness Model Results

Term B SE t df p β
Intercept 2.99 0.07 44.73 95 < .001
MI 0.15 0.07 2.33 102 .022 0.17
PE −0.04 0.06 −0.70 129 .488 −0.05
PR 0.19 0.07 2.73 78 .008 0.21
RA −0.05 0.06 −0.82 122 .412 −0.06
MI × PE 0.04 0.07 0.59 102 .557 0.04
MI × PR −0.13 0.07 −1.85 83 .068 −0.14
MI × RA −0.03 0.07 −0.48 80 .630 −0.04
PE × PR −0.04 0.07 −0.64 111 .525 −0.05
PE × RA −0.09 0.07 −1.35 93 .181 −0.10
PR × RA −0.09 0.07 −1.35 69 .180 −0.10
MI × PE × PR −0.11 0.07 −1.70 92 .093 −0.13
MI × PE × RA 0.03 0.07 0.42 79 .677 0.03
MI × PR × RA −0.11 0.07 −1.73 100 .087 −0.13
PE × PR × RA −0.02 0.07 −0.29 92 .775 −0.02
MI × PE × PR × RA 0.02 0.07 0.35 100 .728 0.03

Note. N = 205. MI = motivational interviewing; PE = psychoeducation; PR = personalized recommendation; RA = repeated administration. Boldface indicates statistical significance with alpha of .05.

Table 5 shows output for the model of change attitudes. Of note, the sample average effect of time was negative and significant, indicating that participants on average demonstrated declining views of importance/readiness to change eating/weight/shape concerns from before to 14 weeks after the chatbot conversation. The model also indicated significant interactions between motivational interviewing and time, as well as between repeated administration and time. Simple slopes of the motivational interviewing interaction indicated that participants receiving this component experienced a small to medium and significant decline over time in change attitudes (B=−0.10, SE=0.02, t(28)=−4.24, p<.001, ß=−0.19), whereas participants not receiving it experienced a smaller, but still significant, decline over time (B=−0.04, SE=0.02, t(176)=−2.10, p=.037, ß=−0.07). Thus, presence (vs absence) of motivational interviewing was associated with a larger decline over time in change attitudes. Simple slopes of repeated administration indicated that participants receiving this component experienced a small but significant decline over time in change attitudes (B=−0.04, SE=0.02, t(76)=−2.15, p=.035, ß=−0.08), whereas participants not receiving it experienced a small to medium decline over time (B=−0.10, SE=0.02, t(36)=−4.36, p<.001, ß=−0.18). Thus, repeated administration was associated with a weaker decline over time in change attitudes.

Table 5.

Change Attitude Model Results and Simple Slopes

Model Results
Term B SE t df p β
Intercept 11.08 0.18 60.90 583 < .001
MI 0.55 0.36 1.53 688 .127 0.09
PE −0.38 0.36 −1.06 657 .289 −0.07
PR −0.31 0.36 −0.86 600 .388 −0.05
RA 0.05 0.36 0.13 612 .897 0.01
MI × PE −0.63 0.74 −0.85 445 .398 −0.05
MI × PR −0.66 0.72 −0.92 690 .357 −0.06
PE × PR 0.05 0.73 0.06 520 .949 0.00
MI × RA 1.09 0.72 1.51 679 .132 0.09
PE × RA 0.62 0.72 0.86 702 .391 0.05
PR × RA −0.33 0.73 −0.46 583 .649 −0.03
MI × PE × PR −1.23 1.44 −0.85 668 .393 −0.05
MI × PE × RA 0.62 1.43 0.44 737 .664 0.03
MI × PR × RA 1.36 1.48 0.92 473 .358 0.06
PE × PR × RA −2.67 1.45 −1.84 608 .066 −0.11
MI × PE × PR × RA 4.02 2.89 1.39 657 .164 0.09
Time −0.07 0.02 −4.41 36 < .001 −0.13
Time × MI −0.06 0.03 −2.28 65 .026 −0.08
Time × PE −0.02 0.03 −0.62 242 .536 −0.02
Time × PR 0.02 0.03 0.73 85 .465 0.03
Time × RA 0.06 0.03 2.00 76 .049 0.07
Time × MI × PE 0.04 0.06 0.58 33 .569 0.02
Time × MI × PR 0.05 0.05 1.01 95 .316 0.04
Time × PE × PR 0.00 0.05 −0.08 88 .936 0.00
Time × MI × RA 0.04 0.06 0.62 38 .541 0.03
Time × PE × RA −0.04 0.06 −0.72 53 .478 −0.03
Time × PR × RA −0.02 0.06 −0.36 38 .720 −0.02
Time × MI × PE × PR 0.26 0.10 2.59 203 .010 0.09
Time × MI × PE × RA −0.05 0.11 −0.42 89 .674 −0.02
Time × MI × PR × RA −0.07 0.11 −0.59 80 .556 −0.02
Time × PE × PR × RA 0.02 0.12 0.15 43 .884 0.01
Time × MI × PE × PR × RA −0.08 0.23 −0.33 57 .740 −0.01
Simple Slopes
Components B SE t df p β
PR 0.04 0.04 0.26 143 .798 0.02
PE −0.02 0.04 −0.46 126 .643 −0.03
PR + MI + PE −0.05 0.04 −1.04 39 .304 −0.08
None −0.05 0.03 −1.46 196 .144 −0.09
PR + PE −0.09 0.03 −2.19 46 .034 −0.17
MI −0.10 0.04 −2.32 67 .023 −0.17
PR + MI −0.11 0.03 −3.33 331 .001 −0.20
MI + PE −0.15 0.05 −3.35 32 .002 −0.28

Note. N = 205. MI = motivational interviewing; PE = psychoeducation; PR = personalized recommendation; RA = repeated administration. Boldface indicates statistical significance with alpha of .05.

As can be seen in Table 5, the model of change attitudes had one significant higher-order interaction involving multiple component factors, specifically between time, motivational interviewing, psychoeducation, and personalized recommendations. Simple slopes of this model are plotted in Figure 3 and detailed in Table 5. For participants who received personalized recommendations but did not receive motivational interviewing or psychoeducation, there was no significant decline over time in change attitudes; in fact, the effect of time was non-significantly and very weakly positive. For other possible combinations of these components, the effect of time was either weakly and non-significantly negative or significantly negative. Thus, the effects of motivational interviewing, psychoeducation, and personalized recommendations on change attitudes depended on each other, and delivering personalized recommendations without the other two components was associated with the most favorable time trend of change attitudes.

Figure 3.

Figure 3.

Simple slopes from model of change attitudes across levels of the three-way interaction between MI, PE, and PR. The gray tab at the top of each plot denotes which of the chatbot components was delivered. Gray area around plotted line denotes range from one standard error above to below the estimated value. MI = motivational interviewing. PE = psychoeducation. PR = personalized recommendation.

As a sensitivity analysis, we re-ran the change attitudes model adjusting for whether a person had, by a given timepoint, reported service utilization; this led to no change in inferences about chatbot components. Additionally, there was no association between service utilization and change attitudes at a given time in this model, suggesting that the decline in change attitudes likely could not be attributed to service utilization; see Supplementary Materials.

Discussion

This study evaluated the individual and combined effects of four candidate chatbot components (i.e., psychoeducation, motivational interviewing, personalized recommendations, and repeated administration), which had been previously developed using user-centered design (Shah et al., 2022), on increasing mental health services use among adults who screened positive for an ED on the NEDA screen but were not in treatment. Secondary outcomes of interest included helpfulness and change attitudes.

Results indicated that 58% of the sample reported mental health services use within 3 months following screening, which was very high in comparison to the 16% we observed in our prior work assessing rate of mental health services use following NEDA screening (Fitzsimmons-Craft et al., 2020). (This comparison should not be taken as definitive though, and the base rate of services use in the current study sample without intervention is unknown.) At the same time, on average, change attitudes (i.e., views of importance/readiness to change eating/shape/weight concerns) declined over time. Importantly, change attitudes were very high to begin with in this sample (mean score at baseline 11.42 out of 14), suggesting change attitudes were highest at initial ED screening and subsequently decreased over time. Thus, it is also possible this decline represented regression to the mean.

Important findings emerged regarding the four chatbot components. First, participants who received the repeated administration component were significantly more likely to report mental health services use than those who did not. This finding aligns with prior work on the importance of reminders in increasing adherence to health-related advice (Fenerty et al., 2012; Thakkar et al., 2016). There were no significant effects of other components, nor were there significant interactions among components, on services use. Therefore, for the goal of increased services use, this component was key. At the same time, repeated administration slowed the decline in change attitudes participants experienced over time. These results highlight that the simple act of checking in with individuals on services use following ED screening and reminding them of resources can have powerful effects on services use and change attitudes. However, this finding must be interpreted in the context of a sample that had very high motivation to begin with, and it is possible that the effects of reminders may be different in samples with different baseline characteristics.

Second, although the motivational interviewing component did not have a main effect on services use, participants who received this component found the chatbot more helpful. Along with indicating acceptability of this component, this is notable given that engagement with digital mental health interventions has been shown to be facilitated by perceptions of intervention usefulness (Borghouts et al., 2021). At the same time, receiving motivational interviewing was actually associated with larger declines over time in change attitudes. This finding, while contradictory to expectations, may be considered consistent with some other recent work finding that motivational interviewing may be counterproductive when baseline motivation is high, as observed here (Bur et al., 2022; Soucy et al., 2021).

Third, although there was no main effect of personalized recommendations on services use (though the effect was trending in the expected direction), participants who received this component found the chatbot more helpful. Furthermore, there was one significant higher-order interaction identified regarding effects on change attitudes, which suggested that delivering personalized recommendations without motivational interviewing or psychoeducation was associated with the most favorable time trend of change attitudes (i.e., no significant decline).

Fourth, the current analyses did not demonstrate any significant effects of psychoeducation. This result contributes to the mixed literature on psychoeducation for EDs. Some prior work has demonstrated possible benefits (Hay et al., 2007), whereas other work has suggested mental health literacy is not associated with services use in EDs (Holtzhausen et al., 2020).

A strength of this study was the ORCT design, which enabled learning about effects of the specific chatbot components on services use and secondary outcomes. This is in contrast to the traditional approach of investigating the effects of a full intervention package vs control, which would not have allowed for such a nuanced understanding of discrete intervention components (Collins et al., in press). We also observed very high engagement with the chatbot—92% of participants started the main conversation and 81% completed it. This high engagement is notable, particularly given challenges that are often observed in engagement with digital mental health interventions (Baumel et al., 2019; Forbes et al., 2023). Follow-up completion was high as well, with 83% of participants completing one or more follow-up assessments. However, this study also had limitations worth acknowledging. Most notably, the sample represented a very small percentage of NEDA screen respondents (D’Adamo et al., in press), and as demonstrated by the high baseline change attitudes, a subsample who may have been particularly motivated for care and to engage with the chatbot. Therefore, our findings may not generalize to the overall population of individuals with EDs. In addition, the sample was primarily White, non-Hispanic women, and thus results may not be generalizable to more diverse ED samples.

A key next step will be to leverage these findings to select an optimized chatbot intervention package (Collins et al., in press). However, making decisions about the optimized package when there are multiple outcomes of interest, as was the case of the current study, can be challenging. For example, are the effects for motivational interviewing and personalized recommendations on secondary outcomes enough to justify inclusion of either or both of these components in the finalized intervention package? This raises important dilemmas that are likely best approached using decision analysis for intervention value efficiency (DAIVE), which allows for consideration of multiple outcomes in making a decision about the optimized intervention package (Strayhorn et al., in press). The resultant optimized intervention can then be evaluated in an RCT vs the standard approach taken at the end of mental health screens (e.g., standard list of referral options). Other critical next steps include examination of relevant baseline variables (e.g., demographics, motivation, ED diagnosis) that may inform the ideal intervention package for particular individuals and expanding intervention options to which the chatbot can refer, particularly as highly accessible ED intervention options are shown to be effective and ultimately made publicly available.

To our knowledge, Alex is the first chatbot developed with the purpose of increasing services use in individuals with EDs, and this is the first trial examining effective components within such a digital intervention. As noted, the next step will be to finalize decision-making about the optimized intervention package and then test the resultant intervention in an RCT. Future work should also focus on recruiting a sample with greater range in baseline motivation. However, given the results here, which identified a number of promising active ingredients, this tool may hold potential to increase use of services following screening—and ultimately, to address the wide treatment gap in EDs.

Supplementary Material

SUP INFO

Public Significance Statement:

Low rates of mental health services use are observed in individuals following online eating disorder screening. Tools are needed, including scalable, digital options, that can be easily paired with screening to promote services utilization, but distilling such interventions to include only effective components is critical, particularly when considering disseminating the intervention at scale.

Acknowledgements and Conflicts of Interest:

This research was supported by Grants K08 MH120341 and R01 MH115128-04S1 from the National Institute of Mental Health and Grant T32 HL130357 from the National Heart, Lung, and Blood Institute. Ellen Fitzsimmons-Craft receives royalties from UpToDate, is a consultant for Kooth, and is on the Clinical Advisory Board for Beanbag Health.

Data Availability Statement:

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

  1. Ali K, Farrer L, Fassnacht DB, Gulliver A, Bauer S, & Griffiths KM (2017). Perceived barriers and facilitators towards help‐seeking for eating disorders: A systematic review. International Journal of Eating Disorders, 50(1), 9–21. 10.1002/eat.22598 [DOI] [PubMed] [Google Scholar]
  2. Ali K, Fassnacht DB, Farrer L, Rieger E, Feldhege J, Moessner M, … Bauer S (2020). What prevents young adults from seeking help? Barriers toward help‐seeking for eating disorder symptomatology. International Journal of Eating Disorders, 53(6), 894–906. https://onlinelibrary-wiley-com.ezaccess.libraries.psu.edu/doi/pdfdirect/10.1002/eat.23266?download=true [DOI] [PubMed] [Google Scholar]
  3. Baumel A, Muench F, Edan S, & Kane JM (2019). Objective user engagement with mental health apps: Systematic search and panel-based usage analysis. Journal of Medical Internet Research, 21(9), e14567. 10.2196/14567 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bem DJ (1967). Self-perception: An alternative interpretation of cognitive dissonance phenomena. Psychological Review, 74(3), 183–200. 10.1037/h0024835 [DOI] [PubMed] [Google Scholar]
  5. Borghouts J, Eikey E, Mark G, De Leon C, Schueller SM, Schneider M, … Sorkin DH (2021). Barriers to and facilitators of user engagement with digital mental health interventions: Systematic review. Journal of Medical Internet Research, 23(3), e24387. 10.2196/24387 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bur OT, Krieger T, Moritz S, Klein JP, & Berger T (2022). Optimizing the context of support of web-based self-help in individuals with mild to moderate depressive symptoms: A randomized full factorial trial. Behaviour Research and Therapy, 152, 104070. 10.1016/j.brat.2022.104070 [DOI] [PubMed] [Google Scholar]
  7. Collins LM (2018). Optimization of behavioral, biobehavioral, and biomedical interventions: The multiphase optimization strategy (MOST). Springer. [Google Scholar]
  8. Collins LM, Dziak JJ, Kugler KC, & Trail JB (2014). Factorial experiments: Efficient tools for evaluation of intervention components. American Journal of Preventive Medicine, 47(4), 498–504. 10.1016/j.amepre.2014.06.021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Collins LM, Nahum-Shani I, Guastaferro K, Strayhorn JC, Vanness DJ, & Murphy SA (in press). Intervention optimization: A paradigm shift and its potential implications for clinical psychology. Annual Review of Clinical Psychology. 10.1146/annurev-clinpsy-080822-051119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. D’Adamo L, Grammer AC, Rackoff GN, Shah J, Firebaugh ML, Taylor CB, … Fitzsimmons-Craft EE (in press). Rates and correlates of study enrolment and use of a chatbot aimed to promote mental health services use for eating disorders following online screening. European Eating Disorders Review. 10.1002/erv.3082 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Fairburn CG, & Beglin SJ (2008). Eating Disorder Examination Questionnaire (EDE-Q 6.0). In Fairburn CG (Ed.), Cognitive behavior therapy and eating disorders (pp. 309–313). Guilford Press. [Google Scholar]
  12. Fenerty SD, West C, Davis SA, Kaplan SG, & Feldman SR (2012). The effect of reminder systems on patients’ adherence to treatment. Patient preference and adherence, 6, 127–135. 10.2147/PPA.S26314 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Festinger L (1962). Cognitive dissonance. Scientific American, 207(4), 93–107. 10.1038/scientificamerican1062-93 [DOI] [PubMed] [Google Scholar]
  14. Fey CF, Hu T, & Delios A (2023). The measurement and communication of effect sizes in management research. Management and Organization Review, 19(1), 176–197. 10.1017/mor.2022.2 [DOI] [Google Scholar]
  15. Fitzsimmons-Craft EE, Balantekin KN, Graham AK, DePietro B, Laing O, Firebaugh ML, … Wilfley DE (2020). Preliminary data on help-seeking intentions and behaviors of individuals completing a widely available online screen for eating disorders in the United States. International Journal of Eating Disorders, 53(9), 1556–1562. 10.1002/eat.23327 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Fitzsimmons-Craft EE, Balantekin KN, Graham AK, Smolar L, Park D, Mysko C, … Wilfley DE (2019). Results of disseminating an online screen for eating disorders across the U.S.: Reach, respondent characteristics, and unmet treatment need. International Journal of Eating Disorders, 52(6), 721–729. 10.1002/eat.23043 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Forbes A, Keleher MR, Venditto M, & DiBiasi F (2023). Assessing patient adherence to and engagement with digital interventions for depression in clinical trials: Systematic literature review. Journal of Medical Internet Research, 25, e43727. 10.2196/43727 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Graham AK, Trockel M, Weisman H, Fitzsimmons-Craft EE, Balantekin KN, Wilfley DE, & Taylor CB (2019a). A screening tool for detecting eating disorder risk and diagnostic symptoms among college-age women. Journal of American College Health, 67(4), 357–366. 10.1080/07448481.2018.1483936 [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Graham AK, Wildes JE, Reddy M, Munson SA, Barr Taylor C, & Mohr DC (2019b). User‐centered design for technology‐enabled services for eating disorders. International Journal of Eating Disorders, 52(10), 1095–1107. 10.1002/eat.23130 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Graham JW (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60(1), 549–576. 10.1146/annurev.psych.58.110405.085530 [DOI] [PubMed] [Google Scholar]
  21. Hay P, Mond J, Paxton S, Rodgers B, Darby A, & Owen C (2007). What are the effects of providing evidence‐based information on eating disorders and their treatments? A randomized controlled trial in a symptomatic community sample. Early Intervention in Psychiatry, 1(4), 316–324. 10.1111/j.1751-7893.2007.00044.x [DOI] [PubMed] [Google Scholar]
  22. Holtzhausen N, Mannan H, Foroughi N, & Hay P (2020). Effects associated with the use of healthcare for eating disorders by women in the community: A longitudinal cohort study. BMJ Open, 10(8), e033986. 10.1136/bmjopen-2019-033986 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Kazdin AE, Fitzsimmons-Craft EE, & Wilfley DE (2017). Addressing critical gaps in the treatment of eating disorders. International Journal of Eating Disorders, 50(3), 170–189. 10.1002/eat.22670 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Klump KL, Bulik CM, Kaye WH, Treasure J, & Tyson E (2009). Academy for eating disorders position paper: Eating disorders are serious mental illnesses. International Journal of Eating Disorders, 42(2), 97–103. 10.1002/eat.20589 [DOI] [PubMed] [Google Scholar]
  25. Kreuter MW, & Wray RJ (2003). Tailored and targeted health communication: Strategies for enhancing information relevance. American Journal of Health Behavior, 27(Suppl. 3), S227–S232. 10.5993/AJHB.27.1.s3.6 [DOI] [PubMed] [Google Scholar]
  26. Lipschitz JM, Pike CK, Hogan TP, Murphy SA, & Burdick KE (2023). The engagement problem: A review of engagement with digital mental health interventions and recommendations for a path forward. Current Treatment Options in Psychiatry, 10(3), 119–135. 10.1007/s40501-023-00297-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Lundahl B, & Burke BL (2009). The effectiveness and applicability of motivational interviewing: A practice‐friendly review of four meta‐analyses. Journal of Clinical Psychology, 65(11), 1232–1245. 10.1002/jclp.20638 [DOI] [PubMed] [Google Scholar]
  28. Lyon AR, & Koerner K (2016). User‐centered design for psychosocial intervention development and implementation. Clinical Psychology: Science and Practice, 23(2), 180–200. 10.1111/cpsp.12154 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Miller WR, & Rollnick S (2013). Motivational interviewing: Helping people change (3rd ed.). Guilford press. [Google Scholar]
  30. Noar SM, Benac CN, & Harris MS (2007). Does tailoring matter? Meta-analytic review of tailored print health behavior change interventions. Psychological Bulletin, 133(4), 673–693. 10.1037/0033-2909.133.4.673 [DOI] [PubMed] [Google Scholar]
  31. Pereira J, & Diaz O (2019). Using health chatbots for behavior change: a mapping study. Journal of Medical Systems, 43(5), 135. 10.1007/s10916-019-1237-1 [DOI] [PubMed] [Google Scholar]
  32. Petty RE, Cacioppo JT, & Goldman R (1981). Personal involvement as a determinant of argument-based persuasion. Journal of Personality and Social Psychology, 41(5), 847–855. 10.1037/0022-3514.41.5.847 [DOI] [Google Scholar]
  33. Rackoff GN, Fitzsimmons-Craft EE, Taylor CB, Wilfley DE, & Newman MG (2023). Psychotherapy utilization by United States college students. Journal of American College Health, 1–8. 10.1080/07448481.2023.2225630 [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Rubin DB (1987). Multiple imputation for nonresponse in surveys. Wiley. [Google Scholar]
  35. Shah J, DePietro B, D’Adamo L, Firebaugh ML, Laing O, Fowler LA, … Wilfley DE (2022). Development and usability testing of a chatbot to promote mental health services use among individuals with eating disorders following screening. International Journal of Eating Disorders, 55(9), 1229–1244. 10.1002/eat.23798 [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Singer JD, & Willett JB (2003). Applied longitudinal data analysis: Modeling change and event occurrence. Oxford University Press. 10.1093/acprof:oso/9780195152968.001.0001 [DOI] [Google Scholar]
  37. Soucy JN, Hadjistavropoulos HD, Karin E, Dear BF, & Titov N (2021). Brief online motivational interviewing pre-treatment intervention for enhancing internet-delivered cognitive behaviour therapy: A randomized controlled trial. Internet Interventions, 25, 100394. 10.1016/j.invent.2021.100394 [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Strayhorn JC, Collins LM, Brick TR, Marchese SH, Pfammatter AF, Pellegrini C, & Spring B (2022). Using factorial mediation analysis to better understand the effects of interventions. Translational Behavioral Medicine, 12(1), ibab137. 10.1093/tbm/ibab137 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Strayhorn JC, Collins LM, & Vanness DJ (in press). A posterior expected value approach to decision-making in the multiphase optimization strategy for intervention science. Psychological Methods. 10.1037/met0000569 [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Thakkar J, Kurup R, Laba T-L, Santo K, Thiagalingam A, Rodgers A, … Chow CK (2016). Mobile telephone text messaging for medication adherence in chronic disease: a meta-analysis. JAMA Internal Medicine, 176(3), 340–349. 10.1001/jamainternmed.2015.7667 [DOI] [PubMed] [Google Scholar]
  41. Tudor Car L, Dhinagaran DA, Kyaw BM, Kowatsch T, Joty S, Theng Y-L, & Atun R (2020). Conversational agents in health care: Scoping review and conceptual analysis. Journal of Medical Internet Research, 22(8), e17158. 10.2196/17158 [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. van Buuren S, & Groothuis-Oudshoorn K (2011). mice: Multivariate imputation by chained equations in R. Journal of Statistical Software, 45(3), 1548–7660. 10.18637/jss.v045.i03 [DOI] [Google Scholar]
  43. Wang PS, Lane M, Olfson M, Pincus HA, Wells KB, & Kessler RC (2005). Twelve-month use of mental health services in the United States: Results from the National Comorbidity Survey Replication.Archives of General Psychiatry,62(6), 629–640. 10.1001/archpsyc.62.6.629 [DOI] [PubMed] [Google Scholar]
  44. Xu Z, Huang F, Koesters M, Staiger T, Becker T, Thornicroft G, & Ruesch N (2018). Effectiveness of interventions to promote help-seeking for mental health problems: systematic review and meta-analysis. Psychological Medicine, 48(16), 2658–2667. 10.1017/S0033291718001265 [DOI] [PubMed] [Google Scholar]
  45. Zhang J, Oh YJ, Lange P, Yu Z, & Fukuoka Y (2020). Artificial intelligence chatbot behavior change model for designing artificial intelligence chatbots to promote physical activity and a healthy diet. Journal of Medical Internet Research, 22(9), e22845. 10.2196/22845 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SUP INFO

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

RESOURCES