Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Aug 1.
Published in final edited form as: Drug Alcohol Depend. 2024 Jun 15;261:111353. doi: 10.1016/j.drugalcdep.2024.111353

Evaluation of a Digital Tool for Detecting Stress and Craving in SUD Recovery: An Observational Trial of Accuracy and Engagement

Stephanie Carreiro a, Pravitha Ramanand b, Melissa Taylor a, Rebecca Leach a, Joshua Stapp b,c, Sloke Sherestha b, David Smelson d, Premananda Indic b
PMCID: PMC11260438  NIHMSID: NIHMS2007016  PMID: 38917718

Abstract

Background:

Digital health interventions offer opportunities to expand access to substance use disorder (SUD) treatment, collect objective real-time data, and deliver just-in-time interventions: however implementation has been limited. RAE (Realize, Analyze, Engage) Health is a digital tool which uses continuous physiologic data to detect high risk behavioral states (stress and craving) during SUD recovery.

Methods:

This was an observational study to evaluate the digital stress and craving detection during outpatient SUD treatment. Participants were asked to use the RAE Health app, wear a commercial-grade wrist sensor over a 30-day period. They were asked to self-report stress and craving, at which time were offered brief in-app de-escalation tools. Supervised machine learning algorithms were applied retrospectively to wearable sensor data obtained to develop group-based digital biomarkers for stress and craving. Engagement was assessed by number of days of utilization, and number of hours in a given day of connection.

Results:

Sixty percent of participants (N=30) completed the 30-day protocol. The model detected stress and craving correctly 76% and 69% of the time, respectively, but with false positive rates of 33% and 28% respectively. All models performed close to previously validated models from a research grade sensor. Participants used the app for a mean of 14.2 days (SD 10.1) and 11.7 hours per day (SD 8.2). Anxiety disorders were associated with higher mean hours per day connected, and return to drug use events were associated with lower mean hours per day connected.

Conclusions:

Future work should explore the effect of similar digital health systems on treatment outcomes and the optimal dose of digital interventions needed to make a clinically significant impact.

Keywords: substance use disorder, digital health, craving, stress, digital biomarker

Graphical Abstract

graphic file with name nihms-2007016-f0001.jpg

1. Introduction

Substance use disorder (SUD) is an enormous public health problem, with over 20 million people newly diagnosed in 2021 in the United States (US)(NIDA (2021)). As SUD is a chronic disease, treatment needs to be longitudinal, but also needs to foster self-sufficiency to manage day to day triggers. Two related triggers associated with return to drug use in SUD are stress and craving. Stress can be defined as a subjective reaction to events that are perceived as harmful, threatening, or challenging (Sinha (2001)). Stress heralds a high-risk period in recovery, and has been shown to predict risk of return to use in susceptible individuals (Back et al. (2010); Sinha et al. (2006); Sinha (2001)). Craving, defined as the subjective experience of wanting a drug, is a conscious expression of intense desire and is specific to a drug of choice (Tiffany and Wray (2012)). Craving is also strongly associated with return to substance use during treatment (Preston et al. (2018a); Tapper (2018); Tiffany and Wray (2012)), and is commonly used as a diagnostic feature, an outcome for clinical trials, and a therapeutic target for SUD (Tiffany and Wray (2012); Sayette (2016)). Furthermore, stress and craving together are closely intertwined in people with SUD: stress is positively correlated to craving for opioids, cocaine and tobacco and to return to drug use in individuals in treatment for SUD (Preston and Epstein (2011); Sinha et al. (1999); Sinha (2008); Brady and Sonne (1999); Preston et al. (2018b)).

Digital health tools offer a unique opportunity to enhance services, while consuming proportionally less human resources and also teaching self-sufficiency. Digital health tools, which include those that use electronic devices (e.g. mobile phones, computers and sensors) to enhance health, have seen rapid expansion over the last decade. They have demonstrated benefits in numerous areas including screening and assessment of SUDs, teaching scientifically proven skills to help conquer unwanted behaviors, and providing tools to help in the day-to-day process of recovery (Marsch et al. (2020)). Digital health tools allow for ecological momentary assessment (EMA, or assessments in a person’s natural environment), and Ecological Momentary Intervention (EMI, or interventions that can be tailored to a patient’s individual needs and delivered in the natural environment). Wearable sensors also allow for the collection of digital biomarkers, or end user generated, objectively measured characteristics of physiology/pathophysiology (Coravos et al. (2019). Taken together, these create opportunities for just-in-time and just-in-space treatment, when and where patients need them most. Specific to mental health, digital health interventions have demonstrated improved outcomes: one systematic review showed improved psychological well-being, and a reduction in depression and anxiety among college students (Lattie et al. (2019)) and another systematic review showed that engagement with digital health intervention is associated with therapeutic gains among mental health patients (Gan et al. (2021)). Compared to other disease states, digital interventions for SUD are still in their infancy. There is promising data that mobile app based interventions can be used in alcohol use disorder (AUD) and opioid use disorder (OUD)(Gustafson et al. (2014); Maricich et al. (2021); Nandakumar et al. (2019)); But challenges remain including access to devices, understanding ideal candidates, optimal dosing and types of interventions, and promoting sustained engagement.

RAE Health (Realize, Analyze, Engage)(Carreiro et al. (2021)) is a digital tool which aims to support recovery by detecting high risk behavioral states (stress and craving) through continuously measured physiologic data, which can be used to trigger just-in-time adaptive interventions. The RAE system consists of 1) a wearable device, 2) a mobile app and 3) a clinician portal. Preliminary studies have shown the ability of data from research-grade wearable sensors to detect self-reported craving and stress (Carreiro et al. (2020)), and have outlined the system design (Carreiro et al. (2021)). Building on this, the present study aimed to evaluate the RAE app in a population of individuals in treatment for substance use disorder, specifically:

  1. To compare digital biomarkers of stress and craving obtained on a commercial-grade wearable device to prior models from a research grade device; and

  2. To evaluate system usage, engagement and perceptions of real-time mobile interaction around stress and craving events.

2. Materials and Methods

2.1. The RAE System

The RAE System (Figure 1) has been described in detail elsewhere (Carreiro et al. (2021)). In brief, the system consists of a wrist-worn wearable device, a mobile application, a secure, cloud-based interface, and a clinician portal. The target end-user (an individual in recovery from SUD) wears the sensor, which collects continuous physiologic data including tri-axial accelerometry, heart rate, and heart rate variability. Data are collected in five minute segments and streamed from the sensor to the mobile app via Bluetooth connection and from the app to the cloud-based server.

Figure 1:

Figure 1:

Overview of the RAE system. Of note, for the present study, notifications were deactivated and participants were asked to self-report episodes of stress and craving. The remainder of the functions were available to participants.

Physiological data is evaluated using three unique classification algorithms running in the RAE app based on prior data (stress vs no stress, craving vs no craving, and stress vs craving), and the complete RAE algorithm combines the results of all three models. The algorithm accepts a 9600 X 3 array of raw accelerometer data (9600 data points per axis) and calculates fifteen features from the window of data. These features are then passed as inputs to each of the three machine learning models. The final classification output of the algorithm is generated using a majority-wins rule. For instance, if both the “Stress vs No-Stress” and “Stress vs Cravings” models predict 3 (Stress), the algorithm will output the classification number 3 (Stress). If there is a tie where each of the three outcomes (Stress, No-Stress, and Craving) are predicted by one of the three models, the algorithm will output a classification of 0 (undetermined). If a stress or craving event is detected, the user receives a push notification from the mobile app asking them to confirm or deny the event. They are then provided opportunities to provide context for the situation, complete brief journaling prompts, or a mindful breathing exercise. Visualization tools are available in the RAE app for the user and/or their treatment provider to view historical data.

RAE’s initial algorithms were designed using a research grade device (Empatica E4, Empatica, Milan, Italy (Empatica (2023))). However, preliminary usability studies indicated that to facilitate long term wear, end-users in our population require standard smart watch features (e.g. fitness tracking capabilities, touch screen, clock, streamlined aesthetic), and would ultimately prefer to use an off-the-shelf device with RAE. To move toward these goals, we studied RAE with the Garmin Vivosmart 4 (Garmin, Olathe, KS, (Garmin (2023)), a commercially available fitness tracking device. The Vivosmart (Figure 1, upper right corner) is 15 × 10.5 × 197 mm in size and waterproof, with a touch screen interface and a 7-day battery life. Sensors on the Vivosmart 4 include a barometric altimeter, tri-axial accelerometer, photoplethysmography (PPG) sensor for heart rate/heart rate variability, and pulse oximeter.

For the present study, the goal was to develop a new model on the Garmin data. Craving and stress algorithms described above ran in the background: however notifications were not pushed through the app. Participants were instead asked to self-report episodes of stress or craving. Once an event was self-reported, the app’s normal functionalities were triggered (as they would be with an auto detected event): these included an offer to contact a pre-designated support person or an offer to engage with de-escalation tools (a brief journaling prompt and/or breathing exercise).

2.2. Overall Study Design

All study related protocols were approved by the University of Massachusetts Chan Medical School Institutional Review Board. Study participants were recruited from 10 outpatient SUD treatment facilities in the United States. Staff at the treatment centers were briefed on the research study and were asked to provide study contact information to interested clients. Once eligibility was confirmed, informed consent was obtained. Several steps were taken to maintain participants’ confidentiality throughout the study. Participants were identified with a study ID and all data (including data from the RAE app) was labeled with the study ID only (no identifiable information).

Study visits were conducted in person (for local participants prior to March 2020), or via secure video teleconference platform (Zoom). During the baseline visit, information regarding demographic data, past medical and mental health history, substance use history, and current medications were collected. Participants also received a standardized training session on the technology (the RAE app and the Garmin sensor), which covered functionality and optimal use. Participants were asked to wear the sensor on their non-dominant arm and to have their smartphone within Bluetooth range with the RAE app open at all times for 30 days, except while charging the device. Algorithms described above ran in the background: however, because the goal was to develop a new model on the Garmin data, the notifications were not pushed through the app. Participants were instead asked to self-report episodes of stress or craving.

Participants were asked to complete two check-in calls at day 10 and day 20 to ensure technology functionality and to troubleshoot any questions or issues they may be experiencing. Day 30 marked the completion of the protocol and the participant was asked to complete standardized questionnaires, in addition to a semi-structured exit interview to gain perspectives on feasibility and usability of the RAE system. Of note, qualitative analysis of the semi-structured interview data is extensive and outside the scope of this manuscript, and will be reported separately. Participants were compensated with $50 worth gift card to a retail store: $25 was provided upon enrollment and $25 upon completion of the study.

2.3. Setting & Recruitment

Initially, recruitment was limited to a single local outpatient SUD treatment center. The study enrollment commenced just prior to the beginning of the COVID-19 pandemic in the early months of 2020, which significantly impacted many aspects of the study including recruitment, retention rate, and protocol execution. The intake and exit visits of the protocol were initially intended to be administered face-to-face, with check-ins occurring via phone. However, due to pandemic related restrictions, the protocol was converted to a virtual format, occurring via telephone or secure videoconferencing platform. By converting our recruitment methods to a virtual protocol, we were able to expand recruitment to outpatient SUD treatment centers throughout the United States. Treatment center types included intensive outpatient (IOP) programs, programs for healthcare professionals in recovery, and a drug court program.

2.4. Inclusion and Exclusion Criteria

We recruited individuals who: 1) were 18 years of age or older at the time of consent, 2) were enrolled in an outpatient treatment program for an SUD 3) were fluent in English, 4) had access to a smartphone with iOS or Android capabilities and 5) were capable of providing written informed consent. We excluded individuals who: 1) were pregnant, or 2) had significant limitation of motion of the non-dominant arm (e.g. amputation or fracture). We excluded patients under the age of 18, as they may have age-related physiologic features that affect digital biomarkers. We excluded pregnant individuals because they represent a special population that are routinely excluded from similar research studies. Additionally, physiologic changes due to pregnancy necessitate separate, focused studies for digital biomarkers. Finally, we excluded individuals with a non-dominant arm amputation or fracture because locomotion activities vary with laterality, and prior studies have been (by convention) done on the non-dominant arm.

2.5. Data Collection

2.5.1. Physiologic Data Collection and Processing

Throughout the study period, the Garmin sensor continuously recorded physiologic parameters. Time series raw physiologic data from the Garmin Vivosmart 4 were downloaded from the RAE cloud-based server and labeled based on participant in-app annotation as stress, craving, or neither. Data were cleaned to remove nonnumeric values, empty values, values outside of physiologic range, and opcode values based on the software platform. Each segment corresponding to annotated events was subjected to a sliding window operation. A five-minute window was slid through the segments one step at a time, with each step incrementing by one-minute of data. Features were estimated at each window, and averaged over all available values to provide the final estimate for each event. A more detailed description of this process can be found in our previously described protocol (Carreiro et al. (2020); Shrestha et al. (2023)).

2.5.2. Non-Physiologic Data Collection from Participants

Demographic and historical data were collected via an intake interview. These data included basic demographics, phone and operating system information, current medications, and self-reported past medical, mental health, and substance use history. Age was operationalized as a continuous variable and as a categorical variable with three groups: <40 years, 40–60 years, and 60 + years old. The presence or absence of mental health diagnoses was based on self-report. Throughout the study period, data on system engagement (defined as number of hours per day of active data collection and number of days using app) were connected. Participant facing electronic surveys were collected at the 10 and 20 day check ins, and at the exit interview to assess perceptions on interactions with the app including barriers to use, and frequency of stress and craving. Self-report of return to drug use was assessed with each survey, and responses were supplemented with toxicological testing (e.g. urine drug screen) performed as part of treatment program, when available. Return to drug use was defined as self-reported use of the substance they were receiving treatment for (excluding tobacco) OR a positive urine drug screen.

2.5.3. Digital Biomarker Data and Machine Learning

Machine learning models were developed to solve three classification problems: stress vs no-stress, craving vs no craving and stress vs craving. Models were built retrospectively for the entire sample upon completion of data collection using data obtained from the Garmin sensor. Twenty-five classification models were tested from the following categories: decision trees, discriminant analysis, logistic regression, Naive Bayes classifiers, support vector machines, nearest neighbor classifiers, and ensemble classifiers using the MATLAB classification learner app. All models were constructed to use the labeled Garmin sensor data to solve the three original classification problems: stress vs no stress, craving vs no craving and stress vs craving. The models considered the 15 features from our initial algorithm plus additional new features from the sensor (i.e. upper and lower quartiles of accelerometer data, the magnitude of acceleration, the angle between maximum and minimum of the acceleration signal, and orientation of acceleration, and descriptive statistics from heart rate data) which improved our predictive capability. The data set was subjected to 10-fold cross validation (separated into 10 folds, trained with nine folds, and tested on the other fold until each fold was used as a test set), to avoid overfitting the model. 10-fold cross validation was based on events from all participants, with the goal of developing generalized algorithms across participants. Each classification algorithm was evaluated with the standard metric of area under the curve (AUC) of the receiver operating characteristic (ROC) curve. All machine learning analyses were conducted in MATLAB (Version 2020b, MathWorks, Natick, MA).

For comparison, we also mapped the Garmin Vivosmart 4 datastream to the datastream from the research grade sensor which we used in our original study (Empatica E4, Empatica, Milan Italy), and ran the same classification problems using our original models. A detailed description of this experiment is outside the scope of this paper, and can be found elsewhere (Shrestha et al. (2023)).

2.6. Analysis of Non-physiologic Data

Descriptive statistics were calculated for demographic variables and usage metrics (e.g. hours and days of use). All available participant data up to the point of study attrition were included regardless of protocol completion status. Outcomes of interest with regard to engagement included overall app use and daily app use while on study. Overall app use was defined as the number of days where the RAE app was connected to the sensor, opened, and engaged with at least once. High utilizers were defined as participants with ≥ 15 days of app use (50% of the study period), and low utilizers were those with < 15 days of app use. Daily app use was defined as number of hours in a given study day that the RAE app and sensor were on and connected. We also measured percent of days where sensor/app utilization was high (≥ 11 hours compared to low (< 11 hours), based on prior data of mean daily smartwatch use (Beukenhorst et al. (2020); Wasserlauf et al. (2019); Carreiro et al. (2020)). Overall and daily usage outcomes were compared based on subgroups of interest (including sex, age, phone operating system (OS) type, treatment facility type, psychiatric comorbidities, and day of the week). Hypothesis testing was performed to compare engagement metrics across subgroups: for normally distributed variables, Students t-test (binary variables) and ANOVA (greater than two groups) were used, and for non-normally distributed variables, Wilcoxon rank sum (binary variables) and Kruskal–Wallis H (greater than two groups) were used. All statistical analyses were conducted in using R Statistical Software (v4.1.2, R Core Team (2021)).

3. Results

3.1. Sample Characteristics

Enrollment occurred from December 2019 to May 2021. Sixty-four individuals were screened and 60 were consented to participate. Of the 50 participants that started using the app, 30 (60%) completed the 30-day protocol. A detailed breakdown of enrollment is shown in Figure 2. Particularly high attrition rate was noted from sites that served a lower socioeconomnic status (SES) population (43% protocol completion in participants from public/subsidized SUD treatment sites, compared to 88% in private SUD treatment sites), and participants using phones with Android operating system (75% protocol completion in iOS users versus 36% in Android users).

Figure 2:

Figure 2:

Study participant enrollment

Demographic characteristics of the study sample of (N = 60) broken down by protocol completion status are presented in Table 1. The sample had a mean age of 41.5 years, was 40% female and 85% identified as white. Forty eight percent of participants reported less than 6 months of sobriety at the time of enrollment. Participants were enrolled from 10 treatment programs from seven states in the US. Most common SUDs that participants were in treatment for were alcohol use disorder (67%), opioid use disorder (33%), cocaine use disorder (11%) and methamphetamine use disorder (8.3%). Co-occurring psychiatric diagnoses were common (60% participants), and 15% had a history of chronic pain. Ninety-four percent of participants who completed the study were still in treatment at end of the study, and 20% had at least one return to drug use during the study period.

Table 1:

Participant Demographics by Protocol Status

Complete Incomplete Never Started App Overall
(N=30) (N=20) (N=10) (N=60)
Age
 Mean (SD) 41.7 (9.46) 39.5 (13.4) 45.4 (12.6) 41.5 (11.2)
Sex
 Male 22 (73.3%) 8 (40.0%) 6 (60.0%) 36 (60.0%)
 Female 8 (26.7%) 12 (60.0%) 4 (40.0%) 24 (40.0%)
Race
 Asian 1 (3.3%) 0 (0%) 0 (0%) 1 (1.7%)
 Black or African American 0 (0%) 0 (0%) 2 (20.0%) 2 (3.3%)
 White 28 (93.3%) 17 (85.0%) 6 (60.0%) 51 (85.0%)
 Other 0 (0%) 2 (10.0%) 0 (0%) 2 (3.3%)
Latino/a 2 (6.7%) 1 (5.0%) 0 (0%) 3 (5.0%)
Fitzpatrick Skin Tone
 Type 1 1 (3.3%) 1 (5.0%) 0 (0%) 2 (3.3%)
 Type 2 4 (13.3%) 3 (15.0%) 0 (0%) 7 (11.7%)
 Type 3 10 (33.3%) 5 (25.0%) 1 (10.0%) 16 (26.7%)
 Type 4 0 (0%) 3 (15.0%) 2 (20.0%) 5 (8.3%)
 Type 5 2 (6.7%) 0 (0%) 0 (0%) 2 (3.3%)
 Type 6 0 (0%) 0 (0%) 2 (20.0%) 2 (3.3%)
Time in Treatment
 0–30 days 8 (26.7%) 3 (15.0%) 0 (0%) 11 (18.3%)
 1–3 months 7 (23.3%) 5 (25.0%) 2 (20.0%) 14 (23.3%)
 3–6 months 1 (3.3%) 2 (10.0%) 1 (10.0%) 4 (6.7%)
 6 months-1 year 1 (3.3%) 6 (30.0%) 1 (10.0%) 8 (13.3%)
 1–5 years 4 (13.3%) 0 (0%) 1 (10.0%) 5 (8.3%)
 5+ years 3 (10.0%) 0 (0%) 0 (0%) 3 (5.0%)
Phone OS
 iOS 23 (76.7%) 10 (50.0%) 3 (30.0%) 36 (60.0%)
 Android 5 (16.7%) 9 (45.0%) 6 (60.0%) 20 (33.3%)
Treatment Facility Type
 Private 29 (96.7%) 14 (70.0%) 8 (80.0%) 51 (85.0%)
 Subsidized/Public 1 (3.3%) 6 (30.0%) 2 (20.0%) 9 (15.0%)
Treatment Substance
 Alcohol 24 (80.0%) 11 (55.0%) 5 (50.0%) 40 (66.7%)
 Opioids 7 (23.3%) 9 (45.0%) 4 (40.0%) 20 (33.3%)
 Cocaine 1 (3.3%) 3 (15.0%) 3 (30.0%) 7 (11.7%)
 Methamphetamine 1 (3.3%) 4 (20.0%) 0 (0%) 5 (8.3%)
Psychiatric Diagnosis
 Any 18 (60.0%) 13 (65.0%) 5 (50.0%) 36 (60.0%)
 Anxiety 16 (53.3%) 12 (60.0%) 4 (40.0%) 32 (53.3%)
 Depression 13 (43.3%) 12 (60.0%) 4 (40.0%) 29 (48.3%)
 PTSD 4 (13.3%) 4 (20.0%) 3 (30.0%) 11 (18.3%)
 ADHD 2 (6.7%) 1 (5.0%) 2 (20.0%) 5 (8.3%)
 Bipolar 0 (0%) 1 (5.0%) 2 (20.0%) 3 (5.0%)
 Schizophrenia 0 (0%) 0 (0%) 0 (0%) 0 (0%)

3.2. Digital Biomarker Performance

After cleaning as described above, 4359 hours of sensor data with 8873 annotated stress events and 186 annotated craving events were analyzed. The best performing model for stress detection was an ensemble bagged tree classifier (ROC 0.78, accuracy 71.6%, Figure 3). The best performing model for craving detection was a Gaussian SVM model (ROC 0.74, Accuracy 71%, Figure 4). The best performing model for differentiating stress from craving was an ensemble bagged tree model (ROC= 0.75, accuracy 70%, Figure 5). For comparison, the performance of the previous algorithm using a research grade sensor Carreiro et al. (2020) is provided in Table 2. The accuracy of detection from the commercial sensor data was similar to that previously derived from the research grade device, with the most substantial difference in the stress vs craving model.

Figure 3:

Figure 3:

Stress Detection: Confusion Matrix and ROC Curve

Figure 4:

Figure 4:

Craving Detection: Confusion Matrix and ROC Curve

Figure 5:

Figure 5:

Stress vs Craving Detection: Confusion Matrix and ROC Curve

Table 2:

Machine Learning Model Metrics

Classifier Sensor Best Performing Model Accuracy ROC
Stress Commercial Device Ensemble bagged tree classifier 71.6 0.78
Stress Research-grade device Fine Gaussian SVM model 74.5 0.81
Craving Commercial Device Gaussian SVM 71.0 0.74
Craving Research-grade device Fine Gaussian SVM model 75.7 0.82
Stress vs Craving Commercial Device Ensemble bagged tree 70.0 0.75
Stress vs Craving Research-grade device Fine Gaussian SVM model 76.8 0.82

3.3. System Usage and Engagement

Participants used the app for a mean of 14.2 days (range 1–36, SD 10.1) over the study period. Forty-four percent of participants (N=22) fell into the high use category (≥ 15 days of app use). When evaluated by subgroup, overall usage was not statistically different based on sex, time in treatment, age or facility type.

When evaluating daily usage, participants maintained the app and sensor connection for a mean of 11.7 hours per day (range 0–23.9, SD 8.2), with 54.0% of participant-days falling into the high use category (≥ 11 hours of connection). Participants with an anxiety diagnosis had significantly higher mean hours per day, compared to those that did not (13.6 vs 9.2 hours, p =<0.001), and iOS users had significantly higher mean hours per day compared to Android users (13.6 vs 9.7, p = <001). Participants who had at least one return to use event (defined as use of the substance they were in treatment for) had significant lower mean hours per day compared to those that did not (6.0 vs 12.2, p = < 0.001). There were no significant differences in mean daily use based on sex or age group. Trajectories of use over the 30 day study period (overall and subgroups with significant differences) are shown in Figure 6.

Figure 6:

Figure 6:

Daily Use of RAE: Overall and by subgroups

Trend lines represent daily mean values, gray shaded band represents the 95% confidence interval using locally-weighted scatterplot smoothing (LOESS), and black dashed lines denote 11 hours of use

3.4. Perceptions on stress, craving and RAE use

Fifty-six percent of respondents reported wearing the sensor all or most of the day, and 75% agreed or strongly agreed that using RAE was helpful in their recovery. Reported barriers to sensor use were infrequent, and included the need for charging (2%) and difficulty pairing with the phone (8%). Reported barriers to app use (that occurred at least once during the study period) included impact on phone battery (8%), difficulty opening/logging into app (6%), and difficulty keeping the app open in the background (9%). No participants indicated discomfort from the sensor as a barrier.

Perceived frequency of stress varied - most participants reported experiencing stress daily (53%) or weekly (26%): however, some participants reported experiencing stress only monthly (20%). Eighty-eight percent of participants reported logging all or most of their stress events during the study period. The most cited reasons for not logging included not having immediate access to their phone (33%), forgetting (30%), the logging process taking too long (6%) or other (30%, including stress occurring at times when logging was inconvenient such as while driving or working).

Perceived frequency of craving was lower and more variable than that for stress. A subset of participants reported experiencing craving daily (6%) or weekly (13%); however, some participants reported experiencing craving only monthly (56%), and 25% reported never having cravings. Forty-two percent of participants reported having at least one craving event during the study period; 55% of them reported logging all or most of their craving events. Of those who reported having cravings, 65% said cravings decreased over time with treatment, however 29% reported no change in craving level, and 6% reported increase in craving over time in treatment. The most cited reasons for not logging craving included not having immediate access to their phone (25%), forgetting to log (25%), the logging process taking too long (6%) or other (44%, including craving occurring at times when logging was inconvenient such as while driving or working).

4. Discussion

Thirty participants (60% of the sample) from US-based SUD treatment centers completed a 30-day protocol using the RAE Health app: mean use of the app was 14.2 out of 30 days, and 11.7 hours per day. Our overall attrition was higher than anticipated, but on par with prior longitudinal studies using mobile health apps in SUD populations (Johnston et al. (2019); Gordon et al. (2017)). Stress and craving detection using an off-the-shelf wearable demonstrated comparable model performance to prior versions using research grade devices.

In order to be successful, digital health tools need leverage (and to be integrated into) the target users’ existing technology routines. The validation of our previously developed algorithms on a commercially available sensor is a substantial advantage. People want devices that look and feel like other commonly used technology and that provide functionality commonly found in smartwatches such as clock/alarm function, step/activity tracking, etc (Carreiro et al. (2020); Chapman et al. (2022)). They also often have their own devices and would prefer digital health tools that integrate into their own devices (Chapman et al. (2022); Onnela (2021)). Integrating an off the shelf fitness device is a first step toward making RAE Health (and other system/algorithms) device agnostic.

The digital divide is defined as the discrepancy in access to information and technologies across lines of traditional social determinants (Lythreatis et al. (2022); Cheshmehzangi et al. (2022); Mathrani et al. (2022)). Although most often referred to alongside socioeconomic boundaries, attention has been paid recently to age, race and sex-related digital divides as well as the intersectionality between these variables (Charness and Boot (2022); Acilar and Sæbø (2023); Whiteside and Beavers (2022)). Arguments can be made for digital health interventions as a solution to healthcare inequalities, or a perpetrator of them (Piers et al. (2023); Yao et al. (2022)). In our study sample, we noted high attrition in participants from public/subsidized treatment centers, particularly those with Android phones. In most cases, participants tended to have phones with relatively simple and inexpensive operating system (e.g. Android Go). While affordable, entry level smartphones are incredibly advantageous to increase access, they have some limitations including less memory and computing power. This created a dampened user experience which we hypothesize contributed to higher attrition in this user group. Specifically, participants with Android Go operating systems experienced more frequent connectivity issues and increased lags in app loading. Importantly, these technological barriers overlap substantially with other social determinants of health (poverty, housing insecurity, food insecurity, digital literacy). These are key considerations for developers of apps and other mobile health tools targeted for this population. In order to avoid furthering the digital divide, computational power of various levels of smartphones needs to be considered in design, products that are entry-level-phone friendly need to be available, and assumptions regarding availability of resources or knowledge need to be carefully examined.

Understanding how people use and interact with digital health applications is an important first step in defining an effective dose for the digital intervention (McVay et al. (2019)). Overall, participants stayed connected to RAE for a mean of 11.7 hours per day over the 30-day study period, which is in line with other work on smartwatch based digital health monitoring apps (Beukenhorst et al. (2020)). Higher utilization by participants with anxiety disorder was an unexpected finding. We also found lower utilization among participants who returned to drug use. As the present study was not designed to detect an effect on outcomes, we cannot definitively conclude whether RAE use is protective and/or if disengagement is predictive of return to use. However, this promising finding will be better elucidated in our ongoing clinical trial.

We also noticed a discrepancy between participants’ reported level of engagement (e.g. reporting wearing the sensor all or most of the day) and measured hours of data capture. Inadvertent lost connectivity between the app and device was identified as a potential contributing factor. In order for data to flow from the sensor to the app, the app has to be open in the background on the phone and the sensor has to be connected to the app via Bluetooth. This created long stretches of time where the participants thought they were using the app only to find that they were not. This was resolved with several app design updates including a connectivity indicator icon and push notifications when the app and sensor were disconnected for ≥ 15 minutes. A second identified cause of disengagement was prohibitively high battery utilization for a subset of app users. This was more common with older model and Android phones, and has been resolved in a more recent release of the app.

Our results should be considered in the context of several limitations. Virtual recruitment during the early months of the pandemic was both an opportunity and a challenge. It allowed us to expand our recruitment beyond our geographic area, which broadened our sample. But as the general public was still becoming familiar with the routine use of video conferencing platforms (e.g. Zoom), this posed a challenge for communication, and for learning the technology. In other words, it was difficult to teach someone a new technology (RAE Health) using another technology that was new to them (Zoom) despite the fact that both are fairly simple. This created frustration on the part of some users and led to increased attrition in the earlier phases of the study; N=10 users consented but never started the app, largely due to frustration with zoom-based training. Another major limitation to note in the present study is the limited racial and ethnic diversity in our sample which limits generalizability. Racial and ethnic factors may play a role in both the accuracy of physiologically-derived measurements (digital biomarkers) and engagement with the app, and will need to be explored in future larger studies.

There are also important limitations in our model development and validation strategy. The event detection models in this study were evaluated using cross validation by randomly splitting the available events into train and test data rather than by assigning participants into training and validation folds. The disadvantage of this is that the models may not be optimized for data from new participants entering the study, however these models will provide accurate detection of events during long term monitoring on the same set of subjects.

Despite these limitations, this study provides several important insights into the use of the RAE system and digital health tools for SUD in general. Based on this experience, multiple new features have been incorporated into the system including connectivity confirmations, disconnection notifications, engagement trackers and visualizations for analysis of stress and craving trends. A randomized controlled trial is currently underway to test the impact of RAE on clinical and psychosocial outcomes for individuals in recovery from SUD (ClinicalTrials.gov: NCT05227339). A streamlined version of the RAE app is in beta-testing which will allow a seamless experience for entry-level smart phones. Other important future research directions include personalization of algorithms, risk stratification and risk level prediction.

5. Conclusions

The accuracy of stress and craving detection (digital biomarker detection) on a commercially available wrist sensor was comparable to algorithms previously described using research-grade sensors. More than half of the participants used the RAE system for 30 days: individuals with anxiety, those who use iOS, and those who did not have a return to use maintained higher engagement over time. Future work in this area should explore the effect of similar digital health systems on treatment outcomes and the optimal dose of digital interventions needed to make a clinically significant impact.

Highlights.

  • Digital biomarkers of stress and craving are detectable using commercially available wearable sensors.

  • Individuals in recovery from SUD engaged with our digital tool for approximately 50% of study days, and for a mean of 11.7 hours per use day.

  • Participants with anxiety disorders and those that did not have any return to drug use events showed the highest levels of engagement.

Acknowledgements

This work was generously supported by National Institutes of Health/National Institute on Drug Abuse through a Small Business Innovation Research award (R44DA046151, MPI: Carreiro, Indic). We would like to acknowledge our collaborators at each participating treatment site, and our participants who generously provided their time and feedback to this study. We would also like to thank Nirzari Kapadia for her feedback on the manuscript draft.

Funding

Funding for this study was provided by the National Institutes of Health/National Institute on Drug Abuse through a Small Business Innovation Research award (R44DA046151, MPI: Carreiro, Indic).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Conflict of Interest

JS is employed by RAE Health. SC and PI and are academic partners with RAE Health on two Small Business Innovation Research awards.

References

  1. Acilar A, Sæbø Ø, 2023. Towards understanding the gender digital divide: A systematic literature review. Global knowledge, memory and communication 72, 233–249. [Google Scholar]
  2. Back SE, Hartwell K, DeSantis SM, Saladin M, McRae-Clark AL, Price KL, Moran-Santa Maria MM, Baker NL, Spratt E, Kreek MJ, et al. , 2010. Reactivity to laboratory stress provocation predicts relapse to cocaine. Drug and alcohol dependence 106, 21–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Beukenhorst AL, Howells K, Cook L, McBeth J, O’Neill TW, Parkes MJ, Sanders C, Sergeant JC, Weihrich KS, Dixon WG, 2020. Engagement and participant experiences with consumer smartwatches for health research: longitudinal, observational feasibility study. JMIR mHealth and uHealth 8, e14368. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Brady KT, Sonne SC, 1999. The role of stress in alcohol use, alcoholism treatment, and relapse. Alcohol Research & Health 23, 263. [PMC free article] [PubMed] [Google Scholar]
  5. Carreiro S, Chintha KK, Shrestha S, Chapman B, Smelson D, Indic P, 2020. Wearable sensor-based detection of stress and craving in patients during treatment for substance use disorder: A mixed methods pilot study. Drug and alcohol dependence 209, 107929. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Carreiro S, Taylor M, Shrestha S, Reinhardt M, Gilbertson N, Indic P, 2021. Realize, analyze, engage (rae): A digital tool to support recovery from substance use disorder. Journal of psychiatry and brain science 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Chapman BP, Lucey E, Boyer EW, Babu KM, Smelson D, Carreiro S, 2022. Perceptions on wearable sensor-based interventions for monitoring of opioid therapy: A qualitative study. Frontiers in Digital Health , 212. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Charness N, Boot WR, 2022. A grand challenge for psychology: reducing the age-related digital divide. Current directions in psychological science 31, 187–193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Cheshmehzangi A, Zou T, Su Z, 2022. The digital divide impacts on mental health during the covid-19 pandemic. Brain, Behavior, and Immunity 101, 211–213. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Coravos A, Khozin S, Mandl KD, 2019. Developing and adopting safe and effective digital biomarkers to improve patient outcomes. NPJ digital medicine 2, 14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Empatica, 2023. E4 wristband. URL: https://www.empatica.com/research/e4/.
  12. Gan DZ, McGillivray L, Han J, Christensen H, Torok M, 2021. Effect of engagement with digital interventions on mental health outcomes: a systematic review and meta-analysis. Frontiers in digital health , 150. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Garmin, 2023. Vívosmart® 4. URL: https://www.garmin.com/en-US/p/605739specs.
  14. Gordon JS, Armin J, Hingle MD, Giacobbi P Jr, Cunningham JK, Johnson T, Abbate K, Howe CL, Roe DJ, 2017. Development and evaluation of the see me smoke-free multi-behavioral mhealth app for women smokers. Translational behavioral medicine 7, 172–184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Gustafson DH, McTavish FM, Chih MY, Atwood AK, Johnson RA, Boyle MG, Levy MS, Driscoll H, Chisholm SM, Dillenburg L, et al. , 2014. A smartphone application to support recovery from alcoholism: a randomized clinical trial. JAMA psychiatry 71, 566–572. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Johnston DC, Mathews WD, Maus A, Gustafson DH, 2019. Using smartphones to improve treatment retention among impoverished substance-using appalachian women: a naturalistic study. Substance Abuse: Research and Treatment 13, 1178221819861377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Lattie EG, Adkins EC, Winquist N, Stiles-Shields C, Wafford QE, Graham AK, 2019. Digital mental health interventions for depression, anxiety, and enhancement of psychological well-being among college students: systematic review. Journal of medical Internet research 21, e12869. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Lythreatis S, Singh SK, El-Kassar AN, 2022. The digital divide: A review and future research agenda. Technological Forecasting and Social Change 175, 121359. [Google Scholar]
  19. Maricich YA, Xiong X, Gerwien R, Kuo A, Velez F, Imbert B, Boyer K, Luderer HF, Braun S, Williams K, 2021. Real-world evidence for a prescription digital therapeutic to treat opioid use disorder. Current Medical Research and Opinion 37, 175–183. [DOI] [PubMed] [Google Scholar]
  20. Marsch LA, Campbell A, Campbell C, Chen CH, Ertin E, Ghitza U, Lambert-Harris C, Hassanpour S, Holtyn AF, Hser YI, et al. , 2020. The application of digital health to the assessment and treatment of substance use disorders: The past, current, and future role of the national drug abuse treatment clinical trials network. Journal of substance abuse treatment 112, 4–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Mathrani A, Sarvesh T, Umer R, 2022. Digital divide framework: online learning in developing countries during the covid-19 lockdown. Globalisation, Societies and Education 20, 625–640. [Google Scholar]
  22. McVay MA, Bennett GG, Steinberg D, Voils CI, 2019. Dose–response research in digital health interventions: Concepts, considerations, and challenges. Health Psychology 38, 1168. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Nandakumar R, Gollakota S, Sunshine JE, 2019. Opioid overdose detection using smartphones. Science translational medicine 11, eaau8914. [DOI] [PubMed] [Google Scholar]
  24. NIDA, 2021. Nida ic fact sheet 2022.URL: https://nida.nih.gov/about-nida/legislative-activities/budget-information/fiscal-year-2022-budgetinformation-congressional-justification-national-institute-drug-abuse/ic-factsheet-2022.
  25. Onnela JP, 2021. Opportunities and challenges in the collection and analysis of digital phenotyping data. Neuropsychopharmacology 46, 45–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Piers R, Williams JM, Sharpe H, 2023. Can digital mental health interventions bridge the ‘digital divide’for socioeconomically and digitally marginalised youth? a systematic review. Child and Adolescent Mental Health 28, 90–104. [DOI] [PubMed] [Google Scholar]
  27. Preston KL, Epstein DH, 2011. Stress in the daily lives of cocaine and heroin users: relationship to mood, craving, relapse triggers, and cocaine use. Psychopharmacology 218, 29–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Preston KL, Kowalczyk WJ, Phillips KA, Jobes ML, Vahabzadeh M, Lin JL, Mezghanni M, Epstein DH, 2018a. Before and after: craving, mood, and background stress in the hours surrounding drug use and stressful events in patients with opioid-use disorder. Psychopharmacology 235, 2713–2723. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Preston KL, Kowalczyk WJ, Phillips KA, Jobes ML, Vahabzadeh M, Lin JL, Mezghanni M, Epstein DH, 2018b. Exacerbated craving in the presence of stress and drug cues in drug-dependent patients. Neuropsychopharmacology 43, 859–867. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. R Core Team, 2021. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. URL: https://www.R-project.org/. [Google Scholar]
  31. Sayette MA, 2016. The role of craving in substance use disorders: theoretical and methodological issues. Annual review of clinical psychology 12, 407–433. [DOI] [PubMed] [Google Scholar]
  32. Shrestha S, Stapp J, Taylor M, Leach R, Carreiro S, Indic P, 2023. Towards device agnostic detection of stress and craving in patients with substance use disorder, in: Proceedings of the… Annual Hawaii International Conference on System Sciences. Annual Hawaii International Conference on System Sciences, NIH Public Access. p. 3156. [PMC free article] [PubMed] [Google Scholar]
  33. Sinha R, 2001. How does stress increase risk of drug abuse and relapse? Psychopharmacology 158, 343–359. [DOI] [PubMed] [Google Scholar]
  34. Sinha R, 2008. Chronic stress, drug use, and vulnerability to addiction. Annals of the new York Academy of Sciences 1141, 105–130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Sinha R, Catapano D, O’Malley S, 1999. Stress-induced craving and stress response in cocaine dependent individuals. Psychopharmacology 142, 343–351. [DOI] [PubMed] [Google Scholar]
  36. Sinha R, Garcia M, Paliwal P, Kreek MJ, Rounsaville BJ, 2006. Stressinduced cocaine craving and hypothalamic-pituitary-adrenal responses are predictive of cocaine relapse outcomes. Archives of general psychiatry 63, 324–331. [DOI] [PubMed] [Google Scholar]
  37. Tapper K, 2018. Mindfulness and craving: Effects and mechanisms. Clinical psychology review 59, 101–117. [DOI] [PubMed] [Google Scholar]
  38. Tiffany ST, Wray JM, 2012. The clinical significance of drug craving. Annals of the New York Academy of Sciences 1248, 1–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Wasserlauf J, You C, Patel R, Valys A, Albert D, Passman R, 2019. Smartwatch performance for the detection and quantification of atrial fibrillation. Circulation: Arrhythmia and Electrophysiology 12, e006834. [DOI] [PubMed] [Google Scholar]
  40. Whiteside H, Beavers B, 2022. Social justice and the digital divide: A narrative inquiry of how digital inequities contribute to the underrepresentation of lowincome african americans in stem careers, in: Society for Information Technology & Teacher Education International Conference, Association for the Advancement of Computing in Education (AACE). pp. 294–298. [Google Scholar]
  41. Yao R, Zhang W, Evans R, Cao G, Rui T, Shen L, 2022. Inequities in health care services caused by the adoption of digital health technologies: scoping review. Journal of medical Internet research 24, e34144. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES