Skip to main content
PLOS One logoLink to PLOS One
. 2026 Feb 5;21(2):e0342419. doi: 10.1371/journal.pone.0342419

Efficacy of a smartphone app to improve mental health among emergency service workers: A randomised controlled trial

Mark Deady 1,*, Mikayla Gregory 1, Quincy J J Wong 2, Denise Meuldijk 1, Daniel A J Collins 1, Lasse B Sander 3, Richard Bryant 4, Samuel B Harvey 1
Editor: Fatma Refaat Ahmed5
PMCID: PMC12875461  PMID: 41642859

Abstract

Background

Emergency service workers (ESWs) are routinely exposed to highly stressful and potentially traumatic events leading to high rates of psychological distress. Early intervention is vital to prevent chronic impairment and/or psychiatric disorders, with digital health innovations (e.g., smartphone apps) offering a potential means of scaling such intervention while overcoming barriers to help-seeking in this population. This study aims to evaluate the efficacy of an app designed to reduce psychological distress and related outcomes in ESWs.

Methods

We conducted a randomised controlled trial with ESWs experiencing psychological distress (Kessler Psychological Distress Scale; K10) score >15. Participants were assigned to the intervention group, a full version of the Build Back Better app (including mindfulness, behavioural activation, trauma-focused cognitive therapy skills, activity and mood monitoring, and healthy coping strategies) or a mood/activity tracking-only version of the same app. Assessment occurred via online self-report questionnaires at baseline (T0) and at 1- and 3-month post-baseline (T1, T2) timepoints. The primary outcome was the K10 score at T2. Linear mixed model analyses were conducted based on the intention-to-treat principle.

Results

N = 880 ESWs were randomized to the full (n = 440) or tracking-only (n = 440) condition. There was no Time x Condition effect for K10 scores, with both conditions showing similar improvements from baseline to 1-month (all ps < .001), and 3-month follow-up (all ps < .001). Exploratory analyses of moderators (engagement; baseline severity) found generally no significant differences in K10 score decreases across different levels of engagement in the full-app condition, however, app use was markedly low overall. At higher baseline psychological distress levels, there was a significant decrease in posttraumatic stress disorder symptoms from baseline to 1-month for the full-app group but not the tracking-only group (p = .002, d = −1.43).

Conclusion

Despite consistent improvement across both app conditions, the minimal between group differences found here highlight the difficulties in developing effective, scalable resources for ESWs and the limitations of unguided digital programs more broadly.

Trial registration

Australian New Zealand Clinical Trials Registry ACTRN12622001324707

Introduction

Emergency service workers (ESWs), including police, ambulance personnel, firefighters and state emergency workers, have increased rates of mental health concerns compared to general population [13]. Prevalence rates of both depression and anxiety are as high as 15% among police, ambulance and firefighters [4,5], while one in ten ESWs report symptoms consistent with posttraumatic stress disorder (PTSD) [6].

In addition to the impact of potentially traumatic events, ESWs can experience other known psychosocial risk factors as part of their roles including high job demands, occupational violence, interpersonal issues, and shift work [3,7,8]. These hazards are compounded by poor help-seeking behaviours [911]. Stigma [911] and concerns that disclosure of mental health issues could influence career outcomes [12,13] present major reasons for this lack of help-seeking. Furthermore, there is an identified need for low-intensity interventions targeting ESWs due to the high-risk nature of their roles and the exposure to disaster and trauma, especially where these workers do not qualify for a formal mental health diagnosis [14].

Digital mental health interventions are a tool that has been developed to help overcome typical accessibility barriers in accessing psychological treatment for mental health issues [15]. These interventions deliver therapeutic evidence-based information via activities and interactive and engaging stimuli to help the user develop skills to assist them in managing their mental health [16,17]. Offering such programs via digital platforms has shown demonstrated effectiveness for prevention and management of mental ill-health symptoms [1820]. However, less is known regarding app-based interventions [21]. Furthermore, evidence suggests that ESWs express interest and openness to digital health interventions [22] and apps specifically [23]. This study found that 54% would be interested in a mental health app for ESWs and this number increased to two thirds when the app targeted physical health. With respondents favouring a mental health approach that leveraged the concept of “mental fitness”. Despite this there have been relatively few studies investigating the efficacy of digital interventions for ESWs, and little direction around what to offer these workers [2426]. Moreover, there is a lack of guidance around how to address subclinical symptoms in this high-risk, trauma-exposed group in order to avoid chronic functional and clinical impairment that comes with ingrained, untreated conditions [27].

The smartphone app Build Back Better was developed and tailored specifically with and for ESWs to improve psychological health and reduce symptoms of psychological distress and related outcomes. Build Back Better app was adapted from an existing app, HeadGear [28]. HeadGear previously underwent a pilot study, RCT, and a naturalistic evaluation, which collectively indicated effectiveness in preventing and improving depressive symptoms in both the general population and in male-dominated industries [28,29]. In order to adapt the program to the needs of ESWs an iterative codesign process occurred with ESWs, mental health experts including psychiatrists and clinical psychologists, consulted alongside design, user experience, and IT teams. Workshops and multiple rounds of user testing and revision took place during this phase with 12 ESWs to tailor this product. This codesign was deemed essential to meet the unique needs of the population. In order to overcome known challenges of engagement ([30]), several specific techniques associated with increased engagement were employed. These included personalised content where users selected area of need, data visualisation (tracking), reminders/push notifications, educational information, self-monitoring, and goal-setting features [31].

A pilot study of the app was previously conducted using an earlier version of the app to examine the acceptability, uptake and preferences of ESWs [32]. Feedback from the pilot study was further used to inform program changes. This primarily focused on elements of the user interface and simplifying the user flow through differing app screens to reduce cognitive burden and simplify navigation. Home screen was enhanced and elements of language within activities was simplified.

The current study aimed to evaluate the efficacy of the Build Back Better app in reducing symptoms of psychological distress and related outcomes in the ESW population by conducting a large scale randomised controlled trial (RCT) with ESWs across Australia.

Methods

Registration and approval

This study received ethical approval from The University of New South Wales Human Research Ethics Committee (HC220566). Consent was informed and obtained in writing (digitally) from all participants. The trial was prospectively registered via the Australian New Zealand Clinical Trials Registry: ACTRN12622001324707. Reporting was in line with CONSORT [33] and CONSORT e-Health [34] guidelines. Fig 1 presents the CONSORT flow diagram.

Fig 1. Flow of participants through the trial.

Fig 1

Study design

This study was a two-armed RCT with the intervention group receiving the “full version” of the Build Back Better app and a tracking-only version of the app (mood and activity monitoring only), with a 2:2 ratio block allocation (4 participants per block). Assessments occurred at baseline (T0), 4 weeks post baseline (T1) and 3-month follow-up (T2) and took place between 01/03/2023 and 30/11/2023.

Procedure

ESWs were recruited via both organisational and social media advertisements, which displayed QR codes and links to the study landing page where participants were required to provide informed consent before proceeding further. Individuals who consented to participate were screened for eligibility (see below) and either progressed to a web page informing them of their ineligibility (with a referral to Black Dog Institute’s National Emergency Worker Support Service website containing additional resources), or, if eligible, they continued to the online baseline assessment (completed in the same session). The baseline assessment included the Patient Health Questionnaire (PHQ-9) [35]; the trial safety protocol required the research team be notified if a participant scored 14 or higher on the PHQ-9, or indicated suicidal ideation on this measure, in which case the researchers followed up with participants and provided further support, including the opportunity to speak with a psychologist or mental health nurse to confirm they were not at heightened risk (29 participants were subsequently called by the clinics and no participant was excluded following this call). Upon completion of all baseline measures (see Table 1), participants were randomised to either the full-intervention or tracking-only version of the app. Randomisation was performed using a computer-generated algorithm integrated into Black Dog Institute’s digital trial management platform. All participants and investigators were masked from condition assignment until the completion of the final survey. Upon allocation, all study participants were given instructions on how to download the Build Back Better smartphone app including a login code that would take them to the appropriate app version (depending on their group allocation). Participants were asked to use the app consistently for at least 30 days. Participants received an email and text messages to complete follow-up surveys after 30 days and again at 3 months post-baseline, with up to two SMS and email reminders delivered at each timepoint if the survey was not completed.

Table 1. Assessment schedule.

T0* T1 T2 In-app
K10 x x x
Demographics (including lifetime trauma exposure and prior help-seeking) x
PHQ-9 x x x
GAD-7 x x x
WHO-5 x x x
CSE-T x x x
PTSD-8 x x x
AQoL-4D x x x
CD-RISC-10 x x x
HPQ x x x
AUDIT-C x x x
User Engagement/Feedback x
Objective app engagement measures x

* Screening and baseline occurred in same session (T0).

Note. K10 = Kessler Psychological Distress Scale; PHQ-9 = Patient Health Questionnaire-9; GAD-7 = Generalized Anxiety Disorder-7; WHO-5 = World Health Organization Wellbeing Index; CSE-T = Trauma Coping Self-Efficacy Scale; PTSD-8 = Post-Traumatic Stress Disorder 8-item; AQoL-4D = Assessment of Quality of Life 4-dimension version; CD-RISC-10 = 10-item Connor-Davidson Resilience Scale; HPQ = Health and Work Performance Questionnaire; AUDIT-C = Alcohol Use Disorders Identification Test-Concise.

Participants

Eligibility criteria included being over the age of 18, currently or in the past 2 years having worked as an ESW, currently residing in Australia, owning a smartphone and having good English comprehension. The eligibility criteria also required participants to have scored 16 or higher on the Kessler Psychological Distress Scale (K10), indicating that they were experiencing at least moderate levels of psychological distress.

Intervention arms

Full intervention app.

The Build Back Better app (Fig 2) was developed as outlined above. Clinical aspects of original HeadGear [28] content including mindfulness, behavioural activation, mood monitoring, and healthy coping were maintained, while adaptations were made to include trauma-focused content including cognitive therapeutic techniques. The rationale for the clinical components of HeadGear are outlined in prior publications [36]. The addition of trauma-focussed cognitive therapeutic techniques was guided by best practice in the area [37], providing psychoeducation around the nature of trauma responses and action around thought monitoring, labelling, and challenging. The format of the program was significantly altered to be less structured and allow for greater flexibility, based on ESW feedback. The user interface was also redesigned to meet the needs and preference of the ESWs. For complete list of activities see S1 File.

Fig 2. Screenshots from Build Back Better app (intervention version).

Fig 2

The Build Back Better app includes four key modules: 1) mindfulness, 2) healthy coping, 3) managing thoughts and 4) valued actions. Each module contains a combination of skill-based activities designed to deliver evidence-based therapeutic techniques via psychoeducational videos, audio recordings, and interactive exercises. The Build Back Better app also has a mood and activity tracker (encompassing mood, sleep, alcohol intake, smoking, physical activity, work/life balance, healthy eating, anxiety and stress) where users can log relevant variables on a slider scale daily, and view progress and correlations over time. The app also provides links and phone numbers for a variety of mental health and workplace support services.

Tracking-only version.

The tracking-only (active control) version of the Build Back Better app had the same look and feel as the intervention version, however, the therapeutic skills-based activity content was not included. The tracking-only participants only had access to the mood/activity tracking component, where participants were encouraged to log their mood and activities over the 30-day trial period. They also had access to the graphing functions that displayed their mood and activities over the period of use. This app version also contained the same information page as the full intervention version with the links and phone numbers for a variety of mental health and workplace support services.

Measures

At baseline, participants were asked a series of questions pertaining to demographics, lifetime trauma exposure and their previous help-seeking behaviour. Several validated scales were administered at baseline and follow-up timepoints (see Table 1), and these are described in detail below.

The primary outcome, psychological distress, was measured using the Kessler Psychological Distress Scale (K10), a 10-item measure of psychological distress over the past four weeks [38]. The K10 has a high level of internal consistency (Cronbach’s α = 0.92) and has been validated in large community samples with psychometric properties remaining consistent across a range of sociodemographic groups [39]. The K10 includes depression and anxiety-based questions and answers are scored on a scale of 1–5 [38]. A total score is calculated by summing the scores across the 10 questions, with a minimum total score of 10 (no psychological distress) and a maximum of 50 (indicating severe distress) [38].

The Patient Health Questionnaire (PHQ-9) was used to measure depressive symptom severity [40]. This 9-item self-report scale has excellent internal consistency (Cronbach’s α = 0.86–0.89) and good criterion and construct validity as a measure of depression severity within large primary care samples [40], along with strong convergent validity with other well-established measures of depression (r = 0.73) when used in the general population [41]. The PHQ-9 asks participants to rate how often over the past two weeks they have been bothered by each of the 9 items on a 4-point Likert scale from 0 (“not at all”) to 3 (“nearly every day”) [35]. Total scores range from 0–27 with higher scores indicating more severe presentations.

The Generalized Anxiety Disorder-7 inventory (GAD-7) was used to measure symptoms of anxiety [42]. The GAD-7 has excellent internal consistency (Cronbach’s α = 0.92), strong construct validity (with higher scores associated with increasing functional impairment), and good convergent validity with other established anxiety scales (r = 0.72; r = 0.74) [42]. The GAD-7 asks how often participants have been bothered by 7 items, each rated on a 4-point Likert scale from 0 (“not at all”) to 3 (“nearly every day”) [42]. Total scores range from 0–27. The cutoffs on this scale include 5 (mild), 10 (moderate) and 15 (severe), with scores over 10 suggestive of an anxiety disorder being present.

Wellbeing was measured using the World Health Organization Wellbeing Index (WHO-5) [43,44]. The WHO-5 has good construct validity and is recommended for use in controlled trials to measure subjective wellbeing over time [45,46]. This 5-item scale asks participants to indicate which is closest to how they have been feeling over the past two weeks on a 6-point Likert scale from 0 (“at no time”) to 5 (“all of the time”) [45,46]. Raw scores therefore range from 0–25 with total scores calculated by multiplying the raw score by 4 giving a total score out of 100. Scores of 50 or less indicate poor wellbeing and scores of 28 or less are indicative of potential depression [45,46].

The Post-Traumatic Stress Disorder-8 item (PTSD-8) [47] is a validated short self-report screening tool for measuring probable ICD-11 defined PTSD [48,49]. The PTSD-8 has been validated in a range of trauma-exposed samples and shown to have good internal consistency (Cronbach’s α = 0.83–0.85) and a high level of concurrent validity with an established measure of trauma symptoms (Trauma Symptom Checklist; r = 0.58–0.78) [47]. After establishing Criterion A, the following 8 items represent symptoms across three subscales; intrusions (4 items), avoidance (2 items) and hypervigilance (2 items) [47]. Items are rated on a 4-point scale from 1 (“not at all”) to 4 (“most of the time”). Probable PTSD diagnosis is met if at least one item in each of the three subscales has a score of 3 or 4 (i.e., participant reports that the symptom is experienced “sometimes” or “most of the time”) [47,49].

Participants were also asked to rate their capacity for trauma-related coping using the Trauma Coping Self-Efficacy Scale (CSE-T) [50]. The CSE-T has sound psychometric properties across different trauma-exposed groups, with good test-retest reliability, internal reliability, and convergent and discriminant validity [50]. This 9-item scale asks participants to rate their perceived capacity for managing possible posttraumatic experiences (i.e., ability to deal with emotions since the traumatic event or capacity for controlling thoughts about the traumatic experience), on a 7-point scale that ranges from 1 (“not capable at all”) to 7 (“totally capable”). Total scores are calculated by summing each of the items and range from 9 to 63 [50].

Work performance was measured using items from the Health and Work Performance Questionnaire (HPQ) [51]. Where participants reported being employed, they were asked to rate their perceived overall job performance over the previous four-week period on a scale of 1 (“worst performance”) to 10 (“top performance”). They were also asked about absenteeism over the last month, the length of any absences and whether any absences were due to their mental health [51]. A score for effective workdays was calculated by subtracting the number of sick days taken over the past 4 weeks from 28 and multiplying by self-reported work performance rating (converted to a score from 0.1 to 1.0). This is consistent with previous studies [29,52].

Resilience was assessed using the 10-item Connor-Davidson Resilience Scale (CD-RISC-10), which has been shown to have excellent internal consistency (Cronbach’s α = 0.89) and test-retest reliability, as well as good convergent and discriminant validity [53,54]. The measure consists of 10 items asking participants to rate how they have felt over the past month (e.g., “under pressure, I stay focused and think clearly”, “able to adapt when changes occur”, or “I am able to handle unpleasant or painful feelings like sadness, fear and anger”) on a 5-point scale from 0 (“not true at all”) to 4 (“true nearly all of the time”). Total scores range from 0–40, with higher scores indicating greater resilience [53].

The Alcohol Use Disorders Identification Test-Concise (AUDIT-C) was used to screen participants for hazardous or problematic alcohol use [55]. This tool has excellent internal consistency (Cronbach’s α = 0.94), predictive validity, and concurrent validity with a lengthier 10-item measure (r = 0.97) [56]. The AUDIT-C is a 3-item scale asking participants how often they have a drink that contains alcohol, how many alcoholic drinks they consume on a typical day and how often they consume 6 alcoholic drinks or more on a single occasion [55]. Each of the 3 questions are scored on a 5-point scale ranging from 0–4 with the highest possible score of 12. Females with a score of 3 or more and males with a score of 4 or more is considered positive identification of hazardous drinking behaviours [55,57].

Participant quality of life was measured using the Assessment of Quality of Life (4 dimensions; AQoL-4D) [58]. This is a 12-item instrument with good internal consistency (Cronbach’s α = 0.94) and construct validity [58]. Questions are evenly grouped into 4 broad dimensions: independent living (e.g., “do you need help looking after yourself?”), mental health (i.e., questions regarding sleep, pain and anxiety/depression/worrying), relationships (with family, friends, etc.) and senses (i.e., vision and hearing related questions) [58,59]. Each item is rated on a 4-point scale ranging from 1 (lowest) to 4 (highest). Total unweighted scores are calculated for each of the 4 dimensions by subtracting 3 from the total score of that dimension and then dividing this number by four before multiplying it by 100 [58,59].

At the 1-month follow-up timepoint, participants were furthermore asked to complete 12 user engagement and app-related feedback questions which involved a combination of 5-point Likert scales and multiple choice questions and free text response questions. Items were adapted from the Mobile Application Rating Scale [60], including ease of use, understanding of content, engagement and interest in the design and content, likelihood of recommending to others, and overall rating of the app. Further questions measured the subjective perception of improvement in mental fitness, and reasons for stopping app use. Participants also provided general feedback and suggestions via open-response questions. These measures have been used previously and is provided in S2 [52].

App usage data was also collected throughout the research trial period pertaining to number of times the app was opened, the length of time spent in the app, the number of activities both accessed and completed as well as the number of mood and activity sessions that were logged by participants.

Sample size

As a population-based, unguided, indicated intervention, the effect size of the intervention was anticipated to be relatively small. Based on review of the available literature of comparable studies [6164] a small between-group effect size (Hedges g = 0.29) was expected. With alpha was set at 0.05 and power (1-β) at 0.80, and assuming a large correlation between repeated measures (i.e., r > .99) to have enough power to cover scenarios where the correlation would be smaller, a minimum of 394 participants was required per study arm. Allowing for a dropout rate of 30% of study participants from baseline, an estimated sample size of 1,126 was required for randomization (i.e., 563 participants per arm). G*Power was used to calculate power.

Data analysis

Analyses were undertaken on an intent-to-treat basis. Linear mixed models for repeated measures analyses with maximum likelihood estimation were used to examine primary and secondary outcomes across baseline, 1-month and 3-month follow-ups. Maximum likelihood estimation incorporates all available data, including participants with missing follow-up data, under the missing-at-random assumption. In the analytical models, two Time variables (reflecting baseline to 1-month, and 1-month to 3-month periods), Condition (full-app vs tracking-only), and corresponding Time x Condition interactions were specified as fixed effects, and a random intercept for participants was also included. Models did not adjust for baseline scores as randomisation in this trial resulted in comparable groups of participants in trial arms at baseline (see Table 2). An identity covariance matrix was employed, and degrees of freedom were estimated using Satterthwaite’s method. Effect sizes (Cohen’s d) were calculated based on relevant modelled mean differences and SDs at relevant timepoints. In addition to primary and secondary analyses, post-hoc exploratory analyses examined the effect of medication and help-seeking on the primary outcome results. Analyses were also conducted to investigate the moderating effect of app engagement and baseline severity on outcomes. Engagement was categorised as non-engagers, minimal engagers, and engagers; this was defined by completion of 0, 1–10, and >10 activities respectively.

Table 2. Participant characteristics at baseline.

Full sample
(N = 880)
Tracking-only
(n = 440)
Full intervention
(n = 440)
Age in years, M (SD, range) 44.98
(13.19, 19–81)
44.38
(13.23, 19–81)
45.58
(13.14, 20–77)
Gender identity, n (%)a
 Female 325 (36.9) 165 (37.5) 160 (36.4)
 Male 536 (60.9) 269 (61.1) 267 (60.7)
 Non-binary or other 16 (1.8) 6 (1.4) 10 (2.2)
Education, n (%)
 Year 12 (equivalent) or less 135 (15.3) 69 (15.7) 66 (15.0)
 Trade or other certificate or diploma 367 (41.7) 185 (42.0) 182 (41.4)
 University Degree 350 (39.8) 174 (39.5) 176 (40.0)
 Other 28 (3.2) 12 (2.7) 16 (3.6)
Live in metropolitan area, n (%) 285 (32.4) 153 (34.8) 132 (30.0)
Emergency service worker status, n (%)b
 Current paid 482 (54.8) 246 (55.9) 236 (53.6)
 Current volunteer 320 (36.4) 163 (37.0) 157 (35.7)
 Retired 74 (8.4) 29 (6.6) 45 (10.2)
Service length, n (%)c
  < 1 year 18 (2.0) 11 (2.5) 7 (1.6)
 1–5 years 223 (25.3) 105 (23.9) 118 (26.8)
 6–10 years 152 (17.3) 88 (20.0) 64 (14.5)
 10 + years 486 (55.2) 236 (53.6) 250 (56.8)
Sought professional/clinical help in lifetime (mental health professional, GP), n (%) 725 (82.4) 369 (83.9) 356 (80.9)
Engaged in help-seeking in last four weeks, n (%) 344 (39.1) 170 (38.6) 174 (39.5)
Currently taking medication for a mental health issue, n (%) 278 (31.6) 140 (31.8) 138 (31.4)
Experienced traumatic event, n (%) 779 (88.5) 388 (88.2) 391 (88.9)
Primary outcome
 K10, M (SD) 25.42 (6.25) 25.27 (6.26) 25.57 (6.25)
Secondary outcomes
 PHQ-9, M (SD) 9.79 (5.49) 9.79 (5.54) 9.79 (5.45)
 GAD-7, M (SD) 7.47 (4.57) 7.54 (4.61) 7.41 (4.53)
 PTSD-8, M (SD)d 16.72 (5.92) 17.12 (6.31) 16.40 (5.60)
 CD-RISC-10, M (SD) 26.67 (6.95) 26.41 (6.89) 26.92 (7.00)
 CSE-T, M (SD) 42.70 (11.10) 42.11 (11.05) 43.28 (11.13)
 WHO-5, M (SD) 38.53 (20.45) 38.93 (20.45) 38.13 (20.47)
 AUDIT-C, M (SD) 3.74 (2.94) 3.83 (3.02) 3.64 (2.85)
 AQoL-4D, M (SD) 78.35 (10.79) 78.35 (10.90) 78.35 (10.69)
  Independent living 95.42 (10.81) 94.90 (11.44) 95.93 (10.12)
  Relationships 69.90 (19.66) 70.08 (19.03) 69.72 (20.29)
  Physical senses 87.50 (12.22) 87.22 (12.90) 87.78 (11.49)
  Mental health 60.58 (17.62) 61.19 (17.12) 59.97 (18.10)
 HPQ, M (SD)e 18.15 (5.91) 18.18 (6.12) 18.12 (5.69)

aThree participants in the full intervention condition indicated ‘prefer not to say’.

bTwo participants in the tracking-only app condition and two participants in the full intervention condition indicated ‘prefer not to say’.

cOne participant in the full intervention condition indicated ‘prefer not to say’.

dAdministered only to participants who indicated that they experienced, witnessed or were confronted with a stressful experience or traumatic event in the last month (n = 299 in full sample: n = 133 in tracking-only condition; n = 166 in full intervention condition).

eComposite measure of effective workdays was calculated by multiplying work performance score for days worked during the previous 28 days by the number of days present at work over the same period.

Note. K10 = Kessler Psychological Distress Scale; PHQ-9 = Patient Health Questionnaire-9; GAD-7 = Generalized Anxiety Disorder-7; PTSD-8 = Post-Traumatic Stress Disorder 8-item; CD-RISC-10 = 10-item Connor-Davidson Resilience Scale; CSE-T = Trauma Coping Self-Efficacy Scale; WHO-5 = World Health Organization Wellbeing Index; AUDIT-C = Alcohol Use Disorders Identification Test-Concise; AQoL-4D = Assessment of Quality of Life 4-dimension version; HPQ = Health and Work Performance Questionnaire.

For statistical significance, alpha was set at ≤ 0.05. SPSS Statistics 27.0 was used to conduct the analyses.

Results

Sample characteristics

Eight hundred and eighty ESWs met eligibility criteria, completed their baseline surveys and were randomly allocated to the full-app intervention or tracking-only condition. Of the baseline sample, 303 (34.4%) completed all outcome measures at 1-month follow-up, and 171 (19.4%) at 3-month follow-up (Fig 1).

Table 2 presents the baseline characteristics of the full study sample. The participants (n = 880) included ESWs from Fire and Rescue (30%), Ambulance (and other emergency medicine workers) (28%), Police (17%), State Emergency Services (13%), Surf Life Saving (2%), and ‘Other’ (10%). The mean age was 44.98 years (SD = 13.19), and the majority identified as male (60.9%). Just over half had been in emergency service work for 10 + years (55.2%) and a majority of the sample had experienced a traumatic event (88.5%). The majority had sought help from a mental health professional or GP during their lifetime (82.4%), with 31.6% taking medication for a mental health issue at trial registration and 39.1% engaging in help-seeking in the four weeks before trial registration.

Across all outcome measures, Little’s Missing Completely at Random (MCAR) test was not significant, χ2(3564) = 2379.28, p = 1.00, indicating missing data were MCAR. However, attrition from assessment was greater in the full intervention condition than the tracking-only condition at 1-month follow-up (72.3% vs 58.9%; χ2(1) = 17.52, p < .001) and at 3-month follow-up (83.6% vs 77.5%; χ2(1) = 5.29, p = .027). As the full sample of 1,024 could not be met due to recruitment constraints, the adjusted power was 0.76. In addition, missingness was significantly related to variables in the dataset (e.g., GAD-7 item and AUDIT-C item scores; ps < .046). Taken together, these results suggest missing data were more plausibly missing at random (MAR), and this aligns with the missing-at-random assumption of the linear mixed models approach for analyses.

Scores on all outcome measures approximated normality (absolute skewness < 3 and absolute kurtosis < 10; Kline, 2016) with the exception of the AQoL-4D independent living variable at the baseline and 1-month follow-up timepoints. AQoL-4D independent living variable raw scores and transformed scores that approximated normality were both analysed. Results from both analyses had the same pattern of significance and led to the same conclusions. Hence, the results from the analysis with the untransformed variable are reported.

Primary outcome

Model-based estimates and Time x Condition interaction effects are presented in Table 3. There was no Time x Condition effect for scores on the K10, with both conditions showing similar improvements from baseline to 1-month (Bs ranged from −2.67 to −2.60, all |ts| > 6.10, all ps < .001, ds ranged from −0.43 to −0.42), and 3-month follow-ups (Bs ranged from −2.00 to −2.89, all |ts| > 4.35, all ps < .001, ds ranged from −0.46 to −0.32). For completeness, the difference in K10 scores between the two conditions was not significant at the 1-month and 3-month follow-ups (with respective Bs = 0.37 and −0.60, both |ts| < 0.79, both ps > .428, d = 0.06 favouring tracking-only condition at 1-month follow-up and d = −0.08 favouring full intervention condition at 3-months follow-up).

Table 3. Model-based estimates and standard error for primary and secondary outcomes at each timepoint by condition.

Full intervention Tracking-only T1 assessment T2 assessment
Time x Condition Time x Condition
T0
(n = 440)
T1
(n = 122)
T2
(n = 72)
T0
(n = 440)
T1
(n = 181)
T2
(n = 99)
statistic df p statistic df p
K10, M (SE) 25.57 (0.30) 22.97 (0.47) 22.68 (0.57) 25.28 (0.30) 22.60 (0.40) 23.28 (0.50) t = 0.13 603.82 .898 t = −1.26 599.27 .207
PHQ-9, M (SE) 9.79 (0.26) 8.69 (0.40) 8.40 (0.49) 9.79 (0.26) 8.91 (0.35) 8.70 (0.43) t = −0.46 586.92 .648 t = −0.49 582.29 .628
GAD-7, M (SE) 7.40 (0.22) 6.28 (0.35) 6.27 (0.43) 7.54 (0.22) 6.77 (0.30) 6.63 (0.37) t = −0.82 570.83 .414 t = −0.39 567.83 .695
PTSD-8, M (SD)a 16.39 (0.45) 13.59 (0.83) 13.66 (1.04) 16.98 (0.49) 14.61 (0.73) 15.26 (0.96) t = −0.40 130.80 .687 t = −0.72 158.55 .476
CD-RISC-10, M (SE) 26.92 (0.34) 26.89 (0.50) 26.92 (0.60) 26.41 (0.34) 26.88 (0.44) 27.19 (0.55) t = −0.88 497.13 .378 t = −0.78 490.95 .292
CSE-T, M (SE) 43.28 (0.54) 44.06 (0.85) 45.43 (1.03) 42.11 (0.54) 44.65 (0.73) 44.77 (0.91) t = −1.75 537.28 .081 t = −0.39 531.20 .694
WHO-5, M (SE) 38.13 (1.00) 40.72 (1.66) 37.43 (2.05) 38.93 (1.00) 41.41 (1.41) 45.01 (1.80) t = 0.05 587.97 .960 t = −2.57 584.11 .010
AUDIT-C, M (SE) 3.64 (0.14) 3.26 (0.17) 3.34 (0.18) 3.83 (0.14) 3.57 (0.15) 3.42 (0.17) t = −0.85 426.95 .397 t = 0.60 424.67 .550
AQoL-4D, M (SE) 78.35 (0.53) 78.02 (0.76) 78.92 (0.89) 78.35 (0.53) 78.58 (0.67) 78.01 (0.82) t = −0.69 463.22 .488 t = 0.86 457.74 .390
AQoL-4D IL, M (SE) 95.93 (0.53) 95.05 (0.80) 94.43 (0.96) 94.90 (0.53) 94.98 (0.70) 95.13 (0.87) t = −1.04 480.70 .297 t = −1.47 474.09 .143
AQoL-4D Rel, M (SE) 69.72 (0.95) 69.90 (1.46) 71.21 (1.74) 70.08 (0.95) 70.40 (1.27) 68.29 (1.59) t = −0.09 480.40 .929 t = 1.52 473.75 .129
AQoL-4D PS, M (SE) 87.78 (0.59) 86.35 (0.99) 89.22 (1.21) 87.22 (0.59) 86.94 (0.85) 85.43 (1.10) t = −0.93 532.69 .351 t = 2.06 525.26 .040
AQoL-4D MH, M (SE) 59.97 (0.86) 60.71 (1.36) 61.06 (1.65) 61.19 (0.86) 61.95 (1.18) 62.92 (1.50) t = −0.02 498.46 .986 t = −0.31 491.29 .755
HPQ, M (SE) 18.10 (0.33) 17.75 (0.62) 18.05 (0.79) 18.18 (0.33) 17.37 (0.50) 18.16 (0.71) t = 0.58 428.64 .563 t = −0.02 445.54 .983

aAdministered only to participants who indicated that they experienced, witnessed or were confronted with a stressful experience or traumatic event in the last month (tracking-only condition: n = 133 at baseline, n = 43 at 1-month follow-up, n = 22 at 3-month follow-up; full intervention condition: n = 166 at baseline, n = 31 at 1-month follow-up, n = 22 at 3-month follow-up).

Note. Time x Condition interactions reflect the period from baseline to the specified timepoint.

Note. K10 = Kessler Psychological Distress Scale; PHQ-9 = Patient Health Questionnaire-9; GAD-7 = Generalized Anxiety Disorder-7; CD-RISC-10 = 10-item Connor-Davidson Resilience Scale; CSE-T = Trauma Coping Self-Efficacy Scale; WHO-5 = World Health Organization Wellbeing Index; AUDIT-C = Alcohol Use Disorders Identification Test-Concise; AQoL-4D = Assessment of Quality of Life 4-dimension version; IL = Independent Living; Rel = Relationships; PS = Physical Senses; MH = Mental Health; HPQ = Health and Work Performance Questionnaire.

Secondary outcomes

Only a time effect was present for depression and anxiety symptoms (PHQ-9 and GAD-7), which significantly improved from baseline to the 1- and 3-month follow-ups in both conditions (Bs ranged from −1.39 to −0.77, all |ts| > 2.59, all ps < .010, ds ranged from −0.25 to −0.16). Alcohol consumption also showed a time effect, decreasing significantly from baseline to both follow-up timepoints in both conditions (Bs ranged from −0.41 to −0.26, all |ts| > 2.34, all ps < .020, ds ranged from −0.14 to −0.09).

PTSD symptoms showed no between group effects, generally decreasing from baseline to the 1- and 3-month follow-ups in both conditions (Bs ranged from −2.81 to −2.37, all |ts| > 2.58, all ps < .011, ds ranged from −0.47 to −0.40), although decrease from baseline to 3-month follow-up for tracking-only group was not significant (B = −1.72, 95% CI [−3.59, 0.14], t(120.44) = −1.83, p = 0.070, d = −0.29). Trauma coping self-efficacy generally improved from baseline to the 1-month and 3-month follow-ups in both conditions (Bs ranged from 2.14 to 2.65, all |ts| > 2.23, all ps < .026, ds ranged from 0.19 to 0.24), although increase from baseline to 1-month follow-up for the full intervention group was not significant (B = 0.78, 95% CI [−0.74, 2.30], t(545.23) = 1.01, p = 0.312, d = 0.07).

There was no differential Time x Condition effect evident in wellbeing from baseline to 1-month follow-up and no significant change in either condition (Bs ranged from 2.49 to 2.59, both |ts| < 1.88, all ps > .061, ds ranged from 0.12 to 0.13) for this period. However, from baseline to 3-month follow-up, wellbeing significantly increased in the tracking-only arm (B = 6.08, 95% CI [2.68, 9.48], t(571.84) = 3.51, p < .001, d = 0.30) but not the full intervention arm (B = −0.70, 95% CI [−4.61, 3.21], t(593.58) = −0.35, p = .726, d = −0.03), resulting in significantly higher wellbeing at 3-month follow-up favouring tracking-only (B = −7.58, 95% CI [−12.92, −2.23], t(936.67) = −2.78, p = .006, d = −0.32).

There were no time effects for work performance (Bs ranged from −0.80 to −0.02, all |ts| < 1.62, all ps > .106, ds ranged from −0.14 to 0.00), resilience (Bs ranged from −0.04 to 0.78, all |ts| < 1.59, all ps > .112, ds ranged from −0.01 to 0.11) or overall health-related quality of life (Bs ranged from −0.34 to 0.57, all |ts| < 0.73, all ps > .467, ds ranged from −0.03 to 0.05). There was also no change over time in the quality of life dimensions of independent living (Bs ranged from −1.50 to 0.23, all |ts| < 1.70, all ps > .089, ds ranged from −0.14 to 0.02), relationships (Bs ranged from −1.78 to 1.49, all |ts| < 1.24, all ps > .215, ds ranged from −0.09 to 0.08), and mental health (Bs ranged from 0.74 to 1.73, all |ts| < 1.25, all ps > .211, ds ranged from 0.04 to 0.10). There was no change in quality of life related to physical senses in either condition from baseline to 1-month follow-up (Bs ranged from −1.43 to −0.29, both |ts| < 1.52, all ps > .128, ds ranged from −0.12 to −0.02), and no differential Time x Condition effect evident for this period. However, from baseline to 3-month follow-up there was a significant differential Time x Condition effect (see Table 2). Neither the decrease in quality of life related to physical senses for the tracking-only condition (B = −1.79, 95% CI [−3.84, 0.27], t(516.98) = −1.71, p = .088, d = −0.15) nor the increase for the full intervention condition (B = 1.44, 95% CI [−0.85, 3.73], t(532.03) = 1.24, p = .217, d = 0.12) reached statistical significance. The full intervention group did, however, have significantly better mean scores on this dimension relative to the tracking-only group at 3-month follow-up (B = 3.78, 95% CI [0.58, 6.99], t(849.01) = 2.32, p = .021, d = 0.29).

Post-hoc exploratory analyses

Effect of medication and help-seeking on primary outcome results.

The primary outcome analysis was repeated but with Time x Condition x Medication interactions (one for each time period) and relevant lower order two-way interactions added as fixed effects into the model. This model tested whether the Time x Condition interactions were moderated by medication use for mental health at the time of trial registration. Results indicated the Time x Condition interactions in the primary outcome analyses were not moderated by medication use (for both periods baseline to 1-month follow-up and baseline to 3-month follow-up: |ts| < 0.93, ps > .352).

A similar model tested whether Time x Condition interactions in the primary outcome analyses were moderated by help-seeking in the four weeks prior to trial registration. Help-seeking significantly moderated the Time x Condition interaction for baseline to 1-month follow-up (B = −3.30, 95% CI [−5.59, −1.02], t(577.82) = −2.84, p = .005), but not for baseline to 3-month follow-up (B = −2.83, 95% CI [−5.73, 0.07], t(574.89) = −1.92, p = .056). The significant moderation for baseline to 1-month follow-up reflected a non-significant Time x Condition interaction when there was help-seeking prior to the trial (B = −1.64, 95% CI [−3.35, 0.06], t(251.63) = −1.89, p = .060; both tracking-only and full intervention conditions showed a significant decrease in K10 scores, with respective Bs = −2.26 and −3.90, both |ts| > 3.92, both ps < .001, ds = −0.36 and −0.62), and a significant Time x Condition interaction when there was no help-seeking prior to the trial (B = 1.60, 95% CI [0.06, 3.14], t(328.16) = 2.05, p = .041). With no prior help-seeking, the full intervention condition exhibited a significant decrease in K10 scores from baseline to 1-month follow-up (B = −1.50, 95% CI [−2.71, −0.29], t(336.33) = −2.44, p = .015, d = −0.24), but the tracking-only condition exhibited an even larger significant decrease in K10 scores (B = −3.11, 95% CI [−4.06, −2.16], t(315.33) = −6.44, p < .001, d = −0.50). However, the small sample size and missing data mean these findings should be interpreted with caution.

App engagement analysis for the intervention condition – primary outcome.

Psychological distress significantly decreased from baseline to the 1-month follow-up similarly for non-engagers, minimal engagers, and engagers (Bs ranged from −3.01 to −1.87, all |ts| > 2.42, all ps < .016, ds ranged from –0.48 to –0.30), but there were no differential Time x App engagement effects evident for this period (see Table 4). Psychological distress also significantly decreased from baseline to the 3-month follow-up for non-engagers, minimal engagers, and engagers (Bs ranged from −4.68 to −1.84, all |ts| > 2.19, all ps < .030, ds ranged from –0.75 to –0.29), although one Time x App engagement interaction was detected as significant (see Table 4). Specifically, while the decrease in psychological distress for minimal engagers was significant (p = .030, d = −0.29), there was an even greater significant decrease for non-engagers (p < .001, d = −0.75). The resulting difference in psychological distress at 3-month follow-up between minimal engagers and non-engagers was not significant (B = 2.26, 95% CI [−0.44, 4.96], t(381.22) = 1.65, p = .100, d = 0.34). S7 provides group level characteristics at baseline for each group.

Table 4. Model-based estimates and standard error for primary outcomes at each timepoint in the intervention condition based on app engagement.
Non-engager (n = 229) (reference group) Minimal engager (n = 151) T1 assessment T2 assessment
Time x App engagement Time x App engagement
T0 T1 T2 T0 T1 T2 statistic df p statistic df p
K10, M (SE) 25.79 (0.42) 22.78 (0.78) 21.10 (1.03) 25.20 (0.51) 23.33 (0.84) 23.36 (0.90) t = 1.08 248.34 .283 t = 2.18 245.90 .030
Non-engager (n = 229) (reference group) Engager (n = 60) T1 assessment T2 assessment
Time x App engagement Time x App engagement
T0 T1 T2 T0 T1 T2 statistic df p statistic df p
K10, M (SE) 25.79 (0.42) 22.78 (0.78) 21.10 (1.03) 25.68 (0.81) 22.91 (0.85) 23.07 (1.03) t = 0.24 230.61 .811 t = 1.55 232.67 .123
Minimal engager (n = 151) Engager (n = 60) T1 assessment T2 assessment
Time x App engagement Time x App engagement
T0 T1 T2 T0 T1 T2 statistic df p statistic df p
K10, M (SE) 25.20 (0.51) 23.33 (0.84) 23.36 (0.90) 25.68 (0.81) 22.91 (0.85) 23.07 (1.03) t = −0.88 226.86 .380 t = −0.63 227.87 .529

Note. K10 = Kessler Psychological Distress Scale.

Baseline psychological distress (K10) as moderator of Time x Condition effects for primary and secondary outcomes.

As shown in Table 5, baseline psychological distress was a significant moderator of the Time x Condition effect for PTSD-8 scores from baseline to 1-month follow-up only, and not for any other outcome at any other timepoints. To further examine this significant moderation effect, the Time x Condition effect for PTSD-8 scores from baseline to 1-month follow-up was evaluated at a baseline K10 score below the mean and a baseline K10 score above the mean. The mean baseline K10 score was 25.42 (SD = 6.25), which fell in the moderate distress category. As such, the middle score of the mild distress category (baseline K10 score = 22) and the middle score of the severe distress category (baseline K10 score = 40) were used to further evaluate the Time x Condition effect for PTSD-8 scores.

Table 5. Statistics testing baseline psychological distress (K10) as moderator of Time x Condition effects for primary and secondary outcomes.
T1 assessment T2 assessment
Time x Condition x Baseline psychological distress Time x Condition x Baseline psychological distress
statistic df p statistic df p
K10, M (SE) t = −0.51 1151.34 .611 t = −0.82 1201.28 .415
PHQ-9, M (SE) t = −1.51 768.02 .131 t = −0.92 782.20 .359
GAD-7, M (SE) t = −1.74 700.30 .082 t = −0.53 714.65 .596
PTSD-8, M (SD)a t = −2.46 138.17 .015 t = 1.15 181.81 .253
CD-RISC-10, M (SE) t = 0.55 515.11 .580 t = −1.07 510.94 .284
CSE-T, M (SE) t = 1.63 587.17 .104 t = −0.60 586.07 .546
WHO-5, M (SE) t = −0.66 686.29 .509 t = −0.11 694.93 .913
AUDIT-C, M (SE) t = 0.50 426.72 .616 t = −0.79 425.06 .428
AQoL-4D, M (SE) t = −1.54 489.32 .124 t = −0.13 484.40 .898
AQoL-4D IL, M (SE) t = −0.82 485.77 .415 t = −1.21 480.87 .228
AQoL-4D Rel, M (SE) t = −1.91 496.08 .057 t = 0.63 491.11 .531
AQoL-4D PS, M (SE) t = −1.01 534.90 .311 t = −1.06 530.47 .289
AQoL-4D MH, M (SE) t = 0.03 544.97 .977 t = 0.71 540.93 .479
HPQ, M (SE) t = 0.64 456.43 .524 t = 1.01 479.50 .311

aAdministered only to participants who indicated that they experienced, witnessed or were confronted with a stressful experience or traumatic event in the last month (tracking-only condition: n = 133 at baseline, n = 43 at 1-month follow-up, n = 22 at 3-month follow-up; full intervention condition: n = 166 at baseline, n = 31 at 1-month follow-up, n = 22 at 3-month follow-up).

Note. Time x Condition interactions reflect the period from baseline to the specified timepoint. All tested models contained relevant main effects, two-way interactions, and three-way interaction, but only the three-way interaction results are presented in the table.

Note. K10 = Kessler Psychological Distress Scale; PHQ-9 = Patient Health Questionnaire-9; GAD-7 = Generalized Anxiety Disorder-7; PTSD-8 = Post-Traumatic Stress Disorder 8-item; CD-RISC-10 = 10-item Connor-Davidson Resilience Scale; CSE-T = Trauma Coping Self-Efficacy Scale; WHO-5 = World Health Organization Wellbeing Index; AUDIT-C = Alcohol Use Disorders Identification Test-Concise; AQoL-4D = Assessment of Quality of Life 4-dimension version; IL = Independent Living; Rel = Relationships; PS = Physical Senses; MH = Mental Health; HPQ = Health and Work Performance Questionnaire.

At a baseline K10 score of 22 (below the mean), the Time x Condition effect for PTSD-8 scores from baseline to 1-month follow-up was not significant (B = −1.64, 95% CI [−4.00, 0.73], t(146) = −1.37, p = .173). However, at a baseline K10 score of 40 (above the mean), the Time x Condition effect for PTSD-8 scores from baseline to 1-month follow-up was significant (B = 6.04, 95% CI [0.54, 11.50], t(129) = 2.17, p = .032). This significant Time x Condition effect reflected a significant decrease in PTSD-8 scores from baseline to 1-month follow-up for the full intervention arm (B = −5.15, 95% CI [−9.31, −0.98], t(141) = −2.44, p = .016, d = −0.86), but a non-significant change in PTSD-8 scores from baseline to 1-month follow-up for the tracking-only arm (B = 0.89, 95% CI [−2.70, 4.48], t(114) = 0.49, p = .624, d = 0.15). This resulted in significantly lower PTSD-8 scores at 1-month follow-up favouring the full intervention condition (B = −8.62, 95% CI [−14.00, −3.19], t(224) = −3.13, p = .002, d = −1.43). The small sample size and missing data mean these findings should be interpreted with caution.

App use and feedback

App engagement was low overall, with the full intervention group, on average, logging into the app 4.66 (SD = 6.41) times over the instructed 30-day period, significantly less than the tracking-only group (9.53; SD = 10.00; p < .001). Engagement with the skills-based app content was also low, with the full intervention group accessing a mean of 4.78 (SD = 9.90) activities. Number of activity and mood monitoring sessions were also tracked, with tracking-only participants using the check-in function significantly more than full intervention group users (M = 19.93; SD = 4.64, compared to M = 6.88; SD = 8.67; p < .001). Overall, those completing the follow-up survey were more likely to have higher rates of use (t128.29 = −7.40 p < .001).

All participants were asked at the 1-month follow-up to provide feedback on the app. In terms of the extent to which they felt the app had helped them to improve their mental fitness (from 1 “strongly disagree” to 5 “strongly agree”), full intervention participants rated the app as significantly more helpful on average (3.27/5; SD = 0.96) compared to tracking-only participants (2.56/5; SD = 0.87; p < .001). Almost all (91.35%) of the full intervention respondents indicated they would recommend the app to others, while only 66.89% of tracking-only respondents claimed they would do so (p < .001). Full intervention app users also rated the app significantly higher overall (4.07/5; SD = 0.93) compared to tracking-only app users (3.34/5; SD = 1.11; p < .001). Between-group comparisons for app user feedback items can be seen in S4 File.

Around half of respondents (129/254; 50.79%) reported stopping use of the app before completing the trial. A higher proportion of the full intervention group (61/103; 59.22%) ceased app use compared to the tracking-only group (68/151; 45.03%). The most common reason for stopping in the full intervention group was lack of time (23/61; 37.70%), whereas the most common reason in the tracking-only group was loss of interest (29/68; 42.65%),

When asked to nominate their favourite app feature, full intervention respondents most frequently noted ease of use/navigation, the wide variety of available content, and the daily tracker. Of the four key content areas, mindfulness was favoured the most.

In terms of least favourite app feature, full intervention respondents most frequently mentioned lack of reminders to use the app (this function was available but required setting changes to turn on), and lack of engaging or relevant content. A small minority reported technical issues (such as inability to access offline) and difficulty with navigation/ease of use.

Discussion

This study reports on the evaluation of Build Back Better, a smartphone app designed to reduce symptoms of psychological distress and related outcomes in ESWs. The RCT was conducted with ESWs across Australia, from a range of emergency service roles. Overall, we found the full app to perform no better than a tracking-only version (in which users monitored their mood and behaviour). Users of both versions of the app improved on most outcomes over time.

The null finding provides valuable insight into the utility of such programs within this workforce. We found engagement to be low, particularly for the full intervention group. Despite this, those who responded to the 1-month follow-up questionnaire provided generally positive user feedback on the app, with strong overall scores and high rates of recommendation, and the app was viewed significantly more favourably by the full intervention group. Of course, there is a degree of bias within this finding as responders were more likely to engage more with the app itself. It is possible that despite satisfaction with the app itself, engagement was limited by other barriers known to impact ESWs, such as time constraints and restrictions due to organisational culture and shift work [65]. Regardless, evidence suggests positive user experience is only weakly correlated with sustained app engagement [66], and the favourable user feedback is insufficient to recommend wider rollout of this app to ESWs in its current form. Importantly, greater engagement was not associated with between-group differences. This is in contrast to consistent findings showing differential effects of similar content in more generalised workforces [29,52], especially at high levels of engagement. In these previous RCTs, rates of app use were more than double those seen here. However, these studies were focused on those with lower levels of symptoms. The present findings suggest that this form of brief, unguided digital intervention may not be adequate to see a marked difference in change scores compared to a month-long tracking-only condition in this vulnerable group. However, low engagement overall makes it difficult to determine whether users received adequate ‘dosage’ within this study. This concept of adequacy of dosage remains an ongoing issue especially within self-guided programs. Indeed, there is some evidence to support this from a recent meta-analysis [67], at low levels of symptoms guidance does not increase effect sizes while more sensitive, guided, digital precision is more appropriate at higher symptom levels. Though the present sample was not severely unwell on the K10, there was considerable comorbid trauma symptomatology which may have introduced significant complexity into the clinical profile of this sample, requiring more intensive content or delivery. Similarly, the high rates of medication use and help-seeking indicate more pronounced impairment. Interestingly, help-seeking appeared to lead to significant decrease in K10 scores for the full intervention arm over the course of follow-up (although for those who were not seeking help, the tracking-only arm had improved short-term outcomes—see S3 File). Finally, in line with recommendations for codesign of ESW interventions [68], representatives from this population were involved in ongoing consultation throughout the app development process, and on the basis of iterative codesign Build Back Better moved from a sequential 30-day once-daily format to an unstructured, self-directed format which may have impacted the effectiveness of the app [69]. It is unclear exactly what drove this null finding especially where the clinical basis for the app was previously associated with effect [29]. However, two major distinctions were the baseline clinical profile of users and the freeform app format. Specifically, this format change may have led to poorer direction within the app. Although users in all these studies were directed to use the app consistently for 30 days, this instruction is likely to be considerably clearer where the app structure is aligned in this manner. Instead, it is more likely present study participants (especially as they were symptomatic at baseline) may have used the app flexibility (e.g., as ‘needed’), which is unlikely to bring about sustained effect. Equally, while the control condition looked to mirror this flexibility, ecological momentary assessment may have resulted in improved engagement.

Alternate explanations for these findings include the potential effect of the tracking-only condition which may have led to real, comparable change in this arm of the study. Indeed, there is evidence to suggest mood monitoring is associated with symptom improvement particularly in the case of depression [70,71], with active controls of this nature associated with significantly smaller effect sizes compared to other forms of control [18]. This tool may have been further enhanced by the inclusion of other forms of monitoring (e.g., activity, sleep) which may have allowed users to recognise patterns in these factors and improve self-regulation. This adds to the ongoing debate around accuracy of evidence for psychological intervention in the context of appropriate control conditions, and, in fact, what such conditions may be [72]. Further comparisons of monitoring and specific self-guided program elements are required to better understand the utility of this form of control condition and what therapeutic impacts it may have.

Interestingly, we found more sustained change in PTSD symptoms in the full app condition, compared to the tracking-only arm. Furthermore, that at higher baseline levels of psychological distress, the full app condition was associated with a significant short-term decrease in PTSD symptoms, which was not present in the tracking-only version of the app. Suggesting that although the distress symptoms measured by the K10 may not have been sensitive to change, the specific PTSD symptoms experienced may have undergone some differential improvement when the full content was available. Again, the exploratory nature of these findings suggest further exploration is required to confirm these preliminary findings.

While there has been very limited research involving digital interventions with ESWs [68] and uptake and acceptability of interventions in this population are frequently reported as low [65], there is some evidence for Internet-based mindfulness in improving adaptive resilience in firefighters [73]. However, more broadly there some evidence to suggest that relaxation (sometimes conflated with elements of mindfulness) might not be beneficial for depression outcomes when delivered in digital programs [74], while behavioural activation is among the most effective components [74,75]. Nevertheless, although both mindfulness and behavioural activation formed core therapeutic content of the Build Back Better content, engagement with these activities may not have been sufficient to elicit effect. Relatedly, there is some emerging evidence for the benefits of interventions that require body movement and group delivery within this high-risk workforce [7678]. Indeed, it has been found that firefighters have a preference for behavioural components of therapy [23] while the use of human support [79] may also be more important among this occupational group than in others, especially due to the comparably lower rates of technology uptake and engagement in these groups [80,81].

This trial was limited by constrained recruitment, with randomised sample failing to meet the sample size estimate. Furthermore, this trial was limited by high rates of attrition, which may have impacted power to detect an effect. As noted above, app engagement was low. Previous work in this area has highlighted a number of elements that might enhance engagement with workplace digital interventions [82], many of which were incorporated in the app build, such as brief activities [83] and persuasive technology (e.g., reminders and self-monitoring) [84]. Nevertheless, the unstructured program design may have limited engagement in the intervention. A recent meta-analyses have demonstrated that despite the potential for persuasive design systems to positively influence behaviour change [85], the association between these principles and efficacy or engagement in mental health apps is a best mixed [86,87]. There were some untested elements which might have played into the findings including expectancy effects which are known to impact effectiveness of digital interventions. Similarly, as with any trial of this kind, there is the potential for a self-selection bias, and thus, the findings may not be representative of all workers. Finally, PTSD symptom findings were impacted by an error in questionnaire skip procedure meaning items were not asked of all users at follow-up (only those with past-month exposure).

This study found little support for a mindfulness and CBT digital intervention for ESWs experiencing psychological distress outperforming a tracking-only version of the app. These findings highlight the challenges of unguided interventions, especially in at-risk occupational groups. Despite previous findings and codesign work suggesting appeal within this group, engagement with the app was a major limitation. These low levels of use, combined with significant trial attrition, are likely to have impacted effectiveness. These challenges are not unique to ESWs but underscore the barriers to implementation of unguided interventions. Alternate avenues of intervention to address the mental health needs of these workers are required, particularly where traditional services are unavailable or where individuals do not meet clinical cutoff levels. These alternatives might include blended care or more guided modes of delivery (e.g., [88]). The use of adaptive and individually tailored approaches may also improve engagement and could be considered as a future direction of research.

Supporting information

S1 File. Intervention content.

(DOCX)

pone.0342419.s001.docx (233KB, docx)
S2 File. Feedback items.

(DOCX)

pone.0342419.s002.docx (27.6KB, docx)
S3 File. Moderation for medication use and help-seeking.

(DOCX)

pone.0342419.s003.docx (24.4KB, docx)
S4 File. App user ratings.

(DOCX)

pone.0342419.s004.docx (46.9KB, docx)
S5 File. CONSORT checklist.

(DOCX)

pone.0342419.s005.docx (43.8KB, docx)
S6 File. Participant characteristics (Engager groups).

(DOCX)

pone.0342419.s006.docx (29.2KB, docx)

Data Availability

A minimal, deidentified dataset is available via UNSWorks public repository (https://doi.org/10.26190/unsworks/31622).

Funding Statement

This research was funded by the Department of Health and Aged Care for the National Emergency Worker Support Service (Grant: 4-IHO1Z4B). SBH (grant number 1178666) and RB (grant no. 1173921) are supported by National Health and Medical Research Council (NHMRC) investigator grants. The funding sources had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.McFarlane AC, Bryant RA. Post-traumatic stress disorder in occupational settings: anticipating and managing the risk. Occup Med (Lond). 2007;57(6):404–10. doi: 10.1093/occmed/kqm070 [DOI] [PubMed] [Google Scholar]
  • 2.Beaton R, Murphy S, Johnson C, Pike K, Corneil W. Exposure to duty-related incident stressors in urban firefighters and paramedics. J Trauma Stress. 1998;11(4):821–8. doi: 10.1023/A:1024461920456 [DOI] [PubMed] [Google Scholar]
  • 3.Skogstad M, Skorstad M, Lie A, Conradi HS, Heir T, Weisæth L. Work-related post-traumatic stress disorder. Occup Med (Lond). 2013;63(3):175–82. doi: 10.1093/occmed/kqt003 [DOI] [PubMed] [Google Scholar]
  • 4.Petrie K, Milligan-Saville J, Gayed A, Deady M, Phelps A, Dell L, et al. Prevalence of PTSD and common mental disorders amongst ambulance personnel: a systematic review and meta-analysis. Soc Psychiatry Psychiatr Epidemiol. 2018;53(9):897–909. doi: 10.1007/s00127-018-1539-5 [DOI] [PubMed] [Google Scholar]
  • 5.Wagner SL, White N, Buys N, Carey MG, Corneil W, Fyfe T, et al. Systematic review of mental health symptoms in firefighters exposed to routine duty-related critical incidents. Traumatology. 2021;27(3):285–302. doi: 10.1037/trm0000275 [DOI] [Google Scholar]
  • 6.Martínez A, Blanch A. Are rescue workers still at risk? A meta-regression analysis of the worldwide prevalence of post-traumatic stress disorder and risk factors. Stress Health. 2024;40(4):e3372. doi: 10.1002/smi.3372 [DOI] [PubMed] [Google Scholar]
  • 7.Sterud T, Hem E, Lau B, Ekeberg O. A comparison of general and ambulance specific stressors: predictors of job satisfaction and health problems in a nationwide one-year follow-up study of Norwegian ambulance personnel. J Occup Med Toxicol. 2011;6(1):10. doi: 10.1186/1745-6673-6-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Hruska B, Barduhn MS. Dynamic psychosocial risk and protective factors associated with mental health in Emergency Medical Service (EMS) personnel. J Affect Disord. 2021;282:9–17. doi: 10.1016/j.jad.2020.12.130 [DOI] [PubMed] [Google Scholar]
  • 9.Auth NM, Booker MJ, Wild J, Riley R. Mental health and help seeking among trauma-exposed emergency service staff: a qualitative evidence synthesis. BMJ Open. 2022;12(2):e047814. doi: 10.1136/bmjopen-2020-047814 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Newell CJ, Ricciardelli R, Czarnuch SM, Martin K. Police staff and mental health: barriers and recommendations for improving help-seeking. Police Pract Res. 2021;23(1):111–24. doi: 10.1080/15614263.2021.1979398 [DOI] [Google Scholar]
  • 11.Rikkers W, Lawrence D. Barriers to seeking help for an emotional or mental health condition among Australian emergency services workers. Australas J Disaster Trauma Stud. 2022;26(1):23–40. [Google Scholar]
  • 12.Hazell CM, Koc Y, O’Brien S, Fielding-Smith S, Hayward M. Enhancing mental health awareness in emergency services (the ENHANcE I project): cross-sectional survey on mental health stigma among emergency services staff. BJPsych Open. 2021;7(3):e77. doi: 10.1192/bjo.2021.37 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Carlan PE, Nored LS. An examination of officer stress: should police departments implement mandatory counseling? J Police Crim Psychol. 2008;23(1):8–15. doi: 10.1007/s11896-008-9015-x [DOI] [Google Scholar]
  • 14.Lanza A, Roysircar G, Rodgers S. First responder mental healthcare: evidence-based prevention, postvention, and treatment. Prof Psychol Res Pr. 2018;49(3):193–204. doi: 10.1037/pro0000192 [DOI] [Google Scholar]
  • 15.Torous J, Bucci S, Bell IH, Kessing LV, Faurholt-Jepsen M, Whelan P, et al. The growing field of digital psychiatry: current evidence and the future of apps, social media, chatbots, and virtual reality. World Psychiatry. 2021;20(3):318–35. doi: 10.1002/wps.20883 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Collins DAJ, Harvey SB, Lavender I, Glozier N, Christensen H, Deady M. A pilot evaluation of a smartphone application for workplace depression. Int J Environ Res Public Health. 2020;17(18):6753. doi: 10.3390/ijerph17186753 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Firth J, Torous J, Nicholas J, Carney R, Pratap A, Rosenbaum S, et al. The efficacy of smartphone-based mental health interventions for depressive symptoms: a meta-analysis of randomized controlled trials. World Psychiatry. 2017;16(3):287–98. doi: 10.1002/wps.20472 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Linardon J, Cuijpers P, Carlbring P, Messer M, Fuller-Tyszkiewicz M. The efficacy of app-supported smartphone interventions for mental health problems: a meta-analysis of randomized controlled trials. World Psychiatry. 2019;18(3):325–36. doi: 10.1002/wps.20673 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Deady M, Choi I, Calvo RA, Glozier N, Christensen H, Harvey SB. eHealth interventions for the prevention of depression and anxiety in the general population: a systematic review and meta-analysis. BMC Psychiatry. 2017;17(1):310. doi: 10.1186/s12888-017-1473-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Firth J, Torous J, Nicholas J, Carney R, Rosenbaum S, Sarris J. Can smartphone mental health interventions reduce symptoms of anxiety? A meta-analysis of randomized controlled trials. J Affect Disord. 2017;218:15–22. doi: 10.1016/j.jad.2017.04.046 [DOI] [PubMed] [Google Scholar]
  • 21.Moshe I, Terhorst Y, Philippi P, Domhardt M, Cuijpers P, Cristea I, et al. Digital interventions for the treatment of depression: a meta-analytic review. Psychol Bull. 2021;147(8):749–86. doi: 10.1037/bul0000334 [DOI] [PubMed] [Google Scholar]
  • 22.Joyce S, Shand F, Lal TJ, Mott B, Bryant RA, Harvey SB. Resilience@Work mindfulness program: results from a cluster randomized controlled trial with first responders. J Med Internet Res. 2019;21(2):e12894. doi: 10.2196/12894 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Deady M, Peters D, Lang H, Calvo R, Glozier N, Christensen H, et al. Designing smartphone mental health applications for emergency service workers. Occup Med (Lond). 2017;67(6):425–8. doi: 10.1093/occmed/kqx056 [DOI] [PubMed] [Google Scholar]
  • 24.O’Donnell ML, Lau W, Fredrickson J, Gibson K, Bryant RA, Bisson J, et al. An open label pilot study of a brief psychosocial intervention for disaster and trauma survivors. Front Psychiatry. 2020;11:483. doi: 10.3389/fpsyt.2020.00483 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Wild J, Greenberg N, Moulds ML, Sharp M-L, Fear N, Harvey S, et al. Pre-incident training to build resilience in first responders: recommendations on what to and what not to do. Psychiatry. 2020;83(2):128–42. doi: 10.1080/00332747.2020.1750215 [DOI] [PubMed] [Google Scholar]
  • 26.Berkowitz S, Bryant R, Brymer M, Hamblen J, Jacobs A, Layne C, et al. National Center for PTSD and National Child Traumatic Stress Network, Skills for Psychological Recovery: Field Operations Guide. 2010.
  • 27.Rikkers W, Lawrence D. Mental health help-seeking experiences and service use among Australian first responders. Aust J Psychol. 2021;73(2):125–33. doi: 10.1080/00049530.2021.1882271 [DOI] [Google Scholar]
  • 28.Deady M, Collins DAJ, Glozier N, Gardiner E, Arena A, Gayed A, et al. Naturalistic evaluation of headgear: a smartphone app to reduce depressive symptoms in workers. Behav Ther. 2025;56(3):529–42. doi: 10.1016/j.beth.2024.01.001 [DOI] [PubMed] [Google Scholar]
  • 29.Deady M, Glozier N, Calvo R, Johnston D, Mackinnon A, Milne D, et al. Preventing depression using a smartphone app: a randomized controlled trial. Psychol Med. 2022;52(3):457–66. doi: 10.1017/S0033291720002081 [DOI] [PubMed] [Google Scholar]
  • 30.Karekla M, Kasinopoulos O, Neto DD, Ebert DD, Van Daele T, Nordgreen T, et al. Best practices and recommendations for digital interventions to improve engagement and adherence in chronic illness sufferers. Eur Psychol. 2019;24(1):49–67. doi: 10.1027/1016-9040/a000349 [DOI] [Google Scholar]
  • 31.Oakley-Girvan I, Yunis R, Longmire M, Ouillon JS. What works best to engage participants in mobile app interventions and e-health: a scoping review. Telemed J E Health. 2022;28(6):768–80. doi: 10.1089/tmj.2021.0176 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Meuldijk D, Deady M, Collins DAJ, Williams DO, Bryant RA, Harvey SB. Feasibility of a mental health app intervention for emergency service workers and volunteers: single-arm pilot study. JMIR Form Res. 2025;9:e50995. doi: 10.2196/50995 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Schulz KF, Altman DG, Moher D. CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials. J Pharmacol Pharmacother. 2010;1(2):100–7. doi: 10.4103/0976-500X.72352 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Eysenbach G, CONSORT-EHEALTH Group. CONSORT-EHEALTH: improving and standardizing evaluation reports of Web-based and mobile health interventions. J Med Internet Res. 2011;13(4):e126. doi: 10.2196/jmir.1923 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Kroenke K, Spitzer RL. The PHQ-9: a new depression diagnostic and severity measure. Thorofare (NJ): Slack Incorporated; 2002. [Google Scholar]
  • 36.Deady M, Johnston DA, Glozier N, Milne D, Choi I, Mackinnon A, et al. Smartphone application for preventing depression: study protocol for a workplace randomised controlled trial. BMJ Open. 2018;8(7):e020510. doi: 10.1136/bmjopen-2017-020510 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Bryant RA. Review of treatments for PTSD. In: Treating PTSD in first responders. A guide for serving those who serve. American Psychological Association; 2021. p. 81–90. [Google Scholar]
  • 38.Slade T, Grove R, Burgess P. Kessler psychological distress scale: normative data from the 2007 Australian National Survey of Mental Health and Wellbeing. Aust N Z J Psychiatry. 2011;45(4):308–16. doi: 10.3109/00048674.2010.543653 [DOI] [PubMed] [Google Scholar]
  • 39.Kessler RC, Andrews G, Colpe LJ, Hiripi E, Mroczek DK, Normand SLT, et al. Short screening scales to monitor population prevalences and trends in non-specific psychological distress. Psychol Med. 2002;32(6):959–76. doi: 10.1017/s0033291702006074 [DOI] [PubMed] [Google Scholar]
  • 40.Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. 2001;16(9):606–13. doi: 10.1046/j.1525-1497.2001.016009606.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Martin A, Rief W, Klaiberg A, Braehler E. Validity of the Brief Patient Health Questionnaire Mood Scale (PHQ-9) in the general population. Gen Hosp Psychiatry. 2006;28(1):71–7. doi: 10.1016/j.genhosppsych.2005.07.003 [DOI] [PubMed] [Google Scholar]
  • 42.Spitzer RL, Kroenke K, Williams JBW, Löwe B. A brief measure for assessing generalized anxiety disorder: the GAD-7. Arch Intern Med. 2006;166(10):1092–7. doi: 10.1001/archinte.166.10.1092 [DOI] [PubMed] [Google Scholar]
  • 43.Bech P. Measuring the dimension of psychological general well-being by the WHO-5. Qual Life Newsl. 2004:15–6. [Google Scholar]
  • 44.Organization WH. Wellbeing measures in primary health care/the DepCare Project: report on a WHO meeting: Stockholm, Sweden, 12–13 February 1998. World Health Organization. Regional Office for Europe; 1998. [Google Scholar]
  • 45.Sischka PE, Costa AP, Steffgen G, Schmidt AF. The WHO-5 well-being index – validation based on item response theory and the analysis of measurement invariance across 35 countries. J Affect Disord Rep. 2020;1:100020. doi: 10.1016/j.jadr.2020.100020 [DOI] [Google Scholar]
  • 46.Topp CW, Østergaard SD, Søndergaard S, Bech P. The WHO-5 Well-Being Index: a systematic review of the literature. Psychother Psychosom. 2015;84(3):167–76. doi: 10.1159/000376585 [DOI] [PubMed] [Google Scholar]
  • 47.Hansen M, Andersen TE, Armour C, Elklit A, Palic S, Mackrill T. PTSD-8: a short PTSD inventory. Clin Pract Epidemiol Ment Health. 2010;6:101–8. doi: 10.2174/1745017901006010101 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Andersen TE, Hansen M, Ravn SL, Seehuus R, Nielsen M, Vaegter HB. Validation of the PTSD-8 scale in chronic pain patients. Pain Med. 2018;19(7):1365–72. doi: 10.1093/pm/pnx166 [DOI] [PubMed] [Google Scholar]
  • 49.Bryant RA, Edwards B, Creamer M, O’Donnell M, Forbes D, Felmingham KL, et al. The effect of post-traumatic stress disorder on refugees’ parenting and their children’s mental health: a cohort study. Lancet Public Health. 2018;3(5):e249–58. doi: 10.1016/S2468-2667(18)30051-3 [DOI] [PubMed] [Google Scholar]
  • 50.Benight CC, Shoji K, James LE, Waldrep EE, Delahanty DL, Cieslak R. Trauma coping self-efficacy: a context-specific self-efficacy measure for traumatic stress. Psychol Trauma. 2015;7(6):591–9. doi: 10.1037/tra0000045 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Kessler RC, Barber C, Beck A, Berglund P, Cleary PD, McKenas D, et al. The world health organization health and work performance questionnaire (HPQ). J Occup Environ Med. 2003;45(2):156–74. doi: 10.1097/01.jom.0000052967.43131.51 [DOI] [PubMed] [Google Scholar]
  • 52.Deady M, Collins DAJ, Lavender I, Mackinnon A, Glozier N, Bryant R, et al. Selective prevention of depression in workers using a smartphone app: randomized controlled trial. J Med Internet Res. 2023;25:e45963. doi: 10.2196/45963 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Connor KM, Davidson JRT. Development of a new resilience scale: the Connor-Davidson Resilience Scale (CD-RISC). Depress Anxiety. 2003;18(2):76–82. doi: 10.1002/da.10113 [DOI] [PubMed] [Google Scholar]
  • 54.Gonzalez SP, Moore EWG, Newton M, Galli NA. Validity and reliability of the Connor-Davidson Resilience Scale (CD-RISC) in competitive sport. Psychol Sport Exerc. 2016;23:31–9. doi: 10.1016/j.psychsport.2015.10.005 [DOI] [Google Scholar]
  • 55.Bush K, Kivlahan DR, McDonell MB, Fihn SD, Bradley KA. The AUDIT alcohol consumption questions (AUDIT-C): an effective brief screening test for problem drinking. Ambulatory Care Quality Improvement Project (ACQUIP). Alcohol Use Disorders Identification Test. Arch Intern Med. 1998;158(16):1789–95. doi: 10.1001/archinte.158.16.1789 [DOI] [PubMed] [Google Scholar]
  • 56.Meneses-Gaya C, Zuardi AW, Loureiro SR, Hallak JEC, Trzesniak C, de Azevedo Marques JM, et al. Is the full version of the AUDIT really necessary? Study of the validity and internal construct of its abbreviated versions. Alcohol Clin Exp Res. 2010;34(8):1417–24. doi: 10.1111/j.1530-0277.2010.01225.x [DOI] [PubMed] [Google Scholar]
  • 57.Bradley KA, Rubinsky AD, Lapham GT, Berger D, Bryson C, Achtmeyer C, et al. Predictive validity of clinical AUDIT-C alcohol screening scores and changes in scores for three objective alcohol-related outcomes in a Veterans Affairs population. Addiction. 2016;111(11):1975–84. doi: 10.1111/add.13505 [DOI] [PubMed] [Google Scholar]
  • 58.Hawthorne G, Richardson J, Osborne R. The Assessment of Quality of Life (AQoL) instrument: a psychometric measure of health-related quality of life. Qual Life Res. 1999;8(3):209–24. doi: 10.1023/a:1008815005736 [DOI] [PubMed] [Google Scholar]
  • 59.Hawthorne G, Korn S, Richardson J. Population norms for the AQoL derived from the 2007 Australian National Survey of Mental Health and Wellbeing. Aust N Z J Public Health. 2013;37(1):7–16. doi: 10.1111/1753-6405.12004 [DOI] [PubMed] [Google Scholar]
  • 60.Stoyanov SR, Hides L, Kavanagh DJ, Zelenko O, Tjondronegoro D, Mani M. Mobile app rating scale: a new tool for assessing the quality of health mobile apps. JMIR Mhealth Uhealth. 2015;3(1):e27. doi: 10.2196/mhealth.3422 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Linardon J, Anderson C, Messer M, Liu C, Torous J. Transdiagnostic-focused apps for depression and anxiety: a meta-analysis. NPJ Digit Med. 2025;8(1):443. doi: 10.1038/s41746-025-01860-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Goreis A, Felnhofer A, Kafka JX, Probst T, Kothgassner OD. Efficacy of self-management smartphone-based apps for post-traumatic stress disorder symptoms: a systematic review and meta-analysis. Front Neurosci. 2020;14:3. doi: 10.3389/fnins.2020.00003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Han J, McGillivray L, Wong QJ, Werner-Seidler A, Wong I, Calear A, et al. A mobile health intervention (LifeBuoy App) to help young people manage suicidal thoughts: protocol for a mixed-methods randomized controlled trial. JMIR Res Protoc. 2020;9(10):e23655. doi: 10.2196/23655 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Rodriguez-Paras C, Tippey K, Brown E, Sasangohar F, Creech S, Kum H-C, et al. Posttraumatic stress disorder and mobile health: app investigation and scoping literature review. JMIR Mhealth Uhealth. 2017;5(10):e156. doi: 10.2196/mhealth.7318 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Alden LE, Matthews LR, Wagner S, Fyfe T, Randall C, Regehr C, et al. Systematic literature review of psychological interventions for first responders. Work Stress. 2020;35(2):193–215. doi: 10.1080/02678373.2020.1758833 [DOI] [Google Scholar]
  • 66.Kaveladze BT, Wasil AR, Bunyi JB, Ramirez V, Schueller SM. User experience, engagement, and popularity in mental health apps: secondary analysis of app analytics and expert app reviews. JMIR Hum Factors. 2022;9(1):e30766. doi: 10.2196/30766 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Terhorst Y, Kaiser T, Brakemeier E-L, Moshe I, Philippi P, Cuijpers P, et al. Heterogeneity of treatment effects in internet- and mobile-based interventions for depression: a systematic review and meta-analysis. JAMA Netw Open. 2024;7(7):e2423241. doi: 10.1001/jamanetworkopen.2024.23241 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Cogan N, Kirk A, Rodriguez C. The development and protocol for testing a co-created digital intervention (Sentinel) to improve mental wellbeing and help manage and prevent trauma in first responders. 2024.
  • 69.Josifovski N, Torok M, Batterham P, Wong Q, Beames JR, Theobald A, et al. Efficacy of BrighterSide, a self-guided app for suicidal ideation: randomized controlled trial. JMIR Ment Health. 2024;11:e55528. doi: 10.2196/55528 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.van der Watt ASJ, Odendaal W, Louw K, Seedat S. Distant mood monitoring for depressive and bipolar disorders: a systematic review. BMC Psychiatry. 2020;20(1):383. doi: 10.1186/s12888-020-02782-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Dubad M, Winsper C, Meyer C, Livanou M, Marwaha S. A systematic review of the psychometric properties, usability and clinical impacts of mobile mood-monitoring applications in young people. Psychol Med. 2018;48(2):208–28. doi: 10.1017/S0033291717001659 [DOI] [PubMed] [Google Scholar]
  • 72.Cristea IA. The waiting list is an inadequate benchmark for estimating the effectiveness of psychotherapy for depression. Epidemiol Psychiatr Sci. 2019;28(3):278–9. doi: 10.1017/S2045796018000665 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Joyce S, Shand F, Lal TJ, Mott B, Bryant RA, Harvey SB. Resilience@Work mindfulness program: results from a cluster randomized controlled trial with first responders. J Med Internet Res. 2019;21(2):e12894. doi: 10.2196/12894 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Furukawa TA, Suganuma A, Ostinelli EG, Andersson G, Beevers CG, Shumake J, et al. Dismantling, optimising, and personalising internet cognitive behavioural therapy for depression: a systematic review and component network meta-analysis using individual participant data. Lancet Psychiatry. 2021;8(6):500–11. doi: 10.1016/S2215-0366(21)00077-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Cuijpers P, Cristea IA, Karyotaki E, Reijnders M, Hollon SD. Component studies of psychological treatments of adult depression: a systematic review and meta-analysis. Psychother Res. 2019;29(1):15–29. doi: 10.1080/10503307.2017.1395922 [DOI] [PubMed] [Google Scholar]
  • 76.Tan L, Deady M, Mead O, Foright RM, Brenneman EM, Bryant RA, et al. Yoga resilience training to prevent the development of posttraumatic stress disorder in active-duty first responders: a cluster randomized controlled trial. Psychol Trauma. 2025;17(5):1108–16. doi: 10.1037/tra0001667 [DOI] [PubMed] [Google Scholar]
  • 77.Tan L, Strudwick J, Deady M, Bryant R, Harvey SB. Mind-body exercise interventions for prevention of post-traumatic stress disorder in trauma-exposed populations: a systematic review and meta-analysis. BMJ Open. 2023;13(7):e064758. doi: 10.1136/bmjopen-2022-064758 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.McKeon G, Wells R, Steel Z, Hadzi-Pavlovic D, Teasdale S, Vancampfort D, et al. An online mental health informed physical activity intervention for emergency service workers and their families: a stepped-wedge trial. Digit Health. 2023;9:20552076221149294. doi: 10.1177/20552076221149294 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Musiat P, Johnson C, Atkinson M, Wilksch S, Wade T. Impact of guidance on intervention adherence in computerised interventions for mental health problems: a meta-analysis. Psychol Med. 2022;52(2):229–40. doi: 10.1017/S0033291721004621 [DOI] [PubMed] [Google Scholar]
  • 80.Grothe J, Tucker S, Blake A, Achutan C, Medcalf S, Suwondo T, et al. Exploring first responders’ use and perceptions on continuous health and environmental monitoring. Int J Environ Res Public Health. 2023;20(6):4787. doi: 10.3390/ijerph20064787 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Deady M, Collins D, Gayed A, Harvey SB, Bryant R. The development of a smartphone app to enhance post-traumatic stress disorder treatment in high-risk workers. Digit Health. 2023;9:20552076231155680. doi: 10.1177/20552076231155680 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Carolan S, Harris PR, Cavanagh K. Improving employee well-being and effectiveness: systematic review and meta-analysis of web-based psychological interventions delivered in the workplace. J Med Internet Res. 2017;19(7):e271. doi: 10.2196/jmir.7583 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Cugelman B, Thelwall M, Dawes P. Online interventions for social marketing health behavior change campaigns: a meta-analysis of psychological architectures and adherence factors. J Med Internet Res. 2011;13(1):e17. doi: 10.2196/jmir.1367 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Morrison LG, Yardley L, Powell J, Michie S. What design features are used in effective e-health interventions? A review using techniques from Critical Interpretive Synthesis. Telemed J E Health. 2012;18(2):137–44. doi: 10.1089/tmj.2011.0062 [DOI] [PubMed] [Google Scholar]
  • 85.Orji R, Moffatt K. Persuasive technology for health and wellness: state-of-the-art and emerging trends. Health Inform J. 2018;24(1):66–91. doi: 10.1177/1460458216650979 [DOI] [PubMed] [Google Scholar]
  • 86.Wu A, Scult MA, Barnes ED, Betancourt JA, Falk A, Gunning FM. Smartphone apps for depression and anxiety: a systematic review and meta-analysis of techniques to increase engagement. NPJ Digit Med. 2021;4(1):20. doi: 10.1038/s41746-021-00386-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Valentine L, Hinton JDX, Bajaj K, Boyd L, O’Sullivan S, Sorenson RP, et al. A meta-analysis of persuasive design, engagement, and efficacy in 92 RCTs of mental health apps. NPJ Digit Med. 2025;8(1):229. doi: 10.1038/s41746-025-01567-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Deady M, Collins DAJ, Azevedo S, Stech E, Harrison A, Broomfield C, et al. Integration of a smartphone app with posttraumatic stress disorder treatment for frontline workers: a pilot study. Aust J Psychol. 2024;76(1):2399112. doi: 10.1080/00049530.2024.2399112 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Fatma Refaat Ahmed

18 Jun 2025

Dear Dr. Deady,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Aug 02 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols .

We look forward to receiving your revised manuscript.

Kind regards,

Fatma Refaat Ahmed, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2.  We note that the original protocol that you have uploaded as a Supporting Information file contains an institutional logo. As this logo is likely copyrighted, we ask that you please remove it from this file and upload an updated version upon resubmission.

3. Thank you for stating the following in the Competing Interests section:

“MD, DM, DAJC, SBH developed the Build Back Better app, they recieve no benefit from this program.”

Please confirm that this does not alter your adherence to all PLOS ONE policies on sharing data and materials, by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests). If there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

Please include your updated Competing Interests statement in your cover letter; we will change the online submission form on your behalf.

4. In the online submission form, you indicated that “Data cannot be shared publicly as this was not clearly stated within the orginal ethics. However, deidentified data can be made available upon reasonable request to the authors.”

All PLOS journals now require all data underlying the findings described in their manuscript to be freely available to other researchers, either 1. In a public repository, 2. Within the manuscript itself, or 3. Uploaded as supplementary information.

This policy applies to all data except where public deposition would breach compliance with the protocol approved by your research ethics board. If your data cannot be made publicly available for ethical or legal reasons (e.g., public availability would compromise patient privacy), please explain your reasons on resubmission and your exemption request will be escalated for approval.

5. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information.

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: No

Reviewer #2: Yes

Reviewer #3: Partly

Reviewer #4: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously? -->?>

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: No

Reviewer #4: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available??>

The PLOS Data policy

Reviewer #1: No

Reviewer #2: Yes

Reviewer #3: No

Reviewer #4: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

Reviewer #4: Yes

**********

Reviewer #1: Thanks for the opportunity to review this manuscript. Overall, the science is fine and its well written.

I have significant concerns about the rationale for the study, and the interpretation of the findings, as follows:

The introduction, and therefore the discussion, fails to address the well know engagement problem in digital mental health. There are hundreds of thousands of apps, less than 5% are used at 30 days, there are well-known severe issues with engagement of digital mental health apps (SEE- https://pubmed.ncbi.nlm.nih.gov/29871870/). The rationale for this study must be made in light of this problem. What did this study do to address the known challenges of engagement, and where it did not address these issues, why?

The manuscript conflates the method of past studies and this study. Was co-design done as part of this project? If so, please elaborate on how, when, what, etc in the methods and results. If not, please move the co-design findings out of the method and into the introduction, clearly expand on how the co-design was conducted, and critically, why? Headgear was an effective intervention, you co-designed a new intervention (why? how?) that was no longer effective. This must be explained in the introduction and revisited in the discussion. The main point of the discussion that is missing is how co-design improvements to Headgear resulted in a non-effective intervention. Please interrogate the co-design approach, the changes that were made, and what the broader digital mental health literature can tell us to explain this unexpected result.

The power calculation describes a prevention study, the title and method describes a treatment study. In addition, why wasnt the Headgear study used to power this study?

The study cycles between the terms usability, satisfaction, usage, acceptability with inconsistency. (SEE - https://pubmed.ncbi.nlm.nih.gov/30914003/) The aims in the introduction need to be clearer about how these constructs will be assessed and what thresholds for satisfaction and usability will be set. It is not sufficient to say satisfaction was higher, what was the pre-study threshold? The method describes qualitative data that is not presented, and all of the 12 user

engagement and app-related feedback questions results are not presented. Table 6 presents means that are difficult to interpret- what does a few hundred opens actually mean?

Reviewer #2: Thank you for the opportunity to review the manuscript entitled "Efficacy of a smartphone app to improve mental health among emergency service workers: A randomised controlled trial". It is well-written and addresses an important topic. Below are some comments for consideration:

Lines 79-84: Few studies have been conducted on electronic support workers (ESW) using digital interventions. What might be the reasons for this? Is it due to the demanding nature of their work, which may limit the use of electronic devices during work hours? This is a crucial point, as it could impact adherence to the intervention. More evidence is needed to explore the behavioral habits regarding technology use as a tool for addressing mental health issues in this group.

Line 88: Please provide a brief overview of the acceptability, acceptance, and preferences of ESW regarding the pilot study outcomes. What changes were implemented from the pilot results to inform the randomized controlled trial (RCT)?

Lines 130-132: All study participants were instructed to download the Build Back Better smartphone app, including a login code tailored to their group allocation. Can participants log in on multiple devices, as long as they enter the correct code? If so, this raises concerns about the potential for participants to access the app on multiple devices or on devices belonging to others, which could lead to contamination.

Line 159: Regarding the content of the intervention—1) mindfulness, 2) healthy coping, 3) managing thoughts, and 4) valued actions—why were these aspects selected? How was this content validated? Please describe the intervention in detail, including session structure, duration, and how participants can seek support or ask questions related to this content.

Lines 166-167: The app provides links and phone numbers to various mental health and workplace support services. What is the purpose of including this content? How can you determine whether the effectiveness observed in participants is attributable to the app intervention or if it results from direct contact with mental health services?

Line 177: What are the psychometric properties of the instruments used in the study?

Table 6: In the app activity usage patterns, the monthly open frequency (average number of times an activity is accessed) is reported. How much time do participants spend on the app? What measures are in place to ensure participants engage with all the content within the app?

Reviewer #3: The present paper reports on a two-arm repeated measures randomized trial aimed at improving mental health among emergency service workers. The intervention arm includes the full version of the Build Back Better app, while the control arm involves symptom tracking only. Each participant is assessed at three-time points: baseline, 4 weeks post-baseline, and 3-month follow-up.

The manuscript has several methodological concerns that warrant attention:

Sample Size Calculation in Repeated Measures Design:

Sample size estimation for repeated measures trials is complex and typically requires specifying the intra-class correlation (ICC) or within-subject correlation. This critical parameter appears to be missing in the current manuscript. Without it, the sample size justification is incomplete.

Multiple Primary Outcomes:

Pages 8–10 describe six different outcomes, all of which seem to be treated as primary. This raises serious concerns about inflated Type I error rates due to multiple comparisons. Unless corrected (e.g., through statistical adjustment or designating a single primary outcome), the study is likely underpowered to detect significant effects across all six measures. The absence of a clearly defined primary outcome is a major design flaw.

High Dropout Rate and Underpowering:

The reported dropout rate of 30% is substantial. While the authors have attempted to adjust the sample size accordingly, Figure 1 (CONSORT diagram) shows only a total of 171 participants completing the trial. If this is the number used for analysis at the primary endpoint, the study may be underpowered even for a single outcome. Additionally, Figure 1 is difficult to read and should be revised for clarity.

Handling of Dropout and Partial Data:

It is unclear how data from participants who dropped out but provided partial data were handled. Were these data included in any form of mixed-effects model or imputed? Clarification is needed on how missing data were addressed, particularly for a longitudinal design.

Power Analysis Inconsistency:

On page 16, the authors report testing an interaction term for statistical significance, yet the power analysis on page 12 appears to be based on detecting a main effect only. A valid power analysis must align with the actual statistical model used in the analysis. If the interaction term is of interest, it should have been included in the original power estimation.

Conclusion:

I believe the paper has the potential to make a valuable contribution, but in its current form, it requires major revision. A well-defined primary outcome and a properly aligned power analysis would go a long way in strengthening the manuscript. Additionally, clarifying how missing data were handled and revising the presentation of the CONSORT diagram would enhance transparency and rigor.

Reviewer #4: This paper reports a primary analysis of an RCT comparing a digital intervention with a monitoring only digital intervention for emergency service workers. There was limited evidence for differential efficacy between the two groups, with both showing comparable positive outcomes across a range of primary and secondary outcomes. The paper is well written, however, in my opinion, deeper reflection on the drop out and engagement patterns, and what this means for digital interventions in this space (e.g., early intervention/indicated prevention in ESWs).

Major Comments

- Method: Please explain the rationale for choosing a self-initiated mood/activity log rather than a protocol typically used in experience sampling studies (i.e., questionnaires sent at fixed/random/semi-random times each day?). Whilst event-related approaches are common for behaviours like alcohol use, they are less common for internal experiences like mood, and studies usually have low compliance in these designs. This could also be a Discussion point for low engagement in both conditions.

- Data analysis page 13: The analysis plan includes multiple testing. What is the rationale for not correcting for multiple tests? Does controlling for multiple testing influence the pattern of results?

- Data analysis page 13 line 289-291. Explain the thresholds for no, minimal and engagers. They seem quite arbitrary (e.g., why not use a continuous variable?) and it would be worth including this as a broader point in the discussion (incl the lack of guidelines around what acceptable engagement actually is in digital interventions and monitoring (e.g., see experience sampling literature)

- Discussion: The authors acknowledge the high amount of drop out and from Baseline to T1 and T2, and low engagement, in both conditions (but particularly for the full intervention condition) in the Discussion. Related to the point above, perhaps the instruction given to participants influenced engagement (Page 6 line 132: Pts were asked to use the app consistently for 30 days”). On the one hand, this instruction is vague. On the other hand, if pts were instructed to use the app/s flexibly, perhaps the low compliance reflects just that (e.g., they used it when they needed to). The Discussion can be enhanced by elaborating on these points in the context of the specific intervention/s but also more broadly in terms of what it means for “preventive” digital interventions (or interventions more generally) for this population.

- Discussion: The disconnect between very low engagement and relatively positive app-feedback is a surprising result which has not been adequately addressed. I recommend adding a discussion point about why this disconnect might be present in this group and whether positive feedback is enough to recommend broader use/scale-up (i.e., what can we really do with this information given that the full intervention was not really used and did not improve most outcomes compared to the control?). Relatedly, is this disconnect common in digital intervention studies, or within this sample?

Minor Comments

- The authors mention in the introduction and methods that the app was co-designed with ESWs. Please specify the level of co-design according to established models – e.g., did they actively collaborate with and make decisions together with the research team or was still a top down approach to development (with researchers/clinicians making the final decisions). This could add to the authors point in the Discussion about app development and dropout/engagement.

- Page 4 line 78-79: “However, less is known on the equivalence of app-based interventions compared to 79 browser-based digital programs [21].” This makes it seem like the trial will be comparing these 2 variants or focusing on browser-based programs. I recommend removing or revising so that it aligns with the approach of the current study.

- Method: Please include examples of the mood and activity tracking questions (e.g., in supp materials)

- Methods: Usability and acceptability are broad terms that are defined differently across studies and within implementation literature. Please define what these concepts and link to previous literature. Relatedly, please describe how the engagement and app-related feedback questionnaire was developed (was it based on a previous study? Was it developed as a bespoke questionnaire?)

- Results: Given that the purpose of the RCT is to compare the effect of the two groups across time, rather than within-group changes, I suggest summarising these results clearly first.

- Page 16 line 341. In the primary outcomes section, it is written that the interaction “seldom” reached significance. No statistical tests for the interaction were significant, please reword to remove ambiguity.

- Discussion page 29. Please revise this sentence so that that the meaning is clear: “Nevertheless, although both mindfulness and behavioural activation formed part of the Build Back Better content, as part of the series of skills presented and may not have been engaged with in sufficient depth to elicit effect.”

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #1: No

Reviewer #2: Yes: Cho Lee Wong

Reviewer #3: No

Reviewer #4: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org . Please note that Supporting Information files do not need this step.

PLoS One. 2026 Feb 5;21(2):e0342419. doi: 10.1371/journal.pone.0342419.r002

Author response to Decision Letter 1


24 Sep 2025

Fatma Refaat Ahmed, Ph.D.

Academic Editor

PLOS ONE

Dear Assoc Professor Refaat Ahmed,

Thank you for inviting us to resubmit this revised manuscript to PlosOne. In the attached manuscript we have addressed the items raised by the editorial team and peer reviewers and we have included our responses to the reviewer comments below.

Kind Regards,

Mark Deady, PhD

Editor comments:

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

The manuscript has been updated to meet PLOS ONE's style requirements as follows:

• Changes to formatting of headings throughout.

• Removal of “Key words” and “Funding” on p. 3.

• Capitalisation changed to lower case where relevant.

• Continuation of line numbers throughout.

• Figures listed as “Fig 1”, “Fig 2”, etc.

• Tables included directly after the paragraph in which they are first cited.

File names have also been updated to meet the journal’s requirements.

2. We note that the original protocol that you have uploaded as a Supporting Information file contains an institutional logo. As this logo is likely copyrighted, we ask that you please remove it from this file and upload an updated version upon resubmission.

The institutional logo has been removed from the trial protocol and an updated version has been uploaded for inclusion as a Supporting Information file (“S3_File”).

3. Thank you for stating the following in the Competing Interests section:

“MD, DM, DAJC, SBH developed the Build Back Better app, they recieve no benefit from this program.”

Please confirm that this does not alter your adherence to all PLOS ONE policies on sharing data and materials, by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests). If there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

Please include your updated Competing Interests statement in your cover letter; we will change the online submission form on your behalf.

Thank you kindly for this information. Please update our Competing Interests statement as follows:

“MD, DM, DAJC, and SBH developed the Build Back Better app. They receive no financial benefit from this program. This does not alter our adherence to PLOS ONE policies on sharing data and materials.”

4. In the online submission form, you indicated that “Data cannot be shared publicly as this was not clearly stated within the orginal ethics. However, deidentified data can be made available upon reasonable request to the authors.”

All PLOS journals now require all data underlying the findings described in their manuscript to be freely available to other researchers, either 1. In a public repository, 2. Within the manuscript itself, or 3. Uploaded as supplementary information.

This policy applies to all data except where public deposition would breach compliance with the protocol approved by your research ethics board. If your data cannot be made publicly available for ethical or legal reasons (e.g., public availability would compromise patient privacy), please explain your reasons on resubmission and your exemption request will be escalated for approval.

Thank you for drawing our attention to this. We would like to request exemption from making the data publicly available for ethical reasons, as detailed in our updated Data Availability statement:

“The data underlying the findings described in this manuscript are not publicly available due the human research ethics committee conditions of approval for the study. Other researchers who wish to access the data must provide written evidence of their own human research ethics approval before the data can be shared for secondary research purposes. Deidentified data can be made available upon reasonable request to the corresponding author (MD) along with evidence of human research ethics approval.”

5. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information.

Captions for Supporting Information files have now been included on the final page of the manuscript. The in-text citation referring to “supplementary materials” has also been updated to match the updated file name “S1_File”.

Reviewers' comments:

Reviewer #1:

Thanks for the opportunity to review this manuscript. Overall, the science is fine and its well written.

I have significant concerns about the rationale for the study, and the interpretation of the findings, as follows:

1. The introduction, and therefore the discussion, fails to address the well know engagement problem in digital mental health. There are hundreds of thousands of apps, less than 5% are used at 30 days, there are well-known severe issues with engagement of digital mental health apps (SEE- https://pubmed.ncbi.nlm.nih.gov/29871870/). The rationale for this study must be made in light of this problem. What did this study do to address the known challenges of engagement, and where it did not address these issues, why?

We agree with the issues raised by Reviewer 1. Indeed, this was the major rationale for the codesign process described below to tailor an existing evidence-based app to this new population. We have included greater detail both on this process and the rationale for it and the existing issues within the literature in this regard. We took a number of steps, informed by the literature, to enhance engagement as outlined below. As the Reviewer highlights Torous and colleagues note several strategies to improve engagement including the involvement of end users in the conception, design and testing of apps (which we have provided further detail on within the manuscript). Furthermore, as this paper states “apps are not the only sphere where engagement can be difficult. Adherence and engagement with information, monitoring, medications and psychotherapy are challenging problems encountered with both the face-to-face or via a digital interface.” Unguided apps/digital interventions do, of course, allow for both the freedom, and with it, the obstacle of ease of disengagement, which can occur without breaking any social contract of accountability. In our case ultimately, engagement remained a significant problem despite considered steps to develop an engaging solution that met user needs, highlighting the ongoing challenges in this space. This finding has recently been demonstrated in a metaanalysis [1], highlighting the considerable challenges inherent in optimising engagement with digital interventions.

“In order to adapt the program to the needs of ESWs an iterative codesign process occurred with ESWs, mental health experts including psychiatrists and clinical psychologists, consulted alongside design, user experience, and IT teams. Workshops and multiple rounds of user testing and revision took place during this phase with 12 ESWs to tailor this product. This codesign was deemed essential to meet the unique needs of the population. In order to overcome known challenges of engagement ([2]), several specific techniques associated with increased engagement were employed. These included personalised content where users selected area of need, data visualisation (tracking), reminders/push notifications, educational information, self-monitoring, and goal-setting features [3].”

2. The manuscript conflates the method of past studies and this study. Was co-design done as part of this project? If so, please elaborate on how, when, what, etc in the methods and results. If not, please move the co-design findings out of the method and into the introduction, clearly expand on how the co-design was conducted, and critically, why? Headgear was an effective intervention, you co-designed a new intervention (why? how?) that was no longer effective. This must be explained in the introduction and revisited in the discussion. The main point of the discussion that is missing is how co-design improvements to Headgear resulted in a non-effective intervention. Please interrogate the co-design approach, the changes that were made, and what the broader digital mental health literature can tell us to explain this unexpected result.

We have modified the methods and introduction to better reflect the codesign occurring outside of this specific RCT study. Co-design was an iterative process working with ESWs, mental health professionals, design, user experience, and IT teams.

Although the reviewer is correct that HeadGear being effective was important in the genesis for the project, simply reusing this app in such a unique population was viewed as inadequate [4, 5]. It was believed that this process would lead to enhanced effectiveness and improved outcomes by addressing individual needs and preferences. At least in part due to better engagement and increased motivation, and salience of message. To the reviewer’s query 1, engagement was a major consideration and something we hoped to improve. As the reviewer points out, in our case this did not occur (although direct comparison is missing). We have further elaborated on this in the discussion. However, there are other trial differences that may explain the different findings.

“…Finally, in line with recommendations for codesign of ESW interventions [6], representatives from this population were involved in ongoing consultation throughout the app development process, and on the basis of iterative codesign Build Back Better moved from a sequential 30-day once-daily format to an unstructured, self-directed format which may have impacted the effectiveness of the app [7]. It is unclear exactly what drove this null finding especially where the clinical basis for the app was previously associated with effect [8]. However, two major distinctions were the baseline clinical profile of users and the freeform app format. Specifically, this format change may have led to poorer direction within the app. Although users in all these studies were directed to use the app consistently for 30 days”, this instruction is likely to be considerably clearer where the app structure follows this instruction. Instead, it is more likely present study participants (especially as they were symptomatic at baseline) may have used the app flexibility (e.g., when they needed to), which is unlikely to bring about sustained effect.”

3. The power calculation describes a prevention study, the title and method describes a treatment study. In addition, why wasnt the Headgear study used to power this study?

The sample was different in terms of risk and outcome of interest. Where the Headgear study was universal in nature looking at depressive symptom change with baseline restrictions around symptom severity, the current trial was aimed to improve psychological distress in a group experiencing at least moderate levels of psychological distress. However, we an error that we have since corrected in the section detailing this calculation. Nevertheless, the study was under powered upon completion due to insufficient recruitment, we have raised this as a point of discussion.

4. The study cycles between the terms usability, satisfaction, usage, acceptability with inconsistency. (SEE - https://pubmed.ncbi.nlm.nih.gov/30914003/) The aims in the introduction need to be clearer about how these constructs will be assessed and what thresholds for satisfaction and usability will be set. It is not sufficient to say satisfaction was higher, what was the pre-study threshold? The method describes qualitative data that is not presented, and all of the 12 user engagement and app-related feedback questions results are not presented. Table 6 presents means that are difficult to interpret- what does a few hundred opens actually mean?

We have sought to clarify this throughout for consistency. We have also resolved inconsistency around language throughout. Our measure of usability was based off the Mobile Application Rating Scale. Although not exhaustive, these items within this validated app quality rating tool provide some insight into user experience of the app.

“Items were adapted from the Mobile Application Rating Scale [9], including ease of use, understanding of content, engagement and interest in the design and content, likelihood of recommending to others, and overall rating of the app and website. Further questions measured the subjective perception of improvement in mental fitness, and reasons for stopping app use. Participants also provided general feedback and suggestions via open-response questions. These measures have been used previously [10].”

We have further removed specific reference to this within the aims as the exploration of use patterns and feedback merely supports the main analysis, and the results do not sufficiently answer a unique question. We do not state that “satisfaction is higher” but we do feel it valuable to compare the groups on these feedback scores to determine differences in experience of the two versions of the app. However, we note that there are no established thresholds in this space to reliably determine whether the app was viewed as “adequately” helpful.

In light of Reviewer 1’s comments we have removed Table 6 from the manuscript and in place have provided supplemental material detailing full list of feedback items used. We have also modified the app use and feedback section to provide greater detail of the findings in regard to this feedback.

“Around half of respondents (129/254; 50.79%) reported stopping use of the app before completing the trial. A higher proportion of the full intervention group (61/103; 59.22%) ceased app use compared to the tracking-only group (68/151; 45.03%). The most common reason for stopping in the full intervention group was lack of time (23/61; 37.70%), whereas the most common reason in the tracking-only group was loss of interest (29/68; 42.65%),

When asked to nominate their favourite app feature, full intervention respondents most frequently noted ease of use/navigation, the wide variety of available content, and the daily tracker. Of the four key content areas, mindfulness was favoured the most.

In terms of least favourite app feature, full intervention respondents most frequently mentioned lack of reminders to use the app (this function was available but required setting changes to turn on), and lack of engaging or relevant content. A small minority reported technical issues (such as inability to access offline) and difficulty with navigation/ease of use.”

Reviewer #2:

Thank you for the opportunity to review the manuscript entitled "Efficacy of a smartphone app to improve mental health among emergency service workers: A randomised controlled trial". It is well-written and addresses an important topic. Below are some comments for consideration:

1. Lines 79-84: Few studies have been conducted on electronic support workers (ESW) using digital interventions. What might be the reasons for this? Is it due to the demanding nature of their work, which may limit the use of electronic devices during work hours? This is a crucial point, as it could impact adherence to the intervention. More evidence is needed to explore the behavioral habits regarding technology use as a tool for addressing mental health issues in this group.

We have sought to further explore this in the revised manuscript. Primarily, this is the result of limited interventions specifically designed for this population more broadly. Certainly, there are limitations related to on-shift use but these also apply to most frontline workforces (e.g. health

Decision Letter 1

Fatma Refaat Ahmed

5 Nov 2025

Dear Dr. Deady,

Please submit your revised manuscript by Dec 20 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols .

We look forward to receiving your revised manuscript.

Kind regards,

Fatma Refaat Ahmed, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

1. If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise.

2. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

Reviewer #5: All comments have been addressed

Reviewer #6: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions??>

Reviewer #5: Yes

Reviewer #6: Partly

**********

3. Has the statistical analysis been performed appropriately and rigorously? -->?>

Reviewer #5: Yes

Reviewer #6: N/A

**********

4. Have the authors made all data underlying the findings in their manuscript fully available??>

The PLOS Data policy

Reviewer #5: No

Reviewer #6: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #5: Yes

Reviewer #6: Yes

**********

Reviewer #5: Thank you for the opportunity to conduct a second-look review. I was asked to comment on a manuscript that has already undergone extensive peer review, and I have therefore focused my assessment on whether the authors have addressed the editor’s and reviewers’ first-round comments. It would be unfair of me to provide fresh round of comments at this stage. The authors have, in my view, largely addressed the prior comments through concrete textual revisions, clearer reporting, and additional analyses. A small number of issues are acknowledged rather than fully resolved.

Specific points addressed

* formatting, file naming, and figure/table placement have been corrected.

* the statement has been clarified and the authors request an ethics-based exemption; I note this requires an editorial decision as it int clear why the committee are blocking.

* the description of the power analysis has been aligned with the tests actually conducted; explicit between-group results at follow-ups have been added.

* handling is now described (linear mixed models with maximum likelihood under a MAR assumption).

* the results and discussion have been adjusted to acknowledge low engagement and minimal between-group differences, with implications for unguided digital interventions made more explicit.

On balance, the first-round comments have been adequately addressed. I recommend acceptance subject to the editorial team being satisfied with the data availability wording and any minor textual edits they may request.

Reviewer #6: The authors have made a commendable effort to address the concerns raised in the previous review round and the manuscript has been improved. However, some questions remain.

- Please clarify which software or method was used for the power analysis (e.g., G*Power, R package, or other).

- The use of linear mixed models (LMMs) with maximum likelihood estimation is methodologically appropriate for repeated measures data; however, the authors indicate that models “did not adjust for baseline scores.” The rationale for this decision should be provided.

- Little’s MCAR test was non-significant (p = 1.00), suggesting that there was no systematic missingness. However, attrition rates differed significantly between conditions at both follow-ups. This discrepancy indicates that data were unlikely to be missing completely at random (MCAR).

- Including descriptive comparisons of baseline characteristics across engagement groups (non-engagers, minimal engagers, and engagers) would enhance the interpretability of the findings and provide important context regarding potential differences in initial symptom severity or demographics.

- The discussion section is well written, appropriately acknowledging the null results and relevant contextual factors affecting the target population. In fact, the authors noted that the active control condition (mood and behavior tracking) is an important feature of this trial and likely contributed to improvements in both arms. The implications of using an active comparator deserve greater discussion.

- Exploratory findings, particularly the benefits for PTSD symptoms and help-seeking participants, should be presented more cautiously, given the small subsample and missing data.

- Finally, the conclusion that unguided digital interventions may not suit this population is reasonable but could be complemented by more constructive recommendations for future development (hybrid or guided models? adaptive tailoring based on symptom severity?).

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #5: Yes: Dr Daniel Leightley

Reviewer #6: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

To ensure your figures meet our technical requirements, please review our figure guidelines: https://journals.plos.org/plosone/s/figures

You may also use PLOS’s free figure tool, NAAS, to help you prepare publication quality figures: https://journals.plos.org/plosone/s/figures#loc-tools-for-figure-preparation.

NAAS will assess whether your figures meet our technical requirements by comparing each figure against our figure specifications.

PLoS One. 2026 Feb 5;21(2):e0342419. doi: 10.1371/journal.pone.0342419.r004

Author response to Decision Letter 2


10 Nov 2025

Fatma Refaat Ahmed, Ph.D.

Academic Editor

PLOS ONE

Dear Assoc Professor Refaat Ahmed,

Thank you for inviting us to resubmit this revised manuscript to PlosOne. In the attached manuscript we have addressed the items raised by the editorial team and peer reviewers and we have included our responses to the reviewer comments below.

Kind Regards,

Mark Deady, PhD

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #5: All comments have been addressed

Reviewer #6: (No Response)

NA

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #5: Yes

Reviewer #6: Partly

NA

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #5: Yes

Reviewer #6: N/A

NA

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #5: No

Reviewer #6: Yes

We have now provided this in repository: https://doi.org/10.26190/unsworks/31622

Which will be live once the manuscript is published.

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #5: Yes

Reviewer #6: Yes

NA

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #5: Thank you for the opportunity to conduct a second-look review. I was asked to comment on a manuscript that has already undergone extensive peer review, and I have therefore focused my assessment on whether the authors have addressed the editor’s and reviewers’ first-round comments. It would be unfair of me to provide fresh round of comments at this stage. The authors have, in my view, largely addressed the prior comments through concrete textual revisions, clearer reporting, and additional analyses. A small number of issues are acknowledged rather than fully resolved.

Specific points addressed

* formatting, file naming, and figure/table placement have been corrected.

* the statement has been clarified and the authors request an ethics-based exemption; I note this requires an editorial decision as it int clear why the committee are blocking.

* the description of the power analysis has been aligned with the tests actually conducted; explicit between-group results at follow-ups have been added.

* handling is now described (linear mixed models with maximum likelihood under a MAR assumption).

* the results and discussion have been adjusted to acknowledge low engagement and minimal between-group differences, with implications for unguided digital interventions made more explicit.

On balance, the first-round comments have been adequately addressed. I recommend acceptance subject to the editorial team being satisfied with the data availability wording and any minor textual edits they may request.

NA

Reviewer #6: The authors have made a commendable effort to address the concerns raised in the previous review round and the manuscript has been improved. However, some questions remain.

- Please clarify which software or method was used for the power analysis (e.g., G*Power, R package, or other).

We have now included this.

G*Power was used to calculate power.

- The use of linear mixed models (LMMs) with maximum likelihood estimation is methodologically appropriate for repeated measures data; however, the authors indicate that models “did not adjust for baseline scores.” The rationale for this decision should be provided.

The reviewer raises a good point. Randomisation in this trial resulted in comparable groups of participants in trial arms at baseline (see Table 2). As such, our analytical models did not adjust for baseline scores. We have now revised the Data Analysis section to include this point:

Models did not adjust for baseline scores as randomisation in this trial resulted in comparable groups of participants in trial arms at baseline (see Table 2).

- Little’s MCAR test was non-significant (p = 1.00), suggesting that there was no systematic missingness. However, attrition rates differed significantly between conditions at both follow-ups. This discrepancy indicates that data were unlikely to be missing completely at random (MCAR).

This is a good point. We have now added further information in the revised paper to provide greater clarification in the Results section:

Across all outcome measures, Little’s Missing Completely at Random (MCAR) test was not significant, χ2(3564) = 2379.28, p = 1.00, indicating missing data were MCAR. However, attrition from assessment was greater in the full intervention condition than the tracking-only condition at 1-month follow-up (72.3% vs 58.9%; χ2(1) = 17.52, p < .001) and at 3-month follow-up (83.6% vs 77.5%; χ2(1) = 5.29, p = .027). In addition, missingness was significantly related to variables in the dataset (e.g., GAD-7 item and AUDIT-C item scores; ps < .046). Taken together, these results suggest missing data were more plausibly missing at random (MAR), and this aligns with the missing-at-random assumption of the linear mixed models approach for analyses.

- Including descriptive comparisons of baseline characteristics across engagement groups (non-engagers, minimal engagers, and engagers) would enhance the interpretability of the findings and provide important context regarding potential differences in initial symptom severity or demographics.

We have now included this descriptive data as an appendix table (S7).

- The discussion section is well written, appropriately acknowledging the null results and relevant contextual factors affecting the target population. In fact, the authors noted that the active control condition (mood and behavior tracking) is an important feature of this trial and likely contributed to improvements in both arms. The implications of using an active comparator deserve greater discussion.

We have attempted to further discuss this within the manuscript, whilst being sensitive to word count restrictions. We also considered the overall effects across the two groups in determining the impact of the active control.

Indeed, there is evidence to suggest mood monitoring is associated with symptom improvement particularly in the case of depression [1, 2], with active controls of this nature associated with significantly smaller effect sizes compared to other forms of control [3]. This tool may have been further enhanced by the inclusion of other forms of monitoring (e.g., activity, sleep) which may have allowed users to recognise patterns in these factors and improve self-regulation. This adds to the ongoing debate around accuracy of evidence for psychological intervention in the context of appropriate control conditions, and in fact what such conditions are [4]. Further comparisons of monitoring and specific self-guided program elements are required to better understand the utility of this form of control condition and what therapeutic impacts it may have.

- Exploratory findings, particularly the benefits for PTSD symptoms and help-seeking participants, should be presented more cautiously, given the small subsample and missing data.

We have now specifically noted the methodological limitations of these findings.

However, the small sample size and missing data mean these findings should be interpreted with caution.

- Finally, the conclusion that unguided digital interventions may not suit this population is reasonable but could be complemented by more constructive recommendations for future development (hybrid or guided models? adaptive tailoring based on symptom severity?).

We have now added further context to this conclusion

These alternatives might include blended care or more guided modes of delivery (e.g., [5]). The use of adaptive and individually tailored approaches may also improve engagement.

We thank the reviewers for their insightful and perspicacious review and suggestions.

Regards,

Mark Deady and authorship team

1. van der Watt ASJ, Odendaal W, Louw K, Seedat S. Distant mood monitoring for depressive and bipolar disorders: a systematic review. BMC Psychiatry. 2020;20(1):383. doi: 10.1186/s12888-020-02782-y.

2. Dubad M, Winsper C, Meyer C, Livanou M, Marwaha S. A systematic review of the psychometric properties, usability and clinical impacts of mobile mood-monitoring applications in young people. Psychological Medicine. 2018;48(2):208-28. Epub 2017/06/23. doi: 10.1017/S0033291717001659.

3. Linardon J, Cuijpers P, Carlbring P, Messer M, Fuller‐Tyszkiewicz M. The efficacy of app‐supported smartphone interventions for mental health problems: A meta‐analysis of randomized controlled trials. World Psychiatry. 2019;18(3):325-36.

4. Cristea IA. The waiting list is an inadequate benchmark for estimating the effectiveness of psychotherapy for depression. Epidemiol Psychiatr Sci. 2019;28(3):278-9. Epub 2018/11/28. doi: 10.1017/s2045796018000665. PubMed PMID: 30479243; PubMed Central PMCID: PMCPMC6998910.

5. Deady M, Collins DAJ, Azevedo S, Stech E, Harrison A, Broomfield C, et al. Integration of a smartphone app with posttraumatic stress disorder treatment for frontline workers: a pilot study. Australian Journal of Psychology. 2024;76(1):2399112. doi: 10.1080/00049530.2024.2399112.

Decision Letter 2

Fatma Refaat Ahmed

22 Jan 2026

<p>Efficacy of a smartphone app to improve mental health among emergency service workers: A randomised controlled trial

PONE-D-25-23028R2

Dear Dr. Deady,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager®  and clicking the ‘Update My Information' link at the top of the page. For questions related to billing, please contact billing support .

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Fatma Refaat Ahmed, Ph.D.

Academic Editor

PLOS One

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

Reviewer #6: All comments have been addressed

Reviewer #7: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions??>

Reviewer #6: Yes

Reviewer #7: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously? -->?>

Reviewer #6: Yes

Reviewer #7: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available??>

The PLOS Data policy

Reviewer #6: Yes

Reviewer #7: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #6: Yes

Reviewer #7: Yes

**********

Reviewer #6: The authors have addressed all the comments and concerns raised in the prior review round. Clarifications regarding the power analysis, baseline adjustment rationale, and handling of missing data have been included. They also have added descriptive comparisons of engagement groups as requested which improves the contextualization of results. The manuscript is clearly written, technically sound, and the conclusions are appropriately matched to the data. Moreover, they also have made available the dataset of the study.

I have no further concerns, and I consider the manuscript suitable for publication.

Reviewer #7: I have been invited to conduct a review on this paper, which has already undergone two previous rounds of review with other reviewers. Therefore, I am focusing my comments and ratings on whether the authors have addressed previous comments or concerns from the previous reviewers. In my opinion, the authors have adequately addressed the previous comments, with a minimal data set now being made publicly available. I recommend acceptance.

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #6: No

Reviewer #7: No

**********

Acceptance letter

Fatma Refaat Ahmed

PONE-D-25-23028R2

PLOS One

Dear Dr. Deady,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS One. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

You will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Fatma Refaat Ahmed

Academic Editor

PLOS One

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 File. Intervention content.

    (DOCX)

    pone.0342419.s001.docx (233KB, docx)
    S2 File. Feedback items.

    (DOCX)

    pone.0342419.s002.docx (27.6KB, docx)
    S3 File. Moderation for medication use and help-seeking.

    (DOCX)

    pone.0342419.s003.docx (24.4KB, docx)
    S4 File. App user ratings.

    (DOCX)

    pone.0342419.s004.docx (46.9KB, docx)
    S5 File. CONSORT checklist.

    (DOCX)

    pone.0342419.s005.docx (43.8KB, docx)
    S6 File. Participant characteristics (Engager groups).

    (DOCX)

    pone.0342419.s006.docx (29.2KB, docx)

    Data Availability Statement

    A minimal, deidentified dataset is available via UNSWorks public repository (https://doi.org/10.26190/unsworks/31622).


    Articles from PLOS One are provided here courtesy of PLOS

    RESOURCES