Skip to main content
PLOS One logoLink to PLOS One
. 2020 Mar 9;15(3):e0230276. doi: 10.1371/journal.pone.0230276

suddengains: An R package to identify sudden gains in longitudinal data

Milan Wiedemann 1,2,*, Graham R Thew 1,2,3, Richard Stott 4, Anke Ehlers 1,2,4
Editor: Timo Gnambs5
PMCID: PMC7062272  PMID: 32150589

Abstract

Sudden gains are large and stable improvements in an outcome variable between consecutive measurements, for example during a psychological intervention with multiple assessments. Researching these occurrences could help understand individual change processes in longitudinal data. Three criteria are generally used to identify sudden gains in psychological interventions. However, applying these criteria can be time consuming and prone to errors if not fully automated. Adaptations to these criteria and methodological decisions such as how multiple gains are handled vary across studies and are reported with different levels of detail. These problems limit the comparability of individual studies and make it hard to understand or replicate the exact methods used. The R package suddengains provides a set of tools to facilitate sudden gains research. This article illustrates how to use the package to identify sudden gains or sudden losses and how to extract descriptive statistics as well as exportable data files for further analysis. It also outlines how these analyses can be customised to apply adaptations of the standard criteria. The suddengains package therefore offers significant scope to improve the efficiency, reporting, and reproducibility of sudden gains research.

Introduction

A sudden gain is a large improvement in an outcome variable experienced by an individual participant between two consecutive measurement points that is stable within a longitudinal data series. Sudden gains were first defined and investigated by Tang and DeRubeis [1], who examined session to session changes in depression symptoms among participants undertaking cognitive behavioural therapy. The majority of sudden gains studies to date have been in relation to psychological therapies [2], but the analytic approach could also be considered when investigating within-participant changes in other fields. A meta-analysis of 16 studies of psychological therapies (total N = 1104) found that experiencing a sudden gain was associated with better overall clinical outcomes at the end of treatment and at follow-up compared to those who did not experience gains [2]. Given this potential significance of sudden gains, examining such events specifically may be informative in understanding when and why such large improvements occur, which could help to increase the efficacy and efficiency of the intervention.

Rates of sudden gains within published clinical studies vary considerably (e.g. 17.8% to 52.2% of participants [2]), which may partly be due to differences in the methods used to identify them. However, such differences are hard to examine given that sufficient methodological details to permit a comparison are not always reported. In addition, some studies have raised concerns about the validity of sudden gains identified through current methods, demonstrating that they can be found in placebo interventions and simulated datasets [3, 4]. This suggests that not all gains reflect meaningful change or show a causal association with the intervention being studied. This highlights the need to examine the presence and strength of these associations and to consider if the current methods of identification can be refined The suddengains R package is the first software program to offer explicit and reproducible methods to automatically identify sudden gains, which may be valuable in improving methodological reporting and consistency across studies. It may also facilitate closer examination of the methods used to identify sudden gains, to help improve their validity and ensure that they more accurately reflect meaningful events. This article aims to provide an accessible overview of how sudden gains are calculated, describe the principal functions of the package, and give instructions on how to use these with longitudinal data. It is hoped that using this package will facilitate improvements in the efficiency, reporting, and reproducibility of sudden gains research.

Identification of sudden gains

Tang and DeRubeis [1, 5] suggested the following three criteria to identify sudden gains:

  1. The gain must be large in absolute terms. While this was originally operationalised as a decrease of at least 7 points on the Beck Depression Inventory (BDI [6]), subsequent studies have generally used the Reliable Change Index (RCI [7]) to define an appropriate cutoff for other scales [8]. Further details are discussed below.

  2. The gain must be large in relative terms. This is defined as a drop of at least 25% of the previous score.

  3. The gain must be large relative to symptom fluctuation. Originally an independent t test was proposed to compare the size of the sudden gain with symptom fluctuation before and after the gain. This method was controversial given the assumption of independence of the measurements before and after the gain is not met [3, 9]. Consequently the wording of this criterion was updated by Tang and colleagues [5, 9], though the calculations remained the same: The difference between the mean scores of the three measurements before the gain (Mpre), and the three measurements after the gain (Mpost), must be greater than the pooled standard deviation of these two groups multiplied by a critical value of 2.776 (i.e. the two-tailed t statistic for α = 0.05 and df = 4). The formula for criterion 3 is therefore:
    Mpre-Mpost>criticalvalue*(npre-1)*SDpre2+(npost-1)*SDpost2npre+npost-2 (1)

The criteria used to identify sudden gains vary between studies. For example, some studies have used different methods to define a cutoff value for criterion one [10, 11], criterion two was not included in some studies because of concerns about the impact of different response scales and data suggesting it has little effect on the number of gains found [12], and studies have used different methods to select a critical value for use in criterion 3 [11, 13] see Eq 1.

Defining a cutoff for the first criterion

Tang and DeRubeis [1] originally defined a 7 point cutoff on the BDI for the first criterion based on frequency distribution plots of session to session change scores on the BDI in clinical trials. The authors reported that 7 BDI points approximately reflected one standard deviation in clinical samples [9]. Stiles et al. [8] noted that 7 BDI points was close to the reliable change value reported in Barkham et al. [14] and therefore used the RCI formula to define a cutoff for a new measure. Subsequent studies have generally adopted this approach. Jacobson and Truax [7] proposed the following formula to test whether the observed pre to post change on a measure reflects more than just fluctuation due to measurement error:

pre-postSdiff=RCI (2)

Following Jacobson and Truax [7], reliable change on a measure is present when:

pre-postSdiff>1.96;therefore (3)
reliablechange>1.96×Sdiff; (4)

where Sdiff is the standard error of the difference between pre and post scores. Using the standard error of measurement (SE), Sdiff can be expressed as:

Sdiff=2×(SE)2; (5)

where SE is calculated using the standard deviation of the control group or normal population s1 and the test-retest reliability of the measure (rxx):

SE=s11-rxx; (6)

Some studies have adapted this formula following suggestions from Martinovich, Saunders and Howard [15] by replacing the test-retest reliability with the internal consistency (α) and replacing the standard deviation of the normal population (s1) with the standard deviation of the clinical sample at baseline (SDpre) so that all statistics can be extracted from the sample data [16]. Note that the use of the test-retest reliability or internal consistency when calculating SE makes the assumption that the scale being examined is unidimensional, and that these reliability estimates remain constant over time, and between individuals. Exploring the factor structure and measurement invariance of the scale may be appropriate to examine if these assumption hold.

SE=SDpre1-α (7)

In the sudden gains literature different approaches have been used to define a cutoff for the first criterion using the RCI formula. Some studies [10, 17] have used the standard error of the difference (Sdiff) while others [11, 13] have used the reliable change value (1.96 × Sdiff). When defining a cutoff it is important to consider the statistical assumptions involved, and toensure that this value reflects a meaningful change (large in absolute terms) that is realistic in a session by session context for the intervention.

Missing data

Missing data, for example where a participant does not provide data on one or more occasions, need to be considered carefully when identifying sudden gains for several reasons. Firstly, depending on the number and pattern of missing data points for an individual, it may not be possible to identify sudden gains, see Table 1. Specifically, in order to estimate the standard deviation values in criterion 3, at least two of the three measurements immediately prior to the gain must be present, as well as at least two of the three measurements immediately following the gain. Some researchers have suggested that methods used to replace missing values, such as last observation carried forward or multiple imputation, may not be appropriate when identifying sudden gains given the potential for additional gains to be detected based on data that were not provided by participants [18, 19].

Table 1. Data patterns required to identify sudden gains.

xn-2 xn-1 xn xn+1 xn+2 xn+3
Pattern 1
Pattern 2
Pattern 3
Pattern 4

Note. xn-2 to xn+3 represent any six consecutive measurement points within the data set. The minimum number of data points that must be present (•) in order to investigate the interval from xn to xn+1 as a potential sudden gain is four, arranged in one of the patterns shown. Note that the pregain (xn) and postgain (xx+1) data points must always be present. ∘ represents missing data.

Secondly, where values are missing in the period around the potential sudden gain, two approaches have been described to evaluate the stability of the change. Following the updated version of the third criterion by Tang and colleagues [5, 9] some studies have used a critical value of 2.776 across all session to session intervals to check the stability [13]. An alternative approach adjusts the critical values used in criterion 3 (see Eq 1) based on the data that were available in the period around the potential sudden gain [11]: Where no data are missing t(4;97.5%) > 2.776; where one datapoint is missing either before or after the gain t(3;97.5%) > 3.182; and where one datapoint is missing both before and after the gain t(2;97.5%) > 4.303. This method has been adopted in some subsequent studies [20, 21].

It is important to understand the reasons for missing data and consider whether methods to handle missing data need to be employed both at the identification stage and in subsequent analyses [22, 23]. Further research to examine the impact of missing data and different methods to handle missing data when identifying sudden gains would be beneficial.

Terminology

The naming of specific sessions (or measurement points) around the gain follows the convention that the session immediately prior to the gain is session N (also known as the pregain session), and the session immediately after is session N+1 (or postgain session). Other sessions are referred to in relation to session N (e.g. N-2, N+3).

Reversals

According to Tang and DeRubeis [1] a sudden gain is counted as reversed if 50% of the improvement made during the gain was lost at any subsequent point. For example, where the sudden gain represents a drop in score from 40 to 30 points, the gain is classed as having reversed if a score of 35 or more is observed at any later session. As discussed in Wucherpfennig et al. [20] a reversal might not necessarily be a stable phenomenon. These authors modified this criterion by suggesting that a stable reversal is present when a reversal is also classified as a sudden loss (see below).

Sudden losses

Although less frequently studied than sudden gains, sudden losses represent the inverse phenomenon, where a participant shows a large and stable increase of scores on the outcome variable. While some authors invert the three sudden gains criteria [11, 24], others further adjust the percentage threshold of the second criterion, e.g. 33% [16].

Why is a package needed?

As indicated by the criteria above, identifying sudden gains requires the application of each of the three criteria to each session to session interval, and that this is performed for each individual in a given dataset. A large number of calculations and extensive manipulation of data is therefore involved, particularly in larger datasets. Doing these data manipulations manually (e.g. in spreadsheets) can be extremely time consuming and lead to errors. It also means that certain methodological decisions, such as determining the critical value for the third criterion, or handling of participants with multiple gains, may not be addressed sufficiently or in a consistent way across studies. It is hoped that the use of the suddengains package will provide faster and more accurate calculations, as well as offering a transparent and consistent method to address these methodological considerations.

Functions of the suddengains package

The suddengains package provides a set of functions to calculate the presence of sudden gains (and sudden losses) within a longitudinal dataset, and to provide basic plots and descriptive statistics of the gains. It can also extract scores on secondary outcome or process measures around the period of each gain. Output files (in SPSS, Excel, or CSV formats) arranged by individual gain, or by person can be generated for further analyses in other programs. This package is supplemented by an interactive web application [25] shinygains that illustrates the main functions of this package at https://milanwiedemann.shinyapps.io/shinygains/. As it allows users to explore and understand the impact of different methodological choices, it may be useful in planning sudden gains studies. Table 2 lists and describes the main functions.

Table 2. Main functions of the suddengains R package.

Function Description
Identify sudden gains
define_crit1_cutoff() Uses RCI formula to help determine a cutoff value for criterion 1
check_interval() Checks if a given interval is a sudden gain/loss
identify_sg(), identify_sl() Identifies sudden gains/losses
Create datasets
create_bysg(), create_byperson() Creates a dataset with one row for each sudden gain/loss
extract_values() Extracts values on a secondary measure around the sudden gain/loss
Describe sudden gains
describe_sg() Generates summary descriptive statistics
plot_sg(), plot_sg_trajectories() Creates plots of the average sudden gain, or individual case trajectories
Additional functions
select_cases() Selects cases to be included in the sudden gains analysis
write_bysg(), write_byperson() Exports CSV, SPSS, Excel, or STATA files of the sudden gains datasets

Note. More details of each function can be found in the package documentation or using the help() function in R.

Worked example

This demonstration uses a dataset (sgdata) that was created to illustrate the functions of this package. The data show self-report weekly questionnaire scores for 43 participants who have received psychological therapy for depression. The intervention lasted for 12 sessions, and each participant completed a set of outcome measures at the beginning of each session, including the BDI and a fictional secondary measure assessing rumination (RQ).

Preparation of data

The data to be analysed for sudden gains are arranged in wide format i.e. one row per participant, and one column for each questionnaire score at each measurement point. A unique identifier variable also needs to be included. Some researchers have specified a minimum number of measurement points that must be present for participants to be included, to ensure that they received a sufficient amount of the intervention being studied [1]. Alternatively it may be of interest to analyse all cases whose data are distributed such that at least one interval can be examined for a potential sudden gain [21]; For all three criteria to be applied there must be data present for at least two of the three data points prior to, and two of the three following, the interval to be examined, see Table 1. The optional select_cases() function can be used to identify samples of cases for analysis who fulfil such conditions, though researchers should consider whether these methods are appropriate for the aims of the study.

Identification of sudden gains

The identify_sg() function applies the sudden gains criteria as specified by the user to each session to session interval in the dataset. As shown below, the user specifies: data, the dataset to use in wide format; sg_crit1_cutoff, the cutoff value to use for criterion 1 (which can be entered manually or calculated using the define_crit1_cutoff() function); sg_crit2_pct, the percentage change value to use for criterion 2 (0.25 by default); sg_crit3, whether or not to apply the third criterion (TRUE by default); sg_crit3_alpha, the alpha value to use when calculating the criterion 3 critical value (0.05 by default); id_var_name, the name of the unique identifier variable within the dataset; and sg_var_list, a list of the variables representing the span of sessions to be analysed, which is sessions 1 to 12 in this example. By default all functions that identify sudden gains apply the adjustment of the critical value in Eq 1 as described by Lutz and colleagues [11]. To turn off this adjustment and instead apply a manually defined critical value across all session to session intervals, the argument sg_crit3_adjust = FALSE can be included and sg_crit3_critical_value specified. Additional options to customise this analysis are discussed in the package documentation. An alternative function, identify_sl(), is identical to identify_sg() but applies the criteria in the inverse direction to calculate sudden losses. The function check_interval() can be used to examine whether a specific session to session interval is a sudden gain/loss.

# First, install and load the suddengains R package

install.packages(“suddengains”)

library(suddengains)

# Identify sudden gains in the dataset “sgdata”:

identify_sg(data = sgdata,

    sg_crit1_cutoff = 7,

    sg_crit2_pct = 0.25,

    sg_crit3 = TRUE,

    id_var_name = “id”,

    sg_var_list = c(“bdi_s1”, “bdi_s2”, “bdi_s3”, “bdi_s4”,

         “bdi_s5”, “bdi_s6”, “bdi_s7”, “bdi_s8”,

         “bdi_s9”, “bdi_s10”, “bdi_s11”, “bdi_s12”),

    crit123_details = TRUE)

The output data frame shows each session to session interval, for example sg_2to3 representing the interval between sessions two and three. Variables indicate whether each of the three criteria were met and therefore whether a sudden gain was observed for each interval. Sudden gains are indicated by a value of 1, see Table 3. Examining this interval in our example data, we see that only id = 10 meets all three criteria, for id = 2 none of the three criteria can be tested, for id = 18 only the third criterion can not be tested, for all other participants at least one criterion is not met.

Table 3. A sample of the output data frame created by the identify_sg() function.

id sg_crit1_2to3 sg_crit2_2to3 sg_crit3_2to3 sg_2to3
1 FALSE FALSE FALSE 0
2 NA NA NA NA
10 TRUE TRUE TRUE 1
12 TRUE FALSE FALSE 0
18 FALSE FALSE NA NA
23 FALSE FALSE TRUE 0

Note. For the variables testing the three sudden gains criteria, referred to by ‘crit1’, ‘crit2’, and ‘crit3’ in the variable names TRUE indicates that the criterion is met, while FALSE indicates the criterion is not met. NA indicates that a particular criterion could not be tested for a sudden gain due to missing data.

To permit further analysis of our data, we wish to obtain an output dataset containing both the original data and the newly identified sudden gains. As participants may experience more than one gain, as in the present example, and to allow for different subsequent analyses, the package provides two options for output datasets: The create_bysg() function creates a dataset structured with one row per sudden gain, and the create_byperson() function creates a dataset structured with one row per person, indicating whether or not they experienced a sudden gain. The tx_start_var_name and tx_end_var_name arguments are used to specify the start and end of treatment (tx) variables, and sg_measure_name specifies the name of the measure used to calculate sudden gains.

# Create output dataset with one row per sudden gain

# and save as an object called “bysg” to use later

bysg <- create_bysg(data = sgdata,

        sg_crit1_cutoff = 7,

        sg_crit2_pct = 0.25,

        sg_crit3 = TRUE,

        id_var_name = “id”,

        tx_start_var_name = “bdi_s1”,

        tx_end_var_name = “bdi_s12”,

        sg_var_list = c(“bdi_s1”, “bdi_s2”, “bdi_s3”,

             “bdi_s4”, “bdi_s5”, “bdi_s6”,

             “bdi_s7”, “bdi_s8”, “bdi_s9”,

             “bdi_s10”, “bdi_s11”, “bdi_s12”),

        sg_measure_name = “bdi”)

The new variables created by the create_bysg() and create_byperson() functions are described in Table 4. To continue working in another program (e.g. SPSS, STATA, Excel) the functions write_bysg() and write_byperson() can be used to export the datasets created in R [26] as .sav, .dta, .xlsx, or .csv files.

Table 4. Description of variables created by the create_bysg() and create_byperson() functions.

Variable Name Variable Label
id_sg Unique ID variable for every identified sudden gain / loss
sg_crit123 Indicates whether all applied sudden gain criteria were met (No = 0; Yes = 1)
sg_session_n Pregain session number
sg_freq_byperson Frequency of sudden gains / losses per person
sg_bdi_2n Pre-pre-pre gain session score (N-2)
sg_bdi_1n Pre-pre gain session score (N-1)
sg_bdi_n Pre-gain session score (N)
sg_bdi_n1 Post-gain session score (N+1)
sg_bdi_n2 Post-post gain session score (N+2)
sg_bdi_n3 Post-post-post gain session score (N+3)
sg_magnitude Raw magnitude of sudden gain
sg_bdi_tx_change Total change during treatment
sg_change_proportion Proportion of total change represented by the sudden gain
sg_reversal_value Reversal value
sg_reversal Indicates whether the reversal value was met at any point in treatment following the sudden gain (No = 0; Yes = 1)

Note. The variable names listed including _bdi_ will reflect the name of the measure specified in the sg_measure_name argument.

Analysis of sudden gains

In this example, we have calculated sudden gains based on depression scores using the BDI. In analysing these gains, we are interested in how rumination scores on the fictional RQ measure change around the period of the sudden gains in depression. The extract_values() function extracts the RQ values from the three sessions before (N-2, N-1, N) and the three sessions after (N+1, N+2, N+3) each depression sudden gain. In the dataset that gets returned by this function we refer to these sessions as sg_bdi_2n, sg_bdi_1n, sg_bdi_n, sg_bdi_n1, sg_bdi_n2, and sg_bdi_n3, respectively. This function can be applied to either the bysg or byperson dataset. By default the extracted values will be added as new variables to the dataset used. Here we demonstrate applying this function to the bysg dataset, as shown in the code below. First, the RQ variables are added to the bysg dataset. Second, the extract_values() function is applied. Note that the list of RQ variables included in the extract_var_list argument must match those used for the sg_var_list argument used previously in the create_bysg() function. This means that the number of variables in these lists has to be identical and measured at the same timepoints. The output data frame can be saved as a new object, or the existing bysg object can be overwritten, as in this example. The RQ scores now in the bysg dataset can be examined, for example to look at the temporal relationship between changes in rumination and changes in depression symptoms.

# 1. Select the ID and variables from a second measure

sgdata_rq <- dplyr::select(sgdata,

          “id”,

          “rq_s1”, “rq_s2”, “rq_s3”,

          “rq_s4”, “rq_s5”, “rq_s6”,

          “rq_s7”, “rq_s8”, “rq_s9”,

          “rq_s10”, “rq_s11”, “rq_s12”)

# 2. Add variables in ‘sgdata rq’ to the ‘bysg’ dataset created earlier

bysg <- dplyr::left_join(bysg, sgdata_rq, by = “id”)

# 3. Extract values on the second measure around the sudden gain

bysg <- extract_values(data = bysg,

         id_var_name = “id_sg”,

         extract_var_list = c(“rq_s1”, “rq_s2”, “rq_s3”,

                 “rq_s4”, “rq_s5”, “rq_s6”,

                 “rq_s7”, “rq_s8”, “rq_s9”,

                 “rq_s10”, “rq_s11”, “rq_s12”),

         extract_measure_name = “rq”,

         add_to_data = TRUE)

The describe_sg() function provides descriptive statistics about the sudden gains based on the variables from the bysg and byperson datasets. For the present example, this function indicates that 16 of the 43 participants experienced a sudden gain, and 9 experienced more than one gain, leading to a total of 26 sudden gains within the data. Information on the mean gain magnitude and reversals is also provided.

The plot_sg() function plots the average sudden gain, and can be used to show the primary or secondary outcome measure data (Fig 1A and 1B). The sg_pre_post_var_list argument specifies the pregain and postgain variables to be plotted, namely sessions N-2 to N+3. This function is built using the R Package ggplot2 [27] and additional ggplot2 functions can be added to the plot. It is also possible to plot the average gain magnitude of different groups (e.g. two treatment arms in a trial) in one figure by using the optional group argument (see Fig 1C).

Fig 1. Plots of average changes around sudden gains.

Fig 1

(A) Average gain magnitude on the BDI across all sudden gains (B) Average changes in rumination (RQ) around sudden gains on the BDI. (C) Average gain magnitude on the BDI for two different treatments.

# Create average sudden gain plot for BDI data (see Fig 1A):

plot_sg(data = bysg,

   id_var_name = “id”,

   tx_start_var_name = “bdi_s1”,

   tx_end_var_name = “bdi_s12”,

   sg_pre_post_var_list = c(“sg_bdi_2n”, “sg_bdi_1n”, “sg_bdi_n”,

            “sg_bdi_n1“, “sg_bdi_n2”, “sg_bdi_n3”),

   ylab = “BDI”)

An additional function, plot_sg_trajectories(), is available to plot the trajectories of a selection of individual cases within the dataset (see Fig 2A). This function can be paired with a filter command, for example filter() from the R Package dplyr [28], to visualise trajectories of specific groups of participants. For example, all participants with more than one sudden gain, or all participants with a sudden gain between sessions 3 and 4 (see Fig 2B).

Fig 2. Plots of trajectories for selected cases.

Fig 2

(A) Trajectories for a selection of individual cases. (B) Trajectories of BDI scores for all participants with a sudden gain between sessions 3 and 4.

Discussion

The analysis of sudden gains and losses provides a detailed examination of within-participant changes during the course of an intervention, and may help to understand individual processes of change. The suddengains package aims to facilitate the computation of gains, which can be laborious and error-prone. It also aims to address common methodological issues, for example by allowing adjustments to the critical value for the third criterion in the presence of missing data, and by highlighting participants with multiple gains.

Limitations of the package include the fact that more substantial adaptations to the standard criteria cannot currently be implemented, though as the underlying code is publically available, researchers may wish to use this in combination with other tools for further development work. Second, while the package may significantly increase the speed and accuracy of calculations, it cannot and should not substitute considered methodological thinking. In particular, users should consider carefully the appropriateness of the methods selected within each function, including related assumptions and limitations. Lastly it should be emphasised that sudden gains and losses identified by applying a set of mathematical criteria are not necessarily related to the effects of the intervention being studied, and that further investigation would be required to establish the presence and strength of evidence for a causal relationship.

Overall, it is hoped that this package will permit faster and more transparent examination of sudden gains within a range of longitudinal datasets, and that it could provide a valuable tool to explore how the criteria might be refined or adapted to better identify gains that reflect meaningful change processes.

Acknowledgments

Earlier versions of this manuscript were written using the R package papaja [29].

Data Availability

The package can be downloaded from CRAN https://CRAN.R-project.org/package=suddengains. All code, materials, and data can be found at https://github.com/milanwiedemann/suddengains. Instructions for installing the package, further technical details, and examples can be found at https://milanwiedemann.github.io/suddengains. The R code examples in this paper refer to package version 0.4.0.

Funding Statement

This project was supported by a Mental Health Research UK studentship (MW), the Wellcome Trust [102176 (GRT); 069777 and 200976 (AE, RS)], the Oxford Health NIHR Biomedical Research Centre (MW, GRT, AE), and the NIHR Oxford Biomedical Research Centre (GRT). The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1. Tang TZ, DeRubeis RJ. Sudden gains and critical sessions in cognitive-behavioral therapy for depression. Journal of Consulting and Clinical Psychology. 1999;67: 894–904. 10.1037//0022-006x.67.6.894 [DOI] [PubMed] [Google Scholar]
  • 2. Aderka IM, Nickerson A, Boe HJ, Hofmann SG. Sudden gains during psychological treatments of anxiety and depression: A meta-analysis. Journal of Consulting and Clinical Psychology. 2012;80: 93–101. 10.1037/a0026455 [DOI] [PubMed] [Google Scholar]
  • 3. Vittengl JR, Clark LA, Jarrett RB. Validity of sudden gains in acute phase treatment of depression. Journal of Consulting and Clinical Psychology. 2005;73: 173–182. 10.1037/0022-006X.73.1.173 [DOI] [PubMed] [Google Scholar]
  • 4. Vittengl JR, Clark LA, Thase ME, Jarrett RB. Detecting sudden gains during treatment of major depressive disorder: Cautions from a monte carlo analysis. Current Psychiatry Reviews. 2015;11: 19–31. 10.2174/1573400510666140929195441 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Tang TZ, DeRubeis RJ, Beberman R, Pham T. Cognitive changes, critical sessions, and sudden gains in cognitive-behavioral therapy for depression. Journal of Consulting and Clinical Psychology. 2005;73: 168–172. 10.1037/0022-006X.73.1.168 [DOI] [PubMed] [Google Scholar]
  • 6. Beck AT, Steer RA. Beck Depression Inventory Manual. San Antonio, TX: The Psychological Corporation; 1993. [Google Scholar]
  • 7. Jacobson NS, Truax PA. Clinical significance: A statistical approach to defining meaningful change in psychotherapy research. Journal of Consulting and Clinical Psychology. 1991;59: 12–19. 10.1037//0022-006x.59.1.12 [DOI] [PubMed] [Google Scholar]
  • 8. Stiles WB, Leach C, Barkham M, Lucock M, Iveson S, Shapiro DA, et al. Early sudden gains in psychotherapy under routine clinic conditions: Practice-based evidence. Journal of Consulting and Clinical Psychology. 2003;71: 14–21. 10.1037/0022-006X.71.1.14 [DOI] [PubMed] [Google Scholar]
  • 9. Tang TZ. Sudden gains In: Cautin RL, Lilienfeld SO, editors. The Encyclopedia of Clinical Psychology. Hoboken, New Jersey: Wiley-Blackwell; 2015. pp. 2745–2751. [Google Scholar]
  • 10. Doane LS, Feeny NC, Zoellner LA. A preliminary investigation of sudden gains in exposure therapy for PTSD. Behaviour Research and Therapy. 2010;48: 555–560. 10.1016/j.brat.2010.02.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Lutz W, Ehrlich T, Rubel J, Hallwachs N, Rottger M-A, Jorasz C, et al. The ups and downs of psychotherapy: Sudden gains and sudden losses identified with session reports. Psychotherapy Research. 2013;23: 14–24. 10.1080/10503307.2012.693837 [DOI] [PubMed] [Google Scholar]
  • 12. Hardy GE, Cahill J, Stiles WB, Ispan C, Macaskill N, Barkham M. Sudden Gains in Cognitive Therapy for Depression: A Replication and Extension. Journal of Consulting and Clinical Psychology. 2005;73: 59–67. 10.1037/0022-006X.73.1.59 [DOI] [PubMed] [Google Scholar]
  • 13. Zilcha-Mano S, Eubanks CF, Muran JC. Sudden gains in the alliance in cognitive behavioral therapy versus brief relational therapy. Journal of Consulting and Clinical Psychology. 2019;87: 501–509. 10.1037/ccp0000397 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Barkham M, Rees A, Stiles WB, Shapiro DA, Hardy GE, Reynolds S. Dose Effect relations in time-limited psychotherapy for depression. Journal of Consulting and Clinical Psychology. 1996;64: 927–935. 10.1037//0022-006x.64.5.927 [DOI] [PubMed] [Google Scholar]
  • 15. Martinovich Z, Saunders S, Howard K. Some Comments on “Assessing Clinical Significance”. Psychotherapy Research. 1996;6: 124–132. 10.1080/10503309612331331648 [DOI] [PubMed] [Google Scholar]
  • 16. König J, Karl R, Rosner R, Butollo W. Sudden gains in two psychotherapies for posttraumatic stress disorder. Behaviour Research and Therapy. 2014;60: 15–22. 10.1016/j.brat.2014.06.005 [DOI] [PubMed] [Google Scholar]
  • 17. Jun JJ, Zoellner LA, Feeny NC. Sudden gains in prolonged exposure and sertraline for chronic PTSD. Depression and Anxiety. 2013;30: 607–613. 10.1002/da.22119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Tang TZ, DeRubeis RJ, Hollon SD, Amsterdam J, Shelton R. Sudden gains in cognitive therapy of depression and depression relapse/recurrence. Journal of Consulting and Clinical Psychology. 2007;75: 404–408. 10.1037/0022-006X.75.3.404 [DOI] [PubMed] [Google Scholar]
  • 19. Shalom JG, Gilboa-Schechtman E, Atzil-Slonim D, Bar-Kalifa E, Hasson-Ohayon I, van Oppen P, et al. Intraindividual variability in symptoms consistently predicts sudden gains: An examination of three independent datasets. Journal of Consulting and Clinical Psychology. 2018;86: 892–902. 10.1037/ccp0000344 [DOI] [PubMed] [Google Scholar]
  • 20. Wucherpfennig F, Rubel JA, Hofmann SG, Lutz W. Processes of change after a sudden gain and relation to treatment outcome—Evidence for an upward spiral. Journal of Consulting and Clinical Psychology. 2017;85: 1199–1210. 10.1037/ccp0000263 [DOI] [PubMed] [Google Scholar]
  • 21. Wiedemann M, Stott R, Nickless A, Beierl ET, Wild J, Warnock-Parkes E, et al. Cognitive processes associated with sudden gains in cognitive therapy for posttraumatic stress disorder in routine care. Journal of Consulting and Clinical Psychology. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Rubin DB. Inference and Missing Data. Biometrica. 1976;63: 581–592. 10.1093/biomet/63.3.581 [DOI] [Google Scholar]
  • 23. Schafer JL, Graham JW. Missing data: Our view of the state of the art. Psychological Methods. 2002;7: 147–177. 10.1037/1082-989X.7.2.147 [DOI] [PubMed] [Google Scholar]
  • 24. Krüger A, Ehring T, Priebe K, Dyer AS, Steil R, Bohus M. Sudden losses and sudden gains during a DBT-PTSD treatment for posttraumatic stress disorder following childhood sexual abuse. European Journal of Psychotraumatology. 2014;5: 24470 10.3402/ejpt.v5.24470 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Chang W, Cheng J, Allaire JJ, Xie Y and McPherson J. shiny: Web Application Framework for R. [Internet]. 2019. Available: https://CRAN.R-project.org/package=shiny
  • 26.R Core Team. R: A language and environment for statistical computing [Internet]. Vienna, Austria: R Foundation for Statistical Computing; 2018. Available: https://www.R-project.org/
  • 27.Wickham H. ggplot2: Elegant graphics for data analysis [Internet]. Springer-Verlag New York; 2016. Available: http://ggplot2.org
  • 28.Wickham H, François R, Henry L, Müller K. dplyr: A grammar of data manipulation [Internet]. 2018. Available: https://CRAN.R-project.org/package=dplyr
  • 29.Aust F, Barth M. papaja: Create APA manuscripts with R Markdown [Internet]. 2018. Available: https://github.com/crsh/papaja

Decision Letter 0

Timo Gnambs

19 Dec 2019

PONE-D-19-30704

suddengains: An R package to identify sudden gains in longitudinal data

PLOS ONE

Dear Mr Wiedemann,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

I have now received two reports by experts in the field and also read the manuscript myself. As you can see both reviewers evaluated your manuscript rather favorably and emphasized the important contribution of your work. At the same time, they also raised important shortcomings that need to be addressed before the manuscript can be accepted for publication. I will summarize the most important points and also include some points of my own:

  • You define sudden gains as “large and stable changes … between two consecutive measurement points” (page 2, line 2). From this introduction, it was unclear to me how you might infer conclusions about the stability of change if you have only two measurements. In my opinion, it would require at least three measurements to make conclusions about the extent the observed change is stable or not. On page 3, you modified this claim by evaluating six measurement points in criterion 3. I think the definition of sudden gains needs to be more precise and should be modified accordingly.

  • Equation (1) uses the standard deviation of the difference score. This represents the formula for two independent groups. Although it is also sometimes used for dependent samples, Cohen (1988) suggested including the correlation between the two measures. In situations, when the standard deviation is expected to change (such as in intervention studies) Glass and colleagues (1981) even recommended using the standard deviation of the pre-intervention measurement (you even refer to this approach on page 4, line 63). Thus, there is some discussion on the appropriate way of estimating the variance of difference scores. Please include a more thorough discussion on your choice of the standard deviation and in what way it seems more appropriate than the other suggestions.

Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences. New York, NY: Routledge Academic.

Glass, G. V., McGaw, B., and Smith, M. L. (1981). Meta-Analysis in Social Research. Beverly Hills, CA: Sage.

  • Reviewer 1 raises an important point regarding the assumptions in your approach (e.g., unidimensionality, constant reliabilities). I would recommend being more explicit about the inherent assumptions (and limitations) of you statistical model.

  • Your way of handling missing data seems suspect to me (page 5). I doubt you can appropriately account for missingness simply by adapting you significance level (Reviewer 1 also raises this point). Please make sure to thoroughly describe the assumptions underlying this approach and also refer to Rubin’s typology of missing mechanisms (MCAR, MAR, MNAR).

  • What are the criteria for excluding respondents because “it would not be possible to identify any sudden gains” (page 8, line 138)? Looking at Table 2, it seems to me that you considered one measurement point before and at least two measurement points after the pregain session as a necessary requirement. Could you elaborate why? Moreover, criterion 3 on page 3 seems to indicate that at least three measurements before and after the pregain session are needed?

  • Reviewer 2 has two suggestions for improving the package and increasing its usefulness for applied researchers. You might want to consider these points. However, their implementation is not a requirement for the publication of your manuscript.

  • Your worked example is only partially presented. I would recommend also including the output generated by the functions and providing a step by step interpretation of the printout.

In addition to these points, both reviewers made further excellent suggestions that you should consider in your revision. I strongly encourage you to address these issues and submit a revised version of your manuscript.

We would appreciate receiving your revised manuscript by Feb 02 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Timo Gnambs

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements:

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at http://www.plosone.org/attachments/PLOSOne_formatting_sample_main_body.pdf and http://www.plosone.org/attachments/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please note that PLOS ONE has specific guidelines on software sharing (http://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-software) for manuscripts whose main purpose is the description of a new software or software package. In this case, new software must conform to the Open Source Definition (https://opensource.org/docs/osd) and be deposited in an open software archive.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: N/A

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The manuscript “suddengains: An R package to identify sudden gains in longitudinal data” describes the development and use of a statistical package to identify sudden gains in psychotherapy. The rationale for the development of the package is provided and its features and use are discussed.

I believe that this manuscript is extremely important in the field of sudden gains research. As the authors noted, the application of sudden gains criteria has differed between studies, and the analyses and calculations are highly prone to errors. As a researcher in this field I think it is safe to assume that at least a moderate degree of variance and discrepancy between studies is due to applying criteria differently (e.g., changing vs. maintaining the critical value for the third criterion based on amount of missing data) and due to computational error. As a result, I found myself in complete agreement with the authors on the importance of having a standard package for computing sudden gains. I believe this can reduce some of the “error variance” that exists in the field.

As part of my review, I compared the suddengains package with a complex, multi-tab excel spreadsheet that I had been using for the past few years to identify sudden gains. The results were identical. While it is more likely that I validated my excel file with the suddengains package than validated the package with my excel file, I believe this suggests that the main functions work without error.

There are two potential features that I think could strengthen the package further. These are by no means necessary for publication as I believe the current features already represent a substantial contribution to the literature. However, if the authors have plans on updating the package in the future, I hope they consider these features. The first is the ability to chose the critical cutoff for the third criterion. As the authors noted, the third criterion is considered to be more a descriptive cutoff than a valid test of statistical significance. This has led some authors to change to the critical value to 2.5 for instance (Hardy et al., 2005). This option could be helpful in future versions. The second feature would be to allow for an altered 3rd criterion by using 1.5 standard deviations of symptom scores as a critical value. This has been done in many previous studies (e.g., Kelly et al. 2005; 2007). Allowing users to chose such an option could facilitate within-study comparisons of sudden gains identified using different criteria. I believe such a comparison could help the field decide on standard criteria based on empirical data.

Overall, I think the manuscript and package represent an important contribution to the field of sudden gains and believe the package will be used by many researchers in the field.

Reviewer #2: Review of the manuscript “suddengains: An R package to identify sudden gains in longitudinal data”

The manuscript focuses important information on individual change, which can be investigated in longitudinal data (i.e., sudden gains & losses). Central is the description and investigation of sudden gains. For the description of sudden gains, the authors consider specific identification criteria. Based on these criteria, they provide an R-package facilitating automated investigations of sudden gains. This should help to improve the efficiency, reporting, and reproducibility of sudden gains research.

I support the basic idea of the authors that the R-package “suddengains” is a useful toolbox for applied researches. In addition to a function for obtaining sudden gains, the presented R package has very helpful add-ons, like an illustrative example, a function for creating new output datasets with sudden gains, and graphical functions for illustrating sudden gains and individual trajectories. The authors build on available state-of-the-art packages, like dplyr and ggplot2, and their functions are well documented.

Despite my respect for developing such a useful R-package, I cannot recommend acceptance of the manuscript in the present form. Here is why: Every automation poses the risk of an unconsidered application of specific methods. In my view, this risk is not appropriately addressed in the manuscript, so far. The authors describe a very broad research field for investigating sudden gains with the presented package (p.2. line 8 – any longitudinal dataset with regular repeated measurement). I would encourage a much more detailed view on the context for the application of the R package, where the major limitations are:

1) The authors specify specific criteria for the identification of sudden gains, but they do not review the incorporated assumptions. For instance, calculating the standard error of measurement (S_E) in Eq. 6 or 7 with a specific reliability estimate (retest reliability or Chronbach’s alpha) is only possible, when investigating unidimensional scales, for which such reliability estimates hold. Furthermore, even when unidimensionality holds, the reliability of a measure is not necessarily constant over time or persons (see e.g., models of Latent-State-Trait Theory, where multiple time points are considered and Chronbach’s alpha can be calculated for unidimensional scales at each time point, or Item-Response-Theory models, which allow for heterogeneous measurement error variance depending on the ability of a person). Using one reliability measure for all time points and individuals is a rough approximation to control for measurement error and alternative methods are available. I think, more details on the assumptions of the promoted methods would be beneficial, in order to describe the limitations of the related R-package.

2) In the preparation of the data, the authors facilitate the exclusion of specific cases, when not enough data points are available to identify sudden gains. They clearly describe which cases have to be excluded. However, the assumptions for and the consequences of excluding persons are not mentioned. This is especially important when conducting subsequent analyses, which are encouraged by the presented R-package. When excluding persons with missing data that is not completely at random, then subsequent results can be biased.

3) A further (minor) comment addresses the statements on the meaning of sudden gains. The authors use a descriptive, databased view on sudden gains. This is appropriate for obtaining such events in the data. However, at the same time they address the meaning of sudden gains, for instance, on page 2 line 20-22 they refer to the meaning of sudden gains in placebo interventions. A clear meaning of sudden gains as effects of an intervention would require a more formal specification of sudden gains as causal effects of the intervention. Accordingly, further (design or analysis) conditions need to hold, for ruling out alternative explanations. I would clearly distinguish between obtaining sudden gains and their meaning.

To sum up, I encourage a revision of the manuscript - especially for the description of the methods and the discussion. Possible limitations of the methods and the related R package should be included. As such, the circumstances under which the application of the R-package is appropriate would be clearer. I would also appreciate a more detailed view on the meaning of sudden gains. In my view, the suggested changes can support reproducibility of sudden gains research, which is one of the author’s goals.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Idan M Aderka

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Mar 9;15(3):e0230276. doi: 10.1371/journal.pone.0230276.r002

Author response to Decision Letter 0


4 Feb 2020

Editor’s comments:

E#1. You define sudden gains as “large and stable changes … between two consecutive measurement points” (page 2, line 2). From this introduction, it was unclear to me how you might infer conclusions about the stability of change if you have only two measurements. In my opinion, it would require at least three measurements to make conclusions about the extent the observed change is stable or not. On page 3, you modified this claim by evaluating six measurement points in criterion 3. I think the definition of sudden gains needs to be more precise and should be modified accordingly.

RESPONSE: Thank you for raising this, having looked again we appreciate the wording was unclear - while ‘large and stable’ tends to be how gains are conceptualised in a broad sense, it is true that stability is evaluated not on 2 points, but on 6 as described later. We have amended the first sentence of the introduction to make this clearer (Page 2):

“A sudden gain is a large improvement in an outcome variable experienced by an individual participant between two consecutive measurement points that is stable within a longitudinal data series.”

E#2. Equation (1) uses the standard deviation of the difference score. This represents the formula for two independent groups. Although it is also sometimes used for dependent samples, Cohen (1988) suggested including the correlation between the two measures. In situations, when the standard deviation is expected to change (such as in intervention studies) Glass and colleagues (1981) even recommended using the standard deviation of the pre-intervention measurement (you even refer to this approach on page 4, line 63). Thus, there is some discussion on the appropriate way of estimating the variance of difference scores. Please include a more thorough discussion on your choice of the standard deviation and in what way it seems more appropriate than the other suggestions.

Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences. New York, NY: Routledge Academic.

Glass, G. V., McGaw, B., and Smith, M. L. (1981). Meta-Analysis in Social Research. Beverly Hills, CA: Sage.

RESPONSE: You rightly raise this issue, one that has been discussed in various sudden gains papers. Our use of the standard deviation is primarily driven by how Criterion 3 was originally defined by Tang and colleagues (1999, 2005, 2015) and has been subsequently applied in the literature. Our view is that for the R package to be useful it needs to be able to apply the criteria using the methods that are ‘standard’ for the field in order to allow comparability between studies. However, we also agree with the point raised by Reviewer 2 that we want to avoid the ‘unconsidered application’ of the package, so have edited the manuscript to make clearer the situations where there are methodological alternatives, and to make clearer references to studies that explore limitations and criticisms of the ‘standard’ approach. An empirical investigation comparing the standard approach with the Cohen and Glass methods would indeed be valuable for the field, and it is hoped that the open code of this package could facilitate such a study (note that we have recorded this as a suggested ‘issue’ on the package GitHub page, see https://github.com/milanwiedemann/suddengains/issues/23).

E#3. Reviewer 1 raises an important point regarding the assumptions in your approach (e.g., unidimensionality, constant reliabilities). I would recommend being more explicit about the inherent assumptions (and limitations) of you statistical model.

RESPONSE: Thank you - as described above we have examined the full manuscript in order to be more explicit in describing the relevant assumptions and limitations. We have improved the referencing of papers that discuss common limitations and relevant methodological issues. The assumptions of unidimensionality and constant reliabilities are now discussed on Page 5:

“Note that the use of the test-retest reliability or Cronbach’s alpha when calculating SE makes the assumption that the scale being examined is unidimensional, and that these reliability estimates remain constant over time, and between individuals. Researchers should consider exploring the factor structure and measurement invariance of the scale to examine if these assumptions hold.”

E#4. Your way of handling missing data seems suspect to me (page 5). I doubt you can appropriately account for missingness simply by adapting you significance level (Reviewer 1 also raises this point). Please make sure to thoroughly describe the assumptions underlying this approach and also refer to Rubin’s typology of missing mechanisms (MCAR, MAR, MNAR).

RESPONSE: We have made a number of changes to the section on missing data including making reference to reservations in the literature about the suitability of common replacement approaches for sudden gains research. It was not our intention to suggest that the adjustment of the critical value in Criterion 3 is sufficient to fully account for missingness - we have reworded this to be clear that this is one approach that has been suggested and used in the literature, but that further work to examine missing data within sudden gains research is warranted.

As the classification of missing data types (e.g. MCAR, MAR, MNAR) will vary from study to study we have opted to make reference to Rubin’s work and encourage researchers to consider the reasons why data may be missing and what methods may need to be employed as a result. This applies principally to the identification of gains themselves, where replacing missing data might lead to identifying “false gains” and is not therefore appropriate. However, different methods for handling missing data may be appropriate for subsequent analyses, so we have now mentioned this at the end of the section.

“Missing data, for example where a participant does not provide data on one or more occasions, need to be considered carefully when identifying sudden gains for several reasons. Firstly, depending on the number and pattern of missing data points for an individual, it may not be possible to identify sudden gains, see Table 2. Specifically, in order to estimate the standard deviation values in criterion 3, at least two of the three measurements immediately prior to the gain must be present, as well as at least two of the three measurements immediately following the gain. Some researchers have suggested that methods used to replace missing values, such as last observation carried forward or multiple imputation, may not be appropriate when identifying sudden gains given the potential for additional gains to be detected based on data that were not provided by participants [20,21].

Secondly, where values are missing in the period around the potential sudden gain, two approaches have been described to evaluate the stability of the change. Following the updated version of the third criterion by Tang and colleagues [5,9] some studies have used a critical value of 2.776 across all session to session intervals to check the stability [17]. An alternative approach adjusts the critical values used in criterion 3 (see Eq 1) based on the data that were available in the period around the potential sudden gain [16]: Where no data are missing t(4;97.5%) > 2.776; where one datapoint is missing either before or after the gain t(3;97.5%) > 3.182; and where one datapoint is missing both before and after the gain t(2;97.5%) > 4.303. This method has been adopted in some subsequent studies [22,24].

It is important to understand the reasons for missing data and consider whether methods to handle missing data need to be employed both at the identification stage and in subsequent analyses [18,19]. Further research to examine the impact of missing data and different methods to handle missing data when identifying sudden gains would be beneficial.”

E#5. What are the criteria for excluding respondents because “it would not be possible to identify any sudden gains” (page 8, line 138)? Looking at Table 2, it seems to me that you considered one measurement point before and at least two measurement points after the pregain session as a necessary requirement. Could you elaborate why? Moreover, criterion 3 on page 3 seems to indicate that at least three measurements before and after the pregain session are needed?

RESPONSE: We have revised the explanation of the select_cases() function to better characterise its use. Some published studies have set a minimum number of datapoints that must be present in order to be included in the analysis, and others have analysed only cases with sufficient data for at least one session-to-session to be tested. The select_cases() function is optional and can be used to apply such conditions should researchers wish. The lowest number of datapoints for a single participant that could still show a sudden gain is four, provided they are arranged in one of the patterns shown in Table 2. We have revised Table 2 in order to improve clarity about these patterns, as well as making a more explicit statement about the minimum amount of data required (Page 9):

“For all three criteria to be applied there must be data present for at least two of the three data points prior to, and two of the three following, the interval to be examined.”

E#6. Reviewer 2 has two suggestions for improving the package and increasing its usefulness for applied researchers. You might want to consider these points. However, their implementation is not a requirement for the publication of your manuscript.

RESPONSE: Thank you, we are grateful for these helpful suggestions and have responded to the specific ideas below.

E#7. Your worked example is only partially presented. I would recommend also including the output generated by the functions and providing a step by step interpretation of the printout.

RESPONSE: We have added additional text and tables to provide explanations of the output at each stage (pages 10-12) so that the worked example is clearer to follow.

In addition to these points, both reviewers made further excellent suggestions that you should consider in your revision. I strongly encourage you to address these issues and submit a revised version of your manuscript.

Reviewer 1:

The manuscript “suddengains: An R package to identify sudden gains in longitudinal data” describes the development and use of a statistical package to identify sudden gains in psychotherapy. The rationale for the development of the package is provided and its features and use are discussed.

I believe that this manuscript is extremely important in the field of sudden gains research. As the authors noted, the application of sudden gains criteria has differed between studies, and the analyses and calculations are highly prone to errors. As a researcher in this field I think it is safe to assume that at least a moderate degree of variance and discrepancy between studies is due to applying criteria differently (e.g., changing vs. maintaining the critical value for the third criterion based on amount of missing data) and due to computational error. As a result, I found myself in complete agreement with the authors on the importance of having a standard package for computing sudden gains. I believe this can reduce some of the “error variance” that exists in the field.

As part of my review, I compared the suddengains package with a complex, multi-tab excel spreadsheet that I had been using for the past few years to identify sudden gains. The results were identical. While it is more likely that I validated my excel file with the suddengains package than validated the package with my excel file, I believe this suggests that the main functions work without error.

There are two potential features that I think could strengthen the package further. These are by no means necessary for publication as I believe the current features already represent a substantial contribution to the literature. However, if the authors have plans on updating the package in the future, I hope they consider these features.

R1#1. The first is the ability to chose the critical cutoff for the third criterion. As the authors noted, the third criterion is considered to be more a descriptive cutoff than a valid test of statistical significance. This has led some authors to change to the critical value to 2.5 for instance (Hardy et al., 2005). This option could be helpful in future versions.

RESPONSE: Thank you, we have implemented this function in a development version of the package and will add it into the main package after it is fully tested (see commit ce26a73, https://github.com/milanwiedemann/suddengains/commit/ce26a73ffb249c52c85e2c2d37546be3c59cd870, on the plos-one-revisions branch (see https://github.com/milanwiedemann/suddengains/tree/plos-one-revisions); Guidance on installing this developmental version can be found in the Readme file, see https://github.com/milanwiedemann/suddengains/tree/plos-one-revisions#installation). This function is also implemented in the interactive demonstration of the package, see https://milanwiedemann.shinyapps.io/shinygains/.

R1#2. The second feature would be to allow for an altered 3rd criterion by using 1.5 standard deviations of symptom scores as a critical value. This has been done in many previous studies (e.g., Kelly et al. 2005; 2007). Allowing users to chose such an option could facilitate within-study comparisons of sudden gains identified using different criteria. I believe such a comparison could help the field decide on standard criteria based on empirical data.

RESPONSE: Thank you for this suggestion. It raises the interesting topic of identifying sudden gains based on criteria that are different for each individual. We see this as a subsequent project that the package could facilitate - our view is that the development and implementation of these alternative methods would require some broader consideration and experimentation before they are ready to implement within the package, but that we would be keen to support this and to have such functions available in the package in future. We have added an issue describing this feature request on GitHub, see https://github.com/milanwiedemann/suddengains/issues/22.

R1#3. Overall, I think the manuscript and package represent an important contribution to the field of sudden gains and believe the package will be used by many researchers in the field.

RESPONSE: Thank you for your kind comments and helpful suggestions.

Reviewer 2:

The manuscript focuses important information on individual change, which can be investigated in longitudinal data (i.e., sudden gains & losses). Central is the description and investigation of sudden gains. For the description of sudden gains, the authors consider specific identification criteria. Based on these criteria, they provide an R-package facilitating automated investigations of sudden gains. This should help to improve the efficiency, reporting, and reproducibility of sudden gains research.

I support the basic idea of the authors that the R-package “suddengains” is a useful toolbox for applied researches. In addition to a function for obtaining sudden gains, the presented R package has very helpful add-ons, like an illustrative example, a function for creating new output datasets with sudden gains, and graphical functions for illustrating sudden gains and individual trajectories. The authors build on available state-of-the-art packages, like dplyr and ggplot2, and their functions are well documented.

R2#1. Despite my respect for developing such a useful R-package, I cannot recommend acceptance of the manuscript in the present form. Here is why: Every automation poses the risk of an unconsidered application of specific methods. In my view, this risk is not appropriately addressed in the manuscript, so far. The authors describe a very broad research field for investigating sudden gains with the presented package (p.2. line 8 – any longitudinal dataset with regular repeated measurement). I would encourage a much more detailed view on the context for the application of the R package, where the major limitations are:

RESPONSE: We share your concern regarding the ‘unconsidered application’ of the package and have therefore sought to address this risk within the manuscript. We have changed the mention of ‘any’ longitudinal dataset to suggest that other fields of research investigating within-participant changes may wish to consider this approach (Page 2, Line 7). We have included a number of additions throughout the paper to make readers more aware of the assumptions, limitations, and criticisms of the standard criteria and methods to identify sudden gains, and have improved the citation of relevant literature discussing these issues. These additions include specific statements encouraging the reader to consider their methodological choices carefully when using the package, e.g. Page 15:

“while the package may significantly increase the speed and accuracy of calculations, it cannot and should not substitute considered methodological thinking. In particular, users should consider carefully the appropriateness of the methods selected within each function, including related assumptions and limitations.”

On Page 8 we have also added a link to an interactive web-based ‘Shiny’ application we have been developing, that is designed to help readers of the paper and users of the package consider and understand the impact of different methodological choices (see https://milanwiedemann.shinyapps.io/shinygains/). We also note that the package itself provides a range of warning messages and help documentation to encourage careful and considered use of the functions. Responsibility does of course also lie with the end user, but we hope that the manuscript now better encourages considered use of the package functions.

R2#2. The authors specify specific criteria for the identification of sudden gains, but they do not review the incorporated assumptions. For instance, calculating the standard error of measurement (S_E) in Eq. 6 or 7 with a specific reliability estimate (retest reliability or Chronbach’s alpha) is only possible, when investigating unidimensional scales, for which such reliability estimates hold. Furthermore, even when unidimensionality holds, the reliability of a measure is not necessarily constant over time or persons (see e.g., models of Latent-State-Trait Theory, where multiple time points are considered and Chronbach’s alpha can be calculated for unidimensional scales at each time point, or Item-Response-Theory models, which allow for heterogeneous measurement error variance depending on the ability of a person). Using one reliability measure for all time points and individuals is a rough approximation to control for measurement error and alternative methods are available. I think, more details on the assumptions of the promoted methods would be beneficial, in order to describe the limitations of the related R-package.

RESPONSE: Thank you for highlighting this point. As noted above, we have now made a number of changes to the manuscript that aim to emphasise the assumptions and potential limitations of the methods that are used as the current ‘standard’ in the field. For the example you give above, we suspect it is true that most studies to date have used the ‘rough approximation’ approach to reliability that you describe, but of course agree it is important for researchers to understand and actively consider whether such assumptions are appropriate for their data. We have now included mention of this issue on Page 5:

“Note that the use of the test-retest reliability or Cronbach’s alpha when calculating SE makes the assumption that the scale being examined is unidimensional, and that these reliability estimates remain constant over time, and between individuals. Researchers should consider exploring the factor structure and measurement invariance of the scale to examine if these assumptions hold.”

In addition we have included clearer mention of one of the main limitations of the package in the discussion section (Page 15):

“Limitations of the package include the fact that more substantial adaptations to the standard criteria cannot currently be implemented, though as the underlying code is publically available, researchers may wish to use this in combination with other tools for further development work.”

R2#3. In the preparation of the data, the authors facilitate the exclusion of specific cases, when not enough data points are available to identify sudden gains. They clearly describe which cases have to be excluded. However, the assumptions for and the consequences of excluding persons are not mentioned. This is especially important when conducting subsequent analyses, which are encouraged by the presented R-package. When excluding persons with missing data that is not completely at random, then subsequent results can be biased.

RESPONSE: We have rephrased this section to avoid suggesting that certain participants need to be excluded, and instead clarify that the select_cases() function allows researchers to stipulate a minimum amount of data for each participant should they wish. In many cases, researchers may want to generate a dataset that consists only of participants with sudden gains (for example, to analyse characteristics of the gains themselves, or the participants who experience them), or one that also includes participants with sufficient data to demonstrate a gain but where one was not observed. In both scenarios, participants who have insufficient data to demonstrate a gain may therefore not be of interest. However, this will be determined by the research question and aims of the study, so we have emphasised that researchers should consider if this optional function is appropriate (Page 9; Please see also our response to comment E5).

“The optional select_cases() function can be used to identify samples of cases for analysis who fulfil such conditions, though researchers should consider whether these methods are appropriate for the aims of the study.”

R2#4. A further (minor) comment addresses the statements on the meaning of sudden gains. The authors use a descriptive, databased view on sudden gains. This is appropriate for obtaining such events in the data. However, at the same time they address the meaning of sudden gains, for instance, on page 2 line 20-22 they refer to the meaning of sudden gains in placebo interventions. A clear meaning of sudden gains as effects of an intervention would require a more formal specification of sudden gains as causal effects of the intervention. Accordingly, further (design or analysis) conditions need to hold, for ruling out alternative explanations. I would clearly distinguish between obtaining sudden gains and their meaning.

RESPONSE: Thank you for raising this important point. You rightly highlight the distinction between the data-driven, mathematical definition of a sudden gain, and their more conceptual meaning in relation to the intervention being studied. We have taken this opportunity to clarify wordings within the manuscript in order to better emphasise this difference, for example on Page 2:

“In addition, some studies have raised concerns about the validity of sudden gains identified through current methods, demonstrating that they can be found in placebo interventions and simulated datasets [3,4]. This suggests that not all gains reflect meaningful change or show a causal association with the intervention being studied. This highlights the need to examine the presence and strength of these associations and to consider if the current methods of identification can be refined.”

And in the discussion on Page 15:

“Lastly it should be emphasised that sudden gains and losses identified by applying a set of mathematical criteria are not necessarily related to the effects of the intervention being studied, and that further investigation would be required to establish the presence and strength of the evidence for a causal relationship.”

R2#5. To sum up, I encourage a revision of the manuscript - especially for the description of the methods and the discussion. Possible limitations of the methods and the related R package should be included. As such, the circumstances under which the application of the R-package is appropriate would be clearer. I would also appreciate a more detailed view on the meaning of sudden gains. In my view, the suggested changes can support reproducibility of sudden gains research, which is one of the author’s goals.

RESPONSE: Thank you for your helpful comments. We hope we have been able to address them appropriately, and believe the manuscript has been much improved as a result.

Attachment

Submitted filename: r-suddengains-plosone-response-r1-v2.docx

Decision Letter 1

Timo Gnambs

26 Feb 2020

suddengains: An R package to identify sudden gains in longitudinal data

PONE-D-19-30704R1

Dear Dr. Wiedemann,

We are pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it complies with all outstanding technical requirements.

Within one week, you will receive an e-mail containing information on the amendments required prior to publication. When all required modifications have been addressed, you will receive a formal acceptance letter and your manuscript will proceed to our production department and be scheduled for publication.

Shortly after the formal acceptance letter is sent, an invoice for payment will follow. To ensure an efficient production and billing process, please log into Editorial Manager at https://www.editorialmanager.com/pone/, click the "Update My Information" link at the top of the page, and update your user information. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, you must inform our press team as soon as possible and no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

With kind regards,

Timo Gnambs

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #2: The authors were very responsive to all suggestions and I found the revised paper very readable. In my view, the meaning of sudden gains and the limitations of standard research methods in this field are clearer. The R package “suddengains” will be very helpful for applied researchers. Furthermore, I support the authors’ statements that their open source code is beneficial for future methodological developments. Next to the good responses to my comments, I really liked the more detailed description of the example output in the manuscript as well as the add-on of a shiny app. Thus, I encourage acceptance of the manuscript.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: Yes: Marie-Ann Sengewald

Acceptance letter

Timo Gnambs

28 Feb 2020

PONE-D-19-30704R1

suddengains: An R package to identify sudden gains in longitudinal data

Dear Dr. Wiedemann:

I am pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

For any other questions or concerns, please email plosone@plos.org.

Thank you for submitting your work to PLOS ONE.

With kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Timo Gnambs

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Attachment

    Submitted filename: r-suddengains-plosone-response-r1-v2.docx

    Data Availability Statement

    The package can be downloaded from CRAN https://CRAN.R-project.org/package=suddengains. All code, materials, and data can be found at https://github.com/milanwiedemann/suddengains. Instructions for installing the package, further technical details, and examples can be found at https://milanwiedemann.github.io/suddengains. The R code examples in this paper refer to package version 0.4.0.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES