Abstract
Context
The benefits of a mindfulness meditation (MM) intervention are most often evidenced by improvements in self-rated stress and mental health. Given the physiological complexity of the psychological stress system, it is likely that some people benefit significantly, while others do not. Clinicians and researchers could benefit from further exploration to determine which baseline factors can predict clinically significant improvements from MM.
Objectives
The study intended to determine: (1) if the baseline measures for participants who significantly benefitted from MM training were different from the baseline measures of participants who did not and (2) whether a classification analysis using a decision-tree, machine-learning approach could be useful in predicting which individuals would be most likely to improve.
Design
The research team performed a secondary analysis of a previously completed randomized, controlled clinical trial.
Setting
Oregon Health & Science University and participants’ homes.
Participants
Participants were 134 stressed, generally healthy adults from the metropolitan area of Portland, Oregon, who were 50 to 85 years old.
Intervention
Participants were randomly assigned either to a six-week MM intervention group or to a waitlist control group, who received the same MM intervention after the waitlist period.
Outcome Measures
Outcome measures were assessed at baseline and at two-month follow-up intervals. A responder was defined as someone who demonstrated a moderate, clinically significant improvement on the Mental Health Component (MHC) of the SF-36, Short Form Health-related Quality of Life (SF-36), ie, a change ≥4. The MHC had demonstrated the greatest effect size in the primary analysis of the above-mentioned randomized, controlled clinical trial. Potential predictors were demographic information and baseline measures related to stress and affect. Univariate statistical analyses were performed to compare the values of predictors in the responder and nonresponder groups. In addition, predictors were chosen for a classification analysis using a decision tree approach.
Results
Of the 134 original participants, 121 completed the MM intervention. As defined above, 61 were responders and 60 were nonresponders. Analyses of the baseline measures demonstrated significant differences between the 2 groups in several measures: (1) the Positive and Negative Affect Schedule negative sub-scale (PANAS-neg), (2) the SF-36 MHC, and (3) the SF-36 Energy/Fatigue, with clinically worse scores being associated with greater likelihood of being a responder. Disappointingly, the decision-tree analyses were unable to achieve a classification rate of better than 65%.
Conclusions
The differences in predictor variables between responders and nonresponders to an MM intervention suggested that those with worse mental health at baseline were more likely to improve. Decision-tree analysis was unable to usefully predict who would respond to the intervention.
Keywords: Mental health, mindfulness meditation, stress, older adults, predictors, decision tree
Mindfulness meditation (MM) is a popular meditative approach that has been formally studied and successfully used in a variety of clinical conditions.1–4 The benefits of an MM intervention are most often evidenced by improvements in self-rated stress and mental health.6,7 Given the physiological complexity of the psychological stress system,8,9 it is likely that some people benefit significantly, while others do not. For example, in one study, participants who had a lower distress tolerance had greater reductions in perceived stress scores following Mindfulness-Based Stress Reduction (MBSR).10
Furthermore, people with higher scores for trait mindfulness have shown greater improvements following an MM intervention.11 One study suggested that participants struggling more with psychosocial adaptation benefitted more from MM training,10 while another suggested that those participants with higher mindful awareness improved more on psychosocial outcomes.11 These limited results suggest that clinicians and researchers could benefit from further exploration to determine which baseline factors can predict clinically significant improvements from MM.
Oken et al. had published the primary analysis of a completed randomized, controlled clinical trial in older adults12 previously. The current secondary analysis was performed to determine: (1) if the baseline measures for participants who significantly benefitted from MM training, i.e., responders, were different from the baseline measures of participants who did not, i.e., nonresponders, and (2) whether a classification analysis using a decision-tree, machine-learning approach13 could be useful in predicting which individuals would be most likely to improve.
METHODS
Participants
Generally healthy adults were recruited from the metropolitan area of Portland, Oregon from June 2011–January 2015. The research team screened 219 potential participants by telephone. People were reached in the local community using advertisements and announcements through newsletters and newspapers, email list servers and postings at Oregon Health & Sciences University (OHSU). All assessments and meditation trainings took place at OHSU. Daily practice other than the trainings generally occurred in participants’ homes. Potential participants were included if they: (1) were between 50 and 85 years old, (2) had a baseline score on the Perceived Stress Scale (PSS) greater than or equal to 9, and (3) agreed to follow the study’s protocol, including randomization.
Exclusion criteria were implemented to screen out underlying illnesses that might limit the benefit of the intervention, confound outcomes, or increase dropout likelihood. Potential participants were excluded if they: (1) had a cognitive impairment, identified through the participant’s significant complaints or a score lower than 30 on the Modified Telephone Interview for Cognitive Status; (2) had significant, patient-reported medical or neurological disease, e.g., major organ failure, insulin-dependent diabetes, active cancer within the 5 years prior to the study, or alcoholism; (3) had significant, untreated depression, as indicated by a score greater than 5 on the Geriatric Depression Scale (short form) and an interview; (4) used medications known to affect the central nervous system’s (CNS’) functioning or to affect physiological measures, e.g., steroids, neuroleptics, or regular narcotic analgesics; stable doses of CNS-active drugs with less impact, e.g., beta-blockers, SSRIs, and histamine blockers, were acceptable; (5) could not understand the instructions, e.g., could not hear or see the study’s materials or were not fluent in English; or (6) had prior experiences with meditation classes or other mind-body classes, e.g., yoga or tai chi, in the 24 months prior to the study or had performed more than 5 minutes of daily practice in the 30 days prior to the study.
This study was approved by Oregon Health & Sciences University’s Institutional Review Board and was registered with ClinicalTrials.gov (NCT01386060). Participants provided informed consent at a baseline visit.
Procedures
Three testing visits occurred, with the visits being 2 months apart. Following Visit 1, the baseline visit, the randomizations were performed by non-blinded research personnel using a computerized, covariate, adaptive randomization procedure,14 balancing groups on age, gender, and baseline scores on the PSS. Research assistants (RAs) who conducted outcome assessments remained blinded.
All participants received 6 weeks of MM training, either between Visits 1 and 2 for the intervention group or between Visits 2 and 3 for the waitlisted controls. The latter group served as a controls for the primary outcome analyses at Visit 2.12 Responder classification for the current analysis was achieved by subtracting baseline data from corresponding data immediately post-intervention.
Enrolled participants were encouraged not to change their use of prescription drugs during the study and to inform the investigator if any changes were made.
For the current secondary analysis, a responder was defined as someone who demonstrated a moderate, clinically significant improvement on the Mental Health Component (MHC) of the SF-36, Short Form Health-related Quality of Life (SF-36),15 i.e., a change ≥4. The MHC had demonstrated the greatest effect size in the primary analysis of Oken et al.’s randomized, controlled clinical trial.12 The 17 potential predictors were demographic information and baseline measures related to stress and affect, as follows:
Demographic measures
The 4 demographic measures included: (1) gender, (2) age at entry to the study, (3) years of formal education, and (4) annual income assessed using a 7-step scale.
Self-rated measures
Participants completed the 13 self-rated measures at the baseline visit, Visit 1. They were entered into the predictor analysis. These measures might possibly affect responsiveness to the MM intervention, and they were obtained for use as secondary outcome measures in the Oken et al.’s randomized, controlled clinical trial.12
The forms took approximately 2 hours to complete. Participants were paid $15 per hour for this time, plus additional time and money for the in-lab assessments. The measures were: (1) life experience stressors,16 (2) the PSS,17 (3) the negative affect subscale (-neg) of the conventional Positive and Negative Affect Schedule (PANAS),18 (4) the positive affect subscale (-pos) of the conventional PANAS,18 (5) the maximum of the PANAS-neg using the state version of the PANAS,18–20 (6) Neuroticism from the current Neuroticism-Extraversion-Openness Personality Inventory Revised (NEO PI-R),21 (7) the Center for Epidemiologic Studies Depression Scale (CES-D),22 (8) the SF-36 MHC,23 (9) the SF-36 Energy and Fatigue subscale,23 (10) the Pittsburgh Sleep Quality Inventory (PSQI),24 (11) the Generalized Self-Efficacy Scale (GPSE),25 (12) the fifth factor (Non-judgmental) from the Five Factor Mindfulness Questionnaire,26 and (13) the Expectancy questionnaire.27
Intervention
The MM intervention was administered over the course of 6 weekly one-on-one sessions with a trained RA. The intervention was a standardized and structured program based on Mindfulness Based Cognitive Therapy (MBCT) previously described by Oken et al. and Wahbeh et al.12,28 The one-on-one sessions allowed easy scheduling at the convenience for each participant. In addition to the six, 60–90 minute training sessions, daily guided meditations for home practice were accessed with the study’s iPod application, which measured adherence using minutes listened.29 Participants were instructed to practice at home for 30 minutes per day as a goal, with at least some practice every day. To account for the many stressors that adults typically face, training sessions offered MM-based coping strategies to be used in the midst of daily living. Participants who were randomly assigned to the waitlisted arm received the same MM intervention between Visit 2 and Visit 3.
Outcome Measures
Seven-step income scale
As previously done to estimate potential financial stress on health, we asked participants to estimate their annual income on a 7-step scale: 0-14,999; $15,000-29,999; $30,000-44,999; $45,000-59,999; $60,000-74,999; $75,000-89,999, and; $90,000+.
Life experience stressors.16
This test measures the severity and number of negative life-events in the year prior to the study. It is a 57-item, self-report measure on which the respondents were asked to report about life experiences during the previous year. Participants were asked to rate their experiences on a 7-point impact scale, ranging from “extremely negative” (−3) to “no impact” (0) to “extremely positive” (+3). A single summation was calculated. Test-retest correlations for the total change scores from two groups of undergraduate psychology students were .63 and .64 (p < .001).16
PSS.17
This scale measures stress. Self-report scores can vary from 0 to 40 based on 10 items, where higher scores indicate higher levels of stress. The total internal consistency is good, with a reported Cronbach’s alpha of .76.17,17a
PANAS-neg and -pos.18
The negative and positive affect subscales of the conventional PANAS measure participants’ negative and positive feelings, respectively, in the week prior to testing, using 10 questions each. Higher scores on the PANAS-neg indicate higher negative disposition or affect and higher scores on the PANAS-pos indicate higher positive affect. This measure has been experimentally tested, and found to be a highly reliable.18,30
State PANAS.18,19,20
A 10-question state version of the PANAS was administered at four semi-random times during waking hours on the day following the in-lab baseline testing using a Smartphone assessment tool. The routine administration of the state version of the PANAS has been validated18,19,20 but not this specific smartphone assessment although it should be comparable. For prediction purposes, we chose the maximum negative state affect from the ecological momentary assessment (EMA).
NEO PI-R.21
The inventory measures neuroticism, assessed because changes are a possible outcome from meditation despite the fact that neuroticism is considered a stable trait. This self-administered widely used inventory is a 60 item, 5-point Likert scale with 5 factors but only the Neuroticism factor was used for these analyses since it was previously shown to be affected by the intervention.21 The Neuroticism factor has a good validity (alpha = 068).
CES-D.22
The scale assesses negative affect and depression using a 20-item, 4-point Likert scale asking about various feelings within the last week. The scale has strong internal consistency with Cronbach’s alpha ranging from 0.88 to 0.91 being reported.22
SF-36 MHC.23
The subscale measures overall mental health and was selected as a predictor. It showed the most significant outcome in the primary analysis for the study by Oken et al.12 and is a widely used and validated self-rated health related quality of life measure.
SF-36 Energy and Fatigue subscale.23
This subscale measures fatigue, which previously has been found to impact adherence in mind-body interventions32 and which may be a predictor of responsiveness.
PSQI.24
The PSQI consists of 19 self-rated questions that assess a variety of factors related to sleep quality. These items are grouped into seven component scores, each weighted equally on a 0–3 scale. The seven component scores are then summed to yield an overall PSQI score ranging from 0–21, with higher scores indicating poorer sleep quality. A recent meta-analysis of PSQI showed that in the majority of studies, Cronbach’s alphas ranged from 0.70 to 0.83, meeting the cut-point for a positive rating for within- and between-group comparisons (i.e., 0.70).31
General Perceived Self-Efficacy Scale (GPSE).25
This scale measures self-efficacy, i.e., an individual’s sense of control. The scale was included because participants with a better sense of control often have better outcomes. This self-report measure consists of 10 items such as “I can always manage to solve difficult problems if I try hard enough,” or “Thanks to my resourcefulness, I know how to handle unforeseen situations.” Items are scored from 1 (“not at all true”) to 4 (“Exactly true). The total score is calculated by the sum of the all items. The total score ranges between 10 and 40, with a higher score indicating more self-efficacy. Internal reliability (Cronbach’s alpha) ranges from .76 to .90.25
Five Factor Mindfulness Questionnaire.26
This questionnaire measures 5 factors of mindfulness but we only used the 9-item Acceptance without Judgment (Mindful Non-judging) subscale from the KIMS because it was found in prior studies to be significantly different in chronically stressed populations.33,34 Items reflect the act of making judgments or evaluations and common examples of self-judgment and/or self-criticism. Participants are asked to rate statements on a scale of 1 (“Never or very rarely true” to 5 (“Very often or always true”). Items for the Acceptance without Judgment scale are reversed coded and include items such as “I criticize myself for having irrational or inappropriate emotions,” and “I tend to make judgments about how worthwhile or worthless my experiences are.” These items are summed after reverse coding (range 9–45) with good internal reliability (Cronbach’s a) = .87.26
Expectancy questionnaire.27
This questionnaire assesses expectancy and credibility. This instrument is a 6-item, 9-point Likert scale that assesses participants’ expectancy and credibility toward treatments. Reports demonstrate good internal consistency for both subscales: Expectancy with a Cronbach’s alpha of .82 and credibility, with α =.75.27
Statistical Analysis
Conventional descriptive statistics
Data processing and analysis were first performed in Stata 14 (StataCorp, College Station, TX, USA). For descriptive purposes, the means of the 15 continuous predictors for the responders and nonresponders at baseline were compared using t-tests, with p values adjusted for false discovery rates.35
All predictors and the outcome to be predicted were checked for normality using Shapiro-Wilk and transformed if not normal. Income assessed using an ordinal scale was compared using the Wilcoxon rank order, and the gender distribution was assessed using chi-square. Regression to the mean may explain why participants with baseline values farther from the mean can improve more that those participants closer to the mean at baseline.36
To demonstrate that the regression to the mean was not the explanation for the current study’s findings, additional analyses were performed. Participants randomly assigned to the waitlisted group had 2 measurements, Visits 1 and 2, prior to the MM intervention. Simple parametric statistics—t-test and analysis of covariance (ANCOVA)—were performed on outcomes from Visits 1, 2, and 3 on data from the waitlisted group.
Decision tree approach
Decision tree analyses were performed using MATLAB (MathWorks, Natick, MA, USA) Statistics and Machine Learning Toolbox. The classification of responders and nonresponders used decision tree analysis for multiple reasons: (1) it works better with small sample sizes; (2) it works well with missing data; and (3) it is more interpretable than other approaches, such as a support vector machine. The MATLAB function fitctree uses a Classification and Regression Tree, greedy decision algorithm based on iterative dichomitization37 as well as other additions, such as how decisions regarding split nodes are made and pruning. The algorithm manages missing predictor values by using all available relevant data to evaluate a specific branch point.
Since only 121 participants were included in this analysis, a decision was made to limit the machine learning analyses to at most 9 predictors to have at least 10 measures per predictor. One set of analyses used 9 predictors chosen based on a priori hypotheses: (1) age, (2) years of education, (3) life experience stressors, (4) nonjudgmental mindfulness, (5) PSQI, (6) NEO PI-R neuroticism, (7) SF-36 Energy/Fatigue, (8) expectancy, and (9) maximum PANAS-negative state from EMA.
Another set of analyses was performed using 9 predictors chosen through principal-component analysis (PCA) of all 17 demographic and self-rated psychological measures, allowing up to 9 linear combinations. The full data set of 17 predictors with only 121 participants might not generate the best classifier because of overfitting and errors in decision tree learning related to multicollinearity. These issues may occur with only 9 predictors even though more than 10 observations per predictor variable were available, an acceptable recommended number. Thus, a similar analysis was performed with just the 4 components with the most variance explained. All participants who completed the MM intervention were included in these analyses.
Decision-tree hyperparameters were set to the MATLAB default tree’s definition parameters. After the hyperparameters were decided, a model was generated using the full data set (n=121). The reliability of the classifier was evaluated using a ten-fold cross-validation. The mean of the classification error together with the standard error were generated using the MATLAB function cvloss.
RESULTS
Of the 134 original participants, 121 completed the MM intervention. See Figure 1 for recruitment, randomization, and drop-out numbers. See Table 1 for demographics and selected baseline measures for all participants who completed their MM training and were included in the analyses. Of the 121 completers, the average age was 60 years, and 79% were women. Of the 121 participants completing the study, 61 were responders and 60 were nonresponders based on the SF-36 MHC criterion previously described. The completers practiced guided meditation for a mean of 27.0 ±10.7 minutes per day during their intervention period.
Figure 1.
Table 1.
Seventeen Predictors Grouped by Responder Status. Statistics comparing the 2 groups at baseline are given by t test for all except chi-square for gender and Wilcoxon rank order for 7- step income, together with associated p values and p values adjusted for the false discovery rate.
Predictors | SF-36-MHC Responder (n=61) Mean ± SD | SF-36-MHC Nonresponder (n=60) Mean ± SD | Statistic | P Value | FDR-adjusted P Value |
---|---|---|---|---|---|
Female, n | 51 | 45 | 1.37 | 0.24 | 0.47 |
Age, y | 59.0 ± 6.7 | 60.8 ± 7.1 | −1.37 | 0.17 | 0.41 |
Education, y | 16.9 ±2.9 | 16.6 ± 2.4 | 0.60 | 0.55 | 0.62 |
Income, 7 steps | 4.6 ± 2.1 | 4.3 ± 2.0 | 0.78 | 0.43 | 0.53 |
Life experience stressors | −10.4 ± 12.4 | −6.9 ± 12.3 | −1.58 | 0.12 | 0.34 |
PSS-baseline | 19.0 ± 6.1 | 17.8 ± 5.5 | 1.07 | 0.29 | 0.47 |
PANAS-neg | 23.2 ± 7.9 | 19.8 ± 5.4 | 2.77 | 0.0065* | 0.04* |
PANAS-pos | 32.4 ± 6.9 | 33.3 ±6.0 | −0.77 | 0.44 | 0.53 |
PANAS-neg-max EMA | 8.6 ± 3.2 | 7.3 ± 2.3 | 2.40 | 0.018* | 0.08 |
NEO PI-R Neuroticism | 25.8 ± 8.2 | 22.8 ± 8.2 | 1.98 | 0.05* | 0.17 |
CES-D | 18.9 ± 10.5 | 17.0 ± 9.4 | 1.16 | 0.25 | 0.47 |
SF-36 MHC | 36.1 ± 9.5 | 42.7 ± 9.8 | −3.77 | 0.0003* | 0.005* |
SF-36 Energy/Fatigue | 38.9 ± 19.1 | 48.9 ± 19.7 | −3.1 | 0.005* | 0.03* |
GPSE | 29.7 ± 4.2 | 30.3 ± 3.5 | −0.97 | 0.33 | 0.47 |
PSQI | 8.3 ± 3.0 | 8.3 ± 3.5 | −0.01 | 0.99 | 0.99 |
Mindfulness nonjudgmental | 29.4 ± 6.4 | 30.5 ± 6.8 | −0.97 | 0.33 | 0.47 |
Expectancy | 28.6 ± 5.8 | 28.3 ± 6.4 | 0.31 | 0.75 | 0.80 |
Significant differences at baseline between the 2 groups
Conventional univariate biostatistical results are shown in Table 1. Several of the predictors were significantly different at baseline between the 2 groups, including the scores on the PANAS-neg, PANAS-neg-max EMA, NEO PI-R Neuroticism, SF-36 MHC, and SF36 Energy/Fatigue. Three of those predictors remained significant after adjustment using the false discovery rate: SF-36 MHC, SF-36 Energy/Fatigue, and PANAS-neg.
To evaluate whether these differences represented a simple regression to the mean, data from the 55 completers in the waitlisted group were analyzed separately because those participants had 2 assessments prior to the MM intervention. The means and standard deviations (SD) on the SF-36 MHC for visits 1, 2, and 3 for responders were 34.8 ± 9.4, 37.9 ± 10.3, and 48.2 ± 7.3, respectively, and from nonresponders were 45.9 ± 8.6, 43.8 ± 11.6, and 41.3 ± 12.8, respectively. The improvement in the scores on the SF-36 MHC in the waitlisted period was greater during the MM intervention than during the waitlisted period, even in the responders who also happened to have lower SF-36 MHC at baseline, Visit 1.
Even though adherence was not a baseline measure, average minutes of guided meditation per day was not significantly different between the groups. The responders practiced a mean of 28.1 ± 11.6 minutes per day and the nonresponders 25.9 ± 9.7 minutes per day, with t=1.15, p=0.25. The scheduling ease of the one-on-one training sessions allowed all participants to attend all 6 training sessions.
The decision tree analysis used the 9 measures that were most statistically different in the responder and nonresponder groups: (1) life experiences, (2) age, (3) gender, (4) CES-D, (5) SF-36 MHC, (6) SF-36 Energy/Fatigue, (7) NEO PI-R Neuroticism, (8) PANAS-neg, and (9) the PANAS-neg-max EMA. The decision tree generated from the full data set produced a mean correct classification rate of 0.612 using ten-fold cross-validation, with a standard error of 0.04. By altering input variables, the decision tree was able to achieve up to a 0.653 correct classification rate with cross-validation with a standard error 0.04; however, some decision trees still ended up with correct classification rates as low as 0.556. Error rates were not improved using PCA to reduce the number of variables while capturing the most unexplained variance.
DISCUSSION
Analysis of responders and nonresponders to MM training revealed several significant differences in baseline measures. Specifically, those who improved from MM training had worse mental health at baseline than those who did not respond, with the mental-health measures that were different at baseline being the PANAS-neg, SF-36 Energy/Fatigue, and the SF-36 MHC.
Machine learning using decision tree analysis had limited ability to predict whether or not a participant’s mental health would significantly improve following MM training, with mean accuracy rates below 70%, only slightly above the expected random classification rate of 50%. None of the demographic measures were significantly different in the responder and nonresponder groups using conventional biostatistics; all p values were greater than 0.15, and none entered into any of the decision trees. Like the present study, other studies described below have also observed that those with worse mental health at baseline were more likely to demonstrate improvements from a meditation intervention.
One study with depression relapse as a primary outcome observed that MBCT was most effective in those who had experienced 3 or more episodes of depression and in those whose depression was not preceded by negative life events.38 Another study showed that participants with higher neuroticism at baseline experienced greater improvement from a meditation intervention in the domains of mental distress and well-being.39 In the current study, neuroticism had a similar statistically significant effect initially but not after adjustment for multiple comparisons.
One prior study found that those with greater mindfulness had greater declines in perceived stress at one year following an MM intervention.11 The current study built on those results by using a larger sample size, more points of measurement, and a much older population. The current study also built on research that showed that people with lower baseline distress tolerance had greater improvements in PSS from MBSR than those with better distress tolerance.10
In the current study, MM responders had greater fatigue at baseline. The SF-36 Energy/Fatigue measure has previously been shown to relate to adherence in mind-body interventions32 and also to be improved by mind-body interventions, e.g., yoga in a group of healthy older adults.40 At least one other study also found no relationship between demographic factors and improvement in depression following MBSR.41 Some researchers have simply examined changes in non-primary outcome measures, potentially at an earlier point in time than the final outcome measures, to see if those may predict changes in outcome measure,42,43 but that approach is inherently different than trying to determine predictors at baseline prior to the intervention.
Expectancy has an impact on many outcomes, self-rated outcomes in particular.44 With regard to meditation, one study showed that patients with cancer had greater improvement if they were randomly assigned to their preferred program regardless of whether it was Mindfulness-Based Cancer Recovery, Supportive Expressive Therapy, or a stress-management seminar.45 Of note, the mindfulness intervention was the most preferred program, but those randomly assigned to their preferred intervention improved more over time on quality of life regardless of actual intervention type. Women with greater psychological morbidity at baseline showed greater improvement in stress symptoms and quality of life if they received their preferred vs non-preferred program.45
In addition to studies discussing baseline factors that relate to improvements from meditation interventions, previous studies of MM have discussed factors associated with predicting adherence to MM interventions. Predicting adherence to meditation practice46 or to any intervention, including taking placebo pills, may be important since adherence correlates with better outcomes.44 Most studies related to predictors of adherence to meditation intervention have identified factors associated with completing the intervention, e.g., comorbid personality disorders.42 It is known that a number of other factors are associated with adherence to mind-body interventions that can impact outcomes.32 In the current study, no analyses were done to relate future adherence to baseline predictor measures, in part because no differences existed in the minutes per day of meditation practice between the 2 groups.
Some limitations existed for the current study. Having only a waitlisted control group, which was necessitated by a limited budget, implies that some benefits observed in the immediate MM intervention group compared to waitlisted controls may be related to nonspecific or placebo effects.12,44 If that result were the case, 2 types of responses might contribute to beneficial outcomes: (1) placebo responsiveness predicted by variables such as expectancy and (2) responsiveness to the MM intervention predicted by mental-health variables. No relationship existed between expectancy measures and mental-health improvement; thus, the placebo effect was a less likely contributor.
The current study’s number of observations (participants) was small for a machine learning approach, even though the analysis limited the number of predictors to result in greater than 10 observations per variable in the training set. Decision tree analysis might have benefited from methods such as ensemble learning or AdaBoost. Additionally, decision tree analysis may not be the best machine learning approach for the data set.44 Adding costs or penalties of allocation and misallocation to the decision analysis might make the analysis more interpretable, but at the current study’s low level of classification accuracy, it was not likely to be useful.
The lack of variability in the current study’s sample demographics, with participants being mostly highly educated women in a relatively narrow age range—50–85 years old—might have limited both the quantitative analysis and the study’s generalizability. Better recruitment strategies and less strict inclusion criteria for the next study exploring predictor variables would be helpful to obtain broader demographic samples.
A more general issue with predictor analysis in clinical trials is that responder status is subject to a “regression to the mean” effect,36 especially because many of the predictor variables are correlated to the SF-36 MHC score. Thus, those participants with lower scores on the SF-36 MHC are more likely to improve, and many of the self-rated predictors are correlated with the SF-36 MHC, e.g., SF-36 Energy/Fatigue and PANAS-neg.
One prior study suggested that regression to the mean is not the only explanation of beneficial MM effects. In one study, people with panic disorder receiving mindfulness-based cognitive therapy demonstrated greater improvements in their panic-disorder symptoms if they were less depressed at baseline as assessed by the Hamilton Depression Rating Scale,40 i.e., closer to the mean at the study’s start. Additionally, the participants in the waitlisted control arm of the current study had no significant changes in their measures of stress and negative affect during the waitlist period (Visit 1 to Visit 2) but did during the actual MM intervention period (visit 2 to visit 3). These observations strongly suggest that regression to the mean was not the main explanation.
Even though simple univariate analyses demonstrated that some baseline predictors in the current study—the PANAS-neg, SF-36 Energy/Fatigue, and SF-36 MHC—were significantly different in the responder and nonresponder groups, the decision tree analyses did not achieve sufficiently high accuracies to be useful for classification. Machine learning is a broad field and other techniques may be better for the current study’s type of data.47
Also, it is likely that these particular predictor variables were not capturing all salient properties of a person that predict responsiveness to an MM intervention. Additionally, these predictors and the MM intervention effects may actually change over the course of MM training. Interactions of the baseline variables may need to be better captured in future prediction studies. It may be better to use stress reactivity rather than improvement in mental health to define a responder, because that measure may be a better indicator of the coping that participants learn through the nonjudgmental aspect of MM training. Stress-reactivity characteristics may need to be captured by more prolonged assessments using ecological momentary assessment or assessing stress reactivity using experimental stressors in a lab setting. Similarly, it is likely that other ways are available to define responder status over terms longer than 2 months, which would require outcome measures assessed during a longer intervention study.
CONCLUSIONS
Even though simple univariate statistics in the current study demonstrated differences in baseline measures between responders and nonresponders, classification accuracy using decision tree analysis was lower than 70%. Because clinical decision-making would benefit from better prediction of who will improve with MM training, it would be useful for future studies to use additional predictor and outcome variables, such as stress reactivity, that better capture MM-induced improvements as well as other statistical or machine learning approaches.
Acknowledgments
The research team acknowledge Roger Ellingson and Wyatt Webb for engineering support, Helané Wahbeh for her help with the MM intervention, and Meghan Miller for her help in obtaining data from participants for the analysis. Melanie Mitchell, Rochelle Fu, and Wayne Wakeland critiqued earlier versions of this article as part of their role on Barry Oken’s Systems Science PhD dissertation committee at Portland State University. The study was supported in part by OHSU and by grants from NIH (AT005121 and AT002688).
Abbreviations
- SF-36
Short Form Health-related Quality of Life
- MHC
mental health component of SF-36 health-related quality of life
- FDR
False Discovery Rate
- SD
standard deviation
- PSS
Perceived Stress Scale
- PANAS
Positive and Negative Affect Schedule
- -neg
negative affect
- -pos
positive affect
- NEO PI-R
Neuroticism-Extraversion-Openness Personality Inventory Revised
- CES-D
Center for Epidemiologic Studies Depression Scale
- GPSE
General Perceived Self-Efficacy
- PSQI
Pittsburgh Sleep Quality Inventory
Footnotes
Authors’ Disclosure Statement: The research team states that it has no conflicts of interest related to the study.
References
- 1.Chiesa A, Serretti A. A systematic review of neurobiological and clinical features of mindfulness meditations. Psychological Medicine. 2010;40:1239–1252. doi: 10.1017/S0033291709991747. [DOI] [PubMed] [Google Scholar]
- 2.Kabat-Zinn J, Massion A, Kristeller J, et al. Effectiveness of a meditation-based stress-reduction program in the treatment of anxiety disorders. American Journal of Psychiatry. 1992;149:936–943. doi: 10.1176/ajp.149.7.936. [DOI] [PubMed] [Google Scholar]
- 3.Chiesa A, Serretti A. Mindfulness based cognitive therapy for psychiatric disorders: a systematic review and meta-analysis. Psychiatry research. 2011 May 30;187(3):441–453. doi: 10.1016/j.psychres.2010.08.011. [DOI] [PubMed] [Google Scholar]
- 4.Grossman P, Niemann L, Schmidt S, Walach H. Mindfulness-based stress reduction and health benefits: A meta-analysis. Journal of psychosomatic research. 2004 Jul;57(1):35–43. doi: 10.1016/S0022-3999(03)00573-7. [DOI] [PubMed] [Google Scholar]
- 5.Ospina MB, Bond K, Karkhaneh M, et al. Meditation practices for health: state of the research (AHRQ) Evidence Report/Technology Assessment (Full Report) 2007;(155):1–263. Publication No. 07-E010. [PMC free article] [PubMed] [Google Scholar]
- 6.Goyal M, Singh S, Sibinga EM, et al. Meditation programs for psychological stress and well-being: a systematic review and meta-analysis. JAMA internal medicine. 2014 Mar;174(3):357–368. doi: 10.1001/jamainternmed.2013.13018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Abbott RA, Whear R, Rodgers LR, et al. Effectiveness of mindfulness-based stress reduction and mindfulness based cognitive therapy in vascular disease: A systematic review and meta-analysis of randomised controlled trials. Journal of psychosomatic research. 2014 May;76(5):341–351. doi: 10.1016/j.jpsychores.2014.02.012. [DOI] [PubMed] [Google Scholar]
- 8.McEwen BS, Bowles NP, Gray JD, et al. Mechanisms of stress in the brain. Nat Neurosci. 2015 Oct;18(10):1353–1363. doi: 10.1038/nn.4086. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Oken BS, Chamine I, Wakeland W. A systems approach to stress, stressors, and resilience in humans. Behavioural Neuroscience. 2015;282:144–154. doi: 10.1016/j.bbr.2014.12.047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Gawrysiak MJ, Leong SH, Grassetti SN, Wai M, Shorey RC, Baime MJ. Dimensions of distress tolerance and the moderating effects on mindfulness-based stress reduction. Anxiety, stress, and coping. 2015 Sep 15;:1–9. doi: 10.1080/10615806.2015.1085513. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Shapiro SL, Brown KW, Thoresen C, Plante TG. The moderation of Mindfulness-based stress reduction effects by trait mindfulness: results from a randomized controlled trial. J Clin Psychol. 2011 Mar;67(3):267–277. doi: 10.1002/jclp.20761. [DOI] [PubMed] [Google Scholar]
- 12.Oken BS, Wahbeh H, Goodrich E, et al. Meditation in Stressed Older Adults: Improvements in self-rated mental health not paralleled by improvements in cognitive function or physiological measures. Mindfulness. 2017;8(3):627–638. doi: 10.1007/s12671-016-0640-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Podgorelec V, Kokol P, Stiglic B, Rozman I. Decision trees: an overview and their use in medicine. Journal of medical systems. 2002 Oct;26(5):445–463. doi: 10.1023/a:1016409317640. [DOI] [PubMed] [Google Scholar]
- 14.Pocock SJ, Simon R. Sequential treatment assignment with balancing for prognostic factors in the controlled clinical trial. Biometrics. 1975;31:103–115. [PubMed] [Google Scholar]
- 15.Coteur G, Feagan B, Keininger DL, Kosinski M. Evaluation of the meaningfulness of health-related quality of life improvements as assessed by the SF-36 and the EQ-5D VAS in patients with active Crohn’s disease. Alimentary Pharmacology and Therapeutics. 2009 May 1;29(9):1032–1041. doi: 10.1111/j.1365-2036.2009.03966.x. [DOI] [PubMed] [Google Scholar]
- 16.Sarason IG, Johnson JH, Siegal JM. Assessing the impact of life changes: development of the Life Experiences Survey. J Consult Clin Psychol. 1978;46:932–946. doi: 10.1037//0022-006x.46.5.932. [DOI] [PubMed] [Google Scholar]
- 17.Cohen S, Karmarck T, Mermelstein R. A Global Measure of Perceived Stress. Journal of health and social behavior. 1983;24:385–396. [PubMed] [Google Scholar]
- 17a.Lee EH. Review of the psychometric evidence of the perceived stress scale. Asian Nursing Research. Korean Soc Nurs Sci. 2012;6(4):121–127. doi: 10.1016/j.anr.2012.08.004. [DOI] [PubMed] [Google Scholar]
- 18.Watson D, Clark LA, Tellegen A. Development and validation of brief measures of positive and negative affect: the PANAS scales. Journal of Personality and Social Psychology. 1988 Jun;54(6):1063–1070. doi: 10.1037//0022-3514.54.6.1063. [DOI] [PubMed] [Google Scholar]
- 19.Leue A, Beauducel A. The PANAS structure revisited: on the validity of a bifactor model in community and forensic samples. Psychological Assessment. 2011 Mar;23(1):215–225. doi: 10.1037/a0021400. [DOI] [PubMed] [Google Scholar]
- 20.Thompson ER. Development and validation of an internationally reliable short-form of the Positive and Negative Affect Schedule. Journal of Cross-Cultural Psychology. 2007;38:227–241. [Google Scholar]
- 21.Costa PT, McCrae RR. NEO Inventories: Professional manual. Lutz, FL: Psychological Assessment Resources, Inc; 2010. [Google Scholar]
- 22.Radloff L. The CES-D scale: a self-report depression scale for research in the general population. Applied Psychological Measurement. 1977;1:385–401. [Google Scholar]
- 23.Ware JE. SF-36 Health Survey: Manual interpretation Guide. Boston: The Health Institute; 1993. [Google Scholar]
- 24.Buysse DJ, Reynolds CF, Monk TH, Berman SR, Kupfer DJ. The Pittsburgh Sleep Quality Index: a new instrument for psychiatric practice and research. Psychiatric Research. 1989;28(2):192–213. doi: 10.1016/0165-1781(89)90047-4. [DOI] [PubMed] [Google Scholar]
- 25.Schwarzer R, Jerusalem M. Generalized self-efficacy scale. In: Weinman J, Wright S, Johnston M, editors. Measures in health psychology: a user’s portfolio Causal and control beliefs. Windsor, UK: Nfer-Nelson; 1995. pp. 35–37. [Google Scholar]
- 26.Baer RA, Smith GT, Hopkins J, Krietmeyer J, Toney L. Using self-report assessment methods to explore facets of mindfulness. Assessment. 2006;13:27–45. doi: 10.1177/1073191105283504. [DOI] [PubMed] [Google Scholar]
- 27.Devilly GJ, Borkovec TD. Psychometric properties of the credibility/expectancy questionnaire. Journal of Behavioral Therapy and Experimental Psychiatry. 2000;31:73–86. doi: 10.1016/s0005-7916(00)00012-4. [DOI] [PubMed] [Google Scholar]
- 28.Wahbeh H, Lane JB, Goodrich E, Miller M, Oken BS. One-on-one Mindfulness Meditation Trainings in a Research Setting. Mindfulness (N Y) 2014 Feb 1;5(1):88–99. doi: 10.1007/s12671-012-0155-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Wahbeh H, Zwickey H, Oken B. One method for objective adherence measurement in mind-body medicine. The Journal of Alternative and Complementary Medicine. 2011;17:1–3. doi: 10.1089/acm.2010.0316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Crawford JR, Henry JD. The positive and negative affect schedule (PANAS): construct validity, measurement properties and normative data in a large non-clinical sample. The British Journal of Clinical Psychology. 2004;43:245–265. doi: 10.1348/0144665031752934. [DOI] [PubMed] [Google Scholar]
- 31.Mollayeva T, Thurairajah P, Burton K, Mollayeva S, Shapiro CM, Colantonio A. The Pittsburg sleep quality index as a screening tool for sleep dysfunction in clinical and non-clinical samples: A systematic review and meta-analysis. Sleep Medicine Reviews. 2016;25:15–73. doi: 10.1016/j.smrv.2015.01.009. [DOI] [PubMed] [Google Scholar]
- 32.Flegal KE, Kishiyama S, Zajdel D, Haas M, Oken BS. Adherence to yoga and exercise interventions in a 6-month clinical trial. BMC Complementary and Alternative Medicine. 2007;7:37. doi: 10.1186/1472-6882-7-37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Wahbeh H, Oken B, Lu M. Differences in veterans with and without posttraumatic stress disorder during relaxing and stressful condition. Annual Meeting American Psychosomatic Society. 2010;A61:2010. [Google Scholar]
- 34.Oken BS, Fonareva I, Wahbeh H. Stress-related cognitive dysfunction in dementia caregivers. Journal of Geriatric Psychiatry and Neurology. 2011;24:192–199. doi: 10.1177/0891988711422524. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Benjamini Y, Hochberg Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological) 1995;57(289–300) [Google Scholar]
- 36.Barnett AG, van der Pols JC, Dobson AJ. Regression to the mean: what it is and how to deal with it. International journal of epidemiology. 2005 Feb;34(1):215–220. doi: 10.1093/ije/dyh299. [DOI] [PubMed] [Google Scholar]
- 37.Quinlan JR. Induction of decision trees. Machine Learning. 1986;1:81–106. [Google Scholar]
- 38.Ma SH, Teasdale JD. Mindfulness-based cognitive therapy for depression: replication and exploration of differential relapse prevention effects. J Consult Clin Psychol. 2004 Feb;72(1):31–40. doi: 10.1037/0022-006X.72.1.31. [DOI] [PubMed] [Google Scholar]
- 39.de Vibe M, Solhaug I, Tyssen R, et al. Does Personality Moderate the Effects of Mindfulness Training for Medical and Psychology Students? Mindfulness (N Y) 2015;6(2):281–289. doi: 10.1007/s12671-013-0258-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Oken BS, Zajdel D, Kishiyama S, et al. Randomized controlled 6-month trial of yoga in healthy seniors. Alternative Therapies in Health and Medicine. 2006;12(1):40–47. [PMC free article] [PubMed] [Google Scholar]
- 41.Greeson JM, Smoski MJ, Suarez EC, et al. Decreased symptoms of depression after mindfulness-based stress reduction: potential moderating effects of religiosity, spirituality, trait mindfulness, sex, and age. J Altern Complement Med. 2015 Mar;21(3):166–174. doi: 10.1089/acm.2014.0285. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Kim B, Cho SJ, Lee KS, et al. Factors associated with treatment outcomes in mindfulness-based cognitive therapy for panic disorder. Yonsei medical journal. 2013 Nov;54(6):1454–1462. doi: 10.3349/ymj.2013.54.6.1454. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Michalak J, Holz A, Teismann T. Rumination as a predictor of relapse in mindfulness-based cognitive therapy for depression. Psychology and psychotherapy. 2011 Jun;84(2):230–236. doi: 10.1348/147608310X520166. [DOI] [PubMed] [Google Scholar]
- 44.Oken BS. Placebo effects: clinical aspects and neurobiology. Brain. 2008;131:2812–2823. doi: 10.1093/brain/awn116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Carlson LE, Tamagawa R, Stephen J, et al. Tailoring mind-body therapies to individual needs: patients’ program preference and psychological traits as moderators of the effects of mindfulness-based cancer recovery and supportive-expressive therapy in distressed breast cancer survivors. Journal of the National Cancer Institute. Monographs. 2014 Nov;2014(50):308–314. doi: 10.1093/jncimonographs/lgu034. [DOI] [PubMed] [Google Scholar]
- 46.Kabat-Zinn J, Chapman-Waldrop A. Compliance with an outpatient stress reduction program: rates and predictors of program completion. J Behav Med. 1988 Aug;11(4):333–352. doi: 10.1007/BF00844934. [DOI] [PubMed] [Google Scholar]
- 47.Khondoker M, Dobson R, Skirrow C, Simmons A, Stahl D. A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies. Statistical methods in medical research. 2013;25(4):1804–1823. doi: 10.1177/0962280213502437. [DOI] [PMC free article] [PubMed] [Google Scholar]