The Importance of Specifying the Time Period in Repeated Measures of Personality Assessments

Matthew W Southward; Madeline L Kushner; Nicole E Stumpp; Sarah E Cecil; Caden J Maynard; Alexandra K Barnhill; Vincent J Buchenberger; Shannon Sauer-Zavala

doi:10.1037/pas0001440

. Author manuscript; available in PMC: 2026 Feb 7.

Published in final edited form as: Psychol Assess. 2025 Dec 4;38(4):267–282. doi: 10.1037/pas0001440

The Importance of Specifying the Time Period in Repeated Measures of Personality Assessments

Matthew W Southward ¹, Madeline L Kushner ², Nicole E Stumpp ², Sarah E Cecil ², Caden J Maynard ², Alexandra K Barnhill ², Vincent J Buchenberger ², Shannon Sauer-Zavala ²

PMCID: PMC12880859 NIHMSID: NIHMS2121786 PMID: 41343401

Abstract

Because most validated personality measures were designed to capture relatively general and stable characteristics, they do not specify a particular timeframe for respondents to consider. It is thus unknown how these measures perform when administered repeatedly or how this performance compares to the same measures with instructions and items adapted to the repeated timeframe of interest. We randomly assigned undergraduate participants (N = 257; M_age = 20.4; 79% female; 77% white; 77% heterosexual) to complete measures of personality (NEO-FFI-3, LPFS-BF-2.0, PID-5-BF, FFBI-SF) with validated general instructions and items or measures with instructions and items pertaining to the previous week once per week for six weeks. Compared to measures with general instructions, measures with weekly instructions demonstrated greater within-person internal consistency (weekly ωs: .42-.83; general ωs: .44-.72), lower rank-order stability (weekly average one-week r: .72; general average one-week r: .86), greater variability (ds: .08-.94), lower average mean scores across time (ds: –.96 – .25), and stronger associations with measures of anxiety and depression, well-being, and functioning but similar between-person internal consistencies (weekly ωs: .79-.99; general ωs: .79-.99) and measurement invariance. Researchers assessing personality weekly may thus be able to capture more variability and stronger associations with relevant constructs while still maintaining reliable individual differences and construct validity using personality measures referencing participants’ past week. However, nuances such as lower average scores when referencing the past week should be kept in mind when comparing results between studies using different reference time frames.

Keywords: personality, self-report, repeated measures, reliability, validity

Personality, defined as a person’s characteristic way of thinking, feeling, and behaving, can be assessed in many ways, including self-report, observer ratings, and open-ended responses (Funder, 2010; Robins et al., 2007). Self-report is the most widely used way of measuring personality, as these measures are practical and readily interpreted (Paulhus & Vazire, 2007). However, the validity of self-reported personality measures can vary based on contextual factors (e.g., the time allotted to complete a measure), individual differences (e.g., motivation or self-awareness to complete a measure accurately), and aspects of the measures themselves.

Although some researchers have explored the effects of time frames on self-report broadly, and psychological symptoms specifically, few have examined how instructed time frames impact personality measures. Most well-validated measures of personality ask how people are generally (Goldberg et al., 2006) or do not specify a time frame (Ben-Porath & Tellegen, 2020; Gosling et al., 2003; Krueger et al., 2013; McRae & Costa, 2010; Morey, 2007; Soto & John, 2017). Assessing personality this way is in line with the common understanding of personality as a fairly stable trait especially over relatively longer time spans. At the same time, many researchers acknowledge that personality may also be more state-like and malleable in shorter time frames or specific contexts. Beck and Jackson (2020), for instance, described how personality can develop as a dynamic, within-person system in the context of environmental pressures that may more specifically point to how personality influences particular behaviors. To then link these dynamic patterns back to individual differences, Fleeson (2001) showed how aggregating across repeated responses on personality measures adapted to refer to relatively brief time frames can accurately represent individual differences in personality traits.

However, it may still be important to use trait measures when assessing personality over a relatively brief period. Psychometrically, researchers may wish to assess the test-retest reliability of a measure (e.g., Soto & John, 2017) or enhance the precision of their estimates of participants’ trait scores through repeated measures. These issues of measurement, specifically at a weekly level, are important to research on interventions for personality. In their systematic review of personality change during interventions using general, trait-based personality measures, Roberts et al. (2017) noted that personality changes followed an exponential trajectory, with nearly all small-to-medium sized change occurring in 4–8 weeks. Similarly, across studies of individual personality change interventions using general, trait-based measures administered weekly, participants reported significant improvements in neuroticism and extraversion, with more mixed results for conscientiousness, agreeableness, and openness among community participants, participants with substance use disorders, and participants with borderline personality disorder (Hudson & Fraley, 2015; Sauer-Zavala et al., 2023a; Stieger et al., 2021, 2022).

By contrast, when administering measures adapted to refer only to the prior week (or month) once per week (or month) in two- and four-week studies of digital interventions to improve conscientiousness and openness to experience and group-based interventions to improve flirting behaviors, community participants reported significant improvements in each personality dimension (Allemand et al., 2022; Stieger et al., 2020). Personality changes varied widely from small- to large-sized effects depending on how directly each dimension was targeted and how narrow the dimension was, with more specific dimensions showing larger changes.

The use of personality measures written to capture general, trait-like experiences and more time-limited, state-like experiences makes it difficult to compare these results. For instance, if participants responding to trait-like measures are using peak-end heuristics to inform their reports, they may be less consistent from one timepoint to the next than those using more state-like measures who may be more likely to rely on episodic memory. There may be further differences in mean scores and variability since trait-like measures encompass a larger set of experiences than state-like measures. Finally, although state-like measures can be aggregated into general trait scores, it is unclear if these measures are capturing the same latent constructs.

Understanding how the framing of measures influences self-reported personality traits over weekly time scales can improve our knowledge of personality assessment reliability and validity, since measures with high reliability ensure that observed changes in individual scores can be attributed to genuine shifts in personality, rather than fluctuating questionnaire responses. If people do respond differently based on the frame of reference given, it suggests that the item phrasing may play a vital role in how they interpret and report their personality characteristics. In essence, gaining insight into how time affects personality self-reporting ultimately allows researchers to refine existing assessment tools to most adequately capture changes in personality.

Current Study

In the current study, we compared the psychometric properties of four commonly used personality measures administered once per week for up to six weeks with instructions that either asked participants to complete the measure with reference to the prior week or did not ask participants to complete the measure with a specific time period in mind. These personality measures were selected to cover normal personality functioning, Criteria A and B of the DSM-5’s Alternative Model for Personality Disorders (American Psychiatric Association [APA], 2022), and borderline personality disorder, given its relevance to both personality and psychopathology more broadly (Ringwald et al., 2023; Southward et al., 2023). We randomly assigned participants to complete personality measures with instructions to consider the prior week or consider their personality in general. We compared each of these instructions conditions on the between- and within-person internal consistencies of the items in each personality measure (Aim 1), the multilevel measurement invariance of each personality measure (Aim 2), the rank-order stability of each personality measure (Aim 3), the person-specific standard deviations of each personality measure (Aim 4), and the person-specific means of each personality measure (Aim 5). To better contextualize and benchmark these results, we compared the psychometric properties of these personality measures with a set of commonly used mood, well-being, and functioning measures that all asked participants to consider the prior week. Because these measures were delivered after the personality measures, we were able to explore any downstream impacts of instruction condition on these measures. Finally, we explored if instruction condition moderated the strength of the within-person associations among personality measures and mood, well-being, and functioning measures (Aim 6). To our knowledge, this is the only study to date to directly compare the psychometric properties of repeatedly administered personality measures with the same content but instructions focusing on different timeframes.

Methods and Materials

Participants

Participants were 289 undergraduate students recruited from the University of Kentucky’s psychology department recruitment pool who participated for class credit. Of these 289 participants, we excluded 31 due to suspected careless responses defined by any combination of 0 variance on the 60-item NEO-Five Factor Inventory-3 (NEO-FFI-3; McCrae & Costa, 2010), completing the 25-item Personality Inventory for DSM-5-Brief Form (PID-5-BF; Krueger et al., 2013) in < 1 minute, completing the 48-item Five Factor Borderline Inventory-Short Form (FFBI-SF; DeShong et al., 2016) in 1 minute or less, or completing two or more weekly surveys out of order and in rapid succession before the end of semester deadline. When participants only completed one survey out of order (k = 9) or less than two days apart (k = 6), we removed those specific surveys from our analyses but retained all others. Finally, one participant was excluded from analyses because of a randomization failure (i.e., receiving established instructions surveys each week despite being randomized at baseline to the weekly instructions condition).

Of the 257 remaining participants, 227 (88.3%) enrolled in the study by selecting it from a list of study options. The remaining 30 (11.7%) participants were invited to participate via email because they reported elevated neuroticism, antagonism (i.e., low agreeableness), or disinhibition (i.e., low conscientiousness) at the pre-screening stage to oversample clinical presentations. Elevations were defined as scores >1 SD above the normed means from the NEO-FFI-3 on one or more of the three personality dimensions above (McCrae & Costa, 2010). To note, elevations in these areas were not required for participation.

Participants were 20.37 years old (SD = 4.98) on average, and a majority identified as women (n = 204; 79.4%), followed by men (n = 47; 18.3%) and non-binary (n = 5; 1.9%). Most participants identified as white (n = 198; 77.0%), followed by African-American (n = 29; 11.3%), Latinx (n = 11; 4.3%), East Asian (n = 9; 3.5%), South Asian (n = 5; 1.9%), Native American (n = 3; 1.2%), Arab/Middle Eastern American (n = 2; 0.8%), and Native Hawaiian/Pacific Islander (n = 1; 0.4%). Most participants identified as heterosexual (n = 199; 77.4%), followed by bisexual (n = 23; 9.0%), asexual (n = 11; 4.3%), homosexual (n = 8; 3.1%), queer (n = 8; 3.1%), and pansexual (n = 8; 3.1%). Smaller proportions identified as questioning (n = 6; 2.3%), demisexual (n = 2; 1.0%), fluid (n = 1; 0.4%) and autosexual (n = 1; 0.4%). A majority (n = 181; 70.7%) of participants reported being single. Participants reported a median family income of $100,000–124,999 and a median education level of some college.

Procedures

All study procedures were approved by the University of Kentucky Institutional Review Board. Participants first provided informed consent, and then completed a baseline survey, which contained a battery of self-report questionnaires, including several personality, mood, and well-being/functioning measures. Participants were randomized (1:1) to either the established instructions condition (n = 132; 51.4%), or the weekly instructions condition (n = 125; 48.6%). Participants in the established instructions condition completed each measure according to its established instructions. Those in the weekly instructions condition received the same measures, although the personality measures included a modified set of instructions and items written to pertain to the past week, detailed below. Each week for six more weeks, all participants received an email with a link to complete the same set of personality, mood, and well-being/functioning measures as they completed at baseline. These measures were delivered in the same order at each occasion. Participants received a maximum of five automated reminder emails spaced eight hours apart if they did not complete the weekly survey within the eight-hour window. Following the sixth week, all participants received a debriefing form describing the study and were compensated with course credit prorated based on the number of weekly surveys completed.

The weekly instructions and items of the personality measures were written to modify each measure as minimally as possible while also making it clear participants should respond considering the past week of their experiences. For instance, the last line of the established instructions for the PID-5-BF reads “We’d like you to take your time and read each statement carefully, selecting the response that best describes you” which we modified to “We’d like you to take your time and read each statement carefully, selecting the response that best describes you IN THE PAST WEEK” in the weekly instructions condition. Because we were concerned that participants would not closely attend to these instructions, we further modified each item to specify a past-week time period. For instance, the first PID-5-BF item is “People would describe me as reckless” which we modified to “In the past week, people would describe me as reckless.” We retained the response options for all measures.

We did not modify the instructions of the mood or well-being measures for any participants, because these measures were already written with instructions asking participants to focus on their experiences over the past week. To align our measure of functioning, the Work and Social Functioning Scale (WSAS; Mundt et al., 2002), with these three measures, we did modify its instructions to include the phrase “IN THE PAST WEEK” as well. We included these measures after the personality measures for two reasons. First, we wanted to evaluate the degree to which participants responded differently to commonly administered measures of past-week symptoms, well-being, and functioning after completing measures of either general or past-week personality. Second, we wanted to confirm that participants’ responses to the personality measures in the weekly instructions condition were more similar to their responses on these mood, well-being, and functioning scales than participants’ responses in the established instructions condition were.

In the established instructions condition, 54 participants (40.9%) completed measures at all six weeks with an average completion of 4.77 weeks. In the weekly condition, 35 participants (28.0%) completed measures at all six weeks with an average completion of 4.38 weeks. The average number of weekly responses completed did not significantly differ between conditions, t(255) = 1.34, p = .18, 95% CI [–.18, .94].

Measures

Demographics

Participants first reported on their demographic characteristics. These characteristics included age, gender identity, racial/ethnic identity, estimated family income, education, marital status, and sexual identity.

Personality Measures

Five-Factor Borderline Inventory-Short Form.

The Five-Factor Borderline Inventory Short Form (FFBI-SF; DeShong et al., 2016) is a 48-item self-report questionnaire designed to measure maladaptive personality traits related to BPD from the perspective of the Five Factor Model. The FFBI-SF consists of 12 4-item subscales representing facets of neuroticism (i.e., anxious uncertainty, dysregulated anger, despondence, self-disturbance, behavioral dysregulation, affective dysregulation, and fragility), low agreeableness (i.e., manipulativeness, distrustfulness, and oppositionality), low conscientiousness (i.e., rashness), and openness (i.e., dissociative tendencies). Each item is rated on a 5-point Likert-type scale from 1 (disagree strongly) to 5 (agree strongly) and summed to create each subscale score. Means and internal consistencies for each subscale in each randomization condition are presented in the Results.

Level of Personality Functioning Scale-Brief Form-2.0.

The Level of Personality Functioning Scale-Brief Form-2.0 (LPFS-BF-2.0; Weekers et al., 2019) is a 12-item self-report questionnaire composed of two 6-item subscales designed to assess difficulties in self and interpersonal functioning as defined by Criterion A of the AMPD (APA, 2022). Each item is rated on a 4-point Likert-type scale from 1 (completely untrue) to 4 (completely true) and summed to create each subscale score. Means and internal consistencies for each subscale in each randomization condition are presented in the Results.

NEO-Five Factor Inventory-3.

The NEO Five Factor Inventory 3 (NEO-FFI-3; McCrae & Costa, 2010) is a 60-item self-report measure with five 12-item subscales designed to assess each of the Big Five personality domains: neuroticism, extraversion, conscientiousness, agreeableness, and openness. Each item is rated on a 5-point Likert-type scale from 1 (strongly disagree) to 5 (strongly agree) and summed to create each subscale score. Means and internal consistencies for each subscale in each randomization condition are presented in the Results.

Personality Inventory for DSM-5-Brief Form.

The Personality Inventory for DSM-5-Brief Form (PID-5-BF; Krueger et al., 2013) is a 25-item self-report measure with five 5-item subscales intended to assess each of the maladaptive personality traits included in the Alternative Model of Personality Disorders (AMPD; APA, 2022): negative affectivity, detachment, antagonism, disinhibition, and psychoticism. Items are rated using a 4-point Likert-type scale from 0 (very false or often false) to 3 (very true or often true) and summed to create each subscale score. Means and internal consistencies for each subscale in each randomization condition are presented in the Results.

Mood, Well-Being, and Functioning Measures

Overall Anxiety Severity and Impairment Scale.

The Overall Anxiety Severity and Impairment Scale (OASIS; Norman et al., 2006) is a 5-item self-report questionnaire used to assess anxiety severity over the previous week. Each item is rated on a 5-point Likert-type scale using unique anchors for each item from 0 (none) to 4 (extreme) and summed to create a total score. Means and internal consistencies for each subscale in each randomization condition are presented in the Results.

Overall Depression Severity and Impairment Scale.

The Overall Depression Severity and Impairment Scale (ODSIS; Bentley et al., 2014) is a 5-item self-report measure used to assess depression severity over the previous week. Each item is rated on a 5-point Likert-type scale using unique anchors for each item from 0 (none) to 4 (extreme) and summed to create a total score. Means and internal consistencies for each subscale in each randomization condition are presented in the Results.

Quality of Life Enjoyment and Satisfaction Questionnaire – Short Form.

The Quality of Life Enjoyment and Satisfaction Questionnaire – Short Form (Q-LES-Q-SF; Endicott, 1993) is a 14-item self-report questionnaire intended to measure satisfaction with a range of physical, emotional, social, and functional life domains. Each item is rated from 1 (very poor) to 5 (very good) and summed to create a total score. Means and internal consistencies for each subscale in each randomization condition are presented in the Results.

Work and Social Adjustment Scale.

The Work and Social Adjustment Scale (WSAS; Marks, 1986) is a 5-item self-report measure of impairment in functioning across five life domains due to mental health difficulties over the previous week. Each item is rated on a 9-point Likert-type scale from 0 (not at all) to 8 (very severely) and summed to create a total score. Means and internal consistencies for each subscale in each randomization condition are presented in the Results.

Data Analytic Plan

To explore if differences in the timeframe specified in the instructions and items led to differences in between- or within-person internal consistencies in personality measures (Aim 1), we first calculated between- and within-person McDonald’s ωs for all measures using Geldolf et al.’s (2014) recommendations. Specifically, we fit a multilevel confirmatory factor analysis (CFA) for each measure separately in Mplus (Version 8.11; Muthén & Muthén, 2012–2022), applying full information maximum likelihood (FIML) estimation to model data missingness. We applied a correlated factors model for each measure, defining each factor as an established subscale from each measure and allowing each item to load on its established subscale. We retained the same correlated factors structure at the between- and within-person levels for each measure. We then applied standard equations to calculate ωs at the between- ( $b$ ) and within-persons ( $w$ ) levels, respectively: $ω_{b / w} = \frac{{(Σ λ_{b j / w j})}^{2}}{{(Σ λ_{b j / w j})}^{2} + Σ θ_{b j / w j}}$ . Between-person ωs indicate the degree to which items represent a unitary construct to different people, whereas within-person ωs indicate the degree to which items represent a unitary construct to any given person over time. Finally, we used the cocron package (Version 1.0–1; Diedenhofen, 2016) in R (Version 4.3.1; R Core Team, 2023) to compare ωs at each level between conditions.¹ Because the test statistic to compare ωs is approximately distributed as χ² when the null hypothesis is true, with df = 1 and assuming α = .05, we had 80% power to detect χ² ≥ 3.84 with λ = 7.85 (Faul et al., 2009).

We then conducted preliminary analyses to test for any differences in demographic variables between randomization conditions, zero-order correlations among all measures of interest, and one-week rank-order stability of all measures of interest (Aim 2). We tested for differences in demographics between randomization conditions using independent samples t-tests for continuous variables (i.e., age, national subjective social status, community subjective social status), χ² tests of independence for categorical variables (i.e., gender, racial/ethnic identity, sexual identity, marital status), and Mann-Whitney U tests for ordinal variables (i.e., income, education) in SPSS (Version 29.0.2.0; IBM Corp., 2023). We then calculated Pearson correlations among continuous and dichotomous variables and Spearman’s correlations involving ordinal variables in SPSS. We included a dummy-coded randomization condition variable in these correlations as well as the person-specific personality, mood, and well-being/functioning variables described below. To test the one-week rank-order stability of each measure, we calculated Pearson correlations between all six pairs of successive timepoints for each measure within each randomization condition and calculated an average of those week-to-week stability statistics, applying pairwise deletion in the presence of missing data. We compared the average one-week rank-order stability using Preacher’s (2002) implementation of Cohen and Cohen’s (1983, p. 54) formula 2.8.5 to compare two independent correlations. Conducting two-tailed tests with n₁ = 132, n₂ = 125, and assuming α = .05, we had 80% power to detect between-group differences in continuous variables of d ≥ .35 (Faul et al., 2009). With df = 1 for tests of independence for categorical variables and assuming α = .05, we had 80% power to detect χ² ≥ 3.84 with λ = 7.85 (Faul et al., 2009). Finally, with 257 participants and assuming α = .05 with two-tailed tests, we had 80% power to detect rs and ρs ≥ .17 (Faul et al., 2009).

To test if the latent constructs producing observed scores on measures with established instructions differed from those producing observed scores on measures with weekly instructions (Aim 3), we tested the measurement invariance of each subscale using two-level multi-group CFAs in Mplus, applying full information maximum likelihood (FIML) estimation to model data missingness. We first evaluated scalar equivalence for each subscale by fixing the factor structure, loadings, and means to be the same between the established instructions group and the weekly instructions group at both between- and within-person levels (restricted model; Figure S1a-c). We then freed the within-person loadings and between-person means to evaluate metric equivalence (semi-restricted model) and, finally, freed both the within- and between-person factor loadings to evaluate configural equivalence (unrestricted model). We evaluated model fit using Hu and Bentler’s (1999) standards (i.e., good fit: RMSEA ≤ .06, CFI/TLI ≥ .95, SRMR ≤ .06; acceptable fit: RMSEA ≤ .08, CFI/TLI ≥ .90, SRMR ≤ .08) To compare each model, we conducted χ² difference tests using the Satorra-Bentler scaled χ² to adjust for the continuous non-normal distributions of these models (Satorra & Bentler, 2010). Given the number of tests conducted, we applied the Benjamini-Hochberg procedure to each set of comparisons separately (i.e., restricted to semi-restricted models; restricted to unrestricted models).

To test if differences in the timeframe specified in the instructions and items led to differences in the standard deviations (Aim 4) and means (Aim 5) of personality measures, we conducted a series of multiple regressions. First, we calculated person-specific standard deviations and means for each personality subscale using all available data provided by each participant. To better isolate the impact of the timeframe specified in the instructions, for Aim 4 we regressed these person-specific standard deviations on a dummy-coded randomization condition variable and several covariates, applying listwise deletion in the presence of missing data. We included person-specific means of the respective personality measure as a covariate to account for individual differences in the personality dimension tested. We included the total number of weekly responses completed as a covariate to account for any artificially elevated standard deviations driven by participants who completed fewer but more extreme weekly responses. Finally, we included demographic characteristics (i.e., age, gender, racial/ethnic identity, sexual identity, income, education, national subjective social status, and local subjective social status) as covariates to account for any existing associations with personality dimensions.

We repeated these analyses to test Aim 5 but replaced person-specific standard deviations with person-specific means of each scale in each model. To compare if any between-condition differences in personality measures influenced subsequent mood, well-being, or functioning measures, we repeated the above multiple regression analyses with each of our mood, well-being, and functioning measures. Assuming α = .05, with 257 participants and 13 predictors in each multiple regression model, each evaluated as a two-tailed test, we had 80% power to detect individual predictors explaining R² ≥ .029 (Faul et al., 2009). Given the number of tests conducted in Aims 2 and 3, we applied the Benjamini-Hochberg procedure (Benjamini & Hochberg, 1995) to better control our Type 1 error rate using α = .05 for each aim.

Finally, to explore between-condition differences in within-person effects (Aim 6), we first disaggregated within- from between-person variability for all personality measures following Curran and Bauer’s (2011) recommendations. We calculated the person-mean for all personality measures and subtracted this person-mean from the raw score at each timepoint. We then ran a series of HLMs using proc mixed in SAS (Version 9.4) to regress each mood, well-being, and functioning measures, in separate models, on within-person personality measure scores, a dummy-coded instructions condition variable (0 = established instructions, 1 = weekly instructions), and the product of the within-person personality measure scores and the dummy-coded instructions condition variable. We applied restricted maximum likelihood estimation, modeling random intercepts, an autoregressive lag-1 residual covariance structure, and used the Kenward-Rogers method to calculate degrees of freedom. With N = 257, we had 80% power to detect effects with ts ≥ 2.82 (Murayama et al., 2022). Given the number of tests, we again applied the Benjamini-Hochberg procedure to the family of models for each mood, well-being, and functioning measure, using α = .05 for each measure.

Transparency and Openness

This study was not preregistered. However, we report how we determined our sample size, all data exclusions, all manipulations, and all measures in the study. Code for all analyses is available at https://doi.org/10.17605/osf.io/gznav. Data is available upon reasonable request from the first or last author.

Results

Aim 1 – Differences in Between- and Within-Person Internal Consistencies of Personality Measures by Instruction Condition

Between-persons, items from all measures in the established instructions condition but one (FFBI-SF-A4-Oppositional; ω = .79) exhibited good-to-excellent internal consistencies, ωs: .80-.99 (Table 1). Similarly, items from all measures in the weekly instructions condition but one (NEO-FFI-3-Openness; ω = .79) exhibited good-to-excellent internal consistencies, ωs: .80-.99. The reliabilities of personality measures were generally similar between conditions, with those in the weekly instructions condition exhibiting slightly greater ωs on average, $\bar{ω}$ = .93, than those in the established instructions condition, $\bar{ω}$ = .90. Only four scales exhibited significantly different ωs between conditions: NEO-FFI-3-Agreeableness, FFBI-SF-A2-Manipulativeness, FFBI-SF-A4-Oppositional, and FFBI-SF-N6a-Affective Dysregulation, ps < .01. In each case, those in the weekly instructions condition demonstrated significantly greater ωs than those in the established instructions condition. Of note, there were no significant between-condition differences or consistent between-condition patterns distinguishing the internal consistencies of items from the mood, well-being, and functioning measures, ps > .05.

Table 1.

Comparing Between- and Within-Person Internal Consistencies Between Established and Weekly Instructions Conditions

Between-Person Internal Consistencies

	Established Instructions	Weekly Instructions
Measure	ω	ω	χ²	df	p

NEO-FFI
Neuroticism	.94	.94	.05	1	.82
Extraversion	.89	.92	2.51	1	.11
Conscientiousness	.94	.94	.04	1	.83
Agreeableness	.82	.93	21.32	1	< .01
Openness	.86	.79	3.58	1	.06
PID-5-BF
Negative Affectivity	.87	.90	.96	1	.33
Detachment	.80	.89	6.13	1	.01
Disinhibition	.92	.95	3.53	1	.06
Antagonism	.81	.87	3.09	1	.08
Psychoticism	.88	.89	.22	1	.64
LPFS-BF-2.0
Self-Dysfunction	.92	.94	2.51	1	.11
Other-Dysfunction	.86	.92	6.26	1	.01
FFBI-SF
N1. Anxious Uncertainty	.93	.94	.50	1	.48
N2. Dysregulated Anger	.98	.97	1.07	1	.30
N3. Despondence	–	.93	–	–	–
N4. Self-Disturbance	.90	.91	.54	1	.46
N5. Behavioral Dysregulation	.95	.97	3.81	1	.05
N6a. Affective Dysregulation	.98	.99	7.35	1	< .01
N6b. Fragility	.84	.89	3.39	1	.07
C6. Rashness	.96	.96	.49	1	.48
A1. Distrustfulness	.90	.93	1.29	1	.26
A2. Manipulativeness	.95	.98	20.50	1	< .01
A4. Oppositional	.79	.91	12.40	1	< .01
O1. Dissociative Tendencies	.99	.99	1.81	1	.18
OASIS	.98	.97	3.37	1	.07
ODSIS	.98	.99	2.35	1	.13
Q-LES-Q	.96	.96	< .01	1	.98
WSAS	.98	.97	1.34	1	.25
Within-Person Internal Consistencies

	Established Instructions	Weekly Instructions
Measure	ω	ω	χ²	df	p

NEO-FFI
Neuroticism	.51	.76	13.24	1	< .01
Extraversion	.48	.76	16.29	1	< .01
Conscientiousness	.43	.80	27.46	1	< .01
Agreeableness	.48	.57	1.06	1	.30
Openness	.44	.42	.04	1	.85
PID-5-BF
Negative Affectivity	.53	.69	3.80	1	.05
Detachment	.35	.69	11.07	1	< .01
Disinhibition	.50	.69	4.87	1	.03
Antagonism	.47	.67	4.38	1	.04
Psychoticism	.44	.55	.97	1	.32
LPFS-BF-2.0
Self-Dysfunction	.56	.79	12.60	1	< .01
Other-Dysfunction	.46	.68	6.14	1	.01
FFBI-SF
N1. Anxious Uncertainty	.47	.73	8.37	1	< .01
N2. Dysregulated Anger	.60	.79	8.43	1	< .01
N3. Despondence	–	.65	–	–	–
N4. Self-Disturbance	.45	.62	2.37	1	.12
N5. Behavioral Dysregulation	.50	.75	8.50	1	< .01
N6a. Affective Dysregulation	.56	.83	17.61	1	< .01
N6b. Fragility	–	.55	–	–	–
C6. Rashness	.46	.73	8.61	1	< .01
A1. Distrustfulness	.55	.75	6.70	1	.01
A2. Manipulativeness	.55	.77	7.96	1	< .01
A4. Oppositional	.39	.59	3.14	1	.08
O1. Dissociative Tendencies	.72	.83	4.99	1	.03
OASIS	.82	.83	.08	1	.78
ODSIS	.89	.87	.53	1	.47
Q-LES-Q	.77	.81	1.10	1	.30
WSAS	.76	.76	< .01	1	.94

Open in a new tab

Note. Bold values significant after applying the Benjamini-Hochberg procedure. NEO-FFI-3 = NEO Five Factor Inventory-3. PID-5-BF = Personality Inventory for DSM-5-Brief Form. LPFS-BF-2.0 = Level of Personality Functioning-Brief Form-2.0. FFBI-SF = Five Factor Borderline Inventory-Short Form. N = Neuroticism. C = Conscientiousness. A = Agreeableness. O = Openness. OASIS = Overall Anxiety Severity & Impairment Scale. ODSIS = Overall Depression Severity & Impairment Scale. Q-LES-Q = Quality of Life Enjoyment and Satisfaction Questionnaire-Short Form. WSAS = Work and Social Adjustment Scale. Multilevel confirmatory factor analyses involving N3-Despondence (between- and within-person levels) and N6b-Fragility (within-person level) in the established instructions condition did not converge.

Within-persons, only items from one personality scale in the established instructions condition, FFBI-SF-O1-Dissociative Tendencies, exhibited acceptable internal consistency, ω = .72; items from all other scales were in the unacceptable-to-questionable range, ωs: .35-.60. By contrast, items from 12 personality scales in the weekly instructions condition exhibited acceptable-to-good internal consistency, ωs: .73-.83, with the remainder in the unacceptable-to-questionable range, ωs: .42-.69. Over half of the personality scales in the weekly instructions condition demonstrated significantly greater ωs than scales in the established instructions condition, ps < .03, reflected in the greater average ω in the weekly instructions condition, $\bar{ω}$ = .69, compared to the established instructions condition, $\bar{ω}$ = .50. As in the between-persons results, there were no significant differences or consistent patterns distinguishing the randomization conditions on any mood, well-being, or functioning measure, ps > .25.

Aim 2 – Differences in Rank-Order Stability Between Instruction Conditions

All personality measures demonstrated relatively high average one-week rank-order stability, $\bar{r} s$ : .59-.93 (Table 2). When comparing instruction conditions, 18 of the 24 personality measures (75%) demonstrated greater one-week rank-order stability when using established instructions compared to weekly instructions, zs > 2.30, ps < .03 (Tables 2, S3). By contrast, of the mood, quality of life, and functioning measures, only the WSAS demonstrated a significant between-condition difference, z = 2.31, p = .02.

Table 2.

Test-Retest Reliability/Rank-Order Stability (Pearson’s r) of All Measures By Condition

Measure	Condition	r	z	p
NEO-FFI-3
Neuroticism	Established	.93	5.12	< .01
	Weekly	.71
Extraversion	Established	.91	4.61	< .01
	Weekly	.69
Conscientiousness	Established	.92	4.17	< .01
	Weekly	.75
Agreeableness	Established	.86	1.55	.12
	Weekly	.79
Openness	Established	.88	2.85	< .01
	Weekly	.75
PID-5-BF
Negative Affectivity	Established	.83	2.43	.01
	Weekly	.69
Detachment	Established	.81	1.61	.11
	Weekly	.71
Disinhibition	Established	.82	1.39	.16
	Weekly	.74
Antagonism	Established	.75	.21	.84
	Weekly	.74
Psychoticism	Established	.85	1.13	.26
	Weekly	.79
LPFS-BF-2.0
Self-Dysfunction	Established	.88	2.71	.01
	Weekly	.76
Other-Dysfunction	Established	.83	2.40	.02
	Weekly	.68
FFBI-SF
N1-Anxious Uncertainty	Established	.89	3.62	< .01
	Weekly	.72
N2-Dysregulated Anger	Established	.89	3.77	< .01
	Weekly	.70
N3-Despondence	Established	.93	5.09	< .01
	Weekly	.74
N4-Self-Disturbance	Established	.89	3.38	< .01
	Weekly	.74
N5-Behavioral Dysregulation	Established	.84	2.35	.02
	Weekly	.71
N6a-Affective Dysregulation	Established	.91	4.26	< .01
	Weekly	.73
N6b-Fragility	Established	.88	2.97	< .01
	Weekly	.74
C6-Rashness	Established	.88	4.21	< .01
	Weekly	.64
A1-Distrustfulness	Established	.84	1.96	.05
	Weekly	.74
A2-Manipulativeness	Established	.83	2.30	.02
	Weekly	.70
A4-Oppositional	Established	.79	2.69	.01
	Weekly	.59
O1-Dissociative Tendencies	Established	.89	2.70	.01
	Weekly	.78
OASIS	Established	.85	.53	.59
	Weekly	.82
ODSIS	Established	.84	.48	.63
	Weekly	.82
Q-LES-Q	Established	.86	.75	.46
	Weekly	.83
WSAS	Established	.86	2.31	.02
	Weekly	.74

Open in a new tab

Note. NEO-FFI-3 = NEO Five Factor Inventory-3. PID-5-BF = Personality Inventory for DSM-5 Brief Form. LPFS-BF-2.0 = Level of Personality Functioning - Brief Form 2.0. FFBI-SF = Five Factor Borderline Inventory-Short Form. OASIS = Overall Anxiety Severity & Impairment Scale. ODSIS = Overall Depression Severity & Impairment Scale. Q-LES-Q-SF = Quality of Life Enjoyment and Satisfaction Questionnaire-Short Form. WSAS = Work and Social Adjustment Scale. z-statistic represents the difference between conditions in average test-retest reliability/rank-order stability with N_Established = 100 and N_Weekly = 95. Bold values indicate statistically significant differences after applying the Benjamini-Hochberg procedure. See Table S3 for week-by-week estimates.

Aim 3 – Measurement Invariance in Latent Constructs Between Instruction Conditions

Across indices, all PID-5-BF subscales except Psychoticism, both LPFS-BF-2.0 subscales, and the FFBI-SF subscales of N5-Behavioral Dysregulation, N6a-Affective Dysregulation, A4-Oppositional, and O1-Dissociative Tendencies demonstrated acceptable-to-good fit when all parameters were set equal across groups (Table 3). All measures except PID-5-BF-Psychoticism, FFBI-SF-N3-Despondence, and FFBI-SF-N4-Self-Disturbance demonstrated at least acceptable fit by RMSEA, although fewer measures fit acceptably using CFI, TLI, and SRMR indices.² However, the vast majority of these subscales (n = 15; 78.9%) demonstrated scalar equivalence compared to the semi-restricted models and a similar percentage (n = 15; 71.4%) demonstrated scalar equivalence compared to the unrestricted models.

Table 3.

Measurement Invariance Between Conditions

	Restricted Model							SRMR		Semi-Restricted Model							SRMR		Comparison to Restricted Model			Unrestricted Model							SRMR		Comparison to Restricted Model
Measure	χ²	df	p	SCF	RMSEA	CFI	TLI	Within	Between	χ²	df	p	SCF	RMSEA	CFI	TLI	Within	Between	χ²	df	p	χ²	df	p	SCF	RMSEA	CFI	TLI	Within	Between	χ²	df	p

NEO-FFI-3
Neuroticism	707.03	249	< .01	1.16	.056	.788	.775	.039	.076	683.81	237	< .01	1.15	.057	.793	.769	.030	.078	25.25	12	.01	637.82	225	< .01	1.15	.056	.809	.775	.030	.092	69.14	24	< .01
Extraversion	637.72	249	< .01	1.11	.052	.813	.802	.040	.083	626.27	237	< .01	1.07	.053	.813	.792	.033	.084	19.57	12	.08	612.86	225	< .01	1.06	.054	.814	.782	.033	.073	36.80	24	.046
Conscientiousness	719.05	249	< .01	1.20	.057	.798	.785	.043	.103	–	–	–	–	–	–	–	–	–	–	–	–	643.62	225	< .01	1.17	.056	.820	.788	.034	.082	73.96	24	< .01
Agreeableness	793.25	249	< .01	1.23	.061	.607	.583	.043	.133	796.327	237	< .01	1.20	.063	.596	.550	.041	.132	12.91	12	.38	780.17	225	< .01	1.17	.065	.599	.530	.040	.122	36.14	24	.05
Openness	593.85	249	< .01	1.17	.049	.686	.667	.042	.098	592.166	237	< .01	1.15	.050	.676	.640	.041	.097	8.17	12	.77	550.95	225	< .01	1.15	.050	.703	.652	.041	.060	45.19	24	.01
PID-5-BF
Negative Affectivity	53.67	32	.01	1.19	.034	.971	.963	.030	.059	52.80	27	< .01	1.16	.041	.965	.948	.027	.060	1.73	5	.89	43.45	22	< .01	1.24	.041	.971	.947	.027	.031	9.05	10	.53
Detachment	90.54	32	< .01	1.22	.056	.893	.867	.036	.058	75.57	27	< .01	1.27	.056	.912	.869	.020	.058	15.28	5	.01	71.20	22	< .01	1.27	.062	.910	.837	.020	.050	18.00	10	.05
Disinhibition	87.05	32	< .01	1.42	.055	.928	.911	.039	.042	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–
Antagonism	73.64	32	< .01	1.52	.047	.914	.892	.030	.088	–	–	–	–	–	–	–	–	–	–	–	–	65.96	22	< .01	1.26	.059	.909	.835	.027	.066	13.74	10	.18
Psychoticism	167.75	32	< .01	1.14	.086	.825	.781	.031	.064	172.41	27	< .01	1.08	.097	.812	.722	.026	.064	3.27	5	.66	166.36	22	< .01	1.11	.107	.814	.662	.026	.061	5.01	10	.89
LPFS-BF-2.0
Self-Dysfunction	171.31	51	< .01	1.15	.064	.912	.896	.037	.060	160.07	45	< .01	1.56	.067	.916	.888	.028	.060	27.38	6	< .01	151.25	39	< .01	1.18	.071	.918	.874	.028	.049	17.36	12	.14
Other-Dysfunction	121.40	51	< .01	1.18	.049	.910	.894	.031	.084	121.22	45	< .01	1.14	.054	.902	.870	.026	.084	3.54	6	.74	111.79	39	< .01	1.16	.057	.907	.856	.025	.071	11.11	12	.52
FFBI-SF
N1. Anxious Uncertainty	57.93	17	< .01	1.16	.065	.945	.922	.014	.098	60.06	13	< .01	1.10	.079	.936	.883	.012	.097	.86	4	.93	46.12	9	< .01	1.16	.085	.950	.866	.012	.049	11.85	8	.16
N2. Dysregulated Anger	71.54	17	< .01	1.37	.075	.942	.918	.034	.033	64.45	13	< .01	1.43	.083	.945	.899	.023	.035	5.30	4	.26	51.78	9	< .01	1.42	.091	.955	.879	.023	.115	18.73	8	.02
N3. Despondence	283.75	17	< .01	.99	.165	.633	.481	.058	.064	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–
N4. Self-Disturbance	257.74	17	< .01	1.07	.157	.718	.602	.048	.089	288.15	13	< .01	.89	.192	.678	.405	.042	.086	11.21	4	.02	214.02	9	< .01	1.17	.199	.760	.359	.042	.091	27.36	8	< .01
N5. Behavioral Dysregulation	32.56	17	.01	1.39	.040	.978	.969	.028	.040	26.65	13	.01	1.43	.043	.981	.965	.004	.041	5.69	4	.22	14.19	9	.12	1.62	.032	.993	.981	.004	.051	19.67	8	.01
N6a. Affective Dysregulation	29.90	17	.03	1.40	.036	.988	.983	.015	.027	28.68	13	< .01	1.40	.046	.986	.974	.008	.028	1.30	4	.86	21.53	9	.01	1.70	.049	.989	.970	.008	.034	4.96	8	.76
N6b. Fragility	77.89	17	< .01	1.45	.079	.807	.728	.039	.063	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–
C6. Rashness	72.90	17	< .01	1.37	.076	.904	.865	.050	.025	51.37	13	< .01	1.39	.072	.934	.879	.013	.025	21.92	4	< .01	42.59	9	< .01	1.55	.081	.942	.847	.013	.049	28.98	8	< .01
A1. Distrustfulness	62.17	17	< .01	1.24	.068	.939	.913	.014	.078	59.47	13	< .01	1.26	.079	.937	.883	.004	.076	1.90	4	.75	48.42	9	< .01	1.42	.087	.946	.857	.004	.051	8.13	8	.42
A2. Manipulativeness	52.02	17	< .01	1.56	.060	.946	.924	.030	.050	52.04	13	< .01	1.52	.072	.940	.889	.027	.050	1.33	4	.86	41.39	9	< .01	1.74	.079	.950	.867	.027	.035	6.78	8	.56
A4. Oppositional	45.61	17	< .01	1.10	.054	.950	.930	.017	.049	41.49	13	< .01	1.12	.062	.950	.908	.009	.049	3.49	4	.48	70.15	9	< .01	.60	.109	.893	.716	.008	.037	4.86	8	.77
O1. Dissociative Tendencies	18.32	17	.37	1.79	.012	.999	.998	.030	.008	18.25	13	.15	1.73	.027	.995	.992	.028	.008	.63	4	.96	14.55	9	.10	2.15	.033	.995	.987	.028	.007	1.11	8	.99

Open in a new tab

Note. SCF = Scaling Correction Factor. RMSEA = Root Mean-Squared Error of Approximation. CFI = Comparative Fit Index. TLI = Tucker-Lewis Index. SRMR = Standardized Root Mean-Squared Residual. Comparison χ² = Satorra-Bentler scaled chi-square difference test statistic. Bold rows indicate significant model differences after applying the Benjamini-Hochberg procedure.

Preliminary Analyses for Aims 4–6

There were no significant differences in demographic characteristics between randomization conditions, ps > .15, except community subjective social status. At baseline, participants in the established instructions condition reported higher community subjective social status (M = 5.44, SD = 1.82) than those in the weekly instructions condition (M = 4.94, SD = 1.77), t(254) = 2.19, p = .03, 95% CI [.05, .93], d = .27. Although primarily small-to-medium sized, certain demographic characteristics (i.e., sexual minority identity, lower national and community subjective social status, and psychotropic medication use) were significantly associated with greater mean scores on neuroticism-related measures and lower mean scores on extraversion-, conscientiousness-, and agreeableness-related measures (Table S2). By contrast, there were no consistent associations between demographic characteristics and standard deviations of our measures of interest (Table S1). A similar proportion of participants reported clinically elevated past-week depression at baseline in the established instructions condition (n = 31; 23.8%) as in the weekly instructions condition (n = 36; 29.0%; χ²(1) = .88, p = .35). A similar proportion of participants also reported clinically elevated past-week anxiety in the established instructions condition (n = 60; 45.8%) as in the weekly instructions condition (n = 62; 49.6%; χ²(1) = .37, p = .54).

Aim 4 – Differences in Standard Deviations of Personality Measures by Instruction Condition

After applying the Benjamini-Hochberg procedure, we found significantly greater person-specific standard deviations in the weekly instructions condition than the established instructions condition on all personality scales except PID-5-BF-Antagonism, ds: .08-.94, ps < .04 (Tables 4, S4-S31).³ These differences tended to be most pronounced among neuroticism-focused scales (i.e., NEO-FFI-Neuroticism, PID-5-BF-Negative Affectivity, FFBI-SF-N1-Anxious Uncertainty, FFBI-SF-N3-Despondence, FFBI-SF-N5-Behavioral Dysregulation, FFBI-SF-N6a-Affective Dysregulation; ds: .43-.84), extraversion-focused scales (i.e., NEO-FFI-Extraversion, PID-5-BF-Detachment; ds: .83-.93), and conscientiousness-focused scales (i.e., NEO-FFI-Conscientiousness, PID-5-BF-Disinhibition, FFBI-SF-C6-Rashness; ds: .38-.94), as well as general personality functioning, ds: .53-.69. Although generally in the same direction, only Q-LES-Q-SF scores exhibited significantly greater standard deviations in the weekly instructions condition compared to the established instructions condition, d = .29, p = .02.

Table 4.

Between-Condition Differences in the Standard Deviations of Each Measure

	Established Instructions	Weekly Instructions
Measure	M (SD)	M (SD)	d	B	SE	p	95% CI
NEO-FFI-3
Neuroticism	2.68 (1.49)	4.39 (2.47)	.84	1.79	.28	< .01	[1.23, 2.34]
Extraversion	2.12 (1.18)	3.87 (2.41)	.93	1.93	.27	< .01	[1.40, 2.46]
Conscientiousness	2.05 (1.25)	4.03 (2.73)	.94	1.83	.29	< .01	[1.25, 2.40]
Agreeableness	2.06 (1.15)	2.65 (1.50)	.44	.71	.17	< .01	[.37, 1.06]
Openness	2.37 (1.23)	3.02 (1.74)	.44	.52	.24	.03	[.05, .98]
PID-5-BF
Negative Affectivity	.27 (.14)	.42 (.23)	.80	.16	.03	< .01	[.11, .22]
Detachment	.22 (.13)	.37 (.22)	.83	.14	.03	< .01	[.09, .19]
Disinhibition	.23 (.17)	.32 (.22)	.46	.10	.03	< .01	[.05, .15]
Antagonism	.18 (.14)	.22 (.18)	.22	.03	.02	.10	[−.01, .07]
Psychoticism	.26 (.14)	.31 (.19)	.32	.07	.02	< .01	[.02, .12]
LPFS-BF-2.0
Self-Dysfunction	1.45 (.87)	2.22 (1.33)	.69	.88	.16	< .01	[.57, 1.19]
Other-Dysfunction	1.39 (.75)	1.94 (1.27)	.53	.66	.14	< .01	[.38, .94]
FFBI-SF
N1. Anxious Uncertainty	1.34 (.88)	2.31 (1.48)	.80	.95	.17	< .01	[.61, 1.29]
N2. Dysregulated Anger	1.40 (1.09)	1.92 (1.78)	.36	.69	.18	< .01	[.33, 1.06]
N3. Despondence	1.16 (.74)	1.78 (1.08)	.67	.78	.12	< .01	[.53, 1.02]
N4. Self-Disturbance	1.46 (.96)	1.73 (1.35)	.23	.41	.16	.01	[.09, .73]
N5. Behavioral Dysregulation	1.38 (.96)	1.96 (1.65)	.43	.83	.16	< .01	[.50, 1.15]
N6a. Affective Dysregulation	1.37 (1.03)	2.50 (1.75)	.79	1.19	.20	< .01	[.80, 1.59]
N6b. Fragility	.98 (.81)	1.50 (1.20)	.51	.58	.12	< .01	[.34, .83]
C6. Rashness	1.20 (.95)	1.68 (1.55)	.38	.60	.17	< .01	[.26, .94]
A1. Distrustfulness	1.73 (1.10)	2.11 (1.54)	.28	.65	.19	< .01	[.27, 1.03]
A2. Manipulativeness	1.02 (1.11)	1.11 (1.26)	.08	.50	.14	< .01	[.22, .77]
A4. Oppositional	1.09 (.74)	1.65 (1.11)	.59	.68	.11	< .01	[.46, .90]
O1. Dissociative Tendencies	1.25 (1.41)	1.62 (1.57)	.25	.58	.18	< .01	[.22, .94]
OASIS	1.95 (1.29)	2.04 (1.31)	.07	.14	.17	.43	[−.20, .47]
ODSIS	1.74 (1.72)	2.10 (1.68)	.21	.35	.22	.11	[−.08, .77]
Q-LES-Q-SF	3.56 (2.12)	4.21 (2.35)	.29	.72	.32	.02	[.10, 1.35]
WSAS	3.83 (2.84)	3.79 (2.48)	−.01	−.16	.34	.65	[−.84, .52]

Open in a new tab

Note. Bold values indicate measures that differ significantly between conditions after applying the Benjamini-Hochberg procedure. NEO-FFI-3 = NEO-Five Factor Inventory-3. PID-5-BF = Personality Inventory for DSM-5-Brief Form. LPFS-BF-2.0 = Level of Personality Functioning Scale-Brief Form-2.0. FFBI-SF = Five Factor Borderline Inventory-Short Form. OASIS = Overall Anxiety Severity & Impairment Scale. ODSIS = Overall Depression Severity & Impairment Scale. Q-LES-Q-SF = Quality of Life Enjoyment and Satisfaction Questionnaire-Short Form. WSAS = Work & Social Adjustment Scale. Means, standard deviations, and Cohen’s d values refer to raw scores and their comparison. Bs, SEs, ps, and 95% CIs refer to parameter estimates from multiple regression models (Tables S4-S30).

Aim 5 – Differences in Means of Personality Measures by Instruction Condition

By contrast, 11 personality scales exhibited significant mean differences between randomization conditions after applying the Benjamini-Hochberg procedures, ds: –.96 - –.21, ps < .02 (Tables 5, S32-S59).⁴ In each of these cases, participants in the established instructions condition tended to exhibit higher mean scores than those in the weekly instructions condition. There were fewer consistent patterns in these between-condition differences with some neuroticism-focused scales (i.e., PID-5-BF-Negative Affectivity, FFBI-SF-N2-Dysregulated Anger, FFBI-SF-N3-Despondence, FFBI-SF-N4-Self-Disturbance, FFBI-SF-N5-Behavioral Dysregulation; ds: –.49 - –.35), some agreeableness-focused scales (i.e., FFBI-SF-A1-Distrustfulness, FFBI-SF-A2-Manipulativeness, FFBI-SF-A4-Oppositional; ds: –.71 - –.21), and individual conscientiousness-focused (FFBI-SF-C6-Rashness; d = –.44), openness-focused (i.e., NEO-FFI-Openness; d = –.96) and personality functioning scales (LPFS-BF-2.0-Other-Dysfunction; d = –.31) demonstrating significant between-condition differences. There were no significant mean-level between-condition differences on any mood, well-being, or functioning scale, ds: <–.01-.10, ps > .20.

Table 5.

Between-Condition Differences in the Means of Each Measure

	Established Instructions	Weekly Instructions
Measure	M (SD)	M (SD)	d	B	SE	p	95% CI
NEO-FFI-3
Neuroticism	27.01 (8.93)	23.12 (7.26)	−.48	−2.69	1.20	.03	[−5.05, −.33]
Extraversion	28.59 (6.99)	26.99 (6.71)	−.23	−2.16	1.00	.03	[−4.13, −.18]
Conscientiousness	32.04 (6.88)	29.25 (6.52)	−.42	−2.09	.96	.03	[−3.98, −.19]
Agreeableness	33.87 (6.09)	34.44 (6.51)	.09	.56	.84	.51	[−1.10, 2.22]
Openness	31.20 (6.43)	25.64 (5.07)	−.96	−5.39	.77	< .01	[−6.91, −3.86]
PID-5-BF
Negative Affectivity	1.55 (.71)	1.21 (.68)	−.49	−.28	.10	< .01	[−.48, −.09]
Detachment	.74 (.52)	.88 (.59)	.25	.13	.08	.11	[−.03, .29]
Disinhibition	.70 (.58)	.68 (.59)	−.04	−.10	.07	.16	[−.24, .04]
Antagonism	.47 (.42)	.49 (.41)	.03	−.01	.05	.90	[−.10, .09]
Psychoticism	1.04 (.69)	.82 (.62)	−.33	−.18	.08	.04	[−.34, −.01]
LPFS-BF-2.0
Self-Dysfunction	13.43 (4.31)	12.53 (4.32)	−.21	−.69	.63	.27	[−1.93, .54]
Other-Dysfunction	11.10 (3.07)	10.16 (3.07)	−.31	−1.16	.42	.01	[−2.00, −.33]
FFBI-SF
N1. Anxious Uncertainty	14.23 (4.53)	12.06 (4.34)	−.49	−1.06	.63	.10	[−2.31, .19]
N2. Dysregulated Anger	8.88 (4.71)	7.03 (3.26)	−.45	−1.85	.51	< .01	[−2.86, −.83]
N3. Despondence	9.65 (4.69)	8.15 (3.36)	−.37	−1.74	.54	< .01	[−2.79, −.68]
N4. Self-Disturbance	10.45 (4.67)	8.11 (3.72)	−.55	−1.79	.54	< .01	[−2.87, −.72]
N5. Behavioral Dysregulation	8.34 (3.94)	7.09 (3.20)	−.35	−1.69	.44	< .01	[−2.55, −.82]
N6a. Affective Dysregulation	10.25 (4.90)	8.95 (4.35)	−.28	−1.50	.66	.02	[−2.80, −.20]
N6b. Fragility	7.35 (3.45)	6.56 (2.75)	−.25	−.80	.36	.03	[−1.51, −.09]
C6. Rashness	7.64 (3.84)	6.21 (2.56)	−.44	−1.69	.39	< .01	[−2.47, −.91]
A1. Distrustfulness	10.77 (4.17)	7.99 (3.65)	−.71	−2.74	.52	< .01	[−3.77, −1.71]
A2. Manipulativeness	6.77 (3.40)	5.30 (2.13)	−.52	−1.41	.33	< .01	[−2.06, −.77]
A4. Oppositional	7.35 (2.69)	6.85 (2.13)	−.21	−1.02	.30	< .01	[−1.60, −.43]
O1. Dissociative Tendencies	8.07 (4.86)	7.04 (3.92)	−.23	−1.02	.54	.06	[−2.08, .04]
OASIS	6.04 (4.40)	6.50 (4.42)	.09	.62	.57	.27	[−.49, 1.74]
ODSIS	4.11 (4.57)	4.51 (4.33)	.10	.53	.53	.32	[−.51, 1.58]
Q-LES-Q-SF	50.09 (9.51)	50.08 (9.30)	< .01	−1.41	1.11	.21	[−3.60, .78]
WSAS	10.96 (9.44)	10.98 (8.93)	< −.01	1.14	1.01	.26	[−.85, 3.13]

Open in a new tab

Aim 6 – Between-Condition Differences in Within-Person Associations With Mood, Well-Being, and Functioning Measures

Because we used a 0/1 dummy-coded instructions condition variable, the conditional main effect of each personality measure represents the strength of the average within-person association at any given timepoint between the personality measure and mood, well-being, or functioning outcome in the established instructions condition. When this term is significant, it thus indicates that reporting greater or lower personality scores than participants’ personal average using the established instructions was significantly associated with greater or lower mood, well-being, or functioning at the same timepoint. The product term then represents the difference in the strength of the average within-person association at any given timepoint between the established instructions condition and the weekly instructions condition. When this term is significant, it indicates that participants in the weekly instructions condition demonstrated a significantly more positive or more negative association between within-person changes in personality measures and mood, well-being, or functioning measures at a given timepoint.

Of the 48 effects of interest (two effects of interest per personality measure × 24 personality measures) for the ODSIS, 27 remained significant after applying the Benjamini-Hochberg procedure (Tables S60-S83, S156). Two-thirds (n = 18; 66.7%) of these effects were product terms representing stronger relations with within-person changes in ODSIS scores in the weekly instructions condition. Across all models, personality measures with weekly instructions were 3.48 times more strongly associated with within-person changes in ODSIS scores than personality measures with established instructions.

Of the 48 effects of interest for the OASIS, only 11 effects remained significant after applying the Benjamini-Hochberg procedure (Tables S84-S107, S157). However, nine of these effects (81.8%) were product terms representing stronger relations in the expected directions with within-person changes in OASIS scores in the weekly instructions condition. Measures with weekly instructions were 2.96 times more strongly associated with within-person changes in OASIS scores than personality measures with established instructions across all models.

Of the 18 significant effects remaining after applying the Benjamini-Hochberg procedure to the WSAS models, 11 (61.1%) were product terms representing stronger relations in the expected directions with within-person changes in WSAS scores in the weekly instructions condition (Tables S108-S131, S158). Measures with weekly instructions were 1.95 times more strongly associated with within-person changes in WSAS scores than personality measures with established instructions across all models.

Finally, of the 23 significant effects remaining after applying the Benjamini-Hochberg procedure to the Q-LES-Q-SF models, 17 (73.9%) were product terms representing stronger within-person relations in the weekly instructions condition (Tables S132-S155, S159). Measures with weekly instructions were 2.25 times more strongly associated with within-person changes in Q-LES-Q scores than personality measures with established instructions across all models.

Discussion

In this study, we evaluated the impact of instructing people to reflect on their personality over a specified time period compared to no specific time period on indices of internal consistency, rank-order stability, measurement invariance, variability, and means of several common measures of personality administered once per week for six weeks as well as associations between within-person changes in these measures and measures of symptoms, well-being, and functioning. Although the between-person internal consistencies of the personality measures were good-to-excellent and generally similar between conditions, the within-person internal consistencies were significantly higher in the weekly instructions condition than the established instructions condition, ranging from unacceptable to good, depending on the measure. In the weekly instructions condition, the personality measures tested also tended to exhibit lower rank-order stability, and somewhat lower mean scores but significantly greater variability and stronger associations with measures of symptoms, well-being, and functioning compared to the established instructions condition. By contrast, we found almost no between-condition differences on indices of internal consistency, measurement invariance, variability, or means of measures of symptoms, well-being, and functioning with instructions to consider the past week. Together, these results can guide researchers in planning repeated measures studies of personality and comparing the results of such studies with cross-sectional findings.

Across conditions, items from all but two personality measures demonstrated good-to-excellent between-person internal consistencies, ωs ≥ .80. Although internal consistencies were slightly higher on average in the weekly instructions condition, there were only four significant between-condition differences in the internal consistencies of the personality measures. It should be noted that all the measures included in this study were initially validated as between-person measures. This psychometric property may thus translate well to repeated measures designs. Researchers designing such studies primarily with questions about individual differences may expect the internal consistencies reported in the initial validation studies of personality measures to generalize to repeated measures samples almost regardless of the instructions delivered.

By contrast, items from all measures demonstrated lower within- than between-person internal consistencies, which may be expected given the aggregation of responses that contribute to between-person internal consistencies (Epstein, 1980). However, items from personality measures in the weekly instructions condition tended to demonstrate greater within-person internal consistency than items in the established instructions condition. This discrepancy was especially pronounced for measures of or closely related to neuroticism, extraversion, conscientiousness, and general personality functioning, with somewhat less discrepancy for measures of or related to openness. Asking participants to explicitly focus on a clearly defined time period may allow them to approach items in a more similar way from one timepoint to the next compared to providing no guidance on a time period to consider. Without a clear time period on which to focus, participants in the established instructions condition may have considered their personality over different time periods at different timepoints in the study, leading to less internally consistent responses from one timepoint to the next. We encourage future researchers to test this hypothesis by collecting qualitative descriptions from participants about their thought process in responding to repeatedly assessed personality items. However, researchers designing repeated measures studies primarily with questions about within-person processes may be encouraged to adapt the instructions of their measures to focus participants on a specific time period of interest to facilitate more internally consistent results.

The fact that 70–80% of the subscales assessed demonstrated scalar equivalence further suggests that both versions of these personality measures reflect the same latent constructs, just at different timescales. Thus, researchers may be encouraged to use the version that best suits their study design and psychometric needs.

Despite the greater within-person internal consistency demonstrated in the weekly instructions condition, participants in this condition also demonstrated significantly more variable and less rank-order stable responses than those in the established instructions condition. By more consistently focusing on a relatively brief but recent time period, those in the weekly instructions condition likely responded based on episodic memories of what events occurred and how they responded over the prior week, whereas those in the established instructions condition may have responded more from semantic memory of how they saw themselves in general across different time periods (Robinson & Clore, 2002). Researchers interested in how personality changes over time may be encouraged to use measures with instructions tailored to the time period over which they are measuring personality so participants will be more likely to focus on how their personality was expressed during that specified time frame. Accumulating data suggest that personality traits change across the lifespan (Roberts et al., 2006; Roberts & Mroczek, 2008) and in response to treatment (Roberts et al., 2017). Thus, it is important understand whether existing personality measures, developed with the assumption that traits are static, can adequately capture change across different timeframes of interest.

Regardless of these differences in variability, we found some mean-level differences between instruction conditions when averaged across the study period, with those in the weekly instructions condition reporting somewhat lower mean scores on most, but not all, personality measures. These findings further suggest that those in the weekly instructions condition were relying on episodic memory rather than semantic memory in responding, because briefer time periods tend to preclude the most extreme responses people can exhibit when considering longer time periods (Peasgood et al., 2023; Walentynowicz et al., 2018). Measures tapping into broader constructs or dimensions (e.g., NEO-FFI-3, PID-5-BF), tended to demonstrate more similar means between conditions than the FFBI-SF. Measures assessing broader constructs may be more likely to include a mix of items that are more relevant to lifetime experiences (e.g., “I use people to get what I want” [PID-5-BF-Antagonism item #22]) or weekly experiences (e.g., “I craved attention” [PID-5-BF-Antagonism item #19]) whereas measures assessing more narrow constructs may include items that are all relevant to either lifetime or weekly experiences. If the primary focus is to directly compare results with previous cross-sectional results with established instructions, those measures could be used. The use of measures with weekly instructions may simply need the caveat that the resulting mean scores may differ slightly and somewhat unpredictably from measures with established instructions.

By contrast, we found no consistent differences between conditions among measures of symptoms, quality of life, or functioning when these included past-week instructions. As with the personality measures, items from these measures generally exhibited lower within- than between-person internal consistency. However, neither the between- nor within-person internal consistencies differed between groups. Only the Q-LES-Q demonstrated significantly greater variability in the weekly instructions condition, and there were no significant differences or consistent patterns in mean differences between conditions on any symptom, quality of life, or functioning measure. Thus, self-report measures designed to assess past-week experiences may perform similarly regardless of the instructions of the measures completed before them. Researchers interested in conducting repeated measures studies that include measures with instructions focusing on different time periods should not expect measures with instructions to focus on general time periods to bias responses on subsequent measures with instructions to focus on specific or more recent time periods. Of course, there are many permutations of these ordering effects that could be explored, which remain open questions for future researchers.

Finally, within-person changes in personality measures and measures of symptoms, well-being, and functioning were roughly 2–3 times more strongly associated with personality measures with weekly instructions than those with established instructions. These results suggest that asking participants to consider the same time period when responding to personality measures as well as measures of symptoms, well-being, and functioning enhances the strength of the association between these measures. Thus, researchers interested in how personality and associated constructs change together over time may be encouraged to specify the same time period for both sets of measures. However, researchers interested in the relative stability of general personality ratings compared to more rapid changes in associated constructs may expect to find fewer and typically smaller-sized associations. We encourage future researchers to use these data to plan their studies to align measurement and theory (Kaurin et al., 2023).

These findings should be interpreted in the context of our study’s limitations. We selected personality measures designed to capture normal variation in the Big Five, measures relevant to Criteria A and B of the AMPD, and borderline personality disorder. Although we selected these to cover a range of relevant personality research, it is possible that other measures of interest would perform differently in this design. Similarly, we selected comparison measures of mood, well-being, and functioning with past-week instructions, which may be expected to be more strongly associated with personality measures with past-week instructions than with general, trait-like instructions (Aim 6). We encourage future researchers to replicate our design using comparison measures of trait-like constructs (e.g., avoidance, hope optimism) to evaluate the impact of instructions on these relations. We focused on one contrast: weekly instructions compared to general instructions. Researchers conducting intensive longitudinal designs may be interested in comparisons with more momentary measures, whereas researchers conducting yearly panel studies may be more interested in measures with past-year instructions. Although beyond the scope of this design, we encourage such tests to more fully characterize how the time period of instructions impacts performance of these measures.

Constraints on Generality

Because our sample was an unselected undergraduate sample and thus was primarily younger, white, and female, these results may not generalize to more diverse, non-college educated, or clinically severe samples. More diverse samples may provide greater variability in responses on these measures. We collected data across an unspecified six-week time period, so these results pertain most directly to a relatively brief span of time without any associated interventions or further experimental manipulations. We encourage intervention researchers to replicate these results in a treatment context to test their generalizability.

Given the enthusiasm around characterizing and intervening on personality change (Beck & Jackson, 2021; Roberts et al., 2017; Sauer-Zavala et al. (2023b)), it is important to use measures that are most appropriate to capture this change. Together, our results emphasize the importance of aligning the time period specified in the instructions of personality measures with researchers’ time period of interest in repeated measures designs. Researchers interested in weekly changes in personality are encouraged to adapt the instructions and items of well-validated personality measures to focus participants on their personality over the prior week. Doing so is likely to lead to similar between-person internal consistency; greater within-person internal consistency; greater variability; slightly lower, though generally similar mean scores; and larger within-person associations with related measures without substantially impacting other weekly measures of symptoms, well-being, or functioning compared to using personality measures with their established, general instructions.

Supplementary Material

NIHMS2121786-supplement-1.docx^{(2.8MB, docx)}

Public Significance Statement.

When administered weekly, personality measures that instruct respondents to consider their personality over the prior week were more reliable, variable, and strongly associated with other constructs than personality measures instructing respondents to consider their personality in general, despite tapping into the same general personality constructs.

Acknowledgments

M.W.S.’s efforts on this paper were partially supported by the National Institute of Mental Health (NIMH) under awards K23MH126211 and R61MH133665. S.S.Z’s efforts on this paper were partially supported by the NIMH under award R61MH133665.

Footnotes

Conflicts of interest: All authors declare no conflicts of interest.

Although we recognize the cocron package is designed to test differences in Cronbach’s α, we used it to compare McDonald’s ωs because ω is a more general version of α that is mathematically equivalent to α if tau equivalence is assumed (Geldolf et al., 2014) and because we are not aware of other methods to statistically compare ωs.

The misfit of these measures was large enough that we re-examined their factor structure in our sample using multilevel exploratory factor analyses in Mplus (Tables S218-S224). We refit our measurement invariance models using these revised factor structures and/or recommendations from modification indices (Table S225), which led to acceptable RMSEA indices across all models and better fitting CFI, TLI, and SRMR statistics but almost identical substantive conclusions as with the original analyses: 14 models [77.8%] demonstrated scalar equivalence compared to the semi-restricted models and 15 [75%] demonstrated scalar equivalence compared to the unrestricted models.

In response to reviewer feedback, we also ran these models without covariates. The results were largely similar, with 21 significant differences between instruction conditions, only two of which resulted in Bs ≥50% different than those in Table 4 (i.e., FFBI-SF-A2-Manipulativeness and WSAS; Tables S160-S187, S216).

⁴

The results were also largely similar when running these models without covariates, with 15 significant between-conditions differences, five of which resulted in Bs ≥50% different than those in Table 5 (i.e., PID-5-BF-Disinhibition, FFBI-SF-A1-Anxious Uncertainty, FFBI-SF-A4-Oppositional, Q-LES-Q, and WSAS; Tables S188-S215; S217)

References

Allemand M, Gmür B, & Flückiger C. (2022). Does extraversion increase following a three-hour flirt training? Exploring two training routes. Scandinavian Journal of Psychology, 63(3), 265–274. 10.1111/sjop.12803 [DOI] [PMC free article] [PubMed] [Google Scholar]
American Psychiatric Association. (2022). Alternative DSM-5 model for personality disorders. In Diagnostic and statistical manual of mental disorders (5th ed., text rev.). 10.1176/appi.books.9780890425787.Alternate_DSM5_Model_for_Personality_Disorders [DOI] [Google Scholar]
Beck ED, & Jackson JJ (2020). Idiographic traits: A return to Allportian approaches to personality. Current Directions in Psychological Science, 29(3), 301–308. 10.1177/0963721420915860 [DOI] [Google Scholar]
Beck ED, & Jackson JJ (2021). Within-person variability. In Rauthmann JF(Ed.), The handbook of personality dynamics and processes (pp. 75–100). Elsevier Academic Press. 10.1016/B978-0-12-813995-0.00004-2 [DOI] [Google Scholar]
Ben-Porath YS, & Tellegen A. (2020). Minnesota Multiphasic Personality Inventory-3 (MMPI-3): Manual for administration, scoring, and interpretation. University of Minnesota Press. [Google Scholar]
Cohen J, & Cohen P. (1983). Applied multiple regression/correlation analysis for the behavioral sciences. Erlbaum. [Google Scholar]
Curran PJ, & Bauer DJ (2011). The disaggregation of within-person and between-person effects in longitudinal models of change. Annual Review of Psychology, 62, 583–619. 10.1146/annurev.psych.093008.100356 [DOI] [Google Scholar]
DeShong HL, Mullins-Sweatt SN, Miller JD, Widiger TA, & Lynam DR (2016). Development of a short form of the five-factor borderline inventory. Assessment, 23(3), 342–352. 10.1177/1073191115581475 [DOI] [PubMed] [Google Scholar]
Diedenhofen B. (2016). cocron: Statistical comparisons of two or more alpha coefficients (Version 1.0–1) [Computer software] http://comparingcronbachalphas.org [Google Scholar]
Endicott J, Nee J, Harrison W, & Blumenthal R. (1993). Quality of Life Enjoyment and Satisfaction Questionnaire: A new measure. Psychopharmacology Bulletin, 29(2), 321–326. [PubMed] [Google Scholar]
Epstein S. (1980). The stability of behavior: II. Implications for psychological research. American Psychologist, 35(9), 790–806. 10.1037/0003-066X.35.9.790 [DOI] [Google Scholar]
Faul F, Erdfelder E, Buchner A, & Lang AG (2009). Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41(4), 1149–1160. 10.3758/BRM.41.4.1149 [DOI] [PubMed] [Google Scholar]
Fleeson W. (2001). Toward a structure- and process-integrated view of personality: Traits as density distributions of states. Journal of Personality and Social Psychology, 80(6), 1011–1027. 10.1037/0022-3514.80.6.1011 [DOI] [PubMed] [Google Scholar]
Funder DC (2010). The personality puzzle (5th ed.) (p. 5). Norton. [Google Scholar]
Geldolf GJ, Preacher KJ, & Zypher MJ (2014). Reliability estimation in a multilevel confirmatory factor analysis framework. Psychological Methods, 19(1), 72–91. 10.1037/a0032138 [DOI] [PubMed] [Google Scholar]
Goldberg LR, Johnson JA, Eber HW, Hogan R, Ashton MC, Cloninger CR, & Gough HG (2006). The international personality item pool and the future of public-domain personality measures. Journal of Research in Personality, 40(1), 84–96. 10.1016/j.jrp.2005.08.007 [DOI] [Google Scholar]
Gosling SD, Rentfrow PJ, & Swann WB Jr. (2003). A very brief measure of the big five personality domains. Journal of Research in Personality, 37(6), 504–528. 10.1016/S0092-6566(03)00046-1 [DOI] [Google Scholar]
Hu L. t., & Bentler PM (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1–55. 10.1080/10705519909540118 [DOI] [Google Scholar]
Hudson NW, & Fraley RC (2015). Volitional personality trait change: Can people choose to change their personality traits?. Journal of Personality and Social Psychology, 109(3), 490–507. 10.1037/pspp0000021 [DOI] [PubMed] [Google Scholar]
Kaurin A, King KM, & Wright AGC (2023). Studying personality pathology with ecological momentary assessment: Harmonizing theory and method. Personality Disorders: Theory, Research, and Treatment, 14(1), 62–72. 10.1037/per0000596 [DOI] [Google Scholar]
Krueger RF, Derringer J, Markon KE, Watson D, & Skodol AE (2013). The Personality Inventory for DSM-5 Brief Form (PID-5-BF) – Adult. American Psychiatric Association. [Google Scholar]
Marks I. (1986). Behavioural psychotherapy. Marks I. [Google Scholar]
McCrae RR, & Costa PT (2010). NEO inventories professional manual. Psychological Assessment Resources. [Google Scholar]
Morey LC (2007). Personality assessment inventory (PAI): Professional manual (2nd ed.). Psychological Assessment Resources. [Google Scholar]
Murayama K, Usami S, & Sakaki M. (2022). Summary-statistics-based power analysis: A new and practical method to determine sample size for mixed-effects modeling. Psychological Methods, 27(6), 1014–1038. 10.1037/met0000330 [DOI] [PubMed] [Google Scholar]
Muthén LK, & Muthén BO (2012–2022). Mplus user’s guide (8th ed.). Muthén & Muthén. [Google Scholar]
Paulhus DL, & Vazire S. (2007). The self-report method. Handbook of Research Methods in Personality Psychology, 224–239. [Google Scholar]
Peasgood T, Caruana JM, & Mukuria C. (2023). Systematic review of the effect of a one-day versus seven-day recall duration on patient reported outcome measures (PROMs). The Patient, 16(3), 201–221. 10.1007/s40271-022-00611-w [DOI] [PMC free article] [PubMed] [Google Scholar]
Preacher KJ (2002, May). Calculation for the test of the difference between two independent correlation coefficients [Computer software]. https://quantpsy.org [Google Scholar]
R Core Team. (2023). R: A language and environment for statistical computing (Version 4.3.1) [Computer software]. https://www.R-project.org/ [Google Scholar]
Ringwald WR, Forbes MK, & Wright AGC (2023). Meta-analysis of structural evidence for the Hierarchical Taxonomy of Psychopathology (HiTOP) model. Psychological Medicine, 53(2), 533–546. 10.1017/S0033291721001902 [DOI] [PubMed] [Google Scholar]
Roberts BW, Luo J, Briley DA, Chow PI, Su R, & Hill PL (2017). A systematic review of personality trait change through intervention. Psychological Bulletin, 143(2), 117–141. 10.1037/bul0000088 [DOI] [PubMed] [Google Scholar]
Roberts BW, & Mroczek D. (2008). Personality trait change in adulthood. Current Directions in Psychological Science, 17(1), 31–35. 10.1111/j.1467-8721.2008.00543.x [DOI] [PMC free article] [PubMed] [Google Scholar]
Roberts BW, Walton KE, & Viechtbauer W. (2006). Patterns of mean-level change in personality traits across the life course: A meta-analysis of longitudinal studies. Psychological Bulletin, 132(1), 1–25. 10.1037/0033-2909.132.1.1 [DOI] [PubMed] [Google Scholar]
Robins RW, Fraley RC, & Krueger RF (Eds.). (2007). Handbook of research methods in personality psychology. Guilford. [Google Scholar]
Robinson MD, & Clore GL (2002). Belief and feeling: Evidence for an accessibility model of emotional self-report. Psychological Bulletin, 128(6), 934–960. 10.1037//0033-2909.128.6.934 [DOI] [PubMed] [Google Scholar]
Satorra A, & Bentler PM (2010). Ensuring positiveness of the scaled difference chi-square test statistic. Psychometrika, 75(2), 243–248. 10.1007/s11336-009-9135-y [DOI] [PMC free article] [PubMed] [Google Scholar]
Sauer-Zavala S, Southward MW, Fruhbauerova M, Semcho SA, Stumpp NE, Hood CO, Smith M, Elhusseini S, & Cravens L. (2023a). BPD Compass: A randomized controlled trial of a short-term, personality-based treatment for borderline personality disorder. Personality Disorders: Theory, Research, and Treatment, 14(5), 534–544. 10.1037/per0000612 [DOI] [Google Scholar]
Sauer-Zavala S, Southward MW, Hood CO, Elhusseini S, Fruhbauerova M, Stumpp NE, & Semcho SA (2023b). Conceptual development and case data for a modular, personality-based treatment for borderline personality disorder. Personality Disorders: Theory, Research, and Treatment, 14(4), 369–380. 10.1037/per0000520 [DOI] [Google Scholar]
Soto CJ, & John OP (2017). The next Big Five Inventory (BFI-2): Developing and assessing a hierarchical model with 15 facets to enhance bandwidth, fidelity, and predictive power. Journal of Personality and Social Psychology, 113(1), 117–143. 10.1037/pspp0000096 [DOI] [PubMed] [Google Scholar]
Southward MW, Cheavens JS, & Coccaro EF (2023). Defining the p-factor: An empirical test of five leading theories. Psychological Medicine, 53(7), 2732–2743. 10.1017/S0033291722001635 [DOI] [PMC free article] [PubMed] [Google Scholar]
Stieger M, Allemand M, Roberts BW, & Davis JP (2022). Mindful of personality trait change: Are treatment effects on personality trait change ephemeral and attributable to changes in states? Journal of Personality, 90(3), 375–392. 10.1111/jopy.12672 [DOI] [PubMed] [Google Scholar]
Stieger M, Flückiger C, Rüegger D, Kowatsch T, Roberts BW, & Allemand M. (2021). Changing personality traits with the help of a digital personality change intervention. Proceedings of the National Academy of Sciences of the United States of America, 118(8), e2017548118. 10.1073/pnas.2017548118 [DOI] [PMC free article] [PubMed] [Google Scholar]
Stieger M, Wepfer S, Rüegger D, Kowatsch T, Roberts BW, & Allemand M. (2020). Becoming more conscientious or more open to experience? Effects of a two‐week smartphone‐based intervention for personality change. European Journal of Personality, 34(3), 345–366. 10.1002/per.2267 [DOI] [Google Scholar]
Walentynowicz M, Schneider S, & Stone AA (2018). The effects of time frames on self-report. PLoS ONE, 13(8), e0201655. 10.1371/journal.pone.0201655 [DOI] [PMC free article] [PubMed] [Google Scholar]
Weekers LC, Hutsebaut J, & Kamphuis JH (2019). The Level of Personality Functioning Scale-Brief Form 2.0: Update of a brief instrument for assessing level of personality functioning. Personality & Mental Health, 13(1), 3–14. 10.1002/pmh.1434 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

NIHMS2121786-supplement-1.docx^{(2.8MB, docx)}

[R1] Allemand M, Gmür B, & Flückiger C. (2022). Does extraversion increase following a three-hour flirt training? Exploring two training routes. Scandinavian Journal of Psychology, 63(3), 265–274. 10.1111/sjop.12803 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] American Psychiatric Association. (2022). Alternative DSM-5 model for personality disorders. In Diagnostic and statistical manual of mental disorders (5th ed., text rev.). 10.1176/appi.books.9780890425787.Alternate_DSM5_Model_for_Personality_Disorders [DOI] [Google Scholar]

[R3] Beck ED, & Jackson JJ (2020). Idiographic traits: A return to Allportian approaches to personality. Current Directions in Psychological Science, 29(3), 301–308. 10.1177/0963721420915860 [DOI] [Google Scholar]

[R4] Beck ED, & Jackson JJ (2021). Within-person variability. In Rauthmann JF(Ed.), The handbook of personality dynamics and processes (pp. 75–100). Elsevier Academic Press. 10.1016/B978-0-12-813995-0.00004-2 [DOI] [Google Scholar]

[R5] Ben-Porath YS, & Tellegen A. (2020). Minnesota Multiphasic Personality Inventory-3 (MMPI-3): Manual for administration, scoring, and interpretation. University of Minnesota Press. [Google Scholar]

[R6] Cohen J, & Cohen P. (1983). Applied multiple regression/correlation analysis for the behavioral sciences. Erlbaum. [Google Scholar]

[R7] Curran PJ, & Bauer DJ (2011). The disaggregation of within-person and between-person effects in longitudinal models of change. Annual Review of Psychology, 62, 583–619. 10.1146/annurev.psych.093008.100356 [DOI] [Google Scholar]

[R8] DeShong HL, Mullins-Sweatt SN, Miller JD, Widiger TA, & Lynam DR (2016). Development of a short form of the five-factor borderline inventory. Assessment, 23(3), 342–352. 10.1177/1073191115581475 [DOI] [PubMed] [Google Scholar]

[R9] Diedenhofen B. (2016). cocron: Statistical comparisons of two or more alpha coefficients (Version 1.0–1) [Computer software] http://comparingcronbachalphas.org [Google Scholar]

[R10] Endicott J, Nee J, Harrison W, & Blumenthal R. (1993). Quality of Life Enjoyment and Satisfaction Questionnaire: A new measure. Psychopharmacology Bulletin, 29(2), 321–326. [PubMed] [Google Scholar]

[R11] Epstein S. (1980). The stability of behavior: II. Implications for psychological research. American Psychologist, 35(9), 790–806. 10.1037/0003-066X.35.9.790 [DOI] [Google Scholar]

[R12] Faul F, Erdfelder E, Buchner A, & Lang AG (2009). Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41(4), 1149–1160. 10.3758/BRM.41.4.1149 [DOI] [PubMed] [Google Scholar]

[R13] Fleeson W. (2001). Toward a structure- and process-integrated view of personality: Traits as density distributions of states. Journal of Personality and Social Psychology, 80(6), 1011–1027. 10.1037/0022-3514.80.6.1011 [DOI] [PubMed] [Google Scholar]

[R14] Funder DC (2010). The personality puzzle (5th ed.) (p. 5). Norton. [Google Scholar]

[R15] Geldolf GJ, Preacher KJ, & Zypher MJ (2014). Reliability estimation in a multilevel confirmatory factor analysis framework. Psychological Methods, 19(1), 72–91. 10.1037/a0032138 [DOI] [PubMed] [Google Scholar]

[R16] Goldberg LR, Johnson JA, Eber HW, Hogan R, Ashton MC, Cloninger CR, & Gough HG (2006). The international personality item pool and the future of public-domain personality measures. Journal of Research in Personality, 40(1), 84–96. 10.1016/j.jrp.2005.08.007 [DOI] [Google Scholar]

[R17] Gosling SD, Rentfrow PJ, & Swann WB Jr. (2003). A very brief measure of the big five personality domains. Journal of Research in Personality, 37(6), 504–528. 10.1016/S0092-6566(03)00046-1 [DOI] [Google Scholar]

[R18] Hu L. t., & Bentler PM (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1–55. 10.1080/10705519909540118 [DOI] [Google Scholar]

[R19] Hudson NW, & Fraley RC (2015). Volitional personality trait change: Can people choose to change their personality traits?. Journal of Personality and Social Psychology, 109(3), 490–507. 10.1037/pspp0000021 [DOI] [PubMed] [Google Scholar]

[R20] Kaurin A, King KM, & Wright AGC (2023). Studying personality pathology with ecological momentary assessment: Harmonizing theory and method. Personality Disorders: Theory, Research, and Treatment, 14(1), 62–72. 10.1037/per0000596 [DOI] [Google Scholar]

[R21] Krueger RF, Derringer J, Markon KE, Watson D, & Skodol AE (2013). The Personality Inventory for DSM-5 Brief Form (PID-5-BF) – Adult. American Psychiatric Association. [Google Scholar]

[R22] Marks I. (1986). Behavioural psychotherapy. Marks I. [Google Scholar]

[R23] McCrae RR, & Costa PT (2010). NEO inventories professional manual. Psychological Assessment Resources. [Google Scholar]

[R24] Morey LC (2007). Personality assessment inventory (PAI): Professional manual (2nd ed.). Psychological Assessment Resources. [Google Scholar]

[R25] Murayama K, Usami S, & Sakaki M. (2022). Summary-statistics-based power analysis: A new and practical method to determine sample size for mixed-effects modeling. Psychological Methods, 27(6), 1014–1038. 10.1037/met0000330 [DOI] [PubMed] [Google Scholar]

[R26] Muthén LK, & Muthén BO (2012–2022). Mplus user’s guide (8th ed.). Muthén & Muthén. [Google Scholar]

[R27] Paulhus DL, & Vazire S. (2007). The self-report method. Handbook of Research Methods in Personality Psychology, 224–239. [Google Scholar]

[R28] Peasgood T, Caruana JM, & Mukuria C. (2023). Systematic review of the effect of a one-day versus seven-day recall duration on patient reported outcome measures (PROMs). The Patient, 16(3), 201–221. 10.1007/s40271-022-00611-w [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] Preacher KJ (2002, May). Calculation for the test of the difference between two independent correlation coefficients [Computer software]. https://quantpsy.org [Google Scholar]

[R30] R Core Team. (2023). R: A language and environment for statistical computing (Version 4.3.1) [Computer software]. https://www.R-project.org/ [Google Scholar]

[R31] Ringwald WR, Forbes MK, & Wright AGC (2023). Meta-analysis of structural evidence for the Hierarchical Taxonomy of Psychopathology (HiTOP) model. Psychological Medicine, 53(2), 533–546. 10.1017/S0033291721001902 [DOI] [PubMed] [Google Scholar]

[R32] Roberts BW, Luo J, Briley DA, Chow PI, Su R, & Hill PL (2017). A systematic review of personality trait change through intervention. Psychological Bulletin, 143(2), 117–141. 10.1037/bul0000088 [DOI] [PubMed] [Google Scholar]

[R33] Roberts BW, & Mroczek D. (2008). Personality trait change in adulthood. Current Directions in Psychological Science, 17(1), 31–35. 10.1111/j.1467-8721.2008.00543.x [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] Roberts BW, Walton KE, & Viechtbauer W. (2006). Patterns of mean-level change in personality traits across the life course: A meta-analysis of longitudinal studies. Psychological Bulletin, 132(1), 1–25. 10.1037/0033-2909.132.1.1 [DOI] [PubMed] [Google Scholar]

[R35] Robins RW, Fraley RC, & Krueger RF (Eds.). (2007). Handbook of research methods in personality psychology. Guilford. [Google Scholar]

[R36] Robinson MD, & Clore GL (2002). Belief and feeling: Evidence for an accessibility model of emotional self-report. Psychological Bulletin, 128(6), 934–960. 10.1037//0033-2909.128.6.934 [DOI] [PubMed] [Google Scholar]

[R37] Satorra A, & Bentler PM (2010). Ensuring positiveness of the scaled difference chi-square test statistic. Psychometrika, 75(2), 243–248. 10.1007/s11336-009-9135-y [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] Sauer-Zavala S, Southward MW, Fruhbauerova M, Semcho SA, Stumpp NE, Hood CO, Smith M, Elhusseini S, & Cravens L. (2023a). BPD Compass: A randomized controlled trial of a short-term, personality-based treatment for borderline personality disorder. Personality Disorders: Theory, Research, and Treatment, 14(5), 534–544. 10.1037/per0000612 [DOI] [Google Scholar]

[R39] Sauer-Zavala S, Southward MW, Hood CO, Elhusseini S, Fruhbauerova M, Stumpp NE, & Semcho SA (2023b). Conceptual development and case data for a modular, personality-based treatment for borderline personality disorder. Personality Disorders: Theory, Research, and Treatment, 14(4), 369–380. 10.1037/per0000520 [DOI] [Google Scholar]

[R40] Soto CJ, & John OP (2017). The next Big Five Inventory (BFI-2): Developing and assessing a hierarchical model with 15 facets to enhance bandwidth, fidelity, and predictive power. Journal of Personality and Social Psychology, 113(1), 117–143. 10.1037/pspp0000096 [DOI] [PubMed] [Google Scholar]

[R41] Southward MW, Cheavens JS, & Coccaro EF (2023). Defining the p-factor: An empirical test of five leading theories. Psychological Medicine, 53(7), 2732–2743. 10.1017/S0033291722001635 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R42] Stieger M, Allemand M, Roberts BW, & Davis JP (2022). Mindful of personality trait change: Are treatment effects on personality trait change ephemeral and attributable to changes in states? Journal of Personality, 90(3), 375–392. 10.1111/jopy.12672 [DOI] [PubMed] [Google Scholar]

[R43] Stieger M, Flückiger C, Rüegger D, Kowatsch T, Roberts BW, & Allemand M. (2021). Changing personality traits with the help of a digital personality change intervention. Proceedings of the National Academy of Sciences of the United States of America, 118(8), e2017548118. 10.1073/pnas.2017548118 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R44] Stieger M, Wepfer S, Rüegger D, Kowatsch T, Roberts BW, & Allemand M. (2020). Becoming more conscientious or more open to experience? Effects of a two‐week smartphone‐based intervention for personality change. European Journal of Personality, 34(3), 345–366. 10.1002/per.2267 [DOI] [Google Scholar]

[R45] Walentynowicz M, Schneider S, & Stone AA (2018). The effects of time frames on self-report. PLoS ONE, 13(8), e0201655. 10.1371/journal.pone.0201655 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R46] Weekers LC, Hutsebaut J, & Kamphuis JH (2019). The Level of Personality Functioning Scale-Brief Form 2.0: Update of a brief instrument for assessing level of personality functioning. Personality & Mental Health, 13(1), 3–14. 10.1002/pmh.1434 [DOI] [PubMed] [Google Scholar]

PERMALINK

The Importance of Specifying the Time Period in Repeated Measures of Personality Assessments

Matthew W Southward

Madeline L Kushner

Nicole E Stumpp

Sarah E Cecil

Caden J Maynard

Alexandra K Barnhill

Vincent J Buchenberger

Shannon Sauer-Zavala

Abstract

Current Study

Methods and Materials

Participants

Procedures

Measures

Demographics

Personality Measures

Five-Factor Borderline Inventory-Short Form.

Level of Personality Functioning Scale-Brief Form-2.0.

NEO-Five Factor Inventory-3.

Personality Inventory for DSM-5-Brief Form.

Mood, Well-Being, and Functioning Measures

Overall Anxiety Severity and Impairment Scale.

Overall Depression Severity and Impairment Scale.

Quality of Life Enjoyment and Satisfaction Questionnaire – Short Form.

Work and Social Adjustment Scale.

Data Analytic Plan

Transparency and Openness

Results

Aim 1 – Differences in Between- and Within-Person Internal Consistencies of Personality Measures by Instruction Condition

Table 1.

Aim 2 – Differences in Rank-Order Stability Between Instruction Conditions

Table 2.

Aim 3 – Measurement Invariance in Latent Constructs Between Instruction Conditions

Table 3.

Preliminary Analyses for Aims 4–6

Aim 4 – Differences in Standard Deviations of Personality Measures by Instruction Condition

Table 4.

Aim 5 – Differences in Means of Personality Measures by Instruction Condition

Table 5.

Aim 6 – Between-Condition Differences in Within-Person Associations With Mood, Well-Being, and Functioning Measures

Discussion

Constraints on Generality

Supplementary Material

Public Significance Statement.

Acknowledgments

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases