Understanding the Within- and Between-Person Structure of Daily Psychopathology Among Adolescents and Young Adults

Hao Zheng; Yao Zheng

doi:10.1177/10731911241283908

. 2024 Sep 30;32(6):899–920. doi: 10.1177/10731911241283908

Understanding the Within- and Between-Person Structure of Daily Psychopathology Among Adolescents and Young Adults

Hao Zheng ¹, Yao Zheng ^1,^✉

PMCID: PMC12290237 PMID: 39344957

Abstract

Previous investigations on the underlying structure of psychopathology symptoms primarily focused at the between-person level and among adult samples. This study used two independent Canadian samples with month-long daily diary designs to investigate daily psychopathology structure at both within- and between-person level among adolescents (n = 99, 2,132 daily reports) and young adults (n = 313, 6,431 and 4,018 daily reports at each wave). Four mainstream types of psychopathology structure were compared based on a comprehensive set of standards. The results suggest that the general factor of psychopathology (p factor) derived from the higher-order and bifactor models performed similarly well at both within- and between-person levels, while the specific factors estimated in the bifactor models demonstrated low reliability and consistency over time. Psychopathology manifests as multidimensional at the within-person level but unidimensional at the between-person level. The current findings inform the development of future prevention and intervention programs by supporting the adoption of transdiagnostic treatment that addresses multiple psychopathology symptoms with a holistic approach.

Keywords: bifactor model, hierarchical model, daily diary, p factor, psychopathology

Given the significant issues of diagnostic comorbidity in mental health (Forbes et al., 2016), the widespread heterogeneity in disorder manifestation (Kotov et al., 2021), and the arbitrary cutoffs of mental disorder diagnosis, each of which resulting in relatively low reliable diagnosis (Forbes et al., 2021), burgeoning research has started to understand psychopathology using a dimensional approach that better reflects processes of psychological dysfunction over traditional classification approaches. Three mainstream models on latent structures of psychopathology include the correlated factor model, the higher-order model, and the bifactor model (Carragher et al., 2016; Caspi et al., 2024; Markon, 2019). However, comparisons of these alternative models have primarily taken a between-person approach, comparing individuals with different severity of psychopathology features with retrospective reporting of their symptoms over a large period of time. A paucity of emerging research has examined these models at the within-person level to explore the short-term dynamics of psychopathology in daily life settings. The current study aims to employ a daily diary design to examine these models to enhance our understanding of how psychopathology manifests at the within- and between-person levels among adolescents and young adults.

The correlated factor model represents the original structural approach in the study of psychopathology. It typically includes two or more distinct but related latent variables identifying the shared variance of the symptom/disorder within each spectrum of psychopathology (Carragher et al., 2016). Considering the substantial correlations between the spectra underlying mental health disorders (>0.45; Caspi et al., 2024; Laceulle et al., 2015; Smith et al., 2020), a general factor of psychopathology, also referred to as the p factor, has been put forward to reflect an underlying liability to experience all forms of psychopathology (Caspi et al., 2014; Watts et al., 2019). However, it is important to recognize that correlations can vary across different age groups and are influenced by the psychopathology symptoms included in the models. For example, studies involving adult samples that included thought disorders generally found lower correlations than those focusing only on internalizing and externalizing symptoms in adolescent samples (Forbes et al., 2021; Hamlat et al., 2019; Laceulle et al., 2015). The p factor has attracted considerable interest in recent psychopathology studies and can manifest differently across various psychopathology models. In the higher-order model, each symptom/disorder loads on one first-order factor (e.g., internalizing and externalizing), and these latent factors subsequently load on a higher-order p factor, which represents the covariation of the first-order factors (Laceulle et al., 2015). In contrast, in the bifactor model, each symptom/disorder loads on both the p factor and a specific factor (e.g., internalizing and externalizing; Caspi et al., 2014; Moore et al., 2020). Here, the p factor directly captures the common variances across all symptom/disorder variables and is set to be uncorrelated with the specific factors (Laceulle et al., 2015). The specific factors, therefore, delineate the unique aspects of disorders within each spectrum, distinguishing them from those in other spectra (Clark et al., 2021; Markon, 2019). All these approaches have some theoretical justifications, with ongoing debate regarding which is preferable for applications (Clark et al., 2021). There is also a single-factor model with every symptom loading directly on the solely one latent p factor (Forbes et al., 2021; Moore et al., 2020).

Prior studies have primarily examined these models at the between-person level (e.g., Carragher et al., 2016; Moore et al., 2020) and found that p factors extracted from various models (i.e., single-factor, higher-order factor, and bifactor models) exhibited high correlations and similar nomological networks (Caspi et al., 2024), suggesting the existence of p factors regardless of how it is modeled. In such between-person-level studies, p factors demonstrate frequent co-occurrence of psychopathology symptoms in individuals and inform us that people who have X symptom also more often show Y symptom than other people (Wright et al., 2023). It is equally important to explore the model of psychopathology at the within-person level, where the interpretation of factor loadings shifts to when someone has X symptom more than their average, they are also likely to experience Y symptom more than their average at the same time (Wright et al., 2023). The between-person-level structures are useful to rank or compare the severity of psychopathology symptoms between individuals, whereas the within-person-level factor structures and corresponding scores are important for tracking and monitoring the fluctuations of psychopathology symptoms within person across time.

It is often assumed that the within- and between-person factor structures are configurally and metrically invariant, with previous research directly applying the commonly used between-person structures to analyze within-person fluctuations (Grosz, 2024). Nonetheless, between-person structural models do not necessarily translate exactly to the within-person level (Cooke et al., 2022; Wright et al., 2015; Zheng et al., 2023). For instance, the Big Five personality model, a well-established psychological factor structure, has been identified only at the between-person level and is absent at the within-person level (Grosz, 2024). Consequently, it is crucial to examine the extent to which the structure of psychopathology at the within-person level mirrors the manifestation observed at the between-person level. For example, if the p factor can only be found at the between-person level but not at the within-person level, the implications would be that psychopathology symptoms tend to co-occur more among certain individuals relative to others (Carragher et al., 2016), yet they do not consistently fluctuate together within the same person over time. This suggests that symptoms may vary independently or co-occur exclusively with others within the same spectrum (e.g., internalizing spectrum) within each individual over time. Only a limited number of studies have investigated whether certain symptoms co-occur as a representation of similar or distinct psychopathology spectrum on the same occasion. In the study by Constantinou et al. (2019), a total of 684 U.K. adolescents with moderate-to-severe antisocial behaviors reported their psychopathology using the Strengths and Difficulties Questionnaire (SDQ) and the Mood and Feelings Questionnaire–Short Form (MFQ) in four waves over 18 months. The results revealed a multidimensional factor structure with a highly reliable p factor, while the specific factors only demonstrated modest reliability. Using data from 101 participants diagnosed with personality disorders, who provided over 30 daily reports on 16 psychopathology items, Wright et al. (2015) conducted multilevel structural equation modeling to compare a set of correlated factor models. They found that a two-factor model (internalizing vs. externalizing) best fit the data at the between-person level, while a four-factor model (negative affect, detachment, hostility, and disinhibition) demonstrated the best model fit at the within-person level. In another study, Wright et al. (2023) investigated the structure of 81 psychopathology indicators among a mixed community and clinical sample of 300 adults who completed 14 or 21 daily surveys and found that the same core factors of psychopathology emerged at both the within- and between-person levels. However, the latent factors at the within-person level are narrower and contain fewer items than those at the between-person level.

Psychopathology symptoms not only demonstrate temporal variability and fluctuations over long periods (e.g., months, years) but can also wax and wane rapidly within several days (Schreuder et al., 2023). Within individuals, these symptoms vary in both type and severity. For example, depressive symptoms may either substitute or co-occur with anxiety symptoms, and the severity of these symptoms can fluctuate on a smaller scale, such as between “mild” and “moderate” (McGorry & Van Os, 2013; Schreuder et al., 2023). These within-person changes cannot be well captured through traditional longitudinal designs, which usually have a limited number of repeated measures and relatively large time intervals between assessments (Sliwinski, 2011). Therefore, it is essential to assess symptoms and compare different structures on a shorter time scale, such as at daily level. Recent studies have started investigating psychopathology in daily life settings over short periods, such as moments and days (e.g., Schreuder et al., 2023; Wright et al., 2023; Zheng & Goulter, 2024). To the best of our knowledge, no study has yet compared the mainstream structures of psychopathology at the within-person level in daily contexts. Whether the constructs of the psychopathology symptoms have the same performance at the within- and between-person level in daily settings remains unknown.

The exploration of psychopathology structure in the current literature was primarily conducted among adult samples, and the generalizability of the findings to other age groups remains unknown (Conway et al., 2022; Kotov et al., 2017). In response to calls for a developmental perspective in a dimensional approach to psychopathology (e.g., Kotov et al., 2017, 2021), it becomes important to investigate the psychopathology models during the developmental transition periods into adolescence, and also from adolescence to young adulthood. These developmental stages may exhibit distinct psychopathology structures compared to other developmental periods, as the manifestation of psychopathology in adolescents and young adults is strongly influenced by their access to regulated items (e.g., cars, alcohol, firearms), peer interactions, and the timing of puberty (Hamlat et al., 2019; Kotov et al., 2021). For instance, within the Hierarchical Taxonomy of Psychopathology (HiTOP) framework, Forbes et al. (2024) identified a higher-order model that encompasses 15 narrower dimensions under four broad spectra: internalizing, externalizing, eating pathology, and uncontrollable worry, obsessions, and compulsions. This structure is largely consistent with adult models but incorporates modifications in the placement/loadings of some items and the conceptual scope of the internalizing spectrum (e.g., Kotov et al., 2017, 2021). In addition, these developmental transition periods are among the best time windows to capture the interactive and dynamic emergence of psychopathology symptoms, as it is during these stages that various mental disorders sequentially or concurrently develop and remit most noticeably (McGorry & Van Os, 2013).

Model comparison based on traditional model fit indices has some well-recognized limitations. Simulation studies have indicated that the more flexible bifactor models could better accommodate complexity (e.g., random noise or error; Reise et al., 2016), leading to a tendency to fit data relatively well, even when the “true” model follows less-complex models, such as correlated factor or higher-order factor structure (Greene et al., 2019). As suggested by Forbes et al. (2021), in addition to comparing the fit indices of competing models, a comprehensive model comparison should also include other criteria. First, factor reliability indicates whether the factor score is a good representation of the modeled factor from which it is calculated (Hoffmann et al., 2022). Relatively low reliability makes it difficult to infer about individuals’ relative rankings or variabilities on that factor (Forbes et al., 2021). Second, longitudinal measurement invariance (MI) assesses the structural consistency of models over time. This evaluation can elucidate whether the changes in latent factors represent genuine change in these factors across time or merely reflect shifts in the measurement properties of the indicators (Conway et al., 2022; Gluschkoff et al., 2019). Third, the correlations between factor scores extracted from different models indicate the degree of similarities among these models. High correlations suggests that all these models consistently rank-order days or individuals at the within- or between-person level on a latent dimension of general psychopathology (Caspi et al., 2024). Fourth, criterion validity test examines whether the factors derived from different models are associated with external variables based on theoretical stipulation, and whether models constructed at different time points represent similar or distinct constructs (Hoffmann et al., 2022; Southward et al., 2022).

The Current Study

There is a major need to examine within- and between-person psychopathology structures in daily contexts among adolescents and young adults. This study used two independent samples, each with a month-long daily diary design, to evaluate four types of psychopathology structures previously identified and supported in the literature. A comprehensive set of standards, including model fit, factor properties and reliability, longitudinal MI, correlations between latent factors, and within- and between-person concurrent and prospective validity, were applied to compare these models. In addition, we tailored our analysis to each type of model by integrating both narrower factors (e.g., emotional problems, hyperactivity) and broader spectra (internalizing vs. externalizing; Goodman et al., 2010; Hoffmann et al., 2024; Wright et al., 2015) to best fit the data. Given the scarce literature on the psychopathology structure at the within-person level, especially using intensive longitudinal designs, an exploratory approach was taken, and no specific hypothesis was made.

Method

Participants and Procedures

We used data from two independent community-based samples of adolescents and young adults. The procedure and instruments for both samples were approved by the research ethics committee at the University of Alberta. Survey instruments were developed and administered using RedCap (Harris et al., 2019). We report all data exclusions, all manipulations, and all measures used in the study. Since this study involved analyses of existing datasets rather than collecting new data, determining sample size was not applicable.

University Student Sample

An initial sample of 313 Canadian university freshmen (M_age = 18.1 years, SD = 1.31, range 17–29, 72% female, 53% Asian, 30.3% White, 5% Black, 4.7% Multiracial, 0.7% Native, 1% Latino or Hispanic, 5.3% Others) completed a baseline survey and at least one day of a 30-day daily diary study (6,431 total observations, M = 21.43 days, SD = 9.65, 72.5% ≥ 20 days) between September and December 2019 in the first wave of the longitudinal study. Two and a half years later, 204 (64% retention rate) participants took part in the baseline survey of the second wave of the longitudinal study, and 194 of whom provided at least one day of data in the subsequent 30-day daily diary study (4,018 total observations, M = 21.0 days, SD = 8.84, 68.0% ≥ 20 days) between February and May 2022 (Wave 2). Participants who retained across waves were more likely to be a bit younger, 18.0 ± 0.7 vs. 18.5 ± 1.9, t(310) = 3.58, p < .001, than those who dropped out, but did not differ in sex, ethnicity, or parental education. Six months after the completion of the second wave assessment, a total of 158 (77.5% retention rate) participants completed a short follow-up survey (no daily surveys) between September and December 2022 (Wave 3).

Participants were recruited through online advertisements, on-campus posters, and short in-class presentations at a large Western Canadian university. For the first wave, all first-year undergraduate students were eligible for inclusion, while the second wave only contacted those participants who completed at least the baseline survey in the first wave without recruiting any new participants. In both waves, after providing informed consent online, participants first completed a baseline survey and participated in daily surveys for 30 consecutive days 3 days following the completion of the baseline survey. Daily surveys were sent by email at 7 pm each night, and participants were asked to complete the survey before going to sleep that night. Six months after the completion of the daily surveys in the second wave, participants completed a short follow-up survey by email. Participants received a $60/75/15 e-gift card as a compensation for their participation in Waves 1, 2, and 3, respectively (see Cooke et al., 2023 for more information on recruitment procedures).

Adolescent Sample

A total of 99 Canadian adolescents (M_age = 14.60 years, SD = 1.76, range 12–17, 55.8% female) participated in a 30-day daily diary study (2,132 total observations, M = 21.6 days, SD = 7.80, 77.8% ≥ 20 days) between April 2019 and September 2020. Participants self-identified as White (51.5%), Asian (23.2%), multiracial (8.1%), Latinx or Hispanic (4.0%), Black (2.0%), other (6.1%), or missing (5.1%). Among adolescents, 81.1% reported living with both biological parents, 17.8% reported living with one biological parent, and 1.1% reported living with someone other than a biological parent.

Participants were recruited through newsletters, social media, and flyers posted or circulated in a Western Canadian province. Interested participants contacted the research team and were provided with study information. Participants received an online baseline survey after providing consent/assent online and an online daily survey 5 days after the completion of the baseline survey for 30 consecutive days. The daily survey was sent out at 5 pm, and adolescents were asked to fill out the survey before going to sleep that night. After the 30-day daily survey, adolescents received a $45 e-gift card as a compensation for their participation. Parental consent and adolescent assent were obtained prior to commencing the study (see Xu & Zheng, 2023 for more information on recruitment procedures).

Measures

Daily Psychopathology

To measure university students’ daily psychopathology, three items from the conduct problems subscale (“I got very angry and lost my temper,” “I was generally willing to do what other people want,” and “I fight a lot.”), four items from the peer problems subscale (“I would rather be alone than with people of my age,” “Other people my age liked me,” “Other young people or fellow students picked on me or bullied me,” “I got along better with adults than with people my own age.”), and all five items of the emotional problems scale from the SDQ (Goodman et al., 1998) deemed by the researchers as to have high daily frequency and within-person fluctuations were selected. Three items were reverse-coded, as detailed in the Supplemental Materials (Table S1). Participants indicate how each item applies to them on that day on a 3-point scale (1 = not true, 2 = somewhat true, 3 = certainly true). In addition, five items on inattention/hyperactivity (“Restless, always up and on the go,” “Made inappropriate noises when shouldn’t,” “Distractible or attention span was a problem,” “Excitable, impulsive,” “Failed to finish things that started.”) with the same 3-point response options were also included. These items were modified from previous studies (Purpura & Lonigan, 2009) and were based on the symptom descriptions of inattention and hyperactivity in the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5; American Psychiatric Association, 2013). They were adjusted from their original trait-like language to reflect more immediate states to accommodate daily surveys. An addition item measuring panic disorder symptoms was included in the item pool of emotional problems. It was derived by calculating the average scores of three items (e.g., “When I feel nervous, my heart beat fast.”) from the Screen for Adult Anxiety Related Disorders (SCAARED; Angulo et al., 2017).

Adolescents reported their daily psychopathology with all items from the emotional symptoms, conduct problems, and hyperactivity subscales, as well as three items from the peer problems subscale of the SDQ (Goodman et al., 1998). Three items were reverse-coded (Supplemental Table S8).

Criterion Validity Measures

Daily Hassles

Daily hassles were measured with an abbreviated version of the Hassles Scale (e.g., “Hurried to meet a deadline.”; Zheng et al., 2023) among university students and adolescents. Participants reported whether that event or experience happened on that day (0 = no, 1 = yes). Scores were averaged across 16 items, with higher scores indicating experiencing more daily hassles (university students Wave 1: ordinal ω_w = .82, ω_b = .97; Wave 2: ordinal ω_w = .95, ω_b = .99; adolescents: ordinal ω_w = .88, ω_b = .98).

Daily Neuroticism

University students reported their levels of neuroticism on that day using three selected items from the Mini-International Personality Item Pool (Donnellan et al., 2006) adapted into daily contexts (e.g., “Had frequent mood swings today”) on a 5-point scale (from 1 = not well at all to 5 = very well). An average score of neuroticism was created, with high scores representing higher levels of neuroticism (Wave 1: ω_w = .59, ω_b = .79; Wave 2: ω_w = .53, ω_b = .75).

Daily Emotion-Regulation Difficulties

University students reported their emotion-regulation difficulties with eight items selected from the Difficulties in Emotion Regulation Scale–Short Form (e.g., “When I was upset today, I have difficulty controlling my behaviors”; Kaufman et al., 2016) on a 3-point scale (0 = never, 1 = 1–2 times, 2 = 3 times or more). Items were averaged, with higher scores indicating experiencing more difficulties in emotion regulation that day (Wave 1: ω_w = .79, ω_b = .90; Wave 2: ω_w = .77, ω_b = .91).

Emotion Regulation

Emotion regulation was measured with the Emotion Regulation Questionnaire (Gross & John, 2003), which includes 10 items assessing cognitive reappraisal (six items) and expressive suppression (four items) facets on a 5-point scale (from 1 = disagree to 5 = agree). Adolescents reported their daily use of each emotion-regulation strategy in daily surveys, while university students reported the frequency of use over the past year/6 months in the three baseline/follow-up surveys. Scores were averaged, with higher scores indicating more use of reappraisal (adolescents: ω_w = .88, ω_b = .98; university students ω: .85–.90) and suppression (adolescents: ω_w = .84, ω_b = .98; university students ω: .85–.89).

Daily Impulsivity

Adolescents reported their daily impulsivity on that day, with four items selected from the Momentary Impulsivity Scale (e.g., “I have felt impatient.”; Tomko et al., 2014) on a 5-point scale (1 = very slightly or not at all to 5 = extremely). Items were averaged with higher scores representing higher levels of impulsivity (ω_w = .58, ω_b = .84).

Impulsivity

The Short Urgency-Premeditation-Perseverance-Sensation Seeking-Positive Urgency (UPPS-P) Impulsive Behavior Scale (Cyders et al., 2014) was used to measure university students’ impulsivity in the Wave 2 baseline survey. The short UPPS-P comprises 20 items with four items each on five subscales: negative urgency, lack of perseverance, lack of premeditation, sensation seeking, and positive urgency. Participants reported on a 5-point Likert-type scale (1 = not at all to 5 = very much). Items were averaged, with higher scores representing higher levels of impulsivity (ω = .75).

Depressive Symptoms

Depressive symptoms were measured using 17 items adapted from the Center for Epidemiological Studies Depression scale (Radloff, 1977). University students indicated how often the statements described them in the past year on a 4-point Likert-type scale (0 = rarely or none, 1 = some or a little of the time, 2 = occasionally or a moderate amount of time, 3 = most or all of the time) in the three baseline/follow-up surveys. Items were averaged, with a higher score indicating higher levels of depressive symptoms (ordinal ωs = .90–.93).

Anxiety Symptoms

University students answered questions on their anxiety symptoms using 14 items modified from the SCAARED (Angulo et al., 2017) on a 3-point scale (0 = not true, 1 = somewhat true, 2 = very true) in the three baseline/follow-up surveys. Scores were averaged, with a higher score reflecting higher levels of anxiety symptoms (ordinal ωs = .90–.92).

Analytic Strategy

Model Estimation

All analyses were performed separately by the sample (university students and adolescents) in Mplus 8.3 (Muthén & Muthén, 1997–2019). Multilevel confirmatory factor analyses (MLCFAs) using the weighted least squares with mean and variance corrected (WLSMV) and Bayesian estimators were fit to estimate different daily psychopathology models at the within- and between-person levels. We opted not to use the Maximum Likelihood estimation with the robust standard error (MLR) estimation, as multiple numerical integrations over multiple dimensions required for categorical items with multiple thresholds are infeasible in the current data. All observed variables were treated as categorical, except for the indicator of panic disorder symptoms, which was treated as a continuous variable. To facilitate model convergence and enhance estimation accuracy (DiStefano et al., 2021), items with a response rate below 3% in any category were dichotomized into 0 (not true) and 1 (somewhat true or certainly true). First, models were estimated in the university student sample, with each model being estimated separately for each wave. Six different structures/models of psychopathology were examined: a correlated factor model with four factors (CF–4F; see Figure 1A), a correlated model with two factors (CF–2F), a higher-order model with four subfactors (HO–4F; see Figure 1B), a single-factor model, a bifactor model with four specific factors (Bi–4F), and a bifactor model with two specific factors (Bi–2F; see Figure 1C). In a higher-order model with two subfactors (HO–2F), the second-order latent factor is indicated by only two first-order latent factors (internalizing and externalizing), resulting in the model being under-identified. Therefore, no HO–2F models were estimated.

Figure 1. — Examples of Alternative Structures/Models of Daily Psychopathology Among University Students.

Models at both levels were estimated simultaneously. At the between-person level, the indicators are the within-person means of each psychopathology item over the 30-day daily surveys (random means/intercepts); as such, the between-person-level indicators were represented as latent means in Figure 1. The within-person structure/model reflects common covariances among the items each day over time, whereas the between-person structure/model represents common covariances in the person-average levels of indicators across individuals (Cooke et al., 2022; Kim et al., 2016). Following the suggestion of Kim et al. (2016), factor loading invariance across levels was tested subsequently. Based on the structural validity and interpretability of the models, certain subpar models were excluded, while the others were retained to explore different structures at the within- and between-person levels to investigate whether daily psychopathology exhibits distinct structures across levels. The retained models were subsequently estimated in the adolescent sample to examine their replication across independent samples. To promote transparency and openness, the codes behind all the analyses are publicly available (https://osf.io/qns65/).

The sample size adequacy for CFA models has not reached any consensus (Wolf et al., 2013), especially for multilevel CFAs. Various rules of thumb have been proposed for determining adequate sample sizes for CFA, including a minimum sample size of 100 or 200 and 10 cases per variable (Jackson, 2003; Wolf et al., 2013). Previous simulation studies on MLCFAs typically selected within-person-level sample sizes ranging from 5 to 50 and between-person-level sample sizes from 30 to 200 (Lin & Hsu, 2023; Padgett & Morgan, 2021). These studies demonstrated that numerous fit indices effectively discriminate between correctly specified and misspecified models when the between-person-level sample size exceeds 100 (Padgett & Morgan, 2021). Based on these guidelines, the current study possesses sufficient statistical power for conducting CFAs at the within-person level. At the between-person level, the sample sizes from the first wave of the university student sample have adequate power, the second wave is marginally powered, whereas the adolescent sample is underpowered.

Structural Validity

Multiple criteria were employed to assess the structural validity. Traditional model fit indices (Hu & Bentler, 1999) include the root mean square error of approximation (RMSEA), standardized root mean square residual (SRMR), comparative fit index (CFI), and Tucker–Lewis Index (TLI). Notably, CFI, TLI, RMSEA, and SRMR at the within-person level (SRMR_w) are not sensitive to the between-person-level misspecification (Hsu et al., 2015). Therefore, the SRMR at the between-person level (SRMR_b) was specifically used to assess model fit at the between-person level (Hsu et al., 2015; Padgett & Morgan, 2021). In addition, employing the conventional model fit criteria (e.g., CFI > .90, TLI > .90, SRMR < .08, and RMSEA < .05) in multilevel models should be cautious, as these cutoff criteria were originally proposed for single-level CFAs (Kim et al., 2016). The standard errors (SEs) of the factor loadings across different models were evaluated to indicate the precision of the parameters (Forbes et al., 2021).

To evaluate factor reliability, we applied the following indices. Although there are no universal cutoff criteria, we provide commonly used thresholds for interpreting these indices: The index of construct replicability (H Index) captures the proportion of variance accounted for by a latent factor divided by the variance that is not explained by the same factor, which “reflects the extent to which a latent variable is represented by its indicators and thus how likely it is to be replicated across studies” (p. 1288, Watts et al., 2019). H values > .80 for general factors and .70 for specific factors suggest well-defined latent variables (Rodriguez et al., 2016; Watts et al., 2019). For the single-factor and correlated factor models, omega total (ωt) and omega subfactor (ωs) were estimated, respectively, and values >.75 are considered acceptable (Revelle & Condon, 2019). For each bifactor model, the omega coefficients (ω, ω_s; Revelle & Condon, 2019) were first calculated, which indicate the proportion of variance in the observed general (ω) and specific factors (ω_s) scores that are attributed to all modeled sources of common variance. Scores >.75 are deemed acceptable. Subsequently, the omega hierarchical (ω_h) was estimated to indicate the proportion of total score variance that could be attributed to the p factor after accounting for all specific factors, while the omega hierarchical specific (ω_hs) reflects the proportion of subscale score variances that could be attributed to the specific factors after accounting for the p factor. Generally, ω_h/ω_hs > .50 indicates acceptable reliability. When ω_h is large (>.80) and ω_hs values are relatively low (<.50), it suggests that the p factor is more reliable than the specific factors (Reise et al., 2013). Explained common variance (ECV; Reise et al., 2013) is an index to estimate the relative strength of factors and the extent of unidimensionality, which shows the percentage of common variance across all the indicators that are accounted by the p factor. ECV of p factor > .70 demonstrates the significance of the p factor relative to the specific factors and could be interpreted as unidimensional (Hoffmann et al., 2024).

Longitudinal and Multigroup MI

Longitudinal MI was tested in the university student sample using the WLSMV estimator. Unconstrained models were compared with those having factor loadings (metric MI), thresholds/intercepts (scalar MI), and residual variances (residual MI) constrained to be equivalent across two waves (Widaman et al., 2010). The criterion of a ΔCFI decreasing ≤.01 and an RMSEA ≤.015 indicates MI at the within-person level (Cheung & Rensvold, 2002; Khojasteh & Lo, 2015). A decrease in SRMR_b≤ .030 suggests invariance at the between-person level (Khojasteh & Lo, 2015).

We also conducted multigroup MI tests with all 11 overlapping items between the adolescent sample and the first wave of the university student sample to explore any potential structural differences across groups. Since Mplus does not support multigroup analysis with multilevel models for categorical items, the psychopathology items were treated as continuous variables, and the MLR estimator was used in this analysis. These 11 items were loaded on three narrower factors (i.e., emotional problems, peer problems, and conduct problems), which were directly used as latent factors, specific factors, or first-order factors in the CF, Bi, and HO models, respectively, without separate analyses for broader factor models (i.e., two-factor models). The same criteria used for longitudinal MI were applied to indicate MI between two groups.

Correlations Between Latent Factor Scores

In both samples, latent factor scores estimated in each psychopathology model were saved at each level. To evaluate the consistency of these factor scores, Pearson correlation analyses were conducted on these scores across different models. In the university student sample, the correlation analyses on the latent factors at the within-person level were performed for each wave separately, while the correlation analyses at the between-person level were conducted across waves.

Within- and Between-Person Criterion Validity

Factor scores were used as observed variables to examine criterion validity. At the within-person level, concurrent predictive criterion validity was evaluated by correlating each latent factor with same-day criterion measures. The between-person concurrent criterion validity was examined by incorporating the correlations between latent factors at the between-person level and person-average levels of validity variables, reflecting the associations between the random intercepts of these components across individuals.

Between-person prospective validity was examined in the university student sample only. Cross-lagged models were built using the between-person factor scores estimated in the psychopathology models and criterion validity variables measured from three baseline/follow-up surveys. Due to the availability of only two waves of psychopathology scores, we employed the conventional cross-lagged panel model instead of the random-intercept cross-lagged model (Hamaker et al., 2015), which requires at least three waves of data. Accordingly, the autoregressive paths reflect the extent to which the psychopathology factor scores and validity variables can be explained by the same construct from the previous wave. The cross-lagged paths reflect the variance in each validity variable at Wave 2 explained by each factor at Wave 1, and vice versa, while controlling for the autoregressive effects. For depressive and anxiety symptoms, and emotion regulation, we also examined the extent to which each factor at Wave 2 could account for the variance of the criterion validity variables assessed in Wave 3.

Results

Structural Validity

University Student Sample

We first estimated all psychopathology models with the same structure across within- and between-person levels in the university student sample. All models converged with reasonable estimates. Table 1 shows the traditional model fit indices. The CFI, TLI, RMSEA, and SRMR_w suggested that for both waves, at the within-person level, the Bi–4F and Bi–2F models demonstrated the best model fit (CFI = .833–.889, TLI = .781–.856, RMSEA = .022–.030, SRMR_w = .066–.076). The other four models showed unacceptable fit, despite the CF–4F and HO–4F models showing relatively better fit than the CF–2F and single-factor models at the within-person level. The SRMR_b indicated a consistent pattern at the between-person level: The Bi–4F and Bi–2F models again demonstrated the best model fit (SRMR_b = .062–.071), followed by the CF–4F and HO–4F models, with the CF–2F and single-factor model showing inferior model fit.

Table 1.

Fit Indices for the Multilevel Confirmatory Factor Analyses for Daily Psychopathology

Model	k	χ²	df	RMSEA	SRMR within/between	CFI	TLI
University students: Wave 1
Correlated factors (4F)	99	1,908.91	258	.032	.074/.074	.791	.752
Correlated factors (2F)	89	2,674.79	268	.038	.094/.077	.695	.652
Higher-order (4F)	95	1,931.56	262	.032	.074/.074	.789	.753
Single factor	87	3,597.52	270	.044	.103/.082	.579	.523
Bifactor (4F)	123	1,554.32	234	.030	.069/.062	.833	.781
Bifactor (2F)	123	1,540.44	235	.030	.066/.067	.835	.785
University students: Wave 2
Correlated factor (4F)	98	999.88	258	.027	.085/.072	.814	.779
Correlated factor (2F)	88	1,191.11	268	.030	.094/.084	.768	.735
Higher-order (4F)	94	1,002.85	262	.027	.086/.075	.814	.783
Single factor	86	1,342.85	270	.032	.098/.087	.731	.695
Bifactor (4F)	122	689.56	234	.022	.076/.069	.886	.850
Bifactor (2F)	121	676.64	235	.022	.075/.071	.889	.856
Adolescents
Correlated factor (4F)	97	308.97	258	.010	.110/.070	.952	.942
Correlated factor (2F)	87	357.50	268	.013	.131/.078	.915	.903
Higher-order (4F)	93	321.04	262	.010	.111/.071	.944	.934
Bifactor (4F)	121	281.38	234	.010	.098/.067	.955	.941

Open in a new tab

Note. All models were estimated with the WLSMV estimators. RMSEA = root mean square error of approximation; SRMR = standardized root mean square residual; CFI = comparative fit index; TLI = Tucker–Lewis Index; 4F = four factors; 2F = two factors.

The patterns of results obtained from the WLSMV and Bayes estimators yielded high consistency. Given the Bayes estimator’s notable advantages in estimating models with small sample sizes and its ability to generate factor scores for subsequent validity analyses, we primarily present results obtained with Bayes in the main text (results with WLSMV estimation available in Supplemental Materials). The standardized factor loadings for different models using the Bayes estimator are presented in Table 2 (Wave 1) and Supplemental Table S2 (Wave 2; see Supplemental Tables S3 and S4 for WLSMV estimation). In the CF–4F, CF–2F, and HO–4F models, all factor loadings were positive and significant, with almost all indicators showing loadings ≥.35 at both levels. The single-factor model revealed positive and significant loadings on the single factor at both levels, despite one item showing non-significant loadings and five indicators demonstrating loadings <.35. For the bifactor models, the p factors were strongly indicated by the items at the between level (median λ_b = .64 [Bi–2F in Wave 2] to .70 [Bi–2F in Wave 1]), while the loadings on p factors were relatively lower at the within-person level (median λ_w = .36 [Bi–4F in Wave 2] to .43 [Bi–2F in Wave 2]). At the between-person level, the Bi–2F model unexpectedly showed two significant negative factor loadings on the specific factors in Wave 1 and six in Wave 2. Although previous studies have reported that the occurrence of negative factor loadings on specific factors after the p factor is accounted for is not uncommon in bifactor models (e.g., Southward et al., 2022), the excessive number of significantly negative loadings renders the interpretation of the substantive meanings of the specific factor challenging, particularly in Wave 2.

Table 2.

Standardized Factor Loadings in the Models Among University Students (Wave 1; Bayes Estimator)

graphic file with name 10.1177_10731911241283908-img2.jpg

Open in a new tab

Note. All loadings significant at p < .05 except those in italic. Significantly negative factor loadings are in a shaded background. In all four-factor models, Items 1–6 loaded onto the emotional problem (specific) factor, Items 7–10 on the peer problem (specific) factor, Items 11–13 on the conduct problem (specific) factor, Items 14–18 on the hyperactivity (specific) factor. In all two-factor models, Items 1–10 loaded onto the internalizing (specific) factor, Items 11–18 on the externalizing (specific) factor. CF = correlated factors; HO = higher-order; Bi = bifactor; 4F = four factors; 2F = two factors; w = within-person level; b = between-person level.

Figure 2 illustrates the SEs of the factor loadings across models with the Bayes estimator in Wave 1 (see Supplemental Figure S1 for Wave 2). The factor loadings of the Bi–4F had larger SEs at both within- and between-person levels than the other models, especially for the specific factors (SE = .024–.078 at the within-person level, and SE = .050–.143 at the between-person level). Factor loadings in the Bi–2F model showed larger SEs exclusively at the between-person level (SE = .049–.125) but not at the within-person level (SE = .018–.038). Model estimates of data from the second wave showed a consistent result pattern.

The reliability indices with the Bayes estimator are shown in Table 3. Generally, the reliability at the between-person level was higher than that at the within-person level. The single-factor models were reliable and well-defined at both levels across waves. Within correlated factor models at the within-person level, only the emotional problems factor in CF–4F and the internalizing factor in CF–2F showed acceptable reliability (i.e., ω_s > .75). Other factors in CF–4F, and the externalizing factors in CF–2F, were not reliable across waves. At the between-person level, all factors in CF–4F and CF–2F exhibited good reliability and replicability.

Table 3.

Reliability Indices for the Models Among University Students

	Within							Between
Indicator	p	Emo	Peer	Con	Hyp	Int	Ext	p	Emo	Peer	Con	Hyp	Int	Ext
University students: Wave 1
Correlated factor (4F)
ω_s		.76	.50	.72	.67				.92	.70	.90	.84
H index		.81	.55	.75	.67				.93	.76	.92	.85
Correlated factor (2F)
ω_s						.76	.73						.91	.90
H index						.83	.74						.94	.93
Single-factor: ω_t	.77							.93
Bifactor (4F)
ω/ω_s	.86	.78	.53	.73	.69			.96	.93	.78	.91	.85
ω_h/ω_hs	.70	.13	.29	.40	.56			.89	.20	.21	.12	.20
ECV	.50	.08	.09	.12	.21			.76	.08	.06	.03	.08
H index	.85	.39	.18	.53	.74			.95	.59	.52	.28	.62
Bifactor (2F)
ω/ω_s	.83					.90	.84	.95					.97	.95
ω_h/ω_hs	.60					.30	.49	.90					.01	.21
ECV	.54					.18	.28	.81					.06	.13
H index	.84					.58	.76	.96					.46	.69
University students: Wave 2
Correlated factor (4F)
ω_s		.78	.44	.70	.51				.90	.69	.87	.86
H index		.83	.50	.73	.65				.93	.85	.95	.86
Correlated factor (2F)
ω_s						.76	.70						.92	.89
H index						.84	.77						.95	.93
Single-factor: ω_t	.81							.94
Bifactor (4F)
ω/ω_s	.82	.78	.37	.72	.62			.95	.91	.75	.87	.87
ω_h/ω_hs	.61	.36	.14	.41	.45			.85	.28	.24	.05	.37
ECV	.47	.16	.06	.12	.19			.72	.10	.06	.01	.11
H index	.76	.58	.36	.53	.76			.95	.65	.47	.09	.76
Bifactor (2F)
ω/ω_s	.81					.89	.82	.95					.96	.94
ω_h/ω_hs	.53					.41	.41	.90					.10	.09
ECV	.50					.24	.26	.78					.10	.12
H index	.80					.64	.73	.95					.58	.68

Open in a new tab

Note. All models were estimated with the Bayes estimator. Emo = emotional problems; Con = conduct problems; Hyp = hyperactivity; Pee = peer problems; Int = internalizing; Ext = externalizing; 4F = four factors; 2F = two factors; H index = index of construct replicability; ω_t = omega total; ω_h = omega hierarchical; ω_hs = omega hierarchical specific; ω = omega; ω_s = omega specific; ECV = explained common variance.

At both within- and between-person levels, the p factors accounted for the most variance in Bi–4F (within-person level: ECV_w = .47–.50; between-person level: ECV_b = .72–.76) and Bi–2F (ECV_w = .50–.54; ECV_b =.78–.81) compared to the specific factors. At the within-person level, the two bifactor models were somewhat multidimensional (i.e., ECV_w of p factor < .70; ω_h < .80). Nonetheless, among the specific factors, only the ω_hs of the hyperactivity subscale in Bi–4F in Wave 1 had acceptable reliability (i.e., ω_hs > .50), whereas the other specific factors did not reliably capture variances (i.e., ω_hs < .50). At the between-person level, the models were unidimensional to some extent (i.e., ECV of p factor > .70; ω_h > .80) across waves, and the replicability and reliability of the specific factors were relatively low (H index < .70 and ω_hs < .50), indicating that they should not be used in practice. The structural validity and reliability results with the Bayes and WLSMV estimators were highly similar (see Supplemental Tables S5 and S6 for reliability indices with WLSMV).

Considering jointly the model fit indices, interpretability, precision, reliability, and replicability of each model, we decided to proceed with only the CF–4F, CF–2F, HO–4F, and Bi–4F models to examine replication in the adolescent sample.

Cross-level MI was tested by constraining the factor loadings of the same item at the within- and between-person levels to be equivalent in the CF–4F (Wave 1: ΔCFI = .013, ΔSRMR_b = .055; Wave 2: ΔCFI = .001, ΔSRMR_b = .059), CF–2F (Wave 1: ΔCFI = −.026, ΔSRMR_b = .058; Wave 2: ΔCFI = .007, ΔSRMR_b = .062), HO–4F (Wave 1: ΔCFI = −.028, ΔSRMR_b = .055; Wave 2: ΔCFI = .001, ΔSRMR_b = .059) and Bi–4F (Wave 1: ΔCFI = .002, ΔSRMR_b = .068; Wave 2: ΔCFI = .020, ΔSRMR_b = .086) model. Although the ΔCFI indicated MI in some models (i.e., ΔCFI < .01), a significant change in SRMR_b (>.035) suggested substantial differences in the between-person-level fit when constraining factor loadings to be equal across levels. Consequently, the results did not support MI across levels.

Next, we estimated different models at the within- and between-person levels to explore if psychopathology potentially demonstrates distinct structures across levels. The CF–4F, CF–2F, HO–4F, and Bi–4F models were fully crossed to enumerate all potential combined structures across levels. The fit indices (Supplemental Table S7) indicated that models with Bi–4F at the within-person level yielded the best fit to the data (Wave 1: CFI = .827–.833, TLI = .781–.789, RMSEA = .029–.030; Wave 2: CFI = .879–.886, TLI = .850–.857, RMSEA = .022). Across models, Bi–4F at the between-person level consistently showed the best fit (Wave 1: SRMR_b = .062–.066; Wave 2: SRMR_b = .069). Overall, the model with Bi–4F at both within- and between-person levels showed the most favorable fit, although this model’s fit still did not meet the traditional standard for fit indices.z

Adolescent Sample

Models with CF–4F, CF–2F, HO–4F, and Bi–4F structures across levels were first estimated among the adolescent sample. The traditional model fit indices (Table 1) indicated that the CF–4F, HO–4F, and Bi–4F models yielded acceptable-to-good fit, with the CF–2F exhibiting a less favorable yet still acceptable model fit. The Bi–4F and CF–4F yielded better fit than HO–4F and CF–2F at the within-person level (higher CFI, TLI, and similar RMSEA), while the Bi–4F showed a better fit than the other three models at the between-person level (lower SRMR_b). Supplemental Table S9 shows the fit indices of models with different structures across levels. Models with CF–4F and Bi–4F at the within-person level showed comparable and superior fit indices (CFI = .951–.955, TLI = .941–.943, RMSEA = .010). At the between-person level, the Bi–4F model (SRMR_b = .067) yielded better model fit than the others.

All items in the CF–4F, CF–2F, and HO–4F models had significant and positive factor loadings, with the exception of one item showing non-significant loading in the CF–2F model (see Supplemental Tables S10 and S11 for estimation with Bayes and WLSMV, respectively). The p factor in the Bi–4F model had stronger factor loadings at the between-person level (median λ_b = .82) than at the within-person level (median λ_w = .36). Numerous non-significant factor loadings emerged within the specific factors of the Bi–4F model. Consistent with the university student sample, the specific factors in the Bi–4F model showed larger SEs than p factors at both levels (Supplemental Figure S2).

The reliability indices for models estimated in the adolescent sample demonstrated generally consistent patterns with those in the university student sample (Supplemental Table S12). The CF–4F and CF–2F models exhibited relatively lower reliability at the within-person level (ω_s < .75) but acceptable replicability (H index > .70) except for the peer problem factor in the CF–4F. In contrast, all factors demonstrated superior reliability and replicability at the between-person level. The Bi–4F model showed multidimensionality at the within-person level (i.e., ECV of p factor < .70), but the specific factors were not reliable in accounting for the variances (ω_hs < .50). The cross-level MI test suggested that none of the models—CF–4F (ΔCFI = .019, ΔSRMR_b = .042), CF–2F (ΔCFI = .012, ΔSRMR_b = .040), HO–4F (ΔCFI = .020, ΔSRMR_b = .045), or Bi–4F model ((ΔCFI = .009, ΔSRMR_b = .032)—indicated MI across levels.

Longitudinal and Multigroup MI

Longitudinal MI results are presented in Table 4. All four models reached metric invariance based on ΔCFI ≤ .01 and ΔRMSEA ≤ .015. At the between-person level, these models showed larger changes in the metric invariance tests (ΔSRMR_b = .003–.012) than in the scalar and strict tests (ΔSRMR_b = −.001 to .001). Nonetheless, all ΔSRMR_b were still below the criterion of a decrease ≤.030 and indicated invariance at the between-person level.

Table 4.

Longitudinal Measurement Invariance Tests Among University Students

Model	χ²	df	CFI	TLI	RMSEA	SRMR within/between
Correlated factor–4 factors
Unconstrained model	2,031.42	1,113	.864	.846	.013	.066/.077
+ Factor loadings constrained	2,054.60	1,141	.864	.850	.012	.067/.086
+ Thresholds constrained	2,094.15	1,167	.863	.852	.012	.067/.086
+ Residuals constrained	2,118.40	1,185	.862	.853	.012	.067/.086
Correlated factor–2 factors
Unconstrained model	2,412.78	1,159	.814	.798	.015	.076/.082
+ Factor loadings constrained	2,436.74	1,191	.815	.805	.014	.078/.093
+ Thresholds constrained	2,469.85	1,219	.814	.808	.014	.078/.093
+ Residuals constrained	2,495.17	1,237	.813	.810	.014	.078/.091
Higher-order–4 factors
Unconstrained model	2,183.52	1,160	.848	.835	.013	.071/.082
+ Factor loadings constrained	2,261.42	1,194	.842	.833	.013	.074/.094
+ Thresholds constrained	2,298.04	1,220	.840	.835	.013	.074/.094
+ Residuals constrained	2,322.96	1,237	.839	.836	.013	.074/.092
Bifactor–4 factors
Unconstrained model	1,838.68	1,117	.896	.879	.011	.064/.085
+ Factor loadings constrained	1,869.80	1,168	.896	.888	.011	.065/.093
+ Thresholds constrained	1,904.01	1,192	.894	.888	.011	.065/.093
+ Residuals constrained	1,928.12	1,210	.893	.889	.011	.065/.090

Open in a new tab

Note. All models were estimated with the WLSMV estimator. In the unconstrained models, the means of latent factor scores at both within- and between-person levels in Wave 1 were fixed to 0, and the variances at both levels in Wave 1 were fixed to 1. The first factor loading of each latent factor was free to estimate, while corresponding first loadings at the same level were constrained to be equal across Waves 1 and 2. The first threshold of each latent factor was constrained to be invariant across time at each level (Widaman et al., 2010). In the following constrained models, the factor loadings were constrained at both within- and between-person levels, while the thresholds and residuals were only constrained at the between-person level. RMSEA = root mean square error of approximation; SRMR = standardized root mean square residual; CFI = comparative fit index; TLI = Tucker–Lewis Index.

The results of the multigroup MI between the adolescent sample and the first wave of the university student sample are shown in Supplemental Table S13. At the within-person level, neither the correlated factor model nor the bifactor model exhibited metric (CF: ΔCFI = .019; Bi: ΔCFI = .082), scalar (CF: CFI = .018; Bi: CFI = .024), or strict MI (CF: ΔCFI = .099; Bi: ΔCFI = .063). At the between-person level, both models achieved metric (CF: ΔSRMR_b = .025; Bi: ΔSRMR_b = .022) and scalar (CF: ΔSRMR_b = −.001; Bi: ΔSRMR_b = .003) MI, but not strict MI (CF: ΔSRMR_b = .041; Bi: ΔSRMR_b = .038). The higher-order models yielded improper solutions with negative residual variances on the first-order latent factors and were not explored further.

Correlations Between Latent Factor Scores

In the university student sample, at the within-person level (Tables 5), the p factor derived from the Bi–4F and HO–4F models showed strong correlations within waves (Wave 1: r = .97; Wave 2: r = .94). These two p factors also showed strong correlations with four factors in the CF–4F models (Wave 1: rs = .65–.98; Wave 2: rs = .72–.95) and two factors in the CF–2F models (Wave 1: rs = .79–.98; Wave 2: rs = .89–.95). The associations between two factors and specific factors derived from the Bi–4F model were much weaker (Wave 1: rs = .10–.39; Wave 2: rs = .05–.60). At the between-person level (Table 5), all factors in the CF–4F (rs = .42–.59) and CF–2F (rs = .53–.58) models showed good consistency across waves. The p factors in the HO–4F (r = .50) and Bi–4F (r = .56) models were also consistent across waves. However, the specific factors were not as consistent across time. The hyperactivity factor estimated in the two waves are not correlated with each other (r = .04, ns), and the other three factors showed moderate associations (rs = .30–.45).

Table 5.

Correlations Between Latent Factor Scores Among University Students at the Within- and Between-Person Level

Factor	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24
1 CF–4F Emo Wave 1	–	.84^***	.64^***	.90^***	–	–	–	–	.98^***	.78^***	–	–	.98^***	–	.44^***	.11^***	.13^***	.13^***	.97^***	–	–	–	–	–
2 CF –4F Con Wave 1	.88^***	–	.64^***	.81^***	–	–	–	–	.81^***	.86^***	–	–	.87^***	–	.21^***	.58^***	.19^***	.13^***	.84^***	–	–	–	–	–
3 CF –4F Hyp Wave 1	.83^***	.93^***	–	.54^***	–	–	–	–	.65^***	.91^***	–	–	.66^***	–	.18^***	.13^***	.72^***	.05^**	.65^***	–	–	–	–	–
4 CF –4F Pee Wave 1	.92^***	.95^***	.86^***	–	–	–	–	–	.89^***	.67^***	–	–	.91^***	–	.31^***	.16^***	.08^***	.40^***	.88^***	–	–	–	–	–
5 CF–4F Emo Wave 2	.57^***	.54^***	.54^***	.53^***	–	.82^***	.63^***	.76^***	–	–	.98^***	.81^***	–	.95^***	–	–	–	–	–	.75^***	.15^***	.08^**	.05	.88^***
6 CF–4F Con Wave 2	.49^***	.54^***	.49^***	.54^***	.85^***	–	.57^***	.78^***	–	–	.80^***	.85^***	–	.88^***	–	–	–	–	–	.43^***	.60^***	.11^***	.10^***	.83^***
7 CF–4F Hyp Wave 2	.50^***	.53^***	.59^***	.50^***	.82^***	.76^***	–	.57^***	–	–	.64^***	.86^***	–	.72^***	–	–	–	–	–	.20^***	.03	.36^***	.01	.79^***
8 CF–4F Pee Wave 2	.46^***	.49^***	.42^***	.51^***	.83^***	.96^***	.67^***	–	–	–	.78^***	.70^***	–	.84^***	–	–	–	–	–	.37^***	.20^***	.04	.34^***	.82^***
9 CF–2F Int Wave 1	.98^***	.90^***	.85^***	.94^***	.58^***	.50^***	.51^***	.47^***	–	.79^***	–	–	.98^***	–	.44^***	.08^***	.15^***	.15^***	.96^***	–	–	–	–	–
10 CF–2F Ext Wave 1	.88^***	.98^***	.96^***	.92^***	.55^***	.53^***	.56^***	.47^***	.91^***	–	–	–	.82^***	–	.20^***	.38^***	.55^***	.08^***	.79^***	–	–	–	–	–
11 CF–2F Int Wave 2	.56^***	.52^***	.52^***	.52^***	.98^***	.87^***	.82^***	.85^***	.56^***	.54^***	–	.83^***	–	.95^***	–	–	–	–	–	.74^***	.13^***	.11^***	.04	.89^***
12 CF–2F Ext Wave 2	.52^***	.56^***	.58^***	.54^***	.89^***	.90^***	.95^***	.82^***	.53^***	.58^***	.90^***	–	–	.89^***	–	–	–	–	–	.39^***	.35^***	.34^***	.01	.89^***
13 HO–4F p Wave 1	.93^***	.98^***	.93^***	.97^***	.56^***	.54^***	.54^***	.50^***	.94^***	.98^***	.55^***	.57^***	–	–	.39^***	.18^***	.17^***	.17^***	.97^***	–	–	–	–	–
14 HO–4F p Wave 2	.47^***	.48^***	.46^***	.50^***	.85^***	.93^***	.75^***	.94^***	.49^***	.48^***	.87^***	.86^***	.50^***	–	–	–	–	–	–	.60^***	.26^***	.13^***	.10^***	.94^***
15 Bi–4F Emo Wave 1	.47^***	.11	.13	.18^*	.25^***	.07	.12	.07	.42^***	.15^*	.22^**	.10	.18^**	.10	–	−.14^***	.02	.03	.28^***	–	–	–	–	–
16 Bi–4F Con Wave 1	.05	.42^***	.26^***	.21^**	.09	.20^**	.10	.13	.07	.37^***	.08	.16^*	.30^***	.08	−.30^***	–	.05^**	−.03	.15^***	–	–	–	–	–
17 Bi–4F Hyp Wave 1	−.05	.13	.38^***	.02	.06	.06	.19^*	−.01	−.04	.22^**	.04	.15^*	.11	.03	−.26^***	.12	–	.01	.10^***	–	–	–	–	–
18 Bi–4F Pee Wave 1	.12	.12	.06	.26^***	.07	.10	.04	.16^*	.15^*	.07	.09	.06	.15^*	.12	.03	−.11	−.10	–	.11^***	–	–	–	–	–
19 Bi–4F p Wave 1	.95^***	.97^***	.91^***	.97^***	.57^***	.54^***	.54^***	.51^***	.96^***	.96^***	.56^***	.57^***	.98^***	.52^***	.20^**	.20^**	.05	.11	–	–	–	–	–	–
20 Bi–4F Emo Wave 2	.31	.16^*	.20^**	.16^*	.55^***	.09	29^***	.10	.29^***	.18^*	.49^***	.23^**	.19^**	.16^*	.45^***	−.08	−.09	.02	.20^***	–	−.06^**	.02	−.08^**	.42^***
21 Bi–4F Con Wave 2	.02	.11	.01	.07	.02	.38^***	−.01	.25^***	−.02	.07	.04	.20^**	.07	.18^*	−.15^*	.30^***	−.03	.01	.03	−.34^***	–	.06^**	−.01	.20^***
22 Bi–4F Hyp Wave 2	.24^**	.20^**	.32^***	.17^*	.28^***	.11	.62^***	.04	.23^**	.25^***	.28^***	.41^***	.21^***	.15^*	.17^*	−.07	.04	−.01	.23^***	.34^***	−.28^***	–	−.10^***	.05
23 Bi–4F Pee Wave 2	.04	.02	−.02	.08	.06	.15^*	−.05	.31^***	.04	−.01	.08	−.01	.03	.25^***	.01	−.10	−.13	.36^***	.04	−.07	−.07	−.01	–	.09^***
24 Bi–4F p Wave 2	.51^***	.55^***	.52^***	.55^***	.88^***	.97^***	.80^***	.94^***	.52^***	.55^***	.90^***	.93^***	.56^***	.94^***	.06	.16^*	.12	.09	.56^***	.11	.25^***	.12	.10	–

Open in a new tab

Note. Between-level correlations are presented under the diagonal, and within-level correlations are presented above the diagonal. CF = correlated factors; HO = higher–order; Bi = bifactor; 4F = four–factors; 2F = two–factors; Emo = emotional problems; Con = conduct problems; Hyp = hyperactivity; Pee = peer problems; Int = internalizing; Ext = externalizing.

p < .05. **p < .01. *** p < .001.

In the adolescent sample (Supplemental Table S14), the p factor derived from the Bi–4F and HO–4F models showed strong correlations (r_w = .86; r_b = .96). These two factors showed strong associations with narrower factors in the CF–4F (r_ws = .72–.95; r_bs = .91–.96) and CF–2F (r_ws = .78–.95; r_bs = .92–.97) models, whereas their associations with specific factors in the Bi–4F model were notably weaker (r_ws = .19–.64; r_bs = .06–.37).

Within- and Between-Person Criterion Validity

University Student Sample

The concurrent criterion validity analyses (Supplemental Table S15) indicated that all factors in the CF–4F model were linked with three criterion validity variables at both within- (r_ws = .28–.50) and between-person levels (r_bs = .41–.86). The factors in the CF–2F model exhibited similar results (r_ws = .41–.51; r_bs = .48–.87). Generally, the associations were stronger at the between-person level than at the within-person level. In the HO–4F and Bi–4F models, the p factor demonstrated strong associations with daily hassles, emotion-regulation difficulties, and neuroticism at the within-person level, and even stronger associations at the between-person level across waves (r_ws = .38–.51, r_bs = .44–.87). The links between specific factors in the Bi–4F model and criterion validity variables at the within-person level were not robust across waves and even contrary to one’s typical expectation. For instance, higher levels of emotional problems were associated with higher levels of emotion-regulation difficulties (r = .13, p = .002) and neuroticism in Wave 2 (r = .07, p = .005) on the same day, but with lower levels of same-day emotion-regulation difficulties (r = −.07, p = .005) and neuroticism (r = −.04, p = .008). The associations between specific factors and criterion validity variables at the between-person level are generally consistent with the patterns one would expect based on the literature.

The concurrent and prospective criterion validity analyses with cross-lagged models (Supplemental Figures S3 and S4) showed that the p factor at the between level estimated in the Bi–4F model had only concurrent but not prospective associations with all criterion validity variables. The p factor estimated from the HO–4F models exhibited prospective associations with anxiety symptoms and neuroticism. Supplemental Table S16 presents the correlation analyses between the latent factors from each psychopathology model and impulsivity scores measured in Wave 2. The p factors estimated in both waves were correlated with the total impulsivity score (rs = .29–.34) and were also associated with scores of lack of premeditation (rs = .12–.19) and positive urgency (rs = .30–.38).

Adolescent Sample

All factors in the CF–4F and CF–2F models were positively correlated with impulsivity (r_w = .16–.20; r_b = .45–.51) and daily hassles (r_w = .28–.38; r_b = .70–.77) at both within- and between-person levels (Supplemental Table S17). The within-person level results indicated that higher-than-average impulsivity and hassles were associated with higher-than-average p factors on the same day in HO–4F (r_w = .18–.33) and Bi–4F (r_w = .24) models. At the between-person level, person-average levels of impulsivity and hassles were positively related to p factors in the HO–4F (r_b = .51–.64) and Bi–4F (r_b = .46–.71) models. Concerning emotion regulation, all factors in the CF–4F and CF–2F models had negative correlations with reappraisal (r_b = −.67 to −.62) and suppression (r_b = −.48 to −.44) at the between-person level, with only one factor negatively associated with reappraisal at the within-person level in each model. In the HO–4F and Bi–4F models, higher-than-average p factors were associated with higher-than-average reappraisal (r_ws = −.10) but not suppression (r_ws = −.04 to −.02) at the within-person level. At the between-person level, p factors were negatively associated with person-average levels of reappraisal (r_bs = −.68 to −.64) and suppression (r_bs = −.54 to −.47).

Discussion

Previous investigations on the underlying structure of psychopathology symptoms primarily focused at the between-person level and among adult samples. This study used two independent samples with month-long daily diary designs to investigate the daily psychopathology structure at both within- and between-person levels among adolescents and young adults on a more refined micro timescale (i.e., daily). We evaluated four different types of psychopathology structure commonly examined and supported in the literature—a correlated factor model, a higher-order model, a bifactor model, and a single-factor model—based on model fit, structural properties and reliability, longitudinal MI, correlations between latent factor scores, and concurrent and prospective criterion validity. We also assessed multiple combinations of models with varying structures at different levels to explore whether their performance vary across within- and between-person levels. These comparisons shed lights in future research in determining which specific models might be more suitable for addressing specific research questions. To our best knowledge, this is the first study to comprehensively compare psychopathology models at different levels with daily diary designs.

Overall, the traditional fit indices exhibit a preference for the Bi–4F model at both within- and between-person levels, which is consistent with prior between-person-level studies (Clark et al., 2021; Laceulle et al., 2015). However, caution is warranted as traditional fit indices always tend to favor the more flexible models (Greene et al., 2019; Reise et al., 2016). Notably, the p factors derived from the Bi–4F and HO–4F models were similarly associated with the criterion variables. This observation is further corroborated by strong and consistent associations between these two p factors, suggesting a high degree of similarity or interchangeability between the p factors estimated in the higher-order and bifactor models (Clark et al., 2021). These two models together suggest the emergence of the general p factor at the within-person level, which is generally recognized as indicating a common general psychopathology trait in traditional between-person-level research. However, the interpretation of this trait may differ across levels. At the between-person level, this trait is indicative of an individual’s tendency to experience comorbid psychopathology symptoms relative to others (Carragher et al., 2016), which is crucial for understanding how the co-occurrence differs between individuals over a more extended period (e.g., months). In contrast, at the within-person level, this trait is more likely to reflect the shared variability or the fluctuation patterns that occur across all psychopathology symptoms. That is, when a particular symptom is higher than its average level in an adolescent or young adult on any given day, other symptoms are also more likely to be higher than their average level on the same day (Wright et al., 2023). This finding has significant implications to track and monitor the fluctuations of daily psychopathology symptoms, as it underscores the interconnectedness and potential mutual influences of these symptoms in the daily lives of adolescents and young adults. The existence of p factors at both levels together underscores the need for a holistic approach in developing prevention strategies. Instead of merely addressing specific symptoms, efforts should consider all symptoms as an integrated system. The p factor at the between-person level can help identify individual factors to answer who is more vulnerable to experiencing a wide range of symptoms, while the p factor at the within-person level can pinpoint daily contextual factors that influence the fluctuations of multiple symptoms to answer when individuals are most vulnerable to these symptoms (Zheng et al., 2023).

Models with the same within-person-level structures demonstrate similar CFI, TLI, and RMSEA regardless of the different between-person-level structures. This observation supports the claim of Hsu et al. (2015) that these indices may not be sensitive to the between-person-level misspecification. A simulation study also suggested that these indices are primarily affected by misspecification at the within-person level and are only partially impacted by those at the between-person level (Padgett & Morgan, 2021). However, none of the models in the university sample met the traditional fit criteria of CFI and TLI (Hsu et al., 2015; Hu & Bentler, 1999). The Bi–4F model in the adolescent sample met the traditional model fit criteria but did not meet the stricter model fit criteria for multilevel models suggested by Padgett and Morgan (2021), which recommended a CFI > .98, TLI > .98, SRMR_w < .02, and RMSEA < .02 for samples smaller than 100. One possible explanation is that existing psychopathology models may not adequately capture the complexities of symptom fluctuations. Another possibility is that these very high cutoff criteria designed for general multilevel models (e.g., classrooms nested in schools; Padgett and Morgan, 2021) may not be entirely suitable for daily diary data, which exhibit their own systematic complexity (e.g., typically demonstrate lower intra-class correlation than other multilevel or nesting data [e.g., classrooms nested in schools]) and require more sensitive indices and criteria, which awaits future psychometric investigation.

None of the structures exhibited MI across levels. Consequently, it cannot be assumed that the same constructs are measured at the within- and between-person levels. This indicates that differences in latent factors between individuals are not merely aggregations of daily measures from each individual (Kim et al., 2016). Even if the structures appear similar in their configuration (e.g., both in the Bi–4F model), they actually may represent distinct factor structures with different factor loadings (Grosz, 2024). Therefore, the models should be interpreted separately at each level. In bifactor models, psychopathology in adolescents and young adults tends to be relatively unidimensional at the between-person level as evidenced by the p factor explaining over 70% of the total variance across samples/waves (Forbes et al., 2021; Reise et al., 2013). Future studies employing intensive longitudinal designs might confidently use the p factor of psychopathology without necessarily focusing on the orthogonal-specific factors at the between-person level. At the within-person level, nonetheless, the reliability indices of the p factors support the multidimensionality of psychopathology, highlighting the necessity of estimating specific factors to provide a more comprehensive understanding of the data (Reise et al., 2013, 2016) These specific factors are intended to identify residual clusters of risk for more narrowly defined psychopathology symptoms (e.g., emotional symptoms) that are not captured by the general p factor (Carragher et al., 2016; Clark et al., 2021). However, the large SEs and low reliability and replicability of the specific factors in the bifactor models, congruent with findings from previous studies at the between-person level (Clark et al., 2021; Forbes et al., 2021; Smith et al., 2020), cast doubt on their applicability in practical analyses. While specific factors are likely present, it is plausible that they do not exist as orthogonal ones in addition to the general p factor among adolescents and young adults in their day-to-day lives. The orthogonal-specific factors observed at the within-person level in this study may primarily reflect absorbed measurement errors rather than true dimensions. Therefore, the challenge for future researchers lies in addressing the multidimensional nature of the psychopathology structures at the within-person level. There is an urgent need for the development of new fine-grained models capable of deriving both p factors and stable and reliable specific factors when needed (Watts et al., 2024).

Integrating intensive longitudinal data collected on micro timescales (day to day) with traditional longitudinal designs on macro timescales (years), this study conducted longitudinal measurement variance tests to explore the structural stability of daily psychopathology at both within- and between-person levels. All three models met the strictest criteria for longitudinal MI, indicating that the structure of daily psychopathology symptoms remained stable over time. This structural stability at the within-person level suggests that the fluctuation patterns of symptoms also remained invariant throughout the study period, indicating a consistent pattern of symptom expression in daily life. Furthermore, the construct equivalence allows for the comparison of latent factor scores, ensuring that any observed differences reflect true changes in the constructs themselves, rather than the variations in measurement properties (Choate et al., 2023; Forbes et al., 2021). For instance, the p factor was consistently identified at both levels over the two and a half years, maintaining a comparable form with the same meaning (Conway et al., 2022; Gluschkoff et al., 2019). Future studies could not only compare the magnitude of p factor scores over time but also calculate the variability of these factors over a short term within each wave (e.g., intraindividual variability, mean of the squared successive differences; [Sliwinski, 2011; Zheng & Goulter, 2024]) to examine how the aggregated variability scores change over longer time period (e.g., decreased or increased daily p factor fluctuations over years). Such analyses will offer a novel understanding of the manifestation, persistence, and temporal changes of psychopathology factors across shorter and longer timescales. It is important to note that the current study’s multilevel nature, primarily using categorical indicators, limits scalar and strict MI tests to the between-person level, excluding within-person levels (thresholds and residuals not available at the within-person level).

Considering the longitudinal MI jointly with the multigroup measurement non-invariance across samples—which emphasizes that the same models/structures applied to different age groups are not equivalent, especially at the within-person level—these findings collectively indicate that the factor structures may demonstrate substantial change during the transition from adolescence to young adulthood, while remaining relatively stable within young adulthood. This finding supports the notion that the manifestation of psychopathology may exhibit both developmental continuity and discontinuity across the life span (Forbes et al., 2024). It also implies that current factor structures are specific to the age groups examined in the current study. Moreover, during the transition to adulthood, it is necessary to expand the item pool to include a wider range of psychopathology symptoms, especially those related to thought disorders (Smith et al., 2020). These additional symptoms may impact the structure and the magnitude of correlations between latent factors (Forbes et al., 2021; Hamlat et al., 2019). In addition, the longitudinal MI tests were performed exclusively on a university student sample, representing only a segment of young adulthood. Future studies should cover a broader span of adolescence and adulthood for a full picture of the development of psychopathology structures.

Building upon and extending previous findings, this study found associations between the p factor and external validity variables including daily hassles, emotion-regulation difficulties, neuroticism, and impulsivity at both within- and between-person levels (Southward et al., 2022). These findings support the assumption that psychopathology symptoms share common antecedents, correlates, and consequences (Caspi et al., 2014; Hamlat et al., 2019; Hankin et al., 2017). Notably, the link between daily hassles and same-day p factor potentially indicates that adolescents and young adults are sensitive to daily mundane life experiences, highlighting a critical shared etiological factor of psychopathology symptoms. The extent to which this link spans various age ranges remains to be examined and warrants further investigation. In addition, at the within-person level, we focused on the relations between daily psychopathology factors and same-day external validity variables, which limits any causal conclusions regarding the directions of these associations. For instance, it is yet to be explored whether the p factor caused emotion dysregulation or was impacted by it. At the between-person level, the association between p factors and concurrent and subsequent impulsivity supports the notion that the p factor of psychopathology is a strong predictor of important outcomes in clinical research (Caspi et al., 2014; Forbes et al., 2019, 2021). This finding further emphasizes the significance of adopting transdiagnostic treatment and comprehensive prevention strategies that address multiple psychopathology symptoms with a unified approach (Carragher et al., 2016; Hamlat et al., 2019).

Based on all these results collectively, the CF–4F and HO–4F models demonstrate similar performance across multiple criteria at both within- and between-person levels. Future studies should select which model to use based on specific research objectives (Forbes et al., 2021). If the general factor of psychopathology is the main focus, then the higher-order model is a suitable option. When research aims to explore narrower factors and their unique associations, the CF–4F model is recommended. The CF–2F model exhibited a relatively poorer model fit but showed comparable results in factor reliability, longitudinal MI, factor consistency, and criterion validity tests with CF–4F and HO–4F. This model could be considered in practice when the focus is on the broad spectra of psychopathology. In addition, this model offers the advantage of facilitating comparisons with other research as its factor structure inherently corresponds with the core spectra of psychopathology in accordance with the HiTOP framework (Conway et al., 2022; Kotov et al., 2017, 2021). Despite offering the best traditional model fit, the Bi–4F has several drawbacks: The specific factors in the Bi–4F model exhibited large SEs and low reliability and consistency over time, and the prospective associations between p factors in the Bi–4F and criterion validity variables were not as robust as those observed in the HO–4F models. Given these issues and the challenging interpretability of the Bi–2F model, the practical utility and necessity of bifactor models warrant further investigation and discussion.

Strengths and Implications

This study has several notable strengths and novel contributions. Prior investigations in psychopathology structures have predominantly relied on cross-sectional approaches or traditional longitudinal designs spanning long time intervals and focused on between-person analyses. This study addressed these limitations by employing month-long daily diary designs and exploring the psychopathology structure at both within- and between-person levels among adolescents and young adults. The discovery of the p factor at the within-person level reveals the comorbid fluctuations tendency across various psychopathology symptoms in daily life, advocating for a transdiagnostic treatment approach that takes into account the daily individual and contextual factors to address multiple psychopathology symptoms simultaneously. Moreover, this study replicated its findings across two independent samples of adolescents and young adults and confirmed the MI of psychopathology structure at both within- and between-person levels over two and a half years. This finding highlights the stability of the psychopathology structure at both levels, underscoring the importance of intensive longitudinal designs for gaining a deeper and more nuanced understanding of psychopathology in adolescents’ and young adults’ daily lives. Future work should continue to examine the psychopathology structure at different fine-grained timescales across development.

Limitation

In light of the strengths, several limitations of this study should also be noted. First, the current daily assessment frequency may not be sufficient to detect whether symptoms emerge simultaneously within minutes or hours or trigger each other within days. Future research should employ within-day experience sampling (e.g., ecological momentary sampling; Sliwinski, 2011) to further explore the emergence of psychopathology symptoms on a more microscopic timescale, such as hour to hour (Zheng & Goulter, 2024). Second, the study mainly used SDQ to assess psychopathology symptoms. While the SDQ is a widely recognized tool for measuring psychopathology (e.g., Carragher et al., 2016; Constantinou et al., 2019; Hoffmann et al., 2024), it covers multiple spectra of symptoms but is not exhaustive. Although previous research suggests that model fit and MI of psychopathology models are not sensitive to item selection (Hoffmann et al., 2022), the narrower factors investigated in this study were derived from the structure of SDQ. Studies with other measurements might exhibit different psychopathology structures. Future studies could enhance our understanding of psychopathology structures in daily life by incorporating more comprehensive psychopathology measures (e.g., Wright et al., 2023). In addition, some items from the criterion validity variables may have some content overlap with psychopathology symptoms (Zimmermann et al., 2019). When possible, we excluded items with direct overlaps (e.g., the daily hassles scale includes no items about emotional worries). Future studies should address this issue and compare associations between criterion validity variables and psychopathology factors with and without content overlapping items (Southward et al., 2022). Third, the factor structures observed in this study are based on data from two community samples recruited through convenience methods. These samples exhibit a limited range of symptom severity, which may affect the generalizability of the findings. Therefore, the current study should be replicated with a representative clinical sample to validate and extend the results to populations with more severe symptoms. Fourth, although this study disentangled between-person differences from within-person fluctuations, it relies solely on self-reported data. As such, the p factor identified may be influenced by common method variance, which may affect its interpretation (Leising et al., 2024; Watts et al., 2024). Future studies should incorporate multi-informant assessment to control for rater-specific bias and investigate the robustness of the p factor (e.g., Hamlat et al., 2019; Oltmanns et al., 2018). Furthermore, the heterogeneity in magnitudes of factor loadings on the p factor, particularly at the within-person level, also challenges its substantive interpretation, suggesting that the construct may not be as uniform as implied (Watts et al., 2024). Finally, although the sample size at the within-person level is considered sufficient based on existing criteria, it is important to recognize that power calculations and model-selection criteria for the within-person level of MLCFAs remain an area of active discussion and investigation, with no clear consensus reached. In addition, the adolescent sample size is underpowered at the between-person level (Jackson, 2003; Wolf et al., 2013). Future studies with larger samples are needed to replicate and extend the current findings.

Conclusion

The current findings suggest that several models demonstrate similar performance across multiple criteria at both within- and between-person levels. Future studies should select which model to use at different levels based on the specific research objectives. The general p factor demonstrated superior reliability and stability at both levels across time. This indicates not only a tendency for individuals to experience comorbid psychopathology symptoms relative to others (between-person level) but also a co-occurrence pattern and shared variability of these symptoms from day to day (within-person level). The specific factors estimated in the bifactor models demonstrated low reliability and consistency over time, casting doubt on their practical utility. The current findings inform the development of future prevention and intervention programs by supporting the adoption of transdiagnostic treatments taking into account the daily individual and contextual factors affecting the fluctuations in multiple psychopathology symptoms from a holistic perspective. Moreover, future studies should continue to leverage diary designs or ecological momentary assessments to investigate how psychopathology manifests in adolescents’ and young adults’ daily lives on a micro timescale and to further elucidate the who and when of individual vulnerability to psychopathology symptoms.

Supplemental Material

sj-docx-1-asm-10.1177_10731911241283908 – Supplemental material for Understanding the Within- and Between-Person Structure of Daily Psychopathology Among Adolescents and Young Adults

sj-docx-1-asm-10.1177_10731911241283908.docx^{(543KB, docx)}

Supplemental material, sj-docx-1-asm-10.1177_10731911241283908 for Understanding the Within- and Between-Person Structure of Daily Psychopathology Among Adolescents and Young Adults by Hao Zheng and Yao Zheng in Assessment

Acknowledgments

The authors gratefully acknowledge all the participants, research assistants, Elk Island and St. Albert public schools, and the following organizations at University of Alberta for their support: International Student Services, English for Academic Purposes program, New Chinese Generation, Chinese Students and Scholars Association, iGeek, Undergraduate Research Initiative, China Institute, East Asian Studies Undergraduate Students Association, and Taiwanese Student Association. Study data were collected and managed using REDCap electronic data capture tools hosted and supported by the Women and Children’s Health Research Institute at the University of Alberta.

Footnotes

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported partly with funding from the China Institute at the University of Alberta, the Social Sciences and Humanities Research Council (IDG 430-2018-00317 and 409-2020-00080) and Natural Sciences and Engineering Research Council (RGPIN-2020-04458 and DGECR-2020-00077) of Canada, and a Killam Research Fund Cornerstone Grant. Hao Zheng was supported by a Mitacs Accelerate Grant (IT 18227) and the Dr. Jane Silvius Graduate Scholarship.

ORCID iDs: Hao Zheng Inline graphic https://orcid.org/0000-0002-8430-3062

Yao Zheng Inline graphic https://orcid.org/0000-0002-0302-0598

Data Availability: Research data are not publicly available due to ethics agreements. However, the data required for the analyses performed in the study are available from the corresponding author upon reasonable request. This study was not preregistered. To promote transparency and openness, the codes behind all the analyses are publicly available at https://osf.io/qns65/

Supplemental Material: Supplemental material for this article is available online.

References

American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). American Psychiatric Publishing. [Google Scholar]
Angulo M., Rooks B. T., Gill M., Goldstein T., Sakolsky D., Goldstein B., Monk K., Hickey M. B., Diler R. S., Hafeman D., Merranko J., Axelson D., Birmaher B. (2017). Psychometrics of the Screen for Adult Anxiety Related Disorders (SCAARED): A new scale for the assessment of DSM-5 anxiety disorders. Psychiatry Research, 253, 84–90. 10.1016/j.psychres.2017.02.034 [DOI] [PMC free article] [PubMed] [Google Scholar]
Carragher N., Teesson M., Sunderland M., Newton N. C., Krueger R. F., Conrod P. J., Barrett E. L., Champion K. E., Nair N. K., Slade T. (2016). The structure of adolescent psychopathology: A symptom-level analysis. Psychological Medicine, 46(5), 981–994. 10.1017/S0033291715002470 [DOI] [PubMed] [Google Scholar]
Caspi A., Houts R. M., Belsky D. W., Goldman-Mellor S. J., Harrington H., Israel S., Meier M. H., Ramrakha S., Shalev I., Poulton R., Moffitt T. E. (2014). The p factor: One general psychopathology factor in the structure of psychiatric disorders? Clinical Psychological Science, 2(2), 119–137. 10.1177/2167702613497473 [DOI] [PMC free article] [PubMed] [Google Scholar]
Caspi A., Houts R. M., Fisher H. L., Danese A., Moffitt T. E. (2024). The general factor of psychopathology (p): Choosing among competing models and interpreting p. Clinical Psychological Science, 12(1), 53–82. 10.1177/21677026221147872 [DOI] [PMC free article] [PubMed] [Google Scholar]
Cheung G. W., Rensvold R. B. (2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling, 9(2), 233–255. 10.1207/S15328007SEM0902_5 [DOI] [Google Scholar]
Choate A. M., Bornovalova M. A., Hipwell A. E., Chung T., Stepp S. D. (2023). The general psychopathology factor (p) from adolescence to adulthood: Exploring the developmental trajectories of p using a multi-method approach. Development and Psychopathology, 35(4), 1775–1793. 10.1017/S0954579422000463 [DOI] [PMC free article] [PubMed] [Google Scholar]
Clark D. A., Hicks B. M., Angstadt M., Rutherford S., Taxali A., Hyde L., Weigard A. S., Heitzeg M. M., Sripada C. (2021). The general factor of psychopathology in the Adolescent Brain Cognitive Development (ABCD) study: A comparison of alternative modeling approaches. Clinical Psychological Science, 9(2), 169–182. 10.1177/2167702620959317 [DOI] [PMC free article] [PubMed] [Google Scholar]
Constantinou M. P., Goodyer I. M., Eisler I., Butler S., Kraam A., Scott S., Pilling S., Simes E., Ellison R., Allison E., Fonagy P. (2019). Changes in general and specific psychopathology factors over a psychosocial intervention. Journal of the American Academy of Child & Adolescent Psychiatry, 58(8), 776–786. 10.1016/j.jaac.2018.11.011 [DOI] [PubMed] [Google Scholar]
Conway C. C., Forbes M. K., South S. C. (2022). A Hierarchical Taxonomy of Psychopathology (HiTOP) primer for mental health researchers. Clinical Psychological Science, 10(2), 236–258. 10.1177/2167702621107834 [DOI] [PMC free article] [PubMed] [Google Scholar]
Cooke E. M., Schuurman N. K., Zheng Y. (2022). Examining the within- and between-person structure of a short form of the Positive and Negative Affect Schedule: A multilevel and dynamic approach. Psychological Assessment, 34(12), 1126–1137. 10.1037/pas0001167 [DOI] [PubMed] [Google Scholar]
Cyders M. A., Littlefield A. K., Coffey S., Karyadi K. A. (2014). Examination of a short English version of the UPPS-P Impulsive Behavior Scale. Addictive Behaviors, 39(9), 1372–1376. 10.1016/j.addbeh.2014.02.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
DiStefano C., Shi D., Morgan G. B. (2021). Collapsing categories is often more advantageous than modeling sparse data: Investigations in the CFA framework. Structural Equation Modeling: A Multidisciplinary Journal, 28(2), 237–249. 10.1080/10705511.2020.1803073 [DOI] [Google Scholar]
Donnellan M. B., Oswald F. L., Baird B. M., Lucas R. E. (2006). The mini-IPIP scales: Tiny-yet-effective measures of the Big Five factors of personality. Psychological Assessment, 18(2), 192–203. 10.1037/1040-3590.18.2.192 [DOI] [PubMed] [Google Scholar]
Forbes M. K., Greene A. L., Levin-Aspenson H. F., Watts A. L., Hallquist M., Lahey B. B., Markon K. E., Patrick C. J., Tackett J. L., Waldman I. D., Wright A. G. C., Caspi A., Ivanova M., Kotov R., Samuel D. B., Eaton N. R., Krueger R. F. (2021). Three recommendations based on a comparison of the reliability and validity of the predominant models used in research on the empirical structure of psychopathology. Journal of Abnormal Psychology, 130(3), 297–317. 10.1037/abn0000533 [DOI] [PubMed] [Google Scholar]
Forbes M. K., Tackett J. L., Markon K. E., Krueger R. F. (2016). Beyond comorbidity: Toward a dimensional and hierarchical approach to understanding psychopathology across the lifespan. Development and Psychopathology, 28(4), 971–986. 10.1017/S0954579416000651 [DOI] [PMC free article] [PubMed] [Google Scholar]
Forbes M. K., Watts A. L., Twose M., Barrett A., Hudson J. L., Lyneham H. J., McLellan L., Newton N. C., Sicouri G., Chapman C., McKinnon A., Rapee R. M., Slade T., Teesson M., Markon K., Sunderland M. (2024). A hierarchical model of the symptom-level structure of psychopathology in youth. Clinical Psychological Science. Advance online publication. 10.1177/21677026241257852 [DOI] [PMC free article] [PubMed]
Gluschkoff K., Jokela M., Rosenström T. (2019). The general psychopathology factor: Structural stability and generalizability to within-individual changes. Frontiers in Psychiatry, 10, Article 594. 10.3389/fpsyt.2019.00594 [DOI] [PMC free article] [PubMed] [Google Scholar]
Goodman A., Lamping D. L., Ploubidis G. B. (2010). When to use broader internalising and externalising subscales instead of the hypothesised five subscales on the Strengths and Difficulties Questionnaire (SDQ): Data from British parents, teachers and children. Journal of Abnormal Child Psychology, 38(8), 1179–1191. 10.1007/s10802-010-9434-x [DOI] [PubMed] [Google Scholar]
Goodman R., Meltzer H., Bailey V. (1998). The Strengths and Difficulties Questionnaire: A pilot study on the validity of the self-report version. European Child & Adolescent Psychiatry, 7(3), 125–130. 10.1007/s007870050057 [DOI] [PubMed] [Google Scholar]
Greene A. L., Eaton N. R., Li K., Forbes M. K., Krueger R. F., Markon K. E., Waldman I. D., Cicero D. C., Conway C. C., Docherty A. R., Fried E. I., Ivanova M. Y., Jonas K. G., Latzman R. D., Patrick C. J., Reininghaus U., Tackett J. L., Wright A. G. C., Kotov R. (2019). Are fit indices used to test psychopathology structure biased? A simulation study. Journal of Abnormal Psychology, 128(7), 740–764. 10.1037/abn0000434 [DOI] [PubMed] [Google Scholar]
Gross J. J., John O. P. (2003). Individual differences in two emotion regulation processes: Implications for affect, relationships, and well-being. Journal of Personality and Social Psychology, 85(2), 348–362. 10.1037/0022-3514.85.2.348 [DOI] [PubMed] [Google Scholar]
Grosz M. P. (2024). The factor structure of Big Five personality trait measures at the between-and within-person levels. European Journal of Personality, 38(4), 724–740. 10.1177/08902070231213805 [DOI] [Google Scholar]
Hamaker E. L., Kuiper R. M., Grasman R. P. P. P. (2015). A critique of the cross-lagged panel model. Psychological Methods, 20(1), 102–116. 10.1037/a0038889 [DOI] [PubMed] [Google Scholar]
Hamlat E. J., Snyder H. R., Young J. F., Hankin B. L. (2019). Pubertal timing as a transdiagnostic risk for psychopathology in youth. Clinical Psychological Science, 7(3), 411–429. 10.1177/2167702618810518 [DOI] [PMC free article] [PubMed] [Google Scholar]
Hankin B. L., Davis E. P., Snyder H., Young J. F., Glynn L. M., Sandman C. A. (2017). Temperament factors and dimensional, latent bifactor models of child psychopathology: Transdiagnostic and specific associations in two youth samples. Psychiatry Research, 252, 139–146. 10.1016/j.psychres.2017.02.061 [DOI] [PMC free article] [PubMed] [Google Scholar]
Harris P. A., Taylor R., Minor B. L., Elliott V., Fernandez M., O’Neal L., McLeod L., Delacqua G., Delacqua F., Kirby J., Duda S. N. (2019). The RedCap consortium: Building an international community of software platform partners. Journal of Biomedical Informatics, 95, Article 103208. 10.1016/j.jbi.2019.103208 [DOI] [PMC free article] [PubMed] [Google Scholar]
Hoffmann M. S., Moore T. M., Axelrud L. K., Tottenham N., Pan P. M., Miguel E. C., Rohde L. A., Milham M. P., Satterthwaite T. D., Salum G. A. (2024). An evaluation of item harmonization strategies between assessment tools of psychopathology in children and adolescents. Assessment, 31(2), 502–517. 10.1177/10731911231163136 [DOI] [PubMed] [Google Scholar]
Hoffmann M. S., Moore T. M., Kvitko Axelrud L., Tottenham N., Zuo X.-N., Rohde L. A., Milham M. P., Satterthwaite T. D., Salum G. A. (2022). Reliability and validity of bifactor models of dimensional psychopathology in youth. Journal of Psychopathology and Clinical Science, 131(4), 407–421. 10.1037/abn0000749 [DOI] [PMC free article] [PubMed] [Google Scholar]
Hsu H.-Y., Kwok O-m, Lin J. H., Acosta S. (2015). Detecting misspecified multilevel structural equation models with common fit indices: A Monte Carlo study. Multivariate Behavioral Research, 50(2), 197–215. 10.1080/00273171.2014.977429 [DOI] [PubMed] [Google Scholar]
Hu L., Bentler P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 1–55. 10.1080/10705519909540118 [DOI] [Google Scholar]
Jackson D. L. (2003). Revisiting sample size and number of parameter estimates: Some support for the N:q Hypothesis. Structural Equation Modeling, 10(1), 128–141. 10.1207/S15328007SEM1001_6 [DOI] [Google Scholar]
Kaufman E. A., Xia M., Fosco G., Yaptangco M., Skidmore C. R., Crowell S. E. (2016). The Difficulties in Emotion Regulation Scale Short Form (DERS-SF): Validation and replication in adolescent and adult samples. Journal of Psychopathology and Behavioral Assessment, 38(3), 443–455. 10.1007/s10862-015-9529 [DOI] [Google Scholar]
Khojasteh J., Lo W.-J. (2015). Investigating the sensitivity of goodness-of-fit indices to detect measurement invariance in a bifactor model. Structural Equation Modeling, 22(4), 531–541. 10.1080/10705511.2014.937791 [DOI] [Google Scholar]
Kim E. S., Dedrick R. F., Cao C., Ferron J. M. (2016). Multilevel factor analysis: Reporting guidelines and a review of reporting practices. Multivariate Behavioral Research, 51(6), 881–898. 10.1080/00273171.2016.1228042 [DOI] [PubMed] [Google Scholar]
Kotov R., Krueger R. F., Watson D., Achenbach T. M., Althoff R. R., Bagby R. M., Brown T. A., Carpenter W. T., Caspi A., Clark L. A., Eaton N. R., Forbes M. K., Forbush K. T., Goldberg D., Hasin D., Hyman S. E., Ivanova M. Y., Lynam D. R., Markon K., Zimmerman M. (2017). The Hierarchical Taxonomy of Psychopathology (HiTOP): A dimensional alternative to traditional nosologies. Journal of Abnormal Psychology, 126(4), 454–477. 10.1037/abn0000258 [DOI] [PubMed] [Google Scholar]
Kotov R., Krueger R. F., Watson D., Cicero D. C., Conway C. C., DeYoung C. G., Eaton N. R., Forbes M. K., Hallquist M. N., Latzman R. D., Mullins-Sweatt S. N., Ruggero C. J., Simms L. J., Waldman I. D., Waszczuk M. A., Wright A. G. C. (2021). The hierarchical taxonomy of psychopathology (HiTOP): A quantitative nosology based on consensus of evidence. Annual Review of Clinical Psychology, 17(1), 83–108. 10.1146/annurev-clinpsy-081219-093304 [DOI] [PubMed] [Google Scholar]
Laceulle O. M., Vollebergh W. A. M., Ormel J. (2015). The structure of psychopathology in adolescence: Replication of a general psychopathology factor in the TRAILS study. Clinical Psychological Science, 3(6), 850–860. 10.1177/2167702614560750 [DOI] [Google Scholar]
Leising D., Borgstede M., Burger J., Zimmermann J., Bäckström M., Oltmanns J. R., Connelly B. S. (2024). Why do judgments on different person-descriptive attributes correlate with one another? A conceptual analysis with relevance for most psychometric research [Preprint]. Psyarxiv. 10.31234/osf.io/7c895 [DOI]
Lin J. J., Hsu H. Y. (2023). Investigating the performance of level-specific fit indices in multilevel confirmatory factor analysis with dichotomous indicators: A Monte Carlo study. Behavior Research Methods, 55(8), 4222–4259. 10.3758/s13428-022-02014-z) [DOI] [PubMed] [Google Scholar]
Markon K. E. (2019). Bifactor and hierarchical models: Specification, inference, and interpretation. Annual Review of Clinical Psychology, 15(1), 51–69. 10.1146/annurev-clinpsy-050718-095522 [DOI] [PubMed] [Google Scholar]
McGorry P., Van Os J. (2013). Redeeming diagnosis in psychiatry: Timing versus specificity. The Lancet, 381(9863), 343–345. 10.1016/S0140-6736(12)61268-9 [DOI] [PubMed] [Google Scholar]
Moore T. M., Kaczkurkin A. N., Durham E. L., Jeong H. J., McDowell M. G., Dupont R. M., Applegate B., Tackett J. L., Cardenas-Iniguez C., Kardan O., Akcelik G. N., Stier A. J., Rosenberg M. D., Hedeker D., Berman M. G., Lahey B. B. (2020). Criterion validity and relationships between alternative hierarchical dimensional models of general and specific psychopathology. Journal of Abnormal Psychology, 129(7), 677–688. 10.1037/abn0000601 [DOI] [PMC free article] [PubMed] [Google Scholar]
Muthén B. O., Muthén L. K. (1997. –2019). Mplus version 8: User’s guide. [Google Scholar]
Oltmanns J. R., Smith G. T., Oltmanns T. F., Widiger T. A. (2018). General factors of psychopathology, personality, and personality disorder: Across domain comparisons. Clinical Psychological Science, 6(4), 581–589. 10.1177/2167702617750150 [DOI] [PMC free article] [PubMed] [Google Scholar]
Padgett R. N., Morgan G. B. (2021). Multilevel CFA with ordered categorical data: A simulation study comparing fit indices across robust estimation methods. Structural Equation Modeling, 28(1), 51–68. 10.1080/10705511.2020.1759426 [DOI] [Google Scholar]
Purpura D. J., Lonigan C. J. (2009). Conners’ Teacher Rating Scale for preschool children: A revised, brief, age-specific measure. Journal of Clinical Child & Adolescent Psychology, 38(2), 263–272. 10.1080/15374410802698446 [DOI] [PMC free article] [PubMed] [Google Scholar]
Radloff L. S. (1977). The CES-D scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1(3), 385–401. 10.1177/014662167700100306 [DOI] [Google Scholar]
Reise S. P., Bonifay W. E., Haviland M. G. (2013). Scoring and modeling psychological measures in the presence of multidimensionality. Journal of Personality Assessment, 95(2), 129–140. 10.1080/00223891.2012.725437 [DOI] [PubMed] [Google Scholar]
Reise S. P., Kim D. S., Mansolf M., Widaman K. F. (2016). Is the bifactor model a better model or is it just better at modeling implausible responses? Application of iteratively reweighted least squares to the Rosenberg Self-Esteem Scale. Multivariate Behavioral Research, 51(6), 818–838. 10.1080/00273171.2016.1243461 [DOI] [PMC free article] [PubMed] [Google Scholar]
Revelle W., Condon D. M. (2019). Reliability from α to ω: A tutorial. Psychological Assessment, 31(12), 1395–1411. 10.1037/pas0000754 [DOI] [PubMed] [Google Scholar]
Rodriguez A., Reise S. P., Haviland M. G. (2016). Applying bifactor statistical indices in the evaluation of psychological measures. Journal of Personality Assessment, 98(3), 223–237. 10.1080/00223891.2015.1089249 [DOI] [PubMed] [Google Scholar]
Schreuder M. J., Groen R. N., Wigman J. T. W., Wichers M., Hartman C. A. (2023). Participation and compliance in a 6-month daily diary study among individuals at risk for mental health problems. Psychological Assessment, 35(2), 115–126. 10.1037/pas0001197 [DOI] [PubMed] [Google Scholar]
Sliwinski M. J. (2011). Approaches to modeling intraindividual and interindividual facets of change for developmental research. In Fingerman K. L., Berg C. A., Smith J., Antonucci T. C. (Eds.). Handbook of life-span development (pp. 1–25). Springer. [Google Scholar]
Smith G. T., Atkinson E. A., Davis H. A., Riley E. N., Oltmanns J. R. (2020). The general factor of psychopathology. Annual Review of Clinical Psychology, 16, 75–98. 10.1146/annurev-clinpsy-071119-115848 [DOI] [PubMed] [Google Scholar]
Southward M. W., Cheavens J. S., Coccaro E. F. (2022). Defining the p-factor: An empirical test of five leading theories. Psychological Medicine, 53(7), 2732–2743. 10.1017/s0033291722001635 [DOI] [PMC free article] [PubMed] [Google Scholar]
Tomko R. L., Solhan M. B., Carpenter R. W., Brown W. C., Jahng S., Wood P. K., Trull T. J. (2014). Measuring impulsivity in daily life: The Momentary Impulsivity Scale. Psychological Assessment, 26(2), 339–349. 10.1037/a0035083 [DOI] [PMC free article] [PubMed] [Google Scholar]
Watts A. L., Greene A. L., Bonifay W., Fried E. I. (2024). A critical evaluation of the p-factor literature. Nature Reviews Psychology, 3(2), 108–122. 10.1038/s44159-023-00260-2 [DOI] [Google Scholar]
Watts A. L., Poore H. E., Waldman I. D. (2019). Riskier tests of the validity of the bifactor model of psychopathology. Clinical Psychological Science, 7(6), 1285–1303. 10.1177/2167702619855035 [DOI] [Google Scholar]
Widaman K. F., Ferrer E., Conger R. D. (2010). Factorial invariance within longitudinal structural equation models: Measuring the same construct across time. Child Development Perspectives, 4(1), 10–18. 10.1111/j.1750-8606.2009.00110.x [DOI] [PMC free article] [PubMed] [Google Scholar]
Wolf E. J., Harrington K. M., Clark S. L., Miller M. W. (2013). Sample size requirements for structural equation models: An evaluation of power, bias, and solution propriety. Educational and Psychological Measurement, 73(6), 913–934. 10.1177/0013164413495237 [DOI] [PMC free article] [PubMed] [Google Scholar]
Wright A. G. C., Beltz A. M., Gates K. M., Molenaar P. C. M., Simms L. J. (2015). Examining the dynamic structure of daily internalizing and externalizing behavior at multiple levels of analysis. Frontiers in Psychology, 6, 1914. 10.3389/fpsyg.2015.01914 [DOI] [PMC free article] [PubMed] [Google Scholar]
Wright A. G. C., Ringwald W. R., Zimmermann J. (2023). Measuring multidimensional hierarchies of psychopathology in daily life [Preprint]. Psyarxiv. 10.31234/osf.io/h8npk [DOI]
Xu J., Zheng Y. (2023). Parent- and child-driven daily family stress processes between daily stress, parental warmth, and adolescent adjustment. Journal of Youth and Adolescence, 52(3), 490–505. 10.1007/s10964-022-01691-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
Zheng H., Cooke E. M., Li K., Zheng Y. (2023). Capturing uplifts and hassles in adolescents’ daily lives: Links of daily experiences with physical and mental well-being. Journal of Youth and Adolescence, 52(1), 177–194. 10.1007/s10964-022-01682-6 [DOI] [PubMed] [Google Scholar]
Zheng Y., Goulter N. (2024). Introduction to the special issue: Novel insights into the externalizing psychopathology spectrum in childhood and adolescence from intensive longitudinal data. Research on Child and Adolescent Psychopathology, 52(1), 1–6. 10.1007/s10802-023-01154-w [DOI] [PubMed] [Google Scholar]
Zimmermann J., Woods W. C., Ritter S., Happel M., Masuhr O., Jaeger U., Wright A. G. (2019). Integrating structure and dynamics in personality assessment: First steps toward the development and validation of a personality dynamics diary. Psychological Assessment, 31(4), 516–531. 10.1037/pas0000625 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

sj-docx-1-asm-10.1177_10731911241283908 – Supplemental material for Understanding the Within- and Between-Person Structure of Daily Psychopathology Among Adolescents and Young Adults

sj-docx-1-asm-10.1177_10731911241283908.docx^{(543KB, docx)}

[bibr1-10731911241283908] American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). American Psychiatric Publishing. [Google Scholar]

[bibr2-10731911241283908] Angulo M., Rooks B. T., Gill M., Goldstein T., Sakolsky D., Goldstein B., Monk K., Hickey M. B., Diler R. S., Hafeman D., Merranko J., Axelson D., Birmaher B. (2017). Psychometrics of the Screen for Adult Anxiety Related Disorders (SCAARED): A new scale for the assessment of DSM-5 anxiety disorders. Psychiatry Research, 253, 84–90. 10.1016/j.psychres.2017.02.034 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr3-10731911241283908] Carragher N., Teesson M., Sunderland M., Newton N. C., Krueger R. F., Conrod P. J., Barrett E. L., Champion K. E., Nair N. K., Slade T. (2016). The structure of adolescent psychopathology: A symptom-level analysis. Psychological Medicine, 46(5), 981–994. 10.1017/S0033291715002470 [DOI] [PubMed] [Google Scholar]

[bibr4-10731911241283908] Caspi A., Houts R. M., Belsky D. W., Goldman-Mellor S. J., Harrington H., Israel S., Meier M. H., Ramrakha S., Shalev I., Poulton R., Moffitt T. E. (2014). The p factor: One general psychopathology factor in the structure of psychiatric disorders? Clinical Psychological Science, 2(2), 119–137. 10.1177/2167702613497473 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr5-10731911241283908] Caspi A., Houts R. M., Fisher H. L., Danese A., Moffitt T. E. (2024). The general factor of psychopathology (p): Choosing among competing models and interpreting p. Clinical Psychological Science, 12(1), 53–82. 10.1177/21677026221147872 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr6-10731911241283908] Cheung G. W., Rensvold R. B. (2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling, 9(2), 233–255. 10.1207/S15328007SEM0902_5 [DOI] [Google Scholar]

[bibr7-10731911241283908] Choate A. M., Bornovalova M. A., Hipwell A. E., Chung T., Stepp S. D. (2023). The general psychopathology factor (p) from adolescence to adulthood: Exploring the developmental trajectories of p using a multi-method approach. Development and Psychopathology, 35(4), 1775–1793. 10.1017/S0954579422000463 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr8-10731911241283908] Clark D. A., Hicks B. M., Angstadt M., Rutherford S., Taxali A., Hyde L., Weigard A. S., Heitzeg M. M., Sripada C. (2021). The general factor of psychopathology in the Adolescent Brain Cognitive Development (ABCD) study: A comparison of alternative modeling approaches. Clinical Psychological Science, 9(2), 169–182. 10.1177/2167702620959317 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr9-10731911241283908] Constantinou M. P., Goodyer I. M., Eisler I., Butler S., Kraam A., Scott S., Pilling S., Simes E., Ellison R., Allison E., Fonagy P. (2019). Changes in general and specific psychopathology factors over a psychosocial intervention. Journal of the American Academy of Child & Adolescent Psychiatry, 58(8), 776–786. 10.1016/j.jaac.2018.11.011 [DOI] [PubMed] [Google Scholar]

[bibr10-10731911241283908] Conway C. C., Forbes M. K., South S. C. (2022). A Hierarchical Taxonomy of Psychopathology (HiTOP) primer for mental health researchers. Clinical Psychological Science, 10(2), 236–258. 10.1177/2167702621107834 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr11-10731911241283908] Cooke E. M., Schuurman N. K., Zheng Y. (2022). Examining the within- and between-person structure of a short form of the Positive and Negative Affect Schedule: A multilevel and dynamic approach. Psychological Assessment, 34(12), 1126–1137. 10.1037/pas0001167 [DOI] [PubMed] [Google Scholar]

[bibr12-10731911241283908] Cyders M. A., Littlefield A. K., Coffey S., Karyadi K. A. (2014). Examination of a short English version of the UPPS-P Impulsive Behavior Scale. Addictive Behaviors, 39(9), 1372–1376. 10.1016/j.addbeh.2014.02.013 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr13-10731911241283908] DiStefano C., Shi D., Morgan G. B. (2021). Collapsing categories is often more advantageous than modeling sparse data: Investigations in the CFA framework. Structural Equation Modeling: A Multidisciplinary Journal, 28(2), 237–249. 10.1080/10705511.2020.1803073 [DOI] [Google Scholar]

[bibr14-10731911241283908] Donnellan M. B., Oswald F. L., Baird B. M., Lucas R. E. (2006). The mini-IPIP scales: Tiny-yet-effective measures of the Big Five factors of personality. Psychological Assessment, 18(2), 192–203. 10.1037/1040-3590.18.2.192 [DOI] [PubMed] [Google Scholar]

[bibr15-10731911241283908] Forbes M. K., Greene A. L., Levin-Aspenson H. F., Watts A. L., Hallquist M., Lahey B. B., Markon K. E., Patrick C. J., Tackett J. L., Waldman I. D., Wright A. G. C., Caspi A., Ivanova M., Kotov R., Samuel D. B., Eaton N. R., Krueger R. F. (2021). Three recommendations based on a comparison of the reliability and validity of the predominant models used in research on the empirical structure of psychopathology. Journal of Abnormal Psychology, 130(3), 297–317. 10.1037/abn0000533 [DOI] [PubMed] [Google Scholar]

[bibr16-10731911241283908] Forbes M. K., Tackett J. L., Markon K. E., Krueger R. F. (2016). Beyond comorbidity: Toward a dimensional and hierarchical approach to understanding psychopathology across the lifespan. Development and Psychopathology, 28(4), 971–986. 10.1017/S0954579416000651 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr17-10731911241283908] Forbes M. K., Watts A. L., Twose M., Barrett A., Hudson J. L., Lyneham H. J., McLellan L., Newton N. C., Sicouri G., Chapman C., McKinnon A., Rapee R. M., Slade T., Teesson M., Markon K., Sunderland M. (2024). A hierarchical model of the symptom-level structure of psychopathology in youth. Clinical Psychological Science. Advance online publication. 10.1177/21677026241257852 [DOI] [PMC free article] [PubMed]

[bibr18-10731911241283908] Gluschkoff K., Jokela M., Rosenström T. (2019). The general psychopathology factor: Structural stability and generalizability to within-individual changes. Frontiers in Psychiatry, 10, Article 594. 10.3389/fpsyt.2019.00594 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr19-10731911241283908] Goodman A., Lamping D. L., Ploubidis G. B. (2010). When to use broader internalising and externalising subscales instead of the hypothesised five subscales on the Strengths and Difficulties Questionnaire (SDQ): Data from British parents, teachers and children. Journal of Abnormal Child Psychology, 38(8), 1179–1191. 10.1007/s10802-010-9434-x [DOI] [PubMed] [Google Scholar]

[bibr20-10731911241283908] Goodman R., Meltzer H., Bailey V. (1998). The Strengths and Difficulties Questionnaire: A pilot study on the validity of the self-report version. European Child & Adolescent Psychiatry, 7(3), 125–130. 10.1007/s007870050057 [DOI] [PubMed] [Google Scholar]

[bibr21-10731911241283908] Greene A. L., Eaton N. R., Li K., Forbes M. K., Krueger R. F., Markon K. E., Waldman I. D., Cicero D. C., Conway C. C., Docherty A. R., Fried E. I., Ivanova M. Y., Jonas K. G., Latzman R. D., Patrick C. J., Reininghaus U., Tackett J. L., Wright A. G. C., Kotov R. (2019). Are fit indices used to test psychopathology structure biased? A simulation study. Journal of Abnormal Psychology, 128(7), 740–764. 10.1037/abn0000434 [DOI] [PubMed] [Google Scholar]

[bibr22-10731911241283908] Gross J. J., John O. P. (2003). Individual differences in two emotion regulation processes: Implications for affect, relationships, and well-being. Journal of Personality and Social Psychology, 85(2), 348–362. 10.1037/0022-3514.85.2.348 [DOI] [PubMed] [Google Scholar]

[bibr23-10731911241283908] Grosz M. P. (2024). The factor structure of Big Five personality trait measures at the between-and within-person levels. European Journal of Personality, 38(4), 724–740. 10.1177/08902070231213805 [DOI] [Google Scholar]

[bibr24-10731911241283908] Hamaker E. L., Kuiper R. M., Grasman R. P. P. P. (2015). A critique of the cross-lagged panel model. Psychological Methods, 20(1), 102–116. 10.1037/a0038889 [DOI] [PubMed] [Google Scholar]

[bibr25-10731911241283908] Hamlat E. J., Snyder H. R., Young J. F., Hankin B. L. (2019). Pubertal timing as a transdiagnostic risk for psychopathology in youth. Clinical Psychological Science, 7(3), 411–429. 10.1177/2167702618810518 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr26-10731911241283908] Hankin B. L., Davis E. P., Snyder H., Young J. F., Glynn L. M., Sandman C. A. (2017). Temperament factors and dimensional, latent bifactor models of child psychopathology: Transdiagnostic and specific associations in two youth samples. Psychiatry Research, 252, 139–146. 10.1016/j.psychres.2017.02.061 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr27-10731911241283908] Harris P. A., Taylor R., Minor B. L., Elliott V., Fernandez M., O’Neal L., McLeod L., Delacqua G., Delacqua F., Kirby J., Duda S. N. (2019). The RedCap consortium: Building an international community of software platform partners. Journal of Biomedical Informatics, 95, Article 103208. 10.1016/j.jbi.2019.103208 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr28-10731911241283908] Hoffmann M. S., Moore T. M., Axelrud L. K., Tottenham N., Pan P. M., Miguel E. C., Rohde L. A., Milham M. P., Satterthwaite T. D., Salum G. A. (2024). An evaluation of item harmonization strategies between assessment tools of psychopathology in children and adolescents. Assessment, 31(2), 502–517. 10.1177/10731911231163136 [DOI] [PubMed] [Google Scholar]

[bibr29-10731911241283908] Hoffmann M. S., Moore T. M., Kvitko Axelrud L., Tottenham N., Zuo X.-N., Rohde L. A., Milham M. P., Satterthwaite T. D., Salum G. A. (2022). Reliability and validity of bifactor models of dimensional psychopathology in youth. Journal of Psychopathology and Clinical Science, 131(4), 407–421. 10.1037/abn0000749 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr30-10731911241283908] Hsu H.-Y., Kwok O-m, Lin J. H., Acosta S. (2015). Detecting misspecified multilevel structural equation models with common fit indices: A Monte Carlo study. Multivariate Behavioral Research, 50(2), 197–215. 10.1080/00273171.2014.977429 [DOI] [PubMed] [Google Scholar]

[bibr31-10731911241283908] Hu L., Bentler P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 1–55. 10.1080/10705519909540118 [DOI] [Google Scholar]

[bibr32-10731911241283908] Jackson D. L. (2003). Revisiting sample size and number of parameter estimates: Some support for the N:q Hypothesis. Structural Equation Modeling, 10(1), 128–141. 10.1207/S15328007SEM1001_6 [DOI] [Google Scholar]

[bibr33-10731911241283908] Kaufman E. A., Xia M., Fosco G., Yaptangco M., Skidmore C. R., Crowell S. E. (2016). The Difficulties in Emotion Regulation Scale Short Form (DERS-SF): Validation and replication in adolescent and adult samples. Journal of Psychopathology and Behavioral Assessment, 38(3), 443–455. 10.1007/s10862-015-9529 [DOI] [Google Scholar]

[bibr34-10731911241283908] Khojasteh J., Lo W.-J. (2015). Investigating the sensitivity of goodness-of-fit indices to detect measurement invariance in a bifactor model. Structural Equation Modeling, 22(4), 531–541. 10.1080/10705511.2014.937791 [DOI] [Google Scholar]

[bibr35-10731911241283908] Kim E. S., Dedrick R. F., Cao C., Ferron J. M. (2016). Multilevel factor analysis: Reporting guidelines and a review of reporting practices. Multivariate Behavioral Research, 51(6), 881–898. 10.1080/00273171.2016.1228042 [DOI] [PubMed] [Google Scholar]

[bibr36-10731911241283908] Kotov R., Krueger R. F., Watson D., Achenbach T. M., Althoff R. R., Bagby R. M., Brown T. A., Carpenter W. T., Caspi A., Clark L. A., Eaton N. R., Forbes M. K., Forbush K. T., Goldberg D., Hasin D., Hyman S. E., Ivanova M. Y., Lynam D. R., Markon K., Zimmerman M. (2017). The Hierarchical Taxonomy of Psychopathology (HiTOP): A dimensional alternative to traditional nosologies. Journal of Abnormal Psychology, 126(4), 454–477. 10.1037/abn0000258 [DOI] [PubMed] [Google Scholar]

[bibr37-10731911241283908] Kotov R., Krueger R. F., Watson D., Cicero D. C., Conway C. C., DeYoung C. G., Eaton N. R., Forbes M. K., Hallquist M. N., Latzman R. D., Mullins-Sweatt S. N., Ruggero C. J., Simms L. J., Waldman I. D., Waszczuk M. A., Wright A. G. C. (2021). The hierarchical taxonomy of psychopathology (HiTOP): A quantitative nosology based on consensus of evidence. Annual Review of Clinical Psychology, 17(1), 83–108. 10.1146/annurev-clinpsy-081219-093304 [DOI] [PubMed] [Google Scholar]

[bibr38-10731911241283908] Laceulle O. M., Vollebergh W. A. M., Ormel J. (2015). The structure of psychopathology in adolescence: Replication of a general psychopathology factor in the TRAILS study. Clinical Psychological Science, 3(6), 850–860. 10.1177/2167702614560750 [DOI] [Google Scholar]

[bibr39-10731911241283908] Leising D., Borgstede M., Burger J., Zimmermann J., Bäckström M., Oltmanns J. R., Connelly B. S. (2024). Why do judgments on different person-descriptive attributes correlate with one another? A conceptual analysis with relevance for most psychometric research [Preprint]. Psyarxiv. 10.31234/osf.io/7c895 [DOI]

[bibr40-10731911241283908] Lin J. J., Hsu H. Y. (2023). Investigating the performance of level-specific fit indices in multilevel confirmatory factor analysis with dichotomous indicators: A Monte Carlo study. Behavior Research Methods, 55(8), 4222–4259. 10.3758/s13428-022-02014-z) [DOI] [PubMed] [Google Scholar]

[bibr41-10731911241283908] Markon K. E. (2019). Bifactor and hierarchical models: Specification, inference, and interpretation. Annual Review of Clinical Psychology, 15(1), 51–69. 10.1146/annurev-clinpsy-050718-095522 [DOI] [PubMed] [Google Scholar]

[bibr42-10731911241283908] McGorry P., Van Os J. (2013). Redeeming diagnosis in psychiatry: Timing versus specificity. The Lancet, 381(9863), 343–345. 10.1016/S0140-6736(12)61268-9 [DOI] [PubMed] [Google Scholar]

[bibr43-10731911241283908] Moore T. M., Kaczkurkin A. N., Durham E. L., Jeong H. J., McDowell M. G., Dupont R. M., Applegate B., Tackett J. L., Cardenas-Iniguez C., Kardan O., Akcelik G. N., Stier A. J., Rosenberg M. D., Hedeker D., Berman M. G., Lahey B. B. (2020). Criterion validity and relationships between alternative hierarchical dimensional models of general and specific psychopathology. Journal of Abnormal Psychology, 129(7), 677–688. 10.1037/abn0000601 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr44-10731911241283908] Muthén B. O., Muthén L. K. (1997. –2019). Mplus version 8: User’s guide. [Google Scholar]

[bibr45-10731911241283908] Oltmanns J. R., Smith G. T., Oltmanns T. F., Widiger T. A. (2018). General factors of psychopathology, personality, and personality disorder: Across domain comparisons. Clinical Psychological Science, 6(4), 581–589. 10.1177/2167702617750150 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr46-10731911241283908] Padgett R. N., Morgan G. B. (2021). Multilevel CFA with ordered categorical data: A simulation study comparing fit indices across robust estimation methods. Structural Equation Modeling, 28(1), 51–68. 10.1080/10705511.2020.1759426 [DOI] [Google Scholar]

[bibr47-10731911241283908] Purpura D. J., Lonigan C. J. (2009). Conners’ Teacher Rating Scale for preschool children: A revised, brief, age-specific measure. Journal of Clinical Child & Adolescent Psychology, 38(2), 263–272. 10.1080/15374410802698446 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr48-10731911241283908] Radloff L. S. (1977). The CES-D scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1(3), 385–401. 10.1177/014662167700100306 [DOI] [Google Scholar]

[bibr49-10731911241283908] Reise S. P., Bonifay W. E., Haviland M. G. (2013). Scoring and modeling psychological measures in the presence of multidimensionality. Journal of Personality Assessment, 95(2), 129–140. 10.1080/00223891.2012.725437 [DOI] [PubMed] [Google Scholar]

[bibr50-10731911241283908] Reise S. P., Kim D. S., Mansolf M., Widaman K. F. (2016). Is the bifactor model a better model or is it just better at modeling implausible responses? Application of iteratively reweighted least squares to the Rosenberg Self-Esteem Scale. Multivariate Behavioral Research, 51(6), 818–838. 10.1080/00273171.2016.1243461 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr51-10731911241283908] Revelle W., Condon D. M. (2019). Reliability from α to ω: A tutorial. Psychological Assessment, 31(12), 1395–1411. 10.1037/pas0000754 [DOI] [PubMed] [Google Scholar]

[bibr52-10731911241283908] Rodriguez A., Reise S. P., Haviland M. G. (2016). Applying bifactor statistical indices in the evaluation of psychological measures. Journal of Personality Assessment, 98(3), 223–237. 10.1080/00223891.2015.1089249 [DOI] [PubMed] [Google Scholar]

[bibr53-10731911241283908] Schreuder M. J., Groen R. N., Wigman J. T. W., Wichers M., Hartman C. A. (2023). Participation and compliance in a 6-month daily diary study among individuals at risk for mental health problems. Psychological Assessment, 35(2), 115–126. 10.1037/pas0001197 [DOI] [PubMed] [Google Scholar]

[bibr54-10731911241283908] Sliwinski M. J. (2011). Approaches to modeling intraindividual and interindividual facets of change for developmental research. In Fingerman K. L., Berg C. A., Smith J., Antonucci T. C. (Eds.). Handbook of life-span development (pp. 1–25). Springer. [Google Scholar]

[bibr55-10731911241283908] Smith G. T., Atkinson E. A., Davis H. A., Riley E. N., Oltmanns J. R. (2020). The general factor of psychopathology. Annual Review of Clinical Psychology, 16, 75–98. 10.1146/annurev-clinpsy-071119-115848 [DOI] [PubMed] [Google Scholar]

[bibr56-10731911241283908] Southward M. W., Cheavens J. S., Coccaro E. F. (2022). Defining the p-factor: An empirical test of five leading theories. Psychological Medicine, 53(7), 2732–2743. 10.1017/s0033291722001635 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr57-10731911241283908] Tomko R. L., Solhan M. B., Carpenter R. W., Brown W. C., Jahng S., Wood P. K., Trull T. J. (2014). Measuring impulsivity in daily life: The Momentary Impulsivity Scale. Psychological Assessment, 26(2), 339–349. 10.1037/a0035083 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr58-10731911241283908] Watts A. L., Greene A. L., Bonifay W., Fried E. I. (2024). A critical evaluation of the p-factor literature. Nature Reviews Psychology, 3(2), 108–122. 10.1038/s44159-023-00260-2 [DOI] [Google Scholar]

[bibr59-10731911241283908] Watts A. L., Poore H. E., Waldman I. D. (2019). Riskier tests of the validity of the bifactor model of psychopathology. Clinical Psychological Science, 7(6), 1285–1303. 10.1177/2167702619855035 [DOI] [Google Scholar]

[bibr60-10731911241283908] Widaman K. F., Ferrer E., Conger R. D. (2010). Factorial invariance within longitudinal structural equation models: Measuring the same construct across time. Child Development Perspectives, 4(1), 10–18. 10.1111/j.1750-8606.2009.00110.x [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr61-10731911241283908] Wolf E. J., Harrington K. M., Clark S. L., Miller M. W. (2013). Sample size requirements for structural equation models: An evaluation of power, bias, and solution propriety. Educational and Psychological Measurement, 73(6), 913–934. 10.1177/0013164413495237 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr62-10731911241283908] Wright A. G. C., Beltz A. M., Gates K. M., Molenaar P. C. M., Simms L. J. (2015). Examining the dynamic structure of daily internalizing and externalizing behavior at multiple levels of analysis. Frontiers in Psychology, 6, 1914. 10.3389/fpsyg.2015.01914 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr63-10731911241283908] Wright A. G. C., Ringwald W. R., Zimmermann J. (2023). Measuring multidimensional hierarchies of psychopathology in daily life [Preprint]. Psyarxiv. 10.31234/osf.io/h8npk [DOI]

[bibr64-10731911241283908] Xu J., Zheng Y. (2023). Parent- and child-driven daily family stress processes between daily stress, parental warmth, and adolescent adjustment. Journal of Youth and Adolescence, 52(3), 490–505. 10.1007/s10964-022-01691-5 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr65-10731911241283908] Zheng H., Cooke E. M., Li K., Zheng Y. (2023). Capturing uplifts and hassles in adolescents’ daily lives: Links of daily experiences with physical and mental well-being. Journal of Youth and Adolescence, 52(1), 177–194. 10.1007/s10964-022-01682-6 [DOI] [PubMed] [Google Scholar]

[bibr66-10731911241283908] Zheng Y., Goulter N. (2024). Introduction to the special issue: Novel insights into the externalizing psychopathology spectrum in childhood and adolescence from intensive longitudinal data. Research on Child and Adolescent Psychopathology, 52(1), 1–6. 10.1007/s10802-023-01154-w [DOI] [PubMed] [Google Scholar]

[bibr67-10731911241283908] Zimmermann J., Woods W. C., Ritter S., Happel M., Masuhr O., Jaeger U., Wright A. G. (2019). Integrating structure and dynamics in personality assessment: First steps toward the development and validation of a personality dynamics diary. Psychological Assessment, 31(4), 516–531. 10.1037/pas0000625 [DOI] [PubMed] [Google Scholar]

PERMALINK

Understanding the Within- and Between-Person Structure of Daily Psychopathology Among Adolescents and Young Adults

Hao Zheng

Yao Zheng

Abstract

The Current Study

Method

Participants and Procedures

University Student Sample

Adolescent Sample

Measures

Daily Psychopathology

Criterion Validity Measures

Daily Hassles

Daily Neuroticism

Daily Emotion-Regulation Difficulties

Emotion Regulation

Daily Impulsivity

Impulsivity

Depressive Symptoms

Anxiety Symptoms

Analytic Strategy

Model Estimation

Figure 1.

Structural Validity

Longitudinal and Multigroup MI

Correlations Between Latent Factor Scores

Within- and Between-Person Criterion Validity

Results

Structural Validity

University Student Sample

Table 1.

Table 2.

Figure 2.

Table 3.

Adolescent Sample

Longitudinal and Multigroup MI

Table 4.

Correlations Between Latent Factor Scores

Table 5.

Within- and Between-Person Criterion Validity

University Student Sample

Adolescent Sample

Discussion

Strengths and Implications

Limitation

Conclusion

Supplemental Material

Acknowledgments

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases