Abstract
Family functioning is a key construct in research and practice involving children and youth. Given that multi-informant assessment of this construct is considered a best practice in research and clinical settings, ensuring measurement invariance of family functioning instruments is an important consideration for family science scholars and practitioners who increasingly use multiple groups or longitudinal designs in investigating family dynamics. Yet, studies involving family functioning provide limited reports of psychometric properties of key or contextual measures. This study used multigroup confirmatory factor analyses to examine measurement invariance of a short version of the McMaster Family Assessment device using data from caregivers (N = 479) and adolescents (N = 571) collected at two periods four years apart. Results revealed that configural and metric invariance of a short version of the family functioning measure hold both across groups (caregivers and adolescents) and time, thus providing the foundation for using this instrument to assess family functioning with different populations and at different time periods. However, evidence of only partial scalar invariance indicated that group comparisons might be biased. The article concludes with implications for family science scholars and practitioners, including caution in using mean scores to compare perceptions of family functioning across different populations, such as caregivers and adolescents.
Keywords: measurement invariance, family functioning, brief measure
Family functioning is an important construct in social science research spanning multiple areas of inquiry and involving diverse participants (Boterhoven, Hafekost, Lawrence, Sawyer, & Zubrick, 2015). Some researchers have argued that family environment might best be understood through investigation of unique perspectives of family members (Georgiades, Boyle, Jenkins, Sanford, & Lipman, 2008). Research carried out with participants from the same household established that caregivers and children often provide discrepant ratings of their family environment (Bagley, Bertrand, Bolitho, & Mallick, 2001; Thompson, 2013; Yu et al., 2006). Different theoretical and empirical approaches have been developed to describe and explain discrepant reports, including a growing body of literature that uses divergent opinions to predict other family dynamics or individual adjustment (De Los Reyes, 2011; Georgiades et al., 2008; Guion, Mrug, & Windle, 2009; Stuart & Jose, 2012).
A general conclusion from these inquiries is that both convergence and discrepancy in opinions on family environment have important implications for the well-being of youth, caregivers, and families. However, an important piece is missing from a large body of research that might limit the scope of these findings. When using reports from multiple informants, researchers must guard against measurement bias and ensure that all groups of informants evaluate the same construct (Putnick & Bornstein, 2016; Vandenberg & Lance, 2000). Similarly, when longitudinal designs are used, researchers must ensure that the construct is perceived equivalently across all survey periods. Only then are correlations and comparisons between groups or times meaningful and justified (Millsap, 2010; Nye & Drasgow, 2011). Unfortunately, this step is often missing, and evidence of measurement equivalence is rarely presented in studies investigating differences in ratings of family environment among different family members or differences in perceptions of family functioning over time (Gregorich, 2006). As Kern and colleagues (2016) summarized, “examining MI [measurement invariance] is still not the norm within the parenting and family psychology literature,” (p. 365).
This study addressed this limitation by assessing measurement invariance of a short version of the General Functioning (GF) sub-scale of the McMaster family assessment device1 (FAD) across adolescents and caregivers. A related goal was to establish measurement invariance of this scale across different time periods. Establishing measurement equivalence of FAD will provide researchers with greater flexibility to use this measure with different populations as well as across different time points. Furthermore, verifying that FAD is measured in the same way and means the same thing for children and parents will enable researchers to conduct follow-up analyses to compare and explain differences in opinions on family functioning between these populations.
The McMaster Family Assessment Device and its Adaptations
The McMaster Family Assessment Device is a 60-item self-report measure of perceived family environment, operationalized through six domains: problem solving, communication, roles, affective responsiveness, affective involvement, and behavior control (Epstein, Baldwin, & Bishop, 1983). A General Functioning subscale, consisting of 12 items, has been used as a short form of FAD to identify healthy and unhealthy areas of functioning within families (Byles, Byrne, Boyle, & Offord, 1988). Based on systems theory, the model assumes that all members of a family are interrelated, thus one member cannot be understood in isolation from the others. In this way, the FAD measures whole family functioning, describing the structure and organization of the family unit, as well as relationship patterns among its members (Miller, Ryan, Keitner, Bishop & Epstein, 2000). Since its creation almost four decades ago, the measure has been used extensively in studies investigating family dynamics with clinical and non-clinical samples as well as across diverse cultural contexts (Stevenson-Hinde, & Akister, 1995). For example, the scale has been validated in a variety of places and among different ethnic groups in North America (Aarons, McDonald, Connelly, & Newton, 2007; Byles et al., 1988, Morris, 1990), Asia (China and Japan), Europe (England, Hungary, Italy, and the Netherlands), South America (Brazil), Oceania (Australia), as well as among Greeks and Japanese living in Australia (for an overview of uses of FAD see Boterhoven et al., 2015 and Pires, Gonçalves, Avanci, & Pesce, 2016).
Although the original FAD remains one of the most common measures in research on family functioning, recent studies have used modified versions of the instrument. One such adaptation is a short version of the General Functioning subscale of FAD in which only positively phrased items are used. Boterhoven de Haan and colleagues (2015) explored the validity and reliability of a six-item measure and concluded that a modified, shorter version of FAD was a quick and effective tool for assessing the overall functioning of families. Importantly, by selecting out negatively-phrased items, the researchers addressed three important concerns that were among criticisms of the original measure: (1) acquiescence that might arise with negatively-worded items, (2) controversy over factorial validity of the full scale, including questions about its dimensionality (Tomas & Oliver, 1999), and (3) the length of the instrument, which is of particular concern in clinical settings (Hamilton & Carr, 2016). In sum, the authors concluded that a modified, shorter version might be preferred by researchers who are considering family dynamics as a contextual variable. Of note, since the initial testing and validation of the short version of the General Functioning subscale, several groups of researchers have used the adapted measure in studies of mental health and well-being in a national sample of children and adolescents in Australia (Hafekost et al., 2016; Johnson et al., 2016; Rikkers, Lawrence, Hafekost, & Zubrick, 2016). Additionally, the short version was part of a systematic review of self-report family assessment measures, which concluded that a modified version of FAD that consisted of only positively phrased items was an appropriate assessment tool for clinical use (Hamilton & Carr, 2016).
The most recent and arguably the most drastic adaptation of FAD also involved shortening of the scale that resulted in a three-item ultra-brief assessment measure (Mansfield, Keitner, & Sheeran, 2018). Upon examining psychometric properties of this measure, Mansfield and colleagues concluded that the modified FAD, the Brief Assessment of Famimly Functioning Scale (BAFFS) was a promising measure of satisfaction with family functioning. Importantly, researchers highlighted that the primary functioning of the modified measure was to gauge an overall perception of family functioning as opposed to a more complete or detailed description of specific aspects of family environment.
Overall, psychometric properties of various versions of FAD have been studied extensively (Hamilton & Carr, 2016; Mansfield, Keitner, & Dealy, 2015; Mansfield, Keitner, & Sheeran, 2018). Despite its popularity, limited research has examined the measurement invariance of this instrument across different populations and across different time periods. Given that reports from parents and youth are considered a best practice in measuring family dynamics, a lack of attention to issues of measurement invariance appears a gap in the literature. Several studies have warned against using measures that do not demonstrate compliance with current validity standards, a practice that can lead to serious errors, such as allowing measurement bias to impact respondents’ scores on a particular measure of interest (French & Finch, 2006). Consequently, interpretation of scale scores may be impaired. Therefore, the primary purpose of this study was to establish the invariance of a short version of FAD across groups and over time.
Method
Procedure and Participants
This study is part of a larger, multi-year research initiative examining the developmental trajectory of dating violence victimization and perpetration among adolescents in a rural southeastern community in the USA (McDonell & Wolfe, 2011; Sianko, Kunkel, Thompson, Small, & McDonell, 2019; Sianko, Mece, & Abazi-Morina, 2019). Upon obtaining Institutional Review Board approval, students from ten public schools in grades six through nine were invited to participate in the study. Parents of eligible students were also contacted. In total, 589 adolescents and 484 caregivers (the number of caregivers is smaller due to sibling pairs) returned signed consent and permission forms to participate in the study. Trained data collectors administered paper-and-pencil surveys to participants, most of which were completed in participants’ homes. Both adolescents and caregivers received gift cards as compensation for their participation at each time of data collection. Data at two time points, collected four years apart, were used in the present study.
The study sample in the first year consisted of 580 adolescents in grades six through nine and their caregivers (N = 484). On average, adolescents were 13 years old (SD = 1.50) and their caregivers were 55.77 years old (SD = 26.56). While only a little more than half of students were female (51.4%), the caregivers were predominantly female (92.4%). African-Americans were the largest racial category of respondents (48%), followed by Caucasians (39%). Hispanic or other minorities added 14.2% to the adolescent sample, but only 3.5% of the caregivers. In terms of the relationship to a student, the vast majority of caregivers were parents (95.1%), a small proportion identified themselves as grandparents (3.6%), and a small number were aunts, uncles or other relatives (1.3%). More than 50% of caregivers were married and shared the same household for an average of 14 years, and had a family income below the median income for the county of $32,979. The sample size for the last year consisted of 495 adolescents and 410 caregivers.
Measure
Selected items from the McMaster Family Assessment Device were used to measure family environment (Boterhoven de Haan et al., 2015; Epstein et al., 1983). Specifically, caregivers’ and adolescents’ responses to a series of positively-phrased items were used to create a scaled measure. Sample items included statements such as: “In time of crisis we can turn to each other for support”; “We feel accepted for what we are” and “We are able to make decisions about how to solve problems.” Respondents marked their agreement with each item on a four-point Likert scale from strongly disagree to strongly agree, and all items were averaged to create a scaled measure where higher scores indicated a more positive view of family environment. See Table 1 for scale reliability and means for parents and adolescents at two survey periods.
Table 1.
N | Correlation Matrix | M (SD) | Cron bach’s α | χ 2 | df | p-value | CFI | SRMR | *RMSEA | ||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 90% CI | |||||||||
| |||||||||||||
1.Adolescents time 1 | 571 | - | - | - | - | 3.17 (0.45) | .82 | 9.424 | 5 | 0.093 | 0.995 | 0.016 | 0.039 0.000 0.078 |
2.Adolescents time 2 | 493 | .26** | - | - | - | 3.11(0.64) | .86 | 12.388 | 5 | 0.030 | 0.993 | 0.017 | 0.055 0.016 .094 |
3.Caregivers time 1 | 479 | .11* | .06 | - | - | 3.32(0.47) | .78 | 11.644 | 5 | 0.040 | 0.989 | 0.021 | 0.053 0.010 0.093 |
4.Caregivers time 2 | 410 | .13** | .15** | .50** | - | 3.32(0.53) | .85 | 26.988 | 5 | 0.001 | 0.966 | 0.030 | 0.104 0.067 .143 |
Note:
p < .05.
p < .01.
CFI = comparative fit index; SRMR = standardized root mean square residual; RMSEA = root mean square error of approximation.
Analysis Approach
Descriptive statistics were used to explore basic characteristics of the short version of the family functioning measure, separately for adolescents and caregivers as well as across two time periods. Specifically, means, standard deviations, skewness, and reliability statistics were calculated for the scaled variables. Frequency analyses were conducted on individual items to check for missing data and to screen for univariate outliers. Additional variables were examined to describe the socio-demographic background of participants. Missing responses were analyzed and 14 cases were removed from analyses due to missing scores on all items comprising the family functioning scale.
Next, a series of factor analyses were conducted to assess measurement invariance of the family functioning measure. First, as a prerequisite for measurement invariance testing, four sets of confirmatory factor analyses were conducted separately for caregivers and adolescents across two time periods. This step is essential for providing initial evidence that family functioning was a valid measure for these two groups. Next, to assess measurement equivalence, multiple group confirmatory factor analysis, also known as multisample confirmatory factor analysis was conducted. This is among the most widely used approaches for examining measurement invariance across groups or across time (French & Finch, 2009; Nye & Drasgow, 2011). This approach consists of several steps, each testing increasingly more restricted models by imposing constraints (Vandenberg & Lance, 2000). Fit indices of different models are then compared to establish measurement bias or confirm measurement equivalence (Cheung & Rensvold, 2002).
Invariance across caregivers and adolescents.
To investigate whether a short version of the family functioning measure is measurement-invariant across caregivers and adolescents, three sets of multiple group confirmatory factor analyses were conducted to assess: (a) configural invariance, (b) metric invariance, and (c) scalar invariance of the measure. The first model assessed whether the same pattern of factor loadings existed between the groups. The second model estimated whether the individual contribution of each item to the factor was statistically similar between caregivers and adolescents. Finally, the third model examined whether two groups had both statistically similar factor loadings and intercepts for corresponding items. Data collected at Time 1 were used for these tests.
Invariance over time.
To examine whether a short version of the family functioning measure is measurement-invariant over time, another set of multiple group confirmatory factor analyses was conducted. Similar to the tests above, the first model tested whether the structure of family environment was consistent between Time 1 and Time 2 of data collection. The second model estimated whether the strength of item loadings was statistically equivalent across two time periods. The final model tested whether the item intercepts were stable across time. Rather than running these analyses on a pooled sample, separate models were estimated for data collected from caregivers and from adolescents. This resulted in two sets of multiple group confirmatory factor analyses examining invariance of the measure over time.
Model specifications and fit indices.
As suggested by Muthén and Muthén (2006), maximum likelihood estimation with robust standard errors (MLR) were used to fit the models. Additionally, the error terms of parallel items were allowed to co-vary because of non-independent responses from caregivers and adolescents in the same household. Similarly, co-variances were added to error terms when longitudinal measurement invariance was tested. To identify models, two specifications were made, in accordance with previous research (Vandeberg & Lance, 2000). The first included fixing the first loading of the factor to one, while the second set the intercept of the first item to zero. All tests were conducted using structural equation modeling in MPlus v. 7.4 (Muthén & Muthén, 2006).
Model fit was assessed using the Satorra-Bentler (SB) scaled chi-square statistic, comparative fit index (CFI), root mean square error of approximation (RMSEA) with 90% confidence interval, and the standardized root mean square residual (SRMR) (Cheung & Rensvold, 2002). Further, models with varying levels of measurement invariance were compared to one another by assessing changes in the following descriptive fit indices: CFI (Δ ≤ .01), RMSEA (Δ ≤ .05), and SRMR (Δ ≤ .01) (Hu & Bentler, 1999). In cases where models were rejected based on a poor fit, modification indices were examined to establish the source of the lack of fit, followed by additional analyses to establish partial measurement invariance. Of note, the chi-square difference test was not considered in model comparisons based on previous research that established sensitivity to bias (in favor or against invariance), based on the sample size (French & Finch, 2006; Roesch, Norman, Merz, Sallis, & Patrick, 2013). Finally, observed means and variances of all items of the family functioning measure were examined between caregivers and adolescents to further explore comparability of scores between the two groups.
Results
Descriptive Results
Results indicated that scales demonstrated acceptable internal consistency across groups and time, with Cronbach’s alpha coefficients ranging from .78 to 86. Descriptive analyses further revealed that, on average, caregivers scored higher on the measure of family functioning than did adolescents. This pattern was similar during both times of data collection. Correlation analyses further revealed that adolescents’ scores were positively associated with caregivers’ scores. However, the strength of these associations was weak, with Pearson’s r ranging from .11 at Time 1 to .15 at Time 2, both at p < .01. Additionally, comparison of means between two time periods established that in the first time of data collection, adolescents scored slightly higher than at Time 2. Interestingly, the mean of caregivers’ scores at Time 1 matched the mean at Time 2. Table 1 summarizes these results.
Confirmatory Factor Analyses
Separate confirmatory factor analyses conducted for caregivers and adolescents revealed adequate model fit for both groups at two time periods. Table 1 presents fit indices of these analyses. Examination of fit statistics suggested the model of family environment fit the data well. Although the chi-square for caregivers was significant at both time periods, other indices indicated good fit. Following examination of fit indices, composite measures of family environment were created for adolescents and caregivers at both time periods.
Measurement Invariance
Invariance across caregivers and adolescents.
Table 2 presents a summary of fit statistics for all invariance models across caregivers and adolescents.
Table 2.
Models | Model Fit | Model Comparison | |||||||
---|---|---|---|---|---|---|---|---|---|
|
|||||||||
S-Bχ2 | df | p-value | CFI | SRMR | RMSEA 90% CI | ΔCFI | ΔSRMR | ΔRMSEA | |
|
|||||||||
Configural | 16.882 | 10 | 0.077 | 0.989 | 0.020 | 0.038 0.000 0.059 |
- | - | - |
Metric | 27.049 | 14 | 0.019 | 0.980 | 0.061 | 0.044 0.028 0.069 |
0.009 | 0.041 | 0.006 |
Scalar (all constraints) | 70.788 | 18 | 0.000 | 0.918 | 0.058 | 0.079 0.060 0.099 |
0.062 | 0.003 | 0.035 |
Partial scalar* | 30.978 | 16 | 0.014 | 0.977 | 0.064 | 0.045 0.020 0.068 |
0.059 | 0.006 | 0.034 |
Note: S-B χ2 = Satorra-Bentler scaled chi-square statistic; CFI = comparative fit index; SRMR = standardized root mean square residual; RMSEA = root mean square error of approximation.
Constraints from two items were released.
Examination of fit statistics for the first model revealed a good fit, indicating that the factor structure of the family environment construct is the same for caregivers and adolescents. Thus, configural invariance was supported and further testing for a stricter measurement invariance could be conducted. The next model, metric invariance, included constraints to examine the equality of factor loadings between caregivers and adolescents. The fit of the metric invariance model was compared with the previously fitted configural invariance model. Although the first model fit the data better, examination of differences in fit indices suggested that imposing constraints on factor loadings did not significantly worsen the fit, indicating that caregivers and adolescents interpreted scale items similarly. Thus, evidence of full metric invariance was established.
Finally, the most restrictive model was estimated. This model, known as scalar or strong invariance, tested the assumption that intercepts of individual items of the family functioning measure were equal between the two groups. Cross-equality constraints were imposed on corresponding pairs of intercepts of each scale item. Initial results revealed a poor fit, indicating that the measure did not meet the criteria for full scalar invariance. Modification indices revealed that two items “We feel accepted for what we are” (caregivers intercept = 3.53, adolescents intercept = 3.27) and “We are able to make decisions about how to solve problems” (caregivers intercept = 3.27, adolescents intercept = 3.12) contributed most strongly to the lack of fit. Follow-up analyses were conducted to explore partial scalar invariance of the measure by lifting the intercept equality constraint from these items. The model resulted in improved fit, providing evidence of partial scalar invariance.
Invariance across time.
Similar to previous analyses, a series of confirmatory analyses were conducted to evaluate the equivalence of a short measure of family functioning across time. One set of analyses tested the equivalence assumption with the sample of adolescents while the other tested the same assumption with the sample of caregivers. The results are presented in Tables 3 (for adolescents) and 4 (for caregivers).
Table 3.
Model Fit | Model Comparison | ||||||||
---|---|---|---|---|---|---|---|---|---|
|
|||||||||
Models | S-B χ2 | df | p-value | CFI | SRMR | RMSEA 90% CI | ΔCFI | ΔSRMR | ΔRMSEA |
|
|||||||||
Configural | 11.764 | 10 | 0.301 | 0.998 | 0.016 | 0.018 0.000 0.052 |
- | - | - |
Metric | 16.143 | 14 | 0.304 | 0.997 | 0.033 | 0.017 0.000 0.047 |
0.001 | 0.017 | 0.001 |
Scalar | 43.164 | 24 | 0.009 | 0.977 | 0.064 | 0.040 0.019 0.057 |
0.020 | 0.031 | 0.022 |
Partial scalar* | 37.586 | 21 | 0.014 | 0.980 | 0.061 | 0.039 0.017 0.058 |
0.003 | 0.003 | 0.001 |
Note: S-B χ2 = Satorra-Bentler scaled chi-square statistic; CFI = comparative fit index; SRMR = standardized root mean square residual; RMSEA = root mean square error of approximation.
Intercepts of three items were allowed to vary.
Overall, the results of configural invariance tests revealed that factor structure was consistent across two time periods among both adolescents and caregivers. Next, metric invariance was tested using more constrained models examining the consistency of factor loadings of corresponding items between two time periods. The models demonstrated a slightly worse fit; however, the decrease in fit did not reach statistical significance, indicating that the metric invariance assumption was met for both samples across two time periods.
Finally, the most restrictive model was estimated to assess whether the intercepts of corresponding items were equal between two time periods. For each sample, fit of the most restrictive model was compared with the fit of the metric invariance model. As shown in Tables 3 and 4, the fit worsened in both samples, suggesting that item intercepts were not equal on one or more of the items comprising the measure. Examination of the output, including modification indices, suggested that releasing some intercept constraints might contribute to small improvements in model fit. Several options were tried to improve the fit by lifting the intercept constraints of various items separately for both samples. For adolescents, the best option was achieved when constraints on only two intercepts were left in place, “In times of crisis we can turn to each other for support” and “We can express feelings to each other.” Examination of intercept values for the items that were allowed to vary revealed that adolescents had higher values during Time 1 on all three items than at Time 2. However, these differences were small.
Table 4.
Model Fit | Model Comparison | ||||||||
---|---|---|---|---|---|---|---|---|---|
|
|||||||||
Models | S-B χ2 | df | p-value | CFI | SRMR | RMSEA 90% CI | ΔCFI | ΔSRMR | ΔRMSEA |
|
|||||||||
Configural | 14.236 | 10 | 0.162 | 0.997 | 0.019 | 0.031 0.000 0.064 |
- | - | - |
Metric | 16.186 | 14 | 0.302 | 0.993 | 0.029 | 0.019 0.000 0.051 |
0.004 | 0.010 | 0.012 |
Scalar | 44.561 | 24 | 0.006 | 0.968 | 0.093 | 0.044 0.023 0.064 |
0.029 | 0.064 | 0.025 |
Partial scalar* | 40.392 | 22 | 0.010 | 0.971 | 0.086 | 0.043 0.022 0.066 |
0.003 | 0.007 | 0.001 |
Note: S-B χ2 = Satorra-Bentler scaled chi-square statistic; CFI = comparative fit index; SRMR = standardized root mean square residual; RMSEA = root mean square error of approximation.
Intercepts of two items were allowed to vary.
For caregivers, a somewhat improved fit was achieved by lifting the intercept constraint for two items, “We feel accepted for what we are” and “We confide in each other.” Results of these additional tests are also reported in Tables 3 and 4. Similarly to the analyses with the adolescents’ subsample, examination of intercept values revealed that caregivers rated their family functioning slightly higher at Time 1 than at Time 2. In all cases, however, the differences were minor.
In sum, although metric invariance models demonstrated a better fit than scalar invariance models for adolescents and caregivers, comparison of descriptive indices revealed that the difference in fit was not significant. Therefore, the partial scalar invariance assumption was supported for adolescents and caregivers across two time periods. Finally, to evaluate the influence of the invariance testing on scales, scale scores were recalculated using only those items that were retained at each level of invariance testing. Table 5 presents these results and summarizes results of all measurement invariance tests.
Table 5.
Levels of Invariance | |||||||
---|---|---|---|---|---|---|---|
|
|||||||
Across Groups | Across Time | ||||||
|
|||||||
Item No | Item Wording | Configurai | Metric | Scalar | Configurai adolescents (caregivers) |
Metric adolescents (caregivers) |
Configurai adolescents (caregivers) |
|
|||||||
1 | In times of crisis we can turn to each other for support | x | x | x | x (x) | x (x) | x (x) |
2 | We can express feelings to each other | x | x | x | x (x) | x (x) | x (x) |
3 | We feel accepted for what we are | x | x | o | x (x) | x (x) | o(x) |
4 | We are able to make decisions about how to solve problems | x | x | x | x (x) | x (x) | o(o) |
5 | We confide in each other | x | x | o | x (x) | x (x) | o(o) |
| |||||||
Scale means for adolescents (caregivers) |
3.17 (3.32) |
3.17 (3.32) |
3.23 (3.37) |
||||
Scale means for adolescents Time 1 (Time 2) |
3.17 (3.11) |
3.17 (3.11) |
3.26 (3.16) |
||||
Scale means for caregivers Time 1 (Time 2) |
3.32 (3.32) |
3.32 (3.32) |
3.37 (3.34) |
Note: x = item retained, o = item not retained.
Discussion
The goal of this study was to examine whether a short version of the General Functioning subscale of the McMaster Family Assessment Device is measurement-invariant across groups and time. Given that family environment is a construct of interest to researchers from multiple disciplines and that a short version of FAD has the potential to become a popular tool in a variety of disciplines and with diverse populations, it is important to ensure that statistical properties of this measure are in line with current validity standards (French & Finch, 2006). To our knowledge, this is the first study that examined measurement invariance of a short version of the General Functioning subscale of the McMaster Family Assessment Device across groups and time. Several findings stand out from the present study.
First, analyses of configural measurement invariance confirmed that a short version of the GF subscale of FAD measures family environment in the same way across caregivers and adolescents and that the structure of this construct is stable across time. Second, tests of metric invariance provided evidence that items comprising the scale were interpreted similarly across caregivers and adolescents. Additionally, the study established that a short version of the General Functioning subscale of FAD can be used across four years. For adolescents, this translates into a substantial developmental age range, from 13 to 17 years. The evidence of metric invariance is especially important as it deals with the most critical concern regarding construct validity (French & Finch, 2006). Together, the results of the initial measurement invariance tests established that adolescents and caregivers agreed on the nature and measurement of family environment, thus providing further assurance to researchers who might consider using a short version of the GF subscale of FAD with various populations and at multiple time periods.
Furthermore, evidence of partial scalar invariance of FAD, both across groups and time, suggested that certain items of the measure might have been perceived differently by caregivers and adolescents and by the same respondents surveyed across two time periods. In more detail, partial invariance of two items from the short subscale revealed that caregivers were more likely to agree with statements “We feel accepted for what we are” and “We are able to make decisions about how to solve problems,” than adolescents. These findings are consistent with previous research that established that parents have higher scores on positive aspects of family functioning (Thompson, 2013). This implies that, on average, adolescents might need a greater assurance that they “are accepted for what they are” and that they “are able to make decisions about how to solve problems” before agreeing with this item in comparison with caregivers. At the same time, these findings suggest the presence of unmeasured individual or family factors that could influence the perception of family functioning among different family members.
Next, partial scalar invariance of selected FAD items across time revealed that although adolescents agreed on the nature and measurement of the overall measure, they displayed some bias in how they perceived selected items at two different points in time. For example, adolescents’ mean responses on three items were higher at time 1 than at time 2, indicating that as adolescents get older their perceptions of family environment shift. Perhaps as they mature, adolescents need to feel a greater average assurance that they are “accepted for what they are” and that they are “able to make decisions about how to solve problems” before agreeing with those items. It is also likely that these differences are true, meaning there is a real difference between younger and older adolescents in how they perceive certain aspects of family environment. However, the lack of complete scalar invariance (unequal item intercepts) does not allow us to fully explain these shifts.
Notwithstanding a lack of clear guidelines for interpreting small deviations from measurement invariance, this study suggests the need for taking a more theoretically grounded perspective for accounting for these differences more fully. As a starting point, research would benefit from considering a developmental approach to interpreting the evidence of measurement variance in family functioning both across different stakeholders (e.g., adolescents and caregivers) and over time. As a unique developmental period, adolescence is often characterized as a dynamic set of processes that occur within the individual, which reinforce and are reinforced by outside factors (Jaworska & MacQueen, 2015). Thus, it is reasonable to suggest that these shifts would influence family dynamics and in turn would be influenced by things within the family domain. Consequently, it is likely that deviations from invariance are simply indications of true qualitative shifts occurring during this distinct developmental period. In other words, some level of variance is expected and might appear normative, given the many transitions of adolescence.
Following recommendations of Adolf and colleagues (2014), the current study seeks to add to a growing body of literature that explores stakeholder- and time-related variance as a meaningful phenomenon. For example, future studies could explore the meaning behind indicators that displayed partial invariance. Similarly, examination of the degree to which variance is present among study participants would be informative. In other words, is measurement variance observed among all study participants or does it manifest only among a subsample? If variance is expected, especially as children get older, how should one interpret the evidence of invariance? Consequently, exploring individual and contextual factors that might explain sources of these differences could help advance this evolving field. As Adolf et al. (2014) noted, the finding of measurement variance “is not necessarily the end of an investigation.” However, establishing sources of heterogeneity within (time) and between (adolescents vs. caregivers) stakeholders should be based on theoretical and practical considerations. At the same time, caution should be exercised to avoid assigning too much weight to minor deviations from ideal standards of measurement invariance (Putnick & Bornstein, 2016). As Putnick and Bornstein summarized, “The concern is that potentially important comparative research will never see the light of print if full invariance cannot be achieved.” (p. 19).
To sum, more research is needed to investigate developmental differences of perceived family environment among adolescents. Several pathways for exploring the meaning and consequences of the evidence of partial measurement variance are offered below.
Implications for Research and Practice
The findings from our study have several implications for the assessment of certain aspects of family environment. First, this study adds to a large pool of literature utilizing quantitative group assessments by demonstrating that a short version of the General Functioning subscale of FAD is a robust instrument and complies with current validity standards, including configural and metric invariance of the measure across groups and time. Thus researchers incorporating family environment measures in their studies will be encouraged to know that a short version of the GF subscale of FAD is well-suited to measure family dynamics among different populations and across different time periods.
Second, the items that showed partial scalar invariance suggest that caregivers’ perceptions of certain aspects of family functioning might not match adolescents’ perceptions. The results further suggest that adolescents might attach a different or more nuanced meaning to statements “We feel accepted for what we are” and “We are able to make decisions about how to solve problems.” This finding calls for follow-up analyses that would investigate sources of these differences. Similarly, results of tests of longitudinal measurement invariance uncovered a set of items that might convey different meanings depending on time of data collection. Four years is a long time for most adolescents, during which many developmental shifts take place. The fact that there were differences in mean scores across years calls for further investigation of the sources of these differences. Importantly, a closer investigation of the magnitude of these differences revealed that they were rather modest. At the same time, it should be noted that after an overall lack of invariance has been established, detecting differences among individual indicators can be especially challenging due to low power, even in studies with large samples (French & Finch, 2006). Consequently, the ability to explain or interpret differences for individual indicators is reduced as well.
Perhaps most importantly, these findings suggest that until the nature of the differential functioning of these items is understood more clearly, researchers should refrain from using full scale scores to compare caregivers’ and adolescents’ views on family environment. Similarly, these findings highlight the caution in using difference scores to compare perceptions of family environment across different time periods (Roesch et al., 2013). Thus, researchers interested in exploring discrepancies in opinions of family functioning between caregivers and children might need to investigate the nature of these opinions among each group more fully.
In terms of practical implications, the study findings might be of interest to practitioners who evaluate family environment using the perspectives of multiple family members. In general, adolescents perceive family functioning more negatively than do their caregivers and other family members (Feldman, Wentzel, & Gehring, 1989; Georgiades et al., 2008; Ohannessian & De Los Reyes, 2014). This has important implications since practitioners and researchers alike do not always have access to multiple family members or to the caregiver-adolescent dyad to gather information on family dynamics. Practitioners should be mindful of this trend in evaluating single responses of family functioning.
The findings from this study suggest several additional directions for future research. First, the focus of this research was limited to testing measurement invariance in two main categories: (a) across caregivers and adolescents and (b) across time. Future research might examine whether measurement equivalence of a short version of FAD holds across other domains. For example, some researchers might be interested in testing for measurement invariance, using other subgroups, such as gender, socio-economic status, race or other demographic variables along with other variables that might divide a sample in subgroups. Furthermore, longitudinal measurement invariance was tested using separate subsamples of caregivers and adolescents. It would be interesting to see if the results of longitudinal invariance hold if these tests are repeated when the data from both groups are aggregated.
Limitations
Detecting a lack of measurement invariance is a complex yet flexible procedure requiring a combination of direct comparison and follow-up analyses to test the structure of the construct and evaluate fit statistics. Many of the steps involved in the process of measurement invariance testing rely on multiple indices. Unfortunately, there is no agreement on what constitutes the best set of indicators of a measurement-invariant instrument, often forcing researchers to choose among competing explanations (Nye & Drasgow, 2011).
In light of these complexities, several limitations are worth noting. The use of robust maximum likelihood estimation method might be considered a limitation, given the nature of the scale responses (French & Finch, 2006; Muthen, & Asparouhov, 2002). The use of categorical responses in studies involving SEM is a controversial issue (Grace, 2008). On the one hand, some have advocated for treating Likert-type items as continuous ordinal data because they represent discrete manifestations of a latent continuous distribution (O’Connor, Leach, Mama, & Lee, 2015). On the other hand, other researchers claimed that such variables should be treated as ordered categories because Likert-type scales require respondents to choose a discrete category, not a value from a continuum (Sullivan & Artino, 2013). As a consequence, estimation methods used to fit models often depend on the preference of a researcher in treating data from Likert-type scales.
It should be noted that one more layer of constraints is used often as a final step to assess measurement invariance – strict invariance, which constrains error variances to be equal between groups or time periods. Although common, the test of unique or error invariance is arguable (Roesch et al., 2013). Opponents of this test argue that invariance of these model parameters should not be tested because it is based on the assumption that random measurement error will behave in a systematic fashion, which is too restrictive and perhaps unrealistic (Little, 1997). For this reason, this study did not include this final step in assessment of the properties of a short version of FAD.
Finally, the data for this study came from adolescents and caregivers in a rural county in the south-eastern USA. Because the evidence of measurement invariance is specific to the study sample, it complicates the generalizability of results to other contexts. For example, caution must be exercised when using the instrument with samples from urban and suburban areas. To some extent, this problematic aspect relates to a concern raised by researchers studying measurement properties of latent constructs – that latent variable may manifest differently under different conditions (Byrne & Campbell, 1999). In other words, the assumption that a latent variable is a given, regardless of the setting in which it operates might be questionable.
Unfortunately, a lack of clear guidelines - both for the execution of the steps required for testing and also for interpreting and reporting results - makes generalizing problematic (Putnick & Bornstein, 2016). As future studies of measurement invariance become available, consensus should grow regarding on how best to generalize results. For this study, a conservative approach was taken.
Summary
In sum, evidence of configural and metric invariance enables researchers to examine relationships between a brief measure of family functioning and other constructs across caregivers and adolescents as well as across time. However, the lack of evidence of complete scalar invariance might prevent researchers from attributing differences in mean scores on this measure solely to differences in underlying factors or true mean difference between groups or across time. Rather, these differences can be due to several reasons: (a) differential item functioning across groups, (b) differential item functioning across time, (c) unmeasured factors, (d) true mean differences in underlying factors, or (e) a combination of all of these options (Byrne & Stewart, 2006). Therefore, more research is encouraged to help clarify and explain the differential functioning of selected items from a short version of the General Functioning subscale of the McMaster Family Assessment Device.
Footnotes
In this study, the terms “a short version of FAD,” “a short version of the General Functioning subscale of the McMaster Family Assessment Device,” and “FAD” are used interchangeably throughout text.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Aarons GA, McDonald EJ, Connelly CD, & Newton RR (2007). Assessment of family functioning in Caucasian and Hispanic Americans: Reliability, validity, and factor structure of the Family Assessment Device. Family Process, 46, 557–569. [DOI] [PubMed] [Google Scholar]
- Adolf J, Schuurman NK, Borkenau P, Borsboom D, & Dolan CV (2014). Measurement invariance within and between individuals: a distinct problem in testing the equivalence of intra- and inter-individual model structures. Frontiers in Psychology, 5. DOI: 10.3389/fpsyg.2014.00883 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bagley C, Bertrand L, Bolitho F, & Mallick K. (2001). Discrepant parent-adolescent views on family functioning: Predictors of poorer self-esteem and problems of emotion and behaviour in British and Canadian Adolescents. Journal of Comparative Family Studies, 32, 393–403. [Google Scholar]
- Boterhoven de Haan KL, Hafekost J, Lawrence D, Sawyer MG, & Zubrick SR (2015). Reliability and validity of a short version of the General Functioning subscale of the McMaster Family Assessment Device. Family Process, 54, 116–123. DOI: 10.1111/famp.12113. [DOI] [PubMed] [Google Scholar]
- Byles J, Byrne C, Boyle M, & Offord D. (1988). Ontario Child Health Study: Reliability and validity of the general functioning subscale of the McMaster Family Assessment Device. Family Process, 27, 97–104. DOI: 10.1111/j.1545-5300.1988.00097.x [DOI] [PubMed] [Google Scholar]
- Byrne BM, & Campbell TL (1999). Cross-cultural comparisons and the presumption of equivalent measurement and theoretical structure: a look beneath the surface. Journal of Cross Cultural Psychology, 30, 555–574. DOI: 10.1177/0022022199030005001 [DOI] [Google Scholar]
- Cheung GW, & Rensvold RB (2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural equation modeling, 9, 233–255. [Google Scholar]
- De Los Reyes A. (2011). Introduction to the special section: More than measurement error: Discovering meaning behind informant discrepancies in clinical assessments of children and adolescents. Journal of Clinical Child & Adolescent Psychology, 40, 1–9. [DOI] [PubMed] [Google Scholar]
- Epstein NB, Baldwin LM, & Bishop DS (1983). The McMaster family assessment device. Journal of Marital and Family Therapy, 9, 171–180. DOI: 10.1111/j.1752-0606.1983.tb01497.x. [DOI] [Google Scholar]
- Feldman SS, Wentzel KR, & Gehring TM (1989). A comparison of the views of mothers, fathers, and pre-adolescents about family cohesion and power. Journal of Family Psychology, 3, 39. [Google Scholar]
- French BF, & Finch WH (2006). Confirmatory factor analytic procedures for the determination of measurement invariance. Structural Equation Modeling, 13, 378–402. [Google Scholar]
- Georgiades K, Boyle MH, Jenkins JM, Sanford M, & Lipman E. (2008). A multilevel analysis of whole family functioning using the McMaster Family Assessment Device. Journal of Family Psycholy, 22, 344–354. DOI: 10.1037/0893-3200.22.3.344 [DOI] [PubMed] [Google Scholar]
- Grace-Martin K. (2008). Can Likert scale data ever be continuous? Article Alley. Available from: http://www.theanalysisfactor.com/can-likert-scale-data-ever-be-continuous/ [Google Scholar]
- Gregorich SE (2006). Do self-report instruments allow meaningful comparisons across diverse population groups? Testing measurement invariance using the confirmatory factor analysis framework. Medical care, 44, S78–S94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guion K, Mrug S, & Windle M. (2009). Predictive value of informant discrepancies in reports of parenting: Relations to early adolescents’ adjustment. Journal of Abnormal Child Psychology, 37, 17–30. [DOI] [PubMed] [Google Scholar]
- Hafekost J, Lawrence D, Boterhoven de Haan L, Johnson SE, Saw S, Buckingham WJ, … Zubrick SR (2016). Australian and New Zealand Journal of Psychiatry, 50, 866–875. DOI: 10.1177/0004867415622270 [DOI] [PubMed] [Google Scholar]
- Hamilton E. & Carr A. (2016). Systematic review of self-report family assessment measures. Family Process, 55, 16–30. [DOI] [PubMed] [Google Scholar]
- Hu L. & Bentler PM (1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification. Pshychological Methods, 1, 424–451. [Google Scholar]
- Jaworska N, & MacQueen G. (2015). Adolescence as a unique developmental period. Journal of psychiatry & neuroscience : JPN, 40(5), 291–293. DOI: 10.1503/jpn.150268 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson SE, Lawrence D, Hafekost J, Saw S, Buckingham WJ, Sawyer M, … Zubrick SR (2016). Australian and New Zealand Journal of Psychiatry, 50, 887–898. DOI: 10.1177/0004867415622562 [DOI] [PubMed] [Google Scholar]
- Kern JL, McBride BA, Laxman DJ, Dyer WJ, Santos RM, & Jeans LM (2016). The role of multiple-group measurement invariance in family psychology research. Journal of Family Psychology, 30(3), 364–374. [DOI] [PubMed] [Google Scholar]
- Mansfield AK, Keitner GI, & Dealy J. (2015). The family assessment device: An update. Family Process, 54, 82–93. doi: 10.1111/famp.12080. [DOI] [PubMed] [Google Scholar]
- Mansfield AK, Keitner GI, & Sheeran T. (2018). The Brief Assessment of Family Functionig Scale (BAFFS): A three-item version of the General Functioning scale of the Family Assessment Device. Psychotherapy Research. doi: 10.1080/10503307.2017.1422213. [Epub ahead of print] [DOI] [PubMed] [Google Scholar]
- McDonell JR & Wolfe DA (2011). [Cohort-sequential study: rural teen dating violence victimization, perpetration]. Unpublished raw data. [Google Scholar]
- Miller IW, Ryan CE, Keitner GI, Bishop DS, & Epstein NB (2000). The McMaster approach to families: Theory, assessment, treatment and research. Journal of Family Therapy, 22, 168–189. DOI: 10.1111/1467-6427.00145. [DOI] [Google Scholar]
- Millsap RE (2010). Testing measurement invariance using item response theory in longitudinal data: An introduction. Child Development Perspectives, 4, 5–9. [Google Scholar]
- Morris T. (1990). Culturally sensitive family assessment: an evaluation of the Family Assessment Device used with Hawaiian-American and Japanese-American families. Family Process, 29(1), 105–16. DOI: 10.1111/j.1545-5300.1990.00105.x [DOI] [PubMed] [Google Scholar]
- Muthen BO, & Asparouhov T. (2002). Latent variable analysis with categorical outcomes: Multiple-group and growth modeling in Mplus. Mplus Web Note No. 4. Available from: http://www.statmodel.com/examples/webnote.shtml#web4. [Google Scholar]
- Muthen LK, & Muthen BO (2010). MPlus Users’ Guide, 6th ed. Los Angeles: Muthen and Muthen. [Google Scholar]
- Nye CD, & Drasgow F. (2011). Effect size indices for analyses of measurement equivalence: Understanding the practical importance of differences between groups. Journal of Applied Psychology, 96, 966–980. [DOI] [PubMed] [Google Scholar]
- O’Connor DP, Leach HJ, Mama SK, & Lee RE (2015). Factorial invariance of the physical activity neighborhood environment survey among single versus multi-family housing residents. Research Quarterly for Exercise and Sport, 86, 303–310. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ohannessian CM, & De Los Reyes A. (2014). Discrepancies in adolescents’ and their mothers’ perceptions of the family and adolescent anxiety symptomatology. Parenting, 14, 1–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pires T, Assis SGD, Avanci JQ, & Pesce RP (2016). Cross-Cultural adaptation of the General Functioning Scale of the Family. Revista de Saúde Pública, 50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Putnick DL, & Bornstein MH (2016). Measurement Invariance Conventions and Reporting: The State of the Art and Future Directions for Psychological Research. Developmental review: DR, 41, 71–90. DOI: 10.1016/j.dr.2016.06.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rikkers W, Lawrence D, Hafekost J, & Zubrick SR (2016). Internet use and electronic gaming by children and adolescents with emotional and behavioural problems in Australia – results from the second Child and Adolescent Survey of Mental Health and Wellbeing. BMC Public Health, 16: 399. DOI: 10.1186/s12889-016-3058-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roesch SC, Norman GJ, Merz EL, Sallis JF, & Patrick K. (2013). Longitudinal measurement invariance of psychosocial measures in physical activity research: an application to adolescent data. Journal of Applied Social Psychology, 43, 721–729. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sianko N, Kunkel D, Thompson M, Small M & McDonell JR (2019). Trajectories of dating violence victimization and perpetration among rural adolescents. Journal of Youth and Adolescence, 48(12), 2360–2376. [DOI] [PubMed] [Google Scholar]
- Sianko N, Mece MH, & Abazi-Morina L. (2019). Adolescent and caregiver views on family functioning: Interactive influence on teen dating violence. Family Process. DOI: 10.1111/famp.12489 [DOI] [PubMed] [Google Scholar]
- Stevenson-Hinde J. & Akister J. (1995). The McMaster Model of family functioning: Observer and parental ratings in a nonclinical sample. Family Process, 34, 337–47. DOI: 10.1111/j.1545-5300.1995.00337.x3 [DOI] [PubMed] [Google Scholar]
- Stuart J, & Jose PE (2012). The influence of discrepancies between adolescent and parent ratings of family dynamics on the well-being of adolescents. Journal of Family Psychology, 26, 858–868. [DOI] [PubMed] [Google Scholar]
- Sullivan GM, & Artino AR Jr (2013). Analyzing and interpreting data from Likert-type scales. Journal of Graduate Medical Education, 5, 541–542. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thompson SJ (2013). How do runaway adolescents and their parents perceive the family? Measurement invariance in the family functioning scale. Journal of Child and Adolescent Behavior, 1:117. DOI: 10.4172/jcalb.1000117 [DOI] [Google Scholar]
- Tomas JM, & Oliver A. (1999). Rosenberg’s self-esteem scale: Two factors or method effects. Structural Equation Modeling: A Multidisciplinary Journal, 6, 84–98. [Google Scholar]
- Vandenberg RJ & Lance CE (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendation for organizational research. Organizational Research Methods, 3, 4–70. [Google Scholar]
- Yu S, Clemens R, Yang H, Li X, Stanton B, Deveaux L, Lunn S, Cottrell L, & Harris C. (2006). Youth and parental perceptions of parental monitoring and parent-adolescent communication, youth depression, and youth risk behaviors. Social Behavior and Personality, 34, 1297–1310. [Google Scholar]