Skip to main content
Educational and Psychological Measurement logoLink to Educational and Psychological Measurement
. 2024 Jan 5;84(6):1232–1244. doi: 10.1177/00131644231222603

Two-Method Measurement Planned Missing Data With Purposefully Selected Samples

Menglin Xu 1, Jessica A R Logan 2,
PMCID: PMC11529668  PMID: 39493801

Abstract

Research designs that include planned missing data are gaining popularity in applied education research. These methods have traditionally relied on introducing missingness into data collections using the missing completely at random (MCAR) mechanism. This study assesses whether planned missingness can also be implemented when data are instead designed to be purposefully missing based on student performance. A research design with purposefully selected missingness would allow researchers to focus all assessment efforts on a target sample, while still maintaining the statistical power of the full sample. This study introduces the method and demonstrates the performance of the purposeful missingness method within the two-method measurement planned missingness design using a Monte Carlo simulation study. Results demonstrate that the purposeful missingness method can recover parameter estimates in models with as much accuracy as the MCAR method, across multiple conditions.

Keywords: planned missing, missing at random, treatment effects


Research designs that include planned missing data are becoming more common in applied education research. Planned missing data designs allow researchers to deliberately incorporate missingness into their data collection plans by randomly assigning participants to receive or not receive specific measures or items (Graham et al., 2006). Depending on the type of design, the missingness can take the form of deliberately not administering specific subset of items to participants, not testing some participants on a given measure or set of measures, or skipping an entire measurement occasion for a subset of participants (Rhemtulla & Hancock, 2016). Regardless of the specific method, one hallmark common to all planned missing data designs is their use of randomization to assign participants, measures, or measurement occasions to be missing. Using randomization to assign missingness conditions results in data that are missing completely at random (MCAR) by virtue of the experimental design itself. When data are MCAR, we can assume that the data that are missing are equivalent to the data that are present (Graham, 2012), which is ideal for use with the statistical methods that can be used to handle incomplete data (Little & Rhemtulla, 2013).

What has yet to be studied is whether planned missingness can be incorporated into research designs when missingness is not selected through randomization. In the present study, we propose a new method of selecting participants for missingness conditions based on a background characteristic, a diagnosis, trait, skill, or performance. We call this “purposefully selecting,” to distinguish it from randomly selecting a subset of participants to receive or not receive key assessments. Purposefully selecting participants into missing data conditions could be advantageous for maximizing resources and allowing researchers to focus on a specific or target population. For example, in a study testing the efficacy of a new behavioral intervention, researchers might expect the program to be effective for children who exhibit problem behaviors, but it may also improve the behaviors of all students. A purposefully planned missingness design study could focus assessment efforts on children who fall below the 25th percentile on a behavioral assessment, and missingness would be strategically assigned to only those above this cut point. Such a strategy would allow a researcher to leverage the power of the full sample of students for the key tests of intervention efficacy, while allowing more detailed examination of the skills and behaviors of the target sample.

Data that are collected following purposeful missingness, assigned deliberately based on meeting a specific criterion, would not be MCAR. However, these data would meet the definition of another missingness mechanism: missing at random (MAR). Data are considered MAR when missingness is predictable based on variables collected as part of the study (Enders, 2010). When these variables are included in multiple imputation models or as predictors during full information maximum likelihood (FIML) estimation, MAR data can yield comparable results to MCAR data (Enders, 2010). The utilization of MAR-aware analytic techniques allows for minimal bias and accurate recovery of parameter estimates, even with a large percentage of missingness (Allison, 2001; Janssen et al., 2010; Rubright et al., 2014). This thereby highlights the potential for using missing data mechanisms beyond MCAR in contemporary planned missingness research.

Although a logical extension of the original planned missing data methods, understanding whether purposeful missingness can be effectively leveraged in the design and execution of planned missingness studies has yet to be investigated. Therefore, in this study, we will demonstrate how MAR data performs in the context of planned missingness designs. We do so by focusing on the two-method measurement (TMM) planned missingness design.

Two-Method Measurement

TMM designs rely on a latent variable measurement framework, specifically the Bias Correction Model (Graham et al., 2006, 2013). In the TMM design, two types of measures are collected that represent the same core construct. The first is the “gold standard” measure, which has excellent measurement properties but is costly to administer in terms of money, time, and resources. The second type of measure is the “efficient” measure. This type of measure captures the same core construct but does so in a less reliable and/or more biased way. For example, children’s language skills can be measured by the Clinical Evaluation of Language Fundamentals–Preschool (CELF: P-2; Wiig et al., 2004), which has excellent reliability and validity but requires testers to be highly trained to test each child individually. Language skills can also be measured by the CELF’s short teacher-reported rating scale, that is, the Descriptive Pragmatics Profile, which due to using teacher report is less costly to the project but is subject to response bias of the teacher. The TMM with Planned Missing Data (TMM-PMD) design uses both types of measures in a strategic way. All participants receive the efficient but biased measures of the core construct, while only a subset of participants receive the more costly “gold standard” measures.

Any TMM relies on structural equation modeling (SEM) techniques to combine these two types of measurements by fitting a bias correction latent variable model (Graham et al., 2013), and the TMM-PMD model is fit within this SEM framework. As shown in Figure 1, the first measurement occasion portion is located on the left-hand side; all the measures load on the latent factor “Trait_t1,” which represents the substantive construct of interest, while the efficient measures also load on the latent factor “Bias_t1,” which represents the systematic measurement bias. The efficient measures that were administered to all participants retain the large sample size, while the gold standard measures help calibrate the measurement model. The TMM-PMD allows researchers to administer gold standard measures, which are often expensive, to a smaller subset of the sample without compromising the statistical power and estimation accuracy. Using FIML to account for the planned missing data, this measurement strategy is able to obtain estimates of the core construct for each participant even though data on the gold standard measures is missing for a subset of the sample. For a practical step-by-step guide on how to conduct an analysis for the TMM-PMD, see Xu and Logan (2021).

Figure 1.

Figure 1.

Two-Method Longitudinal Model for Data Generation.

Note. The suffix “_t1” indicates the first measurement occasion, while “_t2” indicates the second.

Longitudinal Developments

Since the TMM-PMD method was proposed in 2006, some additional advancements have been made, two of which are relevant to this study. First, Garnier-Villarreal et al. (2014) extended the TMM design to longitudinal measurement. Through fitting the same measurement model to two or more occasions, this research method can be used to examine change over time. Second, Xu and Logan (2021) extended the longitudinal design to the estimation of treatment effects, where an autoregressive model is used to compare a treatment and a control group on an outcome. The present study builds on these two approaches, fitting the same autoregressive longitudinal model in examining a treatment effect as was fitted in Xu and Logan (2021).

Prior Simulation Studies of the TMM-PMD

Several simulation research studies have examined the performance of TMM-PMD design, though all use the MCAR missingness mechanism. For example, Graham et al. (2006) conducted simulation studies to examine the parameter estimation performance of TMM-PMD. They used a population model with four indicators, two expensive and two efficient measures, and varied factor loading conditions of each measurement type. Their results showed that the most efficient design, that is, the smallest standard error, was a planned missingness design as compared to a complete case design, and the efficiency was more obvious when the factor loadings of efficient measures on the latent trait factor were larger. Garnier-Villarreal et al. (2014) further examined the performance of TMM design in a longitudinal context. Based on the population model with four measurement occasions and six indicators including three expensive and three efficient indicators at each occasion, their results showed that when response bias was represented by distinct latent factors over time, gold standard measures should be included in each time point to achieve accurate parameter estimations for response bias loadings, focal construct loadings, and structural paths. Hence, we will adopt the bias correction model with both a latent trait construct and a response bias factor specified at each measurement occasion in our simulation.

The Present Study

The goal of the present study is to demonstrate the potential effectiveness of using purposeful missingness (MAR) in a planned missing data design. We do so following the model of a longitudinal TMM-PMD for treatment effects presented in Xu and Logan (2021). This present study examines whether purposeful missingness yields similar results to the more traditional random missingness (MCAR) within the context of this model. More specifically, in the present study, we conduct a simulation to demonstrate: (a) Whether and to what extent bias is introduced in the estimation of the TMM-PMD model under MAR compared with MCAR, and to what extent bias varies by the size of the autoregressive path and (b) How statistical power of a treatment effect is affected by missingness mechanism and autoregressive path sizes within the context of the PMD.

Illustration

Data Generation

Data were generated using Mplus 8.0 (Muthén & Muthén, 1998–2017) and R 4.10 (R Core Team, 2021). All code used to generate and combine datasets is available through the Open Science Framework (OSF): https://osf.io/uqh6s/.

Population Model

The population model is shown in Figure 1. The primary construct of interest is the latent trait variable (Trait_t1 in Figure 1), which is composed of three efficient measures (“cheap” for shorthand: cheap1–cheap3) and two gold standard measures (gold1–gold2). The three efficient measures also load on the latent bias variable (Bias_t1). The TMM model at Time 2 was the same as that of Time 1. The autoregressive path is included between the latent trait factors from Time 1 to Time 2, and the group effect representing the difference between a treatment and control group is reflected by a group variable loading on Trait_t2.

Parameters for data generation were set based on the empirical data in Xu and Logan (2021). The loadings of the efficient measures were set to 0.4 on the trait factor and 0.7 on the response bias factor capturing rater bias. The path from group to Trait_t2 was set at 0.3. The variances of all the latent factors were fixed to unit, and the residual variances were specified to values such that all observed variables had unit variance. We varied the autoregressive coefficient, as this was part of our research questions, and set these to small, medium, and large sizes, that is, 0.3, 0.5, or 0.7.

Missingness Conditions

In all missingness conditions, we only impose missingness on the gold standard variables, with missingness imposed in the same way at both time points. Percentage of missingness was held constant at 60% for all conditions. The sample size was also held constant at n = 500 for all conditions, with indicator variables following a normal distribution. The goal of this work was to contrast MAR data with MCAR data in their performances within the TMM-PMD model. To do so, we needed to generate data with both types of missingness. MCAR data generation is well established and is generated through a random process, that is, the observations were randomly selected to be missing. To generate MAR data, we developed two methods, based on the potential practical methods that could be used for data collection in an applied study.

The first method is MAR_y1. For this condition, we selected one efficient variable, “cheap 1,” and generated missingness assessments as a function of scores on this indicator. If a person scored at or above the 40th percentile on “cheap 1,” they were assigned to have missing data on the gold standard assessments. Only those who scored below the 40th percentile had complete data on all assessments. In practice, this would be similar to administering one measure first, then using scores on that one assessment to determine who should receive the full battery of assessments and who should receive only the efficient assessments.

The second MAR method is more complex and relies on an auxiliary variable (AV). An AV is a variable that is related to a focal variable’s missingness. Previous studies have found that their inclusion during analysis can reduce the power loss and bias associated with MAR data (Collins et al., 2001; Enders, 2010). For these models, we generated an additional AV which was specified to correlate with Trait_t1 at different levels. Different magnitudes of ρ(Y, AV) were simulated because previous studies have found that the impact of AV on parameter estimation performance depends on the correlation between AV and model variables (e.g., Collins et al., 2001; Enders, 2010). We included four conditions of the magnitude of associations between Trait_t1 and the AV: ρ(Y, AV) = 0.2, 0.4, 0.6, and 0.8. Within each condition, missingness was imposed based on a cutoff of the 40th percentile of the AV; cases that scored at or above the 40th percentile on the AV were set to have missing data on all gold standard assessments. In practice, this method would be akin to determining who in the sample should receive the gold standard assessments based on a background characteristic, a similar construct, or performance in a previous year. Some of these would be expected to be more strongly associated with the core trait (Trait_t1) than others. The MAR data are distinguishable from MCAR data in the sense that in the MAR conditions, the missingness was created as a function of either the first efficient item “cheap 1” or a predefined AV which was observed variables; while in the MCAR condition, the missingness was not dependent on the observed variables.

Data Analysis and Outcome Measures

Five hundred replicates were generated for each condition using Mplus and R. For all the analyses, the FIML estimation in Mplus was used, and the R package MplusAutomation (Hallquist & Wiley, 2018) was used to summarize the simulation results. Parameter estimation was evaluated using relative bias (RB), 95% confidence interval (CI) coverage rate, and power. RB is calculated as θ^θθ *100%, where θ^ denotes the parameter estimate averaged across replicates and θ denotes the population value. An RB value of less than 5% would be regarded as acceptable (Hoogland & Boomsma, 1998). The coverage of 95% CI describes the proportion of replicates that the 95% CI contains the true value. It combines estimation bias and variability of estimates. Values closer to 95% would be preferred, and values less than 90% are regarded as poor coverage (Peugh & Enders, 2004).

The estimates of interest were the factor loadings on Trait_t1, composed of λ11 for cheap1, λ41 and λ51 for gold standard measures, as well as the autoregressive coefficient (β), and the group effect (βg). Finally, statistical power is quantified as the proportion of replicates that have successfully detected the difference between groups (0.30), and the cutoff value was set at 80%, which is typically considered as the minimum requirement.

Results

Research Question 1 asked whether and to what extent the RB was introduced into coefficients of the TMM model under various missingness conditions and autoregressive path size. Table 1 shows the results of this research question. The columns in this table represent the different missingness conditions beginning with MCAR. For the AV conditions, results are presented both when the AV is included and when it is not included. The table spanners represent the different autoregressive path size conditions. Finally, each row within the autoregressive condition represents the five key outcomes: (a) λ11: the factor loadings of the first efficient measure. The three loadings were identical across cheap measures so only one is shown. (b) λ41: the loading of the first gold standard measure onto the latent construct. (c) λ51: the loading for the second gold standard measure. (d) β: the autoregressive coefficient, and (e) βg: the treatment effect represented by differences between the two treatment conditions. Figures within this table are percentages of RB, with smaller values being more desirable. To read this table, begin with the first result, –0.025. Within the MAR condition, and when the autoregressive coefficient was 0.3, there is minimal bias (–0.025%) on the factor loading for the efficient measure.

Table 1.

Percent Relative Bias (%) by Parameter Across Conditions of Autoregressive Coefficient and Missing Data Mechanism.

Parameter MCAR MAR_y1 ρ(Y, AV) = 0.2 ρ(Y, AV) = 0.4 ρ(Y, AV) = 0.6 ρ(Y, AV) = 0.8
No AV AV No AV AV No AV AV No AV AV
β = 0.3
 λ11 −0.025 −0.100 −0.375 0.025 −3.275 0.100 −8.825 0.200 −18.475 0.250
 λ41 0.478 0.644 −1.467 0.133 −5.567 −0.044 −12.933 −0.178 −24.644 −0.244
 λ51 −0.344 0.222 −1.656 0.089 −5.722 0.056 −13.056 0.056 −24.767 0.111
 β 0.233 2.933 −2.200 −1.300 −6.500 −1.300 −14.267 −1.233 −26.733 −1.167
 βg −0.900 0.133 1.600 0.967 1.667 0.967 1.767 1.067 1.933 1.233
β = 0.5
 λ11 0.125 0.225 −0.225 0.175 −3.250 0.225 −9.025 0.275 −19.025 0.275
 λ41 0.367 0.656 −1.467 0.167 −5.600 0.000 −13.000 −0.144 −24.778 −0.222
 λ51 −0.311 0.278 −1.678 0.078 −5.767 0.067 −13.144 0.067 −24.911 0.122
 β 0.240 2.040 −2.100 −1.400 −5.980 −1.460 −13.080 −1.420 −24.740 −1.320
 βg −0.900 0.333 1.333 0.700 1.500 0.733 1.700 0.833 1.900 1.033
β = 0.7
 λ11 0.350 0.250 0.000 0.375 −2.975 0.400 −8.650 0.400 −18.600 0.350
 λ41 0.200 0.678 −1.500 0.144 −5.633 −0.011 −13.056 −0.167 −24.878 −0.256
 λ51 −0.244 0.278 −1.678 0.100 −5.767 0.089 −13.133 0.089 −24.900 0.133
 β 0.300 1.586 −1.986 −1.371 −5.486 −1.414 −11.986 −1.400 −23.014 −1.300
 βg −0.700 0.333 1.100 0.433 1.467 0.533 2.067 0.700 2.700 0.967

Note. MCAR = missing completely at random condition; MAR_y1 = missing at random as a function of the first cheap measure; ρ(Y, AV) = correlation between auxiliary variable and the latent trait at time one; No AV = the auxiliary variable is not included in the analysis model; AV = the auxiliary variable is included in the analysis model; β = the autoregressive coefficient; βg = the treatment effect; λ11 = the factor loading of the first cheap measure on the latent trait; λ41 = the loading of the first gold standard measure on the latent trait; λ51 = the loading for the second gold standard measure.

As with previous studies, the RBs for all estimates were negligible for the MCAR condition (RB within 5%). Notably, the same pattern of minimal bias was found for all of the MAR_y1 conditions; the conditions where one of the efficient measures is used to determine whether participants would receive the gold standard assessments.

For the more complex conditions involving the AV, the results were still very positive. When ρ(Y, AV) was 0.2, the RBs for all the estimates were within 5%, suggesting negligible bias regardless of whether the AV was included. However, when the AV was not included (referred to as “No AV” in Table 1), the biases were generally larger as the correlation between AV and Y became stronger. Meanwhile, as can be seen in the table, each of these estimation biases was well recovered to an acceptable level when the AV was included in the analysis. The effect of the omission of AVs is also evident when examining the 95% CI coverage rates (Supplemental File 1).

For Research Question 2, we asked how statistical power was affected by missingness conditions and autoregressive path sizes. Results are shown in Supplemental File 2. The statistical power of factor loadings and the autoregressive coefficient generally exceed the 80% cutoff point across conditions, indicating sufficient power. Regarding the group effect, we observed statistical power below 80% under smaller autoregressive coefficient conditions of β = 0.3 and β = 0.5. However, power reached 80% when β = 0.7. Meanwhile, the statistical power of parameter estimations was not remarkably affected by the conditions of either the missing data mechanism or the inclusion of the AV. For example, the power for the group effect ranged from 67.2% to 75.8% when β = 0.3, and from 84.8% to 87.4% when β = 0.7 across the missing data mechanism conditions, and the inclusion of AV or not did not make notable differences in the power values.

Discussion

In the current study, we sought to understand whether the TMM planned missing data methodology could be implemented when data were purposefully selected to be missing (e.g., MAR) rather than MCAR. In general, we found that parameter estimates were recovered very well across all conditions, as long as the indicator associated with the missingness is included in the model.

Of particular importance are our findings for the condition where missingness on the gold standard measures is determined based on performance on a measure that is included within the measurement model, that is, the MAR_y1 condition. These models recovered parameter estimates with minimal bias, without the need to include a separate AV. An example of such a research strategy might be as follows. First, a researcher would assess the entire sample on a set of efficient measures, such as short-form questionnaires, that measure the same basic construct. For example, these could be peer-report or self-report short-form versions of the three facets of extraversion (e.g., Soto & John, 2017). As another example, this could also be teacher report or parent report of children’s behavior (Allen et al., 2013; Putnam & Rothbart, 2006). In the next step, researchers would ask those participants who meet the predetermined cutoff point on one of these scales to also complete a long-form, or gold standard test measuring the same construct. Continuing our previous example, a researcher interested in studying extraversion could ask those people who were rated very highly by peers on the sociability facet to complete a more thorough extraversion scale. Or for those scientists studying children with behavioral difficulties, parents or teachers of a child who was rated very low on surgency or effortful control could be asked to complete the long-form questionnaire providing more details about their behavior. In this way, incorporating purposeful missingness into a data collection plan would save time, resources, and participant burden, while still allowing for a focus on a target population.

Regarding the results of Question 2, we found that the autoregressor was much more important to the statistical power of the treatment effect than any missing data condition we examined. The importance of the autoregressor is unsurprising, given previous work in this area (e.g., Petscher & Schatschneider, 2011). In summary, researchers who are designing studies using the purposefully selected method need to consider the strength of the autoregressor when determining sample size.

Limitations and Future Directions

While the TMM-PMD with purposeful missingness is a promising method, there are some disadvantages to fitting these models. The first class of disadvantages is related to general issues with the broader TMM-PMD model itself rather than the missingness mechanism, but these do warrant discussion. First, the model is complex. There are several tests and intermediate steps that must be followed to fit this model to the data, and the model requires so many parameters to be fit to the data that it may have difficulty converging. While we have demonstrated here several ways that such a model can potentially work, we would like to see future work approach this using simpler measurement models. A second general weakness of the TMM-PMD approach is that the sample size required to fit these models is similar to that of other large structural equation models. Therefore, we do not recommend this method for instance when the anticipated or achieved sample size is below established benchmarks for fitting such models (e.g., n of at least 200; Kline, 2015). However, the models are flexible in that the sample size that receives the gold standard assessments can be considerably smaller than what would be typically necessary for fitting a structural equation model. For example, with 60% missing, a total sample of 200 would include an n of only 80 with complete gold standard assessments. It would not be possible or advisable to fit a structural equation model to 80 participants, but through the use of the TMM-PMD such modeling is possible.

A second set of limitations for this study relate to the specificity of the model tested. This was a very small scale study with only a limited set of parameters. This has several implications. First, the present study provides only a glimpse into statistical power. We caution that researchers should conduct a simulation study to determine whether statistical power is sufficient in their specific circumstances. Refer to the code in the OSF repository (https://osf.io/n6jfa/) to generate such a model. Second, the factor loadings chosen represent only one of many possibilities. Research has suggested that the strength of factor loadings is associated with model convergence and bias in structural equation models (e.g., Gu et al., 2017; Yang & Green, 2010). Third, the study included a bias factor that had strong factor loadings which all were set to 0.70. This was due to the motivating example’s use of teacher-report measures as the efficient measures, and the bias factor captures rater effects. Other types of efficient measures may show weaker relations, which would change the findings shown here. Future work could investigate this further.

Conclusion

Much of the work within the psychological and developmental sciences is focused on children with or at risk of a learning disability. While the planned missing data method proposed in the present study is ideal for this type of research, it is not the only target population for whom this method would be effective. Examples of other target populations may include participants with depression, children who are identified as gifted, or teachers who fail to meet a particular threshold for implementation fidelity. In each case, the target population is known and can be measured via some type of quantitative scale. In this study, we present evidence that researchers designing a study with TMM-PMD can choose to select participants for missingness conditions based on whether they belong to a specific target population. Using the TMM-PMD with the purposeful missingness method presented here, researchers would be able to focus their intense and high-quality data collection on those participants within the target population, enabling them to leverage the full sample size through widespread administration of the efficient measures, all while maintaining a high-quality representation of the core construct of interest through direct high-quality testing of the target sample. As long as the variable used to select the sample is included in the estimation model either as part of the factor model or as an AV, the estimates can be recovered with minimal bias and with minimal impact on statistical power of key conclusions.

Supplemental Material

sj-docx-1-epm-10.1177_00131644231222603 – Supplemental material for Two-Method Measurement Planned Missing Data With Purposefully Selected Samples

Supplemental material, sj-docx-1-epm-10.1177_00131644231222603 for Two-Method Measurement Planned Missing Data With Purposefully Selected Samples by Menglin Xu and Jessica A. R. Logan in Educational and Psychological Measurement

Footnotes

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The authors received no financial support for the research, authorship, and/or publication of this article.

Data, Materials, and Online Resources: The code used to conduct this study and analyze the data is available on the Open Science Framework: https://osf.io/uqh6s/.

Supplemental Material: Supplemental material for this article is available online.

References

  1. Allen K. D., Kuhn B. R., DeHaai K. A., Wallace D. P. (2013). Evaluation of a behavioral treatment package to reduce sleep problems in children with Angelman Syndrome. Research in Developmental Disabilities, 34(1), 676–686. [DOI] [PubMed] [Google Scholar]
  2. Allison P. D. (2001). Missing data. Sage. [Google Scholar]
  3. Collins L. M., Schafer J. L., Kam C.-H. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330–351. [PubMed] [Google Scholar]
  4. Enders C. K. (2010). Applied missing data analysis. The Guilford Press. [Google Scholar]
  5. Garnier-Villarreal M., Rhemtulla M., Little T. D. (2014). Two-method planned missing designs for longitudinal research. International Journal of Behavioral Development, 38(5), 411–422. [Google Scholar]
  6. Graham J. W. (2012). Missing data: Analysis and design. Springer. [Google Scholar]
  7. Graham J. W., Cumsille P. E., Shevock A. E. (2013). Methods for handling missing data. In Schinka W. F., Velicer W. F., Weiner I. B. (Eds.), Handbook of psychology: Research methods in psychology (pp. 109–141). Wiley. [Google Scholar]
  8. Graham J. W., Taylor B. J., Olchowski A. E., Cumsille P. E. (2006). Planned missing data designs in psychological research. Psychological Methods, 11(4), 323–343. [DOI] [PubMed] [Google Scholar]
  9. Gu H., Wen Z., Fan X. (2017). Examining and controlling for wording effect in a self-report measure: A Monte Carlo simulation study. Structural Equation Modeling: A Multidisciplinary Journal, 24(4), 545–555. [Google Scholar]
  10. Hallquist M. N., Wiley J. F. (2018). MplusAutomation: An R package for facilitating large-scale latent variable analyses in Mplus. Structural Equation Modeling, 25(4), 621–638. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Hoogland J. J., Boomsma A. (1998). Robustness studies in covariance structure modeling. An overview and a meta-analysis. Sociological Methods & Research, 26, 329–367. [Google Scholar]
  12. Janssen K. J., Donders A. R. T., Harrell F. E., Jr., Vergouwe Y., Chen Q., Grobbee D. E., Moons K. G. (2010). Missing covariate data in medical research: To impute is better than to ignore. Journal of Clinical Epidemiology, 63(7), 721–727. [DOI] [PubMed] [Google Scholar]
  13. Kline R. B. (2015). Principles and practice of structural equation modeling. The Guilford Press. [Google Scholar]
  14. Little T.D., Rhemtulla M. (2013). Planned missing data designs for developmental researchers. Child Development Perspectives, 7(4), 199–204. 10.1111/cdep.12043 [DOI] [Google Scholar]
  15. Muthén L.K., Muthén B.O. (1998. –2017). Mplus user’s guide (8th ed.). [Google Scholar]
  16. Petscher Y., Schatschneider C. (2011). A simulation study on the performance of the simple difference and covariance-adjusted scores in randomized experimental designs. Journal of Educational Measurement, 48(1), 31–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Peugh J. L., Enders C. K. (2004). Missing data in educational research: A review of reporting practices and suggestions for improvement. Review of Educational Research, 74(4), 525–556. [Google Scholar]
  18. Putnam S. P., Rothbart M. K. (2006). Development of short and very short forms of the Children’s Behavior Questionnaire. Journal of Personality Assessment, 87(1), 102–112. [DOI] [PubMed] [Google Scholar]
  19. R Core Team. (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/
  20. Rhemtulla M., Hancock G. R. (2016). Planned missing data designs in educational psychology research. Educational Psychologist, 51(3–4), 305–316. [Google Scholar]
  21. Rubright J. D., Nandakumar R., Gluttin J. J. (2014). A simulation study of missing data with multiple missing x’s. Practical Assessment, Research, and Evaluation, 19(1), 10. [Google Scholar]
  22. Soto C. J., John O. P. (2017). The next Big Five Inventory (BFI-2): Developing and assessing a hierarchical model with 15 facets to enhance bandwidth, fidelity, and predictive power. Journal of Personality and Social Psychology, 113(1), 117–143. [DOI] [PubMed] [Google Scholar]
  23. Wiig E. H., Secord W. A., Semel E. (2004). Clinical evaluation of language fundamentals–preschool (2nd ed.). Harcourt Assessment. [Google Scholar]
  24. Xu M., Logan J. A. (2021). Treatment effects in longitudinal two-method measurement planned missingness designs: An application and tutorial. Journal of Research on Educational Effectiveness, 14(2), 501–522. [Google Scholar]
  25. Yang Y., Green S. B. (2010). A note on structural equation modeling estimates of reliability. Structural Equation Modeling, 17(1), 66–81. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

sj-docx-1-epm-10.1177_00131644231222603 – Supplemental material for Two-Method Measurement Planned Missing Data With Purposefully Selected Samples

Supplemental material, sj-docx-1-epm-10.1177_00131644231222603 for Two-Method Measurement Planned Missing Data With Purposefully Selected Samples by Menglin Xu and Jessica A. R. Logan in Educational and Psychological Measurement


Articles from Educational and Psychological Measurement are provided here courtesy of SAGE Publications

RESOURCES