Abstract
Despite the need for sensitivity analysis to nonignorable missingness in intensive longitudinal data (ILD), such analysis is greatly hindered by novel ILD features, such as large data volume and complex nonmonotonic missing-data patterns. Likelihood of alternative models permitting nonignorable missingness often involves very high-dimensional integrals, causing curse of dimensionality and rendering solutions computationally prohibitive to obtain. We aim to overcome this challenge by developing a computationally feasible method, nonlinear indexes of local sensitivity to nonignorability (NISNI). We use linear mixed effects models (LMMs) for the incomplete outcome and covariates. We use Markov multinomial models to describe complex missing-data patterns and mechanisms in ILD, thereby permitting missingness probabilities to depend directly on missing data. Using a second-order Taylor series to approximate likelihood under nonignorability, we develop formulas and closed-form expressions for NISNI. Our approach permits the outcome and covariates to be missing simultaneously, as is often the case in ILD, and can capture U-shaped impact of nonignorability in the neighborhood of the MAR model without fitting alternative models or evaluating integrals. We evaluate performance of this method using simulated data and real ILD collected by the Ecological Momentary Assessment method.
Keywords: Sensitivity Analysis, Linear Mixed Effects Model, Missing Data, Nonignorability, Nonlinear Sensitivity Index
1. Introduction
With development of technology, collecting real-time data using electronic devices is becoming increasingly common. Researchers now can conduct extensive longitudinal studies involving numerous participants and collect information about their daily experiences, behaviors, and environments.1 Data from such studies involve long, intensively collected measurements that can capture changes in multiple variables concurrently and over time and are usually referred to as intensive longitudinal data (ILD).2 One major method of obtaining ILD is Ecological Momentary Assessment (EMA), where prompts are sent to electronic devices, and participants are asked to report their current behaviors and experiences randomly or when events happen. EMA can minimize recall bias, achieve greater ecological validity, and allow researchers to capture relationships among different variables over time.3–5
Missingness is ubiquitous and often unavoidable in ILD. In EMA, given that participants are prompted frequently over time, it is expected that noncompliance or dropout will occur during data collection, leading to missing data. Currently, a majority of statistical analyses assume that missing data do not influence missing probability after conditioning on observed data, known as missing at random (MAR).6 Under MAR and the additional assumption that parameters governing the outcome model and the missing-data mechanism (MDM) are distinct, MDM becomes ignorable6,7 so that a valid likelihood-based/Bayesian inference does not need to model MDM. Thus, the assumption of MAR or (closely related) ignorability greatly simplifies analysis. However, this critical assumption cannot be verified robustly because observed data provide little information about MDM.8,9 In self-reported EMA data, it is possible that reasons for non-response relate to the variable of interest, after conditioning on the observed information. Under such nonignorable missingness, assuming MAR generally yields biased and invalid estimation results.10–13 It is crucial to conduct sensitivity analysis to assess validity and reliability of MAR analysis to alternative missing-data assumptions.8,9,14
Such sensitivity analysis should consider several novel features and notable challenges in ILD. Intensiveness in data collection often means numerous missing data per subject as compared to traditional short-panel data, even if the rate of non-responded prompts is modest per subject. Furthermore, unlike short-panel data, there are typically numerous intermittent missingness and nonmonotonic missing-data patterns. In fact, in our EMA data, the number of unique missing-data patterns was equal to the number of subjects, leading to extremely sparse data for each missing-data pattern. Finally, there is often simultaneous missingness in regression outcome and covariates arising from non-responses. Because of such factors, sensitivity analysis encounters curse of dimensionality in such high-dimensional missing-data problems, which calls for general and flexible sensitivity analysis methods that are also computationally feasible to use with ILD.
Herein, we introduce a tractable sensitivity index method of evaluating sensitivity of a multilevel ILD analysis to alternative missing-data assumptions. We develop a joint selection model that augments (1) a multilevel model for notional complete outcome of interest with (2) a product conditional model for missing covariates and (3) a Markov Multinomial Missing-data Model (M2-MDM) that permits intermittent missingness and dropouts to depend directly on unobserved values in outcome and covariates by nonignorability parameters. Brute-force sensitivity analysis then varies the nonignorability parameters in a plausible range and estimates the resulting range of joint selection models. Curse of dimensionality in the ILD missing-data problem manifests here as high-dimensional integrals in the likelihood of the aforementioned joint selection models. The likelihood function, which is to be optimized, must be evaluated for combinations of potential unobserved data at non-responded occasions for each subject. Thus, computational workload required to evaluate such integrals increases exponentially with increasing number of intermittent missingness and missing covariates and can become computationally prohibitive with increasing data intensiveness. Such rapid increase in complexity with increasing number of missing values per subject is an example of curse of dimensionality. To solve such problems, we developed closed-form expressions for the sensitivity indexes, based on a second-order Taylor series approximation to the likelihood of the joint selection model. Computation of the sensitivity indexes avoids fitting complicated joint selection models or evaluating high-dimensional integrals. As a result, the method is tractable to use in ILD.
Our approach aims to answer the question: how are results from multilevel ILD modeling affected by possible violation of MAR assumption? Our approach has the advantage of directly addressing this question by providing a parsimonious sensitivity index for each parameter estimate in the multilevel model. One alternative approach to answering this question is to jointly estimate all the parameters in the aforementioned selection model. Although this is possible in principle, numerous studies have shown that the likelihood function for a nonignorable selection model is not well behaved. The likelihood function can be flat, non-convex, or multi-modal,8,15–17 even if the nonignorability parameter is weakly identified and conclusions sensitively depend on model assumptions untestable in the presence of missing data.18 Reliable estimation of the joint selection model may require additional data such as availability of instrumental variables or refreshment samples.19,20 In ILD, such estimation is further complicated by the aforementioned curse of dimensionality, which can render the joint estimation approach computationally unfeasible. The method developed herein can be viewed as a tool for quickly screening for sensitivity before investing a great deal of time and resources to overcome conceptual and computational challenges associated with such arduous joint estimation tasks and is consistent with the recommendation that the selection model is best used as a provisional device to perform sensitivity analysis.8,9,15,21
Another alternative is to consider sensitivity analysis based on a semiparametric selection model.22,23 With a sufficiently unstructured semiparametric or nonparametric marginal mean model for the response variable, the nonignorability parameter in this approach can be made unidentifiable. Another benefit of this approach is that the Generalized Estimating Equation (GEE) avoids numerical integrals encountered in likelihood-based sensitivity analysis by inverse weighting for missingness. There have been debates about marginal models versus conditional (multilevel) models and prior work demonstrating benefits of conditional models for longitudinal data.24,25 Our modeling approach is motivated by the fact that multilevel models are well suited for and have been a predominant method for ILD.2 Such multilevel models are preferred because they provide explicit modeling and estimation of heterogeneity among individuals in ILD analysis. Furthermore, multilevel models yield valid and efficient results under MAR without modeling why data are missing, whereas valid GEE analyses require modeling the MDM even under MAR, and efficient GEE MAR estimators can be difficult to find in ILD, which typically have irregular data collection occasions and complicated nonmonotonic missing-data patterns and reasons. It is fair to say that multilevel models remain a major method of analyzing longitudinal data and more so of ILD. It is therefore of great interest to directly gauge sensitivity of multilevel model parameter estimates to MAR assumption.
Our work builds on past work on local sensitivity analysis and extends it for use with ILD. We extend the index of local sensitivity to nonignorability (ISNI) method first proposed by Troxel et al.26 and extend the more-recent nonlinear ISNI (NISNI) method developed by Gao et al.12 to ILD. ISNI utilizes the idea of local sensitivity15,17,21 to assess change in inference in the neighborhood of MAR model and has been extended to a range of statistical models and data types.27–33 Although an R package isni is available for common distributions and models,34 such methods are restricted to settings of missing outcome and monotonic impacts of nonignorable missingness, which limit their use in ILD. Gao et al.12 relaxed these assumptions and developed the NISNI method for a cross-sectional subset (i.e., the first prompt of each subject) of EMA data. Herein, we extend this approach to full ILD.
In Section 2, we develop NISNI for ILD under the selection model framework. In Section 3, we discuss the method of calibrating NISNI. We apply the method to simulated data in Section 4 and to an EMA dataset in Section 5. We conclude with discussion in Section 6.
2. NISNI method for intensive longitudinal data
In EMA, data are collected by sending prompts to participants’ electronic devices, so failing to respond to any prompt will lead to missingness in data for all the questions in that particular diary entry. In such circumstances, outcome variable and covariates for that occasion will be missing simultaneously, which is common in ILD. In the following sections, we develop a joint model that permits a nonignorable missing-data process in EMA data or other types of ILD alike.
2.1. Selection model for incomplete outcome and covariates
2.1.1. Multilevel model for notional complete outcome
Denote the notional complete data vector for a continuous outcome for the ith subject as Yi, where i = 1, 2, …, N, and Yij is the jth element in vector Yi, j = 1, 2, …, ni. We consider a linear mixed effects model (LMM), an instance of a multilevel model:
| (1) |
where Wi is an ni × Pw design matrix for Pw covariates that are fully observed for all the subjects, and Pw × 1 vector θw contains all the fixed effect parameters for Wi; Xi is an ni × Px matrix for Px covariates that are not fully observed, and θx contains all the fixed effect parameters for Wi; Xi is an ni × Px matrix for Px covariates that are not fully observed, and θx contains all the fixed effect parameters for Xi; and Zyi is a subset of (Wi, Xi) and is the design matrix for Qy × 1 random effects Bi, where , and Vy is the var-cov matrix for Bi. Assume the residual . The var-cov matrix for Yi after integration over the random effects is , where vector θ2 contains all the unique parameters in Vy and σey. Let θ1 = (θw, θx), and the entire design matrix for the mean of Yi is (Wi, Xi). Denote θ = (θ1, θ2). Then marginal distribution for outcome Yi is given by
2.1.2. Product conditional multilevel model for notional complete covariates
Let denote Px time-varying covariates collected simultaneously with outcome Y at each occasion. We model joint conditional distribution fψ(Xi|Wi) using a product conditional model:
where is null when p = 1, and when p > 1. Unlike the product conditional model for cross-sectional data,12,35 the notional complete data for the pth covariate are , which is a vector of repeated measures from subject i. We thus model using a multilevel model as follows:
For a specific p, . Similar to the outcome model, Zpi is a subset of and is the design matrix for Qp × 1 random effects bpi, where , and Vp is the var-cov matrix for bpi. Assume the residual . Let , and ψp2 is a vector that contains all the unique parameters in Vp and σep. After integration over bpi,
2.1.3. Markov multinomial missing-data model (M2-MDM)
In EMA, participants may skip some assessments (causing intermittent missingness), or participants may quit during the study (causing dropouts). Furthermore, the prompt response behavior and outcome variable both could be affected by common idiosyncratic (i.e., time-varying) factors which, if unobserved or excluded from the analysis, could lead to dependence of missingness probability on unobserved outcome values. Owing to such considerations, we use a Markov multinomial model to capture arbitrary and nonmonotonic missing-data patterns and nonignorable missingness, which extends the simple binary missing-data model of Gao et al.12 to ILD. For each subject i, there is a corresponding ni × 1 missing status vector Gi with its jth component as:
and we can write the joint distribution for as a product of univariate transitional probabilities: . We then can model univariate conditional distributions separately. ILD typically contains numerous unique missingness patterns. Therefore, conditioning on the full missingness-status history (Gi1, ⋯ , Gi,j−1) can cause data sparsity within each missing-data pattern, which may lead to unstable or even non-converging estimation. One approach to solving this problem is to adopt a finite-order Markov model. Although our method can easily generalize to a higher-order Markov model, we consider the following first-order Markov model for expositional simplicity:
Given missingness status at the previous occasion Gi,j−1, Gij becomes independent of all other past missing status variables. We model missingness status transition probability with the following multinomial logit model:
| (2) |
where u is missingness status at the current occasion with u = O as the reference level, and υ = gi,j−1 ∈ (O, I, D) is missingness status at the previous occasion. Sij contains all the observed missingness predictors, including variables in fully observed Wij as well as the last observed components in outcome Yi,j−1 and in covariates Xi,j−1. Under this first-order M2-MDM, missing probability is a constant for certain occasions; specifically, , (definition of dropout); (definition of intermittent missing).
Nonignorable parameter vector γ1 = (γ1y, γ1x) captures dependence of missingness probability on potentially unobserved values of contemporaneous variables at the current occasion; MDM reduces to MAR when γ1 = 0. The model is a parsimonious nonignorable selection model that approximates a more-complex model wherein given the outcome and covariates at the current occasion, missingness depends additionally on past and future unobserved outcomes and time-varying covariates: we can integrate out these past and future unobserved outcome and covariates so that the resulting selection model depends only on values at the current visit36. Although our method is general and can be extended to the more-general M2-MDM, we prefer the more-parsimonious specification, which reduces the number of parameters for nonignorable non-responses and permits easily interpretable sensitivity analysis, which is desirable37. In addition, we can permit γ1 to depend on u and υ, which increases the number of nonignorability parameters in sensitivity analysis. It seems reasonable in practice that the direction of nonignorability is roughly the same for different values of u and υ. Thus, we can fix these sensitivity parameters at the maximum number of γ1 for a parsimonious and conservative sensitivity analysis.
2.1.4. Selection model likelihood
We partition Yi and Xi into and for observed and missing components, respectively. Our method can be extended to partially observed Zyi and Zpi with slightly more complex closed-form expressions for certain terms in our sensitivity index formulas. For expositional simplicity, we focus on Zyi and Zpi containing only fully observed variables. The log-likelihood for model parameters based on observed data with N independent subjects is:
where ni is the number of planned data collection occasions for subject i, is the number of occasions that have data observed, and is the sample space for a random variable.
2.2. Review of ISNI method
As described in the Introduction, it is well recognized in the literature that non-identifiability arises when attempting to use the aforementioned nonignorable selection model likelihood to jointly estimate all the model parameters because data provides little information about the nonignorability parameter, γ1. A more judicious approach is to consider this nonignorable selection model as tentative and perform sensitivity analysis over a range of values for γ1. For a given γ1, other model parameters Θ = (θ, ψ, γ0) are estimable by maximizing likelihood over Θ. To investigate impacts of different assumptions on the degree of nonignorable missingness, as captured by γ1 on the other parameters Θ, we can obtain parameter profile maximum likelihood estimations (MLEs), given γ1, denoted as . If changes little as γ1 varies around 0, sensitivity analysis concludes that MAR inference is robust and trustworthy. Such profile MLEs can be obtained by maximizing likelihood for a set of fixed values of γ1 by iterative methods such as Newton-Raphson or expectation-maximization methods. However, as we shall show, this is computationally infeasible for ILD. The ISNI method overcomes this challenge by eliminating the need to perform such brute-force model estimation and requiring only estimates readily available under MAR.
The foregoing likelihood expression indicates that unless γ1 = 0, a multi-dimension integral is included for each subject, and its dimension increases with increasing number of non-responses and variables with missing data. Take the EMA baseline wave data as an example. Figure 1 shows the missing-data pattern for each subject. The maximum number of prompts is 128. Green indicates answered prompts, red indicates unanswered prompts, and white indicates there was no planned prompt. Each subject has a unique missing-data pattern, and only 1 out of 461 subjects has fully observed data. On average, each subject has 42 planned random prompts, and the percentage of non-responded prompts is 27% but is as high as 91%. To perform sensitivity analysis for a model including two covariates missing simultaneously with the outcome, the average number of dimensions required for integration per subject will be approximately 34 (42 × 0.27 × 3). Certain subjects have more than 100 prompts, among which 70% are missing. Integration dimension will be higher than 210 for each of these subjects. It will be computationally infeasible to perform sensitivity analysis by direct likelihood optimization. Using Taylor-series expansion, we propose an extended ISNI method that avoids evaluating integrals and fitting the nonignorable selection models, thereby providing a feasible method of examining the parameter sensitivity to nonignorability in ILD.
Figure 1:

Missing-data patterns.
The ISNI method simplifies sensitivity analysis by approximating using a first-order Taylor series expansion around γ1 = 0. For the kth element of Θ:
Troxel et al. defines as ISNI with the following expression:
where , and L is the log-likelihood of a joint selection model. First-order Taylor series expansion and linear ISNI have been shown to adequately capture local sensitivity when only outcome is subject to missingness.
Gao et al.12 have shown that for cross-sectional data, linear ISNI (ISNIL) may not be sufficient for estimating accurately when outcome and covariates are missing concurrently. They extended the method by expanding the Taylor-series to second order, thereby developing the quadratic index of sensitivity to nonignorability (ISNIQ). ISNIL and ISNIQ together are named as nonlinear indexes of sensitivity to nonignorability (NISNI). In the following section, we extend NISNI to overcome curse of dimensionality in selection model likelihood for ILD.
2.3. NISNI development for Y-dependent nonignorability
Similar to cross-sectional data, for ILD the profile MLE can have a U-shape in the neighborhood of MAR when the outcome and covariates are missing simultaneously. To accurately capture the shape, we apply the following Taylor-series expansion:
Define and . As in Supplemental Materials Appendix A:
| (3) |
| (4) |
where and . To help understand the formulas, consider Y-dependent nonignorability. For any subject i, let , and denote as , as , as Dpi,o, and as Dpi,m. Denote conditional expectations as E(Yi)m|o and as E(Xpi)m|o. To obtain NISNI, we must derive expressions for the second and third derivatives of the log likelihood w.r.t. Θ evaluated at γ1 = 0. From the models described in section 2.1, the main required conditional distributions are:
where and . Therefore, each derivative evaluated at γ1 = 0 has an explicit closed-form expression without evaluating high-dimensional integrals. Derivation result details are shown in Supplemental Materials Appendix B.
2.3.1. Simple example to illustrate NISNI
Consider a case where each subject i is prompted twice. The outcome model contains the intercept and one covariate X that is concurrently missing with Y, and the model for X only contains the intercept. Both models only include random intercepts:
Because there are only two occasions, only dropout can happen, and M2-MDM reduces to a binary logit model with . Let noo denote the number of subjects with both occasions observed, and nom denote the number of subjects who drop out at the second occasion. represents observed X in subject i, and pm denotes the missing proportion. , , and represent MAR estimators. When σvy = συx = 0, expressions for terms used in NISNI calculations are described in Supplemental Materials Appendix C. NISNI terms for are:
while is not zero, indicating the need to use NISNI to capture parameter sensitivity to nonignorability. Furthermore, increases with increasing pm, and the absolute values of both and will increase with increasing pm when pm < 0.5, holding all other quantities constant. The NISNI expression will be more complicated when considering correlations and when and ΣXi have more complex structures.
2.4. NISNI development for Y- and X-dependent nonignorability
When considering Y- and X- dependent nonignorability, the M2-MDM takes the general expression, as given by Eq.(2). and ISNIL now become:
becomes a matrix wherein each element represents a second derivative of w.r.t. γ1. Denote , and as , , and , respectively, where and are the pth and qth elements in vector . To compute ISNIQ, we must additionally derive closed-form expressions for terms related to γ1x. From derivation details shown in Supplemental Materials Appendix D:
has the same expression as Eq.(4)
3. NISNI calibration
3.1. Y-dependent nonignorability
To facilitate interpretation, a scale-free NISNI calibration is needed. Let be the minimum absolute value for γ1 so that , i.e., the minimum degree of nonignorability needed to change an MLE by one standard error. That is, . For a continuous outcome, the meaning of depends on the unit of Y. For scale-free interpretation, we define , where σY is the standard deviation (SD) of the outcome. MinNI defined as such means that the minimum magnitude of nonignorability needed to change an estimate by one standard error is such that a unit change in outcome is associated with an e1 = 2.7−fold change in the odds of missingness. In the special case of using ISNIL alone to measure sensitivity, MinNI reduces to a simpler form as . A small MinNI value suggests large sensitivity to nonignorability because modest nonignorability can change an estimate significantly whereas a large MinNI value means that MAR result remains robust to all but extreme nonignorability. We use 1 as a cuto value for important sensitivity.26
3.2. Y- and X-dependent nonignorability
Building upon the approach of calibrating the scalar γ1 for Y-dependent nonignorability, we now consider index calibration for a vector of nonignorability parameter γ1 = (γ1y, γ1x) with Y- and X-dependent nonignorability. Following Gao et al.,12 let represent a standardized cumulative magnitude of nonignorability over all the variables with missing values, and . Thus, to find the range of estimates for parameter θk given a magnitude of overall nonignorability, , we simply find the minimum and maximum values of subject to the constraint .
Now MinNIQ is defined as the minimum to have . For a given θk, this corresponds to finding the minimum subject to the constraint . We use the Lagrange multipliers method to perform the mathematical optimization.
4. Simulation study to illustrate NISNI
In the simulation study, we set ni = 3, i = 1, 2, …, 100, and for each subject:
W is the intercept and θw = 1, θx = 4, and ψw = 0. and are both compound symmetric with , and , . We only consider dropout in M2-MDM for computational feasibility with . Figure 2 shows the analysis performed using one simulated dataset with a missing proportion of 12.3%. The solid line is the exact sensitivity curve; that is, the profile MLE obtained by maximizing the log-likelihood of the joint selection model for a range of specified γ1 values around 0. The dashed line is the sensitivity curve approximated using the equation
is MAR estimate 3.940, ISNIL = 0.0088, and ISNIQ = 1.8903, all of which were computed without estimating any nonignorable selection models. The dashed line is very close to the solid one, indicating that NISNI captures the exact U-shaped sensitivity curve well without requiring computationally intensive estimation of nonignorable models.
Figure 2:

Application of NISNI to simulation data showing 12.33% missingness. Parameter setting in M2-MDM is γ00 = −5, γ01 = −0.1, and true γ1 = −1. Two red vertical dashed lines represent correspondingly.
We then varied the missing proportion by changing γ00 and γ1 while fixing γ01 = −0.5. Table 1 summarizes the results obtained from 12 datasets. and are the first and second derivatives, respectively, at the MAR model calculated directly from the exact sensitivity curve. ISNIL and ISNIQ calculated using Eq.(3) and (4), respectively, are close to the true values, showing that NISNI indeed accurately calculates first and second derivatives without fitting any nonignorable models. Computation time is reduced from hours to mere seconds. The increase in computational efficiency is higher for datasets with larger missing proportions. Over all the datasets, ISNIL values are all very small while MinNIL values are large, indicating negligible linear sensitivity of . In contrast, ISNIQ values are relatively large with MinNIQ approaching the cuto value of 1. This is expected because when the X and Y are missing simultaneously, can be U-shaped. Hence, ISNIL may not adequately capture sensitivity around the MAR model and it is necessary to examine NISNI. Furthermore, ISNIL and ISNIQ values are not related to γ1 values. This is expected because sensitivity analysis should not inform nonignorability magnitude.
Table 1:
NISNI results for simulation data, γ01 = −0.5.
| γ00 | γ1 | Prop_M | (S.E.) | ISNIL | MinNIL | ISNIQ | MinNIQ | ||
|---|---|---|---|---|---|---|---|---|---|
| −1 | 1 | 42.0% | 3.96 (0.14) | −0.025 | 0.986 | −0.021 | 28.47 | 0.978 | 2.18 |
| 0.5 | 33.0% | 3.95 (0.13) | 0.005 | 1.153 | 0.006 | 105.55 | 1.168 | 2.10 | |
| −1 | 23.7% | 3.74 (0.14) | −0.038 | 2.252 | −0.041 | 13.74 | 2.190 | 1.35 | |
| −0.5 | 19.0% | 3.94 (0.14) | −0.042 | 1.635 | −0.045 | 12.95 | 1.637 | 1.64 | |
| −2 | 1 | 39.0% | 3.96 (0.13) | 0.003 | 1.291 | −0.001 | 707.45 | 1.289 | 1.95 |
| 0.5 | 26.0% | 3.99 (0.13) | −0.009 | 1.424 | −0.006 | 105.69 | 1.436 | 1.89 | |
| −1 | 10.7% | 3.80 (0.14) | −0.009 | 2.300 | −0.009 | 63.06 | 2.357 | 1.36 | |
| −0.5 | 16.0% | 3.95 (0.13) | −0.021 | 1.245 | −0.022 | 25.20 | 1.255 | 1.89 | |
| −5 | 1 | 20.0% | 3.89 (0.13) | −0.001 | 2.035 | −0.00004 | 13128.22 | 2.027 | 1.48 |
| 0.5 | 9.0% | 4.01 (0.12) | −0.026 | 0.767 | −0.026 | 20.37 | 0.806 | 2.30 | |
| −1 | 7.0% | 3.87 (0.12) | −0.004 | 1.194 | −0.006 | 80.03 | 1.225 | 1.84 | |
| −0.5 | 3.7% | 3.99 (0.11) | 0.006 | 0.633 | 0.006 | 87.42 | 0.648 | 2.65 |
5. Application to EMA data
5.1. Baseline wave analysis
We applied our method to an EMA dataset obtained from a longitudinal study on adolescent smoking history(PO1CA098262, PI R. Mermelstein).4,5,12 Data for 461 adolescents (55.1% female; 56.8% white; 52.7% grade 10; mean age 15.7 years) were included in baseline wave analysis. Subjects were all self-reported ever-smokers in grades 9 and 10 at the baseline. During a 7-day period, they provided responses to random prompts and three types of smoking-related event interviews (smoking; decide not to smoke; want to smoke but could not). The random prompts were sent to subjects’ hand-held computers several times every day, and each prompt recorded the corresponding date-time information. Random interviews asked questions about mood, activity, location, companionship, presence of other smokers, substance use, and other behaviors. Adolescents were trained to provide a smoking report when they smoked and completed prompt questions just after smoking. Questions in smoking reports included the same questions asked in the random prompts plus additional smoking-related items. With the enormous volume of data collected, we are interested in exploring potential factors related to changes in adolescents’ positive moods.
It is reasonable to suspect that some non-responses are related to participant mood. For example, adolescents tend not to respond when their moods are low, so conventional analysis methods assuming MAR could yield biased results. Therefore, we applied NISNI to evaluate potential impact of such nonrandom missingness on MAR estimates.
5.1.1. Model
Outcome variable posaff (marginal mean 6.80 with SD 1.94) refers to positive affect calculated by taking the average of the following positive mood assessment items: Happy, Relaxed, Cheerful, Confident, and Accepted, each rated from 1 to 10. We consider the following multi-level ideal outcome model for posaff in the form of Eq.(1) with
where Hour is time in hours after midnight. Following Hedeker et al.,4 we included several potential predictors, random intercept and slope for Hour in the LMM. AloneBS indicates the proportion of random prompts wherein a subject was alone. An adolescent was defined as Smoker=1 if he or she had at least one smoke interview record. Male is a demographic binary variable where 1 indicates male, and Grade10 is a demographic binary variable where 1 indicates 10th grade. Weekday is a nominal variable indicating day of the week. Sociso measures social isolation and is missing simultaneously with the outcome variable. It is calculated by taking the average of social isolation assessment items: Lonely, Left out, and Ignored, each rated from 1 to 10. We may include sociso in the model for several reasons. It could be a confounding or mediating variable for the relationship between posaff and Smoker such that the analyst would like to include it in the outcome model. Sociso might also be related to missingness, and including it in the analysis can render MAR assumption more plausible. We posit the following ideal covariate model for sociso: . We assume the following two types of M2-MDMs for sensitivity analysis (u = I, D, υ = gi,j−1).
(1) M2-MDM for Y-dependent nonignorability:
or (2) M2-MDM for Y-and X-dependent nonignorability:
5.1.2. Results
We first conduct MAR analysis and then calculate NISNI for each parameter, using formulas derived in Section 2. A correction interval is calculated for each parameter, representing the range of the corresponding profile MLE when nonignorability is moderate (i.e. when for M2-MDM (1) or for M2-MDM (2)). Results are shown in Tables 2 and 3 for M2-MDM (1) and M2-MDM (2), respectively.
Table 2:
Y-dependent nonignorability NISNI results for EMA baseline data, N = 461, , .
| Parameter | MAR | SE | ISNIL | ISNIQ | Linear Correction | Nonlinear Correction | ||
|---|---|---|---|---|---|---|---|---|
| Est | MinNIL | MinNIQ | ||||||
| Intercept | 7.712*** | 0.130 | 0.816 | 0.466 | [7.291, 8.134] | 0.31 | [7.353, 8.196] | 0.32 |
| Hour | 0.019*** | 0.003 | −0.012 | 0.004 | [0.013, 0.025] | 0.54 | [0.013, 0.026] | 0.57 |
| smoker | −0.032 | 0.095 | 0.099 | 0.043 | [−0.083, 0.019] | 1.86 | [−0.078, 0.025] | 2.64 |
| sociso | −0.418*** | 0.007 | −0.0004 | −0.189 | [−0.418, −0.418] | 38.54 | [−0.443, −0.418] | 0.54 |
| AloneBS | −0.900** | 0.322 | −1.284 | 0.252 | [−1.563,−0.236] | 0.49 | [−1.530, −0.202] | 0.50 |
| Male | 0.199* | 0.098 | 0.230 | −0.144 | [0.081, 0.318] | 0.82 | [0.061, 0.299] | 0.73 |
| Grade10 | 0.031 | 0.096 | 0.091 | −0.002 | [−0.016, 0.078] | 2.03 | [−0.017, 0.078] | 2.00 |
| Tuesday | −0.051 | 0.043 | 0.045 | 0.032 | [−0.075,−0.028] | 1.85 | [−0.070, −0.023] | 1.46 |
| Wednesday | −0.103* | 0.043 | 0.037 | 0.013 | [−0.122, −0.084] | 2.28 | [−0.121, −0.082] | 3.24 |
| Thursday | −0.060 | 0.044 | −0.006 | 0.009 | [−0.063,−0.057] | 14.99 | [−0.062, −0.056] | 4.85 |
| Friday | 0.053 | 0.043 | 0.084 | −0.041 | [0.010, 0.097] | 1.00 | [0.004, 0.091] | 0.90 |
| Saturday | 0.203*** | 0.044 | 0.193 | −0.036 | [0.104, 0.303] | 0.44 | [0.099, 0.298] | 0.43 |
| Sunday | 0.120** | 0.044 | 0.031 | −0.050 | [0.104, 0.136] | 2.74 | [0.097, 0.129] | 1.63 |
p < .001,
p < .01,
p < .05.
Table 3:
Y-and X-dependent nonignorability NISNI results for EMA baseline data, N = 461, , .
| Parameter | MAR | SE | ISNIL | ISNIQ | Linear Correction | Nonlinear Correction | ||
|---|---|---|---|---|---|---|---|---|
| Est | (ISNIQyy,ISNIQxy,ISNIQxx) | MinNIL | MinNIQ | |||||
| Intercept | 7.712*** | 0.130 | (0.816,0) | (0.466,−0.550,0) | [7.291, 8.134] | 0.31 | [7.345, 8.201] | 0.29 |
| Hour | 0.019*** | 0.003 | (−0.012,0) | (0.004,−0.005,0) | [0.013, 0.025] | 0.54 | [0.013, 0.026] | 0.52 |
| smoker | −0.032 | 0.095 | (0.099,0) | (0.043,−0.057,0) | [−0.083, 0.019] | 1.86 | [−0.078, 0.025] | 1.51 |
| sociso | −0.418*** | 0.007 | (−0.0004,0) | (−0.189,0.219,0) | [−0.418, −0.418] | 38.54 | [−0.449, −0.412] | 0.41 |
| AloneBS | −0.900** | 0.322 | (−1.284,0) | (0.252,−0.038,0) | [−1.563,−0.236] | 0.49 | [−1.530, −0.202] | 0.47 |
| Male | 0.199* | 0.098 | (0.230,0) | (−0.144,0.121,0) | [0.081, 0.318] | 0.82 | [0.061, 0.300] | 0.73 |
| Grade10 | 0.031 | 0.096 | (0.091,0) | (−0.002,0.037,0) | [−0.016, 0.078] | 2.03 | [−0.017, 0.078] | 1.88 |
| Tuesday | −0.051 | 0.043 | (0.045,0) | (0.032,0.002,0) | [−0.075,−0.028] | 1.85 | [−0.070, −0.023] | 1.46 |
| Wednesday | −0.103* | 0.043 | (0.037,0) | (0.013,0.006,0) | [−0.122, −0.084] | 2.28 | [−0.121, −0.082] | 1.92 |
| Thursday | −0.060 | 0.044 | (−0.006,0) | (0.009,−0.002,0) | [−0.063,−0.057] | 14.99 | [−0.062, −0.056] | 4.33 |
| Friday | 0.053 | 0.043 | (0.084,0) | (−0.041,−0.001,0) | [0.010, 0.097] | 1.00 | [0.004, 0.091] | 0.90 |
| Saturday | 0.203*** | 0.044 | (0.193,0) | (−0.036,−0.019,0) | [0.104, 0.303] | 0.44 | [0.099, 0.298] | 0.43 |
| Sunday | 0.120** | 0.044 | (0.031,0) | (−0.050,−0.032,0) | [0.104, 0.136] | 2.74 | [0.097, 0.130] | 1.55 |
p < .001,
p < .01,
p < .05.
MAR estimates show that effects of Intercept, Hour, sociso, AloneBS, Male, Wednesday, Saturday, and Sunday are significant, indicating that the adolescents feel more positive at night on weekend, when they are less isolated, have fewer chances to be alone, or are male. Intercept, Hour, AloneBS, Male, and Saturday have MinNI statistics less than 1 under both linear and nonlinear sensitivity analyses, suggesting that potential impact of nonignorability is considerable and that such estimates can be sensitive to nonignorable missingness. Especially for variables AloneBS and Male, the p-value may be greater than 0.05 when nonignorability is moderate, thereby changing the significance level. MinNIL and MinNIQ values are similar and lead to the same qualitative conclusions based on a cut-off value of 1 for important sensitivity, indicating that for fully observed covariates, ISNIL will be sufficient to measure sensitivity to nonignorability. Because Intercept measures the conditional mean of positive mood at random prompts, it is understandable that nonignorable missingness has a noticeable impact on such estimations. The sign of ISNIL for Intercept informs the direction for adjustment of the MAR estimate. For example, if the true γ1 > 0, missingness probability will increase with increasing posaff, and smaller posaff values are more likely to be observed. Therefore, MAR estimate yields underestimated results, and a positive ISNI means that the estimate should be adjusted upward. MinNI statistics for sociso are 38.54 and 0.54 for linear and nonlinear sensitivity analyses, respectively, as listed in Table 2, demonstrating the importance of using NISNI to capture U-shaped impact of nonignorable missingness on regression coefficient parameters for a covariate concurrently missing with outcome. The same finding is listed in Table 3.
When considering Y- and X-dependent nonignorable missingness (i.e., as in M2-MDM (2)), MinNIQ decreases, and some correction intervals widen (i.e., as in Table 3 vs. Table 2), which is expected because Y-dependent nonignorability is a special case of Y- and X- dependent nonignorability. Thus, for the same total size of nonignorability, the correction interval obtained when only considering γ1y should be nested within that obtained when considering both γ1y and γ1x. However, no significant difference is detected between sensitivity measures of the two M2-MDMs, indicating that origin of nonignorability missingness may not significantly impact the maximum sensitivity to nonignorability for parameters in the outcome model.
5.2. Multi-wave analysis
5.2.1. Model
We also applied our method to analyze multiple waves of EMA data. Besides the baseline wave, each subject was followed-up at 6, 15, and 24 months and possibly at 5 and 6 years, thereby generating at most 5 more waves of data. For follow-up years 5 and 6, only the subset of participants who had provided smoking data on their EMA prompts were again recruited to participate in the years 5 and 6 EMA portion of the study, given the interest then to focus on smoking contexts. Among those 461 subjects who had baseline wave measurements, 123 subjects had at least one observation in waves 5 or 6 data and are included in the multi-wave analysis. Because sociso was not collected in waves 5 and 6 data, tirbor, indicating the tired/bored scale, is included in the model instead. Similarly, tirbor is calculated by taking the average of the tired and bored assessment items: Tired, Bored, and Trouble Concentrating, each rated from 1 to 10. Since the 123 subjects all eventually became smokers, smoker indicates a smoker in the baseline wave. AloneBS is calculated separately for each wave. The ideal outcome model and ideal covariate model are:
5.2.2. Results
Results are summarized in Table 4. For each wave, tirbor has a very significant negative impact on posaff, showing that the adolescents feel less positive when they are tired or bored or have trouble concentrating. Similar to conclusions in the baseline wave analysis, ISNIL tirbor values are very close to zero, while ISNIQ values are non-zero, and MinNIQ values are smaller than 1, which again demonstrates the necessity of using NISNI to capture U-shaped impact of nonignorable missingness for tirbor.
Table 4:
NISNI results for multi-wave EMA dataset, N = 123.
| Parameter | MAR | SE | ISNIL | ISNIQ | Linear Correction | Nonlinear Correction | ||
|---|---|---|---|---|---|---|---|---|
| Est | MinNIL | MinNIQ | ||||||
| Wave 1 | 8.050*** | 0.220 | 0.723 | 0.765 | [7.666, 8.434] | 0.57 | [7.774, 8.542] | 0.72 |
| Wave 2 | 7.525*** | 0.220 | 0.753 | 0.613 | [7.125, 7.925] | 0.55 | [7.211, 8.011] | 0.64 |
| Wave 3 | 7.723*** | 0.221 | 0.898 | 0.677 | [7.246, 8.200] | 0.46 | [7.342, 8.295] | 0.52 |
| Wave 4 | 7.806*** | 0.220 | 0.774 | 0.561 | [7.395, 8.217] | 0.54 | [7.474, 8.296] | 0.61 |
| Wave 5 | 8.140*** | 0.219 | 0.455 | 0.365 | [7.899, 8.382] | 0.91 | [7.950, 8.433] | 1.23 |
| Wave 6 | 7.940*** | 0.220 | 0.504 | 0.400 | [7.673, 8.208] | 0.82 | [7.729, 8.265] | 1.06 |
| Hours | 0.008** | 0.002 | −0.005 | −0.002 | [0.005, 0.011] | 0.91 | [0.005, 0.010] | 0.83 |
| Tirbor 1 | −0.284*** | 0.011 | 0.007 | −0.134 | [−0.286, −0.281] | 2.94 | [−0.306, −0.284] | 0.65 |
| Tirbor 2 | −0.193*** | 0.011 | 0.007 | −0.105 | [−0.197, −0.190] | 3.05 | [−0.211, −0.193] | 0.75 |
| Tirbor 3 | −0.199*** | 0.011 | −0.005 | −0.126 | [−0.202, −0.196] | 3.92 | [−0.220, −0.199] | 0.73 |
| Tirbor 4 | −0.211*** | 0.011 | −0.010 | −0.116 | [−0.216, −0.206] | 2.18 | [−0.232, −0.211] | 0.68 |
| Tirbor 5 | −0.244*** | 0.010 | −0.003 | −0.071 | [−0.245, −0.242] | 6.85 | [−0.255,7 −0.244] | 0.92 |
| Tirbor 6 | −0.209*** | 0.011 | −0.001 | −0.078 | [−0.210, −0.209] | 16.99 | [−0.221, −0.209] | 0.96 |
| smoker | 0.064 | 0.204 | 0.005 | 0.050 | [0.061, 0.067] | 78.60 | [0.064, 0.074] | 5.19 |
| AloneBS | −0.492*** | 0.074 | −0.712 | 0.073 | [−0.870, −0.114] | 0.19 | [−0.860, −0.103] | 0.20 |
| Male | −0.026 | 0.177 | 0.144 | −0.162 | [−0.103, 0.051] | 2.31 | [−0.126, 0.028] | 1.57 |
| Grade10 | 0.060 | 0.173 | 0.007 | −0.066 | [0.056, 0.064] | 47.48 | [0.047, 0.060] | 4.12 |
| Tuesday | −0.103** | 0.034 | −0.015 | 0.024 | [−0.111, −0.096] | 4.28 | [−0.108, −0.092] | 2.20 |
| Wednesday | −0.082* | 0.034 | −0.004 | 0.021 | [−0.084, −0.080] | 15.26 | [−0.082, −0.077] | 3.03 |
| Thursday | −0.071* | 0.034 | −0.014 | 0.013 | [−0.079, −0.064] | 4.70 | [−0.077, −0.062] | 2.74 |
| Friday | −0.047 | 0.034 | 0.016 | −0.003 | [−0.055, −0.038] | 3.99 | [−0.056, −0.039] | 3.39 |
| Saturday | −0.057 | 0.034 | 0.117 | −0.007 | [−0.119, 0.005] | 0.56 | [−0.119, 0.004] | 0.55 |
| Sunday | −0.092** | 0.034 | 0.050 | 0.011 | [−0.119, −0.066] | 1.28 | [−0.118, −0.064] | 1.39 |
p < .001,
p < .01,
p < .05.
To compare magnitudes of sensitivity across waves, we examine Figure 3, which shows trends of missing percentage, |ISNIQ|, and MinNIQ across all 6 waves. The patterns for missing percentage and |ISNIQ| are similar, although they are not parallel, which is consistent with the amount of missing data being an important determinant of sensitivity. To help understand why, consider the simple closed-form expressions of NISNI for θw and θx in Section 2.3.1. For the intercept, |ISNIL| is positively associated with the missing percentage, and |ISNIQ| for both θw and θx are related to pm(1 − pm). Therefore, when other quantities in the formula remain the same, |ISNIQ| will increase with increasing pm from 0 to 0.5, reaching the maximum when pm = 0.5. However, as shown in the NISNI formula in Section 2.3.1, other quantities such as MAR estimates can impact both |NISNI| and MinNI. For example, if only referring to pm, we would expect |ISNIQ| for wave 1 to be greater than that for wave 2. However, the results are reversed, which may be because the MAR estimates for the two waves are quite different (−0.286 versus −0.197) compared to the small difference in missingness percentage (0.252 versus 0.269), respectively. Thus, NISNI values can be considered as a generalization of missing percentages that account for complex features of models and data when measuring sensitivity.
Figure 3:

Relationships among missing percentages, |ISNIQ|, and MinNIQ for across all 6 waves.
Although the effect of Male is no longer significant, the finding that the adolescents feel more positive at night or when less alone remains valid. However, Monday now becomes the day that adolescents feel more positive on average, and their moods tend to be lower especially during the middle of the week or on Sunday. AloneBS and Saturday have MinNIL and MinNIQ values less than 1, suggesting that potential impact of nonignorability may be large, and these estimates can be sensitive to nonignorable missingness. Especially for AloneBS, the p-value may be greater than 0.05 when nonignorability is moderate, thereby changing the significance level. For Thursday and Sunday, although MinNI values are larger than 1, the upper bound of the sensitivity interval will have a p-value slightly greater than 0.05.
6. Discussion
Missing data are often unavoidable in ILD. Currently, regular ILD analysis often makes the strong MAR assumption about missing-data processes, which cannot be verified using only observed data. If the assumption is invalid, selection bias can occur, and the analysis results may be incorrect. Therefore, sensitivity analysis is necessary, and it helps researchers gauge credibility of standard analysis. The NISNI method developed herein relaxes the MAR assumption and provides a computationally feasible method of screening local sensitivity to nonignorability for parameters of interest in ILD analyses.
We have extended the NISNI method to ILD showing missingness in both outcome and covariates. Analyses of simulated data and real EMA data show that when outcome and covariates are missing simultaneously, impact of nonignorability on regression coefficients for missing covariates is not monotonic around the MAR model. Application of NISNI to EMA data in both baseline and multi-wave analyses of the adolescent smoker cohort reveals that in estimating the function of mood regulation, nonignorable missingness leads to attenuation bias (i.e., under-estimation of true effects) in the MAR effect estimates of the time-varying predictors missing simultaneously with mood outcome, regardless of whether better or worse mood is more likely to cause non-responses. This new finding of unidirectional impact of nonignorability is useful for EMA researchers to understand the nature of such nonmonotonic impacts on EMA data analysis and to measure/control for such impacts. In such cases, ISNIL is no longer sufficient to quantify parameter sensitivity to nonignorability. By developing ISNIQ, our NISNI method can readily capture U-shaped impact of nonignorability on parameters of interest.
This method is especially useful for addressing curse of dimensionality encountered in high-dimensional nonignorable missing data in ILD, which usually involves numerous measurement occasions for each subject. To perform sensitivity analysis on nonignorable missingness, the brute-force approach to fitting complicated nonignorable models becomes computationally prohibitive. The NISNI method avoids fitting nonignorable models, making such sensitivity analysis feasible to perform in modern data-rich environments.
Recently, shared-parameter mixed effects models have been developed to relax the MAR assumption in EMA data analysis.13,38 Such models assume conditional independence of missingness and unobserved data given random effects. This differs from our models, which permit missingness probability to depend directly on time-varying unobserved outcome and covariates. The difference between the two modeling approaches is analogous to the difference between handling time-constant and time-varying confounders. As a result, shared-parameter models do not have the curse of dimensionality issues, as encountered in our models, so long as dimensionality of random effects does not increase with increasing number of missing occasions. Even so, Cursio et al.13 documented slow convergence issues (i.e., hours of computation time) when fitting their model to EMA data. Thus, it is not surprising that use of brute-force to perform sensitivity analysis by directly estimating a range of nonignorable selection models, as in our modeling approach, is computationally prohibitive, and we must solve the curse of dimensionality problem for practical sensitivity analysis.
Although we illustrate our approach using EMA data, the method also is applicable to other types of ILD. More studies on pain, diet, and physical activity involve data collection methods similar to those used to collect EMA data, and there are similar missing-data issues in such studies for which our NISNI method is applicable. Our method also can be applied to traditional short-panel clinical trials and observational data, where both outcome and time-varying covariates can be missing when a subject misses a visit.
Given the modeling flexibility and computational simplicity of ISNI analysis, researchers can specify different M2-MDMs and perform ISNI analysis to examine robustness of MAR estimates under a given set of posited scenarios. Results to date have demonstrated that conclusions about local sensitivity to nonignorability are considerably less sensitive to MDM specification, than conclusions drawn from joint estimation of all the parameters in nonignorable models.15,28,32 This is intuitive because ISNI analysis fixes the sensitivity parameter γ1, for which data provide little information and which is the major source of sensitivity to model assumptions.
When ISNI analysis flags MAR estimates as being unreliable, researchers may choose to conduct speculative nonignorable analysis. To increase confidence in such an analysis, it is helpful to garner additional information on MDM for more-robust modeling and estimation of nonignorable selection models, e.g., by collecting refreshment samples or ascertaining a sample of missing values. In such situations, ISNI analysis can be useful for informing researchers about the need for and allocation of resources for additional data collection efforts. For instance, statisticians may consider performing ISNI analysis during data collection to monitor sensitivity to nonignorable missingness and assess need for additional data collection.
One limitation of the current study is that we only developed the NISNI method under the condition that each of the variables subject to missingness follows a multivariate normal (MVN) distribution and is modeled using LMM. By introducing random effects, LMM can model complex var-cov matrices. Furthermore, marginal distribution after integration over random effects is also MVN, allowing us to obtain closed-form expressions for all the NISNI terms. Therefore, the NISNI method avoids numerical evaluation of derivatives and expectations and further reduces computational workload. However, in real datasets, there are other data distributions that are more appropriately modeled using generalized linear mixed effects models. Therefore, it would be useful to extend the NISNI method to other data types. Although closed-form expressions for NISNI terms may be unavailable, computational workload can still be considerably less than that required for brute-force sensitivity analysis.
Supplementary Material
Acknowledgements
This research was supported in part by grants R01CA178061 and 5P01CA098262 from the NCI/NIH.
References
- 1.Mehl MR, Conner TS, Csikszentmihalyi M. Handbook of Research Methods for Studying Daily Life. New York, NY: The Guilford Press; 2011. [Google Scholar]
- 2.Walls TA, Schafer JL. Models for Intensive Longitudinal Data. New York, NY: Oxford University Press; 2006. [Google Scholar]
- 3.Stone AA, Shi man S, Atienza AA, Nebeling L. The Science of Real-Time Data Capture: Self-Reports in Health Research. New York, NY: Oxford University Press; 2007. [Google Scholar]
- 4.Hedeker D, Mermelstein RJ, Demirtas H. An Application of a Mixed-Effects Location Scale Model for Analysis of Ecological Momentary Assessment (EMA) Data. Biometrics. 2008;64:627–634. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Hedeker D, Mermelstein RJ, Demirtas H. Modeling Between- and Within-Subject Variance in Ecological Momentary Assessment (EMA) Data Using Mixed-Effects Location Scale Models. Stat Med. 2012;31:3328–3336. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Rubin DB. Inference and Missing Data (with Discussion). Biometrika. 1976;63:581–592. [Google Scholar]
- 7.Heitjan DF, Rubin DB. Ignorability and Coarse Data. Ann Stat. 1991;19:2244–2253. [Google Scholar]
- 8.Little RJA. Modeling the Drop-Out Mechanism in Repeated-Measures Studies. J Am Stat Assoc. 1995;90:1112–1121. [Google Scholar]
- 9.Little RJ, D’Agostino R, Cohen ML, et al. The Prevention and Treatment of Missing Data in Clinical Trials. N Engl J Med. 2012;367:1355–1360. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Courvoisier DS, Eid M, Lischetzke T. Compliance to A Cell Phone-Based Ecological Momentary Assessment Study: The Effect of Time and Personality Characteristics. Psychol Assess. 2012;24:713–720. [DOI] [PubMed] [Google Scholar]
- 11.Bunouf P, Grouin JM, Molenberghs G. Analysis of an incomplete binary outcome derived from frequently recorded longitudinal continuous data: application to daily pain evaluation. Stat Med. 2012;31:1554–1571. [DOI] [PubMed] [Google Scholar]
- 12.Gao W, Hedeker D, Mermelstein RJ, Xie H. A Scalable Approach to Measuring the Impact of Nonignorable Nonresponse with an EMA Application. Stat Med. 2016;35:5579–5602. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Cursio JF, Mermelstein RJ, Hedeker D. Latent Trait Shared-Parameter Mixed Models for Missing Ecological Momentary Assessment Data. Stat Med. 2019;38:660–673. [DOI] [PubMed] [Google Scholar]
- 14.Daniels MJ, Hogan JW. Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling and Sensitivity Analysis. New York, NY: Chapman and Hall/CRC; 2008. [Google Scholar]
- 15.Copas JB, Li HG. Inference for Non-random Samples (with Discussion). J R Stat Soc Series B Stat Methodol. 1997;59:55–95. [Google Scholar]
- 16.Troxel AB, Harrington DP, Lipsitz SR. Analysis of Longitudinal Data with Non-Ignorable Non-Monotone Missing Values. J R Stat Soc Ser C Appl Stat. 1998;47:425–438. [Google Scholar]
- 17.Copas JB, Eguchi S. Local Sensitivity Approximations for Selectivity Bias. J R Stat Soc Series B Stat Methodol. 2001;63:871–895. [Google Scholar]
- 18.Kenward MG. Selection Models for Repeated Measurements with Non-Random Dropout: An Illustration of Sensitivity. Stat Med. 1998;17:2723–2732. [DOI] [PubMed] [Google Scholar]
- 19.Hirano K, Imbens GW, Ridder G, Rubin DB. Combining panels with attrition and refreshment samples. Econometrica. 2001;69:1645–1659. [Google Scholar]
- 20.Deng Y, Hillygus DS, Reiter JP, Si Y, Zheng S. Handling Attrition in Longitudinal Studies: The Case for Refreshment Samples. Stat Sci. 2013;28:238–256. [Google Scholar]
- 21.Verbeke G, Molenberghs G, Thijs H, Lesa re E, Kenward MG. Sensitivity Analysis for Nonrandom Dropout: A Local Influence Approach. Biometrics. 2001;57:7–14. [DOI] [PubMed] [Google Scholar]
- 22.Scharfstein DO, Rotnitzky A, Robins JM. Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models (with Discussion). J Am Stat Assoc. 1999;94:1096–1146. [Google Scholar]
- 23.Wen L, Seaman S. Semi-parametric methods of handling missing data in mortal cohorts under non-ignorable missingness. Biometrics. 2018;74:1427–1437. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Lindsey JK, Lambert P. On the appropriateness of marginal models for repeated measurements in clinical trials. Stat Med. 1998;17:447–469. [DOI] [PubMed] [Google Scholar]
- 25.Lee Y, Nelder JA. Conditional and Marginal Models: Another View. Stat Sci. 2004;19:219–238. [Google Scholar]
- 26.Troxel AB, Ma G, Heitjan DF. An Index of Local Sensitivity to Nonignorability. Stat Sin. 2004;14:1221–1237. [Google Scholar]
- 27.Xie H, Heitjan DF. Sensitivity Analysis of Causal Inference in a Clinical Trial Subject to Crossover. Clin Trials. 2004;1:21–30. [DOI] [PubMed] [Google Scholar]
- 28.Ma G, Troxel AB, Heitjan DF. An Index of Local Sensitivity to Nonignorable Dropout in Longitudinal Modeling. Stat Med. 2005;24:2129–2150. [DOI] [PubMed] [Google Scholar]
- 29.Xie H A Local Sensitivity Analysis Approach to Longitudinal Non-Gaussian Data with Non-Ignorable Dropout. Stat Med. 2008;27:3155–3177. [DOI] [PubMed] [Google Scholar]
- 30.Xie H, Heitjan DF. Local Sensitivity to Nonignorability: Dependence on the Assumed Dropout Mechanism. Stat Biopharm Res. 2009;1:243–257. [Google Scholar]
- 31.Xie H Adjusting for Nonignorable Missingness When Estimating Generalized Additive Models. Biom J. 2010;52:186–200. [DOI] [PubMed] [Google Scholar]
- 32.Xie H Analyzing Longitudinal Clinical Trial Data with Nonignorable Missingness and Unknown Missingness Reasons. Comput Stat Data Anal. 2012;56:1287–1300. [Google Scholar]
- 33.Xie H, Qian Y. Measuring the Impact of Nonignorability in Panel Data with Non-Monotone Nonresponse. J Appl Econ. 2012;27:129–159. [Google Scholar]
- 34.Xie H, Gao W, Xing B, Heitjan DF, Hedeker D, Yuan C. Measuring the Impact of Nonignorable Missingness Using the R Package isni. Comput Methods Programs Biomed. 2018;164:207–220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Chen HY, Xie H, Qian Y. Multiple Imputation for Missing Values through Conditional Semiparametric Odds Ratio Models. Biometrics. 2011;67:799–809. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Troxel AB, Lipsitz SR, Harrington DP. Marginal Models for the Analysis of Longitudinal Measurements with Nonignorable Non-Monotone Missing Data. Biometrika. 1998;85:661–672. [Google Scholar]
- 37.Vansteelandt S, Rotnitzky A, Robins JM. Estimation of regression models for the mean of repeated outcomes under non-ignorable non-monotone non-response. Biometrika. 2007;94:841–860. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Lin X, Mermelstein RJ, Hedeker D. A Shared Parameter Location Scale Mixed Effect Model for EMA Data Subject to Informative Missing. Health Serv Outcomes Res Methodol. 2018;18:227–243. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
