Semiparametric fractional imputation using empirical likelihood in survey sampling

Sixia Chen; Jae Kwang Kim

doi:10.1080/24754269.2017.1328244

. Author manuscript; available in PMC: 2018 Jun 1.

Published in final edited form as: Stat Theory Relat Fields. 2017 Jun 1;1(1):69–81. doi: 10.1080/24754269.2017.1328244

Semiparametric fractional imputation using empirical likelihood in survey sampling

Sixia Chen, Jae Kwang Kim

PMCID: PMC5654594 NIHMSID: NIHMS908583 PMID: 29082363

Abstract

The empirical likelihood method is a powerful tool for incorporating moment conditions in statistical inference. We propose a novel application of the empirical likelihood for handling item nonresponse in survey sampling. The proposed method takes the form of fractional imputation (Kim, 2011) but it does not require parametric model assumptions. Instead, only the first moment condition based on a regression model is assumed and the empirical likelihood method is applied to the observed residuals to get the fractional weights. The resulting semiparametric fractional imputation provides $\sqrt{n}$ -consistent estimates for various parameters. Variance estimation is implemented using a jackknife method. Two limited simulation studies are presented to compare several imputation estimators.

Keywords: Item nonresponse, missing data, quantile estimation, robust estimation

1 Introduction

Missing data are frequently encountered in many areas, such as survey sampling, epidemiology and other fields. Simply ignoring missing values can potentially lead to biased estimation (Little and Rubin 2002, Kim and Shao 2013). Two statistical approaches for handling missing data have been used in practice: propensity score weighting and imputation. Propensity score weighting is used mainly to correct for unit non-response, while imputation is mainly used to handle item nonresponse. Haziza (2009) provides a comprehensive overview of the imputation methods in survey sampling.

Multiple imputation (MI), proposed by Rubin (1987), is a popular approach of imputation for general-purpose estimation due to its practical simplicity. However, the Rubin’s variance estimator may be biased under certain situation (Fay 1992; Wang and Robins 1998; Kim, et. al. 2006; Yang and Kim, 2016) and its validity requires the congeniality condition of Meng (1994), which may not hold for general-purpose estimation.

Fractional imputation (FI), first proposed by Kalton and Kish (1984), provides an alternative method for handling item nonresponse. Fay (1996), Kim and Fuller (2004), Fuller and Kim (2005), Durrant (2005), and Durrant and Skinner (2006) discussed fractional hot deck imputation. Kim (2011) and Kim and Yang (2014) discussed a fully parametric approach to fractional imputation. The parametric fractional imputation provides a powerful tool for handling missing data for various situations. However, it relies on a strong parametric model assumption and making such an assumption is not usually preferred in survey sampling. Balanced random imputation of Chauvet et al (2011) is also an attractive imputation technique, but it still requires parametric model assumptions for multipurpose estimation.

The empirical likelihood (EL) method, considered by Owen (2001) and Qin and Lawless (1994), is a useful tool for semiparametric inference in statistics. It involves a likelihood-based inference without making a parametric distributional assumption about the observed data. Qin (1993) addressed the missing survey data problem by using a biased sampling argument of Vardi (1985). Wang and Rao (2002) brought regression-type imputation approaches to empirical likelihood inference. Wang and Chen (2009) used a nonparametric regression imputation approach to handle missing data in the empirical likelihood inference. Müller (2009) considered a novel application of empirical likelihood method to handle missing data under a regression model assumption. In Müller (2009), the moment condition of the error term in the regression model is used to construct a fully imputed estimator.

In this paper, motivated by the fully imputed estimator of Müller (2009), we propose a semiparametric fractional imputation (SFI) method using empirical likelihood that can be used to handle item nonresponse in survey sampling. Because the proposed SFI uses only moment conditions in the semiparametric regression model, it is more robust than the PFI method or parametric MI method. By using a regression model assumptions, the proposed SFI method is more efficient than the nonparametric regression imputation method of Wang and Chen (2009). The proposed method takes the form of fractional imputation, so the actual implementation is very attractive in practice. The proposed SFI method can be used to estimate various parameters, including nonsmooth parameters such as population quantiles.

The paper is organized as follows. The basic setup is introduced and the proposed method is presented in Section 2. The asymptotic properties of the SFI estimators are presented in Section 3. Extensions to non-smooth statistics as well as random imputations are covered in Section 4. In Section 5, variance estimation is discussed. Some numerical results are given in Section 6. Some concluding remarks are made in Section 7.

2 Basic Setup

Consider a finite population ℱ_N = {(x_i, y_i);i = 1, 2, …, N}, where x_i is the vector of auxiliary variables that are always observed and y_i is the study variable that is subject to missingness. We assume (x_i, y_i) are realizations from a regression model

Y = m (X; β_{0}) + ε,

(1)

where m(X; β₀) is assumed to be known with unknown parameter β₀ and ε satisfies E(ε|X) = 0. No parametric distributional assumption on X is made.

Let δ_i be the response indicator such that δ_i = 1 if y_i is observed and δ_i = 0 otherwise. We assume missing at random (MAR) in the sense that

\Pr (δ = 1 | x, y) = \Pr (δ = 1 | x) .

(2)

Even though we observe δ_i only in the sample, we can conceptually assume that δ_i’s are defined throughout the population. Such extended definition of δ_i has been adopted in Fay (1992), Shao and Steel (1999), Kim, Navarro, and Fuller (2006).

Given the finite population, suppose that sample A of size n is selected from the finite population by a probability sampling mechanism. Let π_i,i = 1, 2,…, N, be the first order inclusion probability of unit i in the population. We are interested in estimating η₀, defined as a solution to the estimating equation E {U(η; x, y)} = 0 where U(η; x, y) is a known function with parameter η. To avoid unnecessary details, we assume that the solution to E {U(η; x, y)} = 0 is unique and the dimensions of η and U(η; x, y) are r. Thus, the parameter η is just-identified Under complete response, a consistent estimator of η₀ is obtained by solving

\sum_{i \in A} \frac{1}{π_{i}} U (η; x_{i}, y_{i}) = 0

for η. If some of y_i are missing, under the MAR assumption, a consistent estimator of η₀ can be obtained by solving the following expected estimating equation

\sum_{i \in A} \frac{1}{π_{i}} [δ_{i} U (η; x_{i}, y_{i}) + (1 - δ_{i}) E {U (η; x_{i}, Y) | x_{i}}] = 0

(3)

for η. The conditional expectation in (3) is with respect to f(y | x), which is unknown as we only assume (1).

In fractional imputation, our goal is to approximate the conditional expectation in (3) by the weighted mean of the fractionally imputed estimating functions. That is, we wish to achieve

E {U (η; x_{i}, Y) | x_{i}} ≅ \sum_{j = 1}^{m} w_{i j}^{*} U (η; x_{i}, y_{i}^{* (j)})

(4)

as closely as possible for some $(w_{i j}^{*}, y_{i}^{* (j)})$ satisfying $\sum_{j = 1}^{m} w_{i j}^{*} = 1$ , where $w_{i j}^{*}$ ’s are desired fractional weights and $y_{i}^{* (j)}$ ’s are m imputed values for subject i. Kim (2011) and Kim and Yang (2014) developed a fractional imputation satisfying (4) using a parametric model assumption on f(y | x).

In our proposed method, we use the empirical likelihood approach to achieve the approximation in (4). To explain the idea, assume for now that the true parameter β₀ in (1) is known. In this case, ε_i = y_i − m(x_i; β₀) are available among δ_i = 1. Because E(ε | x) = 0 holds, we can compute

E {U (η; x_{i}, y) | x_{i}} = \int U (η; x_{i}, y) f (y | x_{i}) d y = \int U (η; x_{i} m (x_{i}, β_{0}) + ε) f_{ε} (ε | x_{i}) d ε,

where f_ε(ε | x) is the (unknown) conditional density of ε given x. To apply the empirical likelihood method, we assume that the conditional distribution of ε given x can be approximated by

F_{ε} (ε | x) = \sum_{i \in A} δ_{i} w_{i} I (ε_{i} \leq ε)

(5)

such that w_i ≥ 0 with Σδ_iw_i = 1 are the point mass assigned to the observed ε_i by assuming that the support of ε_i is equal to the set of observed ε_i. Using the approximation in (5), we can obtain

E {U (η; x_{i}, Y) | x_{i}} ≅ \sum_{j \in A} δ_{j} w_{j} U (η; x_{i}; m (x_{i}; β_{0}) + ε_{j}),

which can be written in the fractional imputation form in (4). To determine w_j uniquely, we can use the idea of pseudo empirical likelihood method of Wu and Rao (2006) to maximize

l (w) = \sum_{i \in A} δ_{i} π_{i}^{- 1} \log (w_{i})

(6)

subject to

\sum_{i \in A} δ_{i} w_{i} = 1 and \sum_{i \in A} δ_{i} w_{i} ε_{i} = 0.

(7)

In practice, we do not know β₀ and, hence, we do not observe ε_i = y_i − m(x_i; β₀). We can use $\sqrt{n}$ -consistent estimator of β₀ to obtain ${\hat{ε}}_{i} = y_{i} - m (x_{i}; \hat{β})$ and apply the above empirical likelihood method to the observed residuals. In general, one can use

{\hat{U}}_{β} (β) = \frac{1}{N} \sum_{i \in A} \frac{δ_{i}}{π_{i}} {y_{i} - m (x_{i}; β)} h (x_{i}; β) = 0

(8)

to obtain a $\sqrt{n}$ -consistent estimator of β, where h(x_i; β) is an arbitrary function that enables the above equation to have a solution. If the variance function V(y|x) = σ²q(x_i; β₀) for a known function q, then one can choose h(x_i; β) = ṁ(x_i; β)/q(x_i; β), where ṁ(x_i; β) = ∂m(x_i; β)/∂β. This choice is motivated by the quasilikelihood euations for generalized linear models (McCullagh and Nelder, 1989, Ch. 9). The solution to (8) can be called complete-case (CC) method. The CC estimator is not efficient in general, but it is efficient for estimating β under MAR. Thus, the resulting SFI estimator can be constructed as follows:

[Step 1] Obtain $\sqrt{n}$ -consistent estimator of β₀ and compute ${\hat{ε}}_{i} = y_{i} - m (x_{i}; \hat{β})$ among the respondents.
[Step 2] Find ŵ_i that maximizes (6) subject to
$\sum_{i \in A} δ_{i} w_{i} = 1 and \sum_{i \in A} δ_{i} w_{i} {\hat{ε}}_{i} = 0.$ (9)
The solution can be written as
${\hat{w}}_{i} = \frac{π_{i}^{- 1}}{\sum_{k \in A} δ_{k} π_{k}^{- 1}} \frac{1}{1 + \hat{λ} {\hat{ε}}_{i}},$ (10)
where $\hat{λ}$ is obtained by solving the second constraint of (9).
[Step 3] Use ŵ_j in Step 2 to approximate
$E {U (η; x_{i}, Y) | x_{i}} ≅ \sum_{j \in A} δ_{j} w_{i j}^{*} U (η; x_{i}, y_{i}^{* (j)}),$
where $y_{i}^{* (j)} = {\hat{y}}_{i} + {\hat{ε}}_{j}$ and $w_{i j}^{*} = {\hat{w}}_{j}$ .
[Step 4] The SFI estimator ${\hat{η}}_{SFI}$ of η is computed by solving
${\hat{U}}_{π} (η, \hat{β}, \hat{λ}) = \frac{1}{N} \sum_{i \in A} \frac{1}{π_{i}} {δ_{i} U (η; x_{i}, y_{i}) + 1 - δ_{j}) \sum_{j \in A} δ_{j} w_{i j}^{*} U (η; x_{i}, y_{i}^{* (j)})} = 0$ (11)
for η.

Instead of (11), one can also consider a fully imputed estimating equation based on

\sum_{i \in A} \frac{1}{π_{i}} E {U (η; x_{i}, Y_{i}) | x_{i}} = 0,

which was considered by Müller (2009) under the independently and identically distributed (I.I.D.) setup. The fully imputed estimating equation may lead to a more efficient estimator of η (Matloff, 1981) but such over-imputation does not appeal to survey practice since we usually do not want to replace the true values of respondents with some imputed values. In the following section, we present the asymptotic properties of ${\hat{η}}_{SFI}$ under complex survey designs.

3 Asymptotic Properties

To discuss the asymptotic properties of the proposed SFI estimator of η, we first assume a sequence of finite populations and samples with finite fourth moments as in Fuller (2009, Ch.1). The following theorem presents the asymptotic normality of the proposed SFI estimator. The sketched proof of Theorem 1 is provided in Appendix A.

Theorem 1

Under the regularity conditions (C1)–(C13) in Appendix A, the SFI estimator defined in (11) is a $\sqrt{n}$ -consistent estimator of η₀, that is

\sqrt{n} ({\hat{η}}_{SFI} - η_{0}) \overset{ℒ}{\to} N (0, B \sum_{u 2} B'),

where $B = {[E {\partial U (η; x, y) / \partial η}]}^{- 1}, \sum_{u 2} = V (N^{- 1} {\sum_{i \in A} π}_{i}^{- 1} ζ_{i})$ , and

ζ_{i} = δ_{i} U (η_{0}; x_{i}, y_{i}) + (1 - δ_{i}) E {U (η_{0}; x_{i}, Y) | x_{i}} + \frac{δ_{i}}{E (δ)} C_{i} + D_{W} {[E {δ h (x; β_{0}) \frac{\partial m (x; β_{0})}{\partial β}}]}^{- 1} δ_{i} ε_{i} h (x_{i}; β_{0}),

(12)

and

C_{i} = {\bar{U}}_{m} (ε_{i}) - E {{\bar{U}}_{m} (ε_{i})} - σ^{- 2} E {ε {\bar{U}}_{m} (ε)} ε_{i},

D_{W} = D + E {ε {\bar{U}}_{m} (ε)} σ^{- 2} E {\frac{\partial m (x; β_{0})}{\partial β} | δ = 1},

D = E ((1 - δ) U (η_{0}; x, y) [\frac{\partial m (x; β_{0})}{\partial β} - E {\frac{\partial m (x; β_{0})}{\partial β} | δ = 1}] l (ε)),

with $σ^{2} = E (ε^{2}), {\bar{U}}_{m} (ε) = E {(1 - δ) U (η_{0}; x, y) | ε}$ , and $l (ε) = - \partial \log f_{ε} (ε | x) / \partial ε$ .

Remark 1

In (12), ζ_i can be written as the sum of four terms. The first two terms is the conditional expectation of U(η; x, y), the third term is the additional term due to approximating f(y | x) by the empirical likelihood method, and the fourth term is the additional term due to estimating β.

According to Theorem 1, a consistent variance estimator of ${\hat{η}}_{SFI}$ can be written as

\hat{V} ({\hat{η}}_{SFI}) = {\hat{E} (\frac{\partial U (η_{0}; x, y)}{\partial η})}^{- 1} \hat{V} (\frac{1}{N} \sum_{i \in A} π_{i}^{- 1} ζ_{i}) {[{\hat{E} (\frac{\partial U (η_{0}; x, y)}{\partial η})}^{- 1}]}^{T},

(13)

where

\hat{E} (\frac{\partial U (η_{0}; x, y)}{\partial η}) = \frac{1}{\hat{N}} \sum_{i \in A} π_{i}^{- 1} {δ_{i} \frac{\partial U (\hat{η}; x_{i}, y_{i})}{\partial η} + (1 - δ_{i}) \sum_{j \in A} δ_{j} w_{i j}^{*} \frac{\partial U (\hat{η}; x_{i} y_{i}^{* (j)})}{\partial η}},

with $\hat{N} = \sum_{i \in A} π_{i}^{- 1}, \hat{η} = {\hat{η}}_{SFI}$ and

\hat{V} (\frac{1}{N} \sum_{i \in A} π_{i}^{- 1} ζ_{i}) = \frac{1}{{\hat{N}}^{2}} \sum_{i \in A} \sum_{j \in A} \frac{π_{i j} - π_{i} π_{j}}{π_{i j}} \frac{{\hat{ζ}}_{i} {\hat{ζ}}_{j}}{π_{i} π_{j}} + \frac{1}{{\hat{N}}^{2}} \sum_{i \in A} \frac{{({\hat{ζ}}_{i} - {\hat{ζ}}_{N})}^{2}}{π_{i}},

(14)

where ${\hat{ζ}}_{N} = {\hat{N}}^{- 1} \sum_{i \in A} π_{i}^{- 1} {\hat{ζ}}_{i}$ and ${\hat{ζ}}_{i}$ is a plug-in estimator of ζ_i in (12). One can use

{\hat{ζ}}_{i} = δ_{i} U (\hat{η}; x_{i}, y_{i}) + (1 - δ_{i}) \hat{μ} (x_{i}; \hat{β}, \hat{η}) + δ_{i} {\hat{E} (δ)}^{- 1} [{\hat{\bar{U}}}_{m} ({\hat{ε}}_{i}) - \hat{E} {{\bar{U}}_{m} (ε_{i})} - {\hat{σ}}^{- 2} \hat{E} {ε {\bar{U}}_{m} (ε)} {\hat{ε}}_{i}] + {\hat{D}}_{W} {[\hat{E} {δ h (x; β_{0}) \frac{\partial m (x; β_{0})}{\partial β}}]}^{- 1} δ_{i} {\hat{ε}}_{i} h (x_{i}; \hat{β}),

with

\hat{μ} (x_{i}; \hat{β}, \hat{η}) = \frac{\sum_{j \in A} π_{j}^{- 1} δ_{j} U (\hat{η}; x_{i}, y_{i}^{* (j)})}{\sum_{j \in A} π_{j}^{- 1} δ_{j}}, \hat{E} (δ) = \frac{1}{\hat{N}} \sum_{j \in A} π_{j}^{- 1} δ_{j},

{\hat{σ}}^{2} = \frac{\sum_{j \in A} π_{j}^{- 1} δ_{j} {\hat{ε}}_{j}^{2}}{\sum_{j \in A} π_{j}^{- 1} δ_{j}}, \hat{E} {\frac{\partial m (x; β_{0})}{\partial β} | δ = 1} = \frac{\sum_{i \in A} π_{j}^{- 1} δ_{i} \partial m (x_{i}; \hat{β}) / \partial β}{\sum_{i \in A} π_{i}^{- 1} δ_{i}},

{\hat{D}}_{W} = \hat{D} + \hat{E} {ε {\bar{U}}_{m} (ε)} {\hat{σ}}^{- 2} \hat{E} {\frac{\partial m (x; β_{0})}{\partial β} | δ = 1},

{\hat{\bar{U}}}_{m} ({\hat{ε}}_{i}) = \frac{1}{\hat{N}} \sum_{j \in A} π_{j}^{- 1} (1 - δ_{j}) U (\hat{η}; x_{j}, y_{j}^{* (i)}),

\hat{E} {{\bar{U}}_{m} (ε_{i})} = \frac{1}{\hat{N}} \sum_{i \in A} π_{i}^{- 1} (1 - δ_{i}) \sum_{j \in A} δ_{j} {\hat{w}}_{j} U (\hat{η}; x_{i}, y_{i}^{* (j)}), \hat{E} {ε {\bar{U}}_{m} (ε)} = \frac{\sum_{j \in A} π_{j}^{- 1} δ_{j} {\hat{ε}}_{j} {\hat{\bar{U}}}_{m} ({\hat{ε}}_{j})}{\sum_{j \in A} δ_{j} π_{j}^{- 1}},

\hat{D} = \frac{1}{\hat{N}} \sum_{i \in A} π_{i}^{- 1} (1 - δ_{i}) \frac{\sum_{j \in A} δ_{j} π_{j}^{- 1} \partial U (\hat{η}; x_{i}, y_{i}^{* (j)}) / \partial y {\partial m (x_{i}; \hat{β}) / \partial β}}{\sum_{j \in A_{r}} π_{j}^{- 1}} .

When nN⁻¹ = o(1), the second term of (14) is of smaller order and can be safely ignored.

4 Extensions

In this section, we discuss two extensions of the proposed method. In Section 4.1, our proposed method is extended to handle non-smooth statistics including distribution functions and percentiles. In Section 4.2, an extension to stochastic imputation is discussed.

4.1 Inference for non-smooth statistics

Suppose that we are interested in estimating parameter η₀, the solution of E {U(η; x, y)} = 0 with non-smooth function U(η; x, y), where the non-smoothness can be with respect to either η or y. For generality, we assume the non-smoothness is with respect to both η and y. Wang and Opsomer (2011) discussed asymptotic results for nondifferentiable survey estimators. Define $θ = (η, β), θ_{0} = (η_{0} = β_{0})$ Let ${\tilde{U}}_{n} (θ) = N^{- 1} \sum_{i \in A} π_{i}^{- 1} \tilde{U} (θ; δ_{i}, x_{i}, y_{i})$ and $\tilde{U} (θ) = E {\tilde{U} (θ; δ_{i}, x_{i}, y_{i})}$ , where

\tilde{U} (θ; δ_{i}, x_{i}, y_{i}) = δ_{i} U (η; x_{i}, y_{i}) + (1 - δ_{i}) \int U {η; x_{i}, m (x_{i}; β) + ε_{i}} f_{ε} (ε_{i} | x_{i}) d ε_{i} .

Denote $\hat{θ} = (\hat{η}, \hat{β})$ as the solution of estimating equation Ũ_n(θ) = 0. To discuss asymptotic properties, we replace regularity conditions (C7)–(C10) in Appendix A with the regularity conditions (C14)–(C17) in Appendix B. The following theorem presents the asymptotic expansion of ${\hat{η}}_{SFI}$ under this scenario and the sketched proof is presented in Appendix B.

Theorem 2

Under regularity conditions (C1)–(C3), and (C11)–(C17) in Appendix A and Appendix B, ${\hat{η}}_{SFI}$ has the following asymptotic expansion

{\hat{η}}_{SFI} - η_{0} = - {[\frac{\partial E {U (η_{0}; x, y)}}{\partial η}]}^{- 1} (\frac{1}{N} \sum_{i \in A} \frac{1}{π_{i}} ζ_{2 i}) + o_{p} (n^{- 1 / 2}),

where

ζ_{2 i} = δ_{i} U (η_{0}; x_{i}, y_{i}) + (1 - δ_{i}) μ (x_{i}; β_{0}, η_{0}) + \frac{δ_{i}}{E (δ)} [{\bar{U}}_{m} (ε_{i}) - E {{\bar{U}}_{m} (ε_{i})} - \frac{E {ε {\bar{U}}_{m} (ε)}}{σ^{2}} ε_{i}] + D_{W}^{*} {[E {δ h (x; β_{0}) \frac{\partial m (x; β_{0})}{\partial β}}]}^{- 1} δ_{i} ε_{i} h (x_{i}; β_{0}),

where

D_{W}^{*} = D * + E {ε {\bar{U}}_{m} (ε)} σ^{- 2} E {\frac{\partial m (x; β_{0})}{\partial β} | δ = 1},

and

D^{*} = {E (δ)}^{- 1} \frac{\partial E [(1 - δ_{i}) δ_{j} U {η_{0}; x_{i}, y_{i}^{* (j)} (β)}]}{\partial β}

evaluated at β₀ and other terms are the same as those in Theorem 1.

By Theorem 2, we can obtain

\sqrt{n} ({\hat{η}}_{SFI} - η_{0}) \overset{ℒ}{\to} N (0, B \sum_{u 2} B'),

where B = [E {∂U(η; x, y)/∂_η}]⁻¹ and $\sum_{u 2} = V {N^{- 1} \sum_{i \in A} π_{i}^{- 1} ζ_{2 i}}$ . If we are interested in estimating the cumulative density function of y, which is Pr(y < t), then we can choose U(η; x, y) = I(y < t) − η and

E [(1 - δ_{i}) δ_{j} U {η_{0}; x_{i}, y_{i}^{* (j)} (β)}] = \int_{- \infty}^{+ \infty} \int_{- \infty}^{+ \infty} {1 - p (x_{i})} p (x_{j}) \int_{- \infty}^{t + m (x_{j}; β) - m (x_{i}; β)} f_{y_{j} | x_{j} (y_{j}) d y_{j} d x_{i} d x_{j}},

where p(x) = Pr(δ = 1|x). Therefore, we have $D * = {E (δ)}^{- 1} E [(1 - δ_{i}) δ_{j} {\frac{\partial m (x_{j}; β_{0})}{\partial β}} f_{y_{j} | x_{j}} {t + m (x_{j}; β_{0}) - m (x_{i}; β_{0})}] .$

A consistent estimators of D* can be written as

\hat{D} * = \sum_{i \in A} \frac{1 - δ_{i}}{π_{i}} \frac{\sum_{j \in A} δ_{j} π_{j}^{- 1} {\partial m (x_{j}; \hat{β}) / \partial β - \partial m (x_{i}; \hat{β}) / \partial β} {\hat{f}}_{y_{j} | x_{j}} {t + m (x_{j}; \hat{β}) - m (x_{i}; \hat{β})}}{\hat{N} \sum_{j \in A_{r}} π_{j}^{- 1}}

with

{\hat{f}}_{y | x} (y | x) = \frac{{(h_{x} h_{y})}^{- 1} \sum_{i \in A} π_{i}^{- 1} δ_{i} K_{x} {h_{x}^{- 1} (x - x_{i})} K_{y} {h_{y}^{- 1} (y - y_{i})}}{{(h_{x})}^{- 1} \sum_{i \in A} π_{i}^{- 1} δ_{i} K_{x} {h_{x}^{- 1} (x - x_{i})}},

where K_x and K_y are kernel functions for x and y with bandwidth h_x and h_y. Thus, a consistent variance estimator of ${\hat{η}}_{SFI}$ here can be obtained similarly to (13).

If the parameter of interest is the τ-th percentile of Y, given by $η = F_{Y}^{- 1} (τ)$ , the SFI estimator ${\hat{η}}_{τ, SFI}$ of η can be obtained by solving the estimating equation (11) with U(η; x, y) = I(y < η) − τ. Since E {I(Y < η)} = F_Y(η), it can be shown that ${\hat{η}}_{τ, SFI}$ has the asymptotic expansion in Theorem 2 with

\frac{\partial E {U (η_{0}; x, y}}{\partial η} = f_{y} (η_{0}) = E {f_{y | x} (η_{0} | x)},

where f_y is the density function for y. A consistent estimator of ∂E {U(η₀; x, y)} /∂η can be written as

\frac{\partial \hat{E} {U (η_{0}; x, y}}{\partial η} = \frac{1}{\hat{N}} \sum_{i \in A} π_{i}^{- 1} {\hat{f}}_{y | x} (\hat{η} | x_{i}),

and a consistent estimator of D* can be written as

\hat{D} * = \sum_{i \in A} \frac{1 - δ_{i}}{π_{i}} \frac{\sum_{j \in A} δ_{j} π_{j}^{- 1} {\partial m (x_{j}; \hat{β}) / \partial β} {\hat{f}}_{y_{j | x_{j}}} {\hat{η} + m (x_{j}; \hat{β}) - m (x_{i}; \hat{β})}}{\hat{N} \sum_{j \in A_{r}} π_{j}^{- 1}},

with $\hat{η} = {\hat{η}}_{τ, SFI}$ .

4.2 Stochastic imputation

For a multi-purpose survey, stochastic imputation is often preferred to deterministic imputation since it can preserve distributional relationships better. In stochastic imputation, imputed values are generated from a stochastic imputation mechanism and with additional variability due to the imputation. For simplicity, we only consider the case where $U (η; x, y_{i}^{* (j)})$ is a smooth function of η and β. The results can be naturally extended to non-smooth statistics. The stochastic imputation estimator ${\hat{η}}_{SFI 2}$ can be obtained by solving the following estimating equation

{\hat{U}}_{η}^{*} (η | \hat{β}, \hat{λ}) = \frac{1}{N} \sum_{i \in A} \frac{1}{π_{i}} {δ_{i} U (η; x_{i}, y_{i}) + (1 - δ_{i}) \frac{1}{M} \sum_{s = 1}^{M} U (η; x_{i}, y_{i}^{* (s)})} = 0,

where $y_{i}^{* (s)}$ are randomly selected from ${{\hat{y}}_{i j} = {\hat{y}}_{i} + {\hat{ε}}_{j}; j \in A_{r}}$ with the selection probability, $P (y_{i}^{* (s)} = {\hat{y}}_{i j}) = w_{i j}^{*}$ where $w_{i j}^{*}$ are the fractional weights in (11). Since

p \lim_{M \to \infty} \frac{1}{M} \sum_{s = 1}^{M} U (η; x_{i}, y_{i}^{* (s)}) = E ({\hat{U}}_{η}^{*} (η | \hat{β}, \hat{λ}) | I, x, y, δ) = \sum_{j \in A} δ_{j} w_{i j}^{*} U (η; x_{i}, y_{i}^{* (j)}),

where the conditional expectation is with respect to the stochastic imputation mechanism, we have

V {{\hat{U}}_{η}^{*} (η | \hat{β}, \hat{λ})} = V {{\hat{U}}_{η} (η_{0}, \hat{β}, \hat{λ})} + V {{\hat{U}}_{η}^{*} (η_{0} | \hat{β}, \hat{λ}) - {\hat{U}}_{η} (η_{0}, \hat{β}, \hat{λ})} = V {{\hat{U}}_{η} (η_{0}, \hat{β}, \hat{λ})} + E [V {{\hat{U}}_{η}^{*} (η_{0} | \hat{β}, \hat{λ}) | I, x, y, δ}] .

Thus, using an argument similar to Theorem 1, we can obtain

V ({\hat{η}}_{SFI 2}) \approx {E (\frac{\partial U (η_{0}; x, y}{\partial η})}^{- 1} V_{M} {E (\frac{\partial U (η_{0}; x, y)}{\partial η'})}^{- 1},

(15)

where $V_{M} = V {{\hat{U}}_{η}^{*} (η | \hat{β}, \hat{λ})}$ Therefore, a consistent variance estimator can be written as

\hat{V} (\hat{η} R I) = {\hat{E} (\frac{\partial U (η_{0}; x, y}{\partial η})}^{- 1} {\hat{V}}_{M} {\frac{\partial U (η_{0}; x, y}{\partial η'}}^{- 1},

where

{\hat{V}}_{M} = \hat{V} {{\hat{U}}_{η} (η_{0}, \hat{β}, \hat{λ}} + \hat{V} {{\hat{U}}_{η}^{*} (η_{0} | \hat{β}, \hat{λ}) - {\hat{U}}_{η} (η_{0}, \hat{β}, \hat{λ}) | I, x, y, δ},

(16)

and $\hat{E} (\partial U (η_{0}; x, y) / \partial η), \hat{V} {{\hat{U}}_{η} (η_{0}, \hat{β}, \hat{λ})}$ can be obtained similarly to (13) and

\hat{V} {{\hat{U}}_{η}^{*} (η_{0} | \hat{β}, \hat{λ}) - {\hat{U}}_{η} (η_{0}, \hat{β}, \hat{λ}) | I, x, y, δ} = \frac{1}{M {\hat{N}}^{2}} \sum_{i \in A} π_{i}^{- 2} (1 - δ_{i}) {\sum_{j \in A} δ_{j} w_{i j}^{*} {U (\hat{η}; x_{i}, y_{i}^{* (j)}) - \sum_{j \in A} δ_{j} w_{i j}^{*} U (\hat{η}; x_{i}, y_{i}^{* (j)})}}^{2} .

The second term of (16) estimates the additional variance due to stochastic imputation. If M is large, the second term is negligible.

5 Replication variance estimation

Estimating the variance of the estimator ${\hat{η}}_{SFI}$ can be done through the linearization formulas presented in Section 3 for smooth statistics and the formulas in Section 4 for non-smooth statistics, respectively. However, it requires tedious algebra to compute all the terms. In this section, we consider an alternative approach using replication methods. Shao and Tu (1995) considered the theoretical aspects of replication methods such as Jackknife and Bootstrap. Wolter (2007) gives a comprehensive overview of replication variance estimation methods in survey sampling.

Suppose we are interested in estimating $T = \sum_{i = 1}^{N} y_{i}$ . Define the design weight as $d_{i} = π_{i}^{- 1}$ . The design unbiased estimator of T is $\hat{T} = \sum_{i \in A} d_{i} y_{i}$ and the consistent replication variance estimator of $\hat{T}$ is given by

{\hat{V}}_{R} (\hat{T}) = \sum_{k = 1}^{L} c_{k} {({\hat{T}}^{(k)} - \hat{T})}^{2},

where there are L replication weights, c_k is the replication factor associated with the k-th replication and ${\hat{T}}^{(k)} = \sum_{i \in A} d_{i}^{(k)} y_{i}$ with $d_{i}^{(k)}$ being the k-th replicate of d_i. For example, c_k = (L − 1)/L for deleting one group jackknife method. For details of corresponding c_k with different variance estimation approaches, see Wolter (2007).

To obtain replication variance estimator of our proposed SFI estimator, we apply the same SFI method to each of the replicates. In the first step, we obtain the k-th replicate of ${\hat{β}}^{(k)}$ by solving

\sum_{i \in A} d_{i}^{(k)} δ_{i} {y_{i} - m (x_{i}; β)} h (x_{i}; β) = 0.

In the second step, the replicated EL weights are computed by maximizing

l^{(k)} (w) = \sum_{j \in A} δ_{j} d_{j}^{(k)} \log (w_{j})

subject to constraints

\sum_{j \in A} δ_{j} (d_{j}^{(k)} / d_{j}) w_{j} = 1 and \sum_{j \in A} δ_{j} (d_{j}^{(k)} / d_{j}) w_{j} {\hat{ε}}_{j}^{(k)} = 0,

with ${\hat{ε}}_{j}^{(k)} = y_{j} - m (x_{j}; {\hat{β}}^{(k)})$ . In the final step, the replicated SFI estimator is computed using the replicated EL weights. For smooth statistics, the k-th replicate of ${\hat{η}}_{SFI}$ , denoted by ${\hat{η}}_{SFI}^{(k)}$ , is obtained by the solution to the following estimating equation

\sum_{i \in A} d_{i}^{(k)} {δ_{i} U (η; x, y) + (1 - δ_{i}) \sum_{j \in A} δ_{j} w_{i j}^{* (k)} U (η; x_{i}, y_{i}^{* (j)} ({\hat{β}}^{(k)}))} = 0,

where $w_{i j}^{* (k)} = {\hat{w}}_{j}^{(k)}$ and $y_{i}^{* (j)} ({\hat{β}}^{(k)}) = m (x_{i}; {\hat{β}}^{(k)}) + y_{j} - m (x_{j}; {\hat{β}}^{(k)})$ . The final replication variance estimator of ${\hat{η}}_{SFI}$ is given by

{\hat{V}}_{R} ({\hat{η}}_{SFI}) = {\sum_{k = 1}^{L} c_{k} ({\hat{η}}_{SFI}^{(k)} - {\hat{η}}_{SFI})}^{2} .

For non-smooth statistics, our estimator is similar to that of Wang and Opsomer (2011). Define

{\hat{u}}^{(k)} = {\hat{U}}_{η}^{(k)} (\hat{η}, \hat{β}, \hat{λ}) + [- \hat{E} {ε {\bar{U}}_{m} (ε)}] ({\hat{λ}}^{(k)} - \hat{λ}) + \hat{D} * ({\hat{β}}^{(k)} - \hat{β}),

where Ê{εŪ_m(ε)} and $\hat{D} *$ are defined in Section 4.1, ${\hat{U}}_{η}^{(k)} (\hat{η}, \hat{β}, \hat{λ})$ is defined in (11) with design weight replaced by replication weight $d_{i}^{(k)}$ and fractional weights replaced by replication fractional weights $w_{i j}^{* (k)}$ . Then the replication variance estimator can be written as:

{\hat{V}}_{R} ({\hat{η}}_{SFI}) = {\frac{\partial \hat{E} {U (\hat{η}; x, y)}}{\partial η}}^{- 1} {\sum_{k = 1}^{L} c_{k} {{\hat{u}}^{(k)} - {\hat{U}}_{η} (\hat{η}, \hat{β}, \hat{λ})}}^{2} {[{\frac{\partial \hat{E} {U (\hat{η}; x, y}}{\partial η}}^{- 1}]}^{T},

with ∂Ê{U(η; x, y)} /∂η defined in Section 4.1.

6 Simulation studies

In this Section, we conduct two limited simulation studies. The first one is generated from an artificial data set and the second one is based on the real data treated as a finite population.

6.1 Simulation One

We repeatedly generate B = 2, 000 finite populations of (x_i, y_i, δ_i) of size N =10, 000 from a super-population model

y_{i} = 0.5 x_{i} + ε_{i},

with x_i ~ exp(1) and E(ε_i | x_i) = 0. Two error distributions are considered: (E1) ε_i ~ N(0, 1) and (E2) ε ~ {χ²(2) − 2} /2. Given (x, y), the response indicator δ has a Bernoulli distribution with Pr(δ = 1|x) = {1 + exp(1 − x)}⁻¹. The overall response rate is about 50%. Given each finite population (x, y, δ), we draw a sample by using a Poisson sampling design with the first-order inclusion probability $π_{i} = n z_{i} / \sum_{i = 1}^{N} z_{i}$ , where n = 200 and z_i = max{0.5y_i + 2, 1} + u_i, with u_i ~ χ²(1) and χ²(1) corresponding to the chi-squared distribution with degrees of freedom equal to one. In this simulation, we are interested in estimating three parameters:

$θ_{1} = N^{- 1} \sum_{i = 1}^{N} y_{i}$ , the population mean of y.
$θ_{2} = N^{- 1} \sum_{i = 1}^{N} I (y_{i} < 1)$ , the proportion of y less than 1.
θ₃ = F⁻¹(0.5), the population median of y.

From each sample, we compute the following four estimators:

The complete-case (CC) estimator only based on the complete cases only. The CC estimator is the solution to $\sum_{i \in A} δ_{i} π_{i}^{- 1} U (η; x_{i}, y_{i}) = 0$ , where U(η; x, y) is the corresponding estimating equation for each parameter.
Full sample estimator based on the original sampling without missing data and pseudo empirical likelihood method (Full). Specifically, we maximize $l = \sum_{i \in A} π_{i}^{- 1} \log (ω_{i})$ , subject to the following constraints
$\sum_{i \in A} ω_{i} = 1, \sum_{i \in A} ω_{i} {\hat{ε}}_{i} = 0,$
where ${\hat{ε}}_{i} = y_{i} - {\hat{β}}_{0} - {\hat{β}}_{1} x_{i}$ and $({\hat{β}}_{0}, {\hat{β}}_{1})$ is obtained by solving the following estimating equation:
$\sum_{i \in A} π_{i}^{- 1} (y_{i} - β_{0} - β_{1} x_{i}) {(1, x_{i})}^{T} = 0.$

The full sample estimator serves as a benchmark for comparison.
The parametric fractional imputation (PFI) estimator of Kim (2011) assuming y_i | x_i ~ N(β₀ + β₁x_i, σ²) with imputation size M = 100.
The nonparametric fractional imputation (NFI) estimator that uses the following nonparametric fractional weights:
$ω_{i j}^{*} = \frac{K_{x} {h_{x}^{- 1} (x_{i} - x_{j})}}{\sum_{j \in A} δ_{j} K_{x} {h_{x}^{- 1} (x_{i} - x_{j})}}$
for each unit i ∈ A with δ_i = 0 and j ∈ A with δ_j = 1. We use the reference bandwidth $h_{x} = 1.06 {\hat{N}}^{- 1 / 5} {\hat{σ}}_{x}$ with ${\hat{σ}}_{x} = {{(\hat{N} - 1)}^{- 1} \sum_{i \in A} π_{i}^{- 1} {(x_{i} - {\hat{μ}}_{x})}^{2}}^{1 / 2}, {\hat{μ}}_{x} = {\hat{N}}^{- 1} \sum_{i \in A} π_{i}^{- 1} x_{i}$ and $\hat{N} = \sum_{i \in A} π_{i}^{- 1}$ . A Gaussian kernel density function K_x(t) = (2π)^−1/2 exp(−t²/2) has also been used.
The stochastic regression imputation (SRI) estimator assuming the following model: y_i = β₀ + β₁x_i + ε_i with E(ε_i) = 0 and V(ε_i) = σ².
The proposed semiparametric fractional imputation (SFI) estimator ${\hat{θ}}_{SFI}$ .

From the Monte Carlo sample of size B = 2,000, Monte Carlo bias, standard error and root mean squared error are computed for each point estimator. The results are presented in Table 1. Under (E1) and (E2), the CC estimators perform worst since the response mechanism is not missing completely at random (MCAR). Unless the response mechanism is MCAR, the CC estimator is biased. The FULL estimators always perform best since they assume no missing values and use moment condition (1). Under distribution (E1), the SFI and PFI estimators have similar performances. Among the three imputation estimators, the NFI and SFI estimator performs worst in terms of RMSE for all scenarios since they used less information.

Table 1.

The Monte Carlo Bias (×10⁻²), Standard Error (SE) (×10⁻²) and Root Mean Squared Error (RMSE) (×10⁻²) for four different methods with two error distributions in Simulation One.

Par	Method	(E1)			(E2)

		Bias	SE	RMSE	Bias	SE	RMSE
E(y)	CC	18.6	13.0	22.7	20.0	19.8	28.1
	FULL	0.1	5.9	5.9	0.3	9.0	9.0
	PFI	−0.1	7.8	7.8	0.7	12.0	12.0
	NFI	0.7	12.9	12.9	2.9	21.3	21.5
	SRI	−0.2	13.9	13.9	1.7	23.3	23.3
	SFI	−0.2	6.9	6.9	0.3	12.4	12.4

Pr(y < 1)	CC	−6.3	5.3	8.2	−3.5	4.8	5.9
	FULL	0.0	3.0	3.0	0.0	2.6	2.6
	PFI	0.2	3.1	3.1	−5.4	3.1	6.2
	NFI	−0.1	5.1	5.1	−0.5	4.9	4.9
	SRI	0.2	5.0	5.0	−0.4	5.0	5.0
	SFI	0.1	3.2	3.2	0.1	3.3	3.3

Quantile	CC	15.0	13.9	20.4	21.9	22.8	31.6
	FULL	0.1	7.8	7.8	0.2	13.2	13.2
	PFI	−0.3	8.5	8.5	30.0	14.5	33.3
	NFI	−0.8	14.9	14.9	2.7	25.3	25.4
	SRI	−1.0	15.7	15.7	2.4	25.4	25.5
	SFI	−0.5	9.1	9.1	0.2	17.0	17.0

Open in a new tab

Under model (E2), the SFI estimator shows negligible bias for all parameters, but the PFI estimator has non-negligible bias for estimating proportion and quantile which is due to the misspecification of the error distribution. The NFI and SRI estimators are not as efficient as the SFI estimator in terms of bias and variance. The SFI estimator outperforms PFI, NFI and SRI estimators in terms of RMSE. The overall results indicate the robustness of SFI. For variance estimation, we computed the relative bias based on the Taylor linearization and replication methods, respectively. All the relative bias are below 7%. In addition, we calculate the Monte Carlo coverage rate for the 95% confidence intervals. Under model (E1), the coverage rates are 94.8%, 93.4% and 95.0% for estimating mean, proportion and quantile by using Taylor method and 94.9%, 93.6% and 95.1% by using Replication method. The results under model (E2) are similar and the coverage rates are close to the nominal rate.

6.2 Simulation Two

In the second simulation study, we use 2013–2014 U.S. National Health Examination and Nutrition Survey (NHANES) data as a pseudo finite population. Suppose the study variable is Systolic blood pressure (BPXSY1) and the covariate variable is body mass index (BMXBMI). Keeping only the cases where both BPXSY1 and BMXBMI are greater than zero, the pseudo finite population eventually contains 7104 cases. The scatter plot of BPXSY1 versus BMXBMI is presented in Figure 1. We assume BPXSY1 is roughly linear with respect to BMXBMI. After performing linear regression of BPXSY1 versus BMXBMI, the QQ plot of residuals and residuals vs fitted values plot are presented in Figure 2. The residual plots suggest deviation from normality. The p-value from Anderson-Darling test for normality is less than 2.2 × 10⁻¹⁶. We first generate response indicators δ_i, i = 1, 2,…., 7104 from the following logistic regression model:

\Pr (δ_{i} = 1 | BMXBM I_{i}) = \frac{\exp {1 - 0.1 \log (BMXBM I_{i})}}{1 + \exp {1 - 0.1 \log (BMXBM I_{i})}} .

QQ plot (left panel) and Residual vs fitted value plot (right panel)

The response rate is around 60%. Then given (BPXSY1_i, BMXBMI, δ_i), B = 2000 Monte Carlo samples are generated from simple random sampling with sample size n = 200. Assume the parameters of interest are:

(Mean). Finite population mean of BPXSY1, which is θ_m = 118.056.
(Prop1). Finite population proportion one of BPXSY1:
$θ_{p 1} = \frac{1}{N} \sum_{i = 1}^{N} I (BPXSY 1_{i} < 80) = 0.0008.$
(Prop2). Finite population proportion two of BPXSY1:
$θ_{p 2} = \frac{1}{N} \sum_{i = 1}^{N} I (BPXSY 1_{i} < 120) = 0.6017.$
(Prop3). Finite population proportion three of BPXSY1:
$θ_{p 3} = \frac{1}{N} \sum_{i = 1}^{N} I (BPXSY 1_{i} < 160) = 0.9711.$

We consider the same PFI, NFI, SRI and SFI estimators as discussed in Simulation One. The Monte Carlo Bias, Standard Error and Root Mean Squared Error (RMSE) are presented in Table 2. For the population mean, PFI and SFI performs similarly and the NFI estimator has slightly larger bias and standard error. SRI has comparable bias as PFI and SFI, but it has larger SE, as expected. For population proportions, the PFI estimator has substantially larger bias than NFI, SRI and SFI which may be due to the misspecification of error distributions. The NFI and SRI estimators have larger standard errors than PFI and SFI estimators since the nonparametric methods are not as efficient as parametric or semiparametric methods and stochastic imputation will produce larger variance. Overall, SFI estimator performs the best in terms of both bias and variance.

Table 2.

The Monte Carlo Bias (×10⁻²), Standard Error (SE) (×10⁻²) and Root Mean Squared Error (RMSE) (×10⁻²) for four different methods and four parameters.

Par	Method	Bias	SE	RMSE
Mean	COM	−2.9	124.8	124.9
	PFI	−2.3	153.2	153.2
	NFI	−5.0	153.5	153.6
	SRI	1.4	169.7	169.7
	SFI	−2.2	153.3	153.3

Prop1	COM	0.0	0.2	0.2
	PFI	0.5	0.3	0.6
	NFI	0.0	0.3	0.3
	SRI	0.1	0.3	0.3
	SFI	0.0	0.2	0.2

Prop2	COM	0.0	3.4	3.4
	PFI	−2.2	3.8	4.4
	NFI	−0.5	4.2	4.3
	SRI	0.5	4.2	4.3
	SFI	0.2	3.9	3.9

Prop3	COM	0.0	1.2	1.2
	PFI	0.7	1.1	1.3
	NFI	0.2	1.4	1.4
	SRI	−0.3	1.6	1.6
	SFI	0.1	1.4	1.4

Open in a new tab

7 Conclusions

Regression imputation is often used to handle item nonresponse in survey sampling. Unlike the usual regression imputation, the proposed semiparametric fractional imputation offers valid inference for a wide set of parameters such as population proportions and quantiles. Besides, only the first moment assumption is needed to obtain a consistent SFI estimator of the parameter, which leads to robust parameter estimation. The proposed SFI method shows good performances in the limited simulation studies.

The proposed method has several possible future research topics. First, instead of assuming ignorable response mechanism, we can consider an extension to nonignorable nonresponse (Kim and Yu, 2011) using an exponential tilting response model. Also, extension of the SFI for handling multivariate missing data will be an important future research topic.

Appendix

A: Proof of Theorem 1

We first assume the following regularity conditions:

(C1)
The finite population is a random sample from the semiparametric regression model in (1). The regression function m(x; β) in (1) has a continuous first derivative ∂m(x; β)/∂β in the neighborhood of the true value β₀ and E {m²(x; β)} and E {∂m(x; β)/∂β} are bounded in this neighborhood.
(C2)
Function h(x; β) in the estimating function Û_β(β) in (8) has continuous first derivative ∂h(x; β)/∂β in the neighborhood of the true value β₀ and ‖h(x; β)‖² and ‖∂h(x; β)/∂β‖ are bounded by some integrable function G₁(x) in the neighborhood.
(C3)
The model error term in (1) satisfies E(ε²) < ∞ and max {‖ε_i‖: i ∈ A} = o_p(n^1/2).
(C4)
Let U_β(β) = E[δ{y − m (x; β)} h (x; β)], assume Û_β(β) converges to U_β(β) in probability uniformly in the neighborhood of the true value β₀. For every a > 0, $\inf_{β : ‖ β - β_{0} ‖ \geq a} ‖ U_{β} (β) ‖ > 0 = ‖ U_{β} (β_{0}) ‖$ .
(C5)
∂Û_β(β) /∂β converges to continuous nonsingular derivative ∂Û_β(β) /∂β in probability uniformly in the neighborhood of the true value β₀.
(C6)
$\sqrt{n} {\hat{U}}_{β} (β_{0}) \overset{ℒ}{\to} N (0, \sum_{β})$ , as n, N → ∞, where $\sum_{β} = V {\sqrt{n} {\hat{U}}_{β} (β_{0})}$ denotes the design model variance, the variance under the joint distribution of the superpopulation model and the sampling mechanism.
(C7)
Function U(η; x, y) has continuous partial derivatives ∂U(η; x, y)/∂η and ∂U(η; x, y)/∂y in the neighborhood of the true value η₀ and ‖U(η; x, y)‖², ‖∂U(η; x, y)/∂η‖ and ‖∂U(η; x, y)/∂y‖ are bounded by some integrable function G₂(x, y) in the neighborhood.
(C8)
Let ${\hat{U}}_{n} (η) = N^{- 1} \sum_{i \in A} π_{i}^{- 1} U (η; x_{i}, y_{i})$ and U(η) = E{U(η; x_i, y_i}, then Û_n(η) converges to U(η) in probability uniformly in the neighborhood of the true value η₀. For every a > 0, $\inf_{η : ‖ η - η_{0} ‖ \geq a} ‖ U (η) ‖ > 0 = ‖ U (η_{0}) ‖$
(C9)
∂Û_n (η) /δη converges to continuous nonsingular derivative ∂U (η) /∂η in probability uniformly in the neighborhood of the true value η₀.
(C10)
$\sqrt{n} {\hat{U}}_{n} (η_{0}) \overset{ℒ}{\to} N (0, \sum_{η})$ , as n, N → ∞, where $\sum_{η} = V {\sqrt{n} {\hat{U}}_{n} (η_{0})}$ denotes the design model variance.
(C11)
The first order inclusion probabilities satisfy K_L ≤ Nn⁻¹π_i ≤ K_U for all i, where K_L and K_U are positive constants.
(C12)
${max}_{i, j} | π_{i j} π_{i}^{- 1} π_{j}^{- 1} - 1 | = o (1)$ for any i, j = 1, 2,…, N and i ≠ j, where π_ij are the second order inclusion probability of unit i and unit j in the population.
(C13)
The response probability satisfies (2) and a < Pr(δ_i = 1|x_i) ≤ 1 for i = 1, 2,…, N for some fixed a > 0

Conditions (C1)–(C2) are the model assumptions about the finite population. Condition (C3) is used to control the asymptotic order of $\hat{λ}$ in (10). Chen and Sitter (1999, Appendix 2) argued that (C3) holds for common unequal probability sampling designs. Conditions (C4) and (C8) ensure the consistency of $\hat{β}$ and $\hat{η}$ , respectively. Conditions (C5), (C6), (C9) and (C10) are the regularity conditions that ensure asymptotic normality of $\hat{β}$ and $\hat{η}$ . Van der Vaart (1998, Ch. 5) used similar regularity conditions. Specifically, Conditions (C6) and (C10) have been used in many existing literature such as Wu and Rao (2006), Wang and Opsomer (2011), among others. Hajek (1960, 1964) established the asymptotic normality condition under simple random sampling and rejective sampling with unequal selection probabilities. Visek (1979) established the asymptotic normality for the Horvitz-Thompson estimator under Rao-Sampford sampling designs. Condition (C7) controls the smoothness and asymptotic behavior of estimating function U(η; x, y). Conditions (C11) and (C12) are the standard assumptions for the sampling designs. Similar conditions have been used in Isaki and Fuller (1982) and Wang and Opsomer (2011). Condition (C13) controls the behavior of the individual response probability. According to assumption (C3) and by using similar techniques as Wu and Rao (2006), we can show that $\hat{λ} = O_{p} (n^{- 1 / 2})$ . Assumption (C4) and Taylor linearization can establish

0 = {\hat{U}}_{β} (\hat{β}) = {\hat{U}}_{β} (β_{0}) + \frac{\partial {\hat{U}}_{β} (β_{0})}{\partial β} (\hat{β} - β_{0}) + o_{p} (n^{- 1 / 2}) .

Therefore,

\hat{β} - β_{0} = - {[E {\frac{\partial {\hat{U}}_{β} ({\hat{β}}_{0})}{\partial β}}]}^{- 1} {\hat{U}}_{β} ({\hat{β}}_{0}) + o_{p} (n^{- 1 / 2}) = {[E {δ h (x; β_{0}) \frac{\partial m (x; β_{0})}{\partial β}}]}^{- 1} \frac{1}{N} \sum_{i \in A} π_{i}^{- 1} δ_{i} ε_{i} h (x_{i}; β_{0}) + o_{p} (n^{- 1 / 2}) .

(A.1)

We know that $\hat{λ}$ is the solution of the following estimating equation

{\hat{U}}_{λ} (λ, \hat{β}) = \frac{1}{N} \sum_{i \in A} δ_{i} π_{i}^{- 1} \frac{{\hat{ε}}_{i}}{1 + λ {\hat{ε}}_{i}} = 0.

In addition, we have

\frac{\partial {\hat{U}}_{λ} (0, β_{0})}{\partial λ} = - \frac{1}{N} \sum_{i \in A} δ_{i} π_{i}^{- 1} ε_{i}^{2} = - E (δ) σ^{2} + o_{p} (1),

(A.2)

and

\frac{\partial {\hat{U}}_{λ} (0, β_{0})}{\partial β} = - \frac{1}{N} \sum_{i \in A} \frac{δ_{i}}{π_{i}} \frac{\partial m (x_{i}; β_{0})}{\partial β} = - E {\frac{\partial m {x; β_{0}}}{\partial β} | δ = 1} E (δ)

(A.3)

Based on (A.2), (A.3), by using Taylor linearization, we have

0 = {\hat{U}}_{λ} (\hat{λ}, \hat{β}) = {\hat{U}}_{λ} (0, β_{0}) + \frac{\partial {\hat{U}}_{λ} (0, β_{0})}{\partial λ} \hat{λ} + \frac{\partial {\hat{U}}_{λ} (0, β_{0})}{\partial β} (\hat{β} - β_{0}) + o_{p} (n^{- 1 / 2}) .

(A.4)

According to (A.1)–(A.4) and after some algebra, it can be shown that

\hat{λ} = \frac{1}{σ^{2}} \frac{1}{N} \sum_{i \in A} π_{i}^{- 1} \frac{δ_{i}}{E (δ)} ε_{i} - \frac{1}{σ^{2}} E {\frac{\partial m (x; β_{0})}{\partial β} | δ = 1} \times {[E {h (x; β_{0} \frac{\partial m (x; β_{0})}{\partial β} | δ = 1)}]}^{- 1} \frac{1}{N} \sum_{i \in A} π_{i}^{- 1} \frac{δ_{i}}{E (δ)} ε_{i} h (x_{i}; β_{0}) + o_{p} (n^{- 1 / 2}),

(A.5)

where σ² is the variance for the residuals. With condition (C6), it can be shown that $\hat{η} = η_{0} + o_{p} (1)$ . In addition, we have

\frac{\partial {\hat{U}}_{η} (η_{0}, β_{0}, 0)}{\partial λ} = \frac{1}{N} \sum_{i \in A} π_{i}^{- 1} (1 - δ_{i}) \sum_{j \in A} \frac{δ_{j} π_{j}^{- 1}}{\sum_{k \in A} δ_{k} π_{k}^{- 1}} ε_{j} U (η; x_{i}, y_{i}^{* (j)} (β_{0})) = - {E (δ)}^{- 1} E {(1 - δ_{i}) δ_{j} ε_{j} U (η_{0}; x_{i}, y_{i}^{* (j)} (β_{0}))} + o_{p} (1) = - E {ε {\bar{U}}_{m} (ε)} + o_{p} (1),

(A.6)

\frac{\partial {\hat{U}}_{η} (η_{0}, β_{0}, 0)}{\partial β} = \frac{1}{N} \sum_{i \in A} \frac{(1 - δ_{i})}{π_{i}} \sum_{j \in A} \frac{δ_{j} π_{j}^{- 1}}{\sum_{k \in A} δ_{k} π_{k}^{- 1}} \frac{\partial U (η_{0}; x_{i}, y_{i})}{\partial y} {\frac{\partial m (x_{i}, β_{0})}{\partial β} - \frac{\partial m (x_{j}; β_{0})}{\partial β}} = {E (δ)}^{- 1} E [(1 - δ_{i}) δ_{j} \frac{\partial U (η_{0}; x_{i}, y_{i})}{\partial y} {\frac{\partial m (x_{i}; β_{0})}{\partial β} - \frac{\partial m (x_{j}; β_{0})}{\partial β}}] + o_{p} (1) = D + o_{p} (1),

(A.7)

and

\frac{\partial {\hat{U}}_{η} (η_{0}, β_{0}, 0)}{\partial η} = \frac{1}{N} \sum_{i \in A} \frac{δ_{i}}{π_{i}} \frac{\partial U (η_{0}; x_{i}, y_{i})}{\partial η} + \frac{1}{N} \sum_{i \in A} \frac{(1 - δ_{i})}{π_{i}} \sum_{j \in A} \frac{δ_{j} π_{j}^{- 1}}{\sum_{k \in A} δ_{k} π_{k}^{- 1}} \frac{\partial U (η_{0}; x_{i}, y_{i}^{* (j)}))}{\partial η} = E {δ \frac{\partial U (η_{0}; x, y)}{\partial η}} + E {(1 - δ) \frac{\partial U (η_{0}; x, y)}{\partial η}} + o_{p} (1) = E {\frac{\partial U (η_{0}; x, y)}{\partial η}} + o_{p} (1),

(A.8)

where Ū_m(ε) = E{(1 − δ) U(η₀; x, y)|ε} and

D = E ((1 - δ) U (η_{0}; x, y) [\frac{\partial m (x; β_{0})}{\partial β} - E {\frac{\partial m (x; β_{0})}{\partial β} | δ = 1}] l (ε)),

with l(ε) = −f′(ε)/⁻¹(ε). Define

S = \frac{1}{N (N - 1) E (δ)} \sum_{i \in A} ω_{i} (1 - δ_{i}) \sum_{j \in A, j \neq i} ω_{j} δ_{j} U (η_{0}; x_{i}, y_{i}^{* (j)} (β_{0})),

then by using Taylor linearization,

{\hat{U}}_{η} (η_{0}, β_{0}, 0) = \frac{1}{N} \sum_{i \in A} π_{i}^{- 1} δ_{i} U (η_{0}; x_{i}, y_{i}) + E (S) + S - E (S) - \frac{E (S)}{E (δ)} {{\bar{δ}}_{N} - E (δ)} + o_{p} (n^{- 1 / 2}) = \frac{1}{N} \sum_{i \in A} π_{i}^{- 1} δ_{i} U (η_{0}; x_{i}, y_{i}) + S - \frac{E (S)}{E (δ)} {{\bar{δ}}_{N} - E (δ)} + o_{p} (n^{- 1 / 2}),

with E(S) = E{(1 − δ)U(η₀; x, y)} and ${\bar{δ}}_{N} = N^{- 1} {\sum_{i \in A} δ_{i} π}_{i}^{- 1}$ . According to the Hoeffding decomposition,

S = \frac{1}{N (N - 1) E (δ)} \sum_{i \in A} π_{i}^{- 1} (1 - δ_{i}) \sum_{j \in A, j \neq i} π_{j}^{- 1} δ_{j} U {η_{0}; x_{i}, y_{i}^{* (j)} (β_{0})} = \frac{1}{N} \sum_{i \in A} [π_{i}^{- 1} (1 - δ_{i}) {U (η_{0}; x_{i}, y_{i}) | x_{i}} + π_{i}^{- 1} \frac{δ_{i}}{E (δ)} E {(1 - δ_{i}) U (η_{0}; x_{i}, y_{i}) | ε_{i}}] - E (S) + o_{p} (n^{- 1 / 2}) .

Therefore,

{\hat{U}}_{η} (η_{0}, β_{0}, 0) = \frac{1}{N} \sum_{i \in A} π_{i}^{- 1} δ_{i} U (η_{0}; x_{i}, y_{i}) + \frac{1}{N} \sum_{i \in A} [π_{i}^{- 1} (1 - δ_{i}) E {U (η_{0}; x_{i}, y_{i}) | x_{i}} + π_{i}^{- 1} \frac{δ_{i}}{E (δ)} E {1 - δ_{i}) U (η_{0}; x_{i}, y_{i}) | ε_{i}}] - \frac{E (S)}{E (δ)} {\bar{δ}}_{N} + o_{p} (n^{- 1 / 2}) .

(A.9)

According to Taylor linearization, we have

0 = {\hat{U}}_{η} (\hat{η}, \hat{β}, \hat{λ}) = {\hat{U}}_{η} (η_{0}, β_{0}, 0) + \frac{\partial {\hat{U}}_{η} (η_{0}, β_{0}, 0)}{\partial η} (\hat{η} - η_{0}) + \frac{\partial {\hat{U}}_{η} (η_{0}, β_{0,} 0)}{\partial β} (\hat{β} - β_{0}) + \frac{\partial {\hat{U}}_{η} (η_{0}, β_{0}, 0)}{\partial λ} \hat{λ} + o_{p} (n^{- 1 / 2}) .

(A.10)

By (A.1),(A.5)–(A.10), after some algebra, we can show that

\hat{η} - η_{0} = - {E (\frac{\partial U (η_{0}; x, y)}{\partial η})}^{- 1} (\frac{1}{N} \sum_{i \in A} π_{i}^{- 1} ζ_{i}) + o_{p} (n^{- 1 / 2}),

where ζ_i is defined in (12) of Theorem 1.

B: Proof of Theorem 2

We replace regularity conditions (C7)–(C10) in Appendix A with the following regularity conditions (C14)–(C17):

(C14)
Ũ_n(θ) converges to Ũ(θ) in probability uniformly in the neighborhood of the true value θ₀. For every a > 0. $\inf_{θ : ‖ θ - θ_{0} ‖ \geq a} ‖ \tilde{U} (θ) ‖ > 0 = ‖ \tilde{U} (θ_{0}) ‖$
(C15)
There exists a measurable function L(δ, x, y) with E {L²(δ, x, y)} < ∞ and for every θ₁ and θ₂ in the neighborhood of the true value θ₀, ‖Ũ (θρ δ, x, y) − Ũ(θ₂; δ, x, y) ‖ ≤ L(δ, x, y)‖θ₁ − θ₂‖.
(C16)
Assume that $E {{\tilde{U}}^{2} (θ_{0}; δ, x, y)} < \infty$ and $E {\tilde{U} (θ; δ, x, y)}$ has continuous and invertible first derivatives with respect to θ and the corresponding first derivatives are bounded by some integrable function in the neighborhood of the true value θ₀.
(C17)
$\sqrt{n} {\tilde{U}}_{n} (θ_{0}) \overset{ℒ}{\to} N (0, \sum_{θ})$ , as n, N → ∞, where $\sum_{θ} = V {\sqrt{n} {\tilde{U}}_{n} (θ_{0})}$ denotes the design model variance.

Similar as conditions (C4) and (C8), condition (C14) ensures the consistency of proposed estimator. Conditions (C15) and (C16) are required to derive asymptotic expansion of proposed estimator. See Van der Vaart (1998, Ch. 5) for more details for those conditions. Similar as conditions (C6) and (C10), Condition (C17) is used to derive the central limit theory.

The proof of the consistency of $\hat{β}$ and $\hat{η}$ is similar to the relevant proof in Theorem 1. According to the regularity conditions (C10), (C11), (C12) and by using similar techniques as that of Theorem 19.26 of Van der Vaart (1998), we can show that

0 = {\hat{U}}_{η} (\hat{η}, \hat{β}, \hat{λ}) = {\hat{U}}_{η} (η_{0}, β_{0,} 0) + \frac{\partial {\hat{U}}_{η} (η_{0}, β_{0}, 0)}{\partial λ} \hat{λ} + \frac{\partial E {{\hat{U}}_{η} (η_{0}, β_{0}, 0)}}{\partial β} (\hat{β} - β_{0}) + \frac{\partial E {{\hat{U}}_{η} (η_{0}, β_{0}, 0)}}{\partial η} (\hat{η} - η_{0}) + o_{p} (n^{- 1 / 2}) .

(B.1)

In addition, we have

\frac{\partial E {{\hat{U}}_{η} (η_{0}, β_{0}, 0)}}{\partial η} = \frac{\partial E {δ U (η_{0}, x, y)}}{\partial η} + \frac{1}{E (δ)} \frac{\partial E {(1 - δ_{i}) δ_{j} U_{η} (η_{0}; x_{i}, y_{i})}}{\partial η} + o_{p} (1) = \frac{\partial U (η_{0}; x, y)}{\partial η} + o_{p} (1),

(B.2)

and

\frac{\partial E {{\hat{U}}_{η} (η_{0}, β_{0}, 0)}}{\partial β} = D * + o_{p} (1),

(B.3)

where D* is defined in Theorem 2. According to (A.1), (A.5), (A.6), (A.9), (B.1)–(B.3), we have

{\hat{η}}_{SFI} - η_{0} = - {[\frac{\partial E {U (η_{0}; x, y)}}{\partial η}]}^{- 1} (\frac{1}{N} \sum_{i \in A} π_{i}^{- 1} ζ_{i}) + o_{p} (n^{- 1 / 2}),

where ζ_i is defined in Theorem 2.

References

Chen J, Sitter R. A pseudo empirical likelihood approach to the effective use of auxiliary information in complex surveys. Statistica Sinica. 1999;9:385–406. [Google Scholar]
Chauvet G, Deville JC, Haziza D. On balanced random imputation in surveys. Biometrika. 2011;98:459–471. [Google Scholar]
Durrant GB. Imputation methods for handling item-nonresponse in the social sciences: a methodological review. ESRC National Center for Research Methods and Southampton Stat Sci.s Research Institute NCRM Methods Review Papers NCRM/002 2005 [Google Scholar]
Durrant GB, Skinner C. Using missing data methods to correct for measurement error in a distribution function. Survey Methodology. 2006;32(1):25–36. [Google Scholar]
Fay RE. When are inferences from multiple imputation valid? Proceedings of the Survey Research Methods Section of the American Statistical Association. 1992;81:227–32. [Google Scholar]
Fay RE. Alternative paradigms for the analysis of imputed survey data. Journal of the American Statistical Association. 1996;91(434):490–498. [Google Scholar]
Fuller WA, Kim JK. Hot deck imputation for the response model. Survey Methodology. 2005;31:139–149. [Google Scholar]
Fuller WA. Sampling Statistics. Wiley; Hoboken, NJ: 2009. [Google Scholar]
Haziza D. Imputation and inference in the presence of missing data. In: Pfeffermann D, Rao CR, editors. Handbook of Statistics. Vol. 29, Sample Surveys: Theory, Methods and Inference. Amsterdam: Elsevier BV; 2009. pp. 215–46. [Google Scholar]
Kalton G, Kish L. Some efficient random imputation methods. Communications in Statistics A. 1984;13:1919–1939. [Google Scholar]
Kim JK, Fuller WA. Fractional hot deck imputation. Biometrika. 2004;91(3):559–578. [Google Scholar]
Kim JK, Brick J, Fuller WA, Kalton G. On the bias of the multiple-imputation variance estimator in survey sampling. Journal of Royal Statistical Society: Series B. 2006;68(3):509–521. [Google Scholar]
Kim JK, Navarro A, Fuller WA. Replicate variance estimation after multi-phase stratified sampling. Journal of the American Statistical Association. 2006;101:312–320. [Google Scholar]
Kim JK. Parametric fractional imputation for missing data analysis. Biometrika. 2011;98:119–132. [Google Scholar]
Kim JK, Yu CL. A semi-parametric estimation of mean functionals with non-ignorable missing data. Journal of the American Statistical Association. 2011;106:157–165. [Google Scholar]
Kim JK, Shao J. Statistical methods for handling incomplete data. London: Chapman and Hall/CRC; 2013. [Google Scholar]
Kim JK, Yang S. Fractional hot deck imputation for robust inference under item nonresponse in survey sampling. Survey Methodology. 2014;40:211–230. [Google Scholar]
Little RJA, Rubin DB. Statistical Analysis With Missing Data. 2nd. Hoboken, NJ: Wiley; 2002. [Google Scholar]
Matloff NS. Use of regression functions for improved estimation of means. Biometrika. 1981;68:685–689. [Google Scholar]
McCullagh P, Nelder J. Generalized Linear Models. London: Chapman and Hall; 1989. [Google Scholar]
Meng XL. Multiple-imputation inferences with uncongenial sources of input. Statistical Science. 1994;9:538–558. [Google Scholar]
Müller UU. Estimating linear functionals in nonlinear regression with response missing at random. Annals of Statistics. 2009;98:2245–2277. [Google Scholar]
Owen AB. Empirical Likelihood. Chapman and Hall/CRC; New York: 2001. [Google Scholar]
Qin J. Empirical likelihood in biased sample problems. Annals of Statistics. 1993;21(3):1182–1196. [Google Scholar]
Qin J, Lawless J. Empirical likelihood and general estimating equations. The Annals of Statistics. 1994;22:300–325. [Google Scholar]
Rubin DB. Multiple Imputation for Nonresponse in Surveys. Wiley; 1987. [Google Scholar]
Shao J, Tu D. The Jackknife and Bootstrap. Springer; 1995. [Google Scholar]
Shao J, Steel P. Variance estimation for survey data with composite imputation and nonnegligible sampling fractions. Journal of the American Statistical Association. 1999;94:254–265. [Google Scholar]
Vardi Y. Empirical distributions in selection bias models. Annals of Statistics. 1985;13:178–203. [Google Scholar]
Van der Vaart AW. Asymptotic Statistics. New York: Cambridge University Press; 1998. [Google Scholar]
Víšek JA. Asymptotic distribution of simple estimate for rejective, Sampford and successive sampling. In: Jurecková J, editor. Contributions to Statistics: Jaroslav Hj́ek Memorial. Academia, Prague & D. Reidel; Dordrecht: 1979. pp. 263–275. [Google Scholar]
Wang N, Robins JM. Large-sample theory for parametric multiple imputation procedures. Biometrika. 1998;85(4):935–948. [Google Scholar]
Wang Q, Rao JNK. Empirical likelihood-based inference under imputation for missing response data. The Annals of Statistics. 2002;30:896–924. [Google Scholar]
Wang D, Chen SX. Empirical likelihood for estimating equations with missing values. The Annals of Statistics. 2009;37:490–517. [Google Scholar]
Wang JQ, Opsomer JD. On asymptotic normality and variance estimation for nondifferentiable survey estimators. Biometrika. 2011;98:91–106. [Google Scholar]
Wolter KM. Introduction to Variance Estimation. Wiley; New York: 2007. [Google Scholar]
Wu C, Rao JNK. Pseudo empirical likelihood ratio confidence intervals for complex surveys. The Canadian Journal of Statistics. 2006;34:359–375. [Google Scholar]
Yang S, Kim JK. A Note on Multiple Imputation for General-Purpose Estimation. Biometrika. 2016;103:244–251. [Google Scholar]

[R1] Chen J, Sitter R. A pseudo empirical likelihood approach to the effective use of auxiliary information in complex surveys. Statistica Sinica. 1999;9:385–406. [Google Scholar]

[R2] Chauvet G, Deville JC, Haziza D. On balanced random imputation in surveys. Biometrika. 2011;98:459–471. [Google Scholar]

[R3] Durrant GB. Imputation methods for handling item-nonresponse in the social sciences: a methodological review. ESRC National Center for Research Methods and Southampton Stat Sci.s Research Institute NCRM Methods Review Papers NCRM/002 2005 [Google Scholar]

[R4] Durrant GB, Skinner C. Using missing data methods to correct for measurement error in a distribution function. Survey Methodology. 2006;32(1):25–36. [Google Scholar]

[R5] Fay RE. When are inferences from multiple imputation valid? Proceedings of the Survey Research Methods Section of the American Statistical Association. 1992;81:227–32. [Google Scholar]

[R6] Fay RE. Alternative paradigms for the analysis of imputed survey data. Journal of the American Statistical Association. 1996;91(434):490–498. [Google Scholar]

[R7] Fuller WA, Kim JK. Hot deck imputation for the response model. Survey Methodology. 2005;31:139–149. [Google Scholar]

[R8] Fuller WA. Sampling Statistics. Wiley; Hoboken, NJ: 2009. [Google Scholar]

[R9] Haziza D. Imputation and inference in the presence of missing data. In: Pfeffermann D, Rao CR, editors. Handbook of Statistics. Vol. 29, Sample Surveys: Theory, Methods and Inference. Amsterdam: Elsevier BV; 2009. pp. 215–46. [Google Scholar]

[R10] Kalton G, Kish L. Some efficient random imputation methods. Communications in Statistics A. 1984;13:1919–1939. [Google Scholar]

[R11] Kim JK, Fuller WA. Fractional hot deck imputation. Biometrika. 2004;91(3):559–578. [Google Scholar]

[R12] Kim JK, Brick J, Fuller WA, Kalton G. On the bias of the multiple-imputation variance estimator in survey sampling. Journal of Royal Statistical Society: Series B. 2006;68(3):509–521. [Google Scholar]

[R13] Kim JK, Navarro A, Fuller WA. Replicate variance estimation after multi-phase stratified sampling. Journal of the American Statistical Association. 2006;101:312–320. [Google Scholar]

[R14] Kim JK. Parametric fractional imputation for missing data analysis. Biometrika. 2011;98:119–132. [Google Scholar]

[R15] Kim JK, Yu CL. A semi-parametric estimation of mean functionals with non-ignorable missing data. Journal of the American Statistical Association. 2011;106:157–165. [Google Scholar]

[R16] Kim JK, Shao J. Statistical methods for handling incomplete data. London: Chapman and Hall/CRC; 2013. [Google Scholar]

[R17] Kim JK, Yang S. Fractional hot deck imputation for robust inference under item nonresponse in survey sampling. Survey Methodology. 2014;40:211–230. [Google Scholar]

[R18] Little RJA, Rubin DB. Statistical Analysis With Missing Data. 2nd. Hoboken, NJ: Wiley; 2002. [Google Scholar]

[R19] Matloff NS. Use of regression functions for improved estimation of means. Biometrika. 1981;68:685–689. [Google Scholar]

[R20] McCullagh P, Nelder J. Generalized Linear Models. London: Chapman and Hall; 1989. [Google Scholar]

[R21] Meng XL. Multiple-imputation inferences with uncongenial sources of input. Statistical Science. 1994;9:538–558. [Google Scholar]

[R22] Müller UU. Estimating linear functionals in nonlinear regression with response missing at random. Annals of Statistics. 2009;98:2245–2277. [Google Scholar]

[R23] Owen AB. Empirical Likelihood. Chapman and Hall/CRC; New York: 2001. [Google Scholar]

[R24] Qin J. Empirical likelihood in biased sample problems. Annals of Statistics. 1993;21(3):1182–1196. [Google Scholar]

[R25] Qin J, Lawless J. Empirical likelihood and general estimating equations. The Annals of Statistics. 1994;22:300–325. [Google Scholar]

[R26] Rubin DB. Multiple Imputation for Nonresponse in Surveys. Wiley; 1987. [Google Scholar]

[R27] Shao J, Tu D. The Jackknife and Bootstrap. Springer; 1995. [Google Scholar]

[R28] Shao J, Steel P. Variance estimation for survey data with composite imputation and nonnegligible sampling fractions. Journal of the American Statistical Association. 1999;94:254–265. [Google Scholar]

[R29] Vardi Y. Empirical distributions in selection bias models. Annals of Statistics. 1985;13:178–203. [Google Scholar]

[R30] Van der Vaart AW. Asymptotic Statistics. New York: Cambridge University Press; 1998. [Google Scholar]

[R31] Víšek JA. Asymptotic distribution of simple estimate for rejective, Sampford and successive sampling. In: Jurecková J, editor. Contributions to Statistics: Jaroslav Hj́ek Memorial. Academia, Prague & D. Reidel; Dordrecht: 1979. pp. 263–275. [Google Scholar]

[R32] Wang N, Robins JM. Large-sample theory for parametric multiple imputation procedures. Biometrika. 1998;85(4):935–948. [Google Scholar]

[R33] Wang Q, Rao JNK. Empirical likelihood-based inference under imputation for missing response data. The Annals of Statistics. 2002;30:896–924. [Google Scholar]

[R34] Wang D, Chen SX. Empirical likelihood for estimating equations with missing values. The Annals of Statistics. 2009;37:490–517. [Google Scholar]

[R35] Wang JQ, Opsomer JD. On asymptotic normality and variance estimation for nondifferentiable survey estimators. Biometrika. 2011;98:91–106. [Google Scholar]

[R36] Wolter KM. Introduction to Variance Estimation. Wiley; New York: 2007. [Google Scholar]

[R37] Wu C, Rao JNK. Pseudo empirical likelihood ratio confidence intervals for complex surveys. The Canadian Journal of Statistics. 2006;34:359–375. [Google Scholar]

[R38] Yang S, Kim JK. A Note on Multiple Imputation for General-Purpose Estimation. Biometrika. 2016;103:244–251. [Google Scholar]

PERMALINK

Semiparametric fractional imputation using empirical likelihood in survey sampling

Sixia Chen

Jae Kwang Kim

Abstract

1 Introduction

2 Basic Setup

3 Asymptotic Properties

Theorem 1

Remark 1

4 Extensions

4.1 Inference for non-smooth statistics

Theorem 2

4.2 Stochastic imputation

5 Replication variance estimation

6 Simulation studies

6.1 Simulation One

Table 1.

6.2 Simulation Two

Figure 1.

Figure 2.

Table 2.

7 Conclusions

Appendix

A: Proof of Theorem 1

B: Proof of Theorem 2

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Semiparametric fractional imputation using empirical likelihood in survey sampling

Sixia Chen

Jae Kwang Kim

Abstract

1 Introduction

2 Basic Setup

3 Asymptotic Properties

Theorem 1

Remark 1

4 Extensions

4.1 Inference for non-smooth statistics

Theorem 2

4.2 Stochastic imputation

5 Replication variance estimation

6 Simulation studies

6.1 Simulation One

Table 1.

6.2 Simulation Two

Figure 1.

Figure 2.

Table 2.

7 Conclusions

Appendix

A: Proof of Theorem 1

B: Proof of Theorem 2

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases