Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Jun 1.
Published in final edited form as: Stat Theory Relat Fields. 2017 Jun 1;1(1):69–81. doi: 10.1080/24754269.2017.1328244

Semiparametric fractional imputation using empirical likelihood in survey sampling

Sixia Chen, Jae Kwang Kim
PMCID: PMC5654594  NIHMSID: NIHMS908583  PMID: 29082363

Abstract

The empirical likelihood method is a powerful tool for incorporating moment conditions in statistical inference. We propose a novel application of the empirical likelihood for handling item nonresponse in survey sampling. The proposed method takes the form of fractional imputation (Kim, 2011) but it does not require parametric model assumptions. Instead, only the first moment condition based on a regression model is assumed and the empirical likelihood method is applied to the observed residuals to get the fractional weights. The resulting semiparametric fractional imputation provides n-consistent estimates for various parameters. Variance estimation is implemented using a jackknife method. Two limited simulation studies are presented to compare several imputation estimators.

Keywords: Item nonresponse, missing data, quantile estimation, robust estimation

1 Introduction

Missing data are frequently encountered in many areas, such as survey sampling, epidemiology and other fields. Simply ignoring missing values can potentially lead to biased estimation (Little and Rubin 2002, Kim and Shao 2013). Two statistical approaches for handling missing data have been used in practice: propensity score weighting and imputation. Propensity score weighting is used mainly to correct for unit non-response, while imputation is mainly used to handle item nonresponse. Haziza (2009) provides a comprehensive overview of the imputation methods in survey sampling.

Multiple imputation (MI), proposed by Rubin (1987), is a popular approach of imputation for general-purpose estimation due to its practical simplicity. However, the Rubin’s variance estimator may be biased under certain situation (Fay 1992; Wang and Robins 1998; Kim, et. al. 2006; Yang and Kim, 2016) and its validity requires the congeniality condition of Meng (1994), which may not hold for general-purpose estimation.

Fractional imputation (FI), first proposed by Kalton and Kish (1984), provides an alternative method for handling item nonresponse. Fay (1996), Kim and Fuller (2004), Fuller and Kim (2005), Durrant (2005), and Durrant and Skinner (2006) discussed fractional hot deck imputation. Kim (2011) and Kim and Yang (2014) discussed a fully parametric approach to fractional imputation. The parametric fractional imputation provides a powerful tool for handling missing data for various situations. However, it relies on a strong parametric model assumption and making such an assumption is not usually preferred in survey sampling. Balanced random imputation of Chauvet et al (2011) is also an attractive imputation technique, but it still requires parametric model assumptions for multipurpose estimation.

The empirical likelihood (EL) method, considered by Owen (2001) and Qin and Lawless (1994), is a useful tool for semiparametric inference in statistics. It involves a likelihood-based inference without making a parametric distributional assumption about the observed data. Qin (1993) addressed the missing survey data problem by using a biased sampling argument of Vardi (1985). Wang and Rao (2002) brought regression-type imputation approaches to empirical likelihood inference. Wang and Chen (2009) used a nonparametric regression imputation approach to handle missing data in the empirical likelihood inference. Müller (2009) considered a novel application of empirical likelihood method to handle missing data under a regression model assumption. In Müller (2009), the moment condition of the error term in the regression model is used to construct a fully imputed estimator.

In this paper, motivated by the fully imputed estimator of Müller (2009), we propose a semiparametric fractional imputation (SFI) method using empirical likelihood that can be used to handle item nonresponse in survey sampling. Because the proposed SFI uses only moment conditions in the semiparametric regression model, it is more robust than the PFI method or parametric MI method. By using a regression model assumptions, the proposed SFI method is more efficient than the nonparametric regression imputation method of Wang and Chen (2009). The proposed method takes the form of fractional imputation, so the actual implementation is very attractive in practice. The proposed SFI method can be used to estimate various parameters, including nonsmooth parameters such as population quantiles.

The paper is organized as follows. The basic setup is introduced and the proposed method is presented in Section 2. The asymptotic properties of the SFI estimators are presented in Section 3. Extensions to non-smooth statistics as well as random imputations are covered in Section 4. In Section 5, variance estimation is discussed. Some numerical results are given in Section 6. Some concluding remarks are made in Section 7.

2 Basic Setup

Consider a finite population N = {(xi, yi);i = 1, 2, …, N}, where xi is the vector of auxiliary variables that are always observed and yi is the study variable that is subject to missingness. We assume (xi, yi) are realizations from a regression model

Y=m(X;β0)+ε, (1)

where m(X; β0) is assumed to be known with unknown parameter β0 and ε satisfies E(ε|X) = 0. No parametric distributional assumption on X is made.

Let δi be the response indicator such that δi = 1 if yi is observed and δi = 0 otherwise. We assume missing at random (MAR) in the sense that

Pr(δ=1|x,y)=Pr(δ=1|x). (2)

Even though we observe δi only in the sample, we can conceptually assume that δi’s are defined throughout the population. Such extended definition of δi has been adopted in Fay (1992), Shao and Steel (1999), Kim, Navarro, and Fuller (2006).

Given the finite population, suppose that sample A of size n is selected from the finite population by a probability sampling mechanism. Let πi,i = 1, 2,…, N, be the first order inclusion probability of unit i in the population. We are interested in estimating η0, defined as a solution to the estimating equation E {U(η; x, y)} = 0 where U(η; x, y) is a known function with parameter η. To avoid unnecessary details, we assume that the solution to E {U(η; x, y)} = 0 is unique and the dimensions of η and U(η; x, y) are r. Thus, the parameter η is just-identified Under complete response, a consistent estimator of η0 is obtained by solving

iA1πiU(η;xi,yi)=0

for η. If some of yi are missing, under the MAR assumption, a consistent estimator of η0 can be obtained by solving the following expected estimating equation

iA1πi[δiU(η;xi,yi)+(1δi)E{U(η;xi,Y)|xi}]=0 (3)

for η. The conditional expectation in (3) is with respect to f(y | x), which is unknown as we only assume (1).

In fractional imputation, our goal is to approximate the conditional expectation in (3) by the weighted mean of the fractionally imputed estimating functions. That is, we wish to achieve

E{U(η;xi,Y)|xi}j=1mwijU(η;xi,yi(j)) (4)

as closely as possible for some (wij,yi(j)) satisfying j=1mwij=1, where wij’s are desired fractional weights and yi(j)’s are m imputed values for subject i. Kim (2011) and Kim and Yang (2014) developed a fractional imputation satisfying (4) using a parametric model assumption on f(y | x).

In our proposed method, we use the empirical likelihood approach to achieve the approximation in (4). To explain the idea, assume for now that the true parameter β0 in (1) is known. In this case, εi = yim(xi; β0) are available among δi = 1. Because E(ε | x) = 0 holds, we can compute

E{U(η;xi,y)|xi}=U(η;xi,y)f(y|xi)dy=U(η;xim(xi,β0)+ε)fε(ε|xi)dε,

where fε(ε | x) is the (unknown) conditional density of ε given x. To apply the empirical likelihood method, we assume that the conditional distribution of ε given x can be approximated by

Fε(ε|x)=iAδiwiI(εiε) (5)

such that wi ≥ 0 with Σδiwi = 1 are the point mass assigned to the observed εi by assuming that the support of εi is equal to the set of observed εi. Using the approximation in (5), we can obtain

E{U(η;xi,Y)|xi}jAδjwjU(η;xi;m(xi;β0)+εj),

which can be written in the fractional imputation form in (4). To determine wj uniquely, we can use the idea of pseudo empirical likelihood method of Wu and Rao (2006) to maximize

l(w)=iAδiπi1log(wi) (6)

subject to

iAδiwi=1andiAδiwiεi=0. (7)

In practice, we do not know β0 and, hence, we do not observe εi = yim(xi; β0). We can use n-consistent estimator of β0 to obtain ε^i=yim(xi;β^) and apply the above empirical likelihood method to the observed residuals. In general, one can use

U^β(β)=1NiAδiπi{yim(xi;β)}h(xi;β)=0 (8)

to obtain a n-consistent estimator of β, where h(xi; β) is an arbitrary function that enables the above equation to have a solution. If the variance function V(y|x) = σ2q(xi; β0) for a known function q, then one can choose h(xi; β) = (xi; β)/q(xi; β), where (xi; β) = ∂m(xi; β)/β. This choice is motivated by the quasilikelihood euations for generalized linear models (McCullagh and Nelder, 1989, Ch. 9). The solution to (8) can be called complete-case (CC) method. The CC estimator is not efficient in general, but it is efficient for estimating β under MAR. Thus, the resulting SFI estimator can be constructed as follows:

  • [Step 1] Obtain n-consistent estimator of β0 and compute ε^i=yim(xi;β^) among the respondents.

  • [Step 2] Find ŵi that maximizes (6) subject to
    iAδiwi=1andiAδiwiε^i=0. (9)
    The solution can be written as
    w^i=πi1kAδkπk111+λ^ε^i, (10)
    where λ^ is obtained by solving the second constraint of (9).
  • [Step 3] Use ŵj in Step 2 to approximate
    E{U(η;xi,Y)|xi}jAδjwijU(η;xi,yi(j)),
    where yi(j)=y^i+ε^j and wij=w^j.
  • [Step 4] The SFI estimator η^SFI of η is computed by solving
    U^π(η,β^,λ^)=1NiA1πi{δiU(η;xi,yi)+1δj)jAδjwijU(η;xi,yi(j))}=0 (11)
    for η.

Instead of (11), one can also consider a fully imputed estimating equation based on

iA1πiE{U(η;xi,Yi)|xi}=0,

which was considered by Müller (2009) under the independently and identically distributed (I.I.D.) setup. The fully imputed estimating equation may lead to a more efficient estimator of η (Matloff, 1981) but such over-imputation does not appeal to survey practice since we usually do not want to replace the true values of respondents with some imputed values. In the following section, we present the asymptotic properties of η^SFI under complex survey designs.

3 Asymptotic Properties

To discuss the asymptotic properties of the proposed SFI estimator of η, we first assume a sequence of finite populations and samples with finite fourth moments as in Fuller (2009, Ch.1). The following theorem presents the asymptotic normality of the proposed SFI estimator. The sketched proof of Theorem 1 is provided in Appendix A.

Theorem 1

Under the regularity conditions (C1)–(C13) in Appendix A, the SFI estimator defined in (11) is a n-consistent estimator of η0, that is

n(η^SFIη0)N(0,Bu2B),

where B=[E{U(η;x,y)/η}]1,u2=V(N1iAπi1ζi), and

ζi=δiU(η0;xi,yi)+(1δi)E{U(η0;xi,Y)|xi}+δiE(δ)Ci+DW[E{δh(x;β0)m(x;β0)β}]1δiεih(xi;β0), (12)

and

Ci=U¯m(εi)E{U¯m(εi)}σ2E{εU¯m(ε)}εi,
DW=D+E{εU¯m(ε)}σ2E{m(x;β0)β|δ=1},
D=E((1δ)U(η0;x,y)[m(x;β0)βE{m(x;β0)β|δ=1}]l(ε)),

with σ2=E(ε2),U¯m(ε)=E{(1δ)U(η0;x,y)|ε}, and l(ε)=logfε(ε|x)/ε.

Remark 1

In (12), ζi can be written as the sum of four terms. The first two terms is the conditional expectation of U(η; x, y), the third term is the additional term due to approximating f(y | x) by the empirical likelihood method, and the fourth term is the additional term due to estimating β.

According to Theorem 1, a consistent variance estimator of η^SFI can be written as

V^(η^SFI)={E^(U(η0;x,y)η)}1V^(1NiAπi1ζi)[{E^(U(η0;x,y)η)}1]T, (13)

where

E^(U(η0;x,y)η)=1N^iAπi1{δiU(η^;xi,yi)η+(1δi)jAδjwijU(η^;xiyi(j))η},

with N^=iAπi1,η^=η^SFI and

V^(1NiAπi1ζi)=1N^2iAjAπijπiπjπijζ^iζ^jπiπj+1N^2iA(ζ^iζ^N)2πi, (14)

where ζ^N=N^1iAπi1ζ^i and ζ^i is a plug-in estimator of ζi in (12). One can use

ζ^i=δiU(η^;xi,yi)+(1δi)μ^(xi;β^,η^)+δi{E^(δ)}1[U¯^m(ε^i)E^{U¯m(εi)}σ^2E^{εU¯m(ε)}ε^i]+D^W[E^{δh(x;β0)m(x;β0)β}]1δiε^ih(xi;β^),

with

μ^(xi;β^,η^)=jAπj1δjU(η^;xi,yi(j))jAπj1δj,E^(δ)=1N^jAπj1δj,
σ^2=jAπj1δjε^j2jAπj1δj,E^{m(x;β0)β|δ=1}=iAπj1δim(xi;β^)/βiAπi1δi,
D^W=D^+E^{εU¯m(ε)}σ^2E^{m(x;β0)β|δ=1},
U¯^m(ε^i)=1N^jAπj1(1δj)U(η^;xj,yj(i)),
E^{U¯m(εi)}=1N^iAπi1(1δi)jAδjw^jU(η^;xi,yi(j)),E^{εU¯m(ε)}=jAπj1δjε^jU¯^m(ε^j)jAδjπj1,
D^=1N^iAπi1(1δi)jAδjπj1U(η^;xi,yi(j))/y{m(xi;β^)/β}jArπj1.

When nN−1 = o(1), the second term of (14) is of smaller order and can be safely ignored.

4 Extensions

In this section, we discuss two extensions of the proposed method. In Section 4.1, our proposed method is extended to handle non-smooth statistics including distribution functions and percentiles. In Section 4.2, an extension to stochastic imputation is discussed.

4.1 Inference for non-smooth statistics

Suppose that we are interested in estimating parameter η0, the solution of E {U(η; x, y)} = 0 with non-smooth function U(η; x, y), where the non-smoothness can be with respect to either η or y. For generality, we assume the non-smoothness is with respect to both η and y. Wang and Opsomer (2011) discussed asymptotic results for nondifferentiable survey estimators. Define θ=(η,β),θ0=(η0=β0) Let Un(θ)=N1iAπi1U(θ;δi,xi,yi) and U(θ)=E{U(θ;δi,xi,yi)}, where

U(θ;δi,xi,yi)=δiU(η;xi,yi)+(1δi)U{η;xi,m(xi;β)+εi}fε(εi|xi)dεi.

Denote θ^=(η^,β^) as the solution of estimating equation Ũn(θ) = 0. To discuss asymptotic properties, we replace regularity conditions (C7)–(C10) in Appendix A with the regularity conditions (C14)–(C17) in Appendix B. The following theorem presents the asymptotic expansion of η^SFI under this scenario and the sketched proof is presented in Appendix B.

Theorem 2

Under regularity conditions (C1)–(C3), and (C11)–(C17) in Appendix A and Appendix B, η^SFI has the following asymptotic expansion

η^SFIη0=[E{U(η0;x,y)}η]1(1NiA1πiζ2i)+op(n1/2),

where

ζ2i=δiU(η0;xi,yi)+(1δi)μ(xi;β0,η0)+δiE(δ)[U¯m(εi)E{U¯m(εi)}E{εU¯m(ε)}σ2εi]+DW[E{δh(x;β0)m(x;β0)β}]1δiεih(xi;β0),

where

DW=D+E{εU¯m(ε)}σ2E{m(x;β0)β|δ=1},

and

D={E(δ)}1E[(1δi)δjU{η0;xi,yi(j)(β)}]β

evaluated at β0 and other terms are the same as those in Theorem 1.

By Theorem 2, we can obtain

n(η^SFIη0)N(0,Bu2B),

where B = [E {∂U(η; x, y)/η}]−1 and u2=V{N1iAπi1ζ2i}. If we are interested in estimating the cumulative density function of y, which is Pr(y < t), then we can choose U(η; x, y) = I(y < t) − η and

E[(1δi)δjU{η0;xi,yi(j)(β)}]=++{1p(xi)}p(xj)t+m(xj;β)m(xi;β)fyj|xj(yj)dyjdxidxj,

where p(x) = Pr(δ = 1|x). Therefore, we have D={E(δ)}1E[(1δi)δj{m(xj;β0)β}fyj|xj{t+m(xj;β0)m(xi;β0)}].

A consistent estimators of D* can be written as

D^=iA1δiπijAδjπj1{m(xj;β^)/βm(xi;β^)/β}f^yj|xj{t+m(xj;β^)m(xi;β^)}N^jArπj1

with

f^y|x(y|x)=(hxhy)1iAπi1δiKx{hx1(xxi)}Ky{hy1(yyi)}(hx)1iAπi1δiKx{hx1(xxi)},

where Kx and Ky are kernel functions for x and y with bandwidth hx and hy. Thus, a consistent variance estimator of η^SFI here can be obtained similarly to (13).

If the parameter of interest is the τ-th percentile of Y, given by η=FY1(τ), the SFI estimator η^τ,SFI of η can be obtained by solving the estimating equation (11) with U(η; x, y) = I(y < η) − τ. Since E {I(Y < η)} = FY(η), it can be shown that η^τ,SFI has the asymptotic expansion in Theorem 2 with

E{U(η0;x,y}η=fy(η0)=E{fy|x(η0|x)},

where fy is the density function for y. A consistent estimator of ∂E {U(η0; x, y)} /∂η can be written as

E^{U(η0;x,y}η=1N^iAπi1f^y|x(η^|xi),

and a consistent estimator of D* can be written as

D^=iA1δiπijAδjπj1{m(xj;β^)/β}f^yj|xj{η^+m(xj;β^)m(xi;β^)}N^jArπj1,

with η^=η^τ,SFI.

4.2 Stochastic imputation

For a multi-purpose survey, stochastic imputation is often preferred to deterministic imputation since it can preserve distributional relationships better. In stochastic imputation, imputed values are generated from a stochastic imputation mechanism and with additional variability due to the imputation. For simplicity, we only consider the case where U(η;x,yi(j)) is a smooth function of η and β. The results can be naturally extended to non-smooth statistics. The stochastic imputation estimator η^SFI2 can be obtained by solving the following estimating equation

U^η(η|β^,λ^)=1NiA1πi{δiU(η;xi,yi)+(1δi)1Ms=1MU(η;xi,yi(s))}=0,

where yi(s) are randomly selected from {y^ij=y^i+ε^j;jAr} with the selection probability, P(yi(s)=y^ij)=wij where wij are the fractional weights in (11). Since

plimM1Ms=1MU(η;xi,yi(s))=E(U^η(η|β^,λ^)|I,x,y,δ)=jAδjwijU(η;xi,yi(j)),

where the conditional expectation is with respect to the stochastic imputation mechanism, we have

V{U^η(η|β^,λ^)}=V{U^η(η0,β^,λ^)}+V{U^η(η0|β^,λ^)U^η(η0,β^,λ^)}=V{U^η(η0,β^,λ^)}+E[V{U^η(η0|β^,λ^)|I,x,y,δ}].

Thus, using an argument similar to Theorem 1, we can obtain

V(η^SFI2){E(U(η0;x,yη)}1VM{E(U(η0;x,y)η)}1, (15)

where VM=V{U^η(η|β^,λ^)} Therefore, a consistent variance estimator can be written as

V^(η^RI)={E^(U(η0;x,yη)}1V^M{U(η0;x,yη}1,

where

V^M=V^{U^η(η0,β^,λ^}+V^{U^η(η0|β^,λ^)U^η(η0,β^,λ^)|I,x,y,δ}, (16)

and E^(U(η0;x,y)/η),V^{U^η(η0,β^,λ^)} can be obtained similarly to (13) and

V^{U^η(η0|β^,λ^)U^η(η0,β^,λ^)|I,x,y,δ}=1MN^2iAπi2(1δi)jAδjwij{U(η^;xi,yi(j))jAδjwijU(η^;xi,yi(j))}2.

The second term of (16) estimates the additional variance due to stochastic imputation. If M is large, the second term is negligible.

5 Replication variance estimation

Estimating the variance of the estimator η^SFI can be done through the linearization formulas presented in Section 3 for smooth statistics and the formulas in Section 4 for non-smooth statistics, respectively. However, it requires tedious algebra to compute all the terms. In this section, we consider an alternative approach using replication methods. Shao and Tu (1995) considered the theoretical aspects of replication methods such as Jackknife and Bootstrap. Wolter (2007) gives a comprehensive overview of replication variance estimation methods in survey sampling.

Suppose we are interested in estimating T=i=1Nyi. Define the design weight as di=πi1. The design unbiased estimator of T is T^=iAdiyi and the consistent replication variance estimator of T^ is given by

V^R(T^)=k=1Lck(T^(k)T^)2,

where there are L replication weights, ck is the replication factor associated with the k-th replication and T^(k)=iAdi(k)yi with di(k) being the k-th replicate of di. For example, ck = (L − 1)/L for deleting one group jackknife method. For details of corresponding ck with different variance estimation approaches, see Wolter (2007).

To obtain replication variance estimator of our proposed SFI estimator, we apply the same SFI method to each of the replicates. In the first step, we obtain the k-th replicate of β^(k) by solving

iAdi(k)δi{yim(xi;β)}h(xi;β)=0.

In the second step, the replicated EL weights are computed by maximizing

l(k)(w)=jAδjdj(k)log(wj)

subject to constraints

jAδj(dj(k)/dj)wj=1and jAδj(dj(k)/dj)wjε^j(k)=0,

with ε^j(k)=yjm(xj;β^(k)). In the final step, the replicated SFI estimator is computed using the replicated EL weights. For smooth statistics, the k-th replicate of η^SFI, denoted by η^SFI(k), is obtained by the solution to the following estimating equation

iAdi(k){δiU(η;x,y)+(1δi)jAδjwij(k)U(η;xi,yi(j)(β^(k)))}=0,

where wij(k)=w^j(k) and yi(j)(β^(k))=m(xi;β^(k))+yjm(xj;β^(k)). The final replication variance estimator of η^SFI is given by

V^R(η^SFI)=k=1Lck(η^SFI(k)η^SFI)2.

For non-smooth statistics, our estimator is similar to that of Wang and Opsomer (2011). Define

u^(k)=U^η(k)(η^,β^,λ^)+[E^{εU¯m(ε)}](λ^(k)λ^)+D^(β^(k)β^),

where Ê{εŪm(ε)} and D^ are defined in Section 4.1, U^η(k)(η^,β^,λ^) is defined in (11) with design weight replaced by replication weight di(k) and fractional weights replaced by replication fractional weights wij(k). Then the replication variance estimator can be written as:

V^R(η^SFI)={E^{U(η^;x,y)}η}1k=1Lck{u^(k)U^η(η^,β^,λ^)}2[{E^{U(η^;x,y}η}1]T,

with ∂Ê{U(η; x, y)} /∂η defined in Section 4.1.

6 Simulation studies

In this Section, we conduct two limited simulation studies. The first one is generated from an artificial data set and the second one is based on the real data treated as a finite population.

6.1 Simulation One

We repeatedly generate B = 2, 000 finite populations of (xi, yi, δi) of size N =10, 000 from a super-population model

yi=0.5xi+εi,

with xi ~ exp(1) and E(εi | xi) = 0. Two error distributions are considered: (E1) εi ~ N(0, 1) and (E2) ε ~ {χ2(2) − 2} /2. Given (x, y), the response indicator δ has a Bernoulli distribution with Pr(δ = 1|x) = {1 + exp(1 − x)}−1. The overall response rate is about 50%. Given each finite population (x, y, δ), we draw a sample by using a Poisson sampling design with the first-order inclusion probability πi=nzi/i=1Nzi, where n = 200 and zi = max{0.5yi + 2, 1} + ui, with ui ~ χ2(1) and χ2(1) corresponding to the chi-squared distribution with degrees of freedom equal to one. In this simulation, we are interested in estimating three parameters:

  1. θ1=N1i=1Nyi, the population mean of y.

  2. θ2=N1i=1NI(yi<1), the proportion of y less than 1.

  3. θ3 = F−1(0.5), the population median of y.

From each sample, we compute the following four estimators:

  1. The complete-case (CC) estimator only based on the complete cases only. The CC estimator is the solution to iAδiπi1U(η;xi,yi)=0, where U(η; x, y) is the corresponding estimating equation for each parameter.

  2. Full sample estimator based on the original sampling without missing data and pseudo empirical likelihood method (Full). Specifically, we maximize l=iAπi1log(ωi), subject to the following constraints
    iAωi=1,iAωiε^i=0,
    where ε^i=yiβ^0β^1xi and (β^0,β^1) is obtained by solving the following estimating equation:
    iAπi1(yiβ0β1xi)(1,xi)T=0.

    The full sample estimator serves as a benchmark for comparison.

  3. The parametric fractional imputation (PFI) estimator of Kim (2011) assuming yi | xi ~ N(β0 + β1xi, σ2) with imputation size M = 100.

  4. The nonparametric fractional imputation (NFI) estimator that uses the following nonparametric fractional weights:
    ωij=Kx{hx1(xixj)}jAδjKx{hx1(xixj)}
    for each unit iA with δi = 0 and jA with δj = 1. We use the reference bandwidth hx=1.06N^1/5σ^x with σ^x={(N^1)1iAπi1(xiμ^x)2}1/2,μ^x=N^1iAπi1xi and N^=iAπi1. A Gaussian kernel density function Kx(t) = (2π)−1/2 exp(−t2/2) has also been used.
  5. The stochastic regression imputation (SRI) estimator assuming the following model: yi = β0 + β1xi + εi with E(εi) = 0 and V(εi) = σ2.

  6. The proposed semiparametric fractional imputation (SFI) estimator θ^SFI.

From the Monte Carlo sample of size B = 2,000, Monte Carlo bias, standard error and root mean squared error are computed for each point estimator. The results are presented in Table 1. Under (E1) and (E2), the CC estimators perform worst since the response mechanism is not missing completely at random (MCAR). Unless the response mechanism is MCAR, the CC estimator is biased. The FULL estimators always perform best since they assume no missing values and use moment condition (1). Under distribution (E1), the SFI and PFI estimators have similar performances. Among the three imputation estimators, the NFI and SFI estimator performs worst in terms of RMSE for all scenarios since they used less information.

Table 1.

The Monte Carlo Bias (×10−2), Standard Error (SE) (×10−2) and Root Mean Squared Error (RMSE) (×10−2) for four different methods with two error distributions in Simulation One.

Par Method (E1) (E2)

Bias SE RMSE Bias SE RMSE
E(y) CC 18.6 13.0 22.7 20.0 19.8 28.1
FULL 0.1 5.9 5.9 0.3 9.0 9.0
PFI −0.1 7.8 7.8 0.7 12.0 12.0
NFI 0.7 12.9 12.9 2.9 21.3 21.5
SRI −0.2 13.9 13.9 1.7 23.3 23.3
SFI −0.2 6.9 6.9 0.3 12.4 12.4

Pr(y < 1) CC −6.3 5.3 8.2 −3.5 4.8 5.9
FULL 0.0 3.0 3.0 0.0 2.6 2.6
PFI 0.2 3.1 3.1 −5.4 3.1 6.2
NFI −0.1 5.1 5.1 −0.5 4.9 4.9
SRI 0.2 5.0 5.0 −0.4 5.0 5.0
SFI 0.1 3.2 3.2 0.1 3.3 3.3

Quantile CC 15.0 13.9 20.4 21.9 22.8 31.6
FULL 0.1 7.8 7.8 0.2 13.2 13.2
PFI −0.3 8.5 8.5 30.0 14.5 33.3
NFI −0.8 14.9 14.9 2.7 25.3 25.4
SRI −1.0 15.7 15.7 2.4 25.4 25.5
SFI −0.5 9.1 9.1 0.2 17.0 17.0

Under model (E2), the SFI estimator shows negligible bias for all parameters, but the PFI estimator has non-negligible bias for estimating proportion and quantile which is due to the misspecification of the error distribution. The NFI and SRI estimators are not as efficient as the SFI estimator in terms of bias and variance. The SFI estimator outperforms PFI, NFI and SRI estimators in terms of RMSE. The overall results indicate the robustness of SFI. For variance estimation, we computed the relative bias based on the Taylor linearization and replication methods, respectively. All the relative bias are below 7%. In addition, we calculate the Monte Carlo coverage rate for the 95% confidence intervals. Under model (E1), the coverage rates are 94.8%, 93.4% and 95.0% for estimating mean, proportion and quantile by using Taylor method and 94.9%, 93.6% and 95.1% by using Replication method. The results under model (E2) are similar and the coverage rates are close to the nominal rate.

6.2 Simulation Two

In the second simulation study, we use 2013–2014 U.S. National Health Examination and Nutrition Survey (NHANES) data as a pseudo finite population. Suppose the study variable is Systolic blood pressure (BPXSY1) and the covariate variable is body mass index (BMXBMI). Keeping only the cases where both BPXSY1 and BMXBMI are greater than zero, the pseudo finite population eventually contains 7104 cases. The scatter plot of BPXSY1 versus BMXBMI is presented in Figure 1. We assume BPXSY1 is roughly linear with respect to BMXBMI. After performing linear regression of BPXSY1 versus BMXBMI, the QQ plot of residuals and residuals vs fitted values plot are presented in Figure 2. The residual plots suggest deviation from normality. The p-value from Anderson-Darling test for normality is less than 2.2 × 10−16. We first generate response indicators δi, i = 1, 2,…., 7104 from the following logistic regression model:

Pr(δi=1|BMXBMIi)=exp{10.1log(BMXBMIi)}1+exp{10.1log(BMXBMIi)}.

Figure 1.

Figure 1

Scatter plot of BPXSY1 vs BMXBMI

Figure 2.

Figure 2

QQ plot (left panel) and Residual vs fitted value plot (right panel)

The response rate is around 60%. Then given (BPXSY1i, BMXBMI, δi), B = 2000 Monte Carlo samples are generated from simple random sampling with sample size n = 200. Assume the parameters of interest are:

  • (Mean). Finite population mean of BPXSY1, which is θm = 118.056.

  • (Prop1). Finite population proportion one of BPXSY1:
    θp1=1Ni=1NI(BPXSY1i<80)=0.0008.
  • (Prop2). Finite population proportion two of BPXSY1:
    θp2=1Ni=1NI(BPXSY1i<120)=0.6017.
  • (Prop3). Finite population proportion three of BPXSY1:
    θp3=1Ni=1NI(BPXSY1i<160)=0.9711.

We consider the same PFI, NFI, SRI and SFI estimators as discussed in Simulation One. The Monte Carlo Bias, Standard Error and Root Mean Squared Error (RMSE) are presented in Table 2. For the population mean, PFI and SFI performs similarly and the NFI estimator has slightly larger bias and standard error. SRI has comparable bias as PFI and SFI, but it has larger SE, as expected. For population proportions, the PFI estimator has substantially larger bias than NFI, SRI and SFI which may be due to the misspecification of error distributions. The NFI and SRI estimators have larger standard errors than PFI and SFI estimators since the nonparametric methods are not as efficient as parametric or semiparametric methods and stochastic imputation will produce larger variance. Overall, SFI estimator performs the best in terms of both bias and variance.

Table 2.

The Monte Carlo Bias (×10−2), Standard Error (SE) (×10−2) and Root Mean Squared Error (RMSE) (×10−2) for four different methods and four parameters.

Par Method Bias SE RMSE
Mean COM −2.9 124.8 124.9
PFI −2.3 153.2 153.2
NFI −5.0 153.5 153.6
SRI 1.4 169.7 169.7
SFI −2.2 153.3 153.3

Prop1 COM 0.0 0.2 0.2
PFI 0.5 0.3 0.6
NFI 0.0 0.3 0.3
SRI 0.1 0.3 0.3
SFI 0.0 0.2 0.2

Prop2 COM 0.0 3.4 3.4
PFI −2.2 3.8 4.4
NFI −0.5 4.2 4.3
SRI 0.5 4.2 4.3
SFI 0.2 3.9 3.9

Prop3 COM 0.0 1.2 1.2
PFI 0.7 1.1 1.3
NFI 0.2 1.4 1.4
SRI −0.3 1.6 1.6
SFI 0.1 1.4 1.4

7 Conclusions

Regression imputation is often used to handle item nonresponse in survey sampling. Unlike the usual regression imputation, the proposed semiparametric fractional imputation offers valid inference for a wide set of parameters such as population proportions and quantiles. Besides, only the first moment assumption is needed to obtain a consistent SFI estimator of the parameter, which leads to robust parameter estimation. The proposed SFI method shows good performances in the limited simulation studies.

The proposed method has several possible future research topics. First, instead of assuming ignorable response mechanism, we can consider an extension to nonignorable nonresponse (Kim and Yu, 2011) using an exponential tilting response model. Also, extension of the SFI for handling multivariate missing data will be an important future research topic.

Appendix

A: Proof of Theorem 1

We first assume the following regularity conditions:

  • (C1)

    The finite population is a random sample from the semiparametric regression model in (1). The regression function m(x; β) in (1) has a continuous first derivative ∂m(x; β)/β in the neighborhood of the true value β0 and E {m2(x; β)} and E {m(x; β)/β} are bounded in this neighborhood.

  • (C2)

    Function h(x; β) in the estimating function Ûβ(β) in (8) has continuous first derivative h(x; β)/β in the neighborhood of the true value β0 and ‖h(x; β)‖2 and ‖h(x; β)/β‖ are bounded by some integrable function G1(x) in the neighborhood.

  • (C3)

    The model error term in (1) satisfies E(ε2) < ∞ and max {‖εi‖: iA} = op(n1/2).

  • (C4)

    Let Uβ(β) = E[δ{y − m (x; β)} h (x; β)], assume Ûβ(β) converges to Uβ(β) in probability uniformly in the neighborhood of the true value β0. For every a > 0, infβ:ββ0aUβ(β)>0=Uβ(β0).

  • (C5)

    ∂Ûβ(β) /β converges to continuous nonsingular derivative ∂Ûβ(β) /β in probability uniformly in the neighborhood of the true value β0.

  • (C6)

    nU^β(β0)N(0,β), as n, N → ∞, where β=V{nU^β(β0)} denotes the design model variance, the variance under the joint distribution of the superpopulation model and the sampling mechanism.

  • (C7)

    Function U(η; x, y) has continuous partial derivatives ∂U(η; x, y)/∂η and ∂U(η; x, y)/∂y in the neighborhood of the true value η0 and ‖U(η; x, y)‖2, ‖∂U(η; x, y)/∂η‖ and ‖∂U(η; x, y)/∂y‖ are bounded by some integrable function G2(x, y) in the neighborhood.

  • (C8)

    Let U^n(η)=N1iAπi1U(η;xi,yi) and U(η) = E{U(η; xi, yi}, then Ûn(η) converges to U(η) in probability uniformly in the neighborhood of the true value η0. For every a > 0, infη:ηη0aU(η)>0=U(η0)

  • (C9)

    ∂Ûn (η) /δη converges to continuous nonsingular derivative ∂U (η) /∂η in probability uniformly in the neighborhood of the true value η0.

  • (C10)

    nU^n(η0)N(0,η), as n, N → ∞, where η=V{nU^n(η0)} denotes the design model variance.

  • (C11)

    The first order inclusion probabilities satisfy KLNn−1πiKU for all i, where KL and KU are positive constants.

  • (C12)

    maxi,j|πijπi1πj11|=o(1) for any i, j = 1, 2,…, N and ij, where πij are the second order inclusion probability of unit i and unit j in the population.

  • (C13)

    The response probability satisfies (2) and a < Pr(δi = 1|xi) ≤ 1 for i = 1, 2,…, N for some fixed a > 0

Conditions (C1)–(C2) are the model assumptions about the finite population. Condition (C3) is used to control the asymptotic order of λ^ in (10). Chen and Sitter (1999, Appendix 2) argued that (C3) holds for common unequal probability sampling designs. Conditions (C4) and (C8) ensure the consistency of β^ and η^, respectively. Conditions (C5), (C6), (C9) and (C10) are the regularity conditions that ensure asymptotic normality of β^ and η^. Van der Vaart (1998, Ch. 5) used similar regularity conditions. Specifically, Conditions (C6) and (C10) have been used in many existing literature such as Wu and Rao (2006), Wang and Opsomer (2011), among others. Hajek (1960, 1964) established the asymptotic normality condition under simple random sampling and rejective sampling with unequal selection probabilities. Visek (1979) established the asymptotic normality for the Horvitz-Thompson estimator under Rao-Sampford sampling designs. Condition (C7) controls the smoothness and asymptotic behavior of estimating function U(η; x, y). Conditions (C11) and (C12) are the standard assumptions for the sampling designs. Similar conditions have been used in Isaki and Fuller (1982) and Wang and Opsomer (2011). Condition (C13) controls the behavior of the individual response probability. According to assumption (C3) and by using similar techniques as Wu and Rao (2006), we can show that λ^=Op(n1/2). Assumption (C4) and Taylor linearization can establish

0=U^β(β^)=U^β(β0)+U^β(β0)β(β^β0)+op(n1/2).

Therefore,

β^β0=[E{U^β(β^0)β}]1U^β(β^0)+op(n1/2)=[E{δh(x;β0)m(x;β0)β}]11NiAπi1δiεih(xi;β0)+op(n1/2). (A.1)

We know that λ^ is the solution of the following estimating equation

U^λ(λ,β^)=1NiAδiπi1ε^i1+λε^i=0.

In addition, we have

U^λ(0,β0)λ=1NiAδiπi1εi2=E(δ)σ2+op(1), (A.2)

and

U^λ(0,β0)β=1NiAδiπim(xi;β0)β=E{m{x;β0}β|δ=1}E(δ) (A.3)

Based on (A.2), (A.3), by using Taylor linearization, we have

0=U^λ(λ^,β^)=U^λ(0,β0)+U^λ(0,β0)λλ^+U^λ(0,β0)β(β^β0)+op(n1/2). (A.4)

According to (A.1)–(A.4) and after some algebra, it can be shown that

λ^=1σ21NiAπi1δiE(δ)εi1σ2E{m(x;β0)β|δ=1}×[E{h(x;β0m(x;β0)β|δ=1)}]11NiAπi1δiE(δ)εih(xi;β0)+op(n1/2), (A.5)

where σ2 is the variance for the residuals. With condition (C6), it can be shown that η^=η0+op(1). In addition, we have

U^η(η0,β0,0)λ=1NiAπi1(1δi)jAδjπj1kAδkπk1εjU(η;xi,yi(j)(β0))={E(δ)}1E{(1δi)δjεjU(η0;xi,yi(j)(β0))}+op(1)=E{εU¯m(ε)}+op(1), (A.6)
U^η(η0,β0,0)β=1NiA(1δi)πijAδjπj1kAδkπk1U(η0;xi,yi)y{m(xi,β0)βm(xj;β0)β}={E(δ)}1E[(1δi)δjU(η0;xi,yi)y{m(xi;β0)βm(xj;β0)β}]+op(1)=D+op(1), (A.7)

and

U^η(η0,β0,0)η=1NiAδiπiU(η0;xi,yi)η+1NiA(1δi)πijAδjπj1kAδkπk1U(η0;xi,yi(j)))η=E{δU(η0;x,y)η}+E{(1δ)U(η0;x,y)η}+op(1)=E{U(η0;x,y)η}+op(1), (A.8)

where Ūm(ε) = E{(1 − δ) U(η0; x, y)|ε} and

D=E((1δ)U(η0;x,y)[m(x;β0)βE{m(x;β0)β|δ=1}]l(ε)),

with l(ε) = −f′(ε)/−1(ε). Define

S=1N(N1)E(δ)iAωi(1δi)jA,jiωjδjU(η0;xi,yi(j)(β0)),

then by using Taylor linearization,

U^η(η0,β0,0)=1NiAπi1δiU(η0;xi,yi)+E(S)+SE(S)E(S)E(δ){δ¯NE(δ)}+op(n1/2)=1NiAπi1δiU(η0;xi,yi)+SE(S)E(δ){δ¯NE(δ)}+op(n1/2),

with E(S) = E{(1 − δ)U(η0; x, y)} and δ¯N=N1iAδiπi1. According to the Hoeffding decomposition,

S=1N(N1)E(δ)iAπi1(1δi)jA,jiπj1δjU{η0;xi,yi(j)(β0)}=1NiA[πi1(1δi){U(η0;xi,yi)|xi}+πi1δiE(δ)E{(1δi)U(η0;xi,yi)|εi}]E(S)+op(n1/2).

Therefore,

U^η(η0,β0,0)=1NiAπi1δiU(η0;xi,yi)+1NiA[πi1(1δi)E{U(η0;xi,yi)|xi}+πi1δiE(δ)E{1δi)U(η0;xi,yi)|εi}]E(S)E(δ)δ¯N+op(n1/2). (A.9)

According to Taylor linearization, we have

0=U^η(η^,β^,λ^)=U^η(η0,β0,0)+U^η(η0,β0,0)η(η^η0)+U^η(η0,β0,0)β(β^β0)+U^η(η0,β0,0)λλ^+op(n1/2). (A.10)

By (A.1),(A.5)–(A.10), after some algebra, we can show that

η^η0={E(U(η0;x,y)η)}1(1NiAπi1ζi)+op(n1/2),

where ζi is defined in (12) of Theorem 1.

B: Proof of Theorem 2

We replace regularity conditions (C7)–(C10) in Appendix A with the following regularity conditions (C14)–(C17):

  • (C14)

    Ũn(θ) converges to Ũ(θ) in probability uniformly in the neighborhood of the true value θ0. For every a > 0. infθ:θθ0aU(θ)>0=U(θ0)

  • (C15)

    There exists a measurable function L(δ, x, y) with E {L2(δ, x, y)} < ∞ and for every θ1 and θ2 in the neighborhood of the true value θ0, ‖Ũ (θρ δ, x, y)Ũ(θ2; δ, x, y) ‖ ≤ L(δ, x, y)‖θ1θ2‖.

  • (C16)

    Assume that E{U2(θ0;δ,x,y)}< and E{U(θ;δ,x,y)} has continuous and invertible first derivatives with respect to θ and the corresponding first derivatives are bounded by some integrable function in the neighborhood of the true value θ0.

  • (C17)

    nUn(θ0)N(0,θ), as n, N → ∞, where θ=V{nUn(θ0)} denotes the design model variance.

Similar as conditions (C4) and (C8), condition (C14) ensures the consistency of proposed estimator. Conditions (C15) and (C16) are required to derive asymptotic expansion of proposed estimator. See Van der Vaart (1998, Ch. 5) for more details for those conditions. Similar as conditions (C6) and (C10), Condition (C17) is used to derive the central limit theory.

The proof of the consistency of β^ and η^ is similar to the relevant proof in Theorem 1. According to the regularity conditions (C10), (C11), (C12) and by using similar techniques as that of Theorem 19.26 of Van der Vaart (1998), we can show that

0=U^η(η^,β^,λ^)=U^η(η0,β0,0)+U^η(η0,β0,0)λλ^+E{U^η(η0,β0,0)}β(β^β0)+E{U^η(η0,β0,0)}η(η^η0)+op(n1/2). (B.1)

In addition, we have

E{U^η(η0,β0,0)}η=E{δU(η0,x,y)}η+1E(δ)E{(1δi)δjUη(η0;xi,yi)}η+op(1)=U(η0;x,y)η+op(1), (B.2)

and

E{U^η(η0,β0,0)}β=D+op(1), (B.3)

where D* is defined in Theorem 2. According to (A.1), (A.5), (A.6), (A.9), (B.1)–(B.3), we have

η^SFIη0=[E{U(η0;x,y)}η]1(1NiAπi1ζi)+op(n1/2),

where ζi is defined in Theorem 2.

References

  1. Chen J, Sitter R. A pseudo empirical likelihood approach to the effective use of auxiliary information in complex surveys. Statistica Sinica. 1999;9:385–406. [Google Scholar]
  2. Chauvet G, Deville JC, Haziza D. On balanced random imputation in surveys. Biometrika. 2011;98:459–471. [Google Scholar]
  3. Durrant GB. Imputation methods for handling item-nonresponse in the social sciences: a methodological review. ESRC National Center for Research Methods and Southampton Stat Sci.s Research Institute NCRM Methods Review Papers NCRM/002 2005 [Google Scholar]
  4. Durrant GB, Skinner C. Using missing data methods to correct for measurement error in a distribution function. Survey Methodology. 2006;32(1):25–36. [Google Scholar]
  5. Fay RE. When are inferences from multiple imputation valid? Proceedings of the Survey Research Methods Section of the American Statistical Association. 1992;81:227–32. [Google Scholar]
  6. Fay RE. Alternative paradigms for the analysis of imputed survey data. Journal of the American Statistical Association. 1996;91(434):490–498. [Google Scholar]
  7. Fuller WA, Kim JK. Hot deck imputation for the response model. Survey Methodology. 2005;31:139–149. [Google Scholar]
  8. Fuller WA. Sampling Statistics. Wiley; Hoboken, NJ: 2009. [Google Scholar]
  9. Haziza D. Imputation and inference in the presence of missing data. In: Pfeffermann D, Rao CR, editors. Handbook of Statistics. Vol. 29, Sample Surveys: Theory, Methods and Inference. Amsterdam: Elsevier BV; 2009. pp. 215–46. [Google Scholar]
  10. Kalton G, Kish L. Some efficient random imputation methods. Communications in Statistics A. 1984;13:1919–1939. [Google Scholar]
  11. Kim JK, Fuller WA. Fractional hot deck imputation. Biometrika. 2004;91(3):559–578. [Google Scholar]
  12. Kim JK, Brick J, Fuller WA, Kalton G. On the bias of the multiple-imputation variance estimator in survey sampling. Journal of Royal Statistical Society: Series B. 2006;68(3):509–521. [Google Scholar]
  13. Kim JK, Navarro A, Fuller WA. Replicate variance estimation after multi-phase stratified sampling. Journal of the American Statistical Association. 2006;101:312–320. [Google Scholar]
  14. Kim JK. Parametric fractional imputation for missing data analysis. Biometrika. 2011;98:119–132. [Google Scholar]
  15. Kim JK, Yu CL. A semi-parametric estimation of mean functionals with non-ignorable missing data. Journal of the American Statistical Association. 2011;106:157–165. [Google Scholar]
  16. Kim JK, Shao J. Statistical methods for handling incomplete data. London: Chapman and Hall/CRC; 2013. [Google Scholar]
  17. Kim JK, Yang S. Fractional hot deck imputation for robust inference under item nonresponse in survey sampling. Survey Methodology. 2014;40:211–230. [Google Scholar]
  18. Little RJA, Rubin DB. Statistical Analysis With Missing Data. 2nd. Hoboken, NJ: Wiley; 2002. [Google Scholar]
  19. Matloff NS. Use of regression functions for improved estimation of means. Biometrika. 1981;68:685–689. [Google Scholar]
  20. McCullagh P, Nelder J. Generalized Linear Models. London: Chapman and Hall; 1989. [Google Scholar]
  21. Meng XL. Multiple-imputation inferences with uncongenial sources of input. Statistical Science. 1994;9:538–558. [Google Scholar]
  22. Müller UU. Estimating linear functionals in nonlinear regression with response missing at random. Annals of Statistics. 2009;98:2245–2277. [Google Scholar]
  23. Owen AB. Empirical Likelihood. Chapman and Hall/CRC; New York: 2001. [Google Scholar]
  24. Qin J. Empirical likelihood in biased sample problems. Annals of Statistics. 1993;21(3):1182–1196. [Google Scholar]
  25. Qin J, Lawless J. Empirical likelihood and general estimating equations. The Annals of Statistics. 1994;22:300–325. [Google Scholar]
  26. Rubin DB. Multiple Imputation for Nonresponse in Surveys. Wiley; 1987. [Google Scholar]
  27. Shao J, Tu D. The Jackknife and Bootstrap. Springer; 1995. [Google Scholar]
  28. Shao J, Steel P. Variance estimation for survey data with composite imputation and nonnegligible sampling fractions. Journal of the American Statistical Association. 1999;94:254–265. [Google Scholar]
  29. Vardi Y. Empirical distributions in selection bias models. Annals of Statistics. 1985;13:178–203. [Google Scholar]
  30. Van der Vaart AW. Asymptotic Statistics. New York: Cambridge University Press; 1998. [Google Scholar]
  31. Víšek JA. Asymptotic distribution of simple estimate for rejective, Sampford and successive sampling. In: Jurecková J, editor. Contributions to Statistics: Jaroslav Hj́ek Memorial. Academia, Prague & D. Reidel; Dordrecht: 1979. pp. 263–275. [Google Scholar]
  32. Wang N, Robins JM. Large-sample theory for parametric multiple imputation procedures. Biometrika. 1998;85(4):935–948. [Google Scholar]
  33. Wang Q, Rao JNK. Empirical likelihood-based inference under imputation for missing response data. The Annals of Statistics. 2002;30:896–924. [Google Scholar]
  34. Wang D, Chen SX. Empirical likelihood for estimating equations with missing values. The Annals of Statistics. 2009;37:490–517. [Google Scholar]
  35. Wang JQ, Opsomer JD. On asymptotic normality and variance estimation for nondifferentiable survey estimators. Biometrika. 2011;98:91–106. [Google Scholar]
  36. Wolter KM. Introduction to Variance Estimation. Wiley; New York: 2007. [Google Scholar]
  37. Wu C, Rao JNK. Pseudo empirical likelihood ratio confidence intervals for complex surveys. The Canadian Journal of Statistics. 2006;34:359–375. [Google Scholar]
  38. Yang S, Kim JK. A Note on Multiple Imputation for General-Purpose Estimation. Biometrika. 2016;103:244–251. [Google Scholar]

RESOURCES