Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2013 Jul 30.
Published in final edited form as: Stat Med. 2012 Jun 20;31(17):1821–1837. doi: 10.1002/sim.4467

Two-sample density-based empirical likelihood ratio tests based on paired data, with application to a treatment study of attention-de cit/hyperactivity disorder and severe mood dysregulation

Albert Vexler a,*,, Wan-Min Tsai a, Gregory Gurevich b, Jihnhee Yu a
PMCID: PMC3445288  NIHMSID: NIHMS402040  PMID: 22714114

Abstract

It is a common practice to conduct medical trials to compare a new therapy with a standard-of-care based on paired data consisted of pre- and post-treatment measurements. In such cases, a great interest often lies in identifying treatment effects within each therapy group and detecting a between-group difference. In this article, we propose exact nonparametric tests for composite hypotheses related to treatment effects to provide efficient tools that compare study groups utilizing paired data. When correctly specified, parametric likelihood ratios can be applied, in an optimal manner, to detect a difference in distributions of two samples based on paired data. The recent statistical literature introduces density-based empirical likelihood methods to derive efficient nonparametric tests that approximate most powerful Neyman–Pearson decision rules. We adapt and extend these methods to deal with various testing scenarios involved in the two-sample comparisons based on paired data. We show that the proposed procedures outperform classical approaches. An extensive Monte Carlo study confirms that the proposed approach is powerful and can be easily applied to a variety of testing problems in practice. The proposed technique is applied for comparing two therapy strategies to treat children’s attention deficit/hyperactivity disorder and severe mood dysregulation.

Keywords: empirical likelihood, exact tests, likelihood ratio, nonparametric test, paired data, paired t-test, two-sample problem, Wilcoxon signed rank test

1. Introduction and Technical Preliminaries

Investigators in various fields of medical studies often deal with paired data to compare different population groups. In this article, we propose a paired data-based methodology motivated by the following comparative study of attention-deficit/hyperactivity disorder (ADHD) and severe mood dysregulation (SMD). ADHD is a commonly diagnosed psychiatric disorder in children (e.g., [1, 2]). SMD is a diagnostic label recently created by the Leibenluft’s laboratory in the National Institute of Mental Health’s intramural program to refer to children with an abnormal baseline mood, hyperarousal, and increased reactivity to negative emotional stimuli (e.g., [36]). A novel group therapy study at University at Buffalo enrolled 32 children aged 7–12 with ADHD and SMD. These children were treated for 11 weeks. The study participants were randomized between two therapy groups: experimental group therapy program (case; new therapy group) and community psychosocial treatment (control; old therapy group). An objective of the study was to compare the feasibility and efficacy of these two treatments using the Children’s Depression Rating Scale — Revised total score (CDRS-Rts). The Children’s Depression Rating Scale, revised version (CDRS-R), is a clinician-rated instrument for the diagnosis of childhood depression and the assessment of the severity of depression in children 6–12 years of age [7, 8]. The CDRS-R consists of 17 clinician rated items, with 14 items based on the child’s self-report or reports from the parents or teachers and three items based on the child’s nonverbal behavior during the interviews. The CDRS-R provides more reliable depression ratings compared with the other children depression rating scales, because it collects information from more sources through interviewing the child, parents or school teachers, independently, and it considers the child’s behavior during the interview, and lengthens scales to capture slight differences of symptomatology. On the basis of clinical experience, a CDRS-Rts of below 40, 40–60, and above 60 corresponds to none to mild, moderate, and severe depression, respectively [79]. Thus, the fact that the CDRS-Rts drops significantly over the course of the study implies an effectiveness of a treatment. To record paired data of this study, two measurements were taken from the same subjects. The paired data were constituted by observed values of CDRS-Rts at week 0 (baseline) and week 11 (endpoint).

In this medical study, the main research problems are to test differences between distributions of the two therapy groups and to detect treatment effects within each group. Testing the hypothesis of no difference between distributions of two therapy groups using the paired data is only one aspect of the comparisons between treatments. For example, in the context of the treatment study of ADHD and SMD, previously, Waxmonsky et al. [6] carried out a study to examine the tolerability and efficacy of methylphenidate (MPH) and behavior modification therapy, where multiple comparisons with the Bonferroni technique using the independent sample t-tests and the pairwise t-tests were conducted. The former tests were implemented to evaluate between-group differences in baseline characteristics and measures of tolerability (the Pittsburgh Side Effect Rating Scale). The latter ones were undertaken in the SMD group to compare pre-differences and post-differences in CDRS-Rts and Pittsburgh Side Effect Rating Scale from low dose MPH to high dose MPH. In this article, we avoid considerations of combined p-values, proposing a simple and efficient way to create nonparametric tests that attends to special alternative hypotheses directly, in an analogy to parametric likelihood ratio tests’ developments. Note that, in controlling the type I error, the used t-tests are known to be inefficient, when utilized data are skewed, and the applied Bonferroni method tends to be conservative. The nonparametric statistical analyses of two populations described above require to consider more versatile testing methods than those well addressed in the classic literature (e.g., [10, 11]). In this article, we propose and examine distribution-free tests for multiple hypotheses to detect various differences related to treatment effects in study groups based on paired data.

To formalize the testing problems, let (Xij, Yij) be independent identically distributed (i.i.d.) pairs of observations within a subject j from sample i, where i = 1, 2 are referred to as treatments; j = 1, …, ni are referred to as subjects. In the nonparametric setting, the classic one-sample tests for paired data, for example, the paired t-test and the Wilcoxon signed rank test, are based on differences Zij = YijXij, where Zij denotes a within-pair difference of subject j from sample i, i = 1, 2; j = 1, …, ni. Note that {Z11, …, Z1n1} and {Z21, …, Z2n2} consist of i.i.d. observations from populations Z1 and Z2 with distribution functions, say, FZ1(.) and FZ2(.), respectively. In contexts of treatment evaluations, Zij can be defined to be the difference of measurements between pre- and post-treatment. In this article, we consider different hypotheses simultaneously for the symmetry of FZ1 and/or FZ2 (detecting a treatment effect into groups) and for the equivalence FZ1 = FZ2. Here, we refer to the nonparametric literature to connect the term ‘treatment effect’ with tests for symmetry (e.g., [10]). Note that the Kolmogorov–Smirnov test is a known procedure to compare distributions of populations, whereas the standard testing procedures such as the paired t-test, the sign test, and the Wilcoxon signed rank test can be applied to the symmetric problem, that is, to test for H0 : FZ(Z) = 1 − FZ(−Z). Comparisons between distributions of new therapy and control groups and detecting treatment effects may be based on multiple hypotheses tests. To this end, one can create relevant tests combining, for example, the Kolmogorov–Smirnov test and the Wilcoxon signed rank test. The use of the classical procedures commonly requires complex considerations to combine the known nonparametric tests. Alternatively, we will develop a direct distribution-free method for analyzing the two-sample problems. The proposed method can be easily applied to test nonparametrically for different composite hypotheses. The proposed approach approximates nonparametrically most powerful Neyman–Pearson test-rules, providing efficiency of the proposed procedures.

When parametric forms of the relevant distributions are known, corresponding parametric likelihood ratios can be easily applied to test for the problems mentioned above. According to the Neyman–Pearson lemma, the parametric likelihood ratio tests are optimal decision rules (e.g., [1113]). We propose to approximate corresponding likelihood ratios using an empirical likelihood (EL) concept. The EL methodology has been addressed in the statistical literature as one of powerful nonparametric techniques (e.g., [1423]). The EL methodology allows researchers to use distribution-free procedures with efficient characteristics that are asymptotically close to those of related parametric likelihood approaches (e.g., [14]). The EL approach is developed via terms of cumulative distribution functions (e.g., [17, 24, 25]). Vexler and Yu [25] demonstrated that the classical EL method based on distribution functions is very suitable for testing parameters; however, the EL technique based on density functions performs more efficiently to test for distributions. To approximate Neyman–Pearson test statistics, Vexler and Gurevich [26, 27] proposed to focus on the density-based EL, Lf=i=1nfi, where fi = f(T(i)), f(·) is an unknown density function of the observations {T1, …, Tn} and T(i) denotes the ith ordered statistic based on {T1, …, Tn}. In this case, approximate values of fi are obtained by maximizing Lf subject to an empirical version of the constraint ∫ f(u)du = 1.

We extend and adapt the density-based EL approach for the two-sample testing issues carrying out multiple testing problems in paired data settings. Despite the fact that many statistical inference procedures have been developed for two-sample problems, to our knowledge, relevant nonparametric likelihood techniques to deal with the presented two-sample issues based on paired data have not been well addressed in the literature. The proposed density-based EL tests are exact, which ensure accurate computations of relevant p-values based on data with small sample sizes.

The paper is organized as follows. In Section 2, we address the purpose of each testing hypothesis considered in this article, and then we develop corresponding density-based EL ratio test statistics. The theoretical results will be presented to show the asymptotic consistency of the proposed tests. To evaluate the proposed approaches, extensive Monte Carlo studies are carried out in Section 3. An application to analyze the CDRS-Rts data is presented in Section 4. In Section 5, we provide some concluding remarks.

2. Statement of problems and Methods

2.1. Hypotheses setting

To test for equality of the distribution of the new therapy group and the control therapy group based on paired observations {Z11, …, Z1n1} and {Z21, …, Z1n2}, one may consider the hypotheses,

HN0:FZ1=FZ2=FZvsHA:FZ1FZ2.

To incorporate evaluation of the treatment effect on each therapy group, we point out three tests related to the null hypothesis, (1) the equality of the distributions of two therapy groups, and (2) no treatment effect in each group. This can be presented by H0 : 1) FZ1 = FZ2 = FZ, and 2) FZ(Z) = 1 − FZ(−Z), for all Z ∈ (−∞, ∞). Against H0, we can set up three different alternative hypotheses, namely HA1, HA2, and HA3, where

  1. HA1: not H0, that is, FZ1FZ2 or FZ1(Z1) ≠ 1 − FZ1(−Z1) or FZ2(Z2) = 1 − FZ2(Z2);

  2. HA2: There is a treatment effect in one therapy group while there is no treatment effect in the other;

  3. HA3: One asserts that both therapy groups have the same treatment effect. In this case, because the distributions of two groups are assumed to be identical under H0 and HA3, a one-sample test for symmetry can be applied.

The cases (i)–(iii) are formally noted in Table I.

Table I.

Hypotheses of interest to be tested based on paired data.

Null hypothesis versus Alternative hypothesis
HN0 : FZ1 = FZ2 = FZ HA : FZ1FZ2
H0 : FZ1 = FZ2 = FZ;FZ(Z) = 1 − FZ(−Z), for all Z ∈ (−∞, ∞) HA1 : FZ1FZ2 or fZi(Zi) ≠ 1 − fZi (−Zi), for i = 1 or 2 (i.e., not H0)
HA2 : FZ1FZ2; FZ1(Z1) ≠ 1 − FZ1(−Z1); FZ2(Z2) = 1 − FZ2(−Z2)
HA3 : FZ1 = FZ2 = FHA3, Z; FHA3,Z(Z) ≠ 1 − FHA3,Z(−Z)

Let Test 1, Test 2, and Test 3, refer to the hypothesis tests for the composite hypotheses H0 versus HA1, H0 versus HA2, and H0 versus HA3, respectively.

2.2. Test statistics

In this section, we develop test statistics for Tests 1–3. The proposed three tests will be shown to be exact.

2.2.1. Test 1: H0 versus HA1

Consider the scenario where one is interested to test for

H0:FZ1=FZ2=FZ;FZ(Z)=1-FZ(-Z),forallZ(-,)versusHA1.

The likelihood ratio test statistic based on observations, Zij, i = 1, 2; j = 1, …, ni, is given by

LRHA1=j=1n1fZ1(Z1j)j=1n2fZ2(Z2j)j=1n1fZ(Z1j)j=1n2fZ(Z2j)=j=1n1fZ1,jfZZ1,jj=1n1fZ2,jfZZ2,j,

where fZi, i = 1, 2, are density functions related to fZi, i = 1, 2; fZ is a density function related to a symmetric distribution FZ; fZ1,j = FZ1 (Z1(j)), fZ2,j = FZ2 (Z2(j)), fZZ1,j = fZ(Z1(j)), and fZZ2,j = fZ(Z2(j)) and Z1(1)Z1(2) ≤ ··· ≤ Z1(n1), Z2(1)Z2(2) ≤ ··· ≤ Z2(n2) are the order statistics based on {Z11, …, Z1n1} and {Z21, …, Z2n2}, respectively. The main novelty of the proposed method for developing the nonparametric test statistic is that we modify the maximum EL concept to obtain directly estimated values of fZ1j, j = 1, … n1, maximizing j=1n1fZ1,j subject to an empirical constraint. This constraint controls estimated values of fZ1,j, j = 1, … n1, preserving the main property of the density function FZ1 under the complex structure of the tested hypothesis. To obtain the associated empirical constraint, we utilize the fact that the values of fZ1,j should be restricted by the equation ∫ FZ1(u)du = 1. By applying the Mean Value Theorem to approximate the constraint ∫ FZ1(u)du = 1 (for details, see [2427]), for all positive integer mn/2, we have

(2m)-1j=1n1Z1(j-m)Z1(j+m)fZ1(u)du=(2m)-1j=1n1Z1(j-m)Z1(j+m)fZ1(u)fZ(u)fZ(u)du(2m)-1j=1n1fZ1,jfZZ1,jZ1(j-m)Z1(j+m)fZ(u)du(2m)-1j=1n1fZ1,jfZZ1,j(FZ(Z1(j+m))-FZ(Z1(j-m))). (1)

Because under the null hypothesis H0, the distribution function FZ = FZ1 = FZ2 is assumed to be symmetric, the idea presented by Schuster [28] can be adapted to estimate (FZ(Z1(j + m)) − FZ(Z1(j − m))) at (1) by using the following estimator, which is denoted as ηm,j,

ηm,j=(Fn1+n2(Z1(j+m))-Fn1+n2(Z1(j-m))), (2)

where Fn1+n2(u)=12(n1+n2)i=12j=1ni[I(Ziju)+I(-Ziju)], I(.) is the indicator function.

By virtue of Lemma 2.1 in [24] and Proposition 2.1 in [26], we have that, for all integer m ≤ 0.5n1,

(2m)-1j=1n1Z1(j-m)Z1(j+m)fZ1(u)du=Z1(1)Z1(n1)fZ1(u)du-r=1m-1(m-r)2m×[Z1(n1-r)Z1(n1-r+1)fZ1(u)du+Z1(r)Z1(r+1)fZ1(u)du], (3)

where Z1(j − m) = Z1(1), if jm ≤ 1 and Z1(j + m) = Z1(n1), if j + mn1.

Because

Z1(1)Z1(n1)fZ1(u)du-f(u)du=1,

Equation (3) demonstrates that (2m)-1j=1n1Z1(j-m)Z1(j+m)fZ1(u)du1, and (2m)-1j=1n1Z1(j-m)Z1(j+m)fZ1(u)du1 when m/n1 → 0 as m, n1 → ∞.

By replacing the distribution functions in (3) by their empirical counterparts, Fn1(u)=n1-1j=1n1I(Z1ju), the empirical version of the Equation (3) then has the form of

(2m)-1j=1n1Z1(j-m)Z1(j+m)fZ1(u)duFn1(Z1(n1))-Fn1(Z1(1))-r=1m-1(m-r)2m[Fn1(Z1(n1-r+1))-Fn1(Z1(n1-r))+Fn1(Z1(r+1))-Fn1(Z1(r))]. (4)

This leads to

(2m)-1j=1n1Z1(j-m)Z1(j+m)fZ1(u)du1-(m+1)(2n1)-1. (5)

Now, by the Equations (1), (2), and (5), the resulting empirical constraint for values of fZ1,j is

(2m)-1j=1n1fZ1,jfZZ1,jηm,j=1-(m+1)(2n1)-1. (6)

To find values of fZ1,j that maximize the likelihood j=1n1fZ1,j provided the condition (6), we formalize the Lagrange function as

j=1n1logfZ1,j+λ1[1-(m+1)(2n1)-1-(2m)-1j=1n1fZ1,jFZZ1,jηm,j],

where λ1 is a Lagrange multiplier. Maximizing the equation above, the values of fZ1,j, … fZ1,n1 have the form of

fZ1,j=m(2n1-m-1)n12ηm,jfZZ1,j,j=1,,n1,

where Z1(j − m) = Z1(1), if jm ≤ 1 and Z1(j + m) = Z1(n1), if j + mn1.

As a consequence, the density-based EL estimator of the ratio j=1n1fZ1,j/fZZ1,j can be formulated by

V1,m,1=j=1n1m(2n1-m-1)n12ηm,j.

One can show that properties of the statistic V1,m,1 strongly depend on the selection of values of the integer parameter m. A similar problem also arose in the well-known goodness-of-fit tests based on sample entropy (e.g., [24,26,29]). Attending to this issue, we eliminate the dependence on the integer parameter m. Towards this end, we utilize the maximum EL concept in a similar manner to arguments proposed in [24, 26, 27:Appendix A]. Thus, the modified test statistic can be written as

V1,1=minn10.5+δmn11-δj=1n1m(2n1-m-1)n12ηm,j,δ(0,1/4), (7)

Likewise, the approximation to the likelihood ratio j=1n2fZ2,j/fZZ2,j is

V2,2=minn20.5+δkn21-δj=1n2k(2n2-k-1)n22ϕk,j,δ(0,1/4). (8)

where

ϕk,j=(Fn1+n2(Z2(j+k))-Fn1+n2(Z2(j-k))),Fn1+n2isdefinedin(2).

Finally, the proposed test statistic for Test 1 has the form of

Vn1n2HA1=i=12Vi,i=minn10.5+δmn11-δj=1n1m(2n1-m-1)n12ηm,jminn20.5+δkn21-δj=1n2k(2n2-k-1)n22ϕk,j,

that approximates the likelihood ratio LRHA1. Consequently, the decision rule is to reject H0 if

log(Vn1n2HA1)>CHA1, (9)

where CHA1 is a test threshold. (Similarly to [30], we will arbitrarily define ηm,j = 1/(n1 + n2) or ϕk,j = 1/(n1 + n2), if ηm,j = 0 or ϕk,j = 0, respectively.) Proposition 1 in Section 2.3 will demonstrate that the proposed test-statistic log(Vn1n2HA1) in (9) is asymptotically consistent. The upper and lower bounds for the integer parameters m and k in definitions of (7) and (8) were selected to provide the asymptotic consistency. Note that, to test the composite hypotheses H0 versus HA1, a complex consideration regarding a reasonable combination of the Kolmogorov–Smirnov test and the Wilcoxon signed rank test can be applied (see, for example, Section 3.2). Alternatively, the test (9) uses measurements from the therapy groups, in an approximate Neyman–Person manner, providing a simple procedure to evaluate the treatment effect on each therapy group. Section 3 shows, in various situations, the test (9) is superior to the combinations of the classic tests based on Kolmogorov–Smirnov and Wilcoxon procedures. It is also shown that the proposed nonparametric test has power comparable with that of correct parametric likelihood ratio tests. Thus, in contexts of the study described in Section 1, the direct application of the density-based EL ratio test (9) provides an efficient evaluation of treatment effects with ADHD and SMD in children.

2.2.2. Test 2: H0 versus HA2

Our goal is to test for

H0:FZ1=FZ2=FZ,FZ(Z)=1-FZ(-Z),forallZ(-,),versusHA2:FZ1FZ2,FZ1(Z1)1-FZ1(-Z1),FZ2(Z2)=1-FZ2(-Z2).

In a similar manner to the development of the density-based EL approximation to the ratio j=1n1fZ1,j/fZZ1,j mentioned in Section 2.2.1, the EL ratio related to test for H0 versus HA2 can be defined as

V1,1=minn10.5+δmn11-δj=1n1m(2n1-m-1)n12ηm,j,δ(0,1/4), (10)

where ηm,j are defined in (2). Consider the density-based EL approximation to the corresponding ratio j=1n2fZ2,j/fZZ2,j. The empirical constraint for values of fZ2,j can be constructed based on the symmetric property of FZ2. By analogy with Equations (1)(6), one can show the resulting empirical constraint on values of fZ2,j in Test 2 has the form of

(2k)-1j=1n2fZ2,jfZZ2,jϕk,j=Λn2k, (11)

where ϕk,j are defined in (8) and

Λn2k=(2n2)-1{j=1n2[I(-Z2jZ2(n2))-I(-Z2jZ2(1))]+n2-1-r=1k-1(k-r)2kj=1n2[I(-Z2jZ2(n2-r+1))-I(-Z2jZ2(n2-r))+I(-Z2jZ2(r+1))-I(-Z2jZ2(r))]-(k-1)2}. (12)

The formal derivation of this constraint is given in Appendix A of this article. Then the corresponding Lagrange function can be formulated by

j=1n2logfZ2,j+λ2[Λn2k-(2k)-1j=1n2fZ2,jfZZ2,jϕk,j], (13)

where λ2 is a Lagrange multiplier. Thus, approximate values of fZ2, 1, …, fZ2, n2 are

fZ2,j=2kΛn2kn2ϕk,jfZZ2,j,j=1,,n2,

where Z2(j − m) = Z2(1), if jk ≤ 1 and Z2(j + k) = Z2(n2), if j + kn2.

Similarly to (7) and (8), the density-based EL estimator of the ratio j=1n2fZ2,j/fZZ2,j can be presented as

V2,2=minn20.5+δkn21-δj=1n22kΛn2kn2ϕk,j,δ(0,1/4). (14)

Finally, taking into account (10) and (14), the proposed test statistic for Test 2 can be constructed as

Vn1n2HA2=minn10.5+δmn11-δj=1n1m(2n1-m-1)n12ηm,jminn20.5+δkn21-δj=1n22kΛn2kn2ϕk,j.

In this case, the decision rule developed for Test 2 is to reject the null hypothesis if

log(Vn1n2HA2)>CHA2, (15)

where CHA2 is a test threshold.

2.2.3. Test 3: H0 versus HA3

Consider the following hypotheses of interest

H0:FZ1=FZ2=FZ,FZ(Z)=1-FZ(-Z),forallZ(-,),versusHA3:FZ1=FZ2,=FHA3,Z,FHA3,Z(Z)1-FHA3,Z(-Z).

The corresponding likelihood ratio test statistic based on observations Zij, i = 1, 2; j = 1, , , ni can be

LRHA3=i=12j=1nifHA3,Z(Zij)i=12j=1nifZ(Zij)=s=1NfHA3,Z(Z(s))s=1NfZ(Z(s))=s=1NfHA3,sfH0,s,

where N = n1 + n2; fH0, s = fZ(Z(s)) and fHA3, s = fHA3,Z(Z(s)) denote the density functions of observations Z under H0 and HA3, respectively, and Z(s)s = 1, …, N, are the order statistics based on the pooled sample of {Z11, …, Z1n1} and {Z21, …, Z2n2} that are denoted by Zs,s = 1, …, N. Using the same technique as in Section 2.2.1, we derive values of fHA3,s, s = 1, …, N, that maximize the log-likelihood s=1Nlog(fHA3,s) given a constraint, empirical form of ∫ fHA3(u)du = 1. The proposed test statistic for Test 3 is

VNHA3=minN0.5+δmN1-δs=1Nm(2N-m-1)N2ωm,s, (16)

where ωm,s=(2N)-1s=1N[I(ZsZ(s+m))+I(-ZsZ(s+m))-I(ZsZ(s-m))-I(-ZsZ(s-m))] and δ ∈ (0, 1/4).

Thus, the decision rule for Test 3 is to reject the null hypothesis if

log(VNHA3)>CHA3, (17)

where CHA3 is a test threshold.

2.3. Asymptotic consistency of the tests

In this section, we present the following propositions to demonstrate the asymptotic consistency of the proposed tests:

Proposition 1

Let fZi(Z) be the density function with the expectations E(log fZi(Zi1)) < ∞ and E(log fZi(Zi1)) < ∞, i = 1, 2. Let fi(u) = (fZi(u) + fZi(−u)/2. Then, under H0,(n1+n2)-1log(Vn1n2HAt)p0, t = 1, 2, and under HA1 and HA2,

(n1+n2)-1log(Vn1n2HAt)p-γ1+γEHAtlog{γ1+γ+11+γ(f2(Z11)f1(Z11))}-11+γEHAtlog{11+γ+γ1+γ(f1(Z21)f2(Z21))}0,t=1,2,

as n1 → ∞, n2 → ∞, n1/n2γ > 0, where γ is a constant.

Proof

We outline the proof in Appendix A1 provided in Supporting material (S1) (ref. [31] is cited in the proof).

Consider the testing problem H0 versus HA3.

Proposition 2

Let the pooled sample Zs,s = 1, …, N, have the density function, f(Z), with the expectations E(log f(Z1)) < ∞ and E(log f(−Z1)) < ∞. Then under H0,N-1log(VNHA3)p0, and under HA3,N-1log(VNHA3)p-EHA3log{0.5+0.5(f(-Z1)f(Z1))}0, as N → ∞

Proof

We omit the proof, since it is similar to the proof of Proposition 1.

2.4. Null distributions of the proposed test statistics

To obtain critical values of the proposed tests, we utilize the fact that the proposed test statistics are based on indicator functions I(.) and I(Z1 < Z2) = I(FZ(Z1) < FZ(Z2)) and I(Z1 < −Z2) = I(FZ(Z1) < FZ(−Z2)) = I(FZ(Z1) < 1 − FZ(Z2)), where the random variables FZ(Z1) and FZ(Z2) have the uniform distribution, Unif [0,1], under H0. Thus, the distributions of the proposed test statistics are independent of the distributions of observations and hence, the critical values of the proposed tests can be exactly computed. For each proposed test, we conducted the following procedures to determine the critical values, Cα, of the null distributions. We first generated data of Z1 and Z2 from the standard normal distribution N(0, 1) and then calculated the test statistics corresponding to each proposed test. At each sample sizes n1 and n2, we obtained 50,000 generated values of the test statistics (9), (15), and (17), with δ = 0.1, tabulating the critical values for the null distributions of the test statistics at the significance levels α, α = 0.01; 0.05; 0.1 (see Table II).

Table II.

The critical values for Test 1 by (9) (Test 2 by (15)) [Test 3 by (17)] with δ = 0.1 for different sample sizes (n1, n2) and significance levels α.

n1 α n2
10 15 20 25 30 35
10 0.01 7.44 (5.75) [4.17] 7.18 (5.85) [4.18] 7.39 (6.02) [4.18] 7.33 (6.17) [4.16] 7.45 (6.33) [4.25] 7.47 (6.20) [4.42]
0.05 5.22 (3.86) [2.68] 5.06 (3.93) [2.82] 5.25 (4.09) [2.84] 5.35 (4.28) [2.85] 5.38 (4.37) [2.97] 5.42 (4.43) [3.11]
0.1 4.27 (3.11) [2.14] 4.19 (3.17) [2.27] 4.39 (3.35) [2.30] 4.48 (3.52) [2.36] 4.56 (3.60) [2.46] 4.59 (3.68) [2.60]
15 0.01 6.83 (5.39) [4.15] 6.96 (5.57) [4.16] 6.98 (5.68) [4.24] 6.88 (5.80) [4.33] 6.96 (5.81) [4.39]
0.05 4.90 (3.73) [2.83] 5.02 (3.89) [2.85] 5.15 (4.04) [2.97] 5.14 (4.11) [3.12] 5.20 (4.18) [3.13]
0.1 4.09 (3.07) [2.30] 4.25 (3.24) [2.34] 4.38 (3.38) [2.45] 4.36 (3.45) [2.59] 4.44 (3.52) [2.63]
20 0.01 6.86 (5.44) [4.24] 6.96 (5.55) [4.38] 6.99 (5.69) [4.41] 7.01 (5.73) [4.42]
0.05 5.11 (3.94) [2.96] 5.24 (4.11) [3.11] 5.25 (4.19) [3.16] 5.27 (4.26) [3.19]
0.1 4.35 (3.33) [2.45] 4.48 (3.47) [2.58] 4.51 (3.55) [2.64] 4.52 (3.62) [2.69]
25 0.01 7.08 (5.62) [4.40] 7.01 (5.76) [4.44] 7.10 (5.85) [4.56]
0.05 5.33 (4.18) [3.15] 5.36 (4.25) [3.22] 5.37 (4.32) [3.34]
0.1 4.56 (3.55) [2.64] 4.59 (3.63) [2.70] 4.62 (3.69) [2.82]
30 0.01 6.89 (5.63) [4.55] 6.99 (5.75) [4.68]
0.05 5.31 (4.23) [3.32] 5.33 (4.27) [3.44]
0.1 4.59 (3.65) [2.80] 4.61 (3.68) [2.93]
35 0.01 6.91 (5.68) [4.64]
0.05 5.35 (4.28) [3.47]
0.1 4.64 (3.69) [2.98]

Remark 1

The definitions (9), (15), and (17) of the proposed test statistic include δ ∈ (0, 1/4). We set up δ = 0.1. To investigate the test statistics with different values of δ, we conducted an extensive Monte Carlo study. The Monte Carlo powers of the proposed tests were not found to be significantly dependent on values of δ ∈ (0, 1/4). These experimental results are similar to those shown in [24,25] and [27]: Appendix A.

Remark 2

The computer codes related to the outputs of this section are provided in the Supplemental Section S2. These codes can be easily modified to obtain results of the next section or to perform the proposed tests based on real data.

3. Simulation study

In this section, we examine the power properties of the proposed tests in various cases using Monte Carlo simulations. The proposed tests based on (9), (15), and (17), with δ = 0,1, are compared with the common test procedures: the maximum likelihood ratio (MLR) tests, assuming parametric conditions on distributions of observations (for details of the constructions and definitions of the MLR tests, see Appendix A2 of the supporting information); combined classic nonparametric tests with a structure based on the Wilcoxon signed rank test or/and the Kolmogorov–Smirnov test. We fixed the significance level of the tests to be 0.05 in all considered cases.

3.1. Power comparison with the parametric method (the MLR tests)

To present the comparative power of the proposed tests versus the corresponding MLR tests, we performed the following Monte Carlo study. Critical values of the MLR test statistics were obtained based on 50,000 simulations under H0 based on N(0, 1)-distributed observations Z. To study the powers of the tests, 10,000 samples for each size (n1, n2) were generated from a variety of distributions. Tables IIIV depict the Monte Carlo powers of the proposed tests and those of the corresponding MLR tests.

Table III.

The Monte Carlo powers of Test 1 by (9) versus the MLR test for H0 versus HA1 with different sample sizes (n1, n2) at the significance level α = 0.05.

FX<sub>1</sub> FY<sub>1</sub> FX<sub>2</sub> FY<sub>2</sub> n1 n2 Proposed test at (9) MLR
N(0, 1) N(0.2, 0.252) N(0.1, 0.52) N(0.5, 1)
10 10 0.1541 0.1579
50 50 0.6232 0.6921
N(2.5, 0.82) N(1.5, 0.52) N(1, 1.52) N(1.5, 0.62)
10 10 0.7267 0.7351
25 25 0.9956 0.9978
N(0.3, 0.52) N(0.5, 1) N(0.25, 0.252) N(0.5, 0.52)
10 10 0.2818 0.4497
50 50 0.9764 0.9993
N(0.5, 0.52) N(1, 1) N(0, 1) N(0, 1)
10 10 0.1723 0.1764
50 50 0.7171 0.7942
N(0, 1) N(0, 1) N(1.5, 1.12) N(1, 1.32)
10 10 0.1125 0.1300
50 50 0.4321 0.5531
N(0, 1) N(0.5, 1) N(0.5, 1.22) N(1, 0.52)
10 10 0.2208 0.2230
50 50 0.8348 0.8785

Table V.

The Monte Carlo powers of Test 3 by (17) versus the MLR test for H0 versus HA3 with different sample sizes (n1, n2) at the significance level α = 0.05.

FX<sub>1</sub> FY<sub>1</sub> FX<sub>2</sub> FY<sub>2</sub> n1 n2 Proposed test at (17) MLR
N(1, 1) N(1.5, 1.52) N(1, 1) N(1.5, 1.52)
10 10 0.2058 0.2191
50 50 0.6709 0.7919
N(0, 0.52) N(0.6, 1) N(0, 0.52) N(0.6, 1)
10 10 0.5922 0.6377
50 50 0.9974 0.9996
N(0.5, 0.25)2 N(1, 1) N(0.5, 0.252) N(1, 1)
10 10 0.5072 0.5444
50 50 0.9869 0.9977
N(2.5, 1.252) N(2, 0.52) N(2.5, 1.252) N(2, 0.52)
10 10 0.3383 0.3509
50 50 0.9004 0.9582

When observations are normally distributed, as anticipated, the MLR tests would be more powerful than the proposed nonparametric tests. The tables show that the powers of the proposed tests are very close to those of the MLR tests, demonstrating that the density-based EL tests are comparable to the parametric method that utilizes the correct information regarding distributions of observations. Table VI displays the actual type I errors of the MLR tests under the misspecification of underlying distributions, i.e., when observations were simulated from t distributions with different degrees of freedoms (DOFs). a logistic with parameters (0,1), a Laplace with parameters (0,1), and the Unif [0,1] under H0. As can be seen from Table VI, the type I errors of the MLR tests for H0 versus HA1 and H0 versus HA2 are not under control until the DOFs of the t distribution < 200. For the cases of the logistic and the Laplace distributions, the type I errors of the MLR tests are not well controlled. When the observations are from Unif [0,1], the impact of the misspecification of the model on the type I errors of the MLR tests is more significant. This illustrates that the considered MLR tests are strongly dependent on assumptions regarding distributions of observations.

Table VI.

The Monte Carlo type I errors of the MLR tests.

FZ<sub>1</sub> FZ<sub>2</sub> n1 n2 MLR test for H0 versus HA1 MLR test for H0 versus HA2 MLR test for H0 versus HA3
t3 t3
10 10 0.1599 0.1838 0.0428
50 50 0.2934 0.3260 0.0465
t5 t5
10 10 0.0955 0.1066 0.0458
50 50 0.1447 0.1687 0.0488
t200 t200
10 10 0.0507 0.0493 0.0499
50 50 0.0503 0.0506 0.0502
Logistic(0,1) Logistic(0,1)
10 10 0.0717 0.0759 0.0463
50 50 0.0855 0.0968 0.0508
Laplace(0,1) Laplace(0,1)
10 10 0.1108 0.1293 0.0438
50 50 0.1446 0.1645 0.0488
Unif[0, 1] Unif[0, 1]
10 10 1 0.9993 1

3.2. Power comparison with classic nonparametric methods

In this section, we compare the power of the proposed tests to the power of procedures based on the classic nonparametric tests. Because Tests 1 and 2 are based on the composite hypotheses regarding between-group differences and treatment effects, the respective Kolmogorov–Smirnov test and Wilcoxon signed rank test cannot be directly applied to test for these hypotheses. In this case, one can perform combined tests based on the Kolmogorov–Smirnov test and the Wilcoxon signed rank test for H0 versus HA1 and H0 versus HA2. For the comparison, we used combined nonparametric tests with the Bonferroni method. To this end, the R procedure ‘p.adjust’ with the method ‘bonferroni’ was utilized. Let ‘W-test’ denote the Wilcoxon signed rank test and ‘K–S test’ denote the Kolmogorov–Smirnov test. The combined nonparametric test for H0 versus HA1 consists of two W-tests for symmetry and one K–S test based on Z1, Z2 for FZ1 = FZ2. The former tests are employed to assess a treatment effect of each therapy group, whereas the latter test is conducted to detect the group difference. Similarly, we performed the combined nonparametric test for H0 versus HA2 that includes one W-test and one K–S test. The classical procedure for H0 versus HA3 is the W-test for symmetry.

To test H0 versus HA1 and H0 versus HA3, we assigned different distributions for baseline measurements X and endpoint observations Y in each group under the alternative hypothesis (i.e., (FX1, FY1) versus (FX2, FY2)), whereas when testing for H0 versus HA2, we directly generated observations Z2 from three cases of symmetric distributions under HA2: N(0, 1); Unif [−1, 13; t2 distribution. Tables VIIIX contain the results of the power comparisons of the two different testing procedures: the proposed procedures and the nonparametric testing procedures based on the W-test and/or K–S test using the Bonferroni approach.

Table VII.

The Monte Carlo powers of the proposed test (9) versus the combined nonparametric test (the two Wilcoxon signed rank tests and one Kolmogorov–Smirnov test) at the significance level α = 0.05.

FX<sub>1</sub> FY<sub>1</sub> FX<sub>2</sub> FY<sub>2</sub> n1 n2 Proposed test at (9) W and K–S test
Exp(1) Lognorm(0, 22) N(0, 1) N(0.5, 1.52)
10 10 0.2238 0.0946
50 50 0.9616 0.6273
Lognorm(1, 1) Lognorm(1, 0.52) N(0,1) N(1.5, 22)
10 10 0.3646 0.2754
50 50 0.9953 0.9849
Exp(3) Lognorm(0, 22) Gamma(5,1) Gamma(1, 5)
10 10 0.6321 0.5016
50 50 1 1
χ(6)2
Gamma(1,10) N(0,1) N(0.5, 22)
10 10 0.3815 0.1179
50 50 1 0.8576
Exp(1) Cauchy(1,1) N(0.5, 1) N(1.5, 22)
10 10 0.1819 0.1255
50 50 0.7928 0.7426
Exp(1) Lognorm(0, 22) Unif [−1, 1] Unif [−1, 1]
10 10 0.2325 0.0836
50 50 0.9981 0.6939

Table IX.

The Monte Carlo powers of the proposed test (17) versus the Wilcoxon signed rank test at the significance level α = 0.05.

FX<sub>1</sub> FY<sub>1</sub> FX<sub>2</sub> FY<sub>2</sub> n1 n2 Proposed test at (17) W test
Exp(1) Lognorm(0, 22) Exp(1) Lognorm(0, 22)
10 10 0.4136 0.2933
50 50 0.9992 0.9223
Lognorm(1, 1) Lognorm(1, 0.52) Lognorm(1, 1) Lognorm(1, 0.52)
10 10 0.1218 0.0731
50 50 0.7906 0.2208
Gamma(5,1) Gamma(1, 5) Gamma(5,1) Gamma(1, 5)
10 10 0.1218 0.0886
50 50 0.8074 0.2294
χ(6)2
Gamma(1,10)
χ(6)2
Gamma(1,10)
10 10 0.3125 0.2344
50 50 0.9629 0.8517
Beta(0, 0.8) Exp(1.5) Beta(0, 0.8) Exp(1.5)
10 10 0.2094 0.1518
50 50 0.9928 0.6003

The Monte Carlo outputs shown in Tables VIIIX indicate that the new tests have higher powers against the combined nonparametric tests. In particular, for the cases of small sample sizes (e.g., (n1, n2) = (10, 10), (25, 25)), the proposed tests are significantly superior to the classic tests. In several cases, the powers of the proposed tests have values that are 3–4 times larger than those of the combined nonparametric tests.

4. Data analysis

In this section, we apply the proposed method to the study described in Section 1, which evaluates treatment effects of ADHD and SMD in children. Study subjects were randomized to receive either the experimental 11-week group therapy program (n1 = 17) or community psychosocial treatment (n2 = 15). We defined the former as group 1 and the latter as group 2. For each child enrolled in the study, CDRS-Rts was taken at the baseline (week 0) and endpoint (week 11). Specifically, we computed the differences of CDRS-Rts: Zij = YijXij, for i = 1, 2; j = 1, …, ni, where Xij stands for the CDRS-Rts assessed at baseline before subject j receives treatment i and Yij represents another CDRS-Rts at the endpoint after subject j receives treatment i.

The empirical histograms of the CDRS-Rts at baseline and endpoint for each group are shown in Figure 1. As can be seen from Figure 1, it appears that both therapy groups have a decline in the CDRS-Rts after the baseline but the decrease in the CDRS-Rts seems to be more significant in group 1.

Figure 1.

Figure 1

Histograms of CDRS-Rts related to the baseline and endpoint in group 1: X1, Y1 and in group 2: X2, Y2.

In the context of the study’s interest to test a claim that the distributions of the changes in CDRS-Rts are not equivalent with respect to the therapy groups or at least one therapy group has a treatment effect, we performed the proposed Test 1. In this case, the observed value of the test statistic by (9), with δ = 0.1, is 22.8217 and the corresponding p-value is 0.00002, indicating the null hypothesis of ‘no group differences and the lack of treatment effects in both groups’ is rejected. The combined non-parametric test (the two ‘W-tests’ and one ‘K–S test’) also rejects the null hypothesis with the p-value 0.000005. On the basis of these results, there is a strong evidence to reject the null hypothesis.

In addition, to demonstrate applicability of the proposed tests, we carried out Test 2, which might be appropriate to test an assertion that there is a treatment effect in one group and no such effect in the other besides a group difference. The observed value of the test statistic by (15), with δ = 0.1, is 11.9370 and the corresponding p-value is 4 × 10−5. The combined nonparametric test (the ‘W-test’ and one ‘K-S test’) with the Bonferroni method also supports the result to reject the null hypothesis with the p-value of 5 × 10−7.

These results show that the proposed procedures are in conjunction with the classic tests, demonstrating that our proposed tests can be utilized in the ADHD and SMD study.

In addition to the analysis above, we also conducted a bootstrap type study to evaluate the efficiency (power) of the considered tests based on small datasets randomly selected from the original data. To perform this study for Test 1, we executed the following procedure. We randomly selected samples with the sample sizes of (n1, n2) = (9, 6), (9, 7), (11, 9), (13, 10), (13, 11), (15, 13) from the original dataset. Then we calculated the corresponding test statistic log(Vn1n2HA1) by (9), where δ = 0.1. We repeated this strategy 10,000 times calculating the proportion of rejections at α = 0.05 of the null hypothesis; that is, we computed the percentage of times when log(Vn1n2HA1)>Cα=0.05. The bootstrap type study for Test 2 was also carried out following the same procedures as described above. The results regarding the proportion of the rejections of the null hypothesis for each considered test are provided in Table X.

Table X.

The proportions of rejectionsa based on the bootstrap method for each considered test.

Bootstrapped sample sizes Test 1 Test 2

(n1, n2) Proposed test (9) Classic testb Proposed test (15) Classic testc
(9, 6) 0.9858 0.7135 0.9755 0.9134
(9, 7) 0.9870 0.7172 0.9800 0.9165
(11, 9) 0.9989 0.9795 0.9955 0.9844
(13, 10) 0.9998 0.9962 0.9993 0.9972
(13, 11) 0.9999 0.9965 0.9997 0.9975
(15, 13) 1 0.9997 0.9999 0.9994
a

The proportion of rejections of each test from the bootstrap method was computed based on sample sizes (n1, n2) and 10,000 replications.

b

The combined classic test for H0 versus HA1 is based on two W-tests and one K–S test.

c

The combined classic test for H0 versus HA2 is based on one W-test and one K–S test.

Table X demonstrates that the proposed procedures have larger proportion of the rejections in comparison with the combined nonparametric tests. In particular, when the sample sizes are relatively small (e.g., (n1, n2) = (9, 6), (9, 7)), the differences in the proportions of the rejections between two approaches are strongly recognizable. For example, we selected a sample of size 9 from the group 1 and a sample of size 6 from the group 2. This subdataset was tested for the hypotheses H0 versus HA1 (Test 1). In contrast with the result that the nonparametric test based on the two ‘W-tests’ and one ‘K–S test’ for H0 versus HA1 is not statistically significant (the Bonferroni adjusted p-values of these classic tests are 0.0617, 0.1050, and 0.9873, respectively), the proposed Test 1, with δ = 0.1, is statistically significant (p-value= 0.0005). Figure 2 shows the empirical histograms of Z1 and Z2 from the subdataset. All these results indicate that the proposed methods for tests 1 and 2 are more sensitive to detect the difference between the null hypothesis and the alternative hypotheses involved in Tests 1 and 2 compared with the corresponding combined nonparametric tests.

Figure 2.

Figure 2

Histograms of the paired observations Z1 and Z2 based on the CDRS-Rts data, with sample sizes (n1, n2) = (9, 6), that were sampled from the original data set.

5. Concluding remarks

In this article, we proposed and examined the two-sample density-based EL ratio tests based on paired observations. While constructing the tests, we used approximations to the most powerful test statistics with respect to the stated problems, providing efficient nonparametric procedures. The proposed tests are shown to be exact and simple in performing. The extensive Monte Carlo studies confirmed powerful properties of the proposed tests. We showed that our tests outperform different tests with a structure based on the Wilcoxon signed rank test and/or the Kolmogorov–Smirnov test, and outperform the parametric likelihood ratio tests when the underlying distributions are misspecified. The data example illustrated that the proposed tests can be easily and efficiently used in practice.

Table IV.

The Monte Carlo powers of Test 2 by (15) versus the MLR test for H0 versus HA2 with different sample sizes (n1, n2) at the significance level α = 0.05.

FX<sub>1</sub> FY<sub>1</sub> FZ<sub>2</sub> n1 n2 Proposed test at (15) MLR
N(0, 0.52) N(0.5, 1) N(0,1)
10 10 0.2325 0.2351
50 50 0.7989 0.8560
N(0, 0.52) N(1.5, 1) N(0,1)
10 10 0.9564 0.9593
15 15 0.9957 0.9967
N(0, 1) N(1.5, 1) N(0,1)
10 10 0.877 0.8991
25 25 0.9995 0.9997
N(1, 0.52) N(2, 1.52) N(0,1)
10 10 0.5219 0.6194
50 50 0.9975 0.9997

Table VIII.

The Monte Carlo powers of the proposed test (15) versus the combined nonparametric test (the one Wilcoxon signed rank test and one Kolmogorov–Smirnov test) at the significance level α = 0.05.

FX<sub>1</sub> FY<sub>1</sub> FZ<sub>2</sub> n1 n2 Proposed test at (15) W and K–S test
Exp(3) N(1.5, 22) N(0, 1)
10 10 0.5638 0.2876
50 50 0.9995 0.9816
Exp(1) Beta(1,1) N(0, 1)
10 10 0.2256 0.1193
50 50 0.9983 0.8046
Exp(1) Cauchy(1,1) N(0, 1)
10 10 0.1613 0.0401
50 50 0.9736 0.2323
Exp(1.5) N(0.5,1) Unif [−1, 1]
10 10 0.1677 0.0384
50 50 0.9984 0.3042
Exp(1.5) Beta(3,1) Unif [−1, 1]
10 10 0.1328 0.0934
50 50 0.8774 0.4128
Exp(1) Cauchy(1,1) Unif [−1, 1]
10 10 0.4052 0.0608
25 25 0.9952 0.9042
Lognorm(1, 1) Lognorm(1.2, 1) t2
10 10 0.2892 0.0657
50 50 0.8807 0.6635
χ(6)2
Gamma(10,1) t2
10 10 0.9044 0.6699
25 25 0.9999 0.9914
Exp(1) Cauchy(1,1) t2
10 10 0.0792 0.0316
50 50 0.2891 0.0877

Acknowledgments

This research was supported by the NIH grant 1R03DE020851 - 01A1 (the National Institute of Dental and Craniofacial Research). The authors are grateful to the Editor, the Associate Editor and the referees for suggestions that led to a substantial improvement in this paper.

Appendix A. Computing the empirical constraint (11) in the development of Test 2

Similarly to the Equations (1)(4), by virtue of the Mean Value Theorem and Lemma 2.1 in [24], we have

(2k)-1j=1n2Z2(j-k)Z2(j+k)fZ2(u)du(2k)-1j=1n2fZ2,jfZZ2,jϕk,j

and

(2k)-1j=1n2Z2(j-k)Z2(j+k)fZ2(u)duFZ2(Z2(n2))-FZ2(Z2(1))-r=1k-1(k-r)2k[FZ2(Z2(n2-r+1))-FZ2(Z2(n2-r))+FZ2(Z2(r+1))-FZ2(Z2(r))], (A1)

where ϕk,j are defined in (8). Here, because of the symmetric property of FZ2, under the alternative hypothesis HA2, applying the estimation proposed by Schuster [28], FZ2(r) − FZ2(s) can be estimated by

FZ2(r)-FZ2(s)=(2n2)-1j=1n2[I(Z2jr)+I(-Z2jr)-I(Z2js)-I(-Z2js)].

Now, the right-hand side of Equation (A1) can be estimated by Λn2k defined in Equation (12). This concludes that the resulting empirical constraint on values of FZ2, j in Test 2 has the form of

(2k)-1j=1n2fZ2,jfZZ2,jϕk,j=Λn2k.

Footnotes

Supporting information may be found in the online version of this article.

References

  • 1.Biederman J. Attention-deficit/hyperactivity disorder: a life-span perspective. The Journal of Clinical Psychiatry. 1998;59:4–16. [PubMed] [Google Scholar]
  • 2.Nair J, Ehimare U, Beitman BD, Nair SS, Lavin A. Clinical review: evidence-based diagnosis and treatment of ADHD in children. Missouri Medicine. 2006;103:617–621. [PubMed] [Google Scholar]
  • 3.Brotman MA, Schmajuk M, Rich B, Dickstein DP, Guyer AE, Costello EJ, Egger HL, Angold A, Leibenluft E. Prevalence, clinical correlates and longitudinal course of severe mood dysregulation in children. Biological Psychiatry. 2006;60:991–997. doi: 10.1016/j.biopsych.2006.08.042. [DOI] [PubMed] [Google Scholar]
  • 4.Carlson GA. Who Are the Children with Severe Mood Dysregulation, a.k.a. “Rages”? American Journal of Psychiartry. 2007;164:1140–1142. doi: 10.1176/appi.ajp.2007.07050830. [DOI] [PubMed] [Google Scholar]
  • 5.Leibenluft E, Charney DS, Towbin KE, Bhangoo RK, Pine DS. Defining Clinical Phenotypes of Juvenile Mania. American Journal of Psychiartry. 2003;160:430–437. doi: 10.1176/appi.ajp.160.3.430. [DOI] [PubMed] [Google Scholar]
  • 6.Waxmonsky J, Pelham WE, Gnagy E, Cummings MR, O’Connor B, Majumdar A, Verley J, Hoffman MT, Massetti GA, Burrows-MacLean L, Fabiano GA, Waschbusch DA, Chacko A, Arnold FW, Walker KS, Garefino AC, Robb JA. The efficacy and tolerability of methylphenidate and behavior modification in children with attention-deficit/hyperactivity disorder and severe mood dysregulation. Journal of Child and Adolescent Psychopharmacology. 2008;18:573–88. doi: 10.1089/cap.2008.065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Poznanski EO, Cook SC, Carroll BJ. A depression rating scale for children. Pediatrics. 1979;64:442–450. [PubMed] [Google Scholar]
  • 8.Poznanski EO, Grossman JA, Buchsbaum Y, Banegas M, Freeman L, Gibbons R. Preliminary studies of the reliability and validity of the children’s depression rating scale. Journal of the American Academy of Child Psychiatry. 1984;23:191–197. doi: 10.1097/00004583-198403000-00011. [DOI] [PubMed] [Google Scholar]
  • 9.Ying G, Mary EN, John H, Michael GW, Graham E. An Exploratory Factor Analysis of the Children’s Depression Rating Scale-Revised. Journal of Child and Adolescent Psychopharmacology. 2006;16:482–491. doi: 10.1089/cap.2006.16.482. [DOI] [PubMed] [Google Scholar]
  • 10.Wilcoxon F. Individual comparisons by ranking methods. Biometrics. 1945;1:80–83. [Google Scholar]
  • 11.Lehmann EL, Romano JP. Testing Statistical Hypotheses. Springer; New York: 2005. [Google Scholar]
  • 12.Vexler A, Wu C. An Optimal Retrospective Change Point Detection Policy. Scandinavian Journal of Statistics. 2009;36:542–558. [Google Scholar]
  • 13.Vexler A, Wu C, Yu KF. Optimal hypothesis testing: from semi to fully Bayes factors. Metrika. 2010;71:125–138. doi: 10.1007/s00184-008-0205-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Lazar N, Mykland PA. An evaluation of the power and conditionality properties of empirical likelihood. Biometrika. 1998;85:523–534. [Google Scholar]
  • 15.Owen AB. Empirical likelihood ratio confidence intervals for a single functional. Biometrika. 1988;75:237–249. [Google Scholar]
  • 16.Owen AB. Empirical Likelihood for Linear Models. The Annals of Statistics. 1991;19:1725–1747. [Google Scholar]
  • 17.Owen AB. Empirical Likelihood. Chapman and Hall/CRC; New York: 2001. [Google Scholar]
  • 18.Qin J, Lawless J. Empirical Likelihood and General Estimating Equations. The Annals of Statistics. 1994;22:300–325. [Google Scholar]
  • 19.Vexler A, Liu S, Kang L, Hutson AD. Modifications of the Empirical Likelihood Interval Estimation with Improved Coverage Probabilities. Communications in Statistics (Simulation and Computation) 2009;38:2171–2183. [Google Scholar]
  • 20.Vexler A, Yu J, Tian L, Liu S. Two-sample nonparametric likelihood inference based on incomplete data with an application to a pneumonia study. Biometrical Journal. 2010;52:348–361. doi: 10.1002/bimj.200900131. [DOI] [PubMed] [Google Scholar]
  • 21.Vexler A, Tsai W-M, Malinovsky Y. Estimation and testing based on data subject to measurement errors: from parametric to non-parametric likelihood methods. Statistics in Medicine. 2011 doi: 10.1002/sim.4304. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Yu J, Vexler A, Tian L. Analyzing Incomplete Data Subject to a Threshold Using Empirical Likelihood Methods: An Application to a Pneumonia Risk Study in an ICU Setting. Biometrics. 2010;66:123–130. doi: 10.1111/j.1541-0420.2009.01228.x. [DOI] [PubMed] [Google Scholar]
  • 23.Yu J, Vexler A, Kim S, Hutson AD. Two-sample Empirical likelihood ratio tests for medians application to biomarker evaluations. The Canadian Journal of Statistics. 2011 doi: 10.1002/cjs.10108. In Press. [DOI] [Google Scholar]
  • 24.Vexler A, Shan G, Kim S, Tsai W-M, Tian L, Hutson AD. An empirical likelihood ratio based goodness-of-fit test for Inverse Gaussian distributions. Journal of Statistical Planning and Inference. 2011;141:2128–2140. [Google Scholar]
  • 25.Vexler A, Yu J. Two-sample density-based empirical likelihood tests for incomplete data in application to a pneumonia study. Biometrical Journal. 2011;53:628–651. doi: 10.1002/bimj.201000235. [DOI] [PubMed] [Google Scholar]
  • 26.Vexler A, Gurevich G. Empirical likelihood ratios applied to goodness-of-fit tests based on sample entropy. Computational Statistics & Data Analysis. 2010;54:531–545. [Google Scholar]
  • 27.Gurevich G, Vexler A. A two-sample empirical likelihood ratio test based on samples entropy. Statisitics and Computing. 2011;21:657–670. [Google Scholar]
  • 28.Schuster EF. Estimating the distribution function of a symmetric distribution. Biometrika. 1975;62:631–635. [Google Scholar]
  • 29.Vasicek O. A test for normality based on sample entropy. Journal of the Royal Statistical Society Series B (Methodological) 1976;38:54–59. [Google Scholar]
  • 30.Canner PL. A simulation study of one-and two-sample Kolmogorov-Smirnov statistics with a particular weight function. Journal of American Statistical Association. 1975;70:209–211. [Google Scholar]
  • 31.Serfling RJ. Approximation Theorems of Mathematical Statistics. Wiley; New York: 1980. [Google Scholar]

RESOURCES