Matching a discrete distribution by Poisson matching quantiles estimation

Hyungjun Lim; Arlene K H Kim

doi:10.1080/02664763.2024.2337082

. 2024 Apr 4;51(15):3102–3124. doi: 10.1080/02664763.2024.2337082

Matching a discrete distribution by Poisson matching quantiles estimation

Hyungjun Lim ¹, Arlene K H Kim ^1,^CONTACT

PMCID: PMC11536626 PMID: 39507212

Abstract

Analyzing the data collected from different sources requires unpaired data analysis to account for the absence of correspondence between the random variable Y and the covariates $X$ . Several attempts have been made to analyze continuous Y, but it may follow a discrete distribution, which previous methodologies have overlooked. To address these limitations, we propose Poisson matching quantiles estimation (PMQE), the first unpaired data analysis method designed to examine the discrete Y and the unpaired continuous covariates $X$ . Using their order statistics, the PMQE method matches the linear combination of random variables $β^{T} X$ to $\log (Y)$ . We further improve the performance of the proposed method by $ℓ_{1}$ penalizing $β$ , leading to the PMQE LASSO. An effective algorithm and simulation results are presented, along with the convergence results. We illustrate the practical application of PMQE using real data.

Keywords: Matching distributions, PMQE, discrete variable, unpaired data analysis, deviance

1. Introduction

The ability to obtain data from various sources has heightened the scholarly interest in data integration and matching. Conventional regression analysis may not be applicable to independently collected data due to the lack of correspondence between the variable of interest and the covariates. Instead, the analysis of such unpaired data necessitates an approach that accounts for the absence of correspondence between variables.

Various studies have been conducted when complete pairing information is not available. For example, [2,9,27,29] restored the missing pair between variables, whereas [1,15,24] estimated regression parameters in the absence of pairing information. Scholars have also applied the deconvolution technique to recover monotone relations using unpaired data [3,5,22,25].

In the recent corpus of literature, we find an increased application of matching quantiles estimation (MQE) [20,23,30]. MQE allows us to find the coefficient vector $β \in R^{p}$ such that the distribution of a linear combination of random variables $β^{T} X$ best matches the distribution of Y, the variable of interest (see [23] for more details). Qin and Wu [20] extended the MQE by introducing a robust version of the MQE to outliers, and [12,30] presented a modified MQE algorithm for censored survival data analysis.

Despite the increasing popularity of unpaired data analysis, to the best of our knowledge, no applicable method exists for analyzing data wherein Y follows a discrete distribution. This lack of methodology is undesirable, because it is common to encounter discrete data in real-world data analysis. Examples include visitor counts at a store, hospital patients, and accidents per month, etc. Naive implementation of methods designed for continuous variables without considering the nature of such variables can lead to inaccurate estimations and flawed inferences: decimal estimates for categorical labels or negative estimates for non-negative data.

We aim to develop a methodology that adequately accounts for the discrete nonnegative count Y using one or more covariates without pair information. Specifically, we propose an extended distribution-matching algorithm called the Poisson matching quantiles estimation (PMQE) applying generalized linear model (GLM) estimation repeatedly to the recursively sorted data. In PMQE, we consider an empirical integrated deviance function (2) in order to measure the discrepancy between discrete and continuous distributions. This is contrasted with the squared error loss function for matching two continuous distributions considered by Sgouropoulos et al. [23] We prove that the algorithm converges as the empirical integrated deviance difference decreases monotonically. Some asymptotic properties of PMQE are established as well. We also consider PMQE LASSO by $ℓ_{1}$ penalizing the coefficients for a sparse representation modeling. We further verify that PMQE is capable of matching Y and $X$ regardless of their actual distributional association in the simulation study. (For more details, see Sections 2 and 3.) To our knowledge, this is the first attempt to utilize the MQE framework to obtain a method for analyzing unpaired data with a discrete count variable.

Our contributions mainly lie in addressing the challenges arising from accommodating the discrete data into the MQE framework. We primarily resolve the conceptual hurdle of discrete-to-continuous distribution matching by smoothing the discrete quantile using the sample quantile based on the mid-distribution function from [17]. Furthermore, we exploit the asymptotic convergence result of the smoothed quantile (see e.g. Theorem 2 in [17]) to substitute the Bahadur-Kiefer representation (see e.g. [13])–adopted in [20,23]–which is only applicable to continuous variables. In terms of theoretical results, we establish conditions that allow the discrete data to have unbounded support. Requirements listed in condition (ii) in Theorem 2.2 provide moment and tail probability conditions that control the asymptotic property of the discrete variable. They are general and non-restrictive as they encompass the most widely considered count data, the Poisson random variable. Regarding the proof of the distribution matching, we resolve all the technical challenges stemming from the matching deviance $D (β)$ defined in (1). It requires alternative approaches to those in related studies as it deviates from the class of matching criteria previously examined. Such issues are demonstrated in the proof of Lemma 6.2 in the Appendix.

We ascertain that PMQE is applicable across diverse contexts. It is particularly effective for estimating the distribution of inaccessible data in unpaired circumstances. For example, suppose we have hospital data where some sensitive records, such as seizure counts, are partially deleted for some group of the patients. Even worse, available data is anonymized among the non-deleted group. We may first apply PMQE on the seizure count and the unpaired covariates (e.g. age, gender, and other accessible data) of non-deleted group data, then estimate the inaccessible seizure count distribution using the pre-learned linear combination of covariates in the deleted group. With the resulting distribution estimate, we can now make various inferences (e.g. moments, quantiles, etc.) about the seizure count on the deleted group. Notice that the whole process is conducted without any additional pair information or de-anonymization step. Another interesting example is illustrated in Section 4, where we apply PMQE to a location-searching problem for a new facility. In addition to these examples, PMQE can be used for analyzing asynchronous measurements such as atmospheric sciences [11] and planetary physics data [18] as [23] pointed out. Moreover, following the suggestion by Qin and Wu [20], studies on climate events such as [16,26] can also benefit from PMQE when analyzing the discrete data.

The remaining paper is organized as follows. In Section 2, we introduce the PMQE method and establish the convergence of the algorithm and some asymptotic properties of the method. In Section 3, we present simulation studies that compare the performance of the PMQE, PMQE LASSO and Poisson GLM in various settings. In Section 4, we explore the real data application of PMQE, and conclude the paper by providing further consideration in Section 5. The Appendix Section includes all the proofs deferred from Section 2.

2. Matching method and convergence of the algorithms

Let Y be a discrete random variable taking nonnegative values and $X = (X_{1}, \dots, X_{p})$ be a collection of p continuous random variables. The goal is to determine a linear combination,

β^{T} X = β_{1} X_{1} + \dots + β_{p} X_{p},

such that the distribution of $\exp (β^{T} X)$ matches the approximate distribution of Y. For convenience of notation, we use $W := \exp (β^{T} X)$ .

Following the ideas in the generalized linear model, we propose to search for $β$ such that the following integrated deviance of two quantile functions is minimized:

D (β) := \int_{0}^{1} {{\tilde{Q}}_{Y} (α) \log \frac{{\tilde{Q}}_{Y} (α)}{Q_{W} (α)} - ({\tilde{Q}}_{Y} (α) - Q_{W} (α))} d α,

(1)

where ${\tilde{Q}}_{ξ} (α)$ denotes the α^th quantile of the random variable of ξ based on the mid-distribution function considered by Ma et al. [17], and $Q_{ξ} (α)$ denotes the αth quantile of the random variable of ξ based on the usual distribution function for $α \in (0, 1)$ . By definition, $D (β) \geq 0$ . Note that we use ${\tilde{Q}}_{Y}$ instead of $Q_{Y}$ in order to smooth the effect using a discrete distribution of Y for some technical reasons (see Proof of Lemma A.2). Then we define $β_{0} = \arg min_{β} D (β),$ which may not be unique. We also set the following assumption.

Assumption 2.1

$β_{0} = \arg min_{β} D (β)$ exists.

Suppose we have a random sample ${Y_{1}, \dots, Y_{n}}$ and ${X_{1}, \dots, X_{n}}$ , where $Y_{i} \in R$ and $X_{i} \in R^{p}$ for $i = 1, \dots, n$ . Let $Y_{(1)} \leq \dots \leq Y_{(n)}$ be the order statistics of $Y_{1}, \dots, Y_{n}$ . We define the estimator of $β_{0}$ by

{\hat{β}}^{PMQ} = \hat{β} = \arg min_{β} D_{n} (β),

where $D_{n} (β)$ is the empirical version of $D (β)$ . Specifically, we let

D_{n} (β) = \frac{1}{n} \sum_{j = 1}^{n} ({\tilde{Q}}_{n, Y} (j / n) \log \frac{{\tilde{Q}}_{n, Y} (j / n)}{Q_{n, W} (j / n)} - ({\tilde{Q}}_{n, Y} (j / n) - Q_{n, W} (j / n))),

(2)

where ${\tilde{Q}}_{n, Y} (j / n)$ is the empirical $((j - 0.5) / n)^{th}$ quantile of $Y_{1}, \dots, Y_{n}$ based on the empirical mid-distribution function. We take ${\tilde{Q}}_{n, Y} (j / n)$ as $((j - 0.5) / n)^{th}$ quantile instead of $(j / n)^{th}$ quantile in order to equate ${\tilde{Q}}_{n, Y} (j / n) = Q_{n, Y} (j / n)$ when there exist no ties in the sample. Note that $Q_{n, W} (j / n) = \exp (β^{T} X)_{(j)} = W_{(j)}$ by definition of the quantile function.

Let ${\tilde{Y}}_{(j)}$ denote ${\tilde{Q}}_{n, Y} (j / n)$ . We determine $\hat{β}$ using the following iterative algorithm, where we further determine the minimizer of

D_{n}^{(k)} (β) = \frac{1}{n} \sum_{j = 1}^{n} [{\tilde{Y}}_{(j)} \cdot \log (\frac{{\tilde{Y}}_{(j)}}{\exp (β^{T} X_{(j)}^{(k - 1)})}) - ({\tilde{Y}}_{(j)} - \exp (β^{T} X_{(j)}^{(k - 1)}))]

(3)

for the $k^{th}$ iteration, where ${X_{(j)}^{(k)}}$ is defined such that

(β^{(k)})^{T} X_{(1)}^{(k)} \leq \dots \leq (β^{(k)})^{T} X_{(n)}^{(k)} .

On the $k^{th}$ iteration, the data are re-paired according to $Y_{(1)}, \dots, Y_{(n)}$ and $X_{(1)}^{(k - 1)}, \dots, X_{(n)}^{(k - 1)}$ ordered by the previous coefficient estimate $β^{(k - 1)}$ obtained from the $(k - 1)^{th}$ iteration. Then, we update $β^{(k - 1)}$ to the Poisson GLM estimate $β^{(k)}$ that minimizes $D_{n}^{(k)} (β)$ , the deviance (3) of the re-paired data. We repeat the process until the change in $D_{n} (β^{(k^{*})})$ is lower than the predetermined tolerance level $ϵ > 0$ for some $k^{*} > 0$ . Note that we use the Poisson GLM estimate as an initial $β^{(0)}$ . The algorithm is summarized as follows:

Remark 2.1

We label this algorithm as ‘PMQE Q’ to indicate that the sample quantile based on the mid-distribution function from [17] is used. If the conventional sample quantile is used, we denote the method as ‘PMQE’. That is, PMQE finds a minimizer of modified $D_{n}^{(k)} (β)$ of Equation (3), where ${\tilde{Y}}_{(j)}$ s are replaced with $Y_{(j)}$ s. If there exist no ties in ${Y_{1}, \dots, Y_{n}}$ , PMQE Q is equivalent to PMQE.

Remark 2.2

Additional LASSO regularization term on the matching criterion $D_{n}^{(k)} (β)$ provides a sparse PMQE. ‘PMQE Q LASSO’ seeks for $β$ minimizing $D_{n}^{(k)} (β)$ subject to $\sum_{j = 1}^{p} | β_{j} | \leq C_{0}$ , for some $C_{0} > 0$ . In Algorithm 1, $D_{n}^{(k)} (β)$ of (3) is replaced with

$\begin{aligned} D_{n}^{(k)} (β) & = \frac{1}{n} \sum_{j = 1}^{n} [{\tilde{Y}}_{(j)} \cdot \log (\frac{{\tilde{Y}}_{(j)}}{\exp (β^{T} X_{(j)}^{(k - 1)})}) - ({\tilde{Y}}_{(j)} - \exp (β^{T} X_{(j)}^{(k - 1)}))] \\ + λ \sum_{i = 1}^{p} | β_{i} |, \end{aligned}$ (4)

where $λ \geq 0$ is an user-specified penalty parameter.

Remark 2.3

We suggest researchers adopt PMQE Q LASSO or PMQE LASSO as the basic methodology. This is because LASSO penalty term with well-adjusted $λ > 0$ enhances the matching performance of the method, which is illustrated in Section 3. Note that PMQE Q and PMQE can simply be considered as the special cases with λ being reduced to zero.

Theorem 2.1 considers the convergence of Algorithm 1, the proof ideas of which are in line with those found in the work of [23].

Theorem 2.1

Consider $D_{n}^{(k)} (β)$ defined in (4), and $β^{(k)} = \arg min_{β} D_{n}^{(k)} (β)$ . Then, we obtain $D_{n}^{(k)} (β^{(k)}) \to c$ as the iteration number $k \to \infty$ , where $c \geq 0$ is a constant.

Proof.

We need to show $D_{n}^{(k + 1)} (β^{(k + 1)}) \leq D_{n}^{(k)} (β^{(k)})$ for $k = 1, 2, \dots$ . Note that

$\begin{aligned} D_{n}^{(k + 1)} (β^{(k + 1)}) \\ = \frac{1}{n} \sum_{j = 1}^{n} ({\tilde{Q}}_{n, Y} (j / n) \log \frac{{\tilde{Q}}_{n, Y} (j / n)}{\exp (β^{(k + 1)^{'}} X_{(j)}^{(k)})} - ({\tilde{Q}}_{n, Y} (j / n) - \exp (β^{(k + 1)^{'}} X_{(j)}^{(k)}))) \\ + λ \sum_{i = 1}^{p} | β_{i}^{(k + 1)} | \\ \leq \frac{1}{n} \sum_{j = 1}^{n} ({\tilde{Q}}_{n, Y} (j / n) \log \frac{{\tilde{Q}}_{n, Y} (j / n)}{\exp (β^{(k)^{'}} X_{(j)}^{(k)})} - ({\tilde{Q}}_{n, Y} (j / n) - \exp (β^{(k)^{'}} X_{(j)}^{(k)}))) \\ + λ \sum_{i = 1}^{p} | β_{i}^{(k)} | \\ \leq \frac{1}{n} \sum_{j = 1}^{n} ({\tilde{Q}}_{n, Y} (j / n) \log \frac{{\tilde{Q}}_{n, Y} (j / n)}{\exp (β^{(k)^{'}} X_{(j)}^{(k - 1)})} - ({\tilde{Q}}_{n, Y} (j / n) - \exp (β^{(k)^{'}} X_{(j)}^{(k - 1)}))) \\ + λ \sum_{i = 1}^{p} | β_{i}^{(k)} | \\ = D_{n}^{(k)} (β^{(k)}), \end{aligned}$

where the first inequality follows the definition of $β^{(k + 1)}$ and the second inequality follows by Lemma A.1. Thus, $D_{n}^{(k)} (β^{(k)})$ is monotone decreasing in k and nonnegative, which proves the claim.

Next, we consider the asymptotic properties of the estimation. Note that by defining

\tilde{D} (β) := \int_{0}^{1} {Q_{W} (α) - {\tilde{Q}}_{Y} (α) \log Q_{W} (α)} d α,

we have

β_{0} = \arg min_{β} \tilde{D} (β) .

(5)

We treat $β_{0}$ as the true value to be estimated. Empirical counterpart of $\tilde{D} (β)$ is defined as follows:

{\tilde{D}}_{n} (β) = \frac{1}{n} \sum_{j = 1}^{n} (Q_{n, W} (j / n) - {\tilde{Q}}_{n, Y} (j / n) \log {\tilde{Q}}_{n, W} (j / n)) .

This gives us

{\hat{β}}^{PMQ} = \arg min_{β} {\tilde{D}}_{n} (β) .

(6)

We also let $B_{0} := {β : β \in \arg min_{β} D (β)}$ and $d (\hat{β}, B_{0}) = inf_{β \in B_{0}} d (\hat{β}, β)$ where $d (β_{0}, β_{1}) = ‖ β_{0} - β_{1} ‖_{2}$ .

Theorem 2.2

Under Assumption 2.1, let the following conditions hold:

(i)
The density $f_{X} (\cdot)$ exists, and for any fixed $β$ and by defining $W := \exp (β^{T} X)$ , there exists $Ω_{n} \subseteq (c_{1} n^{- τ_{0}}, c_{2} n^{τ_{0}})$ for positive constants $c_{1}$ and $c_{2}$ such that for any $0 < τ_{0} < 1 / 4$ ,
$sup_{Q_{W} (α) \in Ω_{n}} | f_{W}^{'} (Q_{W} (α)) | < \infty, inf_{Q_{W} (α) \in Ω_{n}} f_{W} (Q_{W} (α)) \geq n^{- τ_{0}},$ (7)
where $P (W \in Ω_{n}^{c}) = o (n^{- τ_{0}})$ and $f_{W}^{'}$ is the first derivative of density $f_{W}$ .

(ii)
Y satisfies $E (Y) < \infty$ with the tail probability $P (Y > y) = O (\exp (- y)) .$

Then, as $n \to \infty$ ,

(1)
${\tilde{D}}_{n} ({\hat{β}}^{PMQ}) \to \tilde{D} (β_{0})$ in probability.

(2)
$d ({\hat{β}}^{PMQ}, B_{0}) \to 0$ in probability.

Remark 2.4

We consider more relaxed conditions in Theorem 2.2 than those in [23] so that we allow unbounded supports for discrete Y and continuous W. In detail, the tail probability in condition (ii) naturally induces the upper bound for the maximum statistics $P (Y_{(n)} > clog n) = o (n^{- 1})$ . Combined with $Ω_{n}$ , these bounds provide probabilistically negligible sets for very large Y and W. Note that condition (ii) is a general assumption in the sense that it encompasses the widely considered Poisson random variable. Meanwhile, condition (i) is a direct extension of the work of [20]. It allows W to have infinite support while ensuring the Bahadur-Kiefer bound (see [13]). For example, when $\log W \sim N (0, 1)$ , we can take $Ω_{n} = [\exp (- \sqrt{2 τ_{0} \log n - \log 2 π}), \exp (\sqrt{2 τ_{0} \log n - \log 2 π})]$ which satisfies the condition (i).

Remark 2.5

In the proof of Theorem 2.2, we extensively exploit the asymptotic property of the smoothed discrete Y quantile. We address one of the major theoretical challenges: regulating the empirical quantile convergence of discrete Y (i.e. ${\tilde{Q}}_{n, Y} (α) \overset{p}{\to} {\tilde{Q}}_{Y} (α)$ ). We specifically adopt the convergence result established in Theorem 2 of [17]. Detailed strategies are mainly contained in the proof of Lemma 6.2 in the Appendix.

The proofs of Theorem 2.2 are given in the Appendix Section.

3. Simulation study

3.1. Matching measures

To demonstrate the performance of PMQE, we consider the $L_{1}$ -Wasserstein distance and mean matching deviance (MMD). First, Wasserstein distance, also known as Mallows' distance or earth mover's distance, measures the distance between probability distributions (see [19]). As Wasserstein distance can be computed for both continuous and discrete distributions [14], we utilize this metric for comparison.

Following the formula presented in [28], we consider the $L_{1}$ -Wasserstein distance

W_{1} (X, Y; \tilde{β}) = \int_{- \infty}^{\infty} | {\hat{F}}_{Y} (s) - {\hat{F}}_{\exp ({\tilde{β}}^{'} X)} (s) | d s,

where ${\hat{F}}_{δ}$ denotes the empirical distribution function of a random variable δ.

The second measure is the MMD, which is another empirical version of (1) and defined as

MMD (X, Y; \tilde{β}) = \frac{1}{n} \sum_{j = 1}^{n} [Y_{(j)} \cdot \log (\frac{Y_{(j)}}{\exp ({\tilde{β}}^{T} X)_{(j)}}) - (Y_{(j)} - \exp ({\tilde{β}}^{T} X)_{(j)})] .

MMD is very similar to (3), except that we simply use the order statistics for Y instead of empirical quantiles based on the mid-distribution function. It is an alternative measure to the root mean matching error used by Sgouropoulos et al. [23] that considers the nature of the generalized linear model.

3.2. Simulation for unpaired data

In this simulation study, after generating the paired data, we intentionally shuffle the pairs of Y and the covariates in the sample to obtain the unpaired data.

For the the sake of variety, we let $n \in {200, 500, 1000}$ , $p \in {2, 6, 10}$ and $r \in {1, 5}$ . Detailed simulation steps from the data generation to the performance evaluation for each n, p and r scheme are as follows:

Step 1
Randomly sample $X_{i} = (X_{i 1}, \dots, X_{ip})$ , where $X_{i 1}, \dots, X_{i \frac{p}{2}} \overset{iid}{\sim} N (0, 1)$ and $X_{i (\frac{p}{2} + 1)}, \dots, X_{ip} \overset{iid}{\sim} U (0, 1)$ for $i = 1, \dots, n$ .
Step 2
Generate $β = (β_{1}, \dots, β_{p})$ where $β_{1}, \dots, β_{p} \overset{iid}{\sim} U (0, \frac{r}{10})$ .
Step 3
Calculate $μ_{i} = \exp (β_{0} + β^{T} X_{i} + ϵ_{i})$ for $i = 1, \dots, n$ , with an intercept $β_{0} = 1$ and additional noises $ϵ_{1}, \dots, ϵ_{n} \overset{iid}{\sim} N (0, 1)$ .
Step 4
Generate $Y_{i} \sim Poisson (μ_{i})$ for $i = 1, \dots, n$ .
Step 5
Denote $X^{unpaired} := (X_{1}, \dots, X_{\frac{n}{2}})$ , $Y^{unpaired} := (Y_{(\frac{n}{2} + 1)}, \dots, Y_{n})$ and $Y^{true} := (Y_{1}, \dots, Y_{\frac{n}{2}})$ . Combine $X^{unpaired}$ and $Y^{unpaired}$ and designate them as the unpaired training data set.
Step 6
Fit nine methods (namely, PMQE, PMQE Q, corresponding LASSO versions with $λ \in {1, 0.1, 0.01}$ and Poisson GLM) using the unpaired training data set. Consequently, the coefficient estimate $\tilde{β}$ is obtained for each method.
Step 7
Repeat Step 1∼Step 6 for 100 times and record the means and standard errors of $W_{1} (X^{unpaired}, Y^{true}; \tilde{β})$ and MMD $(X^{unpaired}, Y^{true}; \tilde{β})$ for each method.

As PMQE is the first methodology to consider the discrete Y in unpaired data, there exists no applicable benchmark model for this scenario. To address this issue, we include Poisson GLM in the simulation to examine whether appropriately accounting for the unpaired circumstances with PMQE indeed results in improved distribution matching.

Result in Table 1 implies two important aspects about PMQE: first, PMQE yields superior distribution matching performance to conventional Poisson GLM in various circumstances and second, a properly tuned penalty parameter essentially enhances the matching quality of PMQE LASSO. More specifically, Table 1 shows that, in every n, p and r scheme, the closest distribution matches in terms of $L_{1}$ -Wasserstein distance are always produced by one of PMQE models. Note that we highlight the lowest Wasserstein mean values by the bold letters for each n, p and r setting. Calculating the $95 %$ confidence intervals, we notice that the CIs of Poisson GLM tend to be located apart from CIs PMQE. The most comparable result between GLM and PMQE arises in n = 200, p = 10 and r = 5, where CIs of PMQE Q, PMQE Q LASSO 0.01 and GLM overlap. Nevertheless, CI of PMQE LASSO 1 is set apart from that of GLM. No CI overlap between GLM and PMQE methods is observed in any other n, p and r scenarios. From the result, we conclude that PMQE consistently generates more precise distribution approximation compared to GLM.

Table 1.

Means (standard errors) of L1-Wasserstein distance for the unpaired data.

	Variables	p = 2				p = 6				p = 10
	$U (0, \frac{r}{10})$	r = 1		r = 5		r = 1		r = 5		r = 1		r = 5
sample	method	mean	se	mean	se	mean	se	mean	se	mean	se	mean	se
n = 200	PMQE	1.31	(0.08)	1.69	(0.12)	1.51	(0.14)	2.46	(0.22)	1.96	(0.14)	4.19	(0.39)
	PMQE Q	1.2	(0.06)	1.75	(0.21)	1.47	(0.11)	2.43	(0.16)	2.1	(0.19)	4.86	(0.51)
	PMQE LASSO 1	1.41	(0.05)	1.56	(0.07)	1.37	(0.04)	2.03	(0.09)	1.49	(0.06)	3.15	(0.18)
	PMQE LASSO 0.1	1.27	(0.07)	1.6	(0.12)	1.26	(0.1)	2.24	(0.2)	1.41	(0.08)	3.68	(0.37)
	PMQE LASSO 0.01	1.3	(0.08)	1.68	(0.12)	1.49	(0.14)	2.45	(0.22)	1.9	(0.13)	4.18	(0.4)
	PMQE Q LASSO 1	1.36	(0.04)	1.71	(0.14)	1.41	(0.04)	2.1	(0.1)	1.38	(0.04)	3.35	(0.24)
	PMQE Q LASSO 0.1	1.18	(0.06)	1.69	(0.21)	1.27	(0.1)	2.14	(0.15)	1.36	(0.1)	3.86	(0.4)
	PMQE Q LASSO 0.01	1.21	(0.06)	1.75	(0.21)	1.44	(0.1)	2.41	(0.16)	1.95	(0.17)	4.73	(0.5)
	GLM	3.36	(0.07)	4	(0.1)	2.9	(0.06)	4.82	(0.13)	2.67	(0.06)	5.91	(0.18)
n = 500	PMQE	0.97	(0.04)	1.1	(0.05)	0.93	(0.04)	1.73	(0.11)	1.14	(0.05)	2.6	(0.17)
	PMQE Q	0.91	(0.03)	1.08	(0.04)	0.92	(0.03)	1.6	(0.09)	1.13	(0.04)	2.33	(0.15)
	PMQE LASSO 1	1.14	(0.04)	1.22	(0.03)	1.11	(0.03)	1.72	(0.07)	1.16	(0.03)	2.12	(0.09)
	PMQE LASSO 0.1	0.93	(0.04)	1.06	(0.05)	0.82	(0.03)	1.58	(0.11)	0.97	(0.04)	2.23	(0.16)
	PMQE LASSO 0.01	0.96	(0.04)	1.09	(0.05)	0.91	(0.04)	1.71	(0.11)	1.09	(0.04)	2.55	(0.17)
	PMQE Q LASSO 1	1.21	(0.03)	1.25	(0.03)	1.2	(0.03)	1.62	(0.07)	1.17	(0.02)	2.29	(0.11)
	PMQE Q LASSO 0.1	0.85	(0.03)	1.06	(0.04)	0.86	(0.03)	1.46	(0.07)	0.9	(0.03)	2.13	(0.12)
	PMQE Q LASSO 0.01	0.9	(0.03)	1.09	(0.04)	0.9	(0.03)	1.57	(0.09)	1.08	(0.04)	2.3	(0.14)
	GLM	3.58	(0.05)	4.13	(0.07)	3.25	(0.04)	5.36	(0.11)	3.25	(0.04)	7.19	(0.19)
n = 1000	PMQE	0.76	(0.02)	0.79	(0.03)	0.79	(0.02)	1.16	(0.06)	0.93	(0.03)	1.83	(0.1)
	PMQE Q.2	0.72	(0.02)	0.79	(0.03)	0.8	(0.03)	1.23	(0.08)	0.85	(0.03)	1.76	(0.08)
	PMQE LASSO 1	1.03	(0.02)	1.09	(0.03)	1.08	(0.03)	1.34	(0.04)	1.09	(0.03)	1.88	(0.06)
	PMQE LASSO 0.1	0.73	(0.02)	0.76	(0.02)	0.72	(0.02)	1.07	(0.04)	0.74	(0.02)	1.71	(0.09)
	PMQE LASSO 0.01	0.76	(0.02)	0.79	(0.03)	0.77	(0.02)	1.14	(0.05)	0.89	(0.03)	1.8	(0.09)
	PMQE Q LASSO 1	1.05	(0.02)	1.13	(0.03)	1.04	(0.02)	1.4	(0.04)	1.11	(0.03)	1.86	(0.07)
	PMQE Q LASSO 0.1	0.69	(0.02)	0.78	(0.03)	0.71	(0.02)	1.14	(0.07)	0.71	(0.02)	1.6	(0.06)
	PMQE Q LASSO 0.01	0.71	(0.02)	0.79	(0.03)	0.78	(0.03)	1.22	(0.08)	0.81	(0.03)	1.71	(0.07)
	GLM	3.63	(0.03)	4.23	(0.06)	3.54	(0.03)	5.91	(0.12)	3.59	(0.03)	8.08	(0.19)

Open in a new tab

It is also noticeable that the matching performance of PMQE is further enhanced by the LASSO regularization. We can check that the lowest mean values are observed from one of PMQE LASSO models. Also, visiting the result of n = 200, p = 10 and r = 5 case again, we figure that some LASSO PMQE models yield consistent performance. While PMQE Q and PMQE Q LASSO 0.01 exhibit similar performance with GLM, more strictly penalized PMQE LASSO models (e.g. PMQE Q LASSO 0.1 and PMQE Q LASSO 1) displayed improved performance: they produced CIs falling outside the range of Poisson GLM. The result suggests that LASSO regularization with a sensible adjustment of λ helps PMQE to produce consistent performance in various circumstances–at least in the scenarios considered here. Note that in this study, we only considered restricted number of choices for the penalty parameter λ. In practice, λ can be chosen with a proper parameter tuning process (e.g. K-fold cross validation).

Table 2 shows the simulation results in terms of MMD. The outcome closely resembles Table 1, so we omit the details.

Table 2.

Means (standard errors) of MMD for the unpaired data.

	Variables	p = 2				p = 6				p = 10
	$U (0, \frac{r}{10})$	r = 1		r = 5		r = 1		r = 5		r = 1		r = 5
sample	method	mean	se	mean	se	mean	se	mean	se	mean	se	mean	se
n = 200	PMQE	0.28	(0.03)	0.44	(0.07)	0.38	(0.08)	0.63	(0.11)	0.58	(0.08)	1.37	(0.22)
	PMQE Q	0.24	(0.02)	0.43	(0.11)	0.33	(0.05)	0.61	(0.06)	0.66	(0.12)	1.72	(0.38)
	PMQE LASSO 1	0.3	(0.02)	0.35	(0.04)	0.32	(0.02)	0.47	(0.06)	0.34	(0.02)	0.76	(0.07)
	PMQE LASSO 0.1	0.26	(0.03)	0.39	(0.06)	0.27	(0.05)	0.54	(0.1)	0.31	(0.04)	1.15	(0.22)
	PMQE LASSO 0.01	0.28	(0.03)	0.43	(0.07)	0.37	(0.08)	0.62	(0.11)	0.55	(0.08)	1.38	(0.24)
	PMQE Q LASSO 1	0.31	(0.02)	0.41	(0.06)	0.32	(0.02)	0.48	(0.05)	0.28	(0.02)	0.87	(0.1)
	PMQE Q LASSO 0.1	0.23	(0.02)	0.4	(0.1)	0.26	(0.04)	0.49	(0.06)	0.28	(0.05)	1.18	(0.27)
	PMQE Q LASSO 0.01	0.24	(0.02)	0.42	(0.11)	0.32	(0.04)	0.6	(0.06)	0.58	(0.11)	1.65	(0.37)
	GLM	1.56	(0.06)	1.99	(0.09)	1.18	(0.04)	2.2	(0.1)	0.91	(0.04)	2.45	(0.11)
n = 500	PMQE	0.15	(0.01)	0.18	(0.02)	0.13	(0.01)	0.35	(0.05)	0.19	(0.02)	0.54	(0.07)
	PMQE Q	0.13	(0.01)	0.17	(0.01)	0.13	(0.01)	0.31	(0.04)	0.19	(0.01)	0.46	(0.06)
	PMQE LASSO 1	0.23	(0.01)	0.24	(0.01)	0.2	(0.01)	0.33	(0.03)	0.22	(0.01)	0.38	(0.03)
	PMQE LASSO 0.1	0.14	(0.01)	0.17	(0.02)	0.11	(0.01)	0.3	(0.05)	0.14	(0.01)	0.41	(0.06)
	PMQE LASSO 0.01	0.15	(0.01)	0.18	(0.02)	0.13	(0.01)	0.34	(0.05)	0.18	(0.01)	0.52	(0.07)
	PMQE Q LASSO 1	0.26	(0.01)	0.24	(0.01)	0.23	(0.01)	0.33	(0.03)	0.23	(0.01)	0.44	(0.04)
	PMQE Q LASSO 0.1	0.12	(0.01)	0.16	(0.01)	0.12	(0.01)	0.27	(0.03)	0.12	(0.01)	0.4	(0.05)
	PMQE Q LASSO 0.01	0.13	(0.01)	0.17	(0.01)	0.12	(0.01)	0.3	(0.04)	0.18	(0.01)	0.45	(0.05)
	GLM	1.81	(0.05)	2.19	(0.06)	1.44	(0.03)	2.85	(0.09)	1.38	(0.03)	3.67	(0.14)
n = 1000	PMQE	0.09	(0.01)	0.09	(0.01)	0.1	(0.01)	0.18	(0.02)	0.13	(0.01)	0.3	(0.03)
	PMQE Q	0.08	(0.01)	0.09	(0.01)	0.11	(0.01)	0.21	(0.04)	0.11	(0.01)	0.26	(0.02)
	PMQE LASSO 1	0.2	(0.01)	0.2	(0.01)	0.21	(0.01)	0.23	(0.01)	0.2	(0.01)	0.33	(0.02)
	PMQE LASSO 0.1	0.08	(0.01)	0.09	(0.01)	0.08	(0)	0.15	(0.01)	0.08	(0.01)	0.27	(0.02)
	PMQE LASSO 0.01	0.09	(0.01)	0.09	(0.01)	0.09	(0.01)	0.17	(0.02)	0.12	(0.01)	0.29	(0.03)
	PMQE Q LASSO 1	0.2	(0.01)	0.21	(0.01)	0.19	(0.01)	0.24	(0.02)	0.21	(0.01)	0.35	(0.03)
	PMQE Q LASSO 0.1	0.07	(0)	0.09	(0.01)	0.08	(0.01)	0.18	(0.03)	0.08	(0.01)	0.23	(0.02)
	PMQE Q LASSO 0.01	0.08	(0.01)	0.09	(0.01)	0.1	(0.01)	0.21	(0.03)	0.1	(0.01)	0.25	(0.02)
	GLM	1.86	(0.03)	2.3	(0.05)	1.72	(0.03)	3.39	(0.1)	1.66	(0.03)	4.77	(0.15)

Open in a new tab

3.3. Simulation for unpaired data with overdispersed Y

In this simulation study, we compare and evaluate the performance of the PMQE when data are over-dispersed. Count data following a Poisson distribution must satisfy the assumption of equidispersion. Nonetheless, in real-life cases, we observe over-dispersed count data where the variance exceeds the mean. This phenomenon has been extensively considered in the literature, particularly in the studies of [4,7]. We aim to examine the performance of PMQE in the presence of such overdispersion.

To generate over-dispersed count data, we utilize the generalized Poisson distribution (GP distribution) proposed by Consul and Jain [6] (see also [10]). Unlike a typical Poisson distribution, the first two cumulants of a GP distribution depend on two parameters, θ and δ, where the second parameter δ controls the level of dispersion. For example, if a random variable Y follows a GP distribution such that $Y \sim GP (θ, δ)$ , Y has a probability mass function

f_{Y} (y; θ, δ) = \frac{θ (θ + δy)^{y - 1} e^{- θ - δy}}{y!}, y = 0, 1, 2, \dots,

where $θ > 0$ and $max (- 1, - θ / 4) < δ < 1$ . The mean and variance of Y are calculated as $E (Y) = \frac{θ}{1 - δ}$ and $Var (Y) = \frac{θ}{(1 - δ)^{3}}$ . Apparently, the variance of Y exceeds the mean such that $Var (Y) = \frac{1}{(1 - δ)^{2}} E (Y)$ when $δ > 0$ . We observe that the data generated from GP distribution is always over-dispersed unless the second parameter δ is less than or equal to 0. When $δ = 0$ , the GP distribution corresponds to the typical Poisson distribution. The simulation process is almost identical to Step 1∼Step 7 in Subsection 3.2, except that we utilize the GP distribution as a data generator in Step 4 where $Y_{i}$ is generated as follows:

Y_{i} \sim GP (μ_{i} (1 - δ), δ), i = 1, \dots, n,

where $μ_{i}$ s are from Step 3. In addition, we only compared seven methods (PMQE, PMQE Q, associated LASSO versions with $λ \in {1, 0.1}$ and Poisson GLM) in Step 6. We excluded $λ = 0.01$ cases for their similar performance with unpenalized methods. Under the same n, p and r settings, we further vary the dispersion parameter of the GP distribution $δ \in {0.1, 0.3, 0.5}$ . Each δ value yields count data with variance 1.2, 2.04 and four times higher than its mean, respectively. Then, $E (Y_{i}) = μ_{i}$ and $Var (Y_{i}) = E (Y_{i}) / (1 - δ)^{2}$ .

Table 3 shows that PMQE methods produce stable matching performance in overdispersion circumstances. We notice that as the dispersion parameter δ value increases, the overall matching performance of PMQE methods are degraded. For example, when n = 1000, p = 10, and r = 5, the lowest Wasserstein mean value increases as δ increases (1.63, 1.82, and 1.86 for $δ = 0.1, 0.3$ and 0.5, respectively). This phenomenon tends to occur for every method in every setting. Nevertheless, similar to the previous simulation, PMQE always outperforms GLM, and PMQE LASSO with well-adjusted λ produces the closest distribution approximates. That is, in every n, p, r, and δ scheme, no overlapping CIs between PMQE methods and GLM is observed. Also, PMQE LASSO produces even closer distribution matching compared to the non-penalized PMQE. For instance, in n = 1000, p = 10, r = 1 and $δ = 0.5$ case, CIs of PMQE LASSO 0.1 and PMQE Q LASSO 0.1 are located apart and below of every other competitors' CIs. In short, the simulation result shows that PMQE is applicable for overdispersed data (see also Table 4).

Table 3.

Means (standard errors) of the Wasserstein distance for the unpaired data with overdispersed Y.

		Variables	p = 2				p = 6				p = 10
		$U (0, \frac{r}{10})$	r = 1		r = 5		r = 1		r = 5		r = 1		r = 5
dispersion	sample	method	mean	se	mean	se	mean	se	mean	se	mean	se	mean	se
$δ = 0.1$	n = 500	PMQE	0.89	(0.03)	1.06	(0.04)	1.13	(0.05)	1.44	(0.06)	1.33	(0.06)	2.65	(0.16)
		PMQE Q	0.89	(0.04)	1.07	(0.04)	1.01	(0.05)	1.56	(0.08)	1.17	(0.05)	2.69	(0.18)
		PMQE LASSO 1	1.13	(0.03)	1.25	(0.03)	1.21	(0.03)	1.55	(0.04)	1.23	(0.03)	2.34	(0.13)
		PMQE LASSO 0.1	0.86	(0.03)	1.02	(0.04)	1.02	(0.05)	1.33	(0.05)	1.04	(0.04)	2.33	(0.14)
		PMQE Q LASSO 1	1.13	(0.03)	1.27	(0.04)	1.2	(0.03)	1.65	(0.06)	1.24	(0.04)	2.32	(0.13)
		PMQE Q LASSO 0.1	0.84	(0.03)	1.04	(0.04)	0.87	(0.03)	1.44	(0.08)	0.94	(0.03)	2.32	(0.15)
		GLM	3.57	(0.04)	4.21	(0.06)	3.42	(0.04)	5.32	(0.12)	3.36	(0.05)	7.33	(0.22)
	n = 1000	PMQE	0.73	(0.02)	0.89	(0.03)	0.83	(0.03)	1.23	(0.06)	0.94	(0.04)	1.87	(0.09)
		PMQE Q	0.75	(0.03)	0.85	(0.03)	0.82	(0.03)	1.31	(0.06)	0.96	(0.04)	1.9	(0.1)
		PMQE LASSO 1	1.05	(0.02)	1.11	(0.03)	1.08	(0.02)	1.39	(0.05)	1.1	(0.03)	1.87	(0.08)
		PMQE LASSO 0.1	0.7	(0.02)	0.84	(0.03)	0.75	(0.03)	1.16	(0.05)	0.76	(0.03)	1.72	(0.07)
		PMQE Q LASSO 1	1.06	(0.02)	1.15	(0.03)	1.08	(0.02)	1.4	(0.04)	1.09	(0.03)	1.94	(0.07)
		PMQE Q LASSO 0.1	0.72	(0.02)	0.83	(0.03)	0.74	(0.03)	1.2	(0.05)	0.76	(0.02)	1.63	(0.08)
		GLM	3.7	(0.03)	4.35	(0.06)	3.6	(0.04)	5.95	(0.12)	3.59	(0.03)	8.19	(0.18)
$δ = 0.3$	n = 500	PMQE	0.99	(0.04)	1.08	(0.03)	1.21	(0.05)	1.7	(0.09)	1.38	(0.06)	2.8	(0.19)
		PMQE Q	1.01	(0.05)	1.14	(0.06)	1.17	(0.05)	1.8	(0.09)	1.37	(0.07)	2.82	(0.18)
		PMQE LASSO 1	1.19	(0.03)	1.26	(0.03)	1.24	(0.03)	1.73	(0.05)	1.24	(0.03)	2.48	(0.11)
		PMQE LASSO 0.1	0.96	(0.04)	1.04	(0.03)	1.04	(0.04)	1.53	(0.07)	1.02	(0.04)	2.39	(0.15)
		PMQE Q LASSO 1	1.26	(0.03)	1.33	(0.04)	1.27	(0.02)	1.78	(0.06)	1.28	(0.03)	2.48	(0.13)
		PMQE Q LASSO 0.1	1	(0.05)	1.11	(0.05)	0.99	(0.04)	1.65	(0.07)	0.99	(0.03)	2.37	(0.14)
		GLM	3.84	(0.05)	4.37	(0.07)	3.64	(0.04)	5.71	(0.13)	3.56	(0.04)	7.45	(0.18)
	n = 1000	PMQE	0.79	(0.03)	0.94	(0.04)	0.94	(0.03)	1.38	(0.06)	1.06	(0.04)	2.19	(0.11)
		PMQE Q	0.74	(0.02)	0.94	(0.05)	0.89	(0.03)	1.27	(0.06)	1.12	(0.05)	2.06	(0.09)
		PMQE LASSO 1	1.09	(0.02)	1.18	(0.03)	1.08	(0.02)	1.43	(0.05)	1.18	(0.03)	1.92	(0.06)
		PMQE LASSO 0.1	0.75	(0.02)	0.93	(0.04)	0.82	(0.02)	1.3	(0.05)	0.83	(0.03)	1.89	(0.09)
		PMQE Q LASSO 1	1.1	(0.02)	1.2	(0.03)	1.13	(0.02)	1.45	(0.04)	1.21	(0.03)	1.97	(0.07)
		PMQE Q LASSO 0.1	0.72	(0.02)	0.89	(0.04)	0.79	(0.02)	1.18	(0.05)	0.85	(0.03)	1.82	(0.08)
		GLM	3.84	(0.03)	4.54	(0.06)	3.86	(0.04)	6.11	(0.11)	3.89	(0.04)	8.68	(0.2)
$δ = 0.5$	n = 500	PMQE	1.13	(0.06)	1.33	(0.06)	1.38	(0.06)	2.18	(0.13)	1.95	(0.12)	3.66	(0.27)
		PMQE Q	1.05	(0.04)	1.26	(0.09)	1.4	(0.08)	2.04	(0.1)	1.89	(0.11)	3.73	(0.3)
		PMQE LASSO 1	1.31	(0.03)	1.45	(0.04)	1.37	(0.03)	1.75	(0.07)	1.47	(0.05)	2.79	(0.16)
		PMQE LASSO 0.1	1.1	(0.05)	1.26	(0.06)	1.15	(0.06)	1.8	(0.1)	1.24	(0.07)	2.9	(0.2)
		PMQE Q LASSO 1	1.36	(0.03)	1.48	(0.06)	1.42	(0.03)	1.93	(0.07)	1.44	(0.05)	2.79	(0.2)
		PMQE Q LASSO 0.1	1.03	(0.04)	1.25	(0.09)	1.12	(0.06)	1.8	(0.09)	1.19	(0.06)	2.83	(0.25)
		GLM	4.22	(0.05)	4.79	(0.07)	4.1	(0.05)	6.12	(0.12)	3.9	(0.04)	7.9	(0.19)
	n = 1000	PMQE	0.83	(0.03)	1.05	(0.04)	0.96	(0.05)	1.48	(0.07)	1.29	(0.07)	2.53	(0.14)
		PMQE Q.1	0.77	(0.03)	1.04	(0.04)	0.96	(0.04)	1.5	(0.06)	1.28	(0.06)	2.23	(0.1)
		PMQE LASSO 1	1.22	(0.02)	1.31	(0.03)	1.27	(0.03)	1.59	(0.04)	1.3	(0.03)	2.04	(0.07)
		PMQE LASSO 0.1	0.83	(0.03)	1.07	(0.04)	0.81	(0.02)	1.36	(0.06)	0.91	(0.04)	1.93	(0.1)
		PMQE Q LASSO 1	1.26	(0.02)	1.36	(0.03)	1.29	(0.02)	1.57	(0.04)	1.3	(0.02)	2.01	(0.07)
		PMQE Q LASSO 0.1	0.76	(0.03)	1.02	(0.04)	0.85	(0.03)	1.38	(0.05)	0.95	(0.03)	1.86	(0.08)
		GLM	4.32	(0.03)	5.04	(0.06)	4.26	(0.04)	6.54	(0.12)	4.25	(0.04)	8.72	(0.21)

Open in a new tab

Table 4.

Means (standard errors) of the MMD for the unpaired data with overdispersed Y.

		Variables	p = 2				p = 6				p = 10
		$U (0, \frac{r}{10})$	r = 1		r = 5		r = 1		r = 5		r = 1		r = 5
$δ = 0.1$	n = 500	PMQE	0.12	(0.01)	0.18	(0.01)	0.19	(0.02)	0.23	(0.02)	0.27	(0.02)	0.58	(0.08)
		PMQE Q	0.13	(0.01)	0.16	(0.01)	0.16	(0.01)	0.27	(0.02)	0.2	(0.02)	0.62	(0.08)
		PMQE LASSO 1	0.23	(0.01)	0.25	(0.01)	0.23(	0.01)	0.28	(0.01)	0.23	(0.01)	0.58	(0.18)
		PMQE LASSO 0.1	0.11	(0.01)	0.16	(0.01)	0.15	(0.02)	0.2	(0.01)	0.17	(0.01)	0.47	(0.08)
		PMQE Q LASSO 1	0.21	(0.01)	0.25	(0.02)	0.23	(0.01)	0.31	(0.02)	0.23	(0.01)	0.5	(0.08)
		PMQE Q LASSO 0.1	0.11	(0.01)	0.15	(0.01)	0.12	(0.01)	0.23	(0.02)	0.13	(0.01)	0.46	(0.07)
		GLM	1.76	(0.04)	2.24	(0.05)	1.53	(0.04)	2.69	(0.09)	1.41	(0.04)	3.87	(0.18)
	n = 1000	PMQE	0.08	(0.01)	0.12	(0.01)	0.11	(0.01)	0.17	(0.01)	0.14	(0.01)	0.32	(0.03)
		PMQE Q	0.09	(0.01)	0.11	(0.01)	0.11	(0.01)	0.19	(0.02)	0.14	(0.01)	0.31	(0.03)
		PMQE LASSO 1	0.2	(0.01)	0.2	(0.01)	0.2	(0.01)	0.24	(0.02)	0.2	(0.01)	0.32	(0.03)
		PMQE LASSO 0.1	0.08	(0.01)	0.1	(0.01)	0.09	(0.01)	0.15	(0.01)	0.09	(0.01)	0.27	(0.03)
		PMQE Q LASSO 1	0.19	(0.01)	0.21	(0.01)	0.2	(0.01)	0.23	(0.01)	0.2	(0.01)	0.33	(0.02)
		PMQE Q LASSO 0.1	0.08	(0.01)	0.11	(0.01)	0.09	(0.01)	0.17	(0.02)	0.09	(0.01)	0.23	(0.02)
		GLM	1.88	(0.03)	2.34	(0.05)	1.7	(0.03)	3.34	(0.1)	1.64	(0.03)	4.74	(0.14)
$δ = 0.3$	n = 500	PMQE	0.16	(0.02)	0.16	(0.01)	0.23	(0.03)	0.31	(0.03)	0.29	(0.03)	0.62	(0.07)
		PMQE Q	0.18	(0.02)	0.18	(0.02)	0.21	(0.02)	0.35	(0.03)	0.29	(0.03)	0.62	(0.09)
		PMQE LASSO 1	0.22	(0.01)	0.24	(0.02)	0.23	(0.01)	0.32	(0.02)	0.23	(0.01)	0.51	(0.04)
		PMQE LASSO 0.1	0.15	(0.01)	0.15	(0.01)	0.17	(0.02)	0.25	(0.02)	0.15	(0.01)	0.47	(0.05)
		PMQE Q LASSO 1	0.25	(0.02)	0.26	(0.01)	0.24	(0.01)	0.34	(0.02)	0.23	(0.01)	0.52	(0.07)
		PMQE Q LASSO 0.1	0.17	(0.02)	0.17	(0.02)	0.15	(0.01)	0.29	(0.02)	0.14	(0.01)	0.47	(0.06)
		GLM	1.89	(0.05)	2.23	(0.06)	1.59	(0.04)	2.94	(0.1)	1.46	(0.03)	3.81	(0.13)
	n = 1000	PMQE	0.1	(0.01)	0.13	(0.01)	0.13	(0.01)	0.2	(0.02)	0.16	(0.01)	0.41	(0.05)
		PMQE Q	0.08	(0.01)	0.14	(0.02)	0.12	(0.01)	0.18	(0.03)	0.19	(0.02)	0.36	(0.03)
		PMQE LASSO 1	0.2	(0.01)	0.2	(0.01)	0.19	(0.01)	0.24	(0.01)	0.22	(0.01)	0.34	(0.02)
		PMQE LASSO 0.1	0.09	(0.01)	0.12	(0.01)	0.1	(0.01)	0.18	(0.01)	0.1	(0.01)	0.31	(0.03)
		PMQE Q LASSO 1	0.19	(0.01)	0.21	(0.01)	0.2	(0.01)	0.24	(0.01)	0.21	(0.01)	0.34	(0.02)
		PMQE Q LASSO 0.1	0.08	(0.01)	0.11	(0.01)	0.09	(0.01)	0.16	(0.02)	0.1	(0.01)	0.29	(0.03)
		GLM	1.87	(0.03)	2.39	(0.05)	1.81	(0.03)	3.34	(0.09)	1.77	(0.04)	5	(0.16)
$δ = 0.5$	n = 500	PMQE	0.2	(0.02)	0.26	(0.03)	0.29	(0.03)	0.53	(0.08)	0.55	(0.06)	1.04	(0.15)
		PMQE Q	0.16	(0.01)	0.22	(0.04)	0.3	(0.03)	0.39	(0.04)	0.52	(0.07)	1.02	(0.15)
		PMQE LASSO 1	0.22	(0.01)	0.29	(0.03)	0.25	(0.01)	0.32	(0.03)	0.29	(0.02)	0.61	(0.07)
		PMQE LASSO 0.1	0.18	(0.02)	0.22	(0.03)	0.21	(0.02)	0.37	(0.05)	0.22	(0.03)	0.67	(0.08)
		PMQE Q LASSO 1	0.26	(0.01)	0.27	(0.03)	0.26	(0.01)	0.35	(0.02)	0.28	(0.02)	0.61	(0.09)
		PMQE Q LASSO 0.1	0.15	(0.01)	0.21	(0.04)	0.19	(0.02)	0.31	(0.03)	0.21	(0.03)	0.63	(0.11)
		GLM	2.02	(0.04)	2.41	(0.08)	1.81	(0.05)	3	(0.1)	1.58	(0.04)	3.83	(0.13)
	n = 1000	PMQE	0.1	(0.01)	0.15	(0.01)	0.14	(0.01)	0.23	(0.02)	0.26	(0.04)	0.52	(0.06)
		PMQE Q.1	0.09	(0.01)	0.14	(0.01)	0.14	(0.01)	0.25	(0.02)	0.26	(0.03)	0.41	(0.04)
		PMQE LASSO 1	0.2	(0.01)	0.23	(0.01)	0.22	(0.01)	0.26	(0.01)	0.23	(0.01)	0.34	(0.02)
		PMQE LASSO 0.1	0.1	(0.01)	0.15	(0.01)	0.09	(0.01)	0.19	(0.02)	0.12	(0.01)	0.3	(0.03)
		PMQE Q LASSO 1	0.23	(0.01)	0.23	(0.01)	0.22	(0.01)	0.26	(0.02)	0.22	(0.01)	0.31	(0.02)
		PMQE Q LASSO 0.1	0.09	(0.01)	0.14	(0.01)	0.11	(0.01)	0.21	(0.02)	0.12	(0.01)	0.29	(0.03)
		GLM	2.16	(0.03)	2.62	(0.05)	1.99	(0.04)	3.49	(0.09)	1.87	(0.03)	4.79	(0.16)

Open in a new tab

3.4. Simulation for matching distributionally unassociated Y and $X$

In this additional study, we substantiate that PMQE does not require any distributional association between $X$ and Y. In the previous simulations of Subsections 3.2 and 3.3, we sampled Y from the distribution whose parameters depend on $X$ . However, PMQE is designed to find $β$ such that $β^{'} X$ best approximates the distribution of $\ln Y$ even when they are completely unrelated. In this simulation study, we test the performance of PMQE in such circumstances by deliberately eliminating the association between $X$ and Y.

To experiment the performance of PMQE in analyzing distributionally unassociated $X$ and Y, we make the following adjustments in the simulation process. Selective variations of $p \in {2, 6, 10}$ , $r \in {1, 5}$ and $δ \in {0, 0.5}$ are considered with a fixed sample size of n = 1000. Both equidispersion cases ( $δ = 0$ ) and overdispersion cases ( $δ = 0.5$ ) are covered. In Step 1, instead of sampling the half of $X$ from $N (0, 1)$ and the other half of $X$ from $U (0, 1)$ , we sample all covariates from $N (0, 1)$ . Then, after generating Y via Step 2∼Step 4, $X$ is replaced by $U (0, 1)$ random samples before preceding into Step 5. Resulting $X$ and Y are distributionally unrelated which is the desired simulation scheme. For Step 6, following the simulation of Subsection 3.3, seven methods (PMQE, PMQE Q, associated LASSO versions with $λ \in {1, 0.1}$ and Poisson GLM) are compared. The rest of the procedure is conducted without further changes and the results are presented in Table A1 (see Appendix Section).

Table A1 confirms that PMQE exhibits a consistent estimation performance in distributionally unassociated $X$ and Y matching. Specific details are excluded due to their similarity with the outcomes observed in Subsections 3.2 and 3.3. In summary, the result verifies that PMQE is capable of analyzing Y and $X$ lacking the distributional connection.

4. Real data application

To demonstrate the practicality of the PMQE, we consider the problem of determining the optimal location for a new facility, such as a shopping mall, as considered by Drezner et al. [8]. The optimal location search is a common topic in economic gravity models (see [21]), wherein the potential market share captured by a new facility in its neighborhoods is modeled as an increasing function of the facility's attractiveness and a decreasing function of its distance. We consider the number of visitors to the facility as the discrete variable of interest, and the size of the neighborhood and the distance to the facility as the covariates. We assume the data are unpaired for reasons of privacy or anonymization. That is, we have the visitor count vector from each neighborhood to one facility, and separately, we have the data of the population size for each neighborhood and the distance to the facility from the neighborhood. We allow disjointed group of neighborhoods for the visitor count Y and another group of neighborhoods for the covariates $X$ .

In particular, we assume that ‘South Coast Plaza,’ one of the shopping malls in the U.S. is willing to find a location for a new branch in Orange County, California (see [8]). Let $Y = (Y_{1}, \dots, Y_{80})$ be the number of visitors to South Coast Plaza from each neighborhood $N_{1}, \dots, N_{80}$ in Orange County, and let $X_{1} = {X_{1 j}}_{j = 1}^{80}$ be the population of each neighborhood $N_{j}$ and $X_{2} = {X_{2 j}}_{j = 1}^{80}$ be the distance to South Coast Plaza from each neighborhood $N_{j}$ . Although the original data are paired [8], we consider a situation wherein the data miss a link between $X$ and Y. We thus intentionally disconnect pairs by randomly selecting 40 out of the total of 80 neighborhoods. Suppose we denote the set of such selected indices by $I_{S}$ . Then, we record their population counts and distance to South Coast Plaza to construct a $40 \times 2$ data matrix $X^{unpaired} = {X_{1 j}, X_{2 j}}_{j \in I_{S}}$ . Following this, using 40 neighborhoods not selected in the first step, we make a vector $Y^{unpaired} = {Y_{j}}_{j \notin I_{S}}$ . We thereafter combine $X^{unpaired}$ and $Y^{unpaired}$ as an unpaired training data set. Note that, for convenience of analysis, we preprocess the logarithm of both columns in $X$ to have zero means with unit variances.

We then estimate the distributions of visitor counts at competing candidate locations. The candidate locations for the new South Coast Plaza branch are illustrated in Figure 1 as 85 orange circles. Note that 15 grids are excluded in the left corner, as they are located in the midst of the Pacific Ocean. The locations of neighborhoods ${N_{j}}_{j \notin I_{S}}$ , not selected for $X^{unpaired}$ , are visualized as grey triangles. We construct 85 matrices $X_{1}^{new}, X_{2}^{new}, \dots, X_{85}^{new}$ wherein $X_{i}^{new}$ is a $40 \times 2$ matrix that contains the population counts of 40 neighborhoods ${N_{j}}_{j \notin I_{S}}$ in Figure 1 and the 40 traveling distances between the $i^{th}$ location among the 85 competing candidate locations to every 40 neighborhoods ${N_{j}}_{j \notin I_{S}}$ .

Figure 1. — Residence of Orange county and 85 candidate locations for new South Coast Plaza. (triangle: neighborhood locations with size indicating the population. circle: candidate locations.)

We first fit the PMQE using the unpaired data $(X^{unpaired}, Y^{unpaired})$ and obtain the estimated coefficient $\hat{β}$ . Then, the distribution of $\exp ({\hat{β}}^{T} X_{i}^{new})$ is predicted so that we also have the estimated visitor distribution of the $i^{th}$ candidate location. By obtaining all 85 $\exp ({\hat{β}}^{T} X_{i}^{new})$ , we obtain the estimated distributions of visitor counts for the 85 candidate locations.

Figure 2 illustrates the various aspects of the estimated distributions for each candidate location. The six plots each visualize the number of neighborhoods with a zero visitor, the variance of visitor counts, the mean of the bottom $25 %$ of visitor counts, the 95 percentile of the estimated distribution, the median, and the maximum, respectively.

The optimal location of the new South Coast Plaza may vary by the South Coast Plaza company's priorities. For example, if a company wants to avoid locations where it expects to have many zero-visitor neighborhoods, it may refer to the ‘number of zero-visitor’ map. The area marked with the largest red circle is the location expected to have the most zero visitors. The ‘Variance’ map will help decision-making entrepreneurs find locations where visitor numbers vary too much from region to region. Avoiding places with low ‘Mean of bottom $25 %$ ’ may help find locations with too low expected average visitors from relatively less-visiting neighborhoods. Areas with high ‘95 percentile,’ ‘Median,’ and ‘Maximum’ values can help a new South Coast Plaza maximize its visitors, which can lead to increased sales. It is also interesting to note that the three locations at which ‘95 percentile,’ ‘median,’ and ‘maximum’ are each maximized do not perfectly coincide, even if they are nearby. Again, the optimal location may differ by the South Coast Plaza's priority values.

By applying our methodology to real data, we demonstrate the versatility and practicality of the proposed method for finding the best location for a new facility. As mentioned above, the PMQE can be useful in various circumstances depending on the purpose of the analysis.

5. Further consideration

In this article, we propose the PMQE method, the first unpaired data analysis method to consider the discrete-count data Y. This method utilizes the order statistics of discrete count Y and the exponentiated linear combination of the covariates $\exp (β^{T} X)$ . The estimator ${\hat{β}}^{PMQ}$ is obtained by minimizing the discrepancy between the distributions of $\ln (Y)$ and $β^{T} X$ . The validity of a readily applicable algorithm of the PMQE is supported by its proofs of convergence. Simulation studies in various settings suggest that we can expect the most accurate distribution matching using the PMQE (PMQE LASSO) compared with conventional methods, such as the Poisson GLM. With all variations in the sample size, the number of independent variables, and the average β magnitude, our test results indicate that the performance of the PMQE, measured by the $L_{1}$ -Wasserstein distance and MMD, is more accurate on average. We further show that its performance is quite stable in the presence of overdispersion or distributionally unassociated Y and $X$ . Finally, real data application provides an idea of how our proposed method can be applied in practice. We confirm that finding the best location for a new shopping mall can be treated by estimating the visitor count distributions in candidate locations using the proposed PMQE.

We hope that this work is extended further in future studies. Depending on the type of the variable of interest, we might consider a different link function and deviance, such as the Bernoulli or Binomial link functions. It may also be interesting to extend our method by treating any continuous or discrete variable as in the GLM.

While we only consider the case in which $X$ and Y have the same dimension, we can extend our analysis to the case in which we obtain n observations for $X$ and $m (\neq n)$ observations for Y. For simplicity, if we let m>n, then we might just use $min (n, m) = n$ observations by randomly selecting n observations from Y. We expect the algorithm to be unaffected as long as the distributions of the full Y and this subset of Y are similar.

Lastly, real data analysis can be improved by considering the interaction effect on market share for the current facility by adding a new facility. Indeed, adding one more facility in Orange County will affect the market share for the existing facility, which the current analysis does not account for. Incorporating the effect of a new facility needs more careful and sophisticated treatment, which is beyond the scope of the paper. We leave this enquiry to the future.

Appendix.

In this Appendix, we provide lemmas and their proofs needed to show the convergence of the algorithm.

Lemma A.1

Let $a_{1}, \dots, a_{n}$ and $b_{1}, \dots, b_{n}$ be any positive two sequences of real numbers. Then,

$\sum_{i = 1}^{n} a_{(i)} \log \frac{a_{(i)}}{b_{(i)}} - (a_{(i)} - b_{(i)}) \leq \sum_{i = 1}^{n} a_{i} \log \frac{a_{i}}{b_{i}} - (a_{i} - b_{i}) .$ (A1)

Proof of Lemma A.1. Proof of Lemma A.1 —

We prove the claim using the principle of the mathematical induction following ideas in [23]. Note that $\sum_{i = 1}^{n} (a_{(i)} - b_{(i)}) = \sum_{i = 1}^{n} (a_{i} - b_{i})$ . Thus, it suffices to show

$\sum_{i = 1}^{n} a_{(i)} \log \frac{a_{(i)}}{b_{(i)}} \leq \sum_{i = 1}^{n} a_{i} \log \frac{a_{i}}{b_{i}} .$ (A2)

When n = 2, we need to show

$a_{(1)} \log \frac{a_{(1)}}{b_{(1)}} + a_{(2)} \log \frac{a_{(2)}}{b_{(2)}} \leq a_{(1)} \log \frac{a_{(1)}}{b_{(2)}} + a_{(2)} \log \frac{a_{(2)}}{b_{(1)}} .$ (A3)

By arranging the above terms, the above is equivalent to show that $a_{(1)} \log \frac{b_{(2)}}{b_{(1)}} + a_{(2)} \log \frac{b_{(1)}}{b_{(2)}}$ is nonpositive. As

$(a_{(2)} - a_{(1)}) \log \frac{b_{(2)}}{b_{(1)}} \geq 0,$

the claim is true for n = 2.

Let us assume (A2) when n = m. Consider that we have $a_{1}, \dots, a_{m + 1}$ and $b_{1}, \dots, b_{m + 1}$ . Without loss of generality, we let $a_{m + 1} = a_{(1)}$ and $b_{ℓ} = b_{(1)}$ . Consider the case wherein $ℓ = m + 1$ :

$\begin{aligned} \sum_{i = 1}^{m + 1} a_{(i)} \log \frac{a_{(i)}}{b_{(i)}} & = \sum_{i = 2}^{m + 1} a_{(i)} \log \frac{a_{(i)}}{b_{(i)}} + a_{(1)} \log \frac{a_{(1)}}{b_{(1)}} \\ \leq \sum_{i = 1}^{m} a_{i} \log \frac{a_{i}}{b_{i}} + a_{m + 1} \log \frac{a_{m + 1}}{b_{m + 1}}, \end{aligned}$

where the last inequality follows by the assumed inequality (A1). Then, we consider the case wherein $ℓ \neq m + 1$ .

Using the same idea proving (A3), we obtain

$a_{(1)} \log \frac{a_{(1)}}{b_{(1)}} + a_{ℓ} \log \frac{a_{ℓ}}{b_{m + 1}} \leq a_{ℓ} \log \frac{a_{ℓ}}{b_{ℓ}} + a_{m + 1} \log \frac{a_{m + 1}}{b_{m + 1}} .$

Then,

$\begin{aligned} \sum_{i = 1}^{m + 1} a_{i} \log \frac{a_{i}}{b_{i}} & = \sum_{i \neq ℓ}^{m} a_{i} \log \frac{a_{i}}{b_{i}} + a_{ℓ} \log \frac{a_{ℓ}}{b_{ℓ}} + a_{m + 1} \log \frac{a_{m + 1}}{b_{m + 1}} \\ \geq a_{(1)} \log \frac{a_{(1)}}{b_{(1)}} + a_{ℓ} \log \frac{a_{ℓ}}{b_{m + 1}} + \sum_{i \neq ℓ}^{m} a_{i} \log \frac{a_{i}}{b_{i}} \\ \geq a_{(1)} \log \frac{a_{(1)}}{b_{(1)}} + \sum_{i = 2}^{m + 1} a_{(i)} \log \frac{a_{(i)}}{b_{(i)}} = \sum_{i = 1}^{m + 1} a_{(i)} \log \frac{a_{(i)}}{b_{(i)}} . \end{aligned}$

Thus, the claim is true for n = m + 1, which yields the desired result.

Lemma A.2

Assume conditions in Theorem 2.2. Then, for any fixed $β$ and $τ < 1 / 2$ ,

$n^{τ} ({\tilde{D}}_{n} (β) - \tilde{D} (β)) \to 0$

in probability.

Proof of Lemma A.2. Proof of Lemma A.2 —

It suffices to consider a high probability set $Ω_{n}$ such that $Q_{W} (α) \in Ω_{n}$ . Consider $B_{j} = {(j - 1) / n \leq \cdot \leq j / n}$ for $j = 1, \dots, n$ . By arranging the terms, we obtain

$\begin{aligned} {\tilde{D}}_{n} (β) - \tilde{D} (β) & = [\int {\tilde{Q}}_{Y} (α) \log Q_{W} (α) - \frac{1}{n} \sum_{j = 1}^{n} {\tilde{Q}}_{n, Y} (j / n) \log Q_{n, W} (j / n)] \\ - [\int Q_{W} (α) - \frac{1}{n} \sum_{j = 1}^{n} Q_{n, W} (j / n)] \\ = \sum_{j} (\int_{B_{j}} {\tilde{Q}}_{Y} (α) (\log Q_{W} (α) - \log Q_{n, W} (j / n))) \\ + \sum_{j} \log Q_{n, W} (j / n) \int_{B_{j}} ({\tilde{Q}}_{Y} (α) - {\tilde{Q}}_{n, Y} (j / n)) - \sum_{j} \int_{B_{j}} (Q_{W} (α) - Q_{n, W} (j / n)) \\ = (*) + (* *) + (* * *), \end{aligned}$ (A4)

where the first term $(*)$ is bounded as follows,

$\begin{aligned} (*) & \leq \sum_{j} \int_{B_{j}} {\tilde{Q}}_{Y} (α) | \log Q_{W} (α) - \log Q_{n, W} (j / n) | \\ \leq \sum_{j} \int_{B_{j}} {\tilde{Q}}_{Y} (α) sup_{β \in B_{j}} | \log Q_{W} (β) - \log Q_{n, W} (j / n) | \\ = \sum_{j} \int_{B_{j}} {\tilde{Q}}_{Y} (α) [sup_{β \in B_{j}} | \log Q_{W} (β) - \log Q_{W} (j / n) | + | \log Q_{W} (j / n) - \log Q_{n, W} (j / n) |] . \end{aligned}$ (A5)

The first term in (A5) in the bracket is further bounded

$\begin{aligned} sup_{β \in B_{j}} | \log Q_{W} (β) - \log Q_{W} (j / n) | & \leq sup_{β : | β - α | \leq 1 / n} | \log (1 + \frac{Q_{W} (α) - Q_{W} (β)}{Q_{W} (β)}) | \\ \leq sup_{β : | β - α | \leq 1 / n} | \frac{Q_{W} (α) - Q_{W} (β)}{Q_{W} (β)} | \\ \leq sup_{β : | β - α | \leq 1 / n} | {Q_{W} (β)}^{- 1} {f_{W} (Q_{W} (β))}^{- 1} n^{- 1} (1 + O (1)) | \\ \leq O (n^{- 1 + 2 τ_{0}}) = o (n^{- τ}) \end{aligned}$

The last equality follows from $0 < τ_{0} < 1 / 4$ . The second term in (A5) in the bracket is bounded using the condition (7). By the result presented by Kulik [13], we have

$f_{W} (Q_{W} (α)) n^{1 / 2} (Q_{W} (α) - Q_{n, W} (α)) - n^{1 / 2} (F_{n, W} (Q_{W} (α)) - α) = R_{n},$ (A6)

for $α \in (0, 1)$ , where $R_{n} = O_{p} (n^{- 1 / 4} (\log n)^{1 / 2} (\log \log n)^{1 / 4})) = o_{p} (1)$ . Also note that $P (sup_{α} | F_{n, W} (Q_{W} (α)) - α | > ϵ) \leq 2 e^{- 2 n ϵ^{2}}$ for any $ϵ > 0$ by the Dvoretzky–Kiefer–Wolfowitz inequality. Thus, we have

$\begin{aligned} | \log Q_{W} (j / n) - \log Q_{n, W} (j / n) | & \leq sup_{α} | \frac{Q_{n, W} (α) - Q_{W} (α)}{Q_{W} (α)} | \\ \leq sup_{α} | Q_{W} (α)^{- 1} | sup_{α} (| \frac{F_{n, W} (Q_{W} (α)) - α}{f_{W} (Q_{W} (α))} | + | \frac{n^{- 1 / 2} R_{n}}{f_{W} (Q_{W} (α))} |) \\ \leq n^{τ_{0}} sup_{α} (| \frac{F_{n, W} (Q_{W} (α)) - α}{f_{W} (Q_{W} (α))} | + | \frac{n^{- 1 / 2} R_{n}}{f_{W} (Q_{W} (α))} |) \\ = o_{p} (n^{- τ}), \end{aligned}$

for $τ \in (2 τ_{0}, 1 / 2)$ . Finally we use $\sum_{j} \int_{B_{j}} {\tilde{Q}}_{Y} (α) d α = \int {\tilde{Q}}_{Y} (α) d α = E (\tilde{Y}) < \infty$ where $\tilde{Y}$ is the random variable whose quantile function is $\tilde{Q}$ . This completes the proof of $(*) = o_{p} (n^{- τ})$ . Using similar ideas,

$(* * *) = o_{p} (n^{- τ}) .$

We now consider the second term $(* *)$ in (A4). Noting $Q_{n, W} (j / n) = Q_{W} (j / n) + o_{p} (1)$ by (A6), with high probability, we have

$\begin{aligned} (* *) & \leq \sum_{j} | \log Q_{n, W} (j / n) | | \int_{B_{j}} ({\tilde{Q}}_{Y} (α) - {\tilde{Q}}_{n, Y} (j / n)) | \\ \leq \log n^{τ_{0}} {(\int_{1 - 1 / n}^{1} {\tilde{Q}}_{Y} (α) d α + \frac{C}{n} {\tilde{Q}}_{n, Y} (1)) + \sum_{j = 1}^{n - 1} | \int_{B_{j}} {\tilde{Q}}_{Y} (α) - {\tilde{Q}}_{n, Y} (j / n) |} . \end{aligned}$ (A7)

The first term in the braces of (A7) is controlled since $\int_{1 - 1 / n}^{1} {\tilde{Q}}_{Y} (α) d α = O_{p} (\log n / n) = o_{p} (n^{- τ})$ by direct calculation with ${\tilde{Q}}_{Y} (1 - 1 / n) = O (\log n)$ , and since ${\tilde{Q}}_{n, Y} (1) = Y_{(n)}$ and $P (Y_{(n)} > clog n) = o (n^{- 1})$ .

The second term in (A7) can be upper bounded by

$\begin{aligned} \sum_{j = 1}^{n - 1} | \int_{B_{j}} {\tilde{Q}}_{Y} (α) - {\tilde{Q}}_{n, Y} (j / n) | & \leq \sum_{j = 1}^{n - 1} | \int_{B_{j}} ({\tilde{Q}}_{Y} (α) - {\tilde{Q}}_{Y} (j / n)) | + \frac{1}{n} \sum_{j = 1}^{n - 1} | {\tilde{Q}}_{Y} (j / n) - {\tilde{Q}}_{n, Y} (j / n) | \\ \leq \frac{1}{n} ({\tilde{Q}}_{Y} (1 - 1 / n) - {\tilde{Q}}_{Y} (0)) + \frac{1}{n} \sum_{j = 1}^{n - 1} o_{p} (n^{- τ} / p_{j^{*} (j)}) \\ = o_{p} (n^{- τ}), \end{aligned}$

for $τ < 1 / 2$ where the penultimate inequality follows by letting $j^{*} (j)$ be such that $\sum_{i = 1}^{j *} p_{i} + \frac{p_{j * + 1}}{2} \leq j / n \leq \sum_{i = 1}^{j * + 1} p_{i} + \frac{p_{j * + 2}}{2}$ (where $p_{j} = P (Y = j)$ ) and by Theorem 2 in [17], and the final equality follows since $1 / p_{j * (n - 1)} = O (n)$ dominates all the other probabilities.

Combining the above calculations, we have shown that for $τ < 1 / 2$ ,

$n^{τ} ({\tilde{D}}_{n} (β) - \tilde{D} (β)) = o_{p} (1) .$

This completes the proof.

Lemma A.3

Let conditions in Theorem 2.2 hold. Let $B$ be any compact subset of $R^{p}$ . It holds that $sup_{β \in B} | {\tilde{D}}_{n} (β) - \tilde{D} (β) |$ converges to 0 in probability.

Proof of Lemma A.3. Proof of Lemma A.3 —

Let C denote constants that may differ from line to line. Also, let $‖ β ‖$ and $| β |$ denote $L_{2}$ and $L_{1}$ norm of a vector $β \in R^{p}$ , respectively. Using the fact that $\tilde{D} (β)$ is a continuous function in $β$ , for any $β \in B$ , there exists $β_{1}, \dots, β_{m} \in B$ , where m is such that for any $β \in B$ , there exists $i \in {1, \dots, m}$ for which

$‖ β - β_{i} ‖ < \frac{ϵ}{max (\sqrt{p}, M_{n})} and | \tilde{D} (β) - \tilde{D} (β_{i}) | < ϵ,$ (A8)

where $M_{n}$ is a bound such that $‖ x ‖ < M_{n}$ on $Ω_{n} \subseteq (c_{1} n^{- τ_{0}}, c_{2} n^{τ_{0}})$ . Since $β^{'} x \leq \log n^{τ_{0}}$ for any fixed $β$ , we have $‖ x ‖ \leq | x | \leq plog n^{τ_{0}} =: M_{n}$ . Thus we may set $m = (Clog n / ϵ)^{p}$ and

$| β^{'} x - β_{i}^{'} x | \leq ‖ x ‖ ‖ β - β_{i} ‖ \leq ϵ,$ (A9)

and

$| | β | - | β_{i} | | \leq | β - β_{i} | \leq \sqrt{p} ‖ β - β_{i} ‖ < ϵ .$ (A10)

Note that

$\begin{aligned} | {\tilde{D}}_{n} (β) - {\tilde{D}}_{n} (β_{i}) | \\ = | \frac{1}{n} \sum_{j = 1}^{n} {{\tilde{Q}}_{n, Y} (j / n) ((β_{i}^{'} X)_{(j)} - (β^{'} X)_{(j)}) - \exp (β^{'} X)_{(j)} + \exp (β_{i}^{'} X)_{(j)}} + λ (| β | - | β_{i} |) | \\ \leq \frac{1}{n} \sum_{j = 1}^{n} | {\tilde{Q}}_{n, Y} (j / n) | | (β_{i}^{'} X)_{(j)} - (β^{'} X)_{(j)} | + \frac{1}{n} \sum_{j = 1}^{n} | \exp (β^{'} X)_{(j)} - \exp (β_{i}^{'} X)_{(j)} | + λϵ \\ \leq Cϵ (\frac{1}{n} \sum_{j = 1}^{n} {\tilde{Q}}_{n, Y} (j / n)) \to Cϵ E (\tilde{Y}), \end{aligned}$ (A11)

where the penultimate inequality follows by (A10) and the last inequality follows by ideas of [23] (see Lemma A.2 of [23]) and by (A9). Consequently there exists a set A with $P (A) \geq 1 - ϵ$ such that on the set A it holds that $| {\tilde{D}}_{n} (β) - {\tilde{D}}_{n} (β_{i}) | \leq Cϵ$ .

Then

$\begin{aligned} | {\tilde{D}}_{n} (β) - \tilde{D} (β) | & \leq | {\tilde{D}}_{n} (β) - {\tilde{D}}_{n} (β_{i}) | + | {\tilde{D}}_{n} (β_{i}) - \tilde{D} (β_{i}) | + | \tilde{D} (β_{i}) - \tilde{D} (β) | \\ \leq Cϵ + | {\tilde{D}}_{n} (β_{i}) - \tilde{D} (β_{i}) | + ϵ . \end{aligned}$

Thus on the set A, we have

$sup_{β \in B} | {\tilde{D}}_{n} (β) - \tilde{D} (β) | \leq (C + 1) ϵ + \sum_{i = 1}^{m} | {\tilde{D}}_{n} (β_{i}) - \tilde{D} (β_{i}) | .$

Since $(\log n)^{p} \cdot n^{- τ} \to 0$ , the claim follows using Lemma A.2.

Proof of Theorem 2.2. Proof of Theorem 2.2 —

Let $\hat{β} = {\hat{β}}^{PMQ}$ be bounded. Let $B_{0}$ be a compact set that contains $\hat{β}$ with probability 1.

By (5) and (6), we obtain

${\tilde{D}}_{n} (β_{0}) - \tilde{D} (β_{0}) \geq {\tilde{D}}_{n} (\hat{β}) - \tilde{D} (β_{0}) \geq {\tilde{D}}_{n} (\hat{β}) - \tilde{D} (\hat{β}) .$

By Lemma A.3, ${\tilde{D}}_{n} (β_{0}) - \tilde{D} (β_{0})$ and ${\tilde{D}}_{n} (\hat{β}) - \tilde{D} (\hat{β})$ converges to 0 in probability. Hence, ${\tilde{D}}_{n} (\hat{β}) - \tilde{D} (β_{0})$ converges to 0 in probability as well. The second assertion $d ({\hat{β}}^{PMQ}, B_{0}) \to 0$ in probability can be shown as in [23]; hence, we omit the details.

Table A1.

Means (standard errors) of the Wasserstein distance and the MMD for the distributionally unassociated Y and $X$ case (sample size n = 1000).

Variables p = 2 p = 6 p = 10

$U (0, \frac{r}{10})$ r = 1 r = 5 r = 1 r = 5 r = 1 r = 5

measure dispersion method mean se mean se mean se mean se mean se mean se

Wasserstein $δ = 0$ PMQE 0.86 (0.03) 1.06 (0.04) 0.91 (0.04) 1.26 (0.06) 0.83 (0.03) 1.6 (0.07)

PMQE Q 0.89 (0.02) 1.04 (0.04) 0.83 (0.03) 1.37 (0.06) 0.83 (0.03) 1.73 (0.11)

PMQE LASSO 1 1.25 (0.02) 1.44 (0.04) 1.27 (0.03) 1.79 (0.05) 1.31 (0.03) 2.13 (0.07)

PMQE LASSO 0.1 0.82 (0.02) 1.02 (0.03) 0.86 (0.03) 1.23 (0.05) 0.77 (0.02) 1.48 (0.06)

PMQE Q LASSO 1 1.28 (0.02) 1.45 (0.04) 1.28 (0.03) 1.84 (0.05) 1.33 (0.03) 2.3 (0.09)

PMQE Q LASSO 0.1 0.87 (0.02) 1.01 (0.04) 0.79 (0.03) 1.34 (0.06) 0.76 (0.02) 1.71 (0.1)

GLM 3.5 (0.03) 4.01 (0.05) 3.33 (0.03) 4.79 (0.08) 3.22 (0.03) 5.83 (0.1)

$δ = 0.5$ PMQE 0.9 (0.03) 1.1 (0.04) 0.98 (0.03) 1.41 (0.06) 1.15 (0.04) 2.01 (0.1)

PMQE Q 0.93 (0.04) 0.97 (0.04) 0.96 (0.04) 1.4 (0.08) 1.12 (0.05) 1.99 (0.1)

PMQE LASSO 1 1.28 (0.03) 1.4 7(0.04) 1.35 (0.03) 1.76 (0.06) 1.42 (0.03) 2.02 (0.09)

PMQE LASSO 0.1 0.86 (0.03) 1.07 (0.04) 0.86 (0.02) 1.32 (0.06) 0.92 (0.03) 1.66 (0.09)

PMQE Q LASSO 1 1.38 (0.03) 1.52 (0.04) 1.41 (0.03) 1.8 (0.06) 1.5 (0.03) 2.16 (0.08)

PMQE Q LASSO 0.1 0.9 (0.03) 0.96 (0.04) 0.88 (0.03) 1.31 (0.07) 0.96 (0.03) 1.76 (0.08)

GLM 4.27 (0.03) 4.75 (0.05) 4.02 (0.03) 5.33 (0.08) 3.89 (0.04) 6.15 (0.1)

MMD $δ = 0$ PMQE 0.14 (0.01) 0.18 (0.01) 0.13 (0.01) 0.24 (0.03) 0.12 (0.01) 0.34 (0.03)

PMQE Q 0.14 (0.01) 0.18 (0.01) 0.11 (0.01) 0.26 (0.02) 0.11 (0.01) 0.37 (0.04)

PMQE LASSO 1 0.35 (0.02) 0.41 (0.02) 0.35 (0.02) 0.57 (0.04) 0.36 (0.02) 0.74 (0.06)

PMQE LASSO 0.1 0.13 (0.01) 0.18 (0.01) 0.12 (0.01) 0.25 (0.03) 0.1 (0.01) 0.34 (0.03)

PMQE Q LASSO 1 0.36 (0.02) 0.42 (0.02) 0.35 (0.02) 0.58 (0.04) 0.37 (0.02) 0.8 (0.06)

PMQE Q LASSO 0.1 0.13 (0.01) 0.17 (0.01) 0.11 (0.01) 0.27 (0.03) 0.1 (0.01) 0.41 (0.04)

GLM 1.81 (0.03) 2.21 (0.06) 1.57 (0.03) 2.83 (0.11) 1.49 (0.03) 3.74 (0.14)

$δ = 0.5$ PMQE 0.14 (0.01) 0.19 (0.02) 0.15 (0.01) 0.27 (0.03) 0.21 (0.02) 0.47 (0.05)

PMQE Q 0.14 (0.01) 0.15 (0.01) 0.14 (0.01) 0.27 (0.03) 0.2 (0.02) 0.46 (0.05)

PMQE LASSO 1 0.33 (0.02) 0.4 (0.02) 0.34 (0.02) 0.52 (0.04) 0.4 (0.02) 0.6 (0.06)

PMQE Q LASSO 0.1 0.13 (0.01) 0.18 (0.02) 0.12 (0.01) 0.26 (0.02) 0.15 (0.01) 0.35 (0.04)

PMQE Q LASSO 1 0.36 (0.02) 0.41 (0.02) 0.35 (0.02) 0.54 (0.04) 0.42 (0.02) 0.64 (0.07)

PMQE Q LASSO 0.1 0.14 (0.01) 0.15 (0.01) 0.12 (0.01) 0.25 (0.03) 0.15 (0.01) 0.38 (0.04)

GLM 2.2 (0.05) 2.58 (0.06) 1.85 (0.04) 2.9 (0.1) 1.79 (0.05) 3.41 (0.15)

Open in a new tab

Funding Statement

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2023R1A2C1003730 and No. RS-2023-00219212); and Korea University Grant (K2206361).

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

We used the shopping mall data in this study. This dataset is publicly available in [8].

References

1.Abid A., Poon A., and Zou J., Linear regression with shuffled labels, preprint (2017), arXiv:1705.01342.
2.Abid A. and Zou J., A stochastic expectation-maximization approach to shuffled linear regression, in 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton), IEEE, 2018, pp. 470–477.
3.Balabdaoui F., Doss C.R., and Durot C., Unlinked monotone regression, J. Mach. Learn. Res. 22 (2021), pp. 172. [Google Scholar]
4.Breslow N.E., Extra-poisson variation in log-linear models, J. R. Stat. Soc.: C Appl. Stat. 33 (1984), pp. 38–44. [Google Scholar]
5.Carpentier A. and Schlüter T., Learning relationships between data obtained independently, in Artificial Intelligence and Statistics, PMLR, 2016, pp. 658–666.
6.Consul P.C. and Jain G.C., A generalization of the poisson distribution, Technometrics 15 (1973), pp. 791–799. [Google Scholar]
7.Dean C. and Lawless J.F., Tests for detecting overdispersion in poisson regression models, J. Am. Stat. Assoc. 84 (1989), pp. 467–472. [Google Scholar]
8.Drezner T., Drezner Z., and Zerom D., Facility dependent distance decay in competitive location., Netw. Spat. Econ. 20 (2020), pp. 915–934. [Google Scholar]
9.Fang G. and Li P., Regression with label permutation in generalized linear model, in International Conference on Machine Learning, PMLR, 2023, pp. 9716–9760.
10.Harris T., Yang Z., and Hardin J.W., Modeling underdispersed count data with generalized poisson regression, Stata J. 12 (2012), pp. 736–747. [Google Scholar]
11.He X., Yang Y., and Zhang J., Bivariate downscaling with asynchronous measurements, J. Agric. Biol. Environ. Stat. 17 (2012), pp. 476–489. [Google Scholar]
12.Jiang Q., Xia Y., and Liang B., Matching distributions for survival data, Can. J. Stat. 50 (2021), pp. 751–775. [Google Scholar]
13.Kulik R., Bahadur–kiefer theory for sample quantiles of weakly dependent linear processes, Bernoulli 13 (2007), pp. 1071–1090. [Google Scholar]
14.Levina E. and Bickel P., The earth mover's distance is the mallows distance: Some insights from statistics, in Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, Vol. 2. IEEE, 2001, pp. 251–256.
15.Li F., Fujiwara K., Okura F., and Matsushita Y., Generalized Shuffled Linear Regression, in Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 2021, pp. 6474–6483.
16.Li H., Sheffield J., and Wood E.F., Bias correction of monthly precipitation and temperature fields from intergovernmental panel on climate change ar4 models using equidistant quantile matching, J. Geophys. Res.: Atmos. 115 (2010), pp. 1–20. [Google Scholar]
17.Ma Y., Genton M., and Parzen E., Asymptotic properties of sample quantiles of discrete distributions, Ann. Inst. Stat. Math. 63 (2011), pp. 227–243. Funding Information: Acknowledgments The authors would like to thank the Editor, the Associate Editor, and two referees for comments that improved this manuscript. This research was partially supported by NSF grants DMS-0504896 and CMG ATM-0620624. [Google Scholar]
18.O'Brien T., Sornette D., and McPherron R., Statistical asynchronous regression: Determining the relationship between two quantities that are not measured simultaneously, J. Geophys. Res.: Space Phys. 106 (2001), pp. 13247–13259. [Google Scholar]
19.Panaretos V.M. and Zemel Y., Statistical aspects of wasserstein distances, Annu. Rev. Stat. Appl. 6 (2019), pp. 405–431. [Google Scholar]
20.Qin S. and Wu Y., General matching quantiles m-estimation, Comput. Stat. Data Anal. 147 (2020), pp. 106941. [Google Scholar]
21.Reilly W.J., The Law of Retail Gravitation, WJ Reilly, New York, 1931. [Google Scholar]
22.Rigollet P. and Weed J., Uncoupled isotonic regression via minimum wasserstein deconvolution, Inf. Inference: A J. IMA 8 (2019), pp. 691–717. [Google Scholar]
23.Sgouropoulos N., Yao Q., and Yastremiz C., Matching a distribution by matching quantiles estimation, J. Am. Stat. Assoc. 110 (2015), pp. 742–759. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Slawski M. and Ben-David E., Linear regression with sparsely permuted data, Electron. J. Stat. 13 (2019), pp. 1–36. [Google Scholar]
25.Slawski M. and Sen B., Permuted and unlinked monotone regression in $R^{d}$ : an approach based on mixture modeling and optimal transport, preprint (2022), arXiv:2201.03528.
26.Srivastav R.K., Schardong A., and Simonovic S.P., Equidistance quantile matching method for updating idfcurves under climate change, Water Resour. Manag. 28 (2014), pp. 2539–2562. [Google Scholar]
27.Unnikrishnan J., Haghighatshoar S., and Vetterli M., Unlabeled sensing with random linear measurements, IEEE Trans. Inf. Theory 64 (2018), pp. 3237–3253. [Google Scholar]
28.Vallender S., Calculation of the wasserstein distance between probability distributions on the line, Theory Probab. Appl. 18 (1974), pp. 784–786. [Google Scholar]
29.Wang Z., Ben-David E., and Slawski M., Estimation in exponential family regression based on linked data contaminated by mismatch error, preprint (2020), arXiv:2010.00181.
30.Wu P., Liang B., Xia Y., and Tong X., Predicting disease risks by matching quantiles estimation for censored data, Math. Biosci. Eng. 17 (2020), pp. 4544–4562. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

We used the shopping mall data in this study. This dataset is publicly available in [8].

[CIT0001] 1.Abid A., Poon A., and Zou J., Linear regression with shuffled labels, preprint (2017), arXiv:1705.01342.

[CIT0002] 2.Abid A. and Zou J., A stochastic expectation-maximization approach to shuffled linear regression, in 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton), IEEE, 2018, pp. 470–477.

[CIT0003] 3.Balabdaoui F., Doss C.R., and Durot C., Unlinked monotone regression, J. Mach. Learn. Res. 22 (2021), pp. 172. [Google Scholar]

[CIT0004] 4.Breslow N.E., Extra-poisson variation in log-linear models, J. R. Stat. Soc.: C Appl. Stat. 33 (1984), pp. 38–44. [Google Scholar]

[CIT0005] 5.Carpentier A. and Schlüter T., Learning relationships between data obtained independently, in Artificial Intelligence and Statistics, PMLR, 2016, pp. 658–666.

[CIT0006] 6.Consul P.C. and Jain G.C., A generalization of the poisson distribution, Technometrics 15 (1973), pp. 791–799. [Google Scholar]

[CIT0007] 7.Dean C. and Lawless J.F., Tests for detecting overdispersion in poisson regression models, J. Am. Stat. Assoc. 84 (1989), pp. 467–472. [Google Scholar]

[CIT0008] 8.Drezner T., Drezner Z., and Zerom D., Facility dependent distance decay in competitive location., Netw. Spat. Econ. 20 (2020), pp. 915–934. [Google Scholar]

[CIT0009] 9.Fang G. and Li P., Regression with label permutation in generalized linear model, in International Conference on Machine Learning, PMLR, 2023, pp. 9716–9760.

[CIT0010] 10.Harris T., Yang Z., and Hardin J.W., Modeling underdispersed count data with generalized poisson regression, Stata J. 12 (2012), pp. 736–747. [Google Scholar]

[CIT0011] 11.He X., Yang Y., and Zhang J., Bivariate downscaling with asynchronous measurements, J. Agric. Biol. Environ. Stat. 17 (2012), pp. 476–489. [Google Scholar]

[CIT0012] 12.Jiang Q., Xia Y., and Liang B., Matching distributions for survival data, Can. J. Stat. 50 (2021), pp. 751–775. [Google Scholar]

[CIT0013] 13.Kulik R., Bahadur–kiefer theory for sample quantiles of weakly dependent linear processes, Bernoulli 13 (2007), pp. 1071–1090. [Google Scholar]

[CIT0014] 14.Levina E. and Bickel P., The earth mover's distance is the mallows distance: Some insights from statistics, in Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, Vol. 2. IEEE, 2001, pp. 251–256.

[CIT0015] 15.Li F., Fujiwara K., Okura F., and Matsushita Y., Generalized Shuffled Linear Regression, in Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 2021, pp. 6474–6483.

[CIT0016] 16.Li H., Sheffield J., and Wood E.F., Bias correction of monthly precipitation and temperature fields from intergovernmental panel on climate change ar4 models using equidistant quantile matching, J. Geophys. Res.: Atmos. 115 (2010), pp. 1–20. [Google Scholar]

[CIT0017] 17.Ma Y., Genton M., and Parzen E., Asymptotic properties of sample quantiles of discrete distributions, Ann. Inst. Stat. Math. 63 (2011), pp. 227–243. Funding Information: Acknowledgments The authors would like to thank the Editor, the Associate Editor, and two referees for comments that improved this manuscript. This research was partially supported by NSF grants DMS-0504896 and CMG ATM-0620624. [Google Scholar]

[CIT0018] 18.O'Brien T., Sornette D., and McPherron R., Statistical asynchronous regression: Determining the relationship between two quantities that are not measured simultaneously, J. Geophys. Res.: Space Phys. 106 (2001), pp. 13247–13259. [Google Scholar]

[CIT0019] 19.Panaretos V.M. and Zemel Y., Statistical aspects of wasserstein distances, Annu. Rev. Stat. Appl. 6 (2019), pp. 405–431. [Google Scholar]

[CIT0020] 20.Qin S. and Wu Y., General matching quantiles m-estimation, Comput. Stat. Data Anal. 147 (2020), pp. 106941. [Google Scholar]

[CIT0021] 21.Reilly W.J., The Law of Retail Gravitation, WJ Reilly, New York, 1931. [Google Scholar]

[CIT0022] 22.Rigollet P. and Weed J., Uncoupled isotonic regression via minimum wasserstein deconvolution, Inf. Inference: A J. IMA 8 (2019), pp. 691–717. [Google Scholar]

[CIT0023] 23.Sgouropoulos N., Yao Q., and Yastremiz C., Matching a distribution by matching quantiles estimation, J. Am. Stat. Assoc. 110 (2015), pp. 742–759. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0024] 24.Slawski M. and Ben-David E., Linear regression with sparsely permuted data, Electron. J. Stat. 13 (2019), pp. 1–36. [Google Scholar]

[CIT0025] 25.Slawski M. and Sen B., Permuted and unlinked monotone regression in $R^{d}$ : an approach based on mixture modeling and optimal transport, preprint (2022), arXiv:2201.03528.

[CIT0026] 26.Srivastav R.K., Schardong A., and Simonovic S.P., Equidistance quantile matching method for updating idfcurves under climate change, Water Resour. Manag. 28 (2014), pp. 2539–2562. [Google Scholar]

[CIT0027] 27.Unnikrishnan J., Haghighatshoar S., and Vetterli M., Unlabeled sensing with random linear measurements, IEEE Trans. Inf. Theory 64 (2018), pp. 3237–3253. [Google Scholar]

[CIT0028] 28.Vallender S., Calculation of the wasserstein distance between probability distributions on the line, Theory Probab. Appl. 18 (1974), pp. 784–786. [Google Scholar]

[CIT0029] 29.Wang Z., Ben-David E., and Slawski M., Estimation in exponential family regression based on linked data contaminated by mismatch error, preprint (2020), arXiv:2010.00181.

[CIT0030] 30.Wu P., Liang B., Xia Y., and Tong X., Predicting disease risks by matching quantiles estimation for censored data, Math. Biosci. Eng. 17 (2020), pp. 4544–4562. [DOI] [PubMed] [Google Scholar]

PERMALINK

Matching a discrete distribution by Poisson matching quantiles estimation

Hyungjun Lim

Arlene K H Kim

Abstract

1. Introduction

2. Matching method and convergence of the algorithms

Assumption 2.1

Remark 2.1

Remark 2.2

Remark 2.3

Theorem 2.1

Proof.

Theorem 2.2

Remark 2.4

Remark 2.5

3. Simulation study

3.1. Matching measures

3.2. Simulation for unpaired data

Table 1.

Table 2.

3.3. Simulation for unpaired data with overdispersed Y

Table 3.

Table 4.

3.4. Simulation for matching distributionally unassociated Y and X

4. Real data application

Figure 1.

Figure 2.

5. Further consideration

Appendix.

Lemma A.1

Proof of Lemma A.1. Proof of Lemma A.1 —

Lemma A.2

Proof of Lemma A.2. Proof of Lemma A.2 —

Lemma A.3

Proof of Lemma A.3. Proof of Lemma A.3 —

Proof of Theorem 2.2. Proof of Theorem 2.2 —

Table A1.

Funding Statement

Disclosure statement

Data availability statement

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

3.4. Simulation for matching distributionally unassociated Y and $X$