Optimal designs for copula models

E Perrone; WG Müller

doi:10.1080/02331888.2015.1111892

. 2016 Jan 8;50(4):917–929. doi: 10.1080/02331888.2015.1111892

Optimal designs for copula models

E Perrone ^a, WG Müller ^a,^*

PMCID: PMC4936440 PMID: 27453616

Abstract

Copula modelling has in the past decade become a standard tool in many areas of applied statistics. However, a largely neglected aspect concerns the design of related experiments. Particularly the issue of whether the estimation of copula parameters can be enhanced by optimizing experimental conditions and how robust all the parameter estimates for the model are with respect to the type of copula employed. In this paper an equivalence theorem for (bivariate) copula models is provided that allows formulation of efficient design algorithms and quick checks of whether designs are optimal or at least efficient. Some examples illustrate that in practical situations considerable gains in design efficiency can be achieved. A natural comparison between different copula models with respect to design efficiency is provided as well.

Keywords: copulas, design measure, Fisher information, stochastic dependence, clinical trials

AMS Subject Classification: 62K05

1. Introduction

Due to their flexibility in describing dependencies and the possibility of separating marginal and joint effects copula models have become a popular device for coping with multivariate data. in many areas of applied statistics eg. for insurances,[1] econometrics,[2] medicine,[3] marketing,[4] spatial extreme events,[5] time series analysis,[6] even sports [7] and particularly in finance.[8]

The concept of copulas, however, has only been rarely employed in experimental design with notable exceptions of spatial design in [9,10], and sequential trials in [11]. The design question for copula parameter estimation has to our knowledge just been raised in [12], where a brute-force simulated annealing optimization was employed for the solution of a specific problem. By this paper we provide the necessary theory for fully embedding the situation into optimal design theory. Particularly we provide a Kiefer–Wolfowitz type equivalence theorem [13] in Section 5 as a basis for a substantial analysis of the arising issues in the example sections.

To be more concrete, let us consider a vector $x^{T} = (x_{1}, \dots, x_{r}) \in X$ of control variables, where $X \subset R^{r}$ is a compact set. The results of the observations and of the expectations in a regression experiments are the vectors:

\begin{aligned} y (x) & = (y_{1} (x), \dots, y_{m} (x)), \\ E [Y (x)] & = E [(Y_{1}, \dots, Y_{m})] = η (x, β) = (η_{1} (x, β), \dots, η_{m} (x, β)), \end{aligned}

where $β = (β_{1}, \dots, β_{k})$ is a certain unknown vector of marginal parameters to be estimated and $η_{i} (i = 1, \dots, m)$ are known functions. Let us call $F_{Y_{i}} (y_{i} (x, β))$ the marginal cumulative distributions of each $Y_{i}$ for all $i = 1, \dots, m$ and $f_{Y} (y (x, β), α)$ the joint probability density function of the random vector $Y$ , where $α = (α_{1}, \dots, α_{l})$ are unknown (copula) parameters. In the remainder of the paper we will focus on the case $m = 2$ , but generalizations of our results are possible.

Definition 2.

Let $I = [0, 1]$ . A two-dimensional copula (or 2-copula) is a bivariate function $C : I \times I ⟶ I$ with the following properties:

for every $u_{1},$ $u_{2} \in I$
$C (u_{1}, 0) = 0, C (u_{1}, 1) = u_{1}, C (0, u_{2}) = 0, C (1, u_{2}) = u_{2};$ (1)

for every $u_{1},$ $u_{2},$ $u_{3},$ $u_{4} \in I$ such that $u_{1} \leq u_{3}$ and $u_{2} \leq u_{4},$
$C (u_{3}, u_{4}) - C (u_{3}, u_{2}) - C (u_{1}, u_{4}) + C (u_{1}, u_{2}) \geq 0.$

Now let $F_{Y}$ be a joint cumulative distribution function (cdf) with marginal cdfs $F_{Y_{1}}$ and $F_{Y_{2}}$ . According to Sklar's theorem [14] there exists then a 2-copula C such that

F_{Y} (y_{1}, y_{2}) = C (F_{Y_{1}} (y_{1}), F_{Y_{2}} (y_{2}))

(2)

for all reals $y_{1}$ , $y_{2}$ . If $F_{Y_{1}}$ and $F_{Y_{2}}$ are continuous, then C is unique; otherwise, C is uniquely defined on $Ran (F_{Y_{1}}) \times Ran (F_{Y_{2}})$ . Conversely, if C is a 2-copula and $F_{Y_{1}}$ and $F_{Y_{2}}$ are distribution functions, then the function $F_{Y}$ given by Equation (2) is a joint distribution with marginals $F_{Y_{1}}$ and $F_{Y_{2}}$ .

2. Design issues

We need to quantify the amount of information on both (trend and copula) sets of parameters α and β respectively from the regression experiment embodied in the Fisher information matrix, which for an elemental information at a particular control x in the sense of Atkinson et al. [15] is a $(k + l) \times (k + l)$ matrix defined as

m (x, β, α) = (\begin{matrix} m_{β β} (x) & m_{β α} (x) \\ m_{β α}^{T} (x) & m_{α α} (x) \end{matrix}),

(3)

where the submatrix $m_{β β} (x)$ is the $(k \times k)$ matrix with the $(i, j)$ th element defined as

E (- \frac{\partial^{2}}{\partial β_{i} \partial β_{j}} \log [f_{Y} (y (x, β), α)]),

(4)

and the submatrices $m_{β α} (x)$ $(k \times l)$ and $m_{α α} (x)$ $(l \times l)$ are defined accordingly. Here we model the dependence between $Y_{1}$ and $Y_{2}$ with a copula function $C_{α} (F_{Y_{1}} (y_{1} (x, β)), F_{Y_{2}} (y_{2} (x, β)))$ and find the joint density of the random variables from

f_{Y} (y (x, β), α) = \frac{\partial^{2}}{\partial y_{1} \partial y_{2}} C_{α} (F_{Y_{1}} (y_{1} (x, β)), F_{Y_{2}} (y_{2} (x, β)) .

Definition 4.

For a concrete (discrete) experiment with N independent observations at $n \leq N$ support points $x_{1}, \dots, x_{n},$ the corresponding total information matrix is

$M (ξ, β, α) = N^{- 1} \sum_{i = 1}^{n} w_{i} m (x_{i}, β, α), \sum_{i = 1}^{n} w_{i} = 1, ξ = \{\begin{matrix} x_{1} & \dots & x_{n} \\ w_{1} & \dots & w_{n} \end{matrix}\},$

with so-called design weights $w_{i}$ .

The aim of approximate optimal design theory is concerned with finding an optimal design measure $ξ^{*} (β, α)$ , such that it maximizes some scalar function $φ (M (ξ, β, α))$ , the so-called design criterion. In the following we will consider only D-optimality, that is, the criterion $φ (M) = \log det M$ , if M is non-singular. There exist several well written monographs on optimal design theory and its application, but in this paper we follow mainly the style and notation of Silvey.[16]

3. Equivalence theory

The cornerstone of a theoretical investigation into optimal design is usually the formulation of a Kiefer–Wolfowitz type equivalence relation, which is given in the following theorem. It is a generalized version of a theorem given without proof in [17] and follows from a multivariate version of the basic theorem given in [16], its full proof can be found in the Appendix.

Theorem 3.1.

Denote by $(\bar{β}, \bar{α})$ fixed values (local guesses) for the parameter vector. Then, the following properties are equivalent:

$ξ^{*}$ is D-optimal;

$t r [M (ξ^{*}, \bar{β}, \bar{α})^{- 1} m (x, \bar{β}, \bar{α})] \leq (k + l),$ $\forall x \in X;$

$ξ^{*}$ minimize $max_{x \in X} t r [M (ξ^{*}, \bar{β}, \bar{α})^{- 1} m (x, \bar{β}, \bar{α})],$ over all $ξ \in Ξ$ .

This theorem provides simple checks for D-optimality through the maxima of

d (x, ξ^{*}) = tr [M (ξ^{*}, \bar{β}, \bar{α})^{- 1} m (x, \bar{β}, \bar{α})],

which is usually called sensitivity function. It also allows us the use of standard design algorithms such as of the Fedorov-Wynn-type,[18,19] which will yield an optimal approximate design $ξ^{*} = \{\begin{matrix} {x_{1}}^{*} & \dots & {x_{n}}^{*} \\ w_{1}^{*} & \dots & w_{n}^{*} \end{matrix}\}$ .

Note that these resulting optimal designs will now depend not only upon the marginal model structure, but also upon the chosen copula and through the induced nonlinearities potentially also on the unknown parameter values for α and β, which is why we are resorting to localized designs around the values $(\bar{β}, \bar{α})$ . A sensitivity analysis with respect to the effect of these choice on a particular example can be found in [20].

Definition 6.

For the comparison of designs define D-Efficiency of the design ξ with respect to the design $ξ^{*}$ as the ratio

${(\frac{| M (ξ, \bar{β}, \bar{α}) |}{| M (ξ^{*}, \bar{β}, \bar{α}) |})}^{1 / (k + l)},$ (5)

where $(k + l)$ is the number of the model parameters. We will report all our findings in percentage losses of these D-efficiencies.

4. Examples

A main question now of course concerns whether ignorance or wrong guesses of copula function and/or parameters may lead to inefficiencies of the designs.

4.1. Tools

For that purpose let us here give the list of copulas used in our examples (for more details see, eg. [21] or [22]). We provide the copula function along with the so-called Kendall's τ, which is a dependence measure that allows us to conveniently relate different copulas (for a definition and a more exhaustive comparison see [23]).

Definition 8.

Product Copula, which represents the independence case:
$C (u_{1}, u_{2}) = u_{1} u_{2},$
with $τ = 0$ .

Gaussian Copula:
$C_{α} (u_{1}, u_{2}) = \frac{1}{2 Π \sqrt{1 - α^{2}}} \int_{- \infty}^{Φ^{- 1} (u_{1})} \int_{- \infty}^{Φ^{- 1} (u_{2})} \exp (- \frac{{z_{1}}^{2} - 2 α z_{1} z_{2} + {z_{2}}^{2}}{2 (1 - α^{2})}) d z_{1} d z_{2},$
with $α \in [- 1, 1]$ and $τ = (2 / Π) \arcsin (α)$ .

Farlie–Gumbel–Morgenstern (FGM):
$C_{α} (u_{1}, u_{2}) = u_{1} u_{2} [1 + α (1 - u_{1}) (1 - u_{2})],$
with $α \in [- 1, 1]$ and $τ = \frac{2}{9} α$ .

Clayton:
$C_{α} (u_{1}, u_{2}) = [max (u_{1}^{- α} + u_{2}^{- α} - 1, 0)]^{- (1 / α)},$
with $α \in (0, + \infty)$ and $τ = α / (α + 2)$ .

Frank:
$C_{α} (u_{1}, u_{2}) = - \frac{1}{α} \ln (1 + \frac{(e^{- α u_{1}} - 1) (e^{- α u_{2}} - 1)}{e^{- α} - 1}),$
with $α \in (- \infty, + \infty),$ and $τ = 1 - (4 / α) (1 - (1 / α) \int_{0}^{α} (t / (e^{t} - 1)) d t)$ .

Gumbel:
$C_{α} (u_{1}, u_{2}) = \exp (- [(- \ln u_{1})^{α} + (- \ln u_{2})^{α}]^{1 / α}),$
with $α \in [1, + \infty)$ and $τ = (α - 1) / α$ .

4.2. The linear case

Let us first consider a simple example reported in [18]. For each design point $x \in [0, 1]$ , we may observe an independent pair of random variables $Y_{1}$ and $Y_{2}$ , such that

\begin{aligned} E [Y_{1} (x)] & = β_{1} + β_{2} x + β_{3} x^{2}, \\ E [Y_{2} (x)] & = β_{4} x + β_{5} x^{3} + β_{6} x^{4}, \end{aligned}

which is linear in β and has dependence described by the product copula with Gaussian margins. Since this case is covered by Theorem 1, we were able to compute the optimal design $ξ^{*}$ by a standard algorithm and we display it in Figure 1 along with its sensitivity function. As rather typical $ξ^{*}$ is supported on only a small number (here four) of design points, smaller than the number of parameters. From the sensitivity function we can see that it is indeed optimum as it reaches (and not exceeds) the number of parameters at all design points. Furthermore our optimal design coincides with the one reported in [18], namely

ξ^{*} = (\begin{matrix} x_{i}^{*} \\ w_{i}^{*} \end{matrix}) = (\begin{matrix} 0 & 0.38 & 0.76 & 1.0 \\ 0.16 & 0.28 & 0.23 & 0.33 \end{matrix}) .

(6)

Figure 1. — Sensitivity function (left axis) and optimal design (right axis) for the Fedorov example.

Let us consider a more general case, for which the joint distribution is described by a Gaussian copula and we thus allow the random variables $Y_{1}$ and $Y_{2}$ to be dependent. In this case the joint probability function of the random vector $Y = (Y_{1}, Y_{2})$ is simply

\begin{aligned} F_{Y} (y_{1}, y_{2}) & = C_{α} (Φ (y_{1} - η_{1} (x, β)), Φ (y_{2} - η_{2} (x, β))) \\ = Φ_{2} (y_{1} - η_{1} (x, β), y_{2} - η_{2} (x, β); α), \end{aligned}

(7)

where $Φ_{2} (\cdot, \cdot; α)$ denotes the bivariate normal cdf with correlation $α \in (- 1, 1)$ and Φ denotes the cdf of the standard normal distribution $N (0, 1)$ (see [24]).

Our computations gave rise to the following

Corollary 4.2

For different values of α the optimal design is the same as for the independence case, which is the Gaussian case with $α = 0$ .

Note, that the sensitivity function now has a different scaling (with a maximum at 7) as we have an additional copula parameter. This corollary, however, is hardly surprising as this fact coincides with the classic findings for the multivariate Gaussian distribution by Krafft and Schaefer.[25]

But now for a contrast consider the FGM copula. Following our approach, we must calculate the density corresponding to the function:

\begin{aligned} C_{α} (Φ (Y_{1} (x; β)), Φ (Y_{2} (x; β))) & = Φ (Y_{1} (x; β)) Φ (Y_{2} (x; β)) \\ \times [1 + α (1 - Φ (Y_{1} (x; β))) (1 - Φ (Y_{2} (x; β)))], \end{aligned}

which eventually leads to expressions like

E (- \frac{\partial^{2}}{\partial β_{i} \partial β_{j}} \log [\frac{\partial^{2}}{\partial y_{1} \partial y_{2}} C_{α} (Φ (Y_{1} (x; β)), Φ (Y_{2} (x; β)))])

for the information matrix. These integrals are not analytically solvable, but we can evaluate them numerically and we can use the algorithm in order to find the optimum designs.

Not surprisingly those optimal designs do depend upon the choice of $\bar{α}$ (i.e. the assumed dependence) – and similar calculations can be performed for other copula functions as well. Some results are subsumed in Table 1, which displays the loss in D-efficiency that occurs by using the optimal design $ξ^{*}$ from Equation (6) compared to the respective optimal designs for various copula models and Kendall's τ. It can be seen that these losses are generally quite small for all considered copulas.

Table 1. Losses in D-efficiency (in bold) by ignoring the dependence in per cent.

	FGM		Clayton		Frank
τ	$\bar{α}$	D-eff	$\bar{α}$	D-eff	$\bar{α}$	D-eff
$- 0.15$	$- 0.67$	0.29	n.d.	–	−1.37	0.10
$- 0.10$	$- 0.45$	0.23	n.d.	–	−0.90	0.10
$- 0.05$	$- 0.22$	0.59	n.d.	–	−0.45	0.10
$0.05$	$0.22$	0.68	0.10	0.16	0.45	0.10
$0.10$	$0.45$	0.39	0.22	0.13	0.90	0.10
$0.15$	$0.67$	0.28	0.35	0.34	1.37	0.10
$0.35$	n.d.	–	1.08	0.11	3.51	0.11
$0.75$	n.d.	–	6.00	0.27	14.13	0.16

Open in a new tab

4.3. A binary bivariate model

In order to better investigate the role of the copula parameter, we analyse a more elaborate example with potential applications in clinical trials. Let us formally introduce the model. We consider a bivariate binary response $(Y_{i 1}, Y_{i 2})$ , $i = 1, \dots, n$ with four possible outcomes ${(0, 0), (0, 1), (1, 0), (1, 1)}$ where 1 usually represents a success and 0 a failure (of eg. a drug treatment). For a single observation denote the joint probabilities of $Y_{1}$ and $Y_{2}$ by $p_{y_{1}, y_{2}} = pr (Y_{1} = y_{1}, Y_{2} = y_{2})$ for $(y_{1}, y_{2} = 0, 1)$ . In a clinical trial context $Y_{1}$ and $Y_{2}$ could represent efficacy and toxicity of a tested drug.

Now, define

\begin{aligned} p_{11} & = C_{α} (π_{1}, π_{2}), p_{10} = π_{1} - p_{11}, \\ p_{01} & = π_{2} - p_{11}, p_{00} = 1 - π_{1} - π_{2} + p_{11} . \end{aligned}

(8)

The complete log-likelihood for the bivariate binary model is then given by

l (θ; y) = \sum_{i = 1}^{n} w_{i} l_{i} (θ; y), θ = (β_{1}, β_{2}, α),

(9)

where $β_{1}$ and $β_{2}$ are the parameters associated with the respective margins and the log-likelihood for a single observation is given by

l_{i} (θ; y) = y_{1} y_{2} \log p_{11} + y_{1} (1 - y_{2}) \log p_{10} + (1 - y_{1}) y_{2} \log p_{01} + (1 - y_{1}) (1 - y_{2}) \log p_{00} .

(10)

As shown in [26] the Fisher information matrix for a single observation can then be written as

M (θ, ξ_{i}) = {\frac{\partial p}{\partial θ}}^{T} (P^{- 1} + \frac{1}{1 - p_{11} - p_{10} - p_{01}} e e^{T}) \frac{\partial p}{\partial θ},

(11)

where $p = (p_{11}, p_{10}, p_{01})$ , $P = diag (p)$ and $e = (1, 1, 1)^{T}$ . Some useful formulae for calculating information matrices in copula models can also be found in [27].

A particular case of the introduced model has already been analysed in [17]. In that work, the authors assume the marginal probabilities of success given by the models

\log (\frac{π_{i}}{1 - π_{i}}) = β_{i 1} + β_{i 2} x, i = 1, 2

(12)

with $x \in [0, 10]$ and ‘localized’ parameters ${\bar{β}}_{1} = [- 1, 1]$ and ${\bar{β}}_{2} = [- 2, 0.5]$ . The considered joint cdf is the Gumbel cdf, which corresponds to the following choice for the probability of success $p_{11}$ :

p_{11} = F_{Y_{1}, Y_{2}} (π_{1}, π_{2}) = π_{1} π_{2} (1 + α (1 - π_{1}) (1 - π_{2})),

that is, in terms of copulas, the FGM copula.

In [17], the choice of this model is highlighted by arguing that it allows for dependence between efficacy and toxicity and it is claimed that including estimation of α rather then independently analysing efficacy and toxicity is preferable. We reanalyse this example studying both the role of the copula parameter and the impact of a particular type of dependence structure on the D-optimal designs obtained.

To get a clearer idea of the role played by the copula parameter, let us first focus on the benchmark case of independence, described both by the Product copula and the FGM copula with $α = 0$ . However, using these two copulas has substantially different interpretation and effect. In the one case (Product copula) we completely ignore potential dependence, whereas in the other case (FGM copula) we allow for its estimation, but assume it inexistent.

In a second step, to examine the impact of the dependence structure, we compare the model analysed in [17] with the more general ones proposed in [12]. Those have the same assumptions on the marginals probabilities as in [17], as well as same design space and initial parameter for the betas, but the dependencies are instead represented by using the copulas Frank, Gumbel, and Clayton. Note that in [12] the authors employed a brute-force simulated annealing algorithm for their calculations and had no means for checking definitive optimality, which is now possible through the equivalence theorem (Theorem 5.1) provided.

Now, using the D-optimal designs for the FGM copula and for the Product copula as benchmarks, we note the losses in D-efficiency in per cent as reported respectively in Table 2 (Product copula) and in Table 3 (FGM). In both cases, the losses are much stronger than in the previous example.

Table 2. Losses in D-efficiency (in bold) by ignoring the dependence in per cent (product copula).

	Frank		Clayton		Gumbel
τ	$\bar{α}$	D-eff	$\bar{α}$	D-eff	$\bar{α}$	D-eff
$0.11$	1.00	1.72	0.24	1.75	1.12	0.95
$0.45$	5.00	1.31	1.68	1.49	1.84	1.29
$0.66$	10.00	1.87	3.98	0.71	3.00	2.31
$0.76$	15.00	2.89	6.42	2.84	4.21	2.99
$0.82$	20.00	3.10	8.89	9.48	5.45	3.25

Open in a new tab

Table 3. Losses in D-efficiency (in bold) in per cent with respect to the FGM copula ( $α = 0$ ).

	Frank		Clayton		Gumbel
τ	$\bar{α}$	D-eff	$\bar{α}$	D-eff	$\bar{α}$	D-eff
$0.11$	1.00	0.01	0.24	2.42	1.12	0.87
$0.45$	5.00	0.36	1.68	1.13	1.84	0.5
$0.66$	10.00	3.18	3.98	1.34	3.00	2.84
$0.76$	15.00	5.63	6.42	5.54	4.21	5.13
$0.82$	20.00	6.24	8.89	13.94	5.45	6.12

Open in a new tab

Analysing these results by focusing on the Frank and the Gumbel copulas, one can notice that lower losses in Table 3 correspond to the lower values of τ. Conversely, the losses already become much higher in Table 3 for a moderate level of the association τ. Moreover, looking at the results for the Clayton copula, we even have lower losses by ignoring the dependence for almost all the levels of the association τ. This suggests, at first glance, that it is not generally preferable to insert a dependence parameter to be estimated, since this might increase the losses when the model is chosen badly.

Interestingly, if the copula parameter is not estimated, that is, if the model is just a four parameter model, the optimal designs found are almost the same for all the investigated copulas, and can be represented by

ξ^{*} = (\begin{matrix} x_{i}^{*} \\ w_{i}^{*} \end{matrix}) = (\begin{matrix} \sim 0 & 2.80 & 6.79 \\ 0.42 & 0.36 & 0.22 \end{matrix}) .

Evidently, the structure of the dependence has an impact as soon as its parameter requires estimation. In Figure 2 we display the designs and sensitivity functions for a representative case contrasting the very different optimal designs for a Clayton and a Gumbel copula with identical Kendall's τ. It will therefore be of great practical value to compare different copulas with respect to their optimal design properties (as well as eventually to be able to efficiently discriminate between them).

A first such step of comparing different copula models was taken in [12] where the authors evaluate designs for various copula choices against each other (in their Table 8). However, they have been using the same parameter values for all the copulas without considering the different meaning of the copula parameter for various copula families. Therefore, we instead provide in Table 4 an improved comparison between different dependence structures along the same Kendall's τ values by exploiting the relationship between the copula parameter to the measure of concordance τ. Thanks to this comparison, now the pure impact of the choice of the copula is highlighted. It turns out, that even in extreme case the efficiency losses are only small to moderate. They are greatest if Frank or Gumbel are used instead of Clayton, which may be explained by their opposing representations of tail dependencies.

Table 4. Losses in D-efficiency (in bold) by comparing the true copula model with the assumed one for a fixed Kendall's τ.

True Copula	Frank		Clayton		Gumbel
Assumed Copula	Clayton	Gumbel	Frank	Gumbel	Frank	Clayton
$τ = 0.11$	2.24	0.67	1.99	2.70	0.82	2.75
$τ = 0.45$	0.26	0.03	0.26	0.11	0.03	0.15
$τ = 0.66$	1.09	0.11	1.04	1.28	0.14	1.57
$τ = 0.76$	4.27	0.02	3.87	4.08	0.01	4.73
$τ = 0.82$	8.24	0.01	10.91	10.96	0.01	8.43

Open in a new tab

4.4. A more flexible model

Let us now allow the strength of the dependence itself be dependent upon the regressors x, a situation completely covered by our equivalence theorem. Thus here the copula parameters themselves become model-dependent such as, for example, in [28]. As in our context only positive associations (between efficacy and toxicity) make sense we consider in the following the τ modelled by a logistic:

τ (x, α_{1}) = \frac{e^{α_{1} x}}{1 + e^{α_{1} x}},

(13)

which takes values in $[0, 1]$ for $α_{1} \in [0, 1]$ .

Considering the Archimedian copulas Clayton $C_{1}$ and Gumbel $C_{2}$ , the following relationships between the Kendall's τ and the copula parameters hold:

\begin{aligned} τ_{C_{1}} (x, α_{1}) & = \frac{γ_{1} (x, α_{1})}{γ_{1} (x, α_{1}) + 2} andso γ_{1} (x, α_{1}) = \frac{2 τ_{C_{1}} (x, α_{1})}{1 - τ_{C_{1}} (x, α_{1})} for the Clayton family; \\ τ_{C_{2}} (x, α_{1}) & = \frac{γ_{2} (x, α_{1}) - 1}{γ_{2} (x, α_{1})} andso γ_{2} (x, α_{1}) = \frac{1}{1 - τ_{C_{2}} (x, α_{1})} for the Gumbel family . \end{aligned}

Then, we model the current probability of success by a convex combination of the Clayton and the Gumbel copulas

C (π_{1}, π_{2}; α_{1}, α_{2}) = α_{2} C_{1} (π_{1}, π_{2}; γ_{1} (x, α_{1})) + (1 - α_{2}) C_{2} (π_{1}, π_{2}; γ_{2} (x, α_{1})),

and when we link them at the same τ values we end up with

C (π_{1}, π_{2}; α_{1}, α_{2}) = α_{2} C_{1} (π_{1}, π_{2}; 2 e^{α_{1} x}) + (1 - α_{2}) C_{2} (π_{1}, π_{2}; 1 + e^{α_{1} x}) .

The added flexibility of such a model is that the impact of the dependence structure and the association level is reflected by two different parameters. While the $α_{2}$ parameter is strictly related to the structure of the dependence, the $α_{1}$ parameter is only related to the measure of association Kendall's τ.

In Table 5 we report the efficiency losses with respect to the independence case (Product Copula). By fixing three localized values $\bar{α_{1}}$ , we assume three different intervals for τ. Corresponding to these intervals we fixed four localized values $\bar{α_{2}}$ . From the table it is clear that when the range of the τ increases, the losses in terms of D-efficiency can become quite substantial. By focusing on the various localized values for $α_{2}$ , it is evident that also the structure of the dependence plays a big role in the design obtained. Here, we can see that when the highest weight in the convex combination is on the Clayton copula, the efficiency losses are lowest.

Table 5. Losses in D-efficiency for the convex combination model in per cent (in bold).

	$τ \in [0.5, 0.95]$	$τ \in [0.5, 0.99]$	$τ \in [0.5, 0.995]$
$\bar{α_{2}}$	Loss in D-eff.	Loss in D-eff.	Loss in D-eff.
$0.1$	11.15	31.97	47.22
$0.3$	6.51	25.11	40.33
$0.6$	2.50	17.20	32.27
$0.9$	1.11	10.90	24.96

Open in a new tab

5. Discussion

In general, our theory forms the basis to investigate further showcase examples from the literature, like, for example, in [29] or eventually treat mixed discrete/continuous type models like in [30]. Particularly for the latter, but also quite generally the methods provided in this paper can thus be expected to be valuable for real applications from clinical trials, environmental sampling, industrial experiments, etc.

Although here we provide only examples on limited types of copulas, we might expect similar or greater effects for some more non-symmetric copulae (see eg. [31]), which are subject to our current investigations.

Note that in the convex combination example by focusing on $α_{2}$ we could find designs with the sole purpose of efficiently discriminating between different copula models, which we plan to do future research on.

Acknowledgements

We thank F. Durante, M. Stehlík, L. Pronzato, J. Rendas and E.P. Klement for fruitful discussions and a referee for constructive remarks.

Appendix

Equivalence theorem

For all the basics in what follows cf.[16] For a given vector of parameters $(β, α)$ , let $M_{(β, α)}$ be the set of the information matrices generated as ξ ranges over the class of all set of probability distribution on $X$ . Then $M_{(β, α)}$ is the convex hull of ${m (x, β, α) : x \in X}$ .

Let us now recall the definition of two derivatives that will play an important role in our theory.

Definition A ((Gâteaux and Fréchet derivative)).

Considering two elements $M_{1}$ and $M_{2}$ in $M,$ the Gâteaux derivative of φ at $M_{1}$ in the direction of $M_{2}$ is:

$G_{φ} (M_{1}, M_{2}) = lim_{ε \to 0^{+}} \frac{1}{ε} {φ (M_{1} + ε M_{2}) - φ (M_{1})},$

the Fréchet derivative of φ at $M_{1}$ in the direction of $M_{2}$ is:

$F_{φ} (M_{1}, M_{2}) = lim_{ε \to 0^{+}} \frac{1}{ε} {φ {(1 - ε) M_{1} + ε M_{2}} - φ (M_{1})} .$

The following are the properties of the derivatives that we defined before: the concavity of φ implies that

\frac{1}{ε} [φ {(1 - ε) M_{1} + ε M_{2}} - φ (M_{1})]

is a non-increasing function of ϵ in $0 < ε \leq 1$ . Hence when φ is concave, $F_{φ} (M_{1}, M_{2})$ exists if we allow the value $+ \infty$ .

It is clear that if we put $ε = 1$ in the previous equation, we obtain: $F_{φ} (M_{1}, M_{2}) \geq φ (M_{2}) + φ (M_{1})$ .

According to the definitions of Fréchet and Gâteaux derivatives, we can stress the following relationship between them: $F_{φ} (M_{1}, M_{2}) = G_{φ} (M_{1}, M_{2} - M_{1})$ . Then, if we assume the differentiability of φ it is clear that for scalars $a_{i}$

F_{φ} (M_{1}, \sum a_{i} M_{i}) = \sum a_{i} F_{φ} (M_{1}, M_{i}) .

Theorem A.2.

Suppose to have a fixed parameters vector $(\bar{β}, \bar{α}),$ a concave function φ on $M_{(\bar{β}, \bar{α})}$ which is also differentiable at all points of $M_{(\bar{β}, \bar{α})}$ where $φ (M) < - \infty,$ so where a φ optimal measure exists.

Then the following are equivalent:

$ξ^{*}$ is φ-optimal;

$F_{φ} (M (ξ^{*}, \bar{β}, \bar{α}), M (ξ, \bar{β}, \bar{α})) \leq 0,$ $\forall ξ \in Ξ;$

$F_{φ} (M (ξ^{*}, \bar{β}, \bar{α}), m (x, \bar{β}, \bar{α})) \leq 0,$ $\forall x \in X;$

$max_{x \in X} F_{φ} (M (ξ^{*}, \bar{β}, \bar{α}), m (x, \bar{β}, \bar{α})) = min_{ξ \in Ξ} max_{x \in X} F_{φ} (M (ξ, \bar{β}, \bar{α}), m (x, \bar{β}, \bar{α}))$ .

Proof.

Let us prove the theorem by double implications.

$(i) \Rightarrow (ii)$ $ξ^{*}$ is φ-optimal. This means that $φ (M (ξ^{*}, \bar{β}, \bar{α}))$ is maximal.

For the properties of the function φ, the following relation holds:
$φ {(1 - ε) M (ξ^{*}, \bar{β}, \bar{α}) + ε M (ξ, \bar{β}, \bar{α})} - φ {M (ξ^{*}, \bar{β}, \bar{α})} \leq 0$
for $ε \in [0, 1]$ and all $ξ \in Ξ$ .

For all the elements of $M_{(\bar{β}, \bar{α})}$ holds that
$(1 - ε) M (ξ^{*}, \bar{β}, \bar{α}) + ε M (ξ, \bar{β}, \bar{α}) = M {(1 - ε) ξ^{*} + ε ξ}$
and this means, from the definition of the Fréchet derivative, that
$F_{φ} {M (ξ^{*}, \bar{β}, \bar{α}), M (ξ, \bar{β}, \bar{α})} \leq 0$
for all $ξ \in Ξ$ .

$(ii) \Rightarrow (iii)$ Since $m (x, \bar{β}, \bar{α})$ are elements of the convex hull $M_{(\bar{β}, \bar{α})}$ , the condition (iii) follows directly from the hypothesis.

$(iii) \Rightarrow (iv)$ For a particular control x and the elemental informations m, since $F_{φ} {M (ξ, \bar{β}, \bar{α}), M (ξ, \bar{β}, \bar{α})} = 0$ , it must be that:
$max_{x \in X} F_{φ} {M (ξ, \bar{β}, \bar{α}), m (x, \bar{β}, \bar{α})} \geq 0$
But, according to the hypothesis, we have that for the design $ξ^{*}$
$max_{x \in X} F_{φ} {M (ξ^{*}, \bar{β}, \bar{α}), m (x, \bar{β}, \bar{α})} \leq 0.$
Hence
$\begin{aligned} max_{x \in X} F_{φ} {M (ξ^{*}, \bar{β}, \bar{α}), m (x, \bar{β}, \bar{α})} = 0 \\ = min_{ξ} max_{x \in X} F_{φ} {M (ξ, \bar{β}, \bar{α}), m (x, \bar{β}, \bar{α})} . \end{aligned}$

$(iv) \Rightarrow (i)$ Suppose now that $ξ^{*}$ satisfies the hypothesis, then
$max_{x \in X} F_{φ} {M (ξ^{*}, \bar{β}, \bar{α}), m (x, \bar{β}, \bar{α}))} = 0,$
that means that $F_{φ} {M (ξ^{*}, \bar{β}, \bar{α}), m (x, \bar{β}, \bar{α})} \leq 0$ , $\forall x \in X$ . According to the definition of the matrices $M \in M$ , any M can be written as $M (ξ, \bar{β}, \bar{α}) = \sum_{i = n}^{r} w_{i} m (x_{i}, \bar{β}, \bar{α})$ , where $\sum_{i = n}^{r} w_{i} = 1$ and $w_{i} > 0$ for every $i = 1, \dots, n$ .

Then, since φ is differentiable at $M (ξ, \bar{β}, \bar{α})$ , it holds that:
$F_{φ} {M (ξ^{*}, \bar{β}, \bar{α}), M (ξ, \bar{β}, \bar{α})} = \sum_{i = 1}^{n} λ_{i} F_{φ} {M (ξ^{*}, \bar{β}, \bar{α}), m (x, \bar{β}, \bar{α})} \leq 0$
for every $ξ \in Ξ$ .

This means, clearly, that
$φ (M (ξ, \bar{β}, \bar{α})) - φ (M (ξ^{*}, \bar{β}, \bar{α})) \leq 0$
for every $ξ \in Ξ$ , then $ξ^{*}$ is φ-optimal.

D-optimality

Let consider now as design criterion the following function:

φ (M) = \{\begin{cases} \log det M & if M is non-singular \\ - \infty & otherwise \end{cases}

A design that maximizes such a φ function is called D-optimal design.

In the case of D-optimality the Fréchet and the Gâteaux derivatives have the following expression:

Gâteaux derivative

\begin{aligned} \log det (M_{1} + ε M_{2}) - \log det M_{1} & = \log det (I + ε M_{2} M_{1}^{- 1}) \\ = \log {1 + ε tr (M_{2} M_{1}^{- 1})} + O (ε^{2}) = ε tr (M_{2} M_{1}^{- 1}) + O (ε^{2}) \end{aligned}

Hence, $G_{φ} (M_{1}, M_{2}) = tr (M_{2} M_{1}^{- 1})$ .

Fréchet derivative

\begin{aligned} F_{φ} (M_{1}, M_{2}) & = G_{φ} (M_{1}, M_{2} - M_{1}) = tr ((M_{2} - M_{1}) M_{1}^{- 1}) \\ = tr (M_{2} M_{1}^{- 1} - I_{(k + l)}) = tr (M_{2} M_{1}^{- 1}) - (k + l) \end{aligned}

where $(k + l)$ is the number of the model parameters.

We are ready now to give an equivalence theorem which holds in the particular case of the D-criterion.

Theorem A.3.

For a fixed parameters vector $(\bar{β}, \bar{α}),$ the following properties are equivalent:

$ξ^{*}$ is D-optimal;

$tr (M (ξ^{*}, \bar{β}, \bar{α})^{- 1} m (x, \bar{β}, \bar{α})) \leq (k + l),$ $\forall x \in X;$

$ξ^{*}$ minimize $max_{x \in X} tr (M (ξ^{*}, \bar{β}, \bar{α})^{- 1} m (x, \bar{β}, \bar{α})),$ over all $ξ \in Ξ$ .

Proof.

The proof comes directly from the Theorem A.1 by imputing the Fréchet derivative for the D-criterion.

Funding Statement

This work has been supported by the project ANR-2011-IS01-001-01 ‘DESIRE’ and Austrian Science Fund (FWF) I 833-N18.

Disclosure statement

No potential conflict of interest was reported by the authors.

ORCID

W.G. Müller http://orcid.org/0000-0002-3564-766X

References

Valdez EA. Understanding relationships using copulas. N Am Actuar J. 1998;2(1):1–25. [Google Scholar]
Trivedi PK, Zimmer DM. Copula modeling: an introduction for practitioners. Found Trends Econ. 2006;1(1):1–111. [Google Scholar]
Nikoloulopoulos AK, Karlis D. Multivariate logit copula model with an application to dental data. Stat Med. 2008;27(30):6393–6406. doi: 10.1002/sim.3449. [DOI] [PubMed] [Google Scholar]
Danaher PJ, Smith MS. Modeling multivariate distributions using copulas: applications in marketing. Mark Sci. 2011;30(1):4–21. [Google Scholar]
Wadsworth JL, Tawn JA. Dependence modelling for spatial extremes. Biometrika. 2012;99(2):253–272. [Google Scholar]
Patton AJ. A review of copula models for economic time series. J Multivariate Anal. 2012;110:4–18. [Google Scholar]
McHale I, Scarf P. Modelling the dependence of goals scored by opposing teams in international soccer matches. Stat Model. 2011;11(3):219–236. [Google Scholar]
Cherubini U, Luciano E, Vecchiato W. Copula methods in finance. Chichester: Wiley; 2004. [Google Scholar]
Li J, Bárdossy A, Guenni L, Liu M. A copula based observation network design approach. Environ Model Softw. 2011;26(11):1349–1357. [Google Scholar]
Pilz J, Kazianka H, Spöck G. Some advances in Bayesian spatial prediction and sampling design. Spat Statist. 2012;1:65–81. [Google Scholar]
Schmidt R, Faldum A, Witt O, Gerß J. Adaptive designs with arbitrary dependence structure. Biom J. 2014;56(1):86–106. doi: 10.1002/bimj.201200234. [DOI] [PubMed] [Google Scholar]
Denman NG, McGree JM, Eccleston JA, Duffull SB. Design of experiments for bivariate binary responses modelled by Copula functions. Comput Statist Data Anal. 2011;55(4):1509–1520. [Google Scholar]
Kiefer J, Wolfowitz J. The equivalence of two extremum problems. Canad J Math. 1960;12:363–366. [Google Scholar]
Sklar A. Fonctions de répartition à n dimensions et leurs marges. Publications de l'Institut de Statistique de Paris. 1959;8:229–231. [Google Scholar]
Atkinson AC, Fedorov VV, Herzberg AM, Zhang R. Elemental information matrices and optimal experimental design for generalized regression models. J Statist Plann Inference. 2014;144:81–91. [Google Scholar]
Silvey SD. Optimal design (science paperbacks) London: Chapman & Hall; 1980. [Google Scholar]
Heise MA, Myers RH. Optimal designs for bivariate logistic regression. Biometrics. 1996;52(2):613–624. [Google Scholar]
Fedorov VV. The design of experiments in the multiresponse case. Theory Probab Appl. 1971;16(2):323–332. [Google Scholar]
Wynn HP. The sequential generation of D-optimum experimental designs. Ann Math Statist. 1970;41(5):1655–1664. [Google Scholar]
Perrone E. A study on robustness in the optimal design of experiments for copula models. In: Steland A, Rafajłowicz E, Szajowski K, editors. Stochastic models, statistics and their applications. Springer Proceedings in Mathematics & Statistics, Vol. 122. Wrocław, Poland: Springer International Publishing; 2015. p. 335–342.
Nelsen RB. An introduction to copulas (Springer series in statistics) 2nd ed. New York: Springer; 2007. [Google Scholar]
Durante F, Sempi C. Copula theory: an introduction. In: Bickel P, Diggle P, Fienberg S, Gather U, Olkin I, Zeger S, Jaworski P, Durante F, Härdle WK, Rychlik T, editors. Copula theory and its applications. Lecture Notes in Statistics, chapter 1, Vol. 198. Berlin, Heidelberg: Springer; 2010. p. 3–31.
Michiels F, De Schepper A. A copula test space model: how to avoid the wrong copula choice. Kybernetika. 2008;44(6):864–878. [Google Scholar]
Meyer C. The bivariate normal copula. Comm Statist Theory Methods. 2013;42(13):2402–2422. [Google Scholar]
Krafft O, Schaefer M. D-optimal designs for a multivariate regression model. J Multivariate Anal. 1992;42(1):130–140. [Google Scholar]
Dragalin V, Fedorov V. Adaptive designs for dose-finding based on efficacy–toxicity response. J Statist Plann Inference. 2006;136(6):1800–1823. [Google Scholar]
Schepsmeier U, Stöber J. Derivatives and Fisher information of bivariate copulas. Stat. Papers. 2014;55(2):525–542. [Google Scholar]
Noh H, Ghouch AE, Bouezmarni T. Copula-based regression estimation and inference. J Amer Statist Assoc. 2013;108(502):676–688. [Google Scholar]
Oakes D, Ritz J. Regression in a bivariate copula model. Biometrika. 2000;87(2):345–352. [Google Scholar]
de Leon AR, Wu B. Copula-based regression models for a bivariate mixed discrete and continuous outcome. Stat Med. 2011;30(2):175–185. doi: 10.1002/sim.4087. [DOI] [PubMed] [Google Scholar]
Klement EP, Mesiar R. How non-symmetric can a copula be? Comment Math Univ Carolin. 2006;47(1):141–148. [Google Scholar]

[CIT0001] Valdez EA. Understanding relationships using copulas. N Am Actuar J. 1998;2(1):1–25. [Google Scholar]

[CIT0002] Trivedi PK, Zimmer DM. Copula modeling: an introduction for practitioners. Found Trends Econ. 2006;1(1):1–111. [Google Scholar]

[CIT0003] Nikoloulopoulos AK, Karlis D. Multivariate logit copula model with an application to dental data. Stat Med. 2008;27(30):6393–6406. doi: 10.1002/sim.3449. [DOI] [PubMed] [Google Scholar]

[CIT0004] Danaher PJ, Smith MS. Modeling multivariate distributions using copulas: applications in marketing. Mark Sci. 2011;30(1):4–21. [Google Scholar]

[CIT0005] Wadsworth JL, Tawn JA. Dependence modelling for spatial extremes. Biometrika. 2012;99(2):253–272. [Google Scholar]

[CIT0006] Patton AJ. A review of copula models for economic time series. J Multivariate Anal. 2012;110:4–18. [Google Scholar]

[CIT0007] McHale I, Scarf P. Modelling the dependence of goals scored by opposing teams in international soccer matches. Stat Model. 2011;11(3):219–236. [Google Scholar]

[CIT0008] Cherubini U, Luciano E, Vecchiato W. Copula methods in finance. Chichester: Wiley; 2004. [Google Scholar]

[CIT0009] Li J, Bárdossy A, Guenni L, Liu M. A copula based observation network design approach. Environ Model Softw. 2011;26(11):1349–1357. [Google Scholar]

[CIT0010] Pilz J, Kazianka H, Spöck G. Some advances in Bayesian spatial prediction and sampling design. Spat Statist. 2012;1:65–81. [Google Scholar]

[CIT0011] Schmidt R, Faldum A, Witt O, Gerß J. Adaptive designs with arbitrary dependence structure. Biom J. 2014;56(1):86–106. doi: 10.1002/bimj.201200234. [DOI] [PubMed] [Google Scholar]

[CIT0012] Denman NG, McGree JM, Eccleston JA, Duffull SB. Design of experiments for bivariate binary responses modelled by Copula functions. Comput Statist Data Anal. 2011;55(4):1509–1520. [Google Scholar]

[CIT0013] Kiefer J, Wolfowitz J. The equivalence of two extremum problems. Canad J Math. 1960;12:363–366. [Google Scholar]

[CIT0014] Sklar A. Fonctions de répartition à n dimensions et leurs marges. Publications de l'Institut de Statistique de Paris. 1959;8:229–231. [Google Scholar]

[CIT0015] Atkinson AC, Fedorov VV, Herzberg AM, Zhang R. Elemental information matrices and optimal experimental design for generalized regression models. J Statist Plann Inference. 2014;144:81–91. [Google Scholar]

[CIT0016] Silvey SD. Optimal design (science paperbacks) London: Chapman & Hall; 1980. [Google Scholar]

[CIT0017] Heise MA, Myers RH. Optimal designs for bivariate logistic regression. Biometrics. 1996;52(2):613–624. [Google Scholar]

[CIT0018] Fedorov VV. The design of experiments in the multiresponse case. Theory Probab Appl. 1971;16(2):323–332. [Google Scholar]

[CIT0019] Wynn HP. The sequential generation of D-optimum experimental designs. Ann Math Statist. 1970;41(5):1655–1664. [Google Scholar]

[CIT0020] Perrone E. A study on robustness in the optimal design of experiments for copula models. In: Steland A, Rafajłowicz E, Szajowski K, editors. Stochastic models, statistics and their applications. Springer Proceedings in Mathematics & Statistics, Vol. 122. Wrocław, Poland: Springer International Publishing; 2015. p. 335–342.

[CIT0021] Nelsen RB. An introduction to copulas (Springer series in statistics) 2nd ed. New York: Springer; 2007. [Google Scholar]

[CIT0022] Durante F, Sempi C. Copula theory: an introduction. In: Bickel P, Diggle P, Fienberg S, Gather U, Olkin I, Zeger S, Jaworski P, Durante F, Härdle WK, Rychlik T, editors. Copula theory and its applications. Lecture Notes in Statistics, chapter 1, Vol. 198. Berlin, Heidelberg: Springer; 2010. p. 3–31.

[CIT0023] Michiels F, De Schepper A. A copula test space model: how to avoid the wrong copula choice. Kybernetika. 2008;44(6):864–878. [Google Scholar]

[CIT0024] Meyer C. The bivariate normal copula. Comm Statist Theory Methods. 2013;42(13):2402–2422. [Google Scholar]

[CIT0025] Krafft O, Schaefer M. D-optimal designs for a multivariate regression model. J Multivariate Anal. 1992;42(1):130–140. [Google Scholar]

[CIT0026] Dragalin V, Fedorov V. Adaptive designs for dose-finding based on efficacy–toxicity response. J Statist Plann Inference. 2006;136(6):1800–1823. [Google Scholar]

[CIT0027] Schepsmeier U, Stöber J. Derivatives and Fisher information of bivariate copulas. Stat. Papers. 2014;55(2):525–542. [Google Scholar]

[CIT0028] Noh H, Ghouch AE, Bouezmarni T. Copula-based regression estimation and inference. J Amer Statist Assoc. 2013;108(502):676–688. [Google Scholar]

[CIT0029] Oakes D, Ritz J. Regression in a bivariate copula model. Biometrika. 2000;87(2):345–352. [Google Scholar]

[CIT0030] de Leon AR, Wu B. Copula-based regression models for a bivariate mixed discrete and continuous outcome. Stat Med. 2011;30(2):175–185. doi: 10.1002/sim.4087. [DOI] [PubMed] [Google Scholar]

[CIT0031] Klement EP, Mesiar R. How non-symmetric can a copula be? Comment Math Univ Carolin. 2006;47(1):141–148. [Google Scholar]

PERMALINK

Optimal designs for copula models

E Perrone

WG Müller

Abstract

1. Introduction

Definition 2.

2. Design issues

Definition 4.

3. Equivalence theory

Theorem 3.1.

Definition 6.

4. Examples

4.1. Tools

Definition 8.

4.2. The linear case

Figure 1.

Corollary 4.2

Table 1. Losses in D-efficiency (in bold) by ignoring the dependence in per cent.

4.3. A binary bivariate model

Table 2. Losses in D-efficiency (in bold) by ignoring the dependence in per cent (product copula).

Table 3. Losses in D-efficiency (in bold) in per cent with respect to the FGM copula (α=0).

Figure 2.

Table 4. Losses in D-efficiency (in bold) by comparing the true copula model with the assumed one for a fixed Kendall's τ.

4.4. A more flexible model

Table 5. Losses in D-efficiency for the convex combination model in per cent (in bold).

5. Discussion

Acknowledgements

Appendix

Equivalence theorem

Definition A ((Gâteaux and Fréchet derivative)).

Theorem A.2.

Proof.

D-optimality

Theorem A.3.

Proof.

Funding Statement

Disclosure statement

ORCID

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Table 3. Losses in D-efficiency (in bold) in per cent with respect to the FGM copula ( $α = 0$ ).