A Sequential Design for Extreme Quantile Estimation Under Binary Sampling

Michel Broniatowski; Emilie Miranda

doi:10.3390/e28040479

. 2026 Apr 21;28(4):479. doi: 10.3390/e28040479

A Sequential Design for Extreme Quantile Estimation Under Binary Sampling

Michel Broniatowski ^1,^*, Emilie Miranda ¹

Editor: Alessandro Pluchino¹

PMCID: PMC13114713 PMID: 42072604

Abstract

We propose a sequential design method aiming at the estimation of an extreme quantile based on a sample of binary data corresponding to peaks over a given threshold. This study is motivated by an industrial challenge in material reliability and consists of estimating a failure quantile from trials whose outcomes are reduced to indicators of whether the specimen has failed at the tested stress levels. The proposed approach relies on a splitting strategy that decomposes the target extreme probability into a product of higher-order conditional probabilities, enabling a progressive exploration of the tail of the distribution through sampling under truncated laws. We consider GEV and Weibull models for the underlying distribution, and the sequential estimation of their parameters is carried out using an enhanced maximum likelihood procedure specifically adapted to binary data, addressing the substantial uncertainty inherent to such limited information.

Keywords: extreme quantile estimation, sequential design, binary information, splitting, extreme value theory

1. Introduction

Consider a non-negative random variable X with distribution function G. Let $X_{1}, \dots, X_{n}$ be n independent copies of $X .$ The aim of this paper is to estimate $q_{1 - α}$ , the $(1 - α)$ -quantile of G when $α$ is much smaller than $1 / n .$ We therefore aim at the estimation of so-called extreme quantiles. This question has been handled by various authors, and these results are reviewed in Section 3. The approach that we develop is quite different since we do not assume that the $X_{i}$ ’s can be observed. For any threshold x, we define the following random variable:

Y = \{\begin{matrix} 1 if X \leq x \\ 0 if X > x \end{matrix}

which therefore has a Bernoulli distribution with parameter $G (x) .$ The threshold x is chosen by the experimenter, but the underlying value of X remains unobserved, leading to a substantial loss of information. Such settings are common in industrial statistics. For instance, when assessing the strength of a material or a component, one applies a load x and records only whether failure occurs.

In the following, we denote by R the resistance of this material and observe the corresponding indicator Y. Inference on G can be performed for large n, making use of many thresholds $x .$ Unfortunately, such a procedure will not be of any help for extreme quantiles.

Indeed, this setting raises significant methodological challenges. First, the observation of binary responses only leads to a substantial loss of information compared to standard settings where the variable of interest is fully observed. Second, the estimation of extreme quantiles, associated with very small probability levels, further exacerbates this difficulty.

Existing approaches for quantile estimation from binary data, in particular those based on adaptations of the Robbins–Monro procedure, typically rely on a precise prior specification of the distribution of the latent variable, including in the vicinity of the target quantile. Such assumptions may be difficult to justify in practice, especially in industrial contexts where prior information is limited. Moreover, these methods are generally designed for moderate probability levels and do not naturally extend to the estimation of extreme quantiles under severe information constraints.

This motivates the development of alternative approaches that are able to explore the tail of the distribution in a progressive manner while reducing the reliance on strong prior assumptions.

To address these limitations, we will consider a design of experiments that progressively explores increasingly extreme regions of the distribution. More precisely, we assume that observations may be collected not only when R follows G, but also when R follows the conditional law of R given ${R > x}$ . Under such an assumption, it becomes possible to estimate $q_{1 - α}$ even when $α < 1 / n$ , where n denotes the total number of trials.

In materials science, this amounts to considering trials based on artificially modified materials; when estimating extreme upper quantiles, this corresponds to strengthening the material. We would consider a family of increasing thresholds $x_{1}, \dots, x_{m}$ and for each of them realize $K_{1}, \dots, K_{m}$ trials, each block of i.i.d. realizations Y’s, therefore, corresponds to functions of unobserved R’s with distribution G conditioned upon ${R > x_{l}}$ , $1 \leq l \leq m .$

This setting departs significantly from classical approaches based on full data, and is particularly suited for industrial statistics and reliability studies in materials science. From a statistical viewpoint, the situation is tractable when G belongs to a parametric family for which the conditional law of R given ${R > x}$ preserves the functional form of G, up to changes in the parameters. In this case, sampling under conditional laws may be carried out adaptively through a sequential choice of the thresholds $x_{l}$ ’s. This leads to a recursive estimation procedure in which the parameters of G are updated iteratively, and the target quantile $q_{1 - α}$ is obtained by combining quantiles of conditional distributions. This splitting-type approach guides the selection of the $x_{l}$ ’s so that $q_{1 - α}$ can ultimately be determined from the final conditional distribution associated with threshold $x_{m}$ .

Such techniques are closely connected to safety analysis and pharmaceutical dose-finding, where interest naturally focuses on the behavior of the system under very small probability levels. Through a simple change of variable, these situations can be reframed in terms of upper-tail events. In particular, if $\tilde{R} = 1 / R$ , then for $x > 0$ , the event ${R < x}$ is equivalent to ${\tilde{R} > 1 / x}$ . Accordingly, we make use of this duality throughout the paper, exploiting standard results for $\tilde{R}$ when necessary. In particular, if $q_{α}$ denotes the $α$ -quantile of R and ${\tilde{q}}_{1 - α}$ the $(1 - α)$ -quantile of $\tilde{R}$ , then $q_{α} = 1 / {\tilde{q}}_{1 - α}$ . While this notation may appear slightly cumbersome, it is natural in industrial statistics.

This article is organized as follows, and its main contributions are introduced along the way. Section 2 formalizes the problem in the framework of an industrial application and highlights the specific challenges arising from binary observations and extreme quantile estimation. Section 3 reviews the existing literature on extreme quantile estimation and experimental design under binary data. This discussion emphasizes the limitations of current approaches in the regime of very small probabilities and motivates the need for alternative strategies. In Section 4, we introduce a novel splitting-based framework for extreme quantile estimation under partial observability. This constitutes a central contribution of the paper, as it enables the progressive exploration of the tail of the distribution through a sequential experimental design. The proposed procedure is elaborated for a Generalized Pareto model. Section 5 develops the associated estimation procedure. In particular, we propose a dual-criterion approach combining a likelihood-based component with a stability condition on sequentially estimated quantiles. Section 6 extends the methodology to an alternative parametric model, illustrating the flexibility of the proposed framework beyond the Generalized Pareto setting. Finally, Section 7 and Section 8 provide a brief discussion of model selection and behavior under misspecification, as well as hints about extensions of the models studied beforehand.

2. Problem Formulation in an Industrial Reliability Framework

This study focuses on the estimation of extreme failure quantiles, a critical issue in industrial risk assessment. Such quantiles play a major role in engineering applications, particularly in the aeronautics industry, where they intervene directly in decisions pertaining to engine component dimensioning and the management of fatigue-related risks. For a detailed presentation of the industrial context related to material fatigue that motivated this study, see Broniatowski and Miranda (2019) [1].

In this context, the usual estimation procedures rely on data obtained from experimental trial campaigns, in which specimens are subjected to various stress levels and tested until failure or until the end of the trial. Such experimental campaigns can be extremely costly in certain industrial contexts, such as aeronautics, which severely limit both the total number of trials and the diversity of experimental conditions that can be explored.

The aim of this study is to introduce a new experimental design methodology for the estimation of extreme failure quantiles at a fixed target lifetime, under a very low risk level $α$ of the order of $10^{- 3}$ , while relying on a minimal number of trials.

Let S denote the stress level applied (in MPa).

Throughout the paper, we will denote R, a positive r.v. modeling the resistance of the material at the target lifetime and homogeneous to the stress.

Define $s_{α}$ as the failure quantile of probability level $α$ $= 10^{- 3}$ , the level of stress that guarantees that the risk of failure before the fixed lifetime does not exceed $α$ . Thus, $s_{α}$ is the $α$ -quantile of the distribution of R:

s_{α} = q_{α} = sup \{s : P (R \leq s) \leq α\} .

(1)

However, R is not directly observed during the experiments. The relevant information available to characterize R is limited to indicators of whether or not the specimen tested has failed at s before the end of the trial. Therefore, the relevant observations corresponding to a campaign of n trials are formed by a sample of variables $Y_{1}, \dots, Y_{n}$ with for $1 \leq i \leq n,$

Y_{i} = \{\begin{matrix} 1 if R_{i} \leq s_{i} \\ 0 if R_{i} > s_{i} \end{matrix}

where $s_{i}$ is the stress applied on specimen $i .$

Note that the number of observations is constrained by industrial and financial considerations. Thus, $α$ is way lower than $1 / n$ , and we are considering a quantile lying outside the sample range.

Although this work is motivated by an industrial application in materials science, similar statistical settings arise in other fields, such as broader reliability analysis or dose-finding studies in medical trials, where the goal is to estimate a maximum tolerated dose corresponding to a very small failure probability.

3. Extreme Quantile Estimation—A Short Survey

As seen above, estimating extreme failure quantiles raises two main issues: on the one hand, the estimation of an extreme quantile, and on the other hand, the need to conduct inference based on exceedances under thresholds. We provide here a brief overview of these two areas, bearing in mind that the literature on extreme quantile estimation typically assumes complete data or, at best, right-censored observations.

3.1. Extreme Quantiles Estimation Methods

Extreme quantile estimation in the univariate setting is widely covered in the literature when the variable of interest X is either completely or partially observed. The usual framework is the study of the $(1 - α)$ -quantile of a r.v X, with very small $a l p h a$ .

The most classical case corresponds to the setting where $x_{1 - α}$ is drawn from an n sample of observations $X_{1}, \dots, X_{n}$ . A distinction is usually made between high quantiles that lie within the sample range (see Weissman 1978 [2] and Dekkers et al., 1989 [3]) and extreme quantiles that fall outside the range of the observations (e.g., De Haan and Rootzén 1993 [4]). It is assumed that X belongs to the domain of attraction of an extreme value distribution. The tail index of the latter is then estimated through maximum likelihood (Weissman 1978 [2]) or through an extension of Hill’s estimator, such as the moment estimator by Dekkers et al. (1989) [3]. The estimator of the quantile is then deduced from the inverse function of the distribution of the k largest observations. Note that all the above references assume that the distribution has a Pareto tail.

Alternative modeling strategies have been proposed by De Valk (2016) [5] and De Valk and Cai (2018) [6], who assume a Weibull-type tail. This relaxes certain second-order hypotheses on the tail and allows one to target quantiles far outside the sample range. We will use these methods as benchmarks in our empirical comparisons.

Recent studies have also tackled the issue of censoring. For instance, Beirlant et al. (2007) [7] and Einmahl et al. (2008) [8] proposed a generalization of the peak-over-threshold method when the data are subjected to random right censoring and an estimator for extreme quantiles. The idea is to consider a consistent estimator of the tail index on the censored data and divide it by the proportion of censored observations in the tail. Worms and Worms (2014) [9] studied estimators of the extremal index based on Kaplan–Meier integration and censored regression.

By contrast, the literature is much sparser in the case of complete truncation, i.e., when only exceedances over predetermined thresholds are observed. All of the methods mentioned above rely on the higher-order statistics of the original sample, which are not available in the present setting. Consequently, classical extreme quantile estimators are not directly applicable to our problem.

In the next subsection, we therefore turn to sequential experimental designs used in industrial and biostatistical contexts, and assess their relevance for estimating extreme quantiles from dichotomous data.

3.2. Sequential Design Based on Dichotomous Data

A few studies have addressed the estimation of quantiles from binary data. Wu (1985) [10] and Joseph (2004) [11] propose adaptations of the Robbins and Monro procedure for binary data. Yet both focus on quantiles of moderate order (typically between 0.1 and 0.9) and require a fairly accurate prior knowledge of the latent distribution of the variable of interest around the targeted quantile. Such an assumption may be unrealistic in many applications, especially when targeting the tail of the distribution.

Wu and Tian (2014) [12] introduced what appears to be the most advanced procedure for quantile estimation based on binary data. Their three-step sequential approach (search, estimate, approximate) shows promising performance even for extreme quantiles, but it relies heavily on the specifics of the application context and must be tailored on a case-by-case basis.

In contrast, the approach we propose in this study aims to provide a generic framework specifically designed for the estimation of extreme failure quantiles. Our methodology is grounded in parametric assumptions justified by extreme value theory results, thus reducing the need for strong prior knowledge on the latent distribution and reducing sensitivity to model specification.

The remainder of this section reviews two classical designs frequently used in industry and biostatistics and conceptually closest to our objective: the Staircase and the Continual Reassessment Method (CRM). In what follows, we consider the estimation of a small quantile $q_{α}$ for events of the form ${R < s}$ with $α$ small. Both procedures rely on a parametric model for the strength variable R. We retain two specifications to facilitate performance comparisons, rather than to provide fully realistic safety assessments.

3.2.1. The Staircase Method

Denote $P (R \leq s) = ϕ (s, θ_{0})$ . Invented by Dixon and Mood (1948) [13] and refined in Dixon (1965) [14], this technique aims at the estimation of the parameter $θ_{0}$ through sequential search based on data of exceedances under thresholds. Starting from an initial stress level $S_{i n i}$ , each item is tested and the next test level is adjusted by a fixed increment $δ > 0$ : it is increased after a survival and decreased after a failure. This process is repeated for the K specimens. After the K trials, the parameter $θ_{0}$ is estimated through maximization of the likelihood.

The proper conduct of the Staircase method relies on strong assumptions about the choice of the design parameters. Firstly, $S_{i n i}$ has to be sufficiently close to the expectation of R and secondly, $δ$ has to lie between $0.5 σ$ and $2 σ$ , where $σ$ designates the standard deviation of the distribution of R.

Numerical Results

The accuracy of the procedure has been evaluated on the two models presented below on a batch of 1000 replications, each with $K = 100 .$

Exponential case

Let $R \sim E (λ)$ with $λ = 0.2$ . The input parameters are $S_{ini} = 5$ and $δ = 15 \in [0.5 \times \frac{1}{λ^{2}}, 2 \times \frac{1}{λ^{2}}]$ .

As shown in Table 1, the relative error pertaining to the parameter $λ$ is roughly $25 %$ , although the input parameters are reasonably well chosen for the method. The resulting relative error on the $10^{- 3}$ quantile is $40 % .$ Indeed, the parameter $λ$ is underestimated, which results in an overestimation of the variance $1 / λ^{2}$ , which induces an overestimation of the $10^{- 3}$ quantile.

Table 1.

Results obtained using the Staircase method through simulations under the exponential model.

Relative Error
On the Parameter		On $s_{α}$
Mean	Std	Mean	Std
−0.252	0.178	0.4064874	0.304

Open in a new tab

Gaussian case

We now choose $R \sim N (μ, σ)$ with $μ = 60$ and $σ = 10$ . The value of $S_{ini}$ is set to the expectation and $δ = 7$ belongs to the interval $[\frac{σ}{2}, 2 σ] .$ The same procedure as above is performed and yields the results in Table 2.

Table 2.

Results obtained using the Staircase method through simulations under the Gaussian model.

Relative Error
On $μ$		On $σ$		On $s_{α}$
Mean	Std	Mean	Std	Mean	Std
−0.059	0.034	1.544	0.903	−1.753	0.983

Open in a new tab

The expectation of R is recovered rather accurately, whereas the estimation of the standard deviation suffers a loss in accuracy, which, in turn, yields a relative error of 180 % on the $10^{- 3}$ quantile.

Limitations of the Staircase Method

While the Staircase method can recover the central tendency with a limited number of trials, it is not suitable for extreme quantile estimation. The latter relies on extrapolation from potentially biased parameter estimates, and simple reparametrizations (e.g., in terms of the extreme quantile) do not remedy the inherent loss of accuracy.

3.2.2. The Continuous Reassessment Method (CRM)

General Principle

The CRM (O’Quigley, Pepe and Fisher, 1990 [15]) has been designed for clinical trials to estimate $q_{α}$ among J stress levels $s_{1}, \dots, s_{J}$ , when $α$ is of order $20 %$ .

Denote $P (R \leq s) = ψ (s, β_{0})$ . The estimator of $q_{α}$ is

s^{*} : = \underset{s_{j} \in {s_{1}, \dots, s_{J}}}{arginf} | ψ (s_{j}, β_{0}) - α | .

This optimization is performed iteratively with K trials per iteration. Starting with an initial estimator $\hat{β_{1}}$ of $β_{0}$ , for example, through a Bayesian choice as proposed in [15], define

s_{1}^{*} : = \underset{s_{j} \in {s_{1}, \dots, s_{J}}}{arginf} | ψ (s_{j}, \hat{β_{1}}) - α | .

Each iteration $i \geq 1$ follows a two-step procedure:

Update ${\hat{β}}_{i}$ using all past data (maximum likelihood or Bayesian posterior under $ψ (\cdot, β_{0})$ ).
Set
$s_{i}^{*} : = \underset{s_{j} \in {s_{1}, \dots, s_{J}}}{arginf} | ψ (s_{j}, {\hat{β}}_{i}) - α |,$
and perform the next K Bernoulli trials at level $s_{i}^{*}$ .

The stopping rule depends on the context (maximum number of trials or stabilization of the results).

Note that the Bayesian inference is useful in cases where there is no diversity in the observations at some iterations of the procedure, i.e., when, at a given level of test $s_{i}^{*}$ , only failures or survivals are observed.

Application to Failure Quantiles

Denote by $π_{s}$ the prior indexed by the stress level s. $π_{s}$ models the failure probability at level s and has a Beta distribution given by

π_{s} \sim β (k, n - k + 1) .

(2)

It amounts to stating that at a given stress level s, we expect k failures out of n trials.

Let R follow an exponential distribution: $\forall s \geq 0, ψ (s, β_{0}) = p_{s} = 1 - exp (- β_{0} s)$ .

It follows $\forall s, β_{0} = - \frac{1}{s} log (1 - p_{s})$ .

Define the random variable $Λ_{s} = - \frac{1}{s} log (1 - π_{s})$ which, by definition of $π_{s}$ , is distributed as a k-order statistic of a uniform distribution $U_{k, n}$ .

The estimation procedure of the CRM is obtained as follows:

Step 1. Compute an initial estimator of the parameter

Λ_{s} = \frac{1}{L} \sum_{l = 1}^{L} - \frac{1}{s} log (1 - π_{s}^{l})

with $π_{s}^{l} \sim β (k, n - k + 1), 1 \leq l \leq L$ . Define

s_{1}^{*} : = \underset{s_{j} \in {s_{1}, \dots, s_{J}}}{arginf} | (1 - exp (- Λ_{s} s_{j})) - α | .

and perform J trials at level $s_{1}^{*}$ . Denote the observations $Y_{1, j} : = 1_{R_{1, j} < s_{1}^{*}}, 1 \leq j \leq J .$

Step i. At iteration i, compute the posterior distribution of the parameter:

π_{s_{i}}^{*} \sim β (k + \sum_{l = 1}^{i} \sum_{j = 1}^{J} Y_{l, j}, n + (J \times i) - (k + \sum_{l = 1}^{i} \sum_{j = 1}^{J} Y_{l, j}) + 1)

(3)

The above distribution also corresponds to an order statistic of the uniform distribution $U_{k + \sum_{l = 1}^{i} \sum_{j = 1}^{J} Y_{l, j}, n + (J \times i)}$ . We then obtain an estimate $Λ_{s_{1}^{*}}$ .

The next stress level $s_{i + 1}^{*}$ to be tested in the procedure is then given by

s_{i + 1}^{*} : = \underset{s_{j} \in {s_{1}, \dots, s_{J}}}{arginf} | (1 - exp (- Λ_{s_{1}^{*}} s_{j})) - α | .

Numerical Simulation for the CRM

Under the exponential model with parameter $λ = 0.2$ , using $N = 10$ iterations, $J = 10$ equally spaced thresholds $s_{1}, \dots, s_{J}$ , and $K = 50$ trials per iteration, we obtain the results reported in Table 3.

Table 3.

Results obtained through CRM on simulations for the exponential model.

Relative Error
On the $0.1$ -Quantile		On the $10^{- 3}$ -Quantile
Mean	Std	Mean	Std
0.129	0.48	−0.799	0.606

Open in a new tab

The $10^{- 3}$ -quantile is poorly estimated even in this simple setting. Near the target threshold, almost no failures are observed; for acceptable values of K, the method becomes ineffective. Figure 1 illustrates the improvement in accuracy as K increases.

Relative error on the $10^{- 3}$ -quantile with respect to the number of trials for each stress level.

In summary, both the Staircase and the CRM face the same limitation for extreme quantile estimation. The staircase targets the central tendency, whereas the CRM is calibrated for moderate quantile levels (around $0.2$ ), far from the target $α = 10^{- 3}$ . This motivates the original procedure proposed in the next sections, specifically designed for extreme quantiles under binary information.

4. A New Design for the Estimation of Extreme Quantiles

4.1. Splitting

The design we propose is directly inspired by the general principle of splitting methods used in the domain of rare events simulation and introduced by Kahn and Harris (1951) [16].

The central idea is to overcome the difficulty of estimating an extremely small probability by decomposing the target event into a sequence of events of higher probability. Splitting achieves this by expressing a rare-event probability as a product of conditional probabilities that are easier to estimate.

Let $P$ denote the distribution of the random variable R. The event ${R \leq s_{α}}$ can be expressed as the intersection of inclusive events for $s_{α} = s_{m} < s_{m - 1} < \dots < s_{1}$ it holds:

{R \leq s_{α}} = {R \leq s_{m}} \subset \dots \subset {R \leq s_{1}} .

It follows that

α = P (R \leq s_{α}) = P (R \leq s_{1}) \prod_{j = 1}^{m - 1} P (R \leq s_{j + 1} ∣ R \leq s_{j})

(4)

The thresholds ${(s_{j})}_{j = 1, \dots, m}$ should be chosen so that each conditional probability $P (R \leq s_{j + 1} ∣ R \leq s_{j})$ is of moderate level (typically $p = 0.2$ or $0.3$ ). This ensures that the event ${R \leq s_{j + 1}}$ is observable under the conditional distribution of $R ∣ R \leq s_{j}$ , while the product of these probabilities still reconstructs the target rare-event probability $α$ via (4), with a small number of stages m.

From the formal decomposition in (4), a practical experimental scheme can be deduced. Its structure is given in Procedure 1.

Remark 1.

Our approach can be seen as an alternative to stochastic approximation methods for quantile estimation under binary responses. While these methods rely on recursive updates driven by observed responses, they typically require either moderate quantile levels or strong prior assumptions on the underlying distribution.

In contrast, the procedure proposed here combines a sequential design with structural modeling inspired by extreme value theory, allowing the exploration of much more extreme regions of the distribution under weaker assumptions.

Remark 2.

Practical feasibility: The above procedure relies on the assumption that sampling can be performed from the conditional distribution of the resistance at each step. In practice, this amounts to conducting experiments on specimens with progressively reduced resistance levels. Although this assumption may appear strong, it is, in fact, realistic in some experimental settings, as specimens with controlled weakened resistance can be manufactured through appropriate machining processes. An illustrated example of such an approach in the context of material fatigue is provided in Broniatowski and Miranda (2019) [1].

Procedure 1 Splitting procedure

graphic file with name entropy-28-00479-i001.jpg

Open in a new tab

4.2. Choice of Sequential Design Parameters

The performance of the proposed sequential procedure depends on several parameters, namely the conditional probability level p, the number of stages m, and the number of trials per stage K. Their selection involves a trade-off between statistical accuracy and experimental cost.

A standard choice for the conditional probability level p is to fix it around $20 %$ . This value ensures that the conditional failure probability remains sufficiently large to be reliably estimated from a limited number of binary observations, while still enabling a progressive exploration of the tail of the distribution through the splitting mechanism.

The first threshold $s_{1}$ is selected so as to approximate the target conditional probability level p. In practice, however, the resulting probability $p_{1}$ may differ from p. In such cases, the model parameters are updated to ensure consistency with the target rare event probability $α$ , typically through the relation

p_{1} \cdot p^{m - 1} \approx α .

The selection of $s_{1}$ may also rely on domain-specific expertise, depending on the application context. In principle, the first iteration could be refined by incorporating additional prior information, for instance, regarding the central tendency of the distribution. However, such extensions are not considered in the present work in order to preserve the generality of the proposed methodology.

More generally, the parameters p, m, and K must be chosen jointly. Increasing the number of stages m allows for a more gradual progression toward extreme regions, which may improve robustness and stability, but also increases the overall complexity of the procedure. Increasing the number of trials K at each stage improves the accuracy of the parameter estimates and of the conditional quantiles, at the cost of a higher experimental burden. Finally, the choice of p controls the balance between these two effects: smaller values of p lead to a faster progression toward the tail but reduce the amount of information available at each stage, whereas larger values improve estimation reliability but require more stages to reach the target rare event probability.

In practice, this trade-off is largely driven by application-specific constraints. In industrial settings, for instance, the cost of performing experiments may increase significantly as testing conditions become more severe (corresponding to deeper levels in the sequential procedure), while the availability of resources (time, equipment, personnel) may limit the number of feasible trials. As a result, the parameters p, m, and K are often not freely chosen by the statistician, but rather dictated by these operational constraints. The guidelines provided here should therefore be understood as indicative rather than prescriptive. A detailed illustration of these trade-offs in an industrial context is provided in Broniatowski and Miranda (2019) [1].

4.3. Modeling the Distribution of the Strength, Pareto Model

The events under consideration have a small probability under $P .$ By (4), we are led to consider the limit behavior of conditional distributions under smaller and smaller thresholds. To this end, we rely on the classical approximations due to Balkema and de Haan (1974) [17] and Pickands (1975) [18] stated below in the familiar setting of exceedances above high thresholds. Let $\tilde{R} : = 1 / R$ .

Theorem 1.

For $\tilde{R}$ of distribution F belonging to the maximum domain of attraction of an extreme value distribution with tail index c, i.e., $F \in M D A (c)$ , it holds that: there exists $a = a (s) > 0$ , such that:

$lim_{s \to \infty} sup_{0 \leq x < \infty} |\frac{1 - F (x + s)}{1 - F (s)} - (1 - G_{(c, a)} (x))| = 0$

where $G_{(c, a)}$ is defined through

$G_{(c, a)} (x) = 1 - exp \{- \int_{0}^{\frac{x}{a}} {[{(1 + c t)}_{+}]}^{- 1} d t\}$

where $a > 0$ and $c \in R$ .

The distribution G is the Generalized Pareto distribution $G P D (c, a)$ given explicitly by

1 - G (x) = \{\begin{matrix} {(1 + \frac{c}{a} x)}^{- 1 / c} when c \neq 0 \\ exp (- \frac{x}{a}) when c = 0 \end{matrix}

where $x \geq 0$ for $c \geq 0$ and $0 \leq x \leq - \frac{a}{c}$ if $c < 0 .$

A key feature of GPDs is their invariance under threshold conditioning. Indeed it holds, for $\tilde{R} \sim G D P (c, a)$ and $x > s$ ,

P (\tilde{R} > x ∣ \tilde{R} > s) = {(1 + \frac{c (x - s)}{a + c s})}^{- 1 / c}

(5)

We therefore state:

Proposition 1.

When $\tilde{R} \sim G P D (c, a)$ then, given $(\tilde{R} > s)$ , the random variable $\tilde{R} - s$ follows a $G P D (c, a + c s)$ .

Thus, GPDs are both stable under thresholding and arise as limiting models for threshold exceedances. This mirrors the classical rationale that motivates the use of normal or stable laws in additive models. These properties make GPDs particularly suitable to model $\tilde{R}$ for excess-probability inference. Due to the lack of memory property, the exponential distribution, which appears as a possible limit distribution for excess probabilities when $c = 0$ in Theorem 1, does not qualify for modeling. Moreover, since R may take arbitrarily small values (i.e., $\tilde{R}$ is unbounded), we restrict attention to $c > 0$ .

Turning to the context of extreme failure quantiles, we make use of the random variable $R = 1 / \tilde{R}$ and proceed to the corresponding change of variable.

When $c > 0$ , the distribution function of the r.v. R writes for nonnegative x:

F_{c, a} (x) = {(1 + \frac{c}{a x})}^{- 1 / c} .

(6)

For $0 < x < u$ , the conditional distribution of R given $\{R < u\}$ is

P (R < x ∣ R < u) = {(1 - \frac{c (\frac{1}{x} - \frac{1}{u})}{a + \frac{c}{u}})}^{- 1 / c}

showing that the distribution of R is stable under threshold conditioning with parameter $(a_{u}, c)$ , where

a_{u} = a + \frac{c}{u} .

(7)

In practice, at step j of the procedure, the stress level $s_{j}$ is the corresponding threshold $1 / {\tilde{s}}_{j}$ , where ${\tilde{s}}_{j}$ is a right-quantile of the conditional law of $\tilde{R}$ given ${\tilde{R} > {\tilde{s}}_{j - 1}}$ . Therefore, the observations take the form $Y_{i} = 1_{R < s_{j} | R < s_{j - 1}} = 1_{{\tilde{R}}_{i} > {\tilde{s}}_{j} | {\tilde{R}}_{i} > {\tilde{s}}_{j - 1}}, i = 1, \dots, K_{j}$ .

A convenient feature of model (6) lies in the fact that the conditional distributions are completely determined by the initial distribution of R, therefore by $a$ and $c .$ The parameters $a_{j}$ of the conditional distributions are determined from these initial parameters and by the corresponding stress level $s_{j};$ see (7).

4.4. Notation and Framework for Sequential GPD Modeling and Parameterization

We assume that the distribution of the random variable $\tilde{R}$ follows a Generalized Pareto Distribution $G P D (c_{T}, a_{T})$ with distribution function $G_{(c_{T}, a_{T})}$ , and denote ${\bar{G}}_{(c_{T}, a_{T})} = 1 - G_{(c_{T}, a_{T})}$ .

The proposed procedure relies on a sequence of increasing thresholds $({\tilde{s}}_{1}, \dots, {\tilde{s}}_{m})$ and exploits the stability property of the GPD under threshold exceedances. As stated in (5), for each $j \in {1, \dots, m}$ , the conditional distribution of $\tilde{R}$ given ${\tilde{R} > {\tilde{s}}_{j}}$ remains a GPD and can be written as

{\bar{G}}_{(c_{j}, a_{j})} (x - {\tilde{s}}_{j}) = P (\tilde{R} > x ∣ \tilde{R} > {\tilde{s}}_{j}),

with $c_{j} = c_{T}$ and $a_{j} = a_{T} + c_{T} {\tilde{s}}_{j}$ .

At iteration j, we denote by ${(\hat{c}, \hat{a})}_{j}$ the estimators of $(c_{j}, a_{j})$ , so that $1 - G_{{(\hat{c}, \hat{a})}_{j}} (x - {\tilde{s}}_{j})$ provides an estimate of $P (\tilde{R} > x ∣ \tilde{R} > {\tilde{s}}_{j})$ . The parameters of the initial distribution $(c_{T}, a_{T})$ can then be recovered from ${(\hat{c}, \hat{a})}_{j}$ through

{\hat{c}}_{T} = \hat{c}, {\hat{a}}_{T} = \hat{a} - \hat{c} {\tilde{s}}_{j} .

4.5. Sequential Design for the Extreme Quantile Estimation

Fix m and p, where m denotes the number of stress levels under which the trials will be performed, and p is such that $p^{m} = α .$

Select an initial stress level $s_{1}$ that is sufficiently large (i.e., ${\tilde{s}}_{1} = 1 / s_{1}$ sufficiently small) so that $p_{1} = P (R < s_{1})$ is large enough and perform trials at this level. Ideally, one would choose $s_{1}$ such that $p_{1} = p$ , which cannot be secured; in practice, this choice relies on expert judgment.

Switch to the transformed variable $\tilde{R} : = 1 / R$ . Estimate the GPD parameters $(c_{T}, a_{T})$ of $\tilde{R}$ based on the observations above ${\tilde{s}}_{1}$ , producing the initial estimates ${(\hat{c}, \hat{a})}_{1}$ . As stated in Section 4.2, $s_{1}$ is chosen so that it corresponds to medium stress conditions, and failures are therefore easy to observe experimentally.

Define

{\tilde{s}}_{2} : = sup \{s : {\bar{G}}_{{(\hat{c}, \hat{a})}_{1}} (s - {\tilde{s}}_{1}) < p\}

the $(1 - p)$ -quantile of $G_{{(\hat{c}, \hat{a})}_{1}} .$ ${\tilde{s}}_{2}$ is the level of stress to be tested at the following iteration.

Iterating from step $j = 2$ to $m - 1$ , perform K trials under $G_{(c_{j}, a_{j})}$ say ${\tilde{R}}_{j, 1}, \dots, {\tilde{R}}_{j, K}$ and consider the observable variables $Y_{j, i} : = 1_{{\tilde{R}}_{j, i} > {\tilde{s}}_{j}}$ . Therefore, the K iid replications $Y_{j, 1}, \dots, Y_{j, K}$ follow a Bernoulli $B ({\bar{G}}_{(c_{j - 1}, a_{j - 1})} ({\tilde{s}}_{j} - {\tilde{s}}_{j - 1}))$ , where ${\tilde{s}}_{j}$ has been determined at the previous step of the procedure. Estimate $(c_{j}, a_{j})$ in the resulting Bernoulli scheme, say ${(\hat{c}, \hat{a})}_{j}$ . Then define

\begin{matrix} {\tilde{s}}_{j + 1} & : = sup \{s : {\bar{G}}_{{(\hat{c}, \hat{a})}_{j}} (s - {\tilde{s}}_{j}) < p\} \\ = G_{{(\hat{c}, \hat{a})}_{j}}^{- 1} (1 - p) + {\tilde{s}}_{j}, \end{matrix}

which is the $(1 - p)$ -quantile of the estimated conditional distribution of $\tilde{R}$ given ${\tilde{R} > \tilde{s_{j}}}$ , i.e., $G_{{(\hat{c}, \hat{a})}_{j}}$ , and the next level to be tested.

In practice a conservative choice for m is $m = ⌈\frac{l o g α}{l o g p}⌉$ , where $⌈ . ⌉$ denotes the ceiling function. The attained probability $\tilde{α}$ is a proxy of $α$ (see Section 5).

The m stress levels ${\tilde{s}}_{1} < {\tilde{s}}_{2} < \dots < {\tilde{s}}_{m} = {\tilde{q}}_{1 - α}$ satisfy

\begin{matrix} \tilde{α} & = \bar{G} ({\tilde{s}}_{1}) \prod_{j = 1}^{m - 1} {\bar{G}}_{{(\hat{c}, \hat{a})}_{j}} ({\tilde{s}}_{j + 1} - {\tilde{s}}_{j}) \\ = p_{1} p^{m - 1} \end{matrix}

Finally, by construction, ${\tilde{s}}_{m}$ is a proxy of ${\tilde{q}}_{1 - α}$ .

Although conceptually simple, the method raises several challenges, chiefly regarding the estimation of ${(\hat{c}, \hat{a})}_{j}$ at each stage. The next section addresses this issue.

5. Enhanced Sequential Design in the Pareto Model

In this section, we focus on the estimation of the parameters $(c_{T}, a_{T})$ in the $G P D (c_{T}, a_{T})$ distribution of $\tilde{R} .$ One of the main difficulties lies in the fact that the available information does not consist of replications of the random variable $\tilde{R}$ under the current conditional distribution $G_{(c_{j - 1}, a_{j - 1})}$ of $\tilde{R}$ given $(\tilde{R} > \tilde{s_{j - 1}})$ but merely on the very downgraded functions of those.

At step j, we are given $G_{{(\hat{c}, \hat{a})}_{j - 1}}$ and derive ${\tilde{s}}_{j}$ as its $(1 - p)$ -quantile. Simulating K random variable ${\tilde{R}}_{j, i}$ with distribution $G_{(c_{j - 1}, a_{j - 1})}$ , the only observable quantities are the Bernoulli(p) variables

Y_{j, i} : = 1_{{\tilde{R}}_{j, i} > {\tilde{s}}_{j}},

which represent a substantial loss of information compared with the underlying ${\tilde{R}}_{j, i}$ . Estimating ${(\hat{c}, \hat{a})}_{j}$ from these $Y_{j, i}$ ’s is therefore intrinsically challenging.

5.1. Limitations of Likelihood-Based Estimation

A natural approach is to estimate the parameters ${(c, a)}_{j}$ at each iteration using maximum likelihood. However, this leads to a rapid deterioration of the estimation of the extreme quantile ${\tilde{q}}_{1 - α}$ for small $α$ . In simulation experiments, the large standard deviation of ${\hat{\tilde{q}}}_{1 - α}$ is directly linked to the poor precision of the iterative estimators ${(\hat{c}, \hat{a})}_{j}$ . To illustrate this, we generated $n = 200$ independent Bernoulli variables $Y_{i}$ with parameter ${\bar{G}}_{(c_{T}, a_{T})} ({\tilde{s}}_{1})$ . Figure 2 displays the corresponding log-likelihood profile as the Bernoulli parameter ${\bar{G}}_{(c^{'}, a^{'})} ({\tilde{s}}_{0})$ varies with $(c^{'}, a^{'})$ . As expected, the log-likelihood surface is extremely flat across a wide region of the parameter space.

Log-likelihood of the Pareto model with binary data.

This explains the poor results in Table 4 obtained through the Splitting procedure when the parameters at each step are estimated by maximum likelihood, particularly the high variability in the estimated quantiles. Moreover, the accuracy of the estimator of ${\tilde{q}}_{1 - α}$ quickly decreases with the number K of replications $Y_{j, i}$ , $1 \leq i \leq K$ , as illustrated by results in Table 5.

Table 4.

Estimation of the $(1 - α)$ -quantile, ${\tilde{s}}_{α} = 469.103$ , through procedure Section 4.5 with $K = 50$ .

Minimum	Q25	Q50	Mean	Q75	Maximum
67.07	226.50	327.40	441.60	498.90	10,320.00

Open in a new tab

Table 5.

Estimation of the $(1 - α)$ -quantile, ${\tilde{s}}_{α} = 469.103$ , through procedure Section 4.5 for different values of K.

	${\tilde{s}}_{m}$ for $K = 30$		${\tilde{s}}_{m}$ for $K = 50$
${\tilde{s}}_{α}$	Mean	Std	Mean	Std
469.103	1276.00	12,576.98	441.643	562.757

Open in a new tab

Changing the estimation criterion by some alternative method does not improve significantly; Figure 3 shows the distribution of the resulting estimators of ${\tilde{q}}_{1 - α}$ for various estimation methods (minimum Kullback–Leibler, minimum Hellinger, and minimum L1 distances) of $(c_{T}, a_{T}) .$

Estimations of the $α$ -quantile based on the Kullback–Leibler, L1 distance, and Hellinger distance criterion.

These observations motivate the development of an enhanced estimation strategy.

5.2. An Enhanced Sequential Criterion for Estimation

To overcome the limitations of standard estimation methods, we introduce an additional estimation criterion that exploits the iterative structure of the procedure. The main idea is to enforce stability by ensuring coherence between successive conditional quantile estimates through a backward validation mechanism.

Due to its backwards nature, this estimation procedure stands for iterations $j = 2, \dots, m$ .

At iteration $j - 1$ , the sample $Y_{j - 1, i}$ , $1 \leq i \leq K$ is collected in the form of binary responses

Y_{j - 1, i} = 1_{{\tilde{R}}_{j - 1, i} \geq {\tilde{s}}_{j - 1}} .

where ${\tilde{R}}_{j - 1, i}$ are drawn from $G_{{(c, a)}_{j - 2}}$ . The empirical failure probability given by

{\hat{p}}_{j - 1} : = \frac{1}{K} \sum_{i = 1}^{n} Y_{j - 1, i} .

(8)

provides an estimate of $P (\tilde{R} > {\tilde{s}}_{j - 1} ∣ \tilde{R} > {\tilde{s}}_{j - 2})$ and a proxy of p. The conditional probability $P (\tilde{R} > {\tilde{s}}_{j - 1} ∣ \tilde{R} > {\tilde{s}}_{j - 2})$ can be written as a function of the estimated parameters obtained at iteration j, namely ${(\hat{c}, \hat{a})}_{j}$ using (7). This relies on the stability of the GPD under threshold exceedances, which ensures that parameters estimated at level j remain consistent with the distribution at previous levels.

At step $j,$ estimate $P (\tilde{R} > {\tilde{s}}_{j - 1} ∣ \tilde{R} > {\tilde{s}}_{j - 2})$ making use of $G_{{(\hat{c}, \hat{a})}_{j}} .$ This backward estimator writes

\frac{{\bar{G}}_{{(\hat{c}, \hat{a})}_{j}} ({\tilde{s}}_{j - 1})}{{\bar{G}}_{{(\hat{c}, \hat{a})}_{j}} ({\tilde{s}}_{j - 2})} = 1 - G_{{(\hat{c}, \hat{a})}_{j}} ({\tilde{s}}_{j - 1} - {\tilde{s}}_{j - 2}) .

The distance

|({\bar{G}}_{{(\hat{c}, \hat{a})}_{j}} ({\tilde{s}}_{j - 1} - {\tilde{s}}_{j - 2})) - \hat{p_{j - 1}}|

(9)

should be small, since both ${\bar{G}}_{{(\hat{c}, \hat{a})}_{j}} ({\tilde{s}}_{j - 1} - {\tilde{s}}_{j - 2})$ and ${\hat{p}}_{j - 1}$ should approximate $p .$

Consider the distance between quantiles

|({\tilde{s}}_{j - 1} - {\tilde{s}}_{j - 2}) - G_{{(\hat{c}, \hat{a})}_{j}}^{- 1} (1 - {\hat{p}}_{j - 1})| .

(10)

An estimate ${(\hat{c}, \hat{a})}_{j}$ can be proposed as the minimizer of the above expression for $({\tilde{s}}_{j - 1} - {\tilde{s}}_{j - 2})$ for all j. This backward estimation provides coherence with respect to the unknown initial distribution $G_{(c_{T}, a_{T})}$ through a retroactive validation of the parameters. If we had started with a good guess $(\hat{c}, \hat{a}) = (c_{T}, a_{T})$ , then the successive ${(\hat{c}, \hat{a})}_{j}, {\tilde{s}}_{j - 1}$ , etc., would make (10) small, since ${\tilde{s}}_{j - 1}$ (resp. ${\tilde{s}}_{j - 2}$ ) would estimate the $p$ -conditional quantile of $P (.| \tilde{R} > {\tilde{s}}_{j - 2})$ (resp. $P (.| \tilde{R} > {\tilde{s}}_{j - 3})$ ).

It remains to restrict the minimization of (10) to a set of plausible parameter values. We suggest constructing a confidence region for the parameter $(c_{T}, a_{T}) .$ With ${\hat{p}}_{j}$ defined in (8) and $γ \in (0, 1)$ , define the $(1 - γ)$ -confidence region for p by

I_{1 - γ} = [{\hat{p}}_{j} - z_{1 - γ / 2} \sqrt{\frac{{\hat{p}}_{j} (1 - {\hat{p}}_{j})}{K - 1}}; {\hat{p}}_{j} + z_{1 - γ / 2} \sqrt{\frac{{\hat{p}}_{j} (1 - {\hat{p}}_{j})}{K - 1}}]

(11)

where $z_{τ}$ is the $τ$ -quantile of the standard normal distribution. Define

S_{j} = \{(c, a) : (1 - G_{(c, a)} ({\tilde{s}}_{j} - {\tilde{s}}_{j - 1})) \in I_{1 - γ}\} .

(12)

Therefore, $S_{j}$ is a plausible set for $({\hat{c}}_{T}, {\hat{a}}_{T}) .$

We summarize this discussion:

At iteration $j,$ the estimator of $(c_{T}, a_{T})$ is a solution of the minimization problem

min_{(c, a) \in S_{j}} |({\tilde{s}}_{j - 1} - {\tilde{s}}_{j - 2}) - G_{(c, a + c {\tilde{s}}_{j - 2})}^{- 1} (1 - {\hat{p}}_{j - 1})| .

(13)

where optimization is performed using standard numerical methods.

This backward estimation procedure enforces coherence with respect to the unknown initial distribution $G_{(c_{T}, a_{T})}$ by ensuring consistency between successive conditional quantile estimates. If the initial parameters were equal to the true values $(c_{T}, a_{T})$ , then the successive thresholds ${\tilde{s}}_{j}$ would accurately track the corresponding conditional quantiles, making the discrepancy in (10) negligible.

This criterion can be interpreted as a backward validation mechanism, in the spirit of cross-validation or jackknife-type approaches, where estimates at a given stage are assessed through their ability to reproduce quantities observed at previous stages. From a statistical perspective, it reflects the sequential nature of the data acquisition process by incorporating past information into the current estimation step, thereby ensuring coherence between successive conditional quantile estimates. It should therefore not be viewed as a likelihood-based or regularization approach, but rather as a consistency-based validation criterion embedded within the sequential design.

5.3. Theoretical Insight into the Estimator

Although a full theoretical analysis of the proposed estimator is beyond the scope of this paper, some insight into its behavior can be provided. The method relies on a splitting strategy that decomposes a rare event into a sequence of conditional events, thereby progressively exploring the tail of the distribution. Each step involves the estimation of an intermediate conditional quantile, and the final extreme quantile is obtained by the composition of these levels.

From an extreme value theory perspective, this construction can be interpreted as iteratively approximating increasingly extreme quantiles of the original distribution. As the number of iterations increases, the procedure effectively focuses on the tail region, where classical asymptotic results for conditional excess distributions apply.

Moreover, the recursive structure of the algorithm contributes to its stability: provided that the parameter estimates at a given step are sufficiently accurate, the subsequent sampling is concentrated in a relevant region of the distribution. The enhanced estimation criterion introduced in Section 5 further reinforces this stability by promoting consistency between successive conditional quantile estimates. From a heuristic perspective, the sequential updating mechanism may be interpreted as a fixed-point-type iteration, where parameter estimates are adjusted to ensure consistency with previously observed conditional quantiles. This suggests that convergence may be expected under suitable conditions, although no formal result is established in the present work. This reasoning relies on asymptotic arguments from extreme value theory, in particular on the applicability of the De Haan theorem, which ensures that the tail behavior can be approximated by a generalized Pareto distribution. Overall, these elements provide intuition on the expected convergence behavior of the method, even though a formal proof of consistency remains an open question.

5.4. Simulation-Based Numerical Results

This procedure has been applied in three cases. The reference case is $(c_{T}, a_{T}) = (1.5, 1.5)$ . The second case, $(c_{T}, a_{T}) = (0.8, 1.5)$ , corresponds to a lighter tail than the reference. The third case, $(c_{T}, a_{T}) = (1.5, 3)$ , corresponds to a distribution with the same tail index as the reference but a larger dispersion index.

The overall procedure can be summarized as follows in Procedure 2:

Performance is primarily assessed using relative error, which provides a scale-invariant measure particularly suited to extreme quantile estimation, where the quantities of interest may span several orders of magnitude. From a practical standpoint, this metric also reflects the ability of the method to recover the correct order of magnitude, which is often the main objective in industrial applications.

Table 6 highlights important features of the proposed estimator. First, the estimation of ${\tilde{q}}_{1 - α}$ deteriorates as the tail of the distribution becomes heavier, reflecting the intrinsic difficulty of extrapolating further into the tail. In addition, the estimator tends to underestimate ${\tilde{q}}_{1 - α}$ , implying a conservative nature of the sequential exploration of extreme regions.

Table 6.

Mean and std of relative errors on the $(1 - α)$ -quantile of GPD calculated through 400 replicas of Procedure 2.

Parameters	Relative Error on ${\tilde{s}}_{α}$
Parameters	Mean	Std
$c = 0.8$ , $a_{0} = 1.5$ and ${\tilde{s}}_{α} = 469.103$	−0.222	0.554
$c = 1.5$ , $a_{0} = 1.5$ and ${\tilde{s}}_{α} = 31,621.777$	−0.504	0.720
$c = 1.5$ , $a_{0} = 3$ and ${\tilde{s}}_{α} = 63,243.550$	0.310	0.590

Open in a new tab

Despite these limitations, the proposed method shows a clear improvement over the simple Maximum Likelihood estimation. This gain is particularly pronounced in the case of heavy-tailed distributions, where classical approaches struggle to provide reliable estimates Figure 4.

Procedure 2 Simulation and dual-criterion estimation procedure

Initialization:
- Fix the design parameters p, m, and K. In the following examples, we set $m = 5$ , the target level $p = 0.25$ , and $K \in {15, 30, 50}$ .
- Select an initial threshold ${\tilde{s}}_{1}$ based on expert knowledge and perform K trials at this level to obtain a first estimate of $(c_{T}, a_{T})$ . Since ${\tilde{s}}_{1}$ is not calibrated to exactly match the target level p, the resulting probability $p_{1}$ may differ from p. Therefore, p is adjusted using ${\hat{p}}_{1}$ so as to satisfy ${\hat{p}}_{1} p^{m - 1} \approx α$ .
For each stage $j = 2, \dots, m$ :
1. Generation of K trials:
- Perform K trials at the current threshold ${\tilde{s}}_{j}$ , leading to binary observations
  $Y_{j, i} = 1_{{\tilde{R}}_{j, i} \geq {\tilde{s}}_{j}}, i = 1, \dots, K,$
  where ${\tilde{R}}_{j, i}$ are simulated under $G_{{(c, a)}_{j - 1}}$ .
- Compute the empirical conditional probability
  ${\hat{p}}_{j} = \frac{1}{K} \sum_{i = 1}^{K} Y_{j, i},$
  and construct a confidence interval $I_{1 - γ} ({\hat{p}}_{j})$ with confidence level $1 - γ = 0.8$ (see Section 5.2).
2. Parameter space restriction:
- Define the set of admissible parameters $S_{j}$ as those for which the theoretical conditional probability under the Generalized Pareto model lies within the confidence interval $I_{1 - γ}$ (see (11) and (12)).
3. Stability criterion:
- Among the candidates in $S_{j}$ , select the parameter pair ${(\hat{c}, \hat{a})}_{j}$ by minimizing the discrepancy between successive conditional quantile estimates:
  ${(\hat{c}, \hat{a})}_{j} = arg min_{(c, a) \in S_{j}} |({\tilde{s}}_{j - 1} - {\tilde{s}}_{j - 2}) - G_{(c, a + c {\tilde{s}}_{j - 2})}^{- 1} (1 - {\hat{p}}_{j - 1})| .$
- From the estimated parameters, retrieve the $(1 - p)$ -quantile of the estimated conditional distribution $G_{{(\hat{c}, \hat{a})}_{j}}$ :
  ${\tilde{s}}_{j + 1} = G_{{(\hat{c}, \hat{a})}_{j}}^{- 1} (1 - p) + {\tilde{s}}_{j},$
4. Final estimator:
- The estimator of the target extreme quantile ${\tilde{q}}_{1 - α}$ is given by the final threshold ${\tilde{s}}_{m}$ .

Open in a new tab

Estimations of the $(1 - α)$ -quantile of two GPD obtained by Maximum Likelihood and by the improved Maximum Likelihood method. The red line stands for the real value of $q_{α}$ .

Moreover, the influence of the number K of replications at each step reveals an important advantage of the method. While reducing K naturally increases variability, it also amplifies the relative gain over Maximum Likelihood estimation. This behavior, illustrated in Figure 5, suggests a certain robustness of the proposed approach in low-information settings, where only a limited number of trials can be performed at each stage.

Estimations of the $(1 - α)$ -quantile of a $G P D (0.8, 1.5)$ obtained by Maximum Likelihood and by the improved Maximum Likelihood method for different values of K. The red line stands for the real value of $q_{α}$ .

5.5. Performance of the Sequential Estimation

As stated in Section 5, a substantial amount of information is lost under complete truncation and binary sampling.

As stated in Section 3, most available approaches either assume fully observed data or rely on strong prior knowledge of the underlying distribution, particularly in the tail region. These assumptions are not compatible with the binary observation framework considered here. Thus, direct comparisons with existing methods for extreme quantile estimation are not straightforward in our setting.

For this reason, we instead compare our procedure to a benchmark based on full data observations, which provides an upper bound on the achievable performance under ideal information conditions.

5.5.1. Estimation of an Extreme Quantile Based on Complete Data, De Valk’s Estimator

In order to provide an upper bound for performance, we use the estimator proposed by De Valk and Cai (2018) [6]. Their framework targets quantiles of order $p_{n} \in [n^{- τ_{1}}, n^{- τ_{2}}]$ with $τ_{2} > τ_{1} > 1$ , where n is the sample size, which aligns with the industrial context that motivated this work. De Valk’s proposal is a modified Hill estimator adapted to log-Weibull-tailed models. De Valk’s estimator is consistent, asymptotically normally distributed, though it may exhibit finite-sample bias.

We briefly recall the assumptions underpinning De Valk’s approach. Let $X_{1}, \dots, X_{n}$ be n iid r.v’s with distribution F, and denote $X_{k : n}$ the $k -$ order statistics. A tail regularity assumption is needed in order to estimate a quantile with order greater than $1 -$ $1 / n .$

Denote $U (t) = F^{- 1} (1 - 1 / t)$ , and let the function q be defined by

q (y) = U (e^{y}) = F^{- 1} (1 - e^{- y})

for $y > 0$ .

Assume that

lim_{y \to \infty} \frac{log q (y λ) - log q (y)}{g (y)} = h_{θ} (λ) λ > 0

(14)

where g is a regularly varying function and

h_{θ} (λ) = \{\begin{matrix} \frac{λ^{θ} - 1}{θ} if θ \neq 0 \\ log λ if θ = 0 \end{matrix}

de Valk writes condition (14) as $log q \in E R V_{θ} (g)$ .

Despite its naming of log-Generalized tails, this condition also holds for Pareto-tailed distributions, as can be checked, providing $θ = 1 .$

We now introduce de Valk’s extreme quantile estimator.

Let

ϑ_{k, n} : = \sum_{j = k}^{n} \frac{1}{j} .

Let $q (z)$ be the quantile of order $e^{- z} = p_{n}$ of the distribution F. The estimator makes use of $X_{n - l_{n} : n}$ , an intermediate order statistic of $X_{1}, \dots, X_{n}$ , where $l_{n}$ tends to infinity as $n \to \infty$ and $l_{n}$ $/ n \to 0 .$

de Valk’s estimator writes

\hat{q} (z) = X_{n - l_{n} : n} exp \{g (ϑ_{l_{n}, n}) h_{θ} (\frac{z}{ϑ_{l_{n + 1}, n}})\} .

(15)

When the support of F overlaps $R^{-}$ , then the sample size n should be large; see de Valk [6] for details.

Note that, in the case of a $G P D (c, a)$ , parameter $θ$ is known and equal to 1, and the normalizing function g is defined by $g (x) = c x$ for $x > 0$ .

5.5.2. Loss in Accuracy Due to Binary Sampling

Table 7 compares the performance of De Valk’s method (using complete data) with that of our sequential procedure (using only dichotomous outcomes). Unsurprisingly, de Valk’s estimator outperforms ours. This gap reflects the loss of information induced by thresholding and dichotomization. Nevertheless, the comparison remains informative: even though our estimator typically exhibits a larger bias, its dispersion is of the same order of magnitude when handling heavy-tailed GPD models. Given the binary nature of the data, the average relative error is quite honorable. Overall, we can assess that a large part of the volatility of the estimator produced by our sequential methodology is due to the nature of the GPD model, as well as to the sample size.

Table 7.

Mean and std of the relative errors on the $1 - α$ -quantile of GPD on complete and binary data for samples of size $n = 250$ , computed through 400 replicas of both estimation procedures. Estimates on complete data are obtained with de Valk’s method; estimates on binary data are provided by the sequential design.

	Relative Error on the $(1 - α)$ -Quantile
Parameters	on Complete Data		on Binary Data
	Mean	Std	Mean	Std
$c = 0.8$ , $a_{0} = 1.5$ and $s_{α} = 469.103$	0.052	0.257	−0.222	0.554
$c = 1.5$ , $a_{0} = 1.5$ and $s_{α} = 31,621.777$	0.086	0.530	−0.504	0.720
$c = 1.5$ , $a_{0} = 3$ and $s_{α} = 63,243.550$	0.116	0.625	0.310	0.590

Open in a new tab

6. Sequential Design for the Weibull Model

The main property that led to the GPD model is its relevance for tail modeling and its stability through threshold conditioning. In this section, we also investigate a parameterization based on the Weibull distribution as one of the most classical and widely used models in reliability engineering. Under this assumption, the conditional distribution of $\tilde{R}$ given $\{\tilde{R} > s\}$ takes a rather simple form, which allows for some variation of the sequential design method.

6.1. The Weibull Model

Let $\tilde{R} \sim Weibull (α, β)$ with scale $α > 0$ and shape $β > 0$ , and denote by G its c.d.f., g its density, and $G^{- 1}$ the quantile function. For $x \geq 0$ ,

\begin{matrix} G (x) & = 1 - exp (- {(\frac{x}{α})}^{β}) \\ for 0 < u < 1, G^{- 1} (u) & = α {(- log (1 - u))}^{1 / β} \end{matrix}

The conditional distribution of $\tilde{R}$ given truncation above a threshold is a truncated Weibull.

\begin{matrix} for {\tilde{s}}_{2} > {\tilde{s}}_{1}, P (\tilde{R} > {\tilde{s}}_{2} ∣ \tilde{R} > {\tilde{s}}_{1}) & = \frac{P (\tilde{R} > {\tilde{s}}_{2})}{P (\tilde{R} > {\tilde{s}}_{1})} \\ = exp \{(- {(\frac{s_{2}}{α})}^{β} + {(\frac{s_{1}}{α})}^{β})\} \end{matrix}

Let $G_{s_{2}}$ denote the distribution function of $\tilde{R}$ given $(\tilde{R} > {\tilde{s}}_{2})$ .

A useful identity follows. For ${\tilde{s}}_{2} > {\tilde{s}}_{1}$ ,

log P (\tilde{R} > {\tilde{s}}_{2} ∣ \tilde{R} > {\tilde{s}}_{1}) = [{(\frac{{\tilde{s}}_{2}}{{\tilde{s}}_{1}})}^{β} - 1] log P (\tilde{R} > {\tilde{s}}_{1})

(16)

Assuming $P (\tilde{R} > {\tilde{s}}_{1}) = p$ and given ${\tilde{s}}_{1}$ , we may find ${\tilde{s}}_{2}$ the conditional quantile of order $1 - p$ of the distribution of $\tilde{R}$ given $\{\tilde{R} > {\tilde{s}}_{1}\}$ . This solves the first step of the sequential estimation procedure through

log p = [{(\frac{{\tilde{s}}_{2}}{{\tilde{s}}_{1}})}^{β} - 1] log p

where the parameter $β$ has to be estimated on the first run of trials.

The same logic extends iteratively. For ${\tilde{s}}_{j + 1} > {\tilde{s}}_{j} > {\tilde{s}}_{j - 1}$

\begin{matrix} log P (\tilde{R} > {\tilde{s}}_{j + 1} ∣ \tilde{R} > {\tilde{s}}_{j}) & = [\frac{log P (\tilde{R} > {\tilde{s}}_{j + 1} ∣ \tilde{R} > {\tilde{s}}_{j - 1})}{log P (\tilde{R} > {\tilde{s}}_{j} ∣ \tilde{R} > {\tilde{s}}_{j - 1})} - 1] log P (\tilde{R} > {\tilde{s}}_{j} ∣ \tilde{R} > {\tilde{s}}_{j - 1}) \\ = [\frac{{\tilde{s}}_{j - 1}^{β} - {\tilde{s}}_{j + 1}^{β}}{{\tilde{s}}_{j - 1}^{β} - {\tilde{s}}_{j}^{β}} - 1] log P (\tilde{R} > {\tilde{s}}_{j} ∣ \tilde{R} > {\tilde{s}}_{j - 1}) \end{matrix}

(17)

At iteration j, the thresholds ${\tilde{s}}_{j}$ and ${\tilde{s}}_{j - 1}$ are known; the threshold ${\tilde{s}}_{j + 1}$ is the $(1 - p)$ -quantile of the conditional distribution, $P (\tilde{R} > {\tilde{s}}_{j + 1} ∣ \tilde{R} > {\tilde{s}}_{j}) = p$ , hence solving

log p = [\frac{{\tilde{s}}_{j - 1}^{β} - {\tilde{s}}_{j + 1}^{β}}{{\tilde{s}}_{j - 1}^{β} - {\tilde{s}}_{j}^{β}} - 1] log p

where the estimate of $β$ is updated from the data collected at iteration $j .$

6.2. Numerical Results

In line with Section 5.4 and Section 5.5, we assess the performance of the sequential design under a Weibull model. We estimate the $(1 - α)$ -quantile in three scenarios. In the first case, the scale parameter a and the shape parameter b satisfy $(a, b) = (3, 0.9)$ . This corresponds to a strictly decreasing density function with a heavy tail. In the second case, $(a, b) = (3, 1.5)$ , the distribution is more skewed; in the third, $(a, b) = (2, 1.5)$ , the distribution is less dispersed with a lighter tail.

Table 8 shows that the performance of our procedure is, as expected, sensitive to the shape of the distribution. The estimators are less accurate in case 1, corresponding to a heavier tail. We compare these errors to those obtained with complete data using de Valk’s methodology. The loss of accuracy due to data deterioration (binary sampling and truncation) is similar to that observed in the Pareto case, though slightly more pronounced. This is consistent with the fact that the Weibull family is less naturally aligned with the splitting structure than the GPD, which enjoys exact stability under threshold conditioning.

Table 8.

Mean and std of relative errors on the $(1 - α)$ -quantile of Weibull distributions on complete and binary data for samples of size $n = 250$ computed through 400 replicas. Estimates on complete data are obtained with de Valk’s method; estimates on binary data are provided by the sequential design.

	Relative Error on the $(1 - α)$ -Quantile
Parameters	on Binary Data		on Complete Data
	Mean	Std	Mean	Std
$a_{0} = 3$ , $b_{0} = 0.9$ et $s_{α} = 25.69$	0.282	0.520	0.127	0.197
$a_{0} = 3$ , $b_{0} = 1.5$ et $s_{α} = 10.88$	−0.260	0.490	0.084	0.122
$a_{0} = 2$ , $b_{0} = 1.5$ et $s_{α} = 7.25$	−0.241	0.450	0.088	0.140

Open in a new tab

7. Model Selection and Misspecification

In the above sections, we considered two models primarily motivated by their theoretical properties. As stated in Section 4.3, modeling $\tilde{R}$ by a $G P D (c, a)$ with $c > 0$ is justified when the support of the original variable R may be bounded by 0. Note, however, that the GPD model naturally extends to $c = 0$ , which corresponds to the exponential distribution and represents a trivial limiting case for excess probabilities.

Although we excluded the exponential case when modeling the excess probabilities of $\tilde{R}$ with a GPD, we nevertheless considered the Weibull model in Section 6, which lies in the maximum domain of attraction associated with $c = 0$ . Beyond its compatibility with the splitting structure, the Weibull law is a classical and widely used model in reliability, which makes it a natural candidate for an adapted version of our sequential method.

In this section, we discuss the modeling decisions and give some hints on how to deal with misspecification.

7.1. Model Selection

The decision between a Pareto-type model with a strictly positive tail index and a Weibull-type model has been addressed in the literature; a variety of tests exist for assessing membership to a given maximum domain of attraction.

Dietrich et al. (2002) [19] Drees et al. (2006) [20] both propose a test for extreme value conditions related to Cramer-von Mises tests. Let X of distribution function G. The null hypothesis is

H_{0} : G \in M D A (c_{0}) .

In our case, the theoretical value for the tail index is $c_{0} = 0$ . The former test builds on the tail empirical quantile function, while the latter relies on a weighted approximation of the tail empirical distribution.

Choulakian and Stephens (2001) [21] propose a goodness-of-fit test in the spirit of Cramér–von Mises statistics, where unknown parameters are replaced by maximum likelihood estimates. Given a sample $X_{1}, \dots, X_{n}$ from G, the hypothesis to be tested is $H_{0}$ : the sample stems from a $G P D (c_{0}, \hat{a})$ . The associated test statistics are given by the following equation:

\begin{matrix} W^{2} = \sum_{i = 1}^{n} {(\hat{G} (x_{(i)}) - \frac{2 i - 1}{2 n})}^{2} + \frac{1}{12 n}; \\ A^{2} = - n - \frac{1}{n} \sum_{i = 1}^{n} (2 i - 1) \{log (\hat{G} (x_{(i)})) + log (1 - \hat{G} (x_{(} n + 1 - i))\}, \end{matrix}

where $x_{(i)}$ denotes the $i$ -th order statistic of the sample. The authors provide the corresponding tables of critical points.

Jurečková and Picek (2001) [22] designed a non-parametric test for determining whether a distribution G is light or heavy-tailed. The null hypothesis is defined by the following:

H_{c_{O}} : x^{1 / c_{0}} (1 - G (x)) \leq 1 \forall x > x_{0} for some x_{0} > 0

with fixed hypothetical $c_{0}$ . The test procedure consists of splitting the data set into N samples and computing the empirical distribution of the extrema of each sample.

The practical assessment of model suitability in our industrial context is delicate. Two main obstacles arise: (i) data are collected sequentially during the procedure, with no upfront sample of available observations; (ii) the variable of interest R is not observed directly and only dichotomized exceedances over chosen thresholds are available. Most existing procedures assume fully observed data and are semi- or nonparametric tests built from order statistics; their performance also relies on large samples, which are unrealistic in our setting due to the cost and duration of experimental trials.

As an alternative, one may resort to a posteriori validation, once the procedure is completed, combining the statistical output with expert judgment to confirm (or reject) the chosen model.

7.2. Misspecification

In Section 4.3, we assumed that $\tilde{R}$ follows a GPD at the outset. In practice, the tail of the true distribution may converge toward a GPD as thresholds increase, yet differ from it at finite levels.

7.2.1. Robustness to Model Misspecification

The proposed approach relies on parametric assumptions, in particular the generalized Pareto distribution, which is motivated by its stability under threshold exceedances in extreme value theory. This choice is especially relevant in the later stages of the splitting procedure, where the algorithm focuses on increasingly extreme regions of the distribution.

It should be noted, however, that the parametric assumption may be less accurate during the early iterations, when the data are not yet in the tail regime. Nevertheless, the sequential nature of the method progressively reduces the influence of these initial stages, as the final estimate is primarily driven by the deepest levels of the splitting procedure.

This behavior is consistent with standard extreme value estimation techniques, which rely primarily on the largest observations and are therefore relatively insensitive to the distribution in central regions. As a result, the method is expected to exhibit some degree of robustness to model misspecification, provided that the tail behavior is adequately captured.

A more systematic assessment of this robustness, for instance, through simulation studies under alternative distributions, would be a valuable direction for future work. However, such an analysis is not straightforward in the present sequential setting, where data are generated adaptively under conditional distributions. In particular, standard tools such as influence functions are difficult to implement, and a comprehensive empirical validation would go beyond the scope of the present work.

7.2.2. Insights on Control of Model Deviation

While the previous subsection discusses the robustness of the proposed approach from an asymptotic perspective, it is also important to quantify the effect of model misspecification in finite-sample settings, especially in the early iterations of the procedure, where the extreme regime has not yet been reached.

In this subsection, we address this issue by introducing a neighborhood-based framework that allows us to control deviations from the generalized Pareto model and to guide the selection of appropriate threshold levels within the sequential design.

Let us assume that $\tilde{R}$ does not follow a GPD with distribution function F, but rather a distribution G whose tail becomes increasingly close to that of a GPD.

In this case, the issue is to control the distance between G and the theoretical GPD and to determine a threshold level beyond which this discrepancy becomes negligible. One way to proceed is to restrict attention to a neighborhood of F:

V_{ϵ} (F) = \{G : sup_{x} | \bar{F} (x) - \bar{G} (x) | w (x) \leq ϵ\},

(18)

where $ϵ \geq 0$ and w is an increasing weight function such that ${lim}_{x \to \infty} w (x) = \infty$ . $V_{ϵ} (F)$ defines a neighborhood, which does not tolerate large departures from F in the right tail of the distribution.

Let $x \geq s$ . It follows from (18) that a bound for the conditional distribution of x given $R > s$ is as follows:

\frac{\bar{F} (x) - ϵ / w (x)}{\bar{F} (s) + ϵ / w (s)} \leq \frac{\bar{G} (x)}{\bar{G} (s)} \leq \frac{\bar{F} (x +) + ϵ / w (x)}{\bar{F} (s) - ϵ / w (s)} .

(19)

When $ϵ = 0$ , the bounds of (19) match the conditional probabilities of the theoretical Pareto distribution.

In order to control the distance between F and G, this bound may be rewritten in terms of relative error with respect to the Pareto distribution. Using a first-order Taylor expansion for small $ϵ$ , we obtain

1 - u (s, x) . ϵ \leq \frac{\frac{\bar{G} (x)}{\bar{G} (s)}}{\frac{\bar{F} (x)}{\bar{F} (s)}} \leq 1 + u (x, s) . ϵ,

(20)

where

u (s, x) = \frac{{(1 + \frac{c s}{a})}^{1 / c}}{w (s)} + \frac{{(1 + \frac{c x}{a})}^{1 / c}}{w (x)} .

For a given $ϵ$ close to 0, the relative error on the conditional probabilities can be controlled through the choice of s. In particular, the relative error is bounded by a prescribed level $δ > 0$ whenever

\frac{{(1 + \frac{c s}{a})}^{1 / c}}{w (s)} \leq \frac{δ}{ϵ} \frac{{(1 + \frac{c x}{a})}^{1 / c}}{w (x)} .

This provides a practical guideline for selecting thresholds that mitigate the impact of misspecification within the sequential design.

8. Perspectives, Generalization of the Two Models

In this work, we have considered two models for $\tilde{R}$ that exploit the thresholding operations used in the splitting method. This choice is, however, restrictive: the limited information obtained from the trials prevents the use of highly flexible models for the distribution of the resistance. In the following, we outline several possible extensions and generalizations of those models based on common properties of the GPD and Weibull models.

8.1. Variations Around Mixture Forms

When the tail index is positive, the GPD is completely monotone, and thus can be written as the Laplace transform of a probability distribution. Thyrion (1964) [23] and Thorin (1977) [24] established that a $G P D (a_{T}, c_{T})$ , with $c_{T} > 0$ , can be written as the Laplace transform of a Gamma random variable V whose parameters are functions of $a_{T}$ and $c_{T}$ : $V \sim Γ (\frac{1}{c_{T}}, \frac{a_{T}}{c_{T}})$ . Let v denote the density of V,

\begin{matrix} \forall x \geq 0, \bar{G} (x) = & \int_{0}^{\infty} exp (- x y) v (y) d y \\ where v (y) = \frac{{(a_{T} / c_{T})}^{1 / c}}{Γ (1 / c_{T})} y^{1 / c_{T} - 1} exp (- \frac{a_{T} y}{c_{T}}) . \end{matrix}

(21)

It follows that the conditional survival function of $\tilde{R}$ , ${\bar{G}}_{s_{j}}$ , is given by the following:

\begin{matrix} P (\tilde{R} > {\tilde{s}}_{j + 1} ∣ {\tilde{R}}_{j} > {\tilde{s}}_{j}) & = {\bar{G}}_{{\tilde{s}}_{j}} ({\tilde{s}}_{j + 1} - {\tilde{s}}_{j}) \\ = \int_{0}^{\infty} exp \{- ({\tilde{s}}_{j + 1} - {\tilde{s}}_{j}) y\} v_{j} (y) d y, \\ where V_{j} is a random variable of distribution Γ (\frac{1}{c_{j}}, \frac{a_{j}}{c_{j}}) . \end{matrix}

with $c_{j} = c_{T}$ and $a_{j} = a_{j - 1} + c_{T} ({\tilde{s}}_{j} - {\tilde{s}}_{j - 1})$ .

Expression (21) gives room to an extension of the Pareto model. Indeed, we could consider distributions of $\tilde{R}$ that share the same mixture form with a mixing variable W that possesses some common characteristics with the Gamma-distributed random variable $V .$

Similarly, the Weibull distribution $W (α, β)$ can also be written as the Laplace transform of a stable law of density g whenever $β \leq 1$ . Indeed, it holds from Feller 1971 [25] (p. 450, Theorem 1) that:

\forall x \geq 0, exp \{- x^{β}\} = \int_{0}^{\infty} exp (- x y) g (y) d y

(22)

where g is the density of an infinitely divisible probability distribution.

It follows that, for $s_{j} < s_{j + 1}$

\begin{matrix} P (\tilde{R} > {\tilde{s}}_{j + 1} ∣ {\tilde{R}}_{j} > {\tilde{s}}_{j}) & = \frac{exp \{- {({\tilde{s}}_{j + 1} / α)}^{β}\}}{exp \{- {({\tilde{s}}_{j} / α)}^{β}\}} \\ = \frac{\int_{0}^{\infty} exp \{(- ({\tilde{s}}_{j + 1} / α) y\} g (y) d y}{\int_{0}^{\infty} exp \{- ({\tilde{s}}_{j} / α) y\} g (y) d y} = \frac{\int_{0}^{\infty} exp \{- ({\tilde{s}}_{j + 1} / α) y\} g (y) d y}{K (s_{j})} \\ = \frac{1}{K (s_{j})} \int_{0}^{\infty} exp \{- {\tilde{s}}_{j + 1} u\} g_{α} (u)) d u \\ with u = y / α and g_{α} (u) = α g (α u) \end{matrix}

(23)

Thus, an alternative modeling of $\tilde{R}$ could consist of any distribution that can be written as a Laplace transform of a stable law of density $w_{α, β}$ defined on $R_{+}$ and parametrized by $(α, β)$ , that satisfies the following property. For any $s > 0$ , the distribution function of the conditional distribution of $\tilde{R}$ given $\tilde{R} > s$ can be expressed as the Laplace transform of $w_{α, β}^{(α, s)} (.)$ where

x > s, w_{α, β}^{(α, s)} (x) = \frac{α w_{α, β} (α x)}{K (s)},

where $K (.)$ is defined in (23).

8.2. Variation Around the GPD

Another approach, inspired by Naveau et al. (2016) [26], is to modify the model so that the distribution of $\tilde{R}$ converges to a GPD in the upper tail while taking a more flexible form near 0.

$\tilde{R}$ is generated through $G_{(c_{T}, a_{T})}^{- 1} (U)$ with $U \sim U [0, 1]$ . Let us consider now a deformation of the uniform variable $V = L^{- 1} (U)$ defined on $[0, 1]$ , and the transform W of the GPD: $W^{- 1} (U) = G_{(c_{T}, a_{T})}^{- 1} (L^{- 1} (U))$ .

Since the survival function of the GPD is completely monotone, we can choose W so that the distribution of $\tilde{R}$ retains this property.

Proposition 2.

If $ϕ : [0, \infty [\to R$ is completely monotone and ψ is a positive function whose derivative is completely monotone, then $ϕ (ψ)$ is completely monotone.

The transformation of the GPD has a cumulative distribution function $W = L (G (c_{T}, a_{T}))$ and survival function $\bar{W} = \bar{L} (G (c_{T}, a_{T}))$ . $G (c_{T}, a_{T})$ is a Bernstein function; thus, $\bar{W}$ is completely monotone if $\bar{L}$ is as well.

Examples of Admissible Functions

(1) Exponential form:

\begin{matrix} L (0) = 0 \\ L (x) = \frac{1 - exp (- λ x^{α})}{1 - exp (- λ)} avec 0 \leq α \leq 1 et λ > 0 \\ L (1) = 1 \end{matrix}

The obtained transformation is: $\forall x > 0$ ,

{\bar{W}}_{(λ, c_{T}, a_{T})} (x) = \bar{L} (G (x)) = \frac{exp (- λ {[1 - {(1 + \frac{c_{T}}{a_{T}})}^{- 1 / c_{T}}]}^{α}) - exp (- λ)}{1 - exp (- λ)}

with ${\bar{W}}_{(λ, c_{T}, a_{T})} (x)$ completely monotone.

(2) Logarithmic form:

\begin{matrix} L (0) = 0 \\ L (x) = \frac{log (x + 1)}{log 2} (o r m o r e g e n e r a l l y \frac{log (α x + 1)}{log 2}, α > 0) \\ L (1) = 1 \end{matrix}

and $\forall x > 0$ ,

{\bar{W}}_{(c_{T}, a_{T})} (x) = 1 - \frac{log (2 - {(1 + \frac{c_{T}}{a_{T}})}^{- 1 / c_{T}})}{log 2}

(3) Root form:

\begin{matrix} L (0) = 0 \\ L (x) = \frac{\sqrt{x + 1} - 1}{\sqrt{2} - 1} \\ L (1) = 1 \end{matrix}

and

{\bar{W}}_{(c_{T}, a_{T})} x) = 1 - \frac{\sqrt{2 - {(1 + \frac{c_{T} x}{a_{T}})}^{- 1 / c_{T}}} - 1}{\sqrt{2}}

(4) Fraction form:

\begin{matrix} L (0) = 0 \\ L (x) = \frac{(α + 1) x}{x + α}, α > 0 \\ L (1) = 1 \end{matrix}

and

{\bar{W}}_{(α, c_{T}, a_{T})} (x) = 1 - \frac{(α + 1) (1 - {(1 + \frac{c_{T} x}{a_{T}})}^{- 1 / c})}{1 - {(1 + \frac{c_{T} x}{a_{T}})}^{- 1 / c_{T}} + α}

The shapes of the above transformations of the GPD are shown in Figure 6.

Survival functions associated with transformations of the GPD $(0.8, 1.5)$ .

These transformations, however, do not preserve the stability under thresholding specific to the Pareto family; as a consequence, their implementation does not lead to a stable sequential procedure. Nonetheless, they illustrate how the proposed models can be generalized in contexts where additional information on the underlying variable is available.

9. Conclusions

The splitting-based procedure presented in this article proposes an innovative experimental design for estimating an extreme failure quantile. Its development is motivated, on the one hand, by major industrial stakes, and on the other, by the limited relevance of existing methodologies in this context. The core difficulty lies in the nature of the available information: the variable of interest is latent and only exceedance indicators over chosen thresholds can be observed.

We proposed a strategy based on splitting methods to decompose the target rare event into a sequence of more tractable conditional events. The splitting formula introduces a formal decomposition that we translated into a practical sampling strategy targeting the tail of the distribution of interest progressively.

The algebraic structure of the splitting equation motivated specific parametric assumptions on the underlying distribution. We considered two models that benefit from stability properties under thresholding: a Generalized Pareto Distribution (GPD) and a Weibull distribution. Building on this structure, we designed an estimation procedure that leverages the iterative nature of the design by combining a classical maximum likelihood criterion with a backward-consistency criterion on sequentially estimated conditional quantiles. The performance of the resulting estimator was assessed numerically. Although accuracy is necessarily constrained by the quantity and quality of information, the results can still be compared—at least in terms of order of magnitude—to what could be obtained under the idealized setting of fully observed data.

From a practical standpoint, while the GPD aligns most naturally with the splitting structure (due to its exact stability under threshold conditioning), the Weibull distribution remains highly relevant for reliability modeling and engineering practice.

The applicability of the proposed framework, however, depends on the ability to implement conditional sampling schemes, which may require a preliminary analysis of the system under study. While this may limit direct applicability in some contexts, it is worth noting that methods addressing similar problems typically rely on even stronger prior knowledge of the underlying distribution. In this respect, the present approach provides a relatively generic framework, requiring more limited prior information. A concrete illustration of such an implementation is provided in [1].

Another important aspect concerns the choice of the parametric model. The assumptions introduced to ensure tractability of the sequential design are motivated by extreme value theory and are therefore primarily justified in the tail of the distribution. As a result, their adequacy may be less accurate in the earlier stages of the procedure, where the observations do not yet correspond to the extreme regime. This mismatch may induce estimation errors in the first iterations, which can propagate through the sequential updates and affect the final estimate. Although the paper outlines possible extensions toward more flexible modeling strategies, a deeper investigation of model misspecification—especially outside the extreme region—and its impact on the procedure would be a valuable direction for future work.

Beyond the specific results obtained in this work, the proposed methodology also highlights an interesting statistical perspective. The sequential nature of the procedure, together with the use of a primary estimation criterion, naturally suggests the introduction of a second statistical criterion aimed at validating the estimates at each stage.

In particular, incorporating a cross-validation-type phase within the sequential design provides an additional layer of assessment for the stability and reliability of the estimated parameters. This idea, which emerges from the structure of the method itself, opens a promising direction for future research and would require further theoretical investigation.

Author Contributions

Conceptualization, M.B. and E.M.; methodology, M.B. and E.M.; software, M.B. and E.M.; writing, M.B. and E.M. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Funding Statement

This research received no external funding.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

References

1.Broniatowski M., Miranda E. A sequential design for extreme quantiles estimation under binary sampling. arXiv. 20192004.01563 [Google Scholar]
2.Weissman I. Estimation of parameters and large quantiles based on the k largest observations. J. Am. Stat. Assoc. 1978;73:812–815. [Google Scholar]
3.Dekkers A.L.M., Einmahl J.H.J., De Haan L. A moment estimator for the index of an extreme-value distribution. Ann. Stat. 1989;17:1833–1855. doi: 10.1214/aos/1176347397. [DOI] [Google Scholar]
4.De Haan L., Rootzén H. On the estimation of high quantiles. J. Stat. Plan. Inference. 1993;35:1–13. doi: 10.1016/0378-3758(93)90063-C. [DOI] [Google Scholar]
5.De Valk C. Approximation of high quantiles from intermediate quantiles. Extremes. 2016;19:661–684. doi: 10.1007/s10687-016-0255-3. [DOI] [Google Scholar]
6.De Valk C., Cai J.J. A high quantile estimator based on the log-Generalized Weibull tail limit. Econom. Stat. 2018;6:107–128. doi: 10.1016/j.ecosta.2017.03.001. [DOI] [Google Scholar]
7.Beirlant J., Guillou A., Dierckx G., Fils-Villetard A. Estimation of the extreme value index and extreme quantiles under random censoring. Extremes. 2007;10:151–174. doi: 10.1007/s10687-007-0039-x. [DOI] [Google Scholar]
8.Einmahl J.H.J., Fils-Villetard A., Guillou A. Statistics of extremes under random censoring. Bernoulli. 2008;14:207–227. doi: 10.3150/07-BEJ104. [DOI] [Google Scholar]
9.Worms J., Worms R. New estimators of the extreme value index under random right censoring, for heavy-tailed distributions. Extremes. 2014;17:337–358. doi: 10.1007/s10687-014-0189-6. [DOI] [Google Scholar]
10.Wu C.F.J. Efficient Sequential Designs with Binary Data. J. Am. Stat. Assoc. 1985;80:974–984. doi: 10.1080/01621459.1985.10478213. [DOI] [Google Scholar]
11.Joseph V.R. Efficient Robbins–Monro procedure for binary data. Biometrika. 2004;91:461–470. doi: 10.1093/biomet/91.2.461. [DOI] [Google Scholar]
12.Wu C.F.J., Tian Y. Three-phase optimal design of sensitivity experiments. J. Stat. Plan. Inference. 2014;149:1–15. doi: 10.1016/j.jspi.2013.10.007. [DOI] [Google Scholar]
13.Dixon W.J., Mood A.M. A Method for Obtaining and Analyzing Sensitivity Data. J. Am. Stat. Assoc. 1948;43:109–126. doi: 10.1080/01621459.1948.10483254. [DOI] [Google Scholar]
14.Dixon W.J. The Up-and-Down Method for Small Samples. J. Am. Stat. Assoc. 1965;60:967–978. doi: 10.1080/01621459.1965.10480843. [DOI] [Google Scholar]
15.O’Quigley J., Pepe M., Fisher L. Continual Reassessment Method: A Practical Design for Phase 1 Clinical Trials in Cancer. Biometrics. 1990;46:33–48. doi: 10.2307/2531628. [DOI] [PubMed] [Google Scholar]
16.Kahn H., Harris T.E. Estimation of Particle Transmission by Random Sampling. Natl. Bur. Stand. Appl. Math. Ser. 1951;12:27–30. [Google Scholar]
17.Balkema A.A., De Haan L. Residual Life Time at Great Age. Ann. Probab. 1974;2:762–804. doi: 10.1214/aop/1176996548. [DOI] [Google Scholar]
18.Pickands J. Statistical Inference using Extreme Order Statistics. Ann. Probab. 1975;3:119–131. [Google Scholar]
19.Dietrich D., De Haan L., Hüsler J. Testing extreme value conditions. Extremes. 2002;5:71–85. doi: 10.1023/A:1020934126695. [DOI] [Google Scholar]
20.Drees H., De Haan L., Li D. Approximations to the tail empirical distribution function with application to testing extreme value conditions. J. Stat. Plan. Inference. 2006;136:3498–3538. doi: 10.1016/j.jspi.2005.02.017. [DOI] [Google Scholar]
21.Choulakian V., Stephens M.A. Goodness-of-fit tests for the generalized Pareto distribution. Technometrics. 2001;43:478–484. doi: 10.1198/00401700152672573. [DOI] [Google Scholar]
22.Jurečková J., Picek J. A Class of Tests on the Tail Index. Extremes. 2001;4:165–183. doi: 10.1023/A:1013925226836. [DOI] [Google Scholar]
23.Thyrion P. Les lois exponentielles composées. Bull. Assoc. R. Actuaires Belg. 1964;62:35–44. [Google Scholar]
24.Thorin O. On the infinite divisibility of the Pareto distribution. Scand. Actuar. J. 1977;1:31–40. doi: 10.1080/03461238.1977.10405623. [DOI] [Google Scholar]
25.Feller W. An Introduction to Probability Theory and Its Applications. Volume 2 Wiley; Hoboken, NJ, USA: 1971. [Google Scholar]
26.Naveau P., Huser R., Ribereau P., Hannart A. modelling jointly low, moderate, and heavy rainfall intensities without a threshold selection. Water Resour. Res. 2016;52:2753–2769. doi: 10.1002/2015WR018552. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

[B1-entropy-28-00479] 1.Broniatowski M., Miranda E. A sequential design for extreme quantiles estimation under binary sampling. arXiv. 20192004.01563 [Google Scholar]

[B2-entropy-28-00479] 2.Weissman I. Estimation of parameters and large quantiles based on the k largest observations. J. Am. Stat. Assoc. 1978;73:812–815. [Google Scholar]

[B3-entropy-28-00479] 3.Dekkers A.L.M., Einmahl J.H.J., De Haan L. A moment estimator for the index of an extreme-value distribution. Ann. Stat. 1989;17:1833–1855. doi: 10.1214/aos/1176347397. [DOI] [Google Scholar]

[B4-entropy-28-00479] 4.De Haan L., Rootzén H. On the estimation of high quantiles. J. Stat. Plan. Inference. 1993;35:1–13. doi: 10.1016/0378-3758(93)90063-C. [DOI] [Google Scholar]

[B5-entropy-28-00479] 5.De Valk C. Approximation of high quantiles from intermediate quantiles. Extremes. 2016;19:661–684. doi: 10.1007/s10687-016-0255-3. [DOI] [Google Scholar]

[B6-entropy-28-00479] 6.De Valk C., Cai J.J. A high quantile estimator based on the log-Generalized Weibull tail limit. Econom. Stat. 2018;6:107–128. doi: 10.1016/j.ecosta.2017.03.001. [DOI] [Google Scholar]

[B7-entropy-28-00479] 7.Beirlant J., Guillou A., Dierckx G., Fils-Villetard A. Estimation of the extreme value index and extreme quantiles under random censoring. Extremes. 2007;10:151–174. doi: 10.1007/s10687-007-0039-x. [DOI] [Google Scholar]

[B8-entropy-28-00479] 8.Einmahl J.H.J., Fils-Villetard A., Guillou A. Statistics of extremes under random censoring. Bernoulli. 2008;14:207–227. doi: 10.3150/07-BEJ104. [DOI] [Google Scholar]

[B9-entropy-28-00479] 9.Worms J., Worms R. New estimators of the extreme value index under random right censoring, for heavy-tailed distributions. Extremes. 2014;17:337–358. doi: 10.1007/s10687-014-0189-6. [DOI] [Google Scholar]

[B10-entropy-28-00479] 10.Wu C.F.J. Efficient Sequential Designs with Binary Data. J. Am. Stat. Assoc. 1985;80:974–984. doi: 10.1080/01621459.1985.10478213. [DOI] [Google Scholar]

[B11-entropy-28-00479] 11.Joseph V.R. Efficient Robbins–Monro procedure for binary data. Biometrika. 2004;91:461–470. doi: 10.1093/biomet/91.2.461. [DOI] [Google Scholar]

[B12-entropy-28-00479] 12.Wu C.F.J., Tian Y. Three-phase optimal design of sensitivity experiments. J. Stat. Plan. Inference. 2014;149:1–15. doi: 10.1016/j.jspi.2013.10.007. [DOI] [Google Scholar]

[B13-entropy-28-00479] 13.Dixon W.J., Mood A.M. A Method for Obtaining and Analyzing Sensitivity Data. J. Am. Stat. Assoc. 1948;43:109–126. doi: 10.1080/01621459.1948.10483254. [DOI] [Google Scholar]

[B14-entropy-28-00479] 14.Dixon W.J. The Up-and-Down Method for Small Samples. J. Am. Stat. Assoc. 1965;60:967–978. doi: 10.1080/01621459.1965.10480843. [DOI] [Google Scholar]

[B15-entropy-28-00479] 15.O’Quigley J., Pepe M., Fisher L. Continual Reassessment Method: A Practical Design for Phase 1 Clinical Trials in Cancer. Biometrics. 1990;46:33–48. doi: 10.2307/2531628. [DOI] [PubMed] [Google Scholar]

[B16-entropy-28-00479] 16.Kahn H., Harris T.E. Estimation of Particle Transmission by Random Sampling. Natl. Bur. Stand. Appl. Math. Ser. 1951;12:27–30. [Google Scholar]

[B17-entropy-28-00479] 17.Balkema A.A., De Haan L. Residual Life Time at Great Age. Ann. Probab. 1974;2:762–804. doi: 10.1214/aop/1176996548. [DOI] [Google Scholar]

[B18-entropy-28-00479] 18.Pickands J. Statistical Inference using Extreme Order Statistics. Ann. Probab. 1975;3:119–131. [Google Scholar]

[B19-entropy-28-00479] 19.Dietrich D., De Haan L., Hüsler J. Testing extreme value conditions. Extremes. 2002;5:71–85. doi: 10.1023/A:1020934126695. [DOI] [Google Scholar]

[B20-entropy-28-00479] 20.Drees H., De Haan L., Li D. Approximations to the tail empirical distribution function with application to testing extreme value conditions. J. Stat. Plan. Inference. 2006;136:3498–3538. doi: 10.1016/j.jspi.2005.02.017. [DOI] [Google Scholar]

[B21-entropy-28-00479] 21.Choulakian V., Stephens M.A. Goodness-of-fit tests for the generalized Pareto distribution. Technometrics. 2001;43:478–484. doi: 10.1198/00401700152672573. [DOI] [Google Scholar]

[B22-entropy-28-00479] 22.Jurečková J., Picek J. A Class of Tests on the Tail Index. Extremes. 2001;4:165–183. doi: 10.1023/A:1013925226836. [DOI] [Google Scholar]

[B23-entropy-28-00479] 23.Thyrion P. Les lois exponentielles composées. Bull. Assoc. R. Actuaires Belg. 1964;62:35–44. [Google Scholar]

[B24-entropy-28-00479] 24.Thorin O. On the infinite divisibility of the Pareto distribution. Scand. Actuar. J. 1977;1:31–40. doi: 10.1080/03461238.1977.10405623. [DOI] [Google Scholar]

[B25-entropy-28-00479] 25.Feller W. An Introduction to Probability Theory and Its Applications. Volume 2 Wiley; Hoboken, NJ, USA: 1971. [Google Scholar]

[B26-entropy-28-00479] 26.Naveau P., Huser R., Ribereau P., Hannart A. modelling jointly low, moderate, and heavy rainfall intensities without a threshold selection. Water Resour. Res. 2016;52:2753–2769. doi: 10.1002/2015WR018552. [DOI] [Google Scholar]

PERMALINK

A Sequential Design for Extreme Quantile Estimation Under Binary Sampling

Michel Broniatowski

Emilie Miranda

Roles

Abstract

1. Introduction

2. Problem Formulation in an Industrial Reliability Framework

3. Extreme Quantile Estimation—A Short Survey

3.1. Extreme Quantiles Estimation Methods

3.2. Sequential Design Based on Dichotomous Data

3.2.1. The Staircase Method

Numerical Results

Table 1.

Table 2.

Limitations of the Staircase Method

3.2.2. The Continuous Reassessment Method (CRM)

General Principle

Application to Failure Quantiles

Numerical Simulation for the CRM

Table 3.

Figure 1.

4. A New Design for the Estimation of Extreme Quantiles

4.1. Splitting

Remark 1.

Remark 2.

4.2. Choice of Sequential Design Parameters

4.3. Modeling the Distribution of the Strength, Pareto Model

Theorem 1.

Proposition 1.

4.4. Notation and Framework for Sequential GPD Modeling and Parameterization

4.5. Sequential Design for the Extreme Quantile Estimation

5. Enhanced Sequential Design in the Pareto Model

5.1. Limitations of Likelihood-Based Estimation

Figure 2.

Table 4.

Table 5.

Figure 3.

5.2. An Enhanced Sequential Criterion for Estimation

5.3. Theoretical Insight into the Estimator

5.4. Simulation-Based Numerical Results

Table 6.

Figure 4.

Figure 5.

5.5. Performance of the Sequential Estimation

5.5.1. Estimation of an Extreme Quantile Based on Complete Data, De Valk’s Estimator

5.5.2. Loss in Accuracy Due to Binary Sampling

Table 7.

6. Sequential Design for the Weibull Model

6.1. The Weibull Model

6.2. Numerical Results

Table 8.

7. Model Selection and Misspecification

7.1. Model Selection

7.2. Misspecification

7.2.1. Robustness to Model Misspecification

7.2.2. Insights on Control of Model Deviation

8. Perspectives, Generalization of the Two Models

8.1. Variations Around Mixture Forms

8.2. Variation Around the GPD

Proposition 2.

Examples of Admissible Functions

Figure 6.

9. Conclusions

Author Contributions

Data Availability Statement

Conflicts of Interest

Funding Statement

Footnotes

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases