Youden index estimation based on group-tested data

Jin Yang; Aiyi Liu; Neil Perkins; Zhen Chen

doi:10.1177/09622802241295319

. Author manuscript; available in PMC: 2025 Sep 16.

Published in final edited form as: Stat Methods Med Res. 2024 Dec 10;34(1):45–54. doi: 10.1177/09622802241295319

Youden index estimation based on group-tested data

Jin Yang ¹, Aiyi Liu ¹, Neil Perkins ¹, Zhen Chen ¹

PMCID: PMC12434710 NIHMSID: NIHMS2109989 PMID: 39659139

Abstract

Youden index, a linear function of sensitivity and specificity, provides a direct measurement of the highest diagnostic accuracy achievable by a biomarker. It is maximized at the cut-off point that optimizes the biomarker's overall classification rate while assigning equal weight to sensitivity and specificity. In this paper, we consider the problem of estimating the Youden index when only group-tested data are available. The unavailability of individual disease statuses poses a challenge, especially when there is differential false positives and negatives in disease screening. We propose both parametric and nonparametric procedures for estimation of the Youden index, and exemplify our methods by utilizing data from the National Health and Nutrition Examination Survey (NHANES) to evaluate the diagnostic ability of monocyte for predicting chlamydia.

Keywords: Diagnostic accuracy, differential misclassification, group testing, joint model, sensitivity, specificity

1. Introduction

In diagnostic medicine, biomarkers have been widely used to detect the presence of a disease or condition of interest. For example, CD4 and CD8 are commonly used as biomarkers in HIV/AIDs detection,¹ and hemoglobin A1C (HbA1C) is considered as a biomarker for the presence and severity of hyperglycemia.² The Receiver Operating Characteristic (ROC) curve is a widely used graphical tool that illustrates the discriminatory accuracy of continuous diagnostic biomarkers in distinguishing between diseased and healthy individuals. It operates on the principle that an individual is classified as diseased or healthy based on whether the corresponding biomarker value exceeds or falls below a specified threshold value. The effectiveness of any specific threshold value can be assessed through the measurement of the probability of a true positive (sensitivity) and negative (specificity). The ROC curve is a plot of the sensitivity versus 1-specificity over all possible threshold values. Both parametric and nonparametric methods have been used for estimating the ROC curves, see Pepe³ and Nakas et al.⁴

In addition to the area or partial area under the ROC curve (AUC or pAUC), which are commonly used global indices of diagnostic accuracy, the Youden index provides another useful measure in evaluating the biomarker's predictive capacity. This index is a function of sensitivity and specificity and is defined as $YI = ma x_{c} {Sensitivity (c) + Specificity (c) - 1}$ , where YI is the Youden index, $c$ is the cut-off point, and the maximum is taken over all possible $c$ values. Ranging between 0 and 1, a YI with a value close to 1 (0) is indicative of a relatively large (small) diagnostic capacity. Several studies have addressed the problem of estimating the Youden index. Fluss et al.⁵ compared several estimation procedures (parametric and non-parametric); Molanes-López and Letón⁶ applied the modified delta method and adjusted empirical likelihood to estimate the Youden index and its associated threshold; Yin and Tian^7,8 presented parametric and non-parametric approaches for joint confidence region estimation of sensitivity and specificity at the cut-off point as well as of AUC and Youden index; Yin et al.⁹ proposed a nonparametric method based on a kernel-smoothed estimate of the cumulative distribution functions. However, these studies focused on the case where individual disease statuses are all available.

Due to resource constraints and/or privacy considerations, individual-level disease data may not be accessible in many instances. In such scenarios, group testing has been recommended as a practical alternative. The approach, initially introduced by Dorfman¹⁰ for screening syphilis antigen in U.S. army recruits, not only safeguards patient confidentiality when individual results are not imperative but also enhances statistical efficiency. As a result, group testing has found widespread application in various fields, see Hepworth,¹¹ Hughes-Oliver and Rosenberger,¹² McCann and Tebbs,¹³ Turner et al.,¹⁴ Warasi et al.,¹⁵ Malinovsky and Albert,¹⁶ Malinovsky and Albert,¹⁷ Malinovsky et al.,¹⁸ Mokalled et al.,¹⁹ Bilder et al.²⁰ Recently, Zhang et al.²¹ considered AUC estimation when only group-based test results on the disease status are available.

In the present paper, we are concerned with estimating Youden index in the setting of group-tested data, where disease statuses exist at group level while biomarkers are available for each individual subjects. Similar to when AUC is of interest, we are faced with challenges arisen from the unavailability of individual disease status and differential miscalssification that may depend on the group size and number of diseased individuals within each group.

The paper is organized as follows. In Section 2, we establish the procedures to obtain the Youden index via Normal, Gamma, Log-normal, and nonparametric estimation with group-tested data, respectively. Simulation studies will be presented in Section 3. In Section 4, we illustrate our methods with data on chlamydia detection from NHANES. Conclusions and further research discussions will be presented in Section 5.

2. Methodology

2.1. Youden index

Suppose $X$ is the concentration level of a continuous biomarker with distribution function $H$ and probability density function (pdf) $h$ . Meanwhile, let $X$ have conditional distributions $F$ and $G$ in non-diseased and diseased populations, respectively, with $f$ and $g$ the corresponding density functions. Let $D$ be the true binary disease status so that prevalence $Pr (D = 1) = p$ . Further, let $K$ be the observed disease status so that the specificity and sensitivity of the lab test are $δ_{0} = Pr (K = 0 ∣ D = 0)$ and $δ_{1} = Pr (K = 1 ∣ D = 1)$ , respectively. It follows $H = (1 - p) F + pG$ . For any given cut-off point $c$ , the specificity and sensitivity of a biomarker can be written as

Specificity (c) = Pr (X \leq c ∣ D = 0) ≜ F (c) Sensitivity (c) = Pr (X \geq c ∣ D = 1) ≜ 1 - G (c)

(1)

and the Youden index as

YI = max_{c} {Sensitivity (c) + Specificity (c) - 1} = max_{c} {F (c) - G (c)}

(2)

The value of $c$ that achieves this maximum will be considered the optimal threshold $c^{*}$ and the estimation of YI is carried out by estimating $F$ and $G$ and substituting them in (1):

\hat{YI} = {\hat{F} (c^{*}) - \hat{G} (c^{*})}

(3)

where $\hat{F}$ and $\hat{G}$ are the estimators of $F$ and $G$ , respectively.

In this paper, our main interest is to estimate the Youden index in the setting of group-tested data. According to (3), this requires us to estimate $F$ and $G$ first. To this end, we will introduce three different estimation procedures, which contains both parametric and nonparametric methods.

2.2. Group-tested data

Consider a total of $N$ subjects randomly allocated into $n$ groups, each with a size denoted as $J_{i}$ for $i = 1, \dots, n$ , where $J_{1} + \dots + J_{n} = N$ . The continuous variable $X$ is observed for each subject, resulting in observations $X_{ij}, j = 1, \dots, J_{i}$ , $i = 1, \dots, n$ . Let $X_{i} = {(X_{i 1}, \dots, \dots, X_{i J_{i}})}^{T}$ and denote the group-tested disease results by ${\tilde{K}}_{i}$ . Define $D_{ij}$ as the true disease status of the $j$ th subject in the $i$ th group, and ${\tilde{D}}_{i} = max \{D_{i 1}, \dots, D_{i J_{i}}\}$ as the true disease status of group $i$ . For each group $i$ , we assume that the specificity of the lab test remains constant, i.e. $Pr ({\tilde{K}}_{i} = 0 ∣ {\tilde{D}}_{i} = 0) = δ_{0}$ . The sensitivity is differential, depending on the group size $J_{i}$ and the number of diseased subjects $d_{i}$ in the group, i.e. $Pr ({\tilde{K}}_{i} = 1 ∣ {\tilde{D}}_{i} = 1) = δ_{1}^{*} (J_{i}, d_{i})$ , as detailed in Hwang,²² Hung and Swallow,²³ Haber et al.²⁴

Given the true group disease status ${\tilde{D}}_{i}$ , we assume that biomarker $X_{i}$ of subjects in a group and the group-tested results ${\tilde{K}}_{i}$ are independent so that the conditional probability density of $X_{i}$ can be simplified as

h \{X_{i} ∣ \tilde{K} = k, \tilde{D} = d\} = h \{X_{i} ∣ \tilde{D} = d\}

where $k, d \in {0,1}$ . Denoting $h (X_{i}, {\tilde{K}}_{i})$ as the joint density function of $X_{i}$ and $\tilde{K}$ and letting $I (\cdot)$ be an indicator function, the corresponding likelihood function for the observed data can be written as

L = \prod_{i = 1}^{n} h {(X_{i}, {\tilde{K}}_{i} = 0)}^{I ({\tilde{K}}_{i} = 0)} \prod_{i = 1}^{n} h {(X_{i}, {\tilde{K}}_{i} = 1)}^{I ({\tilde{K}}_{i} = 1)}

(4)

where

h (X_{i}, {\tilde{K}}_{i} = 0) = δ_{0} {(1 - p)}^{J_{i}} \prod_{j = 1}^{J_{i}} f (X_{ij}) + \sum_{\begin{matrix} v_{1}, \dots, ν_{J_{i}} = 0 \\ v_{1} + \dots + v_{J_{i}} = d_{i} > 0 \end{matrix}}^{1} {\prod_{j = 1}^{J_{i}} f {(X_{ij})}^{1 - v_{j}} g {(X_{ij})}^{v_{j}}} {1 - δ_{1}^{*} (J_{i}, d_{i})} p^{d_{i}} {(1 - p)}^{J_{i} - d_{i}} h (X_{i}, {\tilde{K}}_{i} = 1) = (1 - δ_{0}) {(1 - p)}^{J_{i}} \prod_{j = 1}^{J_{i}} f (X_{ij}) + \sum_{\begin{matrix} v_{1}, \dots, ν_{J_{i}} = 0 \\ v_{1} + \dots + v_{J_{i}} = d_{i} > 0 \end{matrix}}^{1} {\prod_{j = 1}^{J_{i}} f {(X_{ij})}^{1 - v_{j}} g {(X_{ij})}^{v_{j}}} δ_{1}^{*} (J_{i}, d_{i}) p^{d_{i}} {(1 - p)}^{J_{i} - d_{i}}

(5)

For notation brevity, we assume equal group sizes, $J_{1} = \dots = J_{n} = N / n ≜ J$ ; all subsequent results can be extended when group size varies. The log-likelihood function can be written as follows:

l_{N} \{f (X_{ij}), g (X_{ij}) | p; Ω\} = \sum_{i = 1}^{n} I ({\tilde{K}}_{i} = 1) log h (X_{i}, {\tilde{K}}_{i} = 1) + \sum_{i = 1}^{n} I ({\tilde{K}}_{i} = 0) log h (X_{i}, {\tilde{K}}_{i} = 0) = \sum_{i = 1}^{n} I ({\tilde{K}}_{i} = 1) log \{(1 - δ_{0}) {(1 - p)}^{J} \prod_{j = 1}^{J} f (X_{ij}) + \sum_{\begin{array}{l} v_{1}, \dots, ν_{J} = 0 \\ v_{1} + \dots + v_{J} = d > 0 \end{array}}^{1} \{\prod_{j = 1}^{J} f {(X_{ij})}^{1 - v_{j}} g {(X_{ij})}^{v_{j}}\} δ_{1}^{*} (J, d) p^{d} {(1 - p)}^{J - d}\} + \sum_{i = 1}^{n} I ({\tilde{K}}_{i} = 0) log \{δ_{0} {(1 - p)}^{J} \prod_{j = 1}^{J} f (X_{ij}) + \sum_{\begin{array}{l} v_{1}, \dots, ν_{J} = 0 \\ v_{1} + \dots + v_{J} = d > 0 \end{array}}^{1} \{\prod_{j = 1}^{J} f {(X_{ij})}^{1 - v_{j}} g {(X_{ij})}^{v_{j}}\} \{1 - δ_{1}^{*} (J, d)\} p^{d} {(1 - p)}^{J - d}\}

(6)

where $Ω$ is the vector containing all parameters related to $f$ and $g$ . We use the ‘optim’ function in R to estimate $Ω$ and hence $\hat{f} (X_{ij})$ and $\hat{g} (X_{ij})$ . Consequently, estimate of Youden index can be obtained according to (2) and (3).

2.3. Estimation

2.3.1. Normal estimation

Assume $X$ follows Normal distributions $F$ and $G$ with means $μ_{0}$ and $μ_{1}$ and standard deviations $σ_{0}$ and $σ_{1}$ for non-diseased and diseased populations, respectively. Here we suppose $μ_{1} > μ_{0}$ . For $μ_{1} < μ_{0}$ , one may simply switch diseased with non-diseased. $Sensitivity (c)$ and $Specificity (c)$ in (1) can then be written as

Sensitivity (c) = Pr (X \geq c ∣ D = 1) = Φ (\frac{μ_{1} - c}{σ_{1}}) Specificity (c) = Pr (X \leq c ∣ D = 0) = Φ (\frac{c - μ_{0}}{σ_{0}})

(7)

for a given cut-point $c$ , where $Φ$ denotes the standard Normal distribution function. When $σ_{0} = σ_{1}$ , the optimal cut-off point is the midpoint between diseased and non-diseased means, i.e. $c^{*} = (μ_{0} + μ_{1}) / 2$ . Otherwise, it is given as

c_{1, 2}^{*} = \frac{μ_{0} (b^{2} - 1) - a \pm b \sqrt{a^{2} + (b^{2} - 1) σ_{0}^{2} ln (b^{2})}}{b^{2} - 1}

(8)

where $ln$ is the natural logarithm funtcion, $a = μ_{1} - μ_{0}$ and $b = σ_{1} / σ_{0}$ . Let $c_{1}^{*} < c_{2}^{*}$ , then the Youden index YI occurs at $c_{2}^{*}$ if $b > 1$ and at $c_{1}^{*}$ otherwise, see Fluss et al.⁵ Given ${\hat{μ}}_{0}$ , ${\hat{μ}}_{1}$ , ${\hat{σ}}_{0}$ , ${\hat{σ}}_{1}$ and consequently $\hat{F}$ and $\hat{G}$ , estimate of the optimal cut-off point $c^{*}$ can be obtained through (8) and that of the Youden index through (3).

2.3.2. Gamma estimation

Assume $X$ follows Gamma distribution $F$ and $G$ with shapes $α_{0}$ and $α_{1}$ and rates $β_{0}$ and $β_{1}$ for non-diseased and diseased populations, respectively. The optimal cut-off point $c^{*}$ can be explicitly expressed as follows.

When $α_{0} = α_{1} = α$ ,
$c^{*} = α \times (\frac{1}{β_{0} - β_{1}}) ln (\frac{β_{0}}{β_{1}})$
where $β_{0} < β_{1}$ . When $β_{0} > β_{1}$ , one may simply switch diseased with non-diseased.
When $β_{0} = β_{1} = β$ ,
$c^{*} = \frac{1}{β} \times {(\frac{Γ (α_{1})}{Γ (α_{0})})}^{\frac{1}{α_{1} - α_{0}}}$
Otherwise, $c^{*}$ can be obtained from (2) directly.

The estimation procedure of shapes and rates in $F$ and $G$ is similar to the Normal estimation case above, and the optimal cut-off point $c^{*}$ and corresponding estimator of Youden index $\hat{YI}$ can be obtained similarly.

2.3.3. Log-normal estimation

Assume $X$ follows Log-normal distribution $F$ and $G$ with log-means $log μ_{1}$ and $log μ_{2}$ , and log-standard deviation $log σ_{0}$ and $log σ_{1}$ for non-diseased and diseased populations, respectively. The optimal cut-off point $c^{*}$ and the corresponding Youden index can be obtained similarity to that in the Normal estimation.

2.3.4. Nonparametric estimation

To estimate $F$ and $G$ nonparametrically, we follow the estimation procedure proposed by Zhang et al.²¹ This involves two steps: 1) Estimate the distribution functions $H (x) = Pr (X \leq x)$ and $H (x, \tilde{K} = 0) = Pr (X \leq x, \tilde{K} = 0)$ as well as the prevalence $p$ , and 2) Estimate $F$ and $G$ nonparametrically.

Based on Zhang et al.,²¹ the number of groups that are tested positive, $n_{1} = \sum_{i = 1}^{n} {\tilde{K}}_{i}$ , follows a binomial distribution with size $n$ and probability

p_{1} = (1 - δ_{0}) (1 - p)^{J} + \sum_{d = 1}^{J} δ_{1}^{*} (J, d) (\binom{J}{d}) p^{d} (1 - p)^{J - d}

Let $p_{0} = 1 - p_{1}$ and $n_{0} = n - n_{1}$ . It follows that the log-likelihood function based on the group testing results is $l = n_{0} log p_{0} + n_{1} log p_{1}$ . Then the MLE $\hat{p}$ of $p$ can be obtained by maximizing $l$ . Based on asymptotic normality of MLE proposed by van der Vaart,²⁵

\hat{p} \overset{approx}{~} N {p, var (\hat{p})}

where

var (\hat{p}) = \frac{p_{1} (1 - p_{1})}{n} {\{J (1 - δ_{0}) (1 - p)^{J - 1} + \sum_{d = 1}^{J} δ_{1}^{*} (J, d) (\binom{J}{d}) p^{d - 1} (1 - p)^{J - d - 1} (d - J p)\}}^{- 2}

Consequently, we have

\tilde{H} (x) = \frac{1}{nJ} \sum_{i = 1}^{n} \sum_{j = 1}^{J} I (X_{ij} \leq x)

\tilde{H} (x, \tilde{K} = 0) = \frac{1}{n J} \sum_{i = 1}^{n} I ({\tilde{M}}_{i} = 0) \{\sum_{j = 1}^{J} I (X_{i j} \leq x)\}

The final nonparametric estimators of $F (x)$ and $G (x)$ are

\tilde{F} (x) = \frac{\hat{p} \tilde{H} (x, \tilde{K} = 0) - {\hat{γ}}_{1} \tilde{H} (x)}{{\hat{γ}}_{2} \hat{p} - {\hat{γ}}_{1} (1 - \hat{p})}

\tilde{G} (x) = \frac{{\hat{γ}}_{2} \tilde{H} (x) - (1 - \hat{p}) \tilde{H} (x, \tilde{K} = 0)}{{\hat{γ}}_{2} \hat{p} - {\hat{γ}}_{1} (1 - \hat{p})}

where

{\hat{γ}}_{1} = \sum_{d = 0}^{J - 1} {1 - δ_{1}^{*} (J, d + 1)} (\binom{J - 1}{d}) {\hat{p}}^{d + 1} (1 - \hat{p})^{J - 1 - d}

{\hat{γ}}_{2} = δ_{0} (1 - \hat{p})^{K} + \sum_{d = 1}^{J - 1} {1 - δ_{1}^{*} (J, d)} (\binom{J - 1}{d}) {\hat{p}}^{d} (1 - \hat{p})^{J - d}

These estimators of $F$ and $G$ are step functions. We estimate the optimal cut-off point $c^{*}$ by locating the value of $X$ that maximizes $F (x) - G (x)$ and then obtain the corresponding Youden index estimate.

3. Simulations

We conducted extensive simulation studies to evaluate the performance of our proposed approach. The total number of subjects is $N = 12000$ and we generated the true disease status $\{D_{ij} : 1 \leq i \leq n; 1 \leq j \leq J\}$ for all subjects from a Bernoulli distribution with probability $p$ , where the prevalence $p$ was set to 0.02 or 0.03. The group size $J$ was chosen from {1, 2, 5}, with $J = 1$ corresponding to individual testing. The sensitivity $δ_{1}$ and specificity $δ_{0}$ were selected from {0.90, 0.95, 1.00}. We specified $δ_{1}^{*}$ using the model of Hung and Swallow²³ as $δ_{1}^{*} = δ_{1} d / {d + λ (J - d)}$ , where $d$ represents the number of diseased individuals in a group and $λ = 0.02$ . For the observed disease status $\tilde{K}$ , we randomly divided the $N$ subjects into $n$ groups of size $J$ and generated the group-tested result $(\tilde{K})$ from a Bernoulli distribution with probability $1 - δ_{0}$ for groups with all $D = 0$ and $δ_{1}^{*}$ for groups with at least one $D = 1$ .

We considered three data generating scenarios for biomarker $X$ : (1) Normal data, where $X$ has a Normal distribution $N (μ, σ)$ , with mean $μ = 0$ and standard deviation $σ = 1$ in non-diseased population and $μ = 3$ and $σ = 2.5$ in diseased population; (2) Gamma data, where $X$ has a Gamma distribution $G (α, β)$ , with shape $α_{0} = 1.5$ and rate $β_{0} = 2.2$ in non-diseased population, and $α_{1} = 2$ and $β_{1} = 0.7$ in diseased population; (3) Log-normal data, where $X$ follows a Log-normal distribution, with log-mean 1 and log-standard deviation 0.3 in non-diseased population, and log-mean 1.4 and log-standard deviation 1 in diseased population. The true values of Youden index in these three scenarios are 0.6590, 0.6430 and 0.4135, respectively.

We examined the proposed four estimation procedures (Normal estimation, Gamma estimation, Log-normal estimation, and nonparametric estimation) across varying group sizes $(J = 1, 2, 5)$ , different prevalence rates $(p = 0.02, 0.03)$ , and misclassification rates $(1 - δ_{0} = 1 - δ_{1} = 0, 0.05, 0.1)$ . Our assessments were conducted using widely used criteria, including bias, standard error (SE), root mean square error (RMSE), 95% coverage probability (CP), and the average length of confidence intervals (ACIL). The bootstrap method was employed to compute CP and ACIL. In total, 300 simulated data sets were generated, and within each simulation, 500 bootstrap replicates were created.

We first report simulation results under the Normal data scenario. In this scenario, Gamma or Log-normal estimation were not considered as $X$ can have both positive and negative values. Table 1 shows the performance of the prevalence estimator. It is easy to see that the estimates under the Normal estimation method are all close to the true values, with coverage probabilities close to the nominal level. Across board, the estimator exhibits increased statistical efficiency (indicated by smaller SE) as the misclassification error ( $1 - δ_{0}$ and $1 - δ_{1}$ ) decreases. For example, under Normal estimation, when prevalence is $p = 0.02$ , misclassification rates are both 0.1 $(δ_{1} = δ_{0} = 0.9)$ , and the group size is $J = 5$ , the relative efficiency of the estimator is about 0.85 (0.1983/0.2336) in comparison to $J = 1$ . We found similar performance under nonparametric estimation. For example, when $p = 0.03$ , $δ_{1} = δ_{0} = 0.9$ , and $J = 5$ , the relative efficiency of the estimator is 0.66 (0.2480/0.3781). In addition, under the same situation, the RMSE under Normal estimation is better than that under nonparametric estimation. For example, when $p = 0.02$ , $δ_{1} = δ_{0} = 0.95$ and $J = 2$ , the RMSE is 0.1859 in the scenario of Normal estimation, which is less than 0.2083 in nonparametric estimation.

Table 1.

Simulation results with Normal data for the prevalence estimator based on the group and individual testing approaches: estimate (Est), bias (Bias), standard error (SE), root mean square error (RMSE), coverage probability (CP) and average confidence interval length (ACIL) of the estimator for the biomarker. Entries of Est, Bias, SE are multiplied by 100 for better presentation. $p$ is the prevalence, $δ_{0}$ and $δ_{1}$ are specificity and sensitivity, and $J$ is the size of each group.

		Normal					Nonparametric
$δ_{0} = δ_{1}$	$J$	Est	Bias	SE	RMSE	CP(ACIL)	Est	Bias	SE	RMSE	CP(ACIL)
		$p = 0.02$
0.90	1	1.9645	−0.0355	0.2336	0.2363	93.00%(0.0091)	1.9997	−0.0003	0.3620	0.3620	95.67%(0.0143)
	2	1.9841	−0.0159	0.2214	0.2219	94.33%(0.0086)	2.0087	0.0087	0.2899	0.2901	93.33%(0.0111)
	5	1.9886	−0.0114	0.1983	0.1986	95.00%(0.0078)	1.9951	−0.0049	0.2227	0.2227	96.33%(0.0089)
0.95	1	1.9717	−0.0283	0.1910	0.1931	95.00%(0.0076)	1.9908	−0.0092	0.2434	0.2435	97.33%(0.0100)
	2	1.9881	−0.0119	0.1859	0.1863	94.33%(0.0071)	2.0005	0.0005	0.2083	0.2083	94.33%(0.0082)
	5	1.9859	−0.0141	0.1689	0.1695	96.00%(0.0066)	1.9883	−0.0117	0.1734	0.1738	96.33%(0.0070)
1.00	1	2.0045	0.0045	0.1257	0.1258	96.33%(0.0050)	2.0043	0.0043	0.1263	0.1263	96.33%(0.0050)
	2	2.0058	0.0058	0.1281	0.1283	96.67%(0.0051)	2.0057	0.0057	0.1278	0.1279	96.33%(0.0051)
	5	2.0056	0.0056	0.1308	0.1309	96.00%(0.0052)	2.0065	0.0065	0.1305	0.1307	96.33%(0.0053)
		$p = 0.03$
0.90	1	2.9775	−0.0225	0.2636	0.2646	93.67%(0.0102)	3.0117	0.0117	0.3781	0.3783	94.67%(0.0147)
	2	2.9921	−0.0079	0.2460	0.2461	95.00%(0.0095)	3.0154	0.0154	0.3018	0.3022	94.67%(0.0118)
	5	2.9881	−0.0119	0.2252	0.2255	95.33%(0.0089)	3.0038	0.0038	0.2480	0.2481	97.00%(0.0098)
0.95	1	2.9832	−0.0168	0.2115	0.2122	96.67%(0.0086)	2.9964	−0.0036	0.2581	0.2581	96.33%(0.0106)
	2	2.9987	−0.0013	0.2041	0.2041	95.00%(0.0081)	3.0087	0.0087	0.2215	0.2217	95.00%(0.0089)
	5	2.9870	−0.0130	0.1909	0.1914	95.33%(0.0076)	2.9952	−0.0048	0.2001	0.2002	96.00%(0.0081)
1.00	1	3.0107	0.0107	0.1556	0.1559	94.67%(0.0061)	3.0108	0.0108	0.1555	0.1558	94.67%(0.0061)
	2	3.0130	0.0130	0.1585	0.1591	94.33%(0.0062)	3.0134	0.0134	0.1595	0.1601	95.33%(0.0062)
	5	3.0154	0.0154	0.1613	0.1620	94.33%(0.0064)	3.0157	0.0157	0.1649	0.1656	96.00%(0.0065)

Open in a new tab

Table 2 summarizes the performance of the Youden index estimator under Normal data scenario. As expected, Youden index estimator in group testing has better statistical efficiency than that in the individual testing. For example, under Normal estimation, when prevalence is $p = 0.02$ , misclassification rate are both equal to 0.05 $(δ_{1} = δ_{0} = 0.95)$ , and the group size is $J = 5$ , the relative efficiency of the estimator is 0.85 (0.0387/0.0448) in comparison to $J = 1$ . Results are similar using nonparametric estimation method. For example, when prevalence is $p = 0.03$ , $δ_{1} = δ_{0} = 0.9$ , and $J = 2$ , the relative efficiency of the estimator is 0.76 (0.0633/0.0831). However, when there is no misclassification $(δ_{0} = δ_{1} = 1.00)$ , the superiority of group testing disappears, with either larger SE or RMSE compared to individual testing. In addition, under the same situation, the RMSE under Normal estimation is better than that under nonparametric estimation. For example, when $p = 0.03$ , $δ_{1} = δ_{0} = 0.95$ and $J = 5$ , the RMSE is 0.0290 in the scenario of Normal estimation, which is less than 0.0556 in nonparametric estimation.

Table 2.

Simulation results with Normal data for the Youden index estimator based on the group and individual testing approaches: estimate (Est), bias (Bias), standard error (SE), root mean square error (RMSE), coverage probability (CP) and average confidence interval length (ACIL) of the estimator for the biomarker. $p$ is the prevalence, $δ_{0}$ and $δ_{1}$ are specificity and sensitivity, and $J$ is the size of each group.

		Normal					Nonparametric
$δ_{0} = δ_{1}$	$J$	Est	Bias	SE	RMSE	CP(ACIL)	Est	Bias	SE	RMSE	CP(ACIL)
		$p = 0.02$
0.90	1	0.6733	0.0143	0.0565	0.0583	91.00%(0.2125)	0.6995	0.0405	0.1203	0.1270	87.33%(0.4282)
	2	0.6681	0.0091	0.0487	0.0495	94.00%(0.1984)	0.6911	0.0321	0.0967	0.1019	92.33%(0.3599)
	5	0.6658	0.0068	0.0459	0.0465	93.33%(0.1770)	0.6967	0.0377	0.0916	0.0991	89.33%(0.3277)
0.95	1	0.6701	0.0111	0.0448	0.0462	94.67%(0.1779)	0.6876	0.0287	0.0843	0.0891	93.00%(0.3230)
	2	0.6660	0.0070	0.0421	0.0427	95.33%(0.1629)	0.6850	0.0260	0.0683	0.0730	94.67%(0.2674)
	5	0.6642	0.0052	0.0387	0.0391	94.67%(0.1465)	0.6887	0.0297	0.0671	0.0734	93.00%(0.2567)
1.00	1	0.6590	0.0000	0.0234	0.0234	95.00%(0.0914)	0.6666	0.0076	0.0289	0.0299	94.00%(0.1091)
	2	0.6590	0.0000	0.0250	0.0250	95.67%(0.0964)	0.6708	0.0118	0.0328	0.0348	92.33%(0.1289)
	5	0.6595	0.0005	0.0281	0.0281	93.33%(0.1061)	0.6801	0.0211	0.0440	0.0488	93.00%(0.1780)
		$p = 0.03$
0.90	1	0.6680	0.0090	0.0382	0.0392	94.67%(0.1576)	0.6848	0.0258	0.0831	0.0870	95.33%(0.3251)
	2	0.6653	0.0063	0.0365	0.0370	93.67%(0.1426)	0.6826	0.0236	0.0633	0.0676	96.33%(0.2682)
	5	0.6653	0.0063	0.0335	0.0341	93.67%(0.1306)	0.6876	0.0286	0.0616	0.0679	91.00%(0.2501)
0.95	1	0.6656	0.0066	0.0317	0.0324	95.67%(0.1291)	0.6773	0.0183	0.0556	0.0585	94.33%(0.2273)
	2	0.6637	0.0047	0.0298	0.0302	95.67%(0.1174)	0.6785	0.0195	0.0472	0.0511	94.67%(0.1926)
	5	0.6649	0.0059	0.0284	0.0290	93.67%(0.1109)	0.6828	0.0238	0.0503	0.0556	93.67%(0.1981)
1.00	1	0.6600	0.0010	0.0193	0.0193	94.00%(0.0745)	0.6661	0.0071	0.0238	0.0248	92.67%(0.0892)
	2	0.6599	0.0009	0.0203	0.0203	94.67%(0.0789)	0.6690	0.0100	0.0280	0.0297	92.33%(0.1065)
	5	0.6601	0.0011	0.0232	0.0232	93.67%(0.0867)	0.6775	0.0185	0.0395	0.0437	92.33%(0.1492)

Open in a new tab

Besides that, results of prevalence and Youden index estimators in Gamma/Log-normal data are presented in Appendix A of supplemental. We also conducted additional simulations with a smaller sample size $(N = 8000)$ , higher prevalences $(p = 0.05, 0.1)$ , and a set of expanded equal $(δ_{1} = δ_{0} = 0.7, 0.8, 0.9, 0.95, 1)$ and unequal $(δ_{1} = 0.8, δ_{0} = 0.7)$ sensitivity and specificity values in Appendix B of supplemental. In addition, a new simulation where the sensitivity and specificity are mis-specified are presented in Appendix C of supplemental.

In summary, the simulations demonstrate these findings: (1) all proposed estimation method based on group testing can be superior (in both accuracy and efficiency) to those based on individual testing; (2) as misclassification rates decrease, the proposed estimator becomes more efficient; (3) as expected, using the parametric distribution that is the same as the generating distribution (e.g. Normal estimation for Normal data) results in the best estimators. Otherwise, Nonparametric estimation performs the best; and (4) it is easy to see that when sample size decreases and prevalence increases, the group testing will have lower precision than the individual testing. This suggests that, if the sample size is small, the group testing will perform well only when the prevalence is relatively low; see Liu et al.²⁶

4. Application

We applied our proposed method to genital chlamydia infections and utilized data from the National Health and Nutrition Examination Survey (NHANES), a comprehensive population study designed to assess the health and nutritional status of individuals throughout the United States. Further details can be found at https://www.cdc.gov/nchs/nhanes/index.htm.

In the NHANES study, urine samples were collected from individuals between the ages of 18 and 39, and tests for genital chlamydia infections were conducted using the DNA strand displacement amplification method. The publicly available dataset includes the assay results of eligible participants. Chlamydia, which is caused by Chlamydia trachomatis, is a common sexually transmitted disease that has the potential to influence the levels of monocyte and erythrocyte sedimentation rate, see Łój et al.,²⁷ Park et al.²⁸ In our analysis, we considered using monocyte as a biomarker for detecting chlamydia infections.

We gathered data on chlamydia and monocyte from six consecutive and independent surveys conducted as part of NHANES, spanning the years 1999–2000, 2001–2002, 2003–2004, 2005–2006, 2007–2008, and 2009–2010. To account for the potential impact of oversampling and the intricate survey design, we implemented a resampling technique on the data from each two-year survey dataset. This resampling process involved replacement and utilized sampling weights proportional to the probabilities, while maintaining the original dataset's sample size. Following this, we merged these resampled datasets to construct a comprehensive sample.

After removing those with missing values of chlamydia and monocyte, $N = 12426$ independent observations of $(𝒳, K)^{T}$ were included in our final analysis. Among these observations, 220 subjects tested positive for chlamydia. It's worth noting that the NHANES study did not employ group testing to detect chlamydia and we presented a hypothetical scenario using group-tested data. This approach is justifiable because the self-antibody test and the three biomarkers relied on different specimens. We independently generated group-tested outcomes for disease presence. To achieve this, we considered the testing results in the dataset as the true disease statuses of the subjects and randomly assigned the self-antibody test specimens to groups of size $J$ . The values of the biomarker for each subject remained unchanged.

Since we do not have a reasonable distributional assumption on the monocyte data, we chose to use the Nonparametric estimation method. For $J = 1, 2, 5$ , we estimated the prevalence and Youden index as well as there standard errors and 95% confidence intervals (95% CI) based on individual and group-tested results. We set specificity of $δ_{0} = 0.99$ and a sensitivity of $δ_{1} = 0.9$ (see Haugland et al.²⁹) and assumed that $δ_{1}^{*} (J, d) = δ_{1} d / {d + λ (J - d)}$ with $λ = 0.02$ .

The prevalence and Youden index estimators for monocyte, derived from both individual and group-tested results, are presented in Table 3. The standard errors of estimators are computed using 1000 bootstrap replicates. The table reveals that the prevalence and Youden index estimates achieve better efficiency when $J = 5$ , with a relative efficiency of 0.88 (0.1331/0.1506) in prevalence and 0.94 (3.8543/4.1126) in Youden index, compared to the individual-tested results $(J = 1)$ .

Table 3.

Nonparametric estimators for the Chlamydia data: estimates (Est), standard error (SE), and 95% Confidence Interval (95% CI) of Youden index for the biomarker monocyte based on individual $(J = I)$ and group testing $(J = 2, 5)$ approaches. Entries of SE are multiplied 100 for better presentation. $J$ is the size of each group.

	$p$			Youden index
$J$	Est	SE	95% CI	Est	SE	95% CI
1	0.0155	0.1506	(0.0125, 0.0184)	0.0871	4.1126	(0.0065, 0.1677)
2	0.0166	0.1479	(0.0137, 0.0195)	0.1142	4.9292	(0.0176, 0.2109)
5	0.0169	0.1331	(0.0143, 0.0195)	0.0462	3.8543	(0, 0.1217)

Open in a new tab

5. Summary and discussion

Youden index is an important measure of the accuracy of a diagnostic biomarker. In the present paper, we considered the problem of estimating Youden index of a continuous biomarker when only group-based test results on the disease status are available, in order to save cost and/or protect patients' confidentiality. The biomarker values are observed on the individual levels, particularly when the cost of measuring the biomarker is relatively low.

We proposed both parametric and nonparametric estimation procedures and observed promising performance through simulations. As expected, when the distributional assumptions of the biomarker are reasonable, estimators under these distributions perform well. Youden index remains unchanged under monotonic transformation of the data. If the data follow approximately Normal or Gamma distribution after some monotonic transformation (e.g. box-cox transformation), then our methods can be applied. In situations where we lack reasonable distributional insight or when we desire flexible assumptions, the nonparametric approach provides an attractive estimation procedure. Compared with individual testing, group testing strategy in general not only reduces the cost of disease screening but also provides more efficient estimation of the Youden index, when the disease's prevalence is low and the test for screening for the disease is imperfect. Thus, group testing is an appealing alternative to the conventional individual testing.

The conclusion that considering group-tested data would lead to more precise Youden index estimation is somehow counter-intuitive but can be explained. If misclassification error does not exists, the individual testing provides most information on the disease and thus is always more precise than group testing. However, when there is misclassification error, the individual testing requires more tests be performed and could more likely generate higher testing errors (i.e. false positives and false negatives) than the group testing, which requires fewer tests be performed. This could potentially lead to decreased precision of estimation for individual testing. When comparing two different group sizes, it requires even smaller prevalence for group testing with larger group size to perform better than individual testing.

This paper only considered three pre-determined group sizes, i.e. $J = 1, 2, 5$ . Exploring the optimal group size could be an interesting avenue for future research. Moreover, the current paper focused on a single biomarker. Consideration of multiple biomarkers in the estimation of Youden index can be beneficial and constitutes another promising future work direction. Finally, non-equal (random) group sizes could be another interesting extension.

Supplementary Material

Supp

NIHMS2109989-supplement-Supp.pdf^{(231.5KB, pdf)}

Supplemental materials for this article are available online.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

1.Pastor L, Urrea V, Carrillo J, et al. Dynamics of CD4 and CD8 T-cell subsets and inflammatory biomarkers during early and chronic HIV infection in Mozambican adults. Front Immunol 2018; 8. DOI: 10.3389/fimmu.2017.01925. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Lyons TJ and Basu A. Biomarkers in diabetes: hemoglobin A1c, vascular and tissue markers. Transl Res 2012; 159: 303–312. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Pepe MS. Statistical evaluation of medical tests for classification and prediction. Oxford: Oxford University Press, 2003. [Google Scholar]
4.Nakas C, Bantis L and Gatsonis C. ROC analysis for classification and prediction in practice. Chicago: Chapman & Hall, 2023. [Google Scholar]
5.Fluss R, Faraggi D and Reiser B. Estimation of the Youden index and its associated cutoff point. Biom J 2005; 4: 458–472. [DOI] [PubMed] [Google Scholar]
6.Molanes-López EM and Letón E. Inference of the Youden index and associated threshold using empirical likelihood for quantiles. Stat Med 2011; 30: 2467–2480. [DOI] [PubMed] [Google Scholar]
7.Yin J and Tian L. Joint confidence region estimation for area under ROC curve and Youden index. Stat Med 2013; 33: 985–1000. [DOI] [PubMed] [Google Scholar]
8.Yin J and Tian L. Joint inference about sensitivity and specificity at the optimal cut-off point associated with Youden index. Comput Stat Data Anal 2014; 77: 1–13. [Google Scholar]
9.Yin J, Samawi H and Linder D. Improved nonparametric estimation of the optimal diagnostic cut-off point associated with the Youden index under different sampling schemes. Biom J 2016; 58: 915–934. [DOI] [PubMed] [Google Scholar]
10.Dorfman R The detection of defective members of large populations. Ann Math Stat 1943; 14: 436–440. [Google Scholar]
11.Hepworth G Exact confidence intervals for proportions estmated by group testing. Biometrics 1996; 52: 1134–1146. [Google Scholar]
12.Hughes-Oliver JM and Rosenberger WF. Efficient estimation of the prevalence of multiple rare traits. Biometrika 2000; 87: 315–327. [Google Scholar]
13.McCann MH and Tebbs JM. Pairwise comparisons for proportions estimated by pooled testing. J Stat Plan Inference 2007; 137: 1278–1290. [Google Scholar]
14.Turner DW, Stamey JD and Young DM. Classic group testing with cost for grouping and testing. Comput Math Appl 2009; 58: 1930–1935. [Google Scholar]
15.Warasi M, Tebbs J, McMahan C, et al. Estimating the prevalence of multiple diseases from two-stage hierarchical pooling. Stat Med 2016; 35: 3851–3864. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Malinovsky Y and Albert PS. A note on the minimax solution for the two-stage group testing problem. Am Stat 2015; 69: 45–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Malinovsky Y and Albert PS. Revisiting nested group testing procedures: new results, comparisons, and robustness. Am Stat 2019; 73: 117–125. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Malinovsky Y, Haber G and Albert PS. An optimal design for hierarchical generalized group testing. J R Stat Soc Ser C 2020; 69: 607–621. [Google Scholar]
19.Mokalled S, McMahan C, Tebbs J, et al. Incorporating the dilution effect in group testing regression. Stat Med 2021; 40: 2540–2555. [DOI] [PubMed] [Google Scholar]
20.Bilder C, Tebbs J and McMahan C. Discussion on “is group testing ready for prime-time in disease identification?”. Stat Med 2021; 40: 3881–3886. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Zhang W, Liu A, Li Q, et al. Nonparametric estimation of distributions and diagnostic accuracy based on group-tested results with differential misclassification. Biometrics 2020; 76: 1147–1156. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Hwang FK. Group testing with a dilution effect. Biometrika 1976; 63: 671–680. [Google Scholar]
23.Hung M and Swallow W. Robustness of group testing in the estimation of proportions. Biometrics 1999; 55: 231–237. [DOI] [PubMed] [Google Scholar]
24.Haber G, Malinovsky Y and Albert PS. Is group testing ready for prime-time in disease identification? Stat Med 2021; 40: 3865–3880. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.van der Vaart AW. Asymptotic statistics. Cambridge: Cambridge University Press, 1998. [Google Scholar]
26.Liu C, Liu A, Zhang Z, et al. Optimality of group testing in the presence of misclassification. Biometrika 2012; 99: 245–251. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Łój B, Brodowska A, Ciećwież S, et al. The role of serological testing for Chlamydia trachomatis in differential diagnosis of pelvic pain. Ann Agric Environ Med 2016; 23: 506–510. [DOI] [PubMed] [Google Scholar]
28.Park ST, Lee SW, Kim MJ, et al. Clinical characteristics of genital Chlamydia infection in pelvic inflammatory disease. BMC Womens Health 2017; 17: 5. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Haugland S, Thune T, Fosse B, et al. Comparing urine samples and cervical swabs for Chlamydia testing in a female population by means of Strand Displacement Assay (SDA). BMC Womens Health 2010; 10: 9. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supp

NIHMS2109989-supplement-Supp.pdf^{(231.5KB, pdf)}

[R1] 1.Pastor L, Urrea V, Carrillo J, et al. Dynamics of CD4 and CD8 T-cell subsets and inflammatory biomarkers during early and chronic HIV infection in Mozambican adults. Front Immunol 2018; 8. DOI: 10.3389/fimmu.2017.01925. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Lyons TJ and Basu A. Biomarkers in diabetes: hemoglobin A1c, vascular and tissue markers. Transl Res 2012; 159: 303–312. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Pepe MS. Statistical evaluation of medical tests for classification and prediction. Oxford: Oxford University Press, 2003. [Google Scholar]

[R4] 4.Nakas C, Bantis L and Gatsonis C. ROC analysis for classification and prediction in practice. Chicago: Chapman & Hall, 2023. [Google Scholar]

[R5] 5.Fluss R, Faraggi D and Reiser B. Estimation of the Youden index and its associated cutoff point. Biom J 2005; 4: 458–472. [DOI] [PubMed] [Google Scholar]

[R6] 6.Molanes-López EM and Letón E. Inference of the Youden index and associated threshold using empirical likelihood for quantiles. Stat Med 2011; 30: 2467–2480. [DOI] [PubMed] [Google Scholar]

[R7] 7.Yin J and Tian L. Joint confidence region estimation for area under ROC curve and Youden index. Stat Med 2013; 33: 985–1000. [DOI] [PubMed] [Google Scholar]

[R8] 8.Yin J and Tian L. Joint inference about sensitivity and specificity at the optimal cut-off point associated with Youden index. Comput Stat Data Anal 2014; 77: 1–13. [Google Scholar]

[R9] 9.Yin J, Samawi H and Linder D. Improved nonparametric estimation of the optimal diagnostic cut-off point associated with the Youden index under different sampling schemes. Biom J 2016; 58: 915–934. [DOI] [PubMed] [Google Scholar]

[R10] 10.Dorfman R The detection of defective members of large populations. Ann Math Stat 1943; 14: 436–440. [Google Scholar]

[R11] 11.Hepworth G Exact confidence intervals for proportions estmated by group testing. Biometrics 1996; 52: 1134–1146. [Google Scholar]

[R12] 12.Hughes-Oliver JM and Rosenberger WF. Efficient estimation of the prevalence of multiple rare traits. Biometrika 2000; 87: 315–327. [Google Scholar]

[R13] 13.McCann MH and Tebbs JM. Pairwise comparisons for proportions estimated by pooled testing. J Stat Plan Inference 2007; 137: 1278–1290. [Google Scholar]

[R14] 14.Turner DW, Stamey JD and Young DM. Classic group testing with cost for grouping and testing. Comput Math Appl 2009; 58: 1930–1935. [Google Scholar]

[R15] 15.Warasi M, Tebbs J, McMahan C, et al. Estimating the prevalence of multiple diseases from two-stage hierarchical pooling. Stat Med 2016; 35: 3851–3864. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.Malinovsky Y and Albert PS. A note on the minimax solution for the two-stage group testing problem. Am Stat 2015; 69: 45–52. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] 17.Malinovsky Y and Albert PS. Revisiting nested group testing procedures: new results, comparisons, and robustness. Am Stat 2019; 73: 117–125. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Malinovsky Y, Haber G and Albert PS. An optimal design for hierarchical generalized group testing. J R Stat Soc Ser C 2020; 69: 607–621. [Google Scholar]

[R19] 19.Mokalled S, McMahan C, Tebbs J, et al. Incorporating the dilution effect in group testing regression. Stat Med 2021; 40: 2540–2555. [DOI] [PubMed] [Google Scholar]

[R20] 20.Bilder C, Tebbs J and McMahan C. Discussion on “is group testing ready for prime-time in disease identification?”. Stat Med 2021; 40: 3881–3886. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Zhang W, Liu A, Li Q, et al. Nonparametric estimation of distributions and diagnostic accuracy based on group-tested results with differential misclassification. Biometrics 2020; 76: 1147–1156. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] 22.Hwang FK. Group testing with a dilution effect. Biometrika 1976; 63: 671–680. [Google Scholar]

[R23] 23.Hung M and Swallow W. Robustness of group testing in the estimation of proportions. Biometrics 1999; 55: 231–237. [DOI] [PubMed] [Google Scholar]

[R24] 24.Haber G, Malinovsky Y and Albert PS. Is group testing ready for prime-time in disease identification? Stat Med 2021; 40: 3865–3880. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] 25.van der Vaart AW. Asymptotic statistics. Cambridge: Cambridge University Press, 1998. [Google Scholar]

[R26] 26.Liu C, Liu A, Zhang Z, et al. Optimality of group testing in the presence of misclassification. Biometrika 2012; 99: 245–251. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Łój B, Brodowska A, Ciećwież S, et al. The role of serological testing for Chlamydia trachomatis in differential diagnosis of pelvic pain. Ann Agric Environ Med 2016; 23: 506–510. [DOI] [PubMed] [Google Scholar]

[R28] 28.Park ST, Lee SW, Kim MJ, et al. Clinical characteristics of genital Chlamydia infection in pelvic inflammatory disease. BMC Womens Health 2017; 17: 5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] 29.Haugland S, Thune T, Fosse B, et al. Comparing urine samples and cervical swabs for Chlamydia testing in a female population by means of Strand Displacement Assay (SDA). BMC Womens Health 2010; 10: 9. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Youden index estimation based on group-tested data

Jin Yang

Aiyi Liu

Neil Perkins

Zhen Chen

Abstract

1. Introduction

2. Methodology

2.1. Youden index

2.2. Group-tested data

2.3. Estimation

2.3.1. Normal estimation

2.3.2. Gamma estimation

2.3.3. Log-normal estimation

2.3.4. Nonparametric estimation

3. Simulations

Table 1.

Table 2.

4. Application

Table 3.

5. Summary and discussion

Supplementary Material

Funding

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Youden index estimation based on group-tested data

Jin Yang

Aiyi Liu

Neil Perkins

Zhen Chen

Abstract

1. Introduction

2. Methodology

2.1. Youden index

2.2. Group-tested data

2.3. Estimation

2.3.1. Normal estimation

2.3.2. Gamma estimation

2.3.3. Log-normal estimation

2.3.4. Nonparametric estimation

3. Simulations

Table 1.

Table 2.

4. Application

Table 3.

5. Summary and discussion

Supplementary Material

Funding

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases