Estimation in regret-regression using quadratic inference functions with ridge estimator

Nur Raihan Abdul Jalil; Nur Anisah Mohamed; Rossita Mohamad Yunus

doi:10.1371/journal.pone.0271542

. 2022 Jul 21;17(7):e0271542. doi: 10.1371/journal.pone.0271542

Estimation in regret-regression using quadratic inference functions with ridge estimator

Nur Raihan Abdul Jalil ^1,^#, Nur Anisah Mohamed ^1,^*, Rossita Mohamad Yunus ^1,^#

Editor: Muhammad Amin²

PMCID: PMC9302796 PMID: 35862316

Abstract

In this paper, we propose a new estimation method in estimating optimal dynamic treatment regimes. The quadratic inference functions in myopic regret-regression (QIF-MRr) can be used to estimate the parameters of the mean response at each visit, conditional on previous states and actions. Singularity issues may arise during computation when estimating the parameters in ODTR using QIF-MRr due to multicollinearity. Hence, the ridge penalty was introduced in rQIF-MRr to tackle the issues. A simulation study and an application to anticoagulation dataset were conducted to investigate the model’s performance in parameter estimation. The results show that estimations using rQIF-MRr are more efficient than the QIF-MRr.

Introduction

A dynamic treatment regime (DTR) is a branch of personalized medicine that uses information from the patient to minimize health problems. In reality, the treatment response for each patient is different, which influenced to the development of the DTR and personalized medicine. The advantages of personalized medicine include the following: a cutback in the total cost of health care; the patient receiving an option to intensive health care by deciding an optimal decision for the treatment; and increased compliance and devotion towards treatment [1].

Throughout the years, researchers have shown an interest in DTR. For example, [2] described anticoagulant data to obtain an optimal strategy for the warfarin-treated patient who is at risk of thrombosis (i.e., abnormal blood clotting). Another example of DTR is the estimation of the optimal time by [3] for an asymptomatic HIV-infected subject to start highly active retroviral therapy (HAART). Others include [4–7].

DTR, also known as adaptive strategies, adaptive interventions, or treatment policies, is a multi-stage decision rule of personalized medicine. It defines a set of decision rules to determine the treatment of an individual based on their health condition and treatment history. The term “dynamic” indicates a variation of treatment using the patient’s current state and previous treatment decisions. The purpose of DTR is to optimize the mean outcome for each patient, also called optimal dynamic treatment regime (ODTR).

[8] proposed a regret function to estimate ODTR semi-parametrically and parameterized the optimal rules using the iterative minimization of regrets (IMOR) method. Following the development of ODTR, [9] proposed a regret-regression method where the ODTR was estimated by incorporating the regret function into regression modeling. A doubly robust version of the regret regression by [10] claimed to be equivalent to a reduced form of the efficient g-estimation method [11, 12].

[13] introduced the myopic regret-regression (MRr) which is a short-term strategy of the regret-regression. In a short-term strategy, the ODTR was estimated at each time point. The MRr provides good estimates when the outcome is measured through time. However, in the previous works mentioned above, no attention has been given to the correlation within-subject. ODTR is actually a longitudinal dataset where the observations of one subject are dependent on each other over time. Many studies made use of longitudinal data, for example in [14–16].

To estimate ODTR for correlated data, [13] proposed a method called QIF-MRr which integrates the MRr into the quadratic inference functions (QIF). The QIF method was proposed by [17] where it is an extension to the generalized estimating equations (GEE) [18] which has widely been used in the analysis of longitudinal and correlated data [19–22]. Hence, the QIF-MRr is able to provide unbiased and efficient estimates even with the misspecified working correlation structure.

In the early years [23], proposed the penalized estimating equations, which considered a bridge penalty, and applied them to the GEE method. Meanwhile, [24] proposed the penalized QIF by using the penalized spline and QIF method. In this paper, we proposed an estimation strategy called the ridge quadratic inference function for myopic regret-regression (rQIF-MRr) to estimate ODTR for correlated data. The ridge penalty, also known as L₂ penalty [25]. The reason we choose the ridge penalty over the other penalties is that the ridge penalty can control inflation and general instability related to the least square estimates [26] and it also can solve singularity issues [27].

The paper is organized as follows: We first define the notations and assumptions needed throughout the paper. Then, we propose the rQIF-MRr in estimating ODTR. The proposed method is illustrated using simulation and application to anticoagulant dataset.

Methods

In this section, we will propose a ridge quadratic inference function for myopic regret-regression (rQIF-MRr) to estimate the parameter for ODTR. We follow the notations obtained from [9]. Suppose,

j = 1, 2, …, K is the number of time visits, with K is the final time visit for subject i.
n is the sample size where subject i = 1, 2, …, n.
S_j represents the current state of the subject i at visit j.
A_j represents an action or treatment decision made for subject i at visit j.
${\bar{S}}_{j} = (S_{1}, S_{2}, \dots, S_{j})$ is the cumulative information of the states for subject i measured from the first to the jth visit.
${\bar{A}}_{j} = (A_{1}, A_{2}, \dots, A_{j})$ is the cumulative actions given to subject i from the first to the jth visit based on the previous history $({\bar{S}}_{j - 1}, {\bar{A}}_{j - 1})$ .
The action given at visit j is determined by the state of the subject at visit j with the actions given at the previous visits, A_j−1.
The states and actions $({\bar{S}}_{j}, {\bar{A}}_{j})$ were assumed to be independent between the subjects and dependent within a subject.
The response Y_j were measured at each visit j.
From the notion of the potential outcome or counterfactual [28, 29], a_j is a set of all possible actions that can be given at visit j.
${\bar{S}}_{j} ({\bar{a}}_{j - 1}) = (S_{1}, S_{2} (a_{1}), \dots, S_{j} ({\bar{a}}_{j - 1}))$ is the potential state history under the possible action ${\bar{a}}_{j - 1}$ .
$Y ({\bar{a}}_{K})$ is the potential outcome under the possible action ${\bar{a}}_{K}$ .
$d_{j}^{o p t}$ is the optimal dynamic treatment regime which optimizes the expected value of outcome Y_j.
$E (Y ({\underline{d}}_{j}^{o p t}) | {\bar{S}}_{j}, {\bar{A}}_{j - 1})$ is the expected value of the potential outcome or counterfactual final responses
$E (Y (a_{j}, {\underline{d}}_{j + 1}^{o p t}) | {\bar{S}}_{j}, {\bar{A}}_{j - 1})$ is the expected value of the potential outcome if action a_j is chosen at time j and then subsequently the optimal decision rules are followed.

Three assumptions were made in this study, which are consistency, no unmeasured confounders, and positivity. The assumption of consistency is when the observed outcome Y is equal to the potential outcome $Y ({\bar{a}}_{K})$ and the observed state history ${\bar{S}}_{K}$ is equal to the potential state history ${\bar{S}}_{{\bar{a}}_{K - 1}}$ under the observed treatment a_K = A_K. The treatment is given in a way that it is possible for all the treatment options to be assigned to all patients in the population under consideration.

The assumption of no unmeasured confounders is that the decision for each treatment depends only on the observed states and treatment history. For any regime ${\bar{a}}_{K}$ , the action given at visit j, A_j is independent of any future or potential states or outcome given the previous history, for j = 1, 2, …, K. If there is no drop-out, the assumption is equivalent to the exchangeability. If the subjects are censored, further assumption is needed where the censoring is non-informative conditional on history. That is, the potential outcome of a censored patient will follow the same distribution as the uncensored patients.

The third assumption about positivity is that the optimal treatment regime has a nonzero or positive probability of being observed in the data. In continuous treatment, the optimal treatment regime is identifiable from the observed data. The assumption may be theoretically and practically violated. A theoretical violation occurs when the study’s design prevents a patient from receiving a specific therapy. Practical violations, on the other hand, occur when a portion of the patient has a very low probability of receiving therapy.

Ridge quadratic inference function for myopic regret-regression (rQIF-MRr)

Suppose the mean response of the MRr from [13] is

\begin{matrix} h_{j} & = E (Y_{j} | {\bar{S}}_{j}, {\bar{A}}_{j}) \\ = β_{0} + ϕ_{j} ({\bar{S}}_{j} | {\bar{S}}_{j - 1}, {\bar{A}}_{j - 1}; β) - μ_{j} (A_{j} | {\bar{S}}_{j}, {\bar{A}}_{j - 1}, ψ), \end{matrix}

(1)

where $ϕ_{j} ({\bar{S}}_{j} | {\bar{S}}_{j - 1}, {\bar{A}}_{j - 1}; β) = β_{j}^{T} ({\bar{S}}_{j - 1}, {\bar{A}}_{j - 1}) Z_{j}$ . The coefficients, $β_{j}^{T} ({\bar{S}}_{j - 1}, {\bar{A}}_{j - 1})$ depend on the history before visit j of the states and actions [9, 10]. Meanwhile, the residuals, $Z_{j} = S_{j} - E (S_{j} | {\bar{S}}_{j - 1}, {\bar{A}}_{j - 1})$ is a linear combination between S_j and the expected value of S_j given the history, $({\bar{S}}_{j - 1}, {\bar{A}}_{j - 1})$ . The regret function $μ_{j} (A_{j} | {\bar{S}}_{j}, {\bar{A}}_{j - 1}, ψ)$ is zero if the optimal action is selected at visit j. Otherwise, the regret function is positive since the aim is to maximize the response Y_j [9].

[17] defines R(ρ) to be the working correlation matrix with parameter ρ, and the inverse function of R⁻¹(ρ) can be approximated by a linear combination of several basis matrices defined as

\begin{matrix} R^{- 1} (ρ) = \sum_{l = 1}^{m} τ_{l} M_{l}, \end{matrix}

(2)

where τ₁ are unknown coefficients and M_l are known basis matrices.

There are several types of working correlation structures [17], but in this paper, we only focus on three common types of working correlation structures, which are the first order autoregressive, AR(1), exchangeable and unspecified working correlation structures. The estimating equation of the QIF-MRr is

\begin{matrix} \sum_{i = 1}^{n} {(\frac{\partial h_{i}}{\partial (β, ψ)})}^{T} D_{i}^{- \frac{1}{2}} (τ_{1} M_{1} + \dots + τ_{m} M_{m}) D_{i}^{- \frac{1}{2}} (Y_{i} - h_{i}), \end{matrix}

(3)

and can be written in the form of its extended score as

\begin{matrix} g_{n} (β, ψ) = \frac{1}{n} \sum_{i = 1}^{n} g_{i} (β, ψ) = \frac{1}{n} (\begin{matrix} \sum_{i = 1}^{n} {(\frac{\partial h_{i}}{\partial (β, ψ)})}^{T} D_{i}^{- \frac{1}{2}} M_{1} D_{i}^{- \frac{1}{2}} (Y_{i} - h_{i}) \\ \sum_{i = 1}^{n} {(\frac{\partial h_{i}}{\partial (β, ψ)})}^{T} D_{i}^{- \frac{1}{2}} M_{2} D_{i}^{- \frac{1}{2}} (Y_{i} - h_{i}) \\ ⋮ \\ \sum_{i = 1}^{n} {(\frac{\partial h_{i}}{\partial (β, ψ)})}^{T} D_{i}^{- \frac{1}{2}} M_{m} D_{i}^{- \frac{1}{2}} (Y_{i} - h_{i}) \end{matrix}) \end{matrix}

(4)

where g_i(β, ψ) is the estimating equations from Eq (3), and the linear coefficients, τ_l can be viewed as nuisance parameters [24]. The D_i is a diagonal matrix of the marginal variances for subject i, and M₁, …, M_m are the basis functions for the working correlation matrix with dimension (K × K). The derivatives of the h_j(β, ψ) term, ∂h^T/∂(β, ψ) and vector g_i((β, ψ)) have (p × K) dimension where p is the number of parameters (β, ψ).

As there are more equations than the parameters, the generalized method of moments (GMM) from [30], can be applied to create the QIF-MRr as

\begin{matrix} Q_{n} (β, ψ) = g_{n}^{T} C_{n}^{- 1} g_{n}, \end{matrix}

(5)

where,

\begin{matrix} C_{n} = {\frac{1}{n}}^{2} \sum_{i = 1}^{n} g_{i}^{T} (β, ψ)) g_{i} (β, ψ) . \end{matrix}

The parameters β and ψ can be estimated by setting the extended score, g_n(β, ψ) in Eq (4) as close to zero as possible, which is by minimizing the Q_n(β, ψ) function as

\begin{matrix} (\hat{β}, \hat{ψ}) = arg min_{(β, ψ)} g_{n}^{T} C_{n}^{- 1} g_{n} . \end{matrix}

(6)

Singularity problem often occurs during estimation using QIF-MRr in ODTR. [27] used a ridge-regression to solve singularity problem. Thus, by applying the penalized QIF from [31], we introduce the ridge penalty in QIF-MRr and define a new Q_n function of Eq (5) as

\begin{matrix} r Q_{n} (β, ψ) = g_{n}^{T} C_{n}^{- 1} g_{n} + λ \sum_{v = 1}^{p} {| {(β, ψ)}_{v} |}^{2}; λ \geq 0 \end{matrix}

(7)

where λ is the tuning parameter, and $\sum_{v = 1}^{p} {| {(β, ψ)}_{v} |}^{2}$ is a ridge penalty function where v = 1, …, p is the number of parameter (β, ψ), and p is the total number of parameter (β, ψ). The penalty function will act as a weight during estimation, and stabilize the estimation of the parameter in the computation. The rQIF-MRr minimizes the parameters β and ψ as

\begin{matrix} (\hat{β}, \hat{ψ}) = arg min_{(β, ψ)} g_{n}^{T} C_{n}^{- 1} g_{n} + λ \sum_{v = 1}^{p} {| {(β, ψ)}_{v} |}^{2} \end{matrix}

(8)

Results and discussions

The parameter estimates of the proposed method were investigated using simulations. The simulation dataset were generated sing the scenario from [8] in order to estimate the mean response Y_j at each visit. Let i = 1, 2, …, n be the observed subject, and n is the sample size. Then, j = 1, 2, …, K is the time visit where K = 10 is the final visit.

We generate the first state S₁ for each i from a normal distribution with mean 0.5 and variance 0.01. For the second state onwards, the states S_j ∼ N(m_j, 0.01), where

\begin{matrix} m_{j} = 0.5 + 0.2 S_{j - 1} - 0.07 A_{j - 1}, \end{matrix}

for j = 2, 3, …, K. In this simulation, we only consider one action per visit. The action, A_j were generated from uniform distribution, A_j ∼ U{0, 1, 2, 3}.

The regret function is defined as

\begin{matrix} μ_{j} (a_{j} | {\bar{S}}_{j}, {\bar{A}}_{j - 1}; ψ) = ψ_{1} | a_{j} - ψ_{2} - ψ_{3} S_{j} | \end{matrix}

(9)

where $min_{μ_{j}} μ_{j} (a_{j} | {\bar{S}}_{j}, {\bar{A}}_{j - 1}; ψ) = 0$ . We define

\begin{matrix} I_{j} = {\begin{matrix} 1 & if & A_{j} - ψ_{2} - ψ_{3} S_{j} \geq 0 \\ - 1 & if & A_{j} - ψ_{2} - ψ_{3} S_{j} < 0 \end{matrix} \end{matrix}

(10)

where I_j is a vector of length K for each subject.

The response, Y_j is generated from a normal distribution with mean,

\begin{matrix} h_{j} (β, ψ) & = E (Y_{j} | {\bar{S}}_{j}, {\bar{A}}_{j}) \\ = β_{0} + β_{1} Z_{j} - ψ_{1} | a_{j} - ψ_{2} - ψ_{3} S_{j} | \end{matrix}

(11)

and variance $σ_{Y}^{2} = 0.64$ , where

\begin{matrix} Z_{j} & = S_{j} - E (S_{j}) \\ = S_{j} - m_{j} . \end{matrix}

The residuals, $\in \sim N (0, σ_{Y}^{2} Σ)$ where Σ is obtained from an autoregressive true correlation matrix. Applying the Cholesky decomposition, Σ = CC^T which decomposed a positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose. We take ϵ = C W where W ∼ N(0, I_R). Thus,

\begin{matrix} v a r (ϵ) & = σ_{Y}^{2} C v a r (W) C^{T} \\ = σ_{Y}^{2} C C^{T} \\ = σ_{Y}^{2} Σ \end{matrix}

Then, the response variable for each subject i is

\begin{matrix} Y_{j} = β_{0} + β_{1} Z_{j} - μ_{j} (A_{j} | {\bar{S}}_{j}, {\bar{A}}_{j - 1}; ψ) + ϵ . \end{matrix}

The correlation in the response variable was levelled into three levels: ρ = 0.1, 0.5, 0.95. ρ = 0.1 indicates a low correlation, while ρ = 0.5 is a medium correlation, and ρ = 0.95 is a high correlation. The initial parameter values for the coefficients are β = {3, −5} and ψ = {1.5, 0.1, 5.5}.

To estimate the parameters, we first regress each S_j on history $({\bar{S}}_{j - 1}, {\bar{A}}_{j - 1})$ and define Z_j [9]. Then, we differentiate h_j(β, ψ) from Eq (11) with respect to β₀, β₁, ψ₁, ψ₂ and ψ₃ to obtain ∂h/∂(β, ψ). For each subject i, the partial derivative with respect to β₀ is

\begin{matrix} \frac{\partial h}{\partial β_{0}} = (\begin{matrix} 1 \\ ⋮ \\ 1 \end{matrix}) \end{matrix}

and the partial derivative with respect to β₁ is

\begin{matrix} \frac{\partial h}{\partial β_{1}} = Z \end{matrix}

where Z is the residual vector for subject i. The partial derivatives with respect to ψ₁, ψ₂ and ψ₃ are then

\begin{matrix} \frac{\partial h}{\partial ψ_{1}} = - I, \end{matrix}

\begin{matrix} \frac{\partial h}{\partial ψ_{2}} = ψ_{1} I, \end{matrix}

\begin{matrix} \frac{\partial h}{\partial ψ_{3}} = ψ_{1} S_{j} I . \end{matrix}

where I is an indicators sign of the regrets for subject i from Eq (10).

The parameters $(\hat{β}, \hat{ψ})$ can be obtained using the optim built-in function in R by minimizing Eq (8). The tuning parameter, λ was obtained using a cross-validation technique [32]. The optimal tuning parameter for this simulation is λ = 0.01. For the simulation, we used bootstrap resampling 1000 times with two sample sizes (n = 25 and n = 500). [33] considered that n = 25 as a small sample size. Hence, we used a sample size n = 25 to compare the performance in estimates between the rQIf-MRr and QIF-MRr methods. We fit the models (i.e. QIF-MRr and rQIF-MRr) using AR(1), exchangeable, and unspecified working correlation structures.

For the AR(1) working correlation structure, the inverse working correlation structure can be written as

\begin{matrix} R^{- 1} (ρ) = τ_{0} M_{0} + τ_{1} M_{1} + τ_{2} M_{2} \end{matrix}

where M₀ is an identity matrix

\begin{matrix} M_{0} = (\begin{matrix} 1 & 0 & \dots & 0 \\ 0 & 1 & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & 1 \end{matrix}), \end{matrix}

M₁ is a matrix with 1 on the two main off-diagonals and 0 elsewhere,

\begin{matrix} M_{1} = (\begin{matrix} 0 & 1 & \dots & 0 \\ 1 & 0 & 1 & \dots & 0 \\ 0 & 1 & 0 & 1 & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋱ & ⋱ & ⋮ \\ 0 & \dots & \dots & \dots & \dots & 0 \end{matrix}) \end{matrix}

and M₂ is a matrix with 1 on the corners (1, 1) and (K, K) and 0 elsewhere

\begin{matrix} M_{2} = (\begin{matrix} 1 & 0 & \dots & 0 \\ 0 & 0 & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & 1 \end{matrix}) . \end{matrix}

Here τ₀ = (1 + ρ²)/(1 − ρ²), τ₁ = (−ρ)/(1 − ρ²) and τ₂ = (−ρ²)/(1 − ρ²).

For an exchangeable working correlation structure, R(ρ) consists of 1’s on the diagonal and ρ’s everywhere off-diagonal. Then, R⁻¹ is given as

\begin{matrix} R^{- 1} (ρ) = τ_{0} M_{0} + τ_{1} M_{1} \end{matrix}

where M₀ is an identity matrix

\begin{matrix} M_{0} = (\begin{matrix} 1 & 0 & \dots & 0 \\ 0 & 1 & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & 1 \end{matrix}) \end{matrix}

and M₁ is a matrix with diagonal elements 0 and off-diagonal elements 1

\begin{matrix} M_{1} = (\begin{matrix} 0 & 1 & \dots & 1 \\ 1 & 0 & \dots & 1 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 1 & 1 & \dots & 0 \end{matrix}) . \end{matrix}

Note that, τ₀ = −{(K − 2)ρ + 1}/{(K − 1)ρ² − (K − 2)ρ − 1} and τ₁ = ρ/{(K − 1)ρ² − (K − 2)ρ − 1} and K is the dimension of R.

The unspecified working correlation structure can be used to determine the working correlation structure when there is some difficulty and challenges in obtaining it. For the unspecified working correlation structure, the basis matrices M₀ = I_n and $M_{1} = \hat{U}$ , where

\begin{matrix} \hat{U} = \frac{1}{N} Σ (Y_{i} - h_{i}) {(Y_{i} - h_{i})}^{T}, \end{matrix}

(12)

and the matrix $\hat{U}$ is a consistent estimator of the variance matrix of Y [19].

The correctly specified working correlation structure is when we use the AR(1) working correlation structure to generate the data and fit the models. Otherwise, the model is considered a misspecified working correlation structure. The results below give the mean value of the parameter estimates for 1000 bootstrap resampling (Mean), standard error (SE), and the root mean square error (RMSE).

With sample sizes of n = 25 and n = 500 at different correlation values, ρ, Table 1 compares parameter estimations using QIF-MRr and rQIF-MRr. The rQIF-MRr is more efficient than the QIF-MRr for small and large sample sizes, with small SE and RMSE.

Table 1. Parameter estimation for correctly specified working correlation structure of the QIF-MRr and rQIF-MRr with different correlation values, ρ.

ρ	Methods	Coefficients	n = 25			n = 500
ρ	Methods	Coefficients	Mean	SE	RMSE	Mean	SE	RMSE
ρ = 0.1	QIF-MRr	β ₀	3.0442	0.3685	0.3712	3.0596	0.3803	0.3850
		β ₁	-4.9017	0.1962	0.2195	-4.8988	0.2135	0.2362
		ψ ₁	1.5558	0.3601	0.3644	1.5561	0.3894	0.3934
		ψ ₂	0.1956	0.2691	0.2856	0.2037	0.2665	0.2859
		ψ ₃	5.6546	0.2147	0.2646	5.6269	0.2171	0.2515
	rQIF-MRr	β ₀	3.0528	0.2974	0.3020	3.0324	0.3120	0.3137
		β ₁	-4.9154	0.1731	0.1926	-4.9166	0.1656	0.1854
		ψ ₁	1.5316	0.3404	0.3419	1.5333	0.3342	0.3358
		ψ ₂	0.2168	0.2530	0.2786	0.2400	0.2625	0.2975
		ψ ₃	5.6035	0.1928	0.2188	5.6036	0.1999	0.2252
ρ = 0.5	QIF-MRr	β ₀	3.0311	0.3521	0.3535	3.0292	0.3428	0.3441
		β ₁	-4.9223	0.1916	0.2067	-4.9212	0.1961	0.2113
		ψ ₁	1.6086	0.3687	0.3844	1.5549	0.3340	0.3384
		ψ ₂	0.2245	0.3203	0.3436	0.2323	0.2759	0.3060
		ψ ₃	5.6301	0.2163	0.2524	5.6416	0.2295	0.2697
	rQIF-MRr	β ₀	3.0352	0.3101	0.3121	3.0448	0.2970	0.3003
		β ₁	-4.9165	0.1789	0.1974	-4.9266	0.1720	0.1870
		ψ ₁	1.5513	0.3242	0.3282	1.5363	0.3446	0.3465
		ψ ₂	0.2155	0.2656	0.2896	0.2428	0.2522	0.2899
		ψ ₃	5.6063	0.2031	0.2292	5.6026	0.1939	0.2194
ρ = 0.95	QIF-MRr	β ₀	3.0824	0.3606	0.3699	3.0514	0.3118	0.3160
		β ₁	-4.8987	0.2140	0.2367	-4.9067	0.2045	0.2248
		ψ ₁	1.5349	0.3998	0.4013	1.5637	0.3731	0.3785
		ψ ₂	0.2027	0.2925	0.3100	0.2107	0.2674	0.2894
		ψ ₃	5.6263	0.2216	0.2551	5.6302	0.2201	0.2557
	rQIF-MRr	β ₀	3.0503	0.2901	0.2944	3.0339	0.3014	0.3033
		β ₁	-4.9251	0.1698	0.1856	-4.9098	0.1679	0.1907
		ψ ₁	1.5355	0.3352	0.3371	1.5332	0.3613	0.3628
		ψ ₂	0.2267	0.2618	0.2908	0.2281	0.2720	0.3007
		ψ ₃	5.6058	0.1846	0.2127	5.6060	0.1979	0.2245

Open in a new tab

The results for the correctly specified working correlation structure was shown in Table 1, where the data is generated and fitted using AR(1) working correlation structure. The advantage of using the QIF-MRr and rQIF-MRr in estimation is that the estimates are still efficient even with the misspecified working correlation structure.

The estimation for the misspecified working correlation structures for both QIF-MRr and rQIF-MRr are given in Tables 2–4. For low correlation, ρ = 0.1 in Table 2, the parameter estimation for rQIF-MRr are unbiased and efficient with small SE and RMSE even with misspecified working correlation structures. The correctly specified AR(1) for rQIF-MRr gives slightly smaller SE compared to exchangeable and unspecified working correlation structures.

Table 2. Parameter estimation of the misspecification working correlation structures for the QIF-MRr and rQIF-MRr at low correlation value, ρ = 0.1.

ρ	Methods	Correlation Structure	Coefficients	n = 25			n = 500
ρ	Methods	Correlation Structure	Coefficients	Mean	SE	RMSE	Mean	SE	RMSE
ρ = 0.1	QIF-MRr	AR(1)	β ₀	3.0442	0.3685	0.3712	3.0596	0.3803	0.3850
			β ₁	-4.9017	0.1962	0.2195	-4.8988	0.2135	0.2362
			ψ ₁	1.5558	0.3601	0.3644	1.5561	0.3894	0.3934
			ψ ₂	0.1956	0.2691	0.2856	0.2037	0.2665	0.2859
			ψ ₃	5.6546	0.2147	0.2646	5.6269	0.2171	0.2515
		Exchangeable	β ₀	3.0645	0.3672	0.3728	3.0790	0.3613	0.3699
			β ₁	-4.8821	0.2071	0.2383	-4.8961	0.2096	0.2340
			ψ ₁	1.5157	0.4074	0.4077	1.5514	0.3795	0.3830
			ψ ₂	0.1980	0.2941	0.3100	0.1909	0.2613	0.2767
			ψ ₃	5.6276	0.1952	0.2332	5.6270	0.2277	0.2607
		Unspecified	β ₀	2.9806	0.4761	0.4765	2.9593	0.4881	0.4898
			β ₁	-4.9094	0.2694	0.2843	-4.9037	0.2389	0.2576
			ψ ₁	1.5900	0.5022	0.5103	1.5900	0.5650	0.5721
			ψ ₂	0.2376	0.3304	0.3579	0.2361	0.2575	0.2913
			ψ ₃	5.6526	0.2461	0.2896	5.6581	0.2357	0.2838
	rQIF-MRr	AR(1)	β ₀	3.0528	0.2974	0.3020	3.0324	0.3120	0.3137
			β ₁	-4.9154	0.1731	0.1926	-4.9166	0.1656	0.1854
			ψ ₁	1.5316	0.3404	0.3419	1.5333	0.3342	0.3358
			ψ ₂	0.2168	0.2530	0.2786	0.2400	0.2625	0.2975
			ψ ₃	5.6035	0.1928	0.2188	5.6036	0.1999	0.2252
		Exchangeable	β ₀	3.0552	0.3411	0.3455	3.0373	0.3554	0.3573
			β ₁	-4.9108	0.1659	0.1883	-4.9051	0.1643	0.1897
			ψ ₁	1.5284	0.3452	0.3464	1.5199	0.3650	0.3656
			ψ ₂	0.2122	0.2545	0.2781	0.1989	0.2618	0.2799
			ψ ₃	5.5959	0.2063	0.2275	5.6178	0.1911	0.2245
		Unspecified	β ₀	3.0111	0.4069	0.4071	2.9980	0.3949	0.3949
			β ₁	-4.9116	0.1951	0.2142	-4.9123	0.1928	0.2119
			ψ ₁	1.5272	0.4894	0.4902	1.5415	0.5086	0.5103
			ψ ₂	0.2339	0.2960	0.3249	0.2326	0.2512	0.2840
			ψ ₃	5.6149	0.2122	0.2413	5.6203	0.1876	0.2228

Open in a new tab

Table 4. Parameter estimation of the misspecification working correlation structures for the QIF-MRr and rQIF-MRr at high correlation value, ρ = 0.95.

ρ	Methods	Correlation Structure	Coefficients	n = 25			n = 500
ρ	Methods	Correlation Structure	Coefficients	Mean	SE	RMSE	Mean	SE	RMSE
ρ = 0.95	QIF-MRr	AR(1)	β ₀	3.0824	0.3606	0.3699	3.0514	0.3118	0.3160
			β ₁	-4.8987	0.2140	0.2367	-4.9067	0.2045	0.2248
			ψ ₁	1.5349	0.3998	0.4013	1.5637	0.3731	0.3785
			ψ ₂	0.2027	0.2925	0.3100	0.2107	0.2674	0.2894
			ψ ₃	5.6263	0.2216	0.2551	5.6302	0.2201	0.2557
		Exchangeable	β ₀	3.0917	0.3967	0.4071	3.0433	0.3566	0.3593
			β ₁	-4.9110	0.2219	0.2391	-4.8956	0.1836	0.2112
			ψ ₁	1.5109	0.3888	0.3890	1.5925	0.3413	0.3536
			ψ ₂	0.2113	0.2789	0.3003	0.2072	0.2833	0.3030
			ψ ₃	5.6337	0.2271	0.2635	5.6022	0.1977	0.2226
		Unspecified	β ₀	3.0447	0.4625	0.4646	3.0129	0.4968	0.4970
			β ₁	-4.8935	0.1912	0.2189	-4.9421	0.2372	0.2442
			ψ ₁	1.5760	0.4772	0.4832	1.5813	0.4911	0.4978
			ψ ₂	0.2113	0.2811	0.3023	0.2701	0.3556	0.3941
			ψ ₃	5.6389	0.2353	0.2732	5.6423	0.2583	0.2948
	rQIF-MRr	AR(1)	β ₀	3.0503	0.2901	0.2944	3.0339	0.3014	0.3033
			β ₁	-4.9251	0.1698	0.1856	-4.9098	0.1679	0.1907
			ψ ₁	1.5355	0.3352	0.3371	1.5332	0.3613	0.3628
			ψ ₂	0.2267	0.2618	0.2908	0.2281	0.2720	0.3007
			ψ ₃	5.6058	0.1846	0.2127	5.6060	0.1979	0.2245
		Exchangeable	β ₀	3.0282	0.3621	0.3632	3.0463	0.3371	0.3402
			β ₁	-4.8960	0.1728	0.2016	-4.9026	0.1741	0.1995
			ψ ₁	1.5156	0.3482	0.3485	1.5032	0.3687	0.3687
			ψ ₂	0.2070	0.2559	0.2774	0.2198	0.2571	0.2836
			ψ ₃	5.6102	0.2042	0.2320	5.5970	0.1978	0.2203
		Unspecified	β ₀	3.0027	0.4507	0.4507	2.9941	0.3957	0.3957
			β ₁	-4.9160	0.2012	0.2180	-4.9011	0.1739	0.2000
			ψ ₁	1.5389	0.4213	0.4231	1.5199	0.4691	0.4696
			ψ ₂	0.2321	0.2768	0.3067	0.2415	0.2448	0.2827
			ψ ₃	5.6229	0.2122	0.2452	5.6142	0.1927	0.2240

Open in a new tab

When the correlation is medium (ρ = 0.5), and high (ρ = 0.95) in Tables 3 and 4 respectively, estimation using misspecified working correlation structures still gives unbiased and efficient estimates with small SE and RMSE. Although the true model (i.e. AR(1)) gives slightly better estimates, but the difference is not far. This is one of the advantage of using the QIF in estimation, where the parameter estimates is efficient even with the misspecified working correlation structure [24, 34].

Table 3. Parameter estimation of the misspecification working correlation structures for the QIF-MRr and rQIF-MRr at medium correlation value, ρ = 0.5.

ρ	Methods	Correlation Structure	Coefficients	n = 25			n = 500
ρ	Methods	Correlation Structure	Coefficients	Mean	SE	RMSE	Mean	SE	RMSE
ρ = 0.5	QIF-MRr	AR(1)	β ₀	3.0311	0.3521	0.3535	3.0292	0.3428	0.3441
			β ₁	-4.9223	0.1916	0.2067	-4.9212	0.1961	0.2113
			ψ ₁	1.6086	0.3687	0.3844	1.5549	0.3340	0.3384
			ψ ₂	0.2245	0.3203	0.3436	0.2323	0.2759	0.3060
			ψ ₃	5.6301	0.2163	0.2524	5.6416	0.2295	0.2697
		Exchangeable	β ₀	3.0382	0.4092	0.4110	3.0596	0.4076	0.4119
			β ₁	-4.8863	0.1930	0.2240	-4.8863	0.1998	0.2298
			ψ ₁	1.5769	0.3343	0.3430	1.5301	0.3806	0.3818
			ψ ₂	0.1984	0.2691	0.2865	0.2049	0.2762	0.2954
			ψ ₃	5.6305	0.2191	0.2550	5.6194	0.1988	0.2319
		Unspecified	β ₀	3.0109	0.4180	0.4181	3.0342	0.4024	0.4038
			β ₁	-4.9027	0.2338	0.2533	-4.9305	0.2197	0.2304
			ψ ₁	1.5958	0.4666	0.4764	1.6050	0.5100	0.5207
			ψ ₂	0.2211	0.3208	0.3429	0.2631	0.2868	0.3300
			ψ ₃	5.6165	0.2055	0.2363	5.5904	0.2032	0.2224
	rQIF-MRr	AR(1)	β ₀	3.0352	0.3101	0.3121	3.0448	0.2970	0.3003
			β ₁	-4.9165	0.1789	0.1974	-4.9266	0.1720	0.1870
			ψ ₁	1.5513	0.3242	0.3282	1.5363	0.3446	0.3465
			ψ ₂	0.2155	0.2656	0.2896	0.2428	0.2522	0.2899
			ψ ₃	5.6063	0.2031	0.2292	5.6026	0.1939	0.2194
		Exchangeable	β ₀	3.0594	0.3355	0.3407	3.0422	0.3355	0.3382
			β ₁	-4.9105	0.1606	0.1838	-4.9146	0.1771	0.1966
			ψ ₁	1.5171	0.3443	0.3447	1.5107	0.3758	0.3759
			ψ ₂	0.2007	0.2496	0.2691	0.2083	0.2561	0.2781
			ψ ₃	5.6083	0.1878	0.2168	5.6162	0.1998	0.2311
		Unspecified	β ₀	3.0182	0.4152	0.4156	2.9904	0.4173	0.4174
			β ₁	-4.9158	0.1944	0.2118	-4.9178	0.1868	0.2041
			ψ ₁	1.5059	0.4972	0.4972	1.5257	0.5301	0.5307
			ψ ₂	0.2308	0.2858	0.3143	0.2471	0.2620	0.3005
			ψ ₃	5.6149	0.2098	0.2392	5.6232	0.1913	0.2275

Open in a new tab

Application to Warfarin data

For application, we use the data from Warfarin-treated patients who at risk of thrombosis [2]. The data consists of 303 patients with 14 clinic visits. The states, S_j, is defined as the difference between the International Normalized Ratio (INR) at visit j and the INR within the target range. Meanwhile, A_j is defined as the dose given to patients at each visit j. Suppose the INR is used to measure blood clotting speed, and a positive S_j indicates that the clotting time was too long and that the dose A_j should be reduced, and vice versa. The goal of the treatment is to make sure the INR is within the target range.

From the 14 clinic visits, only 9 visits were considered, where the first 4 visits were treated as a stabilization period, and the last visits had no contribution to the outcome. At each visit j, the response Y_j is measured for j = 1, 2, …, K with K = 9. Hence, the mean response for Y_j conditional on $({\bar{S}}_{j}, {\bar{A}}_{j})$ as in Eq (1). The mixture model for S_j is used to obtain the state residuals, Z_j. The model consists of a logistic component for P(S_j = 0) and linear component for |S_j| given (S_j ≠ 0).

The regret function is modeled as in Eq (9). The first step will be estimating the residual of the state function, Z_j. Then, for each i, we estimate the g_i(β, ψ) functions, the extended score matrix g_n(β, ψ), and the partial derivatives $\partial h_{i}^{T} / \partial (β, ψ)$ . In estimation, the initial value for (β, ψ) = {8.00, 0.00, 2.00, 0.25, −5.00} given that (β, ψ) is unknown for this application [13]. The parameter estimates of $(\hat{β}, \hat{ψ})$ can be obtained as in Eqs 6 and 8 for QIF-MRr and rQIF-MRr respectively. Using a cross-validation technique, the optimal tuning parameter λ = 0.1375436. 100 bootstrap resamplings were performed to test the consistency and efficiency of the estimation using AR(1), exchangeable, and unspecified working correlation structures.

Table 5 shows the results of parameter estimations for Warfarin data using the QIF-MRr and rQIF-MRr with three different types of working correlation structures. In comparison to AR(1) with an unspecified working correlation structure, estimation using rQIF-MRr with an exchangeable working correlation structure is more efficient with a smaller SE. Estimation using the rQIf-MRr with the AR(1) working correlation structure produces better results than the QIF-MRr. Meanwhile, the findings for both methods are almost similar when we estimate the parameters using an exchangeable and unspecified working correlation structure.

Table 5. Parameter estimation of Warfarin data with AR(1), exchangeable, and unspecified working correlation structures for QIF-MRr and rQIF-MRr.

Method		$\hat{β_{1}}$	$\hat{β_{2}}$	$\hat{ψ_{1}}$	$\hat{ψ_{2}}$	$\hat{ψ_{3}}$
QIF-MRr_AR(1)	Mean	10.5870	1.6317	2.5877	0.0720	-8.2011
QIF-MRr_AR(1)	SE	2.9063	2.6250	1.8113	1.3275	4.9005
QIF-MRr_Exchangeable	Mean	6.5295	-2.2246	0.6014	-0.0254	-2.5608
QIF-MRr_Exchangeable	SE	0.3169	0.1561	0.1246	0.0418	0.4755
QIF-MRr_Unspecified	Mean	6.9895	-2.0406	1.0450	-0.0452	-3.4159
QIF-MRr_Unspecified	SE	0.1733	0.2822	0.1606	0.0810	0.3609
rQIF-MRr_AR(1)	Mean	6.7960	-2.3492	1.1033	-0.1296	-3.2082
rQIF-MRr_AR(1)	SE	0.4594	0.4512	0.5022	0.1987	0.4037
rQIF-MRr_Exchangeable	Mean	6.5844	-2.0634	0.6057	-0.0566	-2.8881
rQIF-MRr_Exchangeable	SE	0.2497	0.1261	0.1742	0.1711	0.2019
rQIF-MRr_Unspecified	Mean	6.9572	-2.0550	1.0379	-0.0383	-3.4174
rQIF-MRr_Unspecified	SE	0.2037	0.2756	0.1662	0.0585	0.3791

Open in a new tab

Conclusions

The rQIF-MRr method was proposed in this paper with the goal of improving estimation in ODTR. There is huge potential to explore rQIF-MRr in personalized medicine, particularly in estimating ODTR. The study of ODTR is a branch of personalized medicine and it is a promising and developing field.

The simulation studies show that the parameter estimation using rQIF-MRr is unbiased and efficient even with the misspecified working correlation structure. In the simulation analysis, the proposed rQIF-MRr method performed well for small and large sample sizes at any correlation level. In comparison to the QIF-MRr, parameter estimation using the rQIF-MRr gives an efficient estimate with a minimal standard error when applied to Warfarin data.

Comparisons of different methods for determining the optimal tuning parameter λ, such as generalized cross-validation technique [35, 36], are in our best interest. Genetic algorithm [37] and particle swarm optimization [38] are two other methods.

When working with participants or patients with a defined time study in ODTR, there may be dropouts during data collection. It’s possible that this will result in either missing or survival data. The proposed method is incompatible with this type of data. Improvisation is required to deal with missing and survival data.

Supporting information

S1 Dataset

(DAT)

Click here for additional data file.^{(381.1KB, dat)}

Data Availability

As an illustration of the proposed method, we use simulation and real-world data applications. For a simulation study, the data is generated based on the scenario given in the section Results and Discussions. For the application, we used an anticoagulation dataset from Rosthoj et al. 2006 [2] (DOI:10.1002/sim.2694) also included in the references section of the paper. The DOI for the dataset for the application part is DOI:10.1002/sim.2694.

Funding Statement

This research is funded by Universiti Malaya www.um.edu.my, Research Grant, (GPF083B-2020 and BKS073-2017 to NAM). The funders had no role in the study design, data collection, and analysis, decision to publish, or preparation of the manuscript.

References

1. Chakraborty B, Moodie EE. Statistical Methods for Dynamic Treatment Regimes. Springer; 2013. [Google Scholar]
2. Rosthøj S, Fullwood C, Henderson R, Steward S. Estimation of Optimal Dynamic Anticoagulation Regimes from Observational Data: A Regret Based Approach. Statistics in Medicine. 2006;25(24):4197–4215. doi: 10.1002/sim.2694 [DOI] [PubMed] [Google Scholar]
3. Robins JM, Orellana L, Rotnitzky A. Estimation and Extrapolation of Optimal Treatment and Testing Strategies. Statistics in Medicine. 2008;27(23):4678–4721. doi: 10.1002/sim.3301 [DOI] [PubMed] [Google Scholar]
4. Qiu H, Carone M, Sadikova E, Petukhova M, Kessler RC, Luedtke A. Optimal individualized decision rules using instrumental variable methods. Journal of the American Statistical Association. 2021;116(533):174–191. doi: 10.1080/01621459.2020.1745814 [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Sun Y, Wang L. Stochastic Tree Search for Estimating Optimal Dynamic Treatment Regimes. Journal of the American Statistical Association. 2021;116(533):421–432. doi: 10.1080/01621459.2020.1819294 [DOI] [Google Scholar]
6. Moodie EE, Richardson TS, Stephens DA. Demystifying Optimal Dynamic Treatment Regimes. Biometrics. 2007;63(2):447–455. doi: 10.1111/j.1541-0420.2006.00686.x [DOI] [PubMed] [Google Scholar]
7. Moodie EE, Platt RW, Kramer MS. Estimating response-maximized decision rules with applications to breastfeeding. Journal of the American Statistical Association. 2009;104(485):155–165. doi: 10.1198/jasa.2009.0011 [DOI] [Google Scholar]
8. Murphy SA. Optimal Dynamic Treatment Regimes. Journal of the Royal Statistical Society, Series B (with discussion). 2003;65(2):331–366. doi: 10.1111/1467-9868.00389 [DOI] [Google Scholar]
9. Henderson R, Ansell P, Alshibani D. Regret-Regression for Optimal Dynamic Treatment Regimes. Biometrics. 2010;66(4):1192–1201. doi: 10.1111/j.1541-0420.2009.01368.x [DOI] [PubMed] [Google Scholar]
10. Barrett JK, Henderson R, Rosthøj S. Doubly robust estimation of optimal dynamic treatment regimes. Statistics in biosciences. 2014;6(2):244–260. doi: 10.1007/s12561-013-9097-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Robins JM. Causal Inference from Complex Longitudinal Data. Latent Variable Modelling and Application to Causality. 1997; Berkane M., Editor. NY: Springer Verlag:69–117. [Google Scholar]
12.Robins JM. Optimal structural nested models for optimal sequential decisions. In: Proceedings of the second seattle Symposium in Biostatistics. Springer; 2004. p. 189–326.
13.Mohamed NA. Optimal Dynamic Treatment Regimes: Regret-Regression Method with Myopic Strategies [PHD Thesis]. Newcastle University; 2013.
14. Bringmann LF, Vissers N, Wichers M, Geschwind N, Kuppens P, Peeters F, et al. A network approach to psychopathology: new insights into clinical longitudinal data. PloS one. 2013;8(4):e60188. doi: 10.1371/journal.pone.0060188 [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Verkuil B, Atasayi S, Molendijk ML. Workplace bullying and mental health: a meta-analysis on cross-sectional and longitudinal data. PloS one. 2015;10(8):e0135225. doi: 10.1371/journal.pone.0135225 [DOI] [PMC free article] [PubMed] [Google Scholar]
16. Koudelová J, Hoffmannová E, Dupej J, Velemínská J. Simulation of facial growth based on longitudinal data: age progression and age regression between 7 and 17 years of age using 3D surface data. PloS one. 2019;14(2):e0212618. doi: 10.1371/journal.pone.0212618 [DOI] [PMC free article] [PubMed] [Google Scholar]
17. Qu A, Lindsay B, Li B. Improving generalized estimating equations using quadratic inference function. Biometrika. 2000;87(4):823–836. doi: 10.1093/biomet/87.4.823 [DOI] [Google Scholar]
18. Liang KY, Zeger SL. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73(1):13–22. doi: 10.1093/biomet/73.1.13 [DOI] [Google Scholar]
19. Qu A, Lindsay B. Building adaptive estimating equations when inverse of covariance estimation is difficult. Journal of the Royal Statistical Society Series B. 2003;65(1):127–142. doi: 10.1111/1467-9868.00376 [DOI] [Google Scholar]
20. Tsai GF, Qu A. Testing the significance of cell-cycle patterns in time-course microarray data using nonparametric quadratic inference functions. Computational Statistics & Data Analysis. 2008;52(3):1387–1398. doi: 10.1016/j.csda.2007.03.018 [DOI] [Google Scholar]
21. Westgate PM. A comparison of utilized and theoretical covariance weighting matrices on the estimation performance of quadratic inference functions. Communications in Statistics-Simulation and Computation. 2014;43(10):2432–2443. doi: 10.1080/03610918.2012.752839 [DOI] [Google Scholar]
22. Yang W, Liao S. A study of quadratic inference functions with alternative weighting matrices. Communications in Statistics-Simulation and Computation. 2017;46(2):994–1007. doi: 10.1080/03610918.2014.988255 [DOI] [Google Scholar]
23. Fu WJ. Penalized estimating equations. Biometrics. 2003;59(1):126–132. doi: 10.1111/1541-0420.00015 [DOI] [PubMed] [Google Scholar]
24. Qu A, Li R. Quadratic inference functions for varying-coefficient models with longitudinal data. Biometrics. 2006;62(2):379–391. doi: 10.1111/j.1541-0420.2005.00490.x [DOI] [PMC free article] [PubMed] [Google Scholar]
25. Hoerl AE, Kennard RW. Ridge Regression: Biased Estimation to Nonorthogonal Problems. Technometrics. 1970;12(1):55–67. doi: 10.1080/00401706.1970.10488634 [DOI] [Google Scholar]
26. Horel A. Applications of ridge analysis toregression problems. Chem Eng Progress. 1962;58:54–59. [Google Scholar]
27. Imani M, Ghassemian H. Ridge regression-based feature extraction for hyperspectral data. International Journal of Remote Sensing. 2015;36(6):1728–1742. doi: 10.1080/01431161.2015.1024894 [DOI] [Google Scholar]
28. Rubin DB. Bayesian Inference for Causal Effects: The Role of Randomization. The Annals of Statistics. 1978;6:34–58. doi: 10.1214/aos/1176344064 [DOI] [Google Scholar]
29. Hernán MA. A definition of causal effect for epidemiological research. Journal of Epidemiology & Community Health. 2004;58(4):265–271. doi: 10.1136/jech.2002.006361 [DOI] [PMC free article] [PubMed] [Google Scholar]
30. Hansen LP. Large Sample Properties of Generalized Method of Moments Estimators. Econometrica. 1982;50(4):1029–1054. doi: 10.2307/1912775 [DOI] [Google Scholar]
31.Dziak JJ. Penalized quadratic inference functions for variable selection in longitudinal research [PHD Thesis]. The Pennsylvania State University; 2006.
32. Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. Journal of statistical software. 2010;33(1):1. doi: 10.18637/jss.v033.i01 [DOI] [PMC free article] [PubMed] [Google Scholar]
33. Westgate PM, Braun TM. An improved quadratic inference function for parameter estimation in the analysis of correlated data. Statistics in Medicine. 2013;32(11):3260–3273. doi: 10.1002/sim.5715 [DOI] [PubMed] [Google Scholar]
34. Westgate PM. A bias-corrected covariance estimatefor improved inference with quadratic inference function. Statistics in Medicine. 2012;31(29):4003–4022. doi: 10.1002/sim.5479 [DOI] [PubMed] [Google Scholar]
35. Golub GH, Heath M, Wahba G. Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics. 1979;21(2):215–223. doi: 10.1080/00401706.1979.10489751 [DOI] [Google Scholar]
36. Roozbeh M, Arashi M, Hamzah NA. Generalized cross-validation for simultaneous optimization of tuning parameters in ridge regression. Iranian Journal of Science and Technology, Transactions A: Science. 2020;44(2):473–485. doi: 10.1007/s40995-020-00851-1 [DOI] [Google Scholar]
37.Praga-Alejo R, Torres-Treviño L, Piña-Monarrez M. Optimal determination of k constant of ridge regression using a simple genetic algorithm. In: Electronics, Robotics and Automotive Mechanics Conference, 2008. CERMA’08. IEEE; 2008. p. 39–44.
38. Uslu VR, Egrioglu E, Bas E. Finding optimal value for the shrinkage parameter in ridge regression via particle swarm optimization. American Journal of Intelligent Systems. 2014;4(4):142–147. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Dataset

(DAT)

Click here for additional data file.^{(381.1KB, dat)}

Data Availability Statement

[pone.0271542.ref001] 1. Chakraborty B, Moodie EE. Statistical Methods for Dynamic Treatment Regimes. Springer; 2013. [Google Scholar]

[pone.0271542.ref002] 2. Rosthøj S, Fullwood C, Henderson R, Steward S. Estimation of Optimal Dynamic Anticoagulation Regimes from Observational Data: A Regret Based Approach. Statistics in Medicine. 2006;25(24):4197–4215. doi: 10.1002/sim.2694 [DOI] [PubMed] [Google Scholar]

[pone.0271542.ref003] 3. Robins JM, Orellana L, Rotnitzky A. Estimation and Extrapolation of Optimal Treatment and Testing Strategies. Statistics in Medicine. 2008;27(23):4678–4721. doi: 10.1002/sim.3301 [DOI] [PubMed] [Google Scholar]

[pone.0271542.ref004] 4. Qiu H, Carone M, Sadikova E, Petukhova M, Kessler RC, Luedtke A. Optimal individualized decision rules using instrumental variable methods. Journal of the American Statistical Association. 2021;116(533):174–191. doi: 10.1080/01621459.2020.1745814 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271542.ref005] 5. Sun Y, Wang L. Stochastic Tree Search for Estimating Optimal Dynamic Treatment Regimes. Journal of the American Statistical Association. 2021;116(533):421–432. doi: 10.1080/01621459.2020.1819294 [DOI] [Google Scholar]

[pone.0271542.ref006] 6. Moodie EE, Richardson TS, Stephens DA. Demystifying Optimal Dynamic Treatment Regimes. Biometrics. 2007;63(2):447–455. doi: 10.1111/j.1541-0420.2006.00686.x [DOI] [PubMed] [Google Scholar]

[pone.0271542.ref007] 7. Moodie EE, Platt RW, Kramer MS. Estimating response-maximized decision rules with applications to breastfeeding. Journal of the American Statistical Association. 2009;104(485):155–165. doi: 10.1198/jasa.2009.0011 [DOI] [Google Scholar]

[pone.0271542.ref008] 8. Murphy SA. Optimal Dynamic Treatment Regimes. Journal of the Royal Statistical Society, Series B (with discussion). 2003;65(2):331–366. doi: 10.1111/1467-9868.00389 [DOI] [Google Scholar]

[pone.0271542.ref009] 9. Henderson R, Ansell P, Alshibani D. Regret-Regression for Optimal Dynamic Treatment Regimes. Biometrics. 2010;66(4):1192–1201. doi: 10.1111/j.1541-0420.2009.01368.x [DOI] [PubMed] [Google Scholar]

[pone.0271542.ref010] 10. Barrett JK, Henderson R, Rosthøj S. Doubly robust estimation of optimal dynamic treatment regimes. Statistics in biosciences. 2014;6(2):244–260. doi: 10.1007/s12561-013-9097-6 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271542.ref011] 11. Robins JM. Causal Inference from Complex Longitudinal Data. Latent Variable Modelling and Application to Causality. 1997; Berkane M., Editor. NY: Springer Verlag:69–117. [Google Scholar]

[pone.0271542.ref012] 12.Robins JM. Optimal structural nested models for optimal sequential decisions. In: Proceedings of the second seattle Symposium in Biostatistics. Springer; 2004. p. 189–326.

[pone.0271542.ref013] 13.Mohamed NA. Optimal Dynamic Treatment Regimes: Regret-Regression Method with Myopic Strategies [PHD Thesis]. Newcastle University; 2013.

[pone.0271542.ref014] 14. Bringmann LF, Vissers N, Wichers M, Geschwind N, Kuppens P, Peeters F, et al. A network approach to psychopathology: new insights into clinical longitudinal data. PloS one. 2013;8(4):e60188. doi: 10.1371/journal.pone.0060188 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271542.ref015] 15. Verkuil B, Atasayi S, Molendijk ML. Workplace bullying and mental health: a meta-analysis on cross-sectional and longitudinal data. PloS one. 2015;10(8):e0135225. doi: 10.1371/journal.pone.0135225 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271542.ref016] 16. Koudelová J, Hoffmannová E, Dupej J, Velemínská J. Simulation of facial growth based on longitudinal data: age progression and age regression between 7 and 17 years of age using 3D surface data. PloS one. 2019;14(2):e0212618. doi: 10.1371/journal.pone.0212618 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271542.ref017] 17. Qu A, Lindsay B, Li B. Improving generalized estimating equations using quadratic inference function. Biometrika. 2000;87(4):823–836. doi: 10.1093/biomet/87.4.823 [DOI] [Google Scholar]

[pone.0271542.ref018] 18. Liang KY, Zeger SL. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73(1):13–22. doi: 10.1093/biomet/73.1.13 [DOI] [Google Scholar]

[pone.0271542.ref019] 19. Qu A, Lindsay B. Building adaptive estimating equations when inverse of covariance estimation is difficult. Journal of the Royal Statistical Society Series B. 2003;65(1):127–142. doi: 10.1111/1467-9868.00376 [DOI] [Google Scholar]

[pone.0271542.ref020] 20. Tsai GF, Qu A. Testing the significance of cell-cycle patterns in time-course microarray data using nonparametric quadratic inference functions. Computational Statistics & Data Analysis. 2008;52(3):1387–1398. doi: 10.1016/j.csda.2007.03.018 [DOI] [Google Scholar]

[pone.0271542.ref021] 21. Westgate PM. A comparison of utilized and theoretical covariance weighting matrices on the estimation performance of quadratic inference functions. Communications in Statistics-Simulation and Computation. 2014;43(10):2432–2443. doi: 10.1080/03610918.2012.752839 [DOI] [Google Scholar]

[pone.0271542.ref022] 22. Yang W, Liao S. A study of quadratic inference functions with alternative weighting matrices. Communications in Statistics-Simulation and Computation. 2017;46(2):994–1007. doi: 10.1080/03610918.2014.988255 [DOI] [Google Scholar]

[pone.0271542.ref023] 23. Fu WJ. Penalized estimating equations. Biometrics. 2003;59(1):126–132. doi: 10.1111/1541-0420.00015 [DOI] [PubMed] [Google Scholar]

[pone.0271542.ref024] 24. Qu A, Li R. Quadratic inference functions for varying-coefficient models with longitudinal data. Biometrics. 2006;62(2):379–391. doi: 10.1111/j.1541-0420.2005.00490.x [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271542.ref025] 25. Hoerl AE, Kennard RW. Ridge Regression: Biased Estimation to Nonorthogonal Problems. Technometrics. 1970;12(1):55–67. doi: 10.1080/00401706.1970.10488634 [DOI] [Google Scholar]

[pone.0271542.ref026] 26. Horel A. Applications of ridge analysis toregression problems. Chem Eng Progress. 1962;58:54–59. [Google Scholar]

[pone.0271542.ref027] 27. Imani M, Ghassemian H. Ridge regression-based feature extraction for hyperspectral data. International Journal of Remote Sensing. 2015;36(6):1728–1742. doi: 10.1080/01431161.2015.1024894 [DOI] [Google Scholar]

[pone.0271542.ref028] 28. Rubin DB. Bayesian Inference for Causal Effects: The Role of Randomization. The Annals of Statistics. 1978;6:34–58. doi: 10.1214/aos/1176344064 [DOI] [Google Scholar]

[pone.0271542.ref029] 29. Hernán MA. A definition of causal effect for epidemiological research. Journal of Epidemiology & Community Health. 2004;58(4):265–271. doi: 10.1136/jech.2002.006361 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271542.ref030] 30. Hansen LP. Large Sample Properties of Generalized Method of Moments Estimators. Econometrica. 1982;50(4):1029–1054. doi: 10.2307/1912775 [DOI] [Google Scholar]

[pone.0271542.ref031] 31.Dziak JJ. Penalized quadratic inference functions for variable selection in longitudinal research [PHD Thesis]. The Pennsylvania State University; 2006.

[pone.0271542.ref032] 32. Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. Journal of statistical software. 2010;33(1):1. doi: 10.18637/jss.v033.i01 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271542.ref033] 33. Westgate PM, Braun TM. An improved quadratic inference function for parameter estimation in the analysis of correlated data. Statistics in Medicine. 2013;32(11):3260–3273. doi: 10.1002/sim.5715 [DOI] [PubMed] [Google Scholar]

[pone.0271542.ref034] 34. Westgate PM. A bias-corrected covariance estimatefor improved inference with quadratic inference function. Statistics in Medicine. 2012;31(29):4003–4022. doi: 10.1002/sim.5479 [DOI] [PubMed] [Google Scholar]

[pone.0271542.ref035] 35. Golub GH, Heath M, Wahba G. Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics. 1979;21(2):215–223. doi: 10.1080/00401706.1979.10489751 [DOI] [Google Scholar]

[pone.0271542.ref036] 36. Roozbeh M, Arashi M, Hamzah NA. Generalized cross-validation for simultaneous optimization of tuning parameters in ridge regression. Iranian Journal of Science and Technology, Transactions A: Science. 2020;44(2):473–485. doi: 10.1007/s40995-020-00851-1 [DOI] [Google Scholar]

[pone.0271542.ref037] 37.Praga-Alejo R, Torres-Treviño L, Piña-Monarrez M. Optimal determination of k constant of ridge regression using a simple genetic algorithm. In: Electronics, Robotics and Automotive Mechanics Conference, 2008. CERMA’08. IEEE; 2008. p. 39–44.

[pone.0271542.ref038] 38. Uslu VR, Egrioglu E, Bas E. Finding optimal value for the shrinkage parameter in ridge regression via particle swarm optimization. American Journal of Intelligent Systems. 2014;4(4):142–147. [Google Scholar]

PERMALINK

Estimation in regret-regression using quadratic inference functions with ridge estimator

Nur Raihan Abdul Jalil

Nur Anisah Mohamed

Rossita Mohamad Yunus

Roles

Abstract

Introduction

Methods

Ridge quadratic inference function for myopic regret-regression (rQIF-MRr)

Results and discussions

Table 1. Parameter estimation for correctly specified working correlation structure of the QIF-MRr and rQIF-MRr with different correlation values, ρ.

Table 2. Parameter estimation of the misspecification working correlation structures for the QIF-MRr and rQIF-MRr at low correlation value, ρ = 0.1.

Table 4. Parameter estimation of the misspecification working correlation structures for the QIF-MRr and rQIF-MRr at high correlation value, ρ = 0.95.

Table 3. Parameter estimation of the misspecification working correlation structures for the QIF-MRr and rQIF-MRr at medium correlation value, ρ = 0.5.

Application to Warfarin data

Table 5. Parameter estimation of Warfarin data with AR(1), exchangeable, and unspecified working correlation structures for QIF-MRr and rQIF-MRr.

Conclusions

Supporting information

Data Availability

Funding Statement

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Estimation in regret-regression using quadratic inference functions with ridge estimator

Nur Raihan Abdul Jalil

Nur Anisah Mohamed

Rossita Mohamad Yunus

Roles

Abstract

Introduction

Methods

Ridge quadratic inference function for myopic regret-regression (rQIF-MRr)

Results and discussions

Table 1. Parameter estimation for correctly specified working correlation structure of the QIF-MRr and rQIF-MRr with different correlation values, ρ.

Table 2. Parameter estimation of the misspecification working correlation structures for the QIF-MRr and rQIF-MRr at low correlation value, ρ = 0.1.

Table 4. Parameter estimation of the misspecification working correlation structures for the QIF-MRr and rQIF-MRr at high correlation value, ρ = 0.95.

Table 3. Parameter estimation of the misspecification working correlation structures for the QIF-MRr and rQIF-MRr at medium correlation value, ρ = 0.5.

Application to Warfarin data

Table 5. Parameter estimation of Warfarin data with AR(1), exchangeable, and unspecified working correlation structures for QIF-MRr and rQIF-MRr.

Conclusions

Supporting information

Data Availability

Funding Statement

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases