SMARTp: A SMART design for nonsurgical treatments of chronic periodontitis with spatially referenced and nonrandomly missing skewed outcomes

Jing Xu; Dipankar Bandyopadhyay; Sedigheh Mirzaei Salehabadi; Bryan Michalowicz; Bibhas Chakraborty

doi:10.1002/bimj.201900027

. Author manuscript; available in PMC: 2020 Mar 4.

Published in final edited form as: Biom J. 2019 Sep 17;62(2):282–310. doi: 10.1002/bimj.201900027

SMARTp: A SMART design for nonsurgical treatments of chronic periodontitis with spatially referenced and nonrandomly missing skewed outcomes

Jing Xu ^1,², Dipankar Bandyopadhyay ³, Sedigheh Mirzaei Salehabadi ⁴, Bryan Michalowicz ⁵, Bibhas Chakraborty ^1,^6,⁷

PMCID: PMC7054179 NIHMSID: NIHMS1053056 PMID: 31531896

Abstract

This paper proposes dynamic treatment regimes (DTRs) as effective individualized treatment strategies for managing chronic periodontitis. The proposed DTRs are studied via SMARTp—a two-stage sequential multiple assignment randomized trial (SMART) design. For this design, we propose a statistical analysis plan and a novel cluster-level sample size calculation method that factors in typical features of periodontal responses such as non-Gaussianity, spatial clustering, and nonrandom missingness. Here, each patient is viewed as a cluster, and a tooth within a patient’s mouth is viewed as an individual unit inside the cluster, with the tooth-level covariance structure described by a conditionally autoregressive structure. To accommodate possible skewness and tail behavior, the tooth-level clinical attachment level (CAL) response is assumed to be skew-t, with the nonrandomly missing structure captured via a shared parameter model corresponding to the missingness indicator. The proposed method considers mean comparison for the regimes with or without sharing an initial treatment, where the expected values and corresponding variances or covariance for the sample means of a pair of DTRs are derived by the inverse probability weighting and method of moments. Simulation studies are conducted to investigate the finite-sample performance of the proposed sample size formulas under a variety of outcome-generating scenarios. An R package SMARTp implementing our sample size formula is available at the Comprehensive R Archive Network for free download.

Keywords: dynamic treatment regimes, inverse probability weighting, method of moments, periodontitis, skew-normal, skew-t, SMART

1 |. INTRODUCTION

Chronic periodontitis (CP) is a serious form of periodontal disease (PD), and if left untreated, continues to remain a major cause of adult tooth loss (Eke, Page, Wei, Thornton-Evans, & Genco, 2012). It is a highly prevalent condition, affecting almost half of the U.S. adults above the age of 30 (Thornton-Evans et al., 2013). PD may exhibit significant comorbidity, including diabetes, cardiovascular complications, respiratory illnesses, etc. (Beck et al., 2001; Grossi et al., 1997; Wang et al., 2009). Dental hygienists consider the clinical attachment level (CAL), rounded to the nearest millimeter, as the most important biomarker to measure the severity of PD (Nicholls, 2003). The CAL refers to the amount of lost periodontal ligament fibers, with the severity categorized as slight/mild: 1–2 mm, moderate: 3–4 mm, and severe: ≥ 5mm, according to the American Academy of Periodontology (AAP) 1999 guidelines (Wiebe & Putnins, 2000).

The treatment options for periodontitis range from oral hygiene, to scaling and root planing (SRP), to SRP with adjunctive treatments, and eventually to surgeries, when the severity increases. Recommended biannual (basic) dental cleaning, polishing, and professional flossing which we enjoy sitting in a comfortable reclining chair in a dental clinic only removes plaque (bacterial colony) and tartar above the gumline, and are not considered as effective procedures for treating gum diseases. Often, early stages of CP are effectively treated via nonsurgical means. Benefits of nonsurgical treatment of CP include shorter recovery time due to less-invasive techniques, reduced discomfort than surgery, and fewer dietary restrictions leading to improved quality of life, postprocedure. When disease has progressed significantly, deep cleaning, a two-step process consisting of SRP, is often recommended as the gold standard (Herrera, 2016) for thorough plaque removal, at or below the gumline. Yet, bacteria may still exist under the gumline following an SRP. Hence, the dentist or oral hygienist may also recommend various supplementary procedures, or adjuncts (such as locally delivered antimicrobials, systemic antimicrobials, nonsurgical lasers, etc.) following SRP to treat chronically deep gum pockets. Also, patients with periodontal co-morbidity who respond suboptimally to SRP may benefit from adjunctive therapy (Porteous & Rowe, 2014). However, current evidence suggests that the use of these adjuncts as stand-alone treatments does not lead to any clinical benefits of treating CP compared to SRP alone (Azarpazhooh, Shah, Tenenbaum, & Goldberg, 2010; Sgolastra, Petrucci, Gatto, & Monaco, 2012). Hence, they are usually recommended in conjunction to SRP.

In 2011, the Council on Scientific Affairs of the American Dental Association (ADA) resolved to develop a Clinical Practice Guideline (CPG) for the nonsurgical treatment (Smiley et al., 2015) of CP with SRP, with or without adjuncts, based on a systematic literature review. The panel found 0.5 mm average improvement in CAL with SRP, while the combination of SRP with assorted adjuncts resulted in (average) CAL improvement of 0.2–0.6 mm, over SRP alone. However, comparison among the adjuncts was not conducted. Recently, a systematic network meta-analyses (NMA) of the adjuncts in 74 studies from the CPG revealed none of them to be statistically (significantly) superior to the other (John, Michalowicz, Kotsakis, & Chu, 2017). However, the NMA ranked SRP + doxycycline hyclate gel (a local antimicrobial) as the best nonsurgical treatment of CP, compared to SRP alone. Throughout the years, lasers have revolutionized oral care, with reported advantages in minimizing tissue damages, swelling, and bleeding, leading to high patient acceptance. However, its clinical efficacy, both as an alternative, or adjuvant to SRP (Liu, Hou, Wong, & Lan, 1999; Zhao et al., 2014) remains inconclusive, with inconsistent evidence derived from underpowered clinical studies (Porteous & Rowe, 2014); see also the April 2011 statement (Workgroup, 2011) by the AAP.

Although randomized clinical trials (RCTs) and subsequent meta-analyses continue to remain the de facto in understanding effectiveness of a treatment (or intervention) over others, conducting a successful RCT comes with its own bag of limitations, which includes issues with patient recruitment and retention, escalating costs, educating the wider public, etc. This has led to the development of more patient-centric approaches, primarily through adaptive interventions or dynamic treatment regimes (DTRs; Murphy, 2003; Murphy, van der Laan, Robins, & CPPRG, 2001; Robins, 2004) under the umbrella of precision medicine (Garcia, Kuska, & Somerman, 2013), where the focus changed to decision-making for the individual, or subgroups, rather than average response based traditional RCT treatment comparisons. DTRs involve multistage sequential interventions, according to the patients’ evolving characteristics (such as an individual patient’s response or adherence) at each subsequent treatment stage. The treatment types are repeatedly adjusted over time to match an individual’s need in order to achieve optimal treatment effect. They are very appealing in managing chronic diseases that require long-term care (Lavori & Dawson, 2004; Murphy & McKay, 2004). Examples of DTRs from various clinical areas include alcoholism (Breslin et al., 1998), smoking (Chakraborty, Murphy, & Strecher, 2010), drug abuse (Brooner & Kidorf, 2002), depression (Untzer et al., 2001), hypertension (Glasgow, Engel, & D’Lugoff, 1989), etc. However, in the field of oral health and CP, a DTR proposal to mitigate the aforementioned issues related to RCTs seem to be nonexistent. For example, in the treatment of CP, one may develop a simple adaptive strategy of continuing the SRP (in the later stages) among the SRP responders-only group (in the first stage), while subjecting the nonresponders to “SRP + adjuncts” at later stages.

A special class of designs, called sequential multiple assignment randomized trial, or SMART designs (Lei, Nahum-Shani, Lynch, Oslin, & Murphy, 2012; Murphy, 2005), are popularly used to study DTRs (Chakraborty & Moodie, 2013; Ertefaie, Wu, Lynch, & Nahum-Shani, 2015). SMART designs involve randomizing patients to available treatment options at the initial stage, followed by rerandomizing at each subsequent stage of some or all of the patients to treatments available at that stage. These rerandomizations and set of treatment options depend on how well the patient responded to the previous treatments (Ghosh, Cheung, & Chakraborty, 2016). Contrary to the standard RCT that takes a “one size fits all” single intervention approach, the SMART advances the RCT by following the same individual over sequential randomizations (i.e., ordered interventions) with the underlying series and order of interventions depicting a real-life setting, which can drastically affect eventual outcomes. Although Murphy (2005) proposed the general SMART design framework, the treatment was restricted to individual-level outcomes. However, in our motivating clinical discipline of treating CP (and also in some behavioral intervention research), interventions are delivered at the group or cluster (individual) level, while the CAL responses are available at the cluster subunit level (teeth) level. Although sample size formulas and SMART design implementations under clustered outcomes setting are available (Ghosh et al., 2016; NeCamp, Kilbourne, & Almirall, 2017), they only focus on regimes that do not share an initial treatment. Furthermore, they do not account for other data complications typical to PD studies, such as the presence of (i) non-Gaussian (skewed and thick-tailed), (ii) nonrandomly missing, and (iii) spatially referenced CAL responses (Reich et al., 2013). For example, consider the motivating Gullah African American Diabetic, or GAAD study, which recorded the extent of PD in a Type-2 diabetic Gullah-speaking African American population from the coastal South Carolina sea islands (Fernandes et al., 2009). For illustration, panel (a) in Figure 1 describes the measurement locations and sample data for a random subject, while panel (b) plots the density histogram of the CAL for the four tooth types from the GAAD dataset, revealing considerable rightskewness. Furthermore, PD being the major cause of adult tooth loss, it is likely that patients with higher level of CAL (and CP) exhibit a higher proportion of missing teeth, and hence this missingness mechanism is nonignorable (Reich et al., 2013). Also, CP and PD are hypothesized to be spatially referenced, that is, proximally located teeth usually have similar disease status than distally located ones. Ignoring the features (i)–(iii) in constructing any SMART design for CP may lead to imprecise estimates of the desired parameters. It is important to note here that the Ghosh et al. (2016) approach of a clustered SMART design considers traditional clustering (subunits within a cluster) of Gaussianly distributed continuous responses, and excludes spatial clustering and other features.

CAL data *Note*. Panel (A) shows the observed CAL for a patient with a missing incisor, where the shaded boxes represent teeth, the circles represent sites, and gray lines represent neighbor pairs that connects adjacent sites on the same tooth and sites that share a gap between teeth. “Gap” in the figure indicates, for example, the four sites in the gap between teeth #4 and 5. The tooth numbers are indicated, and excludes the four third molars: 1, 16, 17, 32. The vertical and horizontal lines separate the mouth into four quadrants, with the molars (#2 and 3, #14 and 15, #18 and 19, #30 and 31), premolars (#4 and 5, #12 and 13, #20 and 21, #28 and 29), canines (#6, 12, 22, 27) and incisors (#7–10, #23–26). Panel (B) presents the frequency density plot of the CAL (rounded to the nearest millimeters) for each tooth type from the GAAD dataset

In this paper, we set forward to address the aforementioned limitations in developing a list of plausible DTRs for treating CP. We cast this into a two-stage SMART design framework for CP outcomes that exhibit (i)–(iii), and present an analysis plan and sample size calculations for (a) detecting a postulated effect size of a single treatment regime, and (b) detecting a postulated difference between two treatment regimes with or without a shared initial treatment. The tooth-level covariance structure describing spatial association is modeled by a conditionally autoregressive process (Reich & Bandyopadhyay, 2010). To accommodate possible skewness and tail behavior, the tooth-level CAL responses are assumed to have skew-t (ST; Azzalini & Capitanio, 2003a) errors, with the nonrandomly missing CAL values imputed via a shared parameter model corresponding to the missingness indicator. The proposed method considers mean comparison for the regimes with or without sharing an initial treatment, where the expected values and corresponding variances or covariance of the effect size of the treatment regimes are derived by the inverse probability weighting (IPW) techniques (Robins, Rotnitzky, & Zhao, 1994b), and method of moments. Note that the proposed methodology is not restricted to CP trials only. It can also be applied (or extended) to other SMARTs involving DTRs for treating other health conditions (e.g., infectious diseases) for which outcomes may be skewed and spatially correlated, with some outcome data possibly being nonrandomly missing.

The rest of the paper is organized as follows. Section 2 introduces eight potential treatments and the corresponding DTRs that constitute the two-stage SMART design for CP. Section 3 presents the theoretical framework and a sample size calculation method under this SMART design, incorporating the aforementioned features typical to PD data. Section 4 investigates the finite-sample performance of the proposed sample size calculation method using synthetic data generated under various settings. Section 5 demonstrates the implementation of the R function SampleSize.SMARTp for calculating sample sizes, also available at the GitHub link https://github.com/bandyopd/SMARTp. Finally, the paper ends with a discussion in Section 6. Additional material consisting of detailed derivations is relegated to the Appendix.

2 |. A PROPOSED SMART DESIGN INVOLVING DTRs FOR TREATING CP

In this section, we propose DTRs for treating CP, which are studied via a SMART design. A list of possible treatments consist of the treatment initiation steps: (1) oral hygiene instruction and (2) education on risk reduction. This is followed by (3) SRP or more advanced nonsurgical treatments that combine SRP with adjunctive therapy, as summarized in the systematic review of Smiley et al. (2015), such as (4) SRP with local antimicrobial therapy, (5) SRP with systemic antimicrobial therapy, (6) SRP with photodynamic therapy, which uses lasers, but only to activate an antimicrobial agent, (7) SRP with systemic subantimicrobial-dose doxycycline (SDD), and finally, (8) laser. The corresponding SMART design for developing DTRs is presented in Figure 2. Note that the number of potential DTRs are not limited to Figure 2. More details are presented in Section 6.

A SMART design schematic diagram for developing DTRs for treating chronic periodontitis *Note*. R, randomization; 1, oral hygiene instruction; 2, education on risk reduction; 3, scaling and root planing (SRP); 4, SRP with local antimicrobial therapy; 5, SRP with systemic antimicrobial therapy; 6, SRP with photodynamic therapy; 7, SRP with systemic subantimicrobial-dose doxycycline (SDD); 8, laser therapy

Oral hygiene is primarily used for prevention and initial therapy, especially during early stage of periodontitis. At the beginning of the proposed trial, each participant has to attend the treatment initiation steps (1) and (2) before any randomization. Note that although SRP is the accepted gold standard, the role of laser therapy, though advantageous in targeting the diseased area precisely and accurately, still remains controversial as a standard of care. In this paper, we develop our SMART design, with a primary focus on comparing the DTRs starting with either SRP (#3) or laser therapy (#8). At the initial stage, each participant is randomly allocated to either treatment 3 or 8. We propose a DTR that matches a patient’s need in achieving similar outcome as SRP with adjuncts, though at a lower cost. Each possible treatment regime can have more than one path of treatment according to each patient’s evolving response. The patients who respond to the initial treatment continue the same treatment at the second stage of the trial. The patients who do not respond to treatment 3 are randomly allocated to one of the treatments 4–7 in the second stage. Similarly, for patients allocated to the laser arm (treatment 8), the nonresponders will also have the provision of being randomly allocated to one of treatments 4–7 in the second stage. The randomization probabilities calculated at both initial and the final stages of our SMART design are presented in Section 3.3. The primary final outcome measure is the recorded and rounded tooth-level CAL. The possible paths are listed below.

Path 1: “1”, “2,” “3,” “3”;
Path 2: “1,” “2,” “3,” “4,”;
Path 3: “1,” “2,” “3,” “5”;
Path 4: “1,” “2,” “3,” “6”;
Path 5: “1,” “2,” “3,” “7”;
Path 6: “1,” “2,” “8,” “8”;
Path 7: “1,” “2,” “8,” “4”;
Path 8: “1,” “2,” “8,” “5”;
Path 9: “1,” “2,” “8,” “6”;
Path 10: “1,” “2,” “8,” “7.”

This leads to eight different DTRs (d₁−d₈) that are embedded within the two-stage SMART design, that is,

Regime 1 (d₁): (“3,” “3,”^R “4”^{N R});

Regime 2 (d₂): (“3,” “3,”^R “5”^{N R});

Regime 3 (d₃): (“3,” “3,”^R “6”^{N R});

Regime 4 (d₄): (“3,” “3,”^R “7”^{N R});

Regime 5 (d₅): (“8,” “8,”^R “4”^{N R});

Regime 6 (d₆): (“8 “8,”^R “5”^{N R});

Regime 7 (d₇): (“8,” “8,”^R “6”^{N R});

Regime 8 (d₈): (“8,” “8,”^R “7”^{N R}).

Here, Regime 1 consists of treating a patient initially (after the basic treatment steps 1 and 2) with treatment 3, continuing with treatment 3 at the second stage if the patient is adjudged a responder (R) at the end of stage 1, and switching to treatment 4 at the second stage if s/he is a nonresponder (NR). All other regimes can be explained similarly.

There are a number of advantages of SMART designs over a series of single-stage trials to develop an optimal DRT (Chakraborty & Moodie, 2013). First, the single-stage trials may fail to detect delayed therapeutic effects. For example, the patients who receive laser therapy may achieve better short-term outcomes than those who receive SRP initially; however, the relative merit of SRP may be realized in the second stage when possible adverse events occur due to laser therapy. Second, SMARTs offer the option to reveal useful diagnostic information in that even though an initial treatment (e.g., SRP) may not be particularly effective, it can guide a better treatment choice (e.g., systemic antimicrobial therapy as an adjunct) at the second stage; a series of single-stage trials fail to offer this diagnostic effect. Third, the nonresponding patients to an initial treatment (e.g., SRP) may drop out of single-stage trials, but they are less likely to drop out of a SMART because they expect to receive possibly better treatments (e.g., SRP with adjuncts) at the next stage, even if SRP is not effective for them initially. Thus, the cohort of participants who are recruited and retained in a SMART may be quite different from those who are recruited and retained in a single-stage trial. This cohort effect may lead to biased estimation of effects of treatment sequences (DTRs) in single-stage trials, but should be well taken care of in a SMART.

3 |. SMART DESIGN: MODEL, HYPOTHESIS TESTING, AND SAMPLE SIZE CALCULATIONS

In this section, we propose the theoretical framework and an associated novel sample size formula for our SMART design.

3.1 |. Statistical model

We start by introducing some notations. Let A_i1 denote the treatment for patient i at the initial stage (i.e., “3” or “8”), R_i(A_i1) denote the proximal response after initial treatment A_i1, that is, R_i(·) = 1 if the ith patient is a responder and R_i(·) = 0 otherwise, A_i2(A_i1, R_i(A_i1)) denote the treatment at the final stage based on the initial treatment and proximal response, Y_it denote the final outcome measure, that is, change in mean CAL for the ith tooth of patient i, and M_it denote the missingness indicator of the tth tooth of patient i, that is, M_it = 1 if missing, or 0 otherwise. Our proposed model allows missing teeth to occur either before, or during the study, and the recorded Y_it at the end of the two-stage study is either a number (in millimeters), or missing (resulting from tooth t be missing before, or during the study). The expectation of the primary outcome Y_it considers the joint distribution of the mean CAL change and the missingness indicator, see Appendix C.

Thus, the observed data trajectory for patient i can be described as O_i = (A_i1, R_i(A_i1), A_i2(A_i1, R_i(A_i1)), Y_i,1, …, Y_i,28, M_i,1, …, M_i,28). Note that there are N patients in the sample, and each patient has a maximum of 28 teeth (if no tooth is missing). Thus, the (overall) outcome measure for patient i is ${\bar{Y}}_{i} = \sum_{t = 1}^{28} Y_{i t} (1 - M_{i t}) / \sum_{t = 1}^{28} (1 - M_{i t})$ , which is the mean of CAL of the available teeth for patient i. Hence, the proportion of available teeth for patient i is ${\hat{p}}_{i} = \sum_{t = 1}^{28} (1 - M_{i t}) / 28$ . The generative model for Y is specified by a regression model of the form

Y_{i t} = μ_{i} + Q_{i t} + ϵ_{i t 1},

(1)

for i = 1, …, N and t = 1, …, 28, where μ_i = β₀ + β₁A_i13 + β₂A_i13R_i + β₃R_i + β₄A_i13A_i24 (1 – R_i) + β₅A_i13A_i25(1 − R_i) β₆A_i13A_i26(1 − R_i). Here, A_i13 is an indicator of treatment “3” at initial stage for patient i, A_i24 is an indicator of treatment “4” at the final stage for patient i, and ϵ_it1 is the (random) error term distributed as a skew-normal (SN), or ST density (Azzalini & Capitanio, 2003b), that is, $ϵ_{i t 1} ~ S T (0, σ_{1}^{2}, λ, v)$ , with location parameter 0, scale parameter σ₁, skewness parameter λ, and degrees of freedom v that measures the kurtosis. Note that the distribution of ϵ_it1 is normal if λ = 0 and v = ∞, SN if λ ≠ 0 and v = ∞, t if λ = 0 and v < ∞, and ST, if λ ≠ 0 and v < ∞. Expressions of the mean, variance, skewness γ₁, and kurtosis γ₂ for both SN and ST distributions are presented in Appendices A.1 and B.1, respectively. Following Reich and Bandyopadhyay (2010), we assume the latent vector Q_i = (Q_i1, …, Q_i28)^⊤ follows a multivariate normal distribution, with mean vector 0_28×1 and covariance matrix Σ_28×28 with a conditional autoregressive (CAR) structure, that is, Σ_28×28 = τ²(C_28×28 − ρD_28×28)⁻¹. Here, τ² > 0 and ρ ∈ [0, 1] are the parameters controlling the magnitude of variation, and degree of spatial association, respectively. For the matrix D, the elements D_tt′, are ones if locations t and t′ are adjacent, and zeroes otherwise. The matrix C is diagonal with diagonal elements C_tt = Σ_t′ D_tt′.

Next, under the assumption of nonrandomly missing teeth (locations of missing teeth are not random, but rather related to the CP health in that region of the mouth), we propose a probit regression model for the missing teeth indicator as a function of the underlying (spatial) latent term Q_i. Define M_it = I(M_it0 > 0), where M_it0 is a (latent) continuous variable, modeled as

M_{i t 0} = a_{0} + b_{0} Q_{i t} + ϵ_{i t 0},

(2)

where $ϵ_{i t 0} \overset{i . i . d}{~} N (0, σ_{0}^{2})$ . Note that the above probit trick allows us to connect the continuous (spatial) latent Q_i to the binary missingness indicator M_it. Thus, under a shared random parameter formulation (Albert, 2019) popularly used in longitudinal studies with informative (or missing not at random) dropout, and also in oral health studies (Reich & Bandyopadhyay, 2010), the latent variable Q_i not only models the association between teeth in a mouth, but also acts as the dependence term connecting the observed response and the missing data process. For the sake of identifiability, we choose $σ_{0}^{2} = 1$ . Here, under the popular shared parameter framework (Vonesh, Greene, & Schluchter, 2006), Q facilitates sharing of information between Y and M for modeling nonrandomly missing data. The parameters a₀, b₀, and the quantities Q_i and ϵ_it0 determine the proportion of available teeth $p_{i} = E ({\hat{p}}_{i})$ , which can be estimated using either stochastic or deterministic method (see Appendix C). The parameter b₀ controls the association between Y and M, for example, b₀ = 0 indicates no association. Also, the Pearson correlation coefficient between Y_it and M_it0 (from (2)) is $c_{i t} = b_{0} var (Q_{i t}) / \sqrt{(var (Q_{i t}) + var (ϵ_{i t 1})) (b_{0}^{2} var (Q_{i t}) + var (ϵ_{i t 0}))}$ ; see Appendix C for the derivation. For power analysis, one may choose the distributions of Q_i, ϵ_it1 and ϵ_it0 from the literature, for example, Reich and Bandyopadhyay (2010). Define c_i = Σ_t c_it/28. The clinician may suggest values for μ_i, p_i, and c_i based on experience, or literature review, and the corresponding estimates of a₀ and b₀ can be obtained by solving a set of simultaneous equations involving p_i and c_i. However, when clinical advice is not available, sample size calculations should undergo a sensitivity analysis. In other words, estimates of N can be computed under various choices of the parameters, for example, p_i = 0.2, 0.3, 0.5, or 0.8, and a conservative estimate of N maybe selected.

Next, we derive the expected value and variance for the sample mean of a DTR, using d₁ as an example, based on the IPW principle. IPW techniques have been successfully applied for estimating regression coefficients (Robins, Rotnitzky, & Zhao, 1994a), and population mean (Cao, Tsiatis, & Davidian, 2009), in the context of incomplete data. For DTRs under SMART designs, most likely, we are unable to sample data directly from a particular regime. For example, responders of SRP can be viewed as coming from any of regimes 1–4. Hence, a method of moments estimate of the sample mean for regime 1 is given by

{\bar{Y}}^{d_{1}} = E_{d_{1}} ({\bar{Y}}_{i}) = E (W_{i}^{d_{1}} {\bar{Y}}_{i}),

(3)

W_{i}^{d_{1}} = \frac{I (A_{i 1} = a_{1 i}^{d_{1}}, A_{i 2} = {[a_{2 i}^{d_{1} R}]}^{R_{i}} {[a_{2 i}^{d_{1} N R}]}^{1 - R_{i}})}{π_{1 i}^{d_{1}} {[π_{2 i}^{d_{1} R}]}^{R_{i}} {[π_{2 i}^{d_{1} N R}]}^{1 - R_{i}}},

(4)

where ${\bar{Y}}_{i}$ is the the final outcome measure of CAL change for patient i; I(·) the indicator function; R_i the binary response indicator for treatment “3” at initial stage; $a_{1 i}^{d_{1}}$ the regime 1 (d₁) treatment at initial stage for patient i, for example, “3”; $a_{2 i}^{d_{1} R}$ the regime 1 treatment at the final stage if participant i is a responder (i.e., R_i = 1), for example, “3”; $a_{2 i}^{d_{1} N R}$ the regime 1 treatment at the final stage if patient i is a nonresponder (i.e., R_i = 0), for example, “4”; $π_{1 i}^{d_{1}}$ the probability of treatment allocation of regime 1 at initial stage for patient i; $π_{2 i}^{d_{1} R}$ the probability of treatment allocation of regime 1 at the final (second) stage if patient i is a responder, that is, 1; $π_{2 i}^{d_{1} N R}$ : probability of treatment allocation of regime 1 at the final stage, if patient i is a nonresponder, that is, 1/4.

To maximize power, we estimate $π_{1 i}^{d_{1}}$ as in Murphy (2005) to have equal sample sizes across all possible regimes. Based on Figure 2, we set

π_{1 i}^{d_{1}} = \frac{{(1 \cdot γ^{d_{1}} + \frac{1}{4} \cdot (1 - γ^{d_{1}}))}^{- 1}}{{(1 \cdot γ^{d_{1}} + \frac{1}{4} \cdot (1 - γ^{d_{1}}))}^{- 1} + {(1 \cdot γ^{d_{5}} + \frac{1}{4} \cdot (1 - γ^{d_{5}}))}^{- 1}},

(5)

where $γ^{d_{1}}$ denotes the response rate for regime 1 at initial stage. When both $γ^{d_{1}}$ and $γ^{d_{5}}$ are unknown, Murphy (2005) suggested setting $π_{1 i}^{d_{1}}$ to

π_{1 i}^{d_{1}} = \frac{max (1^{- 1}, {\frac{1}{4}}^{- 1})}{max (1^{- 1}, {\frac{1}{4}}^{- 1}) + max (1^{- 1}, {\frac{1}{4}}^{- 1})} .

(6)

Alternatively, we can also set $π_{1 i}^{d_{1}} = 1 / 2$ , ensuring equal probability of treatment allocation at the initial stage.

The mean and variance of ${\bar{Y}}^{d_{1}}$ are given by

E ({\bar{Y}}^{d_{1}}) = E (\frac{I (A_{i 1} = a_{1 i}^{d_{1}}, A_{i 2} = {[a_{2 i}^{d_{1} R}]}^{R_{i}} {[a_{2 i}^{d_{1} N R}]}^{1 - R_{i}})}{π_{1 i}^{d_{1}} {[π_{2 i}^{d_{1} R}]}^{R_{i}} {[π_{2 i}^{d_{1} N R}]}^{1 - R_{i}}} {\bar{Y}}_{i})

(7)

and

var ({\bar{Y}}^{d_{1}}) = \frac{1}{N} var (\frac{I (A_{i 1} = a_{1 i}^{d_{1}}, A_{i 2} = {[a_{2 i}^{d_{1} R}]}^{R_{i}} {[a_{2 i}^{d_{1} N R}]}^{1 - R_{i}})}{π_{1 i}^{d_{1}} {[π_{2 i}^{d_{1} R}]}^{R_{i}} {[π_{2 i}^{d_{1} N R}]}^{1 - R_{i}}} {\bar{Y}}_{i}) .

(8)

In terms of π, γ, μ and σ, (7) and (8) can be expressed alternatively as

E ({\bar{Y}}^{d_{1}}) = γ^{d_{1}} μ_{d_{1} R} + (1 - γ^{d_{1}}) μ_{d_{1} N R} = μ_{d_{1}}

and

V ({\bar{Y}}^{d_{1}}) = \frac{1}{N} {\frac{γ^{d_{1}}}{π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} R}]} (σ_{d_{1} R}^{2} + (1 - π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} R}]) μ_{d_{1} R}^{2}) + \frac{1 - γ^{d_{1}}}{π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} N R}]} (σ_{d_{1} N R}^{2} + (1 - π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} N R}]) μ_{d_{1} N R}^{2}) + γ^{d_{1}} (1 - γ^{d_{1}}) {(μ_{d_{1} R} - μ_{d_{1} N R})}^{2}},

where $σ_{d, R}^{2}$ : the variance of ${\bar{Y}}_{i}$ from d₁ and R_i = 1; $μ_{d_{1}, R}$ : the mean of ${\bar{Y}}_{i}$ from d₁ and R_i = 1. Detailed expressions of both $σ_{d, R}^{2}$ and $μ_{d_{1}, R}$ appear in Appendix C. However, we compute them using Monte Carlo method. Likewise, both $E ({\bar{Y}}^{d_{3}})$ and $V ({\bar{Y}}^{d_{3}})$ can be derived; see Appendix C.

3.2 |. Hypothesis testing

Our SMART design allows the following important hypothesis tests:

detecting a single DTR effect on CAL, using $H_{0} : μ_{d_{1}} = 0 vs H_{1} : μ_{d_{1}} = δ_{d_{1}} \neq 0$ ;
comparing two DTRs that share an initial treatment (e.g., DTRs starting with SRP as initial treatment and then followed by various adjuncts), using $H_{0} : μ_{d_{1}} - μ_{d_{3}} = 0 vs H_{1} : μ_{d_{1}} - μ_{d_{3}} = δ_{d_{1} - d_{3}} \neq 0$ ;
comparing two DTRs that do not share an initial treatment (e.g., DTRs initiated by SRP and laser, respectively), using $H_{0} : μ_{d_{1}} - μ_{d_{5}} = 0$ versus $H_{1} : μ_{d_{1}} - μ_{d_{5}} = δ_{d_{1} - d_{5}} \neq 0$ ;
detecting one regime (e.g., regime 1) is the best among all the possible eight embedded regimes in Figure 2, that is,

H_{0} : μ_{d_{1}} - μ_{d_{2}} \leq 0 or μ_{d_{1}} - μ_{d_{3}} \leq 0 or \dots or μ_{d_{1}} - μ_{d_{8}} \leq 0

versus

H_{1} : μ_{d_{1}} - μ_{d_{2}} > 0 and μ_{d_{1}} - μ_{d_{3}} > 0 and \dots and μ_{d_{1}} - μ_{d_{8}} > 0.

Hypothesis 1 can be used to test whether the improvement in CAL in the proposed DTR is better than SRP (e.g., ≥ 0.5 mm), or not worse than SRP with adjuncts (e.g., 0.7–1.1 mm), based on the systematic review results of Smiley et al. (2015). As the NMA by John et al. (2017) found no significant evidence of CAL improvement among adjuncts, Hypothesis 2 can be used to test if indeed there are statistically significant differences between the DTRs of SRP and “SRP + adjuncts.” The treatment effect of laser therapy is still under investigation; we can use Hypothesis 3 to test if there is a statistically significant difference between the DTRs initiated by SRP and laser. Finally, Hypothesis 4 can assist us in testing the finding by John et al. (2017) that SRP with local antimicrobial therapy is possibly the best when compared to SRP with other adjuncts.

Consider Hypothesis 2. The expectation and variance of the difference in regime means can be expressed, respectively, as

E ({\bar{Y}}^{d_{1}} - {\bar{Y}}^{d_{3}}) = μ_{d_{1}} - μ_{d_{3}} = δ_{d_{1} - d_{3}}

(9)

and

V ({\bar{Y}}^{d_{1}} - {\bar{Y}}^{d_{3}}) = V ({\bar{Y}}^{d_{1}}) + V ({\bar{Y}}^{d_{3}}) - 2 cov ({\bar{Y}}^{d_{1}}, {\bar{Y}}^{d_{3}}) = \frac{1}{N} 2 σ_{d_{1} - d_{3}}^{2} .

(10)

Note that both δ and σ² in Equations (9) and (10), respectively, are functions of the parameter vector $Ω_{d_{1} - d_{3}} = (μ, τ, ρ, λ, v, σ_{1}^{2}, σ_{0}^{2}, a_{0}, b_{0}, γ^{d_{1}}, π_{1}^{d_{1}}, π_{2}^{d_{1} R}, π_{2}^{d_{1} N R}, γ^{d_{3}}, π_{1}^{d_{3}}, π_{2}^{d_{3} R}, π_{2}^{d_{3} N R})$ Also, $μ, τ, ρ, λ, ν, σ_{1}^{2}, σ_{0}^{2}, a_{0}$ and b₀ are defined in Equations (1) and (2), while parameters γs and πs are defined in Equations (3)–(5). The covariance between ${\bar{Y}}^{d_{1}}$ and ${\bar{Y}}^{d_{3}}$ is

cov ({\bar{Y}}^{d_{1}}, {\bar{Y}}^{d_{3}}) = \frac{1}{N} {\frac{γ^{d_{1}}}{π_{1 i}^{d_{1}} π_{2 i}^{d_{1} R}} (σ_{d_{1} R}^{2} + μ_{d_{1} R}^{2}) - γ^{d_{1}} γ^{d_{3}} μ_{d_{1} R} μ_{d_{3} R} - γ^{d_{1}} (1 - γ^{d_{3}}) μ_{d_{1} R} μ_{d_{3} N R} - γ^{d_{3}} (1 - γ^{d_{1}}) μ_{d_{1} N R} μ_{d_{3} R} - (1 - γ^{d_{1}}) (1 - γ^{d_{3}}) μ_{d_{1} N R} μ_{d_{3} N R}} .

Note that $γ^{d_{1}} = γ^{d_{3}}$ , $μ_{d_{1} R} = μ_{d_{3} R}$ , and $σ_{d_{1} R}^{2} = σ_{d_{3} R}^{2}$ because the responders from treatment “3” are consistent with both regimes 1 and 3. The derivations of both $E ({\bar{Y}}^{d_{1}} - {\bar{Y}}^{d_{3}})$ and $V ({\bar{Y}}^{d_{1}} - {\bar{Y}}^{d_{3}})$ can be found in Appendix C. Next, we present a theoretical result following a set of assumptions; see Appendix D for the proof.

Assumptions.

Random vectors $({\bar{Y}}_{i}, W_{i}^{d_{1}}, W_{i}^{d_{3}}), 0 \leq i \leq N$ , 0 ≤ i ≤ N are independent and identically distributed, and distribution of ${\bar{Y}}_{i}$ is independent of $W_{i}^{d_{1}}$ and $W_{i}^{d_{3}}$ , where $W_{i}^{d_{1}}$ is defined by Equation (4).
Let Θ, the parameter space for the regime means under consideration (e.g., $μ_{d_{1}}$ and $μ_{d_{3}}$ ), be a compact subset of real numbers. Also let $μ_{d_{1} 0}$ , $μ_{d_{3} 0}$ , and $δ_{(d_{1} - d_{3}) 0}$ denote the true values of $μ_{d_{1}}$ , $μ_{d_{3}}$ , and $δ_{d_{1} - d_{3}}$ , respectively. Assume unbiased estimators $\frac{1}{N_{2}} \sum_{i = 1}^{N_{2}} W_{i}^{d_{1}} {\bar{Y}}_{i}$ and $\frac{1}{N_{2}} \sum_{i = 1}^{N_{2}} W_{i}^{d_{3}} {\bar{Y}}_{i}$ for $μ_{d_{1}}$ and $μ_{d_{3}}$ , respectively, that is, $E (W_{i}^{d_{1}} {\bar{Y}}_{i} - μ_{d_{1}}) = 0$ only when $μ_{d_{1}} = μ_{d_{1} 0}$ and $E (W_{i}^{d_{3}} {\bar{Y}}_{i} - μ_{d_{3}}) = 0$ only when $μ_{d_{3}} = μ_{d_{3} 0}$ . Hence, $E ({\hat{δ}}_{d_{1} - d_{3}} - δ_{d_{1} - d_{3}}) = 0$ only for $δ_{d_{1} - d_{3}} = δ_{(d_{1} - d_{3}) 0} = μ_{d_{1} 0} - μ_{d_{3} 0}$ . Note that N₂ represents sample size N of Hypothesis 2.

Theorem 3.1. The IPW and MOM estimator ${\hat{δ}}_{d_{1} - d_{3}}$ is a consistent estimator of $δ_{d_{1} - d_{3}}$ . Under moment conditions and the above assumptions, we have $\sqrt{N_{2}} ({\hat{δ}}_{d_{1} - d_{3}} - δ_{(d_{1} - d_{3}) 0}) \to N o r m a l (0, 2 σ_{d_{1} - d_{3}}^{2})$ .

Theorem 3.1 is based on the Hypothesis 2. However, we can also derive similar theorems for other hypothesis. Though regimes 1 and 5 do not share any initial treatments, the covariance between the sample mean of these two regimes can be derived in the similar way as $cov ({\bar{Y}}^{d_{1}}, {\bar{Y}}^{d_{3}})$ . The mathematical formula for $cov ({\bar{Y}}^{d_{1}}, {\bar{Y}}^{d_{5}})$ is given in Appendix C.

Before deriving the sample size formula, we present the test statistics for the corresponding hypotheses below. For example, for $H_{0} : μ_{d_{1}} - μ_{d_{3}} = 0$ versus $H_{1} : μ_{d_{1}} - μ_{d_{3}} = δ_{d_{1} - d_{3}} \neq 0$ (Hypothesis 2), we use the univariate Wald statistic $Z_{2} = {\hat{δ}}_{d_{1} - d_{3}} / \sqrt{2 σ_{d_{1} - d_{3}}^{2} / N_{2}}$ , where ${\hat{δ}}_{d_{1} - d_{3}} = {\bar{Y}}^{d_{1}} - {\bar{Y}}^{d_{3}}$ is the estimated effect size and $σ_{d_{1} - d_{3}}^{2}$ is given in (10). In large samples, Z₂ follows a standard normal distribution if H₀ is true. Hence, at α level of significance, we reject H₀ if | Z₂ |> z_α/2, where z_α/2 is the upper α/2 quantile of a standard normal distribution. In a similar way, the test statistic for Hypothesis 1 $(H_{0} : μ_{d_{1}} = 0 vs. H_{1} : μ_{d_{1}} = δ_{d_{1}} \neq 0)$ and Hypothesis 3 $(H_{0} : μ_{d_{1}} - μ_{d_{5}} = 0 vs H_{1} : μ_{d_{1}} - μ_{d_{5}} = δ_{d_{1} - d_{5}} \neq 0)$ are $Z_{1} = {\hat{δ}}_{d_{1}} / \sqrt{2 σ_{d_{1}}^{2} / N_{1}}$ and $Z_{3} = {\hat{δ}}_{d_{1} - d_{5}} / \sqrt{2 σ_{d_{1} - d_{5}}^{2} / N_{3}}$ , respectively, where both follow standard normal distribution under H₀.

Note that Hypothesis 4 uses a multivariate test statistic $Z = {(Z_{d_{1} - d_{2}}, Z_{d_{1} - d_{3}}, \dots, Z_{d_{1} - d_{8}})}^{⊤}$ , where each component of the vector, for example, $Z_{d_{1} - d_{2}} = {\hat{δ}}_{d_{1} - d_{2}} / \sqrt{2 σ_{d_{1} - d_{2}}^{2} / N_{4}}$ , follows a standard normal distribution, under H₀. Hence, at α level of significance, we reject H₀ if all of $Z_{d_{1} - d_{2}}, \dots, Z_{d_{1} - d_{8}} > z_{α}$ . The first three hypothesis tests are two-sided tests, while the fourth hypothesis is one-sided. Thus, in order to compute the joint probability $Pr (Z_{d_{1} - d_{2}} > z_{α}, \dots, Z_{d_{1} - d_{8}} > z_{α})$ , one requires the covariance matrix of ${({\hat{δ}}_{d_{1} - d_{2}}, \dots, {\hat{δ}}_{d_{1} - d_{8}})}^{⊤}$ ; the derivation of $cov ({\hat{δ}}_{d_{1} - d_{2}}, {\hat{δ}}_{d_{1} - d_{3}})$ is presented in Appendix C.

3.3 |. Sample size calculation

The calculated sample size is capable to detect the postulated effect size corresponding to either a single regime, or the difference between two regimes, or, if one regime is superior to the others. The proposed sample size formulas for Hypothesis tests 1–4 under our SMART design are given by

N_{1} = 2 {(z_{α / 2} - z_{1 - β})}^{2} \frac{σ_{d_{1}}^{2}}{δ_{d_{1}}^{2}},

(11)

N_{2} = 2 {(z_{α / 2} - z_{1 - β})}^{2} \frac{σ_{d_{1} - d_{3}}^{2}}{δ_{d_{1} - d_{3}}^{2}},

(12)

N_{3} = 2 {(z_{α / 2} - z_{1 - β})}^{2} \frac{σ_{d_{1} - d_{5}}^{2}}{δ_{d_{1} - d_{5}}^{2}},

(13)

while N₄ is the root of the equation

Pr {Z_{d_{1} - d_{2}} \leq - (z_{α} - δ_{d_{1} - d_{2}}^{*} \sqrt{N_{4} / 2}), \dots, Z_{d_{1} - d_{8}} \leq - (z_{α} - δ_{d_{1} - d_{8}}^{*} \sqrt{N_{4} / 2})} = power,

(14)

respectively, where $σ_{d_{1} - d_{3}}^{2}$ is defined by (10), and similarly, both $σ_{d_{1}}^{2}$ and $σ_{d_{1} - d_{5}}^{2}$ can also be defined; α =Pr(Type I error), β =Pr(Type II error)= 1 – Power, Pr(z > z_α/2) = α/2, and Pr(z > z_1−β) = 1 – β, the effect size $δ_{d_{1} - d_{3}} = μ_{d_{1}} - μ_{d_{3}}; Z_{d_{1} - d_{2}}, \dots, Z_{d_{1} - d_{8}}$ have standard normal distributions and same correlation matrix as ${\hat{δ}}_{d_{1} - d_{2}}, \dots, {\hat{δ}}_{d_{1} - d_{8}}$ ; see the derivation in Appendix C. Therefore, we define the standardized effect size by $δ_{d_{1} - d_{3}}^{*} = δ_{d_{1} - d_{3}} / σ_{d_{1} - d_{3}}$ . Note that our calculations advance the previous ones for SMART designs in clustered data (Ghosh et al., 2016; NeCamp et al., 2017) by including non-Gaussianity, spatial association, and nonrandom missingness features, typical of periodontal responses, in addition to considering comparisons between regimes that share the same initial treatment, and even testing whether one particular regime is the best among all possible embedded regimes. Also, the subjects (or clusters) in our setting are randomly allocated with equal probability for each embedded regime, which was suggested by Murphy (2005) to maximize the power for comparing regime means.

4 |. SIMULATION STUDIES

We now present simulation studies to investigate the finite-sample performance of the proposed sample size formulas (11)–(14) in terms of computing Monte Carlo power estimates based on 5,000 simulated datasets, given the Type-II error rate β = 0.2, or nominal power of 80% and Type-I error rate α = 0.05 (Hypotheses 1–3) or 0.025 (Hypothesis 4). The Monte Carlo data generation steps are given below. These include generating the random variables A_i1, R_i(A_il) and A_i2(A_i1, R_i(A_i1)), M_it and Y_it for each patient i, i = l,…,N.

Step 1. The initial treatment A_i1 is assigned randomly to either “3” or “8,” with probability $π_{1 i}^{d_{1}}$ and $1 - π_{1 i}^{d_{1}}$ , respectively.

Step 2. The response variable R_i(A_i1) is generated from Bernoulli $(γ^{d_{1}})$ if A_i1=“3,” or Bernoulli $(γ^{d_{5}})$ if A_i1=“8,” where $γ^{d_{1}} = 0.25$ or 0.5 and $γ^{d_{5}} = 0.5$ .

Step 3. The final treatment A_i2(A_i1 = “3,″ R_i(A_i1 = “3”)= 1) is assigned to “3” with probability of 1, and A_i2(A_i1 = “3,″ R_i(A_i1 = “3”) = 0) is randomly assigned to “4,” “5,” “6,” or “7” with probability of ¼, while A_i2(A_i1 = “8,”R_i(A_i1 = “8”) = 1) is assigned to “8” with probability 1 and A_i2(A_i1 = “8,”R_i(A_i1 = “8”) = 0) is assigned to “4,” “5,” “6” or “7” with probability of ¼.

Step 4. The change in mean CAL Y_it and missingness indicator M_it of each tooth are generated by regression models (1) and (2), respectively. For the model parameters, we assume τ = 0.85, ρ = 0.975, a₀ = −1, b₀ = 0.5, σ₁ = 0.95 and σ₂ = 1, based on estimates from Reich and Bandyopadhyay (2010). For model (1), we select μ_i= 0 if R_i = 1 and μ_i = 0.5, 2 or 5 if R_i = 0 to test the proposed method. The skewness and kurtosis parameters for the error term ϵ_it1 of model (1) are chosen as λ = 0, 2 and 10, ν = ∞, 8, 5 and 3. This choice of parameters for model (2) give expected proportion of available teeth around 80% (i.e., p_i ≈ 0.8) for each patient. Given the parameters of models (1) and (2), at λ =10 and ν = 3, the association between CAL change and missingness is around 0.42 (i.e., c_i ≈ 0.42).

Step 5. The mean CAL change for patient i is computed as ${\bar{Y}}_{i} = \frac{\sum_{t = 1}^{28} Y_{i t} (1 - M_{i t})}{\sum_{t = 1}^{28} (1 - M_{i t})}$ .

Tables 1–4 present a list of sample sizes calculated by the proposed method, with the corresponding estimated Monte Carlo powers based on Hypotheses 1–4, respectively. Various pairs of the skewness and kurtosis parameters (λ, ν) corresponding to the error ϵ_it1 are considered. Recall, λ = 0 and ν = ∞ indicates normal distribution; λ ≠ 0 and ν = ∞ indicates SN distribution; λ = 0 and v < ∞ indicates t distribution, and λ ≠ 0 and ν < ∞ indicates ST distribution. We consider a list of effect sizes (e.g., $δ_{d_{1}} = μ_{d_{1}}$ ), and present their corresponding absolute values (e.g., $| δ_{d_{1}} |$ ), and the absolute value of the standardized effect size, that is, $| δ_{d_{1}}^{*} | = | δ_{d_{1}} / σ_{d_{1}} |$ except for Table 4, where we present a list of average of absolute values of effect sizes (e.g., $\bar{| δ |} = \sum_{j = 2}^{8} | δ_{d_{1} - d_{j}} | / 7$ ), and the corresponding standardized effect size (i.e., $\bar{| δ * |} = \sum_{j = 2}^{8} | δ_{d_{1} - d_{j}} / σ_{d_{1} - d_{j}} | / 7$ ).

TABLE 1.

Formula-based sample size (N₁) and estimated Monte Carlo power $(\hat{P})$ at β = 0.2 and α = 0.05, based on hypothesis test of $H_{0} : μ_{d_{1}} = 0$ versus $H_{1} : μ_{d_{1}} \neq 0$ , under varying absolute effect size $(| δ_{d_{1}} |)$ or standardized effect size $(| δ_{d_{1}}^{*} |)$ , treatment “3” response rate $(γ^{d_{1}})$ , skewness (λ), and degrees of freedom (ν), given treatment “8” response rate $γ^{d_{5}} = 0.5$ , σ₁ = 0.95, σ₀ = 1, ρ = 0.975, τ = 0.85, expected percentage of available teeth per patient p_i = 80%

Absolute effect size	Skewness	Degrees of freedom	Response rate	Standardized effect size	Formula-based sample size	Monte Carlo power
$\| δ_{d_{1}} \|$	λ	v	$γ^{d_{1}}$	$\| δ_{d_{1}}^{*} \|$	N₁	$\hat{P}$
1.28	0	Inf	0.25	0.43	84.00	0.80
3.53				0.48	68.00	0.79
0.78			0.5	0.29	189.00	0.79
2.28				0.34	135.00	0.80
1.28		8	0.25	0.43	84.00	0.80
3.53				0.48	68.00	0.79
0.78			0.5	0.29	190.00	0.81
2.28				0.34	135.00	0.80
1.27		5	0.25	0.43	85.00	0.80
3.53				0.48	68.00	0.80
0.78			0.5	0.29	192.00	0.79
2.28				0.34	135.00	0.80
1.28		3	0.25	0.43	86.00	0.81
3.52				0.48	68.00	0.80
0.78			0.5	0.28	196.00	0.79
2.28				0.34	135.00	0.79
1.96	2	Inf	0.25	0.51	61.00	0.79
4.20				0.51	61.00	0.81
1.45			0.5	0.41	92.00	0.80
2.95				0.39	102.00	0.78
2.02		8	0.25	0.51	60.00	0.79
4.27				0.51	61.00	0.81
1.53			0.5	0.42	88.00	0.80
3.03				0.40	100.00	0.79
2.09		5	0.25	0.52	59.00	0.80
4.33				0.51	60.00	0.80
1.58			0.5	0.43	85.00	0.79
3.08				0.40	98.00	0.79
2.22		3	0.25	0.52	58.00	0.80
4.46				0.52	60.00	0.79
1.71			0.5	0.44	80.00	0.79
3.21				0.41	94.00	0.79
2.03	10	Inf	0.25	0.51	60.00	0.79
4.28				0.51	61.00	0.81
1.53			0.5	0.43	87.00	0.80
3.03				0.40	99.00	0.78
2.11		8	0.25	0.52	59.00	0.79
4.36				0.51	60.00	0.80
1.61			0.5	0.44	83.00	0.79
3.11				0.40	97.00	0.79
2.17		5	0.25	0.52	58.00	0.79
4.42				0.52	60.00	0.80
1.67			0.5	0.44	81.00	0.80
3.18				0.41	95.00	0.79
2.31		3	0.25	0.53	57.00	0.80
4.57				0.52	59.00	0.79
1.82			0.5	0.46	76.00	0.80
3.32				0.42	92.00	0.80

Open in a new tab

TABLE 4.

Formula-based sample size (N₄) and estimated Monte Carlo power $(\hat{P})$ at β = 0.2 and α = 0.025, based on hypothesis test of $H_{0} : μ_{d_{1}} \leq μ_{d_{2}} or \dots or μ_{d_{1}} \leq μ_{d_{8}}$ versus $H_{1} : μ_{d_{1}} > μ_{d_{2}} & \dots & μ_{d_{1}} > μ_{d_{8}}$ , under varying average of absolute effect sizes $(\bar{| δ |} = \sum_{j = 2}^{8} | δ_{d_{1} - d_{j}} | / 7)$ or standardized effect size $(\bar{| δ^{*} |} = \sum_{j = 2}^{8} | δ_{d_{1} - d_{j}} / σ_{d_{1} - d_{j}} | / 7)$ and treatment “3” response rate $(γ^{d_{1}})$ , skewness (λ) and degrees of freedom (ν), given treatment “8” response rate $γ^{d_{5}} = 0.5$ , σ₁ = 0.95, σ₁ = 1, λ = 0.975, τ = 0.85, expected % of available teeth per patient p_i = 80%

Absolute effect size	Skewness	Degrees of freedom	Response rate	Standardized effect size	Formula-based sample size	Monte Carlo power
\|δ\|	λ	v	$γ^{d_{1}}$	\|δ*\|	N₄	$\hat{P}$
1.50	0	Inf	0.25	0.48	95.00	0.79
3.75				0.51	71.00	0.80
1.00			0.5	0.35	179.00	0.80
2.50				0.37	132.00	0.81
1.50		8	0.25	0.48	96.00	0.80
3.75				0.51	71.00	0.81
1.00			0.5	0.35	180.00	0.80
2.50				0.37	132.00	0.80
1.50		5	0.25	0.47	97.00	0.80
3.75				0.51	71.00	0.80
1.00			0.5	0.35	182.00	0.81
2.50				0.37	132.00	0.79
1.50		3	0.25	0.47	101.00	0.80
3.75				0.51	72.00	0.81
1.00			0.5	0.34	189.00	0.80
2.50				0.37	134.00	0.79
1.50	2	Inf	0.25	0.36	161.00	0.79
3.75				0.44	93.00	0.79
1.00			0.5	0.26	297.00	0.79
2.50				0.33	172.00	0.79
1.50		8	0.25	0.35	173.00	0.78
3.75				0.43	97.00	0.79
1.00			0.5	0.26	318.00	0.78
2.50				0.32	179.00	0.79
1.50		5	0.25	0.34	183.00	0.78
3.75				0.43	100.00	0.79
1.00			0.5	0.25	336.00	0.78
2.50				0.32	184.00	0.79
1.50		3	0.25	0.32	209.00	0.80
3.75				0.42	107.00	0.79
1.00			0.5	0.24	383.00	0.79
2.50				0.31	197.00	0.80
1.50	10	Inf	0.25	0.35	172.00	0.78
3.75				0.43	97.00	0.81
1.00			0.5	0.26	316.00	0.79
2.50				0.32	179.00	0.79
1.50		8	0.25	0.34	186.00	0.79
3.75				0.43	101.00	0.79
1.00			0.5	0.25	344.00	0.80
2.50				0.32	185.00	0.80
1.50		5	0.25	0.33	198.00	0.78
3.75				0.42	104.00	0.79
1.00			0.5	0.24	362.00	0.79
2.50				0.31	192.00	0.79
1.50		3	0.25	0.31	229.00	0.79
3.75				0.41	112.00	0.78
1.00			0.5	0.23	418.00	0.79
2.50				0.30	207.00	0.79

Open in a new tab

Table 1 detects a single regime (i.e., regime 1) effect. We define μ_i =0 if R_i = 1 and μ_i = 2 (e.g., the first row), or 5 (e.g., the second row) if R_i = 0. Table 2 compares two regimes that share an initial treatment (i.e., regimes 1 vs. 3). We define μ_i = 0 if R_i = 1 and μ_i = 0.5 if R_i = 0 for regime 1, while we define μ_i =0 if R_i = 1 and μ_i = 2 (e.g., the first row) or 5 (e.g., the second row) if R_i = 0 for regime 3. Table 3 compares two regimes without sharing initial treatment (i.e., regimes 1 versus 5), where we define μ_i = 0 if R_i = 1 and μ_i = 0.5 if R_i = 0 for regime 1, while we define μ_i = 0 if R_i = 1 and μ_i = 2 (e.g., the first row) or 5 (e.g., the second row) if R_i = 0 for regime 5. Finally, Table 4 detects if regime one is better than rest of the embedded regimes (i.e., regimes 1 vs. 2–8), where we define μ_i = 0 if R_i = 1 and μ_i = 2 (e.g., the first row) or 5 (e.g., the second row) if R_i = 0 for regime 1, while we define μ_i = 0 if R_i = 1 and μ_i = 0 if R_i = 0 for regimes 2–8.

TABLE 2.

Formula-based sample size (N₂) and estimated Monte Carlo power $(\hat{P})$ at β = 0.2 and α = 0.05, based on hypothesis test of $H_{0} : μ_{d_{1}} = μ_{d_{3}}$ versus $H_{1} : μ_{d_{1}} \neq μ_{d_{3}}$ , under varying absolute effect size $(| δ_{d_{1} - d_{3}} |)$ or standardized effect size $(| δ_{d_{1} - d_{3}}^{*} |)$ and treatment “3” response rate $(γ^{d_{1}})$ , skewness (λ), and degrees of freedom (ν), given treatment “8” response rate $γ^{d_{5}} = 0.5$ , σ₁ = 0.95, σ₀ = 1, ρ = 0.975, τ = 0.85, expected percentage of available teeth per patient p_i = 80%

Absolute effect size	Skewness	Degrees of freedom	Response rate	Standardized effect size	Formula-based sample size	Monte Carlo power
$\| δ_{d_{1} - d_{3}} \|$	λ	v	$γ^{d_{1}}$	$\| δ_{d_{1} - d_{3}}^{*} \|$	N₂	$\hat{P}$
1.12	0	Inf	0.25	0.35	127.00	0.80
3.37				0.45	77.00	0.81
0.75			0.5	0.26	229.00	0.79
2.25				0.33	141.00	0.80
1.12		8	0.25	0.35	128.00	0.79
3.38				0.45	77.00	0.79
0.75			0.5	0.26	233.00	0.80
2.25				0.33	141.00	0.80
1.13		5	0.25	0.35	128.00	0.79
3.37				0.45	77.00	0.79
0.75			0.5	0.26	234.00	0.80
2.25				0.33	141.00	0.79
1.12		3	0.25	0.35	132.00	0.80
3.37				0.45	77.00	0.79
0.75			0.5	0.26	238.00	0.80
2.25				0.33	142.00	0.80
1.13	2	Inf	0.25	0.25	242.00	0.80
3.38				0.39	104.00	0.79
0.75			0.5	0.19	435.00	0.80
2.25				0.29	188.00	0.81
1.13		8	0.25	0.25	257.00	0.79
3.37				0.38	107.00	0.79
0.75			0.5	0.18	461.00	0.80
2.25				0.28	194.00	0.80
1.12		5	0.25	0.24	275.00	0.80
3.37				0.38	110.00	0.79
0.75			0.5	0.18	486.00	0.80
2.25				0.28	199.00	0.80
1.13		3	0.25	0.23	305.00	0.79
3.38				0.37	116.00	0.79
0.75			0.5	0.17	551.00	0.80
2.25				0.27	211.00	0.80
1.13	10	Inf	0.25	0.25	257.00	0.80
3.38				0.38	107.00	0.80
0.75			0.5	0.18	465.00	0.81
2.25				0.28	195.00	0.80
1.13		8	0.25	0.24	277.00	0.80
3.37				0.38	111.00	0.80
0.75			0.5	0.18	498.00	0.80
2.25				0.28	201.00	0.79
1.12		5	0.25	0.23	293.00	0.80
3.37				0.37	114.00	0.79
0.75			0.5	0.17	527.00	0.80
2.25				0.28	206.00	0.79
1.12		3	0.25	0.22	333.00	0.79
3.38				0.36	121.00	0.79
0.75			0.5	0.16	593.00	0.80
2.25				0.27	220.00	0.80

Open in a new tab

TABLE 3.

Formula-based sample size (N₃) and estimated Monte Carlo power $(\hat{P})$ at β = 0.2 and α = 0.05, based on hypothesis test of $H_{0} : μ_{d_{1}} = μ_{d_{5}}$ versus $H_{1} : μ_{d_{1}} \neq μ_{d_{5}}$ , under varying absolute effect size $(| δ_{d_{1} - d_{5}} |)$ or standardized effect size $(| δ_{d_{1} - d_{5}}^{*} |)$ and treatment “3” response rate $(γ^{d_{1}})$ , skewness (λ), and degrees of freedom (ν), given treatment “8” response rate $γ^{d_{5}} = 0.5$ , σ₁ = 0.95, σ₀ = 1, ρ = 0.975, τ = 0.85, expected percentage of available teeth per patient p_i = 80%

Absolute effect size	Skewness	Degrees of freedom	Response rate	Standardized effect size	Formula-based sample size	Monte Carlo power
$\| δ_{d_{1} - d_{5}} \|$	λ	v	$γ^{d_{1}}$	$\| δ_{d_{1} - d_{5}}^{*} \|$	N₃	$\hat{P}$
0.63	0	Inf	0.25	0.19	420.00	0.78
2.12				0.28	196.00	0.79
0.75			0.5	0.25	244.00	0.80
2.25				0.33	142.00	0.79
0.62		8	0.25	0.19	432.00	0.81
2.13				0.28	196.00	0.79
0.74			0.5	0.25	249.00	0.80
2.25				0.33	143.00	0.80
0.62		5	0.25	0.19	435.00	0.79
2.12				0.28	197.00	0.81
0.75			0.5	0.25	248.00	0.80
2.25				0.33	143.00	0.80
0.63		3	0.25	0.19	444.00	0.80
2.12				0.28	198.00	0.80
0.75			0.5	0.25	257.00	0.80
2.25				0.33	143.00	0.79
0.62	2	Inf	0.25	0.14	792.00	0.79
2.12				0.24	263.00	0.79
0.75			0.5	0.19	452.00	0.80
2.25				0.29	190.00	0.80
0.62		8	0.25	0.14	847.00	0.79
2.13				0.24	270.00	0.80
0.75			0.5	0.18	486.00	0.80
2.25				0.28	198.00	0.80
0.63		5	0.25	0.13	886.00	0.79
2.13				0.24	277.00	0.81
0.75			0.5	0.18	508.00	0.80
2.25				0.28	202.00	0.80
0.63		3	0.25	0.13	1000.00	0.80
2.13				0.23	293.00	0.79
0.75			0.5	0.16	580.00	0.80
2.25				0.27	215.00	0.80
0.63	10	Inf	0.25	0.14	842.00	0.79
2.12				0.24	271.00	0.80
0.75			0.5	0.18	484.00	0.81
2.25				0.28	196.00	0.80
0.63		8	0.25	0.13	909.00	0.80
2.12				0.24	281.00	0.80
0.75			0.5	0.17	524.00	0.81
2.25				0.28	204.00	0.80
0.63		5	0.25	0.13	960.00	0.79
2.13				0.23	286.00	0.81
0.75			0.5	0.17	550.00	0.80
2.25				0.27	210.00	0.81
0.62		3	0.25	0.12	1097.00	0.80
2.12				0.23	307.00	0.80
0.75			0.5	0.16	623.00	0.80
2.25				0.27	224.00	0.80

Open in a new tab

The results are summarized below. We obtain mostly small to medium (0.2–0.5) absolute standardized effect sizes, except some smaller effects (<0.2) in Table 3. The tables show that the Monte Carlo estimated powers (78–82%) are close to the nominal power based on the sample size formulas (11)–(14). Note that the sample sizes are rounded to the next integer. Throughout our simulation studies, we assume τ = 0.85 and ρ = 0.975, based on the literature (Reich & Bandyopadhyay, 2010). Sensitivity checks (omitted here for brevity) revealed the estimated N to increase with increase of either τ, or ρ, or both. With respect to calibration of ρ as a (spatial) correlation measure, a very high value of rho (say, ρ = 0.99) only translates to Moran’s I ≤ 0.5, indicative of moderate correlation (Banerjee, Carlin, & Gelfand, 2014). Hence, under the assumption of the presence of meaningful spatial association, we suggest to input a very high value of ρ during sample size calculations.

5 |. IMPLEMENTATION IN R

In this section, we demonstrate the implementation of the R function SampleSize_SMARTp for sample size calculations via a simulation study. This function is available in the R package SMARTp. The current version of this function only considers a two-stage SMART design. Figure 2 defines the SMART design.

The first three inputs of the function

SampleSize_SMARTp(mu, stl, dtr, regime, pow, a, rho, tau, sigma1, lambda, nu, sigma0, Num, p_i, c_i, a0, b0, cutoff)

are matrices, defined as follows:

mu: Mean matrix, where rows represent treatment paths and columns represent cluster subunits (i.e., teeth) within a cluster (i.e., mouth).
stl: Stage-1 treatment matrix, where rows represent the corresponding stage-1 treatments, the first column includes the numbers of treatment options for responder, the second column includes the numbers of treatment options for nonresponders, the third column are the response rates, and the fourth column includes the row numbers of matrix “st1.”
dtr: Matrix of dimension (no. of DTRs × 4), the first column represents the DTR numbers, the second column represents the corresponding treatment path numbers of responders for the corresponding DTRs in the first column, the third column represents the corresponding treatment path numbers of the nonresponders for the corresponding DTRs in the first column, while the fourth column represents the corresponding initial treatment.

The regime can be a single regime number if the hypothesis test is to detect the effect of that regime (e.g., Hypothesis 1), a vector of two regime numbers if the hypothesis test is to compare regimes (e.g., Hypothesis 2 or 3), or a vector of three or more regime numbers if the hypothesis test is to detect if the first regime is better than other regimes in the vector (e.g., Hypothesis 4). The power and Type-I error rates are given by pow and a, respectively, with the corresponding defaults set at 0.8 and 0.05. The parameters τ and ρ, which quantify the variation and association in the CAR specification of the random effect Q_it are given by tau and rho, respectively, with defaults set at tau = 0.85 and rho = 0.975. The inputs sigma1, lambda and nu define the scale (σ₁), skewness (λ) and degrees of freedom (ν) parameters of the residual ϵ_it1, which default to sigma1 = 0.95, lambda = 0 and nu = Inf. The standard deviation σ₀ for the residual ϵ_it0 is given by sigma0, whose default is sigma0 = 1. The rest of the parameters a₀, b₀, and c₀ from (2) are specified by a0, b0, and cutoff, respectively, and their defaults are a0 = −1, b0 = 0.5, and cutoff = 0. The user can either provide the choice of a0 and b0, or the choice p_i and c_i, which are the expected proportion p_i of available teeth for patient i, and the average Pearson’s correlation coefficient c_i between Y_it and M_it0, averaged over the 28 teeth for patient i, respectively. Monte Carlo estimates of the mean and variance of $\bar{Y}$ for each treatment path were obtained using Num random samples.

The possible outputs are summarized as follows:

N, the estimated sample size;
Del, the effect size;
Del_std, the standardized effect size;
ybar, the estimated regime means correspond to regime;
Sigma, the CAR covariance matrix of Q_it;
sig.dd, N* the variance or covariance matrix of the estimated regime means correspond to regime;
sig.e.sq, N* the variance or covariance matrix of the difference between first and rest of estimated regime means corresponding to regime, sig.e.sq=sig.dd if the element number of regime is 1;
p_st1, the randomization probability at stage 1 for each treatment path;
p_st2, the randomization probability at stage 2 for each treatment path;
res, a vector with binary indicators represent responding or nonresponding that corresponds to a treatment path;
ga, the response rates of initial treatments correspond to each treatment path;
initr, one column matrix with dimension of number of treatment path, the elements are the corresponding row number of st1.

In the following, we present the R code for the sample size calculation corresponding to the second row of Table 3 in Section 4.

#The packages required
library(“mvtnorm”)
library(“sn”)
# The SMART Design
mu=matrix(0,10,28)
mu[2,]=rep(0.5,28)
mu[4,] =rep(2,28)
mu[7,] =rep(5,28)
st1=cbind(c(1,1), c(4,4), c(0.25, 0.5), 1:2)
dtr=cbind(1:8, c(rep(1,4), rep(6,4)), c(2,3,4,5,7,8,9,10), c(rep(1,4), rep(2,4)))
## Hypothesis Test 3, with power and Type-1 error rates to be 80%
## and 20%, respectively
regime=c(1,5)
pow = 0.8
a = 0.05
## Parameter values
cutoff=0; sigma1=0.95; sigma0=1; lambda=0; nu=Inf; b0=0.5; a0=−1.0;
rho=0.975;tau=0.85;
## Iteration size
Num = 1000000

Then, the R code to compute $N_{3}, δ_{d_{1} - d_{5}}$ , $var ({\bar{Y}}^{d_{1}} - {\bar{Y}}^{d_{5}}), δ_{d_{1} - d_{5}}^{⋆}, var ({\bar{Y}}^{d_{1}}), var ({\bar{Y}}^{d_{5}})$ and $cov ({\bar{Y}}^{d_{1}}, {\bar{Y}}^{d_{5}})$ are, respectively,

SampleSize=SampleSize_SMARTp(mu=mu, st1=st1, dtr=dtr, regime=regime,
pow=pow, a=a, rho=rho, tau=tau, sigma1=sigma1, lambda=lambda, nu=nu,
sigma0=sigma0, Num=Num, a0=a0, b0=b0, cutoff=cutoff);
N=ceiling(SampleSize$N); print(N);
Del=SampleSize$Del; print(Del);
v_d1_d5=SampleSize$sig.e.sq[1,1]/SampleSize$N;v_d1_d5;
Del_std=SampleSize$Del_std; print(Del_std);
v_d1=SampleSize$sig.dd[1,1]/SampleSize$N; print(v_d1)
v_d5=SampleSize$sig.dd[2,2]/SampleSize$N; print(v_d5)
cov_d1d5=SampleSize$sig.dd[1,2]/SampleSize$N; print(cov_d1d5)

6 |. DISCUSSION

This paper proposes a two-stage SMART design and associated analysis plan to study a number of DTRs for managing CP. A statistical analysis plan under this design includes hypothesis testing of detecting an effect size for either a single regime, or the difference between two regimes with or without sharing an initial treatment, or even testing whether one regime is best among the embedded regimes. This paper also develops a novel sample size calculation method, accommodating typical statistical challenges observed in CP data, such as non-Gaussianity, spatial association, and nonrandom missingness. To the best of our knowledge, this is the first SMART proposal in CP research within the umbrella of precision oral health—a major goal in the NIH/NIDCR’s Strategic Plan 2014–2019, and advances previous SMART proposals (Ghosh et al., 2016; NeCamp et al., 2017) considered for clustered data. Very recently, Artman, Nahum-Shani, Wu, Mckay, and Ertefaie (2018) proposed rigorous sample size and power calculations factoring in the complex correlation structure between the DTRs that remain embedded within the SMART by design. However, their setup is different than what we are considering here. Our methods are fully generalizable to SMART design scenarios in other biomedical domains, where the recorded clinical endpoints exhibit similar statistical challenges as in PD studies. Furthermore, availability of open-source R codes from the SMARTp package will facilitate adaptation and modification to those scenarios.

Note that there are no real data to support the input information in the proposed sample size formulas, which some readers may view as a limitation of our methods. With precision oral health being a recently emerging field (Garaicoa-Pazmino, Decker, & Polverini, 2015), this is not really a limitation of our method per se; it simply reflects the nascent state of the literature in this specific field. As such, we recommend considering reasonable assumptions, such as medium effect size, conservative sample sizes, etc., to elicit the input values for implementing our proposed SMART design. Additionally, these input parameters can be referenced by estimates from existing single-stage clinical trials. At any rate, our experimental design, statistical analysis plan or sample size calculation can be updated, or improved through data collection.

The proposed methodology is mainly limited to SMARTs with only two stages, and without any consideration of baseline or other covariates. In principle, our methodology can be extended to include more therapies (such as various kinds of laser treatments), and the number of treatment stages (which leads to close monitoring of CAL changes), with each stage having more treatment types. However, such extensions may be operationally messy and/or computationally burdensome. Also, using the Q-function approach that minimizes squared error (NeCamp et al., 2017), or maximizes likelihood (Van Der Laan & Rubin, 2006), the effect size of DTRs can be adjusted by adding baseline characteristics such as age, gender, education, oral hygiene, etc., into the regression models in (1) and/or (2). Furthermore, the efficiency of the proposed estimator can be improved using the corresponding doubly robust estimators, for example, those considered by Ertefaie et al. (2015). Following Ertefaie et al. (2015) and Artman et al. (2018), the proposed sample size method can also be extended to detect the optimal treatment regime. These are important avenues for future research, and will be considered elsewhere.

Supplementary Material

R codes

NIHMS1053056-supplement-R_codes.zip^{(2.8MB, zip)}

ACKNOWLEDGMENTS

This work was supported by the grant MOE 2015-T2-2-056 from the Singapore Ministry of Education, Dr. Chakraborty’s startup grant from the Duke-NUS Medical School, and the grant R01-DE024984 from the United States National Institutes of Health. The authors thank the editor, the associate editor, and two reviewers, whose constructive comments led to an improved version of the manuscript. They are also thankful to researchers at the HealthPartners Institute located at Minneapolis, Minnesota for providing the motivation and the context behind this work, and also to Brian Reich from the North Carolina State University for interesting discussions.

Funding information

United States National Institutes of Health, Grant/Award Number: R01-DE024984–01A1; Singapore Ministry of Education, Grant/Award Number: MOE 2015-T2-2-056; Duke-NUS Medical School

APPENDIX A: SKEW-NORMAL DISTRIBUTION

The statistical properties and the application of skew-normal (SN) distribution are described in Azzalini and Dalla Valle (1996) and Azzalini and Capitanio (1999), respectively. Aparecida Guedes et al. (2014) presents an example of applying a regression model with SN errors. Here, we present a brief introduction.

Define Z₀ ~ N(0,1), independent of a m-dimensional random variable Z = (Z₁,…,Z_m)^⊤ with standardized normal marginals, and correlation matrix Ψ. Suppose κ_t,... , κ_m ∈ (−1, 1), define

X_{j} = κ_{j} | Z_{0} | + {(1 - κ_{j}^{2})}^{1 / 2} Z_{j},

(A1)

for j = 1,…, m, where $κ_{j} = λ_{j} / {(1 + λ_{j}^{2})}^{1 / 2}$ , such that X_j ~ SN(λ_j), where “SN’ stands for skew-normal and λ ∈ (−∞,∞) controls skewness. The probability density function of X_j is f(x_j;λ_j) = 2ϕ(x_j )Φ(λ_jX_j), for−∞ < x_j < ∞, where ϕ(·) and Φ(·) denote the density and cumulative distribution function (cdf) of N(0, 1), respectively. The joint density function of X₁,… ,X_m is given by

f (x; θ_{x}, Ω_{x}) = 2 ϕ_{m} (x; Ω_{x}) Φ (θ_{x}^{⊤} x),

(A2)

where x = (x₁,… ,x_m)^⊤, ϕ_m(x; Ω_x) denotes the density function of the m-dimension multivariate normal distribution with standardized marginals and correlation matrix Ω_x. We have $θ_{x}^{⊤} = \frac{λ^{⊤} Ψ^{- 1} K^{- 1}}{{(1 + λ^{⊤} Ψ^{- 1} λ)}^{1 / 2}}$ , $K = diag ({(1 - κ_{1}^{2})}^{1 / 2}, \dots, {(1 - κ_{m}^{2})}^{1 / 2})$ , and Ω_x = K(Ψ + λλ^⊤)K.

Define Y = ξ + ωX, with Y = (Y₁,…Y_m)^⊤, ξ = (ξ₁,…ξ_m)^⊤ and ω = diag(ω_1,…ω_m)^⊤, where the components of ω are assumed to be positive. The density function of Y is

f (y; ξ, Ω, θ) = 2 ϕ_{m} (y - ξ; Ω) Φ (θ^{⊤} (y - ξ)),

(A3)

where Ω = ωΩ_xω and $θ^{⊤} = θ_{x}^{⊤} ω^{- 1}$ . Thus, Y is the m-dimensional random variable from the SN distribution, with location ξ, scale ω and skewness θ, that is, Y ~ SN_m(ξ, Ω, θ). From (A3), we have $E (Y) = ξ + ω {(\frac{2}{π})}^{1 / 2} κ$ , $var (Y) = Ω - ω^{2} \frac{2}{π} κ κ^{⊤}$ , and the skewness vector $S K E W (Y) = \frac{4 - π}{2} \frac{{(κ \sqrt{2 / π})}^{3}}{{(1 - 2 κ^{2} / π)}^{3 / 2}} = γ_{1}$ .

APPENDIX B: SKEW-t DISTRIBUTION

Skew-t (ST) random variables generated from both SN and Chi-squared variables are described in Azzalini and Capitanio (2003b). Here, W ~ ST_m(ξ, Ω, θ, v), such that $W = ξ + ω X / \sqrt{V}$ and ωX ~ SN_m(0, Ω, θ), where $V ~ χ_{v}^{2} / v$ is independent of X.

The density function of W is

f_{W} (w; ξ, Ω, θ, v) = 2 t_{m} (w; ξ, Ω, θ, v) T_{1} (θ^{⊤} (w - ξ) \sqrt{\frac{v + m}{Q_{w} + v}}; v + m),

(B1)

where Q_w = (W – ξ)^⊤Ω⁻¹(W – ξ), t_m(·; ξ, Ω, θ, ν) denotes the density function of a m-dimensional t variate with location ξ, shape matrix Ω and degrees of freedom ν, while T₁(·;ν + m) denotes the cdf of an univariate Student’s t with degrees of freedom v + m. We can use the expression $U = ω X / \sqrt{V}$ to compute the moments of U, that is, the nth moment of W when ξ = 0, is

E (U^{n}) = ω^{n} E (X^{n}) E (V^{- n / 2}),

(B2)

where

E (V^{- n / 2}) = \frac{{(v / 2)}^{n / 2} Γ (\frac{1}{2} (v - n))}{Γ (\frac{1}{2} v)}

with E(Xⁿ) given in Azzalini and Capitanio (1999). Thus, the mean and variance of W, are, respectively,

E (W) = ξ + ω κ {(v / π)}^{1 / 2} \frac{Γ (\frac{1}{2} (v - 1))}{Γ (\frac{1}{2} v)}, v > 1,

v a r (W) = \frac{v}{v - 2} Ω - \frac{v}{π} {(\frac{Γ (\frac{1}{2} (v - 1))}{Γ (\frac{1}{2} v)})}^{2} ω^{2} κ κ^{⊤}, v > 2.

Similarly, the skewness (γ₁) and kurtosis (γ₂) for the univariate cases are

γ_{1} = μ [\frac{v (3 - κ^{2})}{v - 3} - \frac{3 v}{v - 2} + 2 μ^{2}] {[\frac{v}{v - 2} - μ^{2}]}^{- 3 / 2}, v > 3,

γ_{2} = [\frac{3 v^{2}}{(v - 2) (v - 4)} - \frac{4 μ^{2} v (3 - κ^{2})}{v - 3} + \frac{6 μ^{2} v}{v - 2} - 3 μ^{4}] {[\frac{v}{v - 2} - μ^{2}]}^{- 2} - 3, v > 4,

where $μ = κ \sqrt{\frac{v}{π}} \frac{Γ (\frac{1}{2} (v - 1))}{Γ (\frac{1}{2} v)} .$

APPENDIX C: SAMPLE SIZE FORMULA DERIVATION

The covariance between Y_it and M_it0 is

cov (μ_{i} + Q_{i t} + ϵ_{i t 1}, a_{0} + b_{0} Q_{i t} + ϵ_{i t 0})

= cov (Q_{i t}, b_{0} Q_{i t})

= b_{0} var (Q_{i t})

= b_{0} Σ_{t t},

where Σ_tt is the tth diagonal elements of the covariance matrix Σ_28×28. The Pearson correlation coefficient is

c_{i t} = \frac{b_{0} var (Q_{i t})}{\sqrt{var (μ_{i} + Q_{i t} + \in_{i t 0}) var (a_{0} + b_{0} Q_{i t} + \in_{i t 0})}}

= \frac{b_{0} var (Q_{i t})}{\sqrt{(var (Q_{i t}) + var (ϵ_{i t 1})) (b_{0}^{2} var (Q_{i t}) + var (ϵ_{i t 0}))}},

where var(Q_it) = Σ_tt, $var (ϵ_{i t 0}) = σ_{0}^{2}$ and $var (ϵ_{i t 1}) = \frac{σ_{1}^{2} v}{v - 2} - \frac{v}{π} {[\frac{Γ (0.5 (v - 1))}{Γ (0.5 v)}]}^{2} \frac{σ_{1}^{2} λ^{2}}{1 + λ^{2}} if v < \infty$ , otherwise $var (ϵ_{i t 1}) = σ_{1}^{2} - \frac{2}{π} \frac{σ_{1}^{2} λ^{2}}{1 + λ^{2}}$ . Now, we derive an expression for p_i, that is,

p_{i} = 1 - E (\frac{\sum_{t = 1}^{28} M_{i t}}{28})

= 1 - \frac{1}{28} \sum_{t = 1}^{28} E [E (M_{i t} | Q_{i t}, ϵ_{i t 0})]

= 1 - \frac{1}{28} \sum_{t = 1}^{28} E [E (I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > c_{0}) | Q_{i t}, ϵ_{i t 0})]

= 1 - \frac{1}{28} \sum_{t = 1}^{28} E [I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > c_{0})]

= 1 - \frac{1}{28} \sum_{t = 1}^{28} Pr (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > c_{0})

= 1 - \frac{1}{28} \sum_{t = 1}^{28} Pr (z > \frac{c_{0} - E (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0})}{\sqrt{var (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0})}})

= 1 - \frac{1}{28} \sum_{t = 1}^{28} [1 - Φ (\frac{c_{0} - a_{0}}{\sqrt{b_{0}^{2} Σ_{t t} + σ_{0}^{2}}})],

where Φ(·) is the cdf of z ~ N(0,1). Now,

E ({\bar{Y}}^{d_{1}}) = E (\frac{I (A_{i 1} = a_{1 i}^{d_{1}}, A_{i 2} = {[a_{2 i}^{d_{1} R}]}^{R_{i}} {[a_{2 i}^{d_{1} N R}]}^{1 - R_{i}})}{π_{1 i}^{d_{1}} {[π_{2 i}^{d_{1} R}]}^{R_{i}} {[π_{2 i}^{d_{1} N R}]}^{1 - R_{i}}} {\bar{Y}}_{i})

= E [E (\frac{I (A_{i 1} = a_{1 i}^{d_{1}}, A_{i 2} = {[a_{2 i}^{d_{1} R}]}^{R_{i}} {[a_{2 i}^{d_{1} N R}]}^{1 - R_{i}})}{π_{1 i}^{d_{1}} {[π_{2 i}^{d_{1} R}]}^{R_{i}} {[π_{2 i}^{d_{1} N R}]}^{1 - R_{i}}} {\bar{Y}}_{i} | R_{i})]

= γ^{d_{1}} μ_{d_{1} R} + (1 - γ^{d_{1}}) μ_{d_{1} N R} .

In a similar way, we have $E ({\bar{Y}}^{d_{3}}) = γ^{d_{3}} μ_{d_{3} R} + (1 - γ^{d_{3}}) μ_{d_{3} N R}$ .

According to variance decomposition, the right side of (8) is the sum of two components, which are

E [V (\frac{I (A_{i 1} = a_{1 i}^{d_{1}}, A_{i 2} = [a_{2 i}^{d_{1} R}] R_{i} {[a_{2 i}^{d_{1} N R}]}^{1 - R_{i}})}{π_{1 i}^{d_{1}} {[π_{2 i}^{d_{1} R}]}^{R_{i}} {[π_{2 i}^{d_{1} N R}]}^{1 - R_{i}}} {\bar{Y}}_{i} | R_{i})]

and

V [E (\frac{I (A_{i 1} = a_{1 i}^{d_{1}}, A_{i 2} = [a_{2 i}^{d_{1} R}] R_{i} {[a_{2 i}^{d_{1} N R}]}^{1 - R_{i}})}{π_{1 i}^{d_{1}} {[π_{2 i}^{d_{1} R}]}^{R_{i}} {[π_{2 i}^{d_{1} N R}]}^{1 - R_{i}}} {\bar{Y}}_{i} | R_{i})] .

The first component is

E [V (\frac{I (A_{i 1} = a_{1 i}^{d_{1}}, A_{i 2} = [a_{2 i}^{d_{1} R}] R_{i} {[a_{2 i}^{d_{1} N R}]}^{1 - R_{i}})}{π_{1 i}^{d_{1}} {[π_{2 i}^{d_{1} R}]}^{R_{i}} {[π_{2 i}^{d_{1} N R}]}^{1 - R_{i}}} {\bar{Y}}_{i} | R_{i})]

= \sum_{R_{i} = 0}^{1} V (\frac{I (A_{i 1} = a_{1 i}^{d_{1}}, A_{i 2} = [a_{2 i}^{d_{1} R}] R_{i} {[a_{2 i}^{d_{1} N R}]}^{1 - R_{i}})}{π_{1 i}^{d_{1}} {[π_{2 i}^{d_{1} R}]}^{R_{i}} {[π_{2 i}^{d_{1} N R}]}^{1 - R_{i}}} {\bar{Y}}_{i} | R_{i}) P r (R_{i}),

while the second component is

V [E (\frac{I (A_{i 1} = a_{1 i}^{d_{1}}, A_{i 2} = [a_{2 i}^{d_{1} R}] R_{i} {[a_{2 i}^{d_{1} N R}]}^{1 - R_{i}})}{π_{1 i}^{d_{1}} {[π_{2 i}^{d_{1} R}]}^{R_{i}} {[π_{2 i}^{d_{1} N R}]}^{1 - R_{i}}} {\bar{Y}}_{i} | R_{i})]

= \sum_{R_{i} = 0}^{1} E^{2} (\frac{I (A_{i 1} = a_{1 i}^{d_{1}}, A_{i 2} = [a_{2 i}^{d_{1} R}] R_{i} {[a_{2 i}^{d_{1} N R}]}^{1 - R_{i}})}{π_{1 i}^{d_{1}} {[π_{2 i}^{d_{1} R}]}^{R_{i}} {[π_{2 i}^{d_{1} N R}]}^{1 - R_{i}}} {\bar{Y}}_{i} | R_{i}) P r (R_{i})

- {(\sum_{R_{i} = 0}^{1} E (\frac{I (A_{i 1} = a_{1 i}^{d_{1}}, A_{i 2} = [a_{2 i}^{d_{1} R}] R_{i} {[a_{2 i}^{d_{1} N R}]}^{1 - R_{i}})}{π_{1 i}^{d_{1}} {[π_{2 i}^{d_{1} R}]}^{R_{i}} {[π_{2 i}^{d_{1} N R}]}^{1 - R_{i}}} {\bar{Y}}_{i} | R_{i}) P r (R_{i}))}^{2},

based on the formulae E(X) = E[E(X | Y)] and V(X) = E(X²) – E²(X). We have $P r (R_{i} = 1) = γ^{d_{1}}$ or $P r (R_{i} = 0) = 1 - γ^{d_{1}}$ , and

V (\frac{I (A_{i 1} = a_{1 i}^{d_{1}}, A_{i 2} = [a_{2 i}^{d_{1} R}])}{π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} R}]} {\bar{Y}}_{i} | R_{i} = 1)

= E (\frac{I (A_{i 1} = a_{1 i}^{d_{1}}, A_{i 2} = [a_{2 i}^{d_{2} R}])}{{(π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} R}])}^{2}} {\bar{Y}}_{i}^{2}) - E^{2} (\frac{I (A_{i 1} = a_{1 i}^{d_{1}}, A_{i 2} = [a_{2 i}^{d_{1} R}])}{π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} R}]} {\bar{Y}}_{i})

= \frac{1}{π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} R}]} E_{d_{1} R} ({\bar{Y}}_{i}^{2}) - E_{d_{1} R}^{2} ({\bar{Y}}_{i})

= \frac{1}{π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} R}]} (σ_{d_{1} R}^{2} + (1 - π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} R}]) μ_{d_{1} R}^{2}),

with $μ_{d_{1} R}$ , the expectation of ${\bar{Y}}_{i}$ from d₁, with R₁ = 1, given by

E_{d_{1} R} ({\bar{Y}}_{i}) = E_{d_{1} R} (E_{d_{1} R} (\frac{\sum_{t = 1}^{28} Y_{i t} (1 - M_{i t})}{\sum_{t = 1}^{28} (1 - M_{i t})} | Q_{i}, ϵ_{i 0}))

= E_{d_{1} R} (E_{d_{1} R} (\frac{\sum_{t = 1}^{28} (μ_{i} + Q_{i t} + ϵ_{i t 1}) (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))}{\sum_{t = 1}^{28} (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))} | Q_{i}, ϵ_{i 0}))

= E (\frac{\sum_{t = 1}^{28} ({μ_{i} |}_{A_{i 3} = 1, R_{i} = 1} + Q_{i t} + E (ϵ_{i t 1})) (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))}{\sum_{t = 1}^{28} (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))})

= \int_{Q_{i}} \int_{ϵ_{i}} \frac{\sum_{t = 1}^{28} ({μ_{i} |}_{A_{i 3} = 1, R_{i} = 1} + Q_{i t} + E (ϵ_{i t 1})) (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))}{\sum_{t = 1}^{28} (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))} f (Q_{i}) f (ϵ_{i}) d ϵ_{i} d Q_{i},

where f (Q_t) and f(ϵ_i) are the density functions for Q_i and ϵ_i, respectively. Also, $σ_{d_{1} R}^{2}$ , the variance of ${\bar{Y}}_{i}$ that is from d₁, with R_i = 1, can be written as

V_{d_{1} R} ({\bar{Y}}_{i}) = E_{d_{1} R} (V_{d_{1} R} ({\bar{Y}}_{i} | Q_{i}, ϵ_{i 0})) + V_{d_{1} R} (E_{d_{1} R} ({\bar{Y}}_{i} | Q_{i}, ϵ_{i 0})),

where

E_{d_{1} R} (V_{d_{1} R} ({\bar{Y}}_{i} | Q_{i}, ϵ_{i 0})) = E_{d_{1} R} (V_{d_{1} R} (\frac{\sum_{t = 1}^{28} Y_{i t} (1 - M_{i t})}{\sum_{t = 1}^{28} (1 - M_{i t})} | Q_{i}, ϵ_{i 0}))

= E (V (\frac{\sum_{t = 1}^{28} ({μ_{i} |}_{A_{i 3} = 1, R_{i} = 1} + Q_{i t} + ϵ_{i t 1}) (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))}{\sum_{t = 1}^{28} (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))}))

= E (\frac{\sum_{t = 1}^{28} [1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0)] σ_{1}^{2}}{{(\sum_{t = 1}^{28} [1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0)])}^{2}})

= \int_{Q_{i}} \int_{ϵ_{i}} \frac{\sum_{t = 1}^{28} [1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0)] σ_{1}^{2}}{{(\sum_{t = 1}^{28} [1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0)])}^{2}} f (Q_{i}) f (ϵ_{i}) d ϵ_{i} d Q_{i}

and

V_{d_{1} R} (E_{d_{1} R} ({\bar{Y}}_{i} | Q_{i}, ϵ_{i 0})) = V_{d_{1} R} (E_{d_{1} R} (\frac{\sum_{t = 1}^{28} Y_{i t} (1 - M_{i t})}{\sum_{t = 1}^{28} (1 - M_{i t})} | Q_{i}, ϵ_{i 0}))

= V (E (\frac{\sum_{t = 1}^{28} ({μ_{i} |}_{A_{i 3} = 1, R_{i} = 1} + Q_{i t} + ϵ_{i t 1}) (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))}{\sum_{t = 1}^{28} (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))}))

= V (\frac{\sum_{t = 1}^{28} ({μ_{i} |}_{A_{i 3} = 1, R_{i} = 1} + Q_{i t} + E (ϵ_{i t 1})) (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))}{\sum_{t = 1}^{28} (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))})

= E [{(\frac{\sum_{t = 1}^{28} ({μ_{i} |}_{A_{i 3} = 1, R_{i} = 1} + Q_{i t} + E (ϵ_{i t 1})) (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))}{\sum_{t = 1}^{28} (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))})}^{2}]

- {[E (\frac{\sum_{t = 1}^{28} ({μ_{i} |}_{A_{i 3} = 1, R_{i} = 1} + Q_{i t} + E (ϵ_{i t 1})) (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))}{\sum_{t = 1}^{28} (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))})]}^{2}

= \int_{Q_{i}} \int_{ϵ_{i}} {(\frac{\sum_{t = 1}^{28} ({μ_{i} |}_{A_{i 3} = 1, R_{i} = 1} + Q_{i t} (ϵ_{i t 1})) (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))}{\sum_{t = 1}^{28} (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))})}^{2} f (Q_{i}) f (ϵ_{i}) d ϵ_{i} d Q_{i}

- {[\int_{Q_{i}} \int_{ϵ_{i}} \frac{\sum_{t = 1}^{28} ({μ_{i} |}_{A_{i 3} = 1, R_{i} = 1} + Q_{i t} + E (ϵ_{i t 1})) (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))}{\sum_{t = 1}^{28} (1 - I (a_{0} + b_{0} Q_{i t} + ϵ_{i t 0} > 0))} f (Q_{i}) f (ϵ_{i}) d ϵ_{i} d Q_{i}]}^{2} .

Similarly, we have

V (\frac{I (A_{i 1} = a_{1 i}^{d_{1}}, A_{i 2} = [a_{2 i}^{d_{1} N R}])}{π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} N R}]} {\bar{Y}}_{i} | R_{i} = 0) = \frac{1}{π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} N R}]} (σ_{d_{1} N R}^{2} + (1 - π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} N R}]) μ_{d_{1} N R}^{2}),

where $μ_{d_{1} N R}^{2}$ and $σ_{d_{1} N R}^{2}$ are the expectation and variance of ${\bar{Y}}_{i}$ from d₁ with R_i = 0.

Therefore, the second component is

V [E (\frac{I (A_{i 1} = a_{1 i}^{d_{1}}, A_{i 2} = [a_{2 i}^{d_{1} R}] R_{i} {[a_{2 i}^{d_{1} N R}]}^{1 - R_{i}})}{π_{1 i}^{d_{1}} {[π_{2 i}^{d_{1} R}]}^{R_{i}} {[π_{2 i}^{d_{1} N R}]}^{1 - R_{i}}} {\bar{Y}}_{i} | R_{i})]

= γ^{d_{1}} μ_{d_{1} R}^{2} + (1 - γ^{d_{1}}) μ_{d_{1} N R}^{2} - {(γ^{d_{1}} μ_{d_{1} R} + (1 - γ^{d_{1}}) μ_{d_{1} N R})}^{2}

= γ^{d_{1}} (1 - γ^{d_{1}}) {(μ_{d_{1} R} - μ_{d_{1} N R})}^{2} .

Thus, the variance formula (8) is

V ({\bar{Y}}^{d_{1}}) = \frac{1}{N} {\frac{γ^{d_{1}}}{π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} R}]} (σ_{d_{1} R}^{2} + (1 - π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} R}]) μ_{d_{1} R}^{2}) + \frac{1 - γ^{d_{1}}}{π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} N R}]} (σ_{d_{1} N R}^{2} + (1 - π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} N R}]) μ_{d_{1} N R}^{2}) + γ^{d_{1}} (1 - γ^{d_{1}}) {(μ_{d_{1} R} - μ_{d_{1} N R})}^{2}},

while $V ({\bar{Y}}^{d_{3}})$ is

\frac{1}{N} {\frac{γ^{d_{3}}}{π_{1 i}^{d_{3}} [π_{2 i}^{d_{3} R}]} (σ_{d_{3} R}^{2} + (1 - π_{1 i}^{d_{3}} [π_{2 i}^{d_{3} R}]) μ_{d_{3} R}^{2}) + \frac{1 - γ^{d_{3}}}{π_{1 i}^{d_{3}} [π_{2 i}^{d_{3} N R}]} (σ_{d_{3} N R}^{2} + (1 - π_{1 i}^{d_{3}} [π_{2 i}^{d_{3} N R}]) μ_{d_{3} N R}^{2}) + γ^{d_{3}} (1 - γ^{d_{3}}) {(μ_{d_{3} R} - μ_{d_{3} N R})}^{2}} .

The variance of the difference between d₁ and d₃ is

V ({\bar{Y}}^{d_{1}} - {\bar{Y}}^{d_{3}}) = V ({\bar{Y}}^{d_{1}}) + V ({\bar{Y}}^{d_{3}}) - 2 cov ({\bar{Y}}^{d_{1}}, {\bar{Y}}^{d_{3}}) .

(C1)

The covariance between ${\bar{Y}}^{d_{1}}$ and ${\bar{Y}}^{d_{3}}$ is

cov ({\bar{Y}}^{d_{1}}, {\bar{Y}}^{d_{3}}) = \frac{1}{N^{2}} cov (\sum_{i = 1}^{N} W_{i}^{d_{1}} {\bar{Y}}_{i}, \sum_{i = 1}^{N} W_{i}^{d_{3}} {\bar{Y}}_{i})

= \frac{1}{N^{2}} cov (\sum_{i = 1}^{N} W_{i}^{d_{1}} {\bar{Y}}_{i} (R_{i} + (1 - R_{i})), \sum_{i = 1}^{N} W_{i}^{d_{3}} {\bar{Y}}_{i} (R_{i} + (1 - R_{i})))

= \frac{1}{N^{2}} cov (\sum_{i = 1}^{N} W_{i}^{d_{1}} {\bar{Y}}_{i} R_{i} + W_{i}^{d_{1}} {\bar{Y}}_{i} (1 - R_{i}), \sum_{i = 1}^{N} W_{i}^{d_{3}} {\bar{Y}}_{i} R_{i} + W_{i}^{d_{3}} {\bar{Y}}_{i} (1 - R_{i}))

= \frac{1}{N^{2}} cov (\sum_{i = 1}^{N} W_{i}^{d_{1}} {\bar{Y}}_{i} R_{i} + \sum_{i = 1}^{N} W_{i}^{d_{1}} {\bar{Y}}_{i} (1 - R_{i}), \sum_{i = 1}^{N} W_{i}^{d_{3}} {\bar{Y}}_{i} R_{i} + \sum_{i = 1}^{N} W_{i}^{d_{3}} {\bar{Y}}_{i} (1 - R_{i}))

= \frac{1}{N^{2}} cov (\sum_{i = 1}^{N} W_{i}^{d_{1}} {\bar{Y}}_{i} R_{i}, \sum_{i = 1}^{N} W_{i}^{d_{3}} {\bar{Y}}_{i} R_{i}) + \frac{1}{N^{2}} cov (\sum_{i = 1}^{N} W_{i}^{d_{1}} {\bar{Y}}_{i} R_{i}, \sum_{i = 1}^{N} W_{i}^{d_{3}} {\bar{Y}}_{i} (1 - R_{i}))

+ \frac{1}{N^{2}} cov (\sum_{i = 1}^{N} W_{i}^{d_{1}} {\bar{Y}}_{i} (1 - R_{i}), \sum_{i = 1}^{N} W_{i}^{d_{3}} {\bar{Y}}_{i} R_{i}) + \frac{1}{N^{2}} cov (\sum_{i = 1}^{N} W_{i}^{d_{1}} {\bar{Y}}_{i} (1 - R_{i}), \sum_{i = 1}^{N} W_{i}^{d_{3}} {\bar{Y}}_{i} (1 - R_{i}))

= \frac{1}{N^{2}} [\sum_{i = 1}^{N} cov (W_{i}^{d_{1}} {\bar{Y}}_{i} R_{i}, W_{i}^{d_{3}} {\bar{Y}}_{i} R_{i}) + \sum_{i \neq j} cov (W_{i}^{d_{1}} {\bar{Y}}_{i} R_{i}, W_{j}^{d_{3}} {\bar{Y}}_{j} R_{j})]

+ \frac{1}{N^{2}} [\sum_{i = 1}^{N} cov (W_{i}^{d_{1}} {\bar{Y}}_{i} R_{i}, W_{i}^{d_{3}} {\bar{Y}}_{i} (1 - R_{i})) + \sum_{i \neq j} cov (W_{i}^{d_{1}} {\bar{Y}}_{i}, W_{j}^{d_{3}} {\bar{Y}}_{j} (1 - R_{j}))]

+ \frac{1}{N^{2}} [\sum_{i = 1}^{N} cov (W_{i}^{d_{1}} {\bar{Y}}_{i} (1 - R_{i}), W_{i}^{d_{3}} {\bar{Y}}_{i} R_{i}) + \sum_{i \neq j} cov (W_{i}^{d_{1}} {\bar{Y}}_{i} (1 - R_{i}), W_{j}^{d_{3}} {\bar{Y}}_{j} R_{j})]

+ \frac{1}{N^{2}} [\sum_{i = 1}^{N} cov (W_{i}^{d_{1}} {\bar{Y}}_{i} (1 - R_{i}), W_{i}^{d_{3}} {\bar{Y}}_{i} (1 - R_{i})) + \sum_{i \neq j} cov (W_{i}^{d_{1}} {\bar{Y}}_{i} (1 - R_{i}), W_{j}^{d_{3}} {\bar{Y}}_{j} (1 - R_{j}))]

= \frac{1}{N} cov (W_{i}^{d_{1}} {\bar{Y}}_{i} R_{i}, W_{i}^{d_{3}} {\bar{Y}}_{i} R_{i}) + \frac{1}{N} cov (W_{i}^{d_{1}} {\bar{Y}}_{i} R_{i}, W_{i}^{d_{3}} {\bar{Y}}_{i} (1 - R_{i}))

+ \frac{1}{N} cov (W_{i}^{d_{1}} {\bar{Y}}_{i} (1 - R_{i}), W_{i}^{d_{3}} {\bar{Y}}_{i} R_{i}) + \frac{1}{N} cov (W_{i}^{d_{1}} {\bar{Y}}_{i} (1 - R_{i}), W_{i}^{d_{3}} {\bar{Y}}_{i} (1 - R_{i})),

where

cov (W_{i}^{d_{1}} {\bar{Y}}_{i} R_{i}, W_{i}^{d_{3}} {\bar{Y}}_{i} R_{i}) = E (W_{i}^{d_{1}} {\bar{Y}}_{i} R_{i} W_{i}^{d_{3}} {\bar{Y}}_{i} R_{i}) - E (W_{i}^{d_{1}} {\bar{Y}}_{i} R_{i}) E (W_{i}^{d_{3}} {\bar{Y}}_{i} R_{i})

= E [E (W_{i}^{d_{1}} {\bar{Y}}_{i} R_{i} W_{i}^{d_{3}} {\bar{Y}}_{i} R_{i}) | R_{i}] - E [E (W_{i}^{d_{1}} {\bar{Y}}_{i} R_{i}) | R_{i}] E [E (W_{i}^{d_{3}} {\bar{Y}}_{i} R_{i}) | R_{i}]

= \frac{γ^{d_{1}}}{π_{1 i}^{d_{1}} π_{2 i}^{d_{1} R}} E_{d_{1} R} ({\bar{Y}}_{i}^{2}) - γ^{d_{1}} γ^{d_{3}} E_{d_{1} R} ({\bar{Y}}_{i}) E_{d_{3} R} ({\bar{Y}}_{i})

= \frac{γ^{d_{1}}}{π_{1 i}^{d_{1}} π_{2 i}^{d_{1} R}} [σ_{d_{1} R}^{2} + μ_{d_{1} R}^{2}] - γ^{d_{1}} γ^{d_{3}} μ_{d_{1} R} μ_{d_{3} R} .

Similarly,

cov (W_{i}^{d_{1}} {\bar{Y}}_{i} R_{i}, W_{i}^{d_{3}} {\bar{Y}}_{i} (1 - R_{i}))

= E [E (W_{i}^{d_{1}} {\bar{Y}}_{i} R_{i} W_{i}^{d_{3}} {\bar{Y}}_{i} (1 - R_{i})) | R_{i}] - E [E (W_{i}^{d_{1}} {\bar{Y}}_{i} R_{i}) | R_{i}] E [E (W_{i}^{d_{3}} {\bar{Y}}_{i} (1 - R_{i})) | R_{i}]

= - γ^{d_{1}} (1 - γ^{d_{3}}) μ_{d_{1} R} μ_{d_{3} N R},

cov (W_{i}^{d_{1}} {\bar{Y}}_{i} (1 - R_{i}), W_{i}^{d_{3}} {\bar{Y}}_{i} R_{i})

= E [E (W_{i}^{d_{1}} {\bar{Y}}_{i} (1 - R_{i}) W_{i}^{d_{3}} {\bar{Y}}_{i} R_{i}) | R_{i}] - E [E (W_{i}^{d_{1}} {\bar{Y}}_{i} (1 - R_{i})) | R_{i}] E [E (W_{i}^{d_{3}} {\bar{Y}}_{i} R_{i}) | R_{i}]

= - γ^{d_{3}} (1 - γ^{d_{1}}) μ_{d_{1} N R} μ_{d_{3} R},

cov (W_{i}^{d_{1}} {\bar{Y}}_{i} (1 - R_{i}), W_{i}^{d_{3}} {\bar{Y}}_{i} (1 - R_{i}))

= E [E (W_{i}^{d_{1}} {\bar{Y}}_{i} (1 - R_{i}) W_{i}^{d_{3}} {\bar{Y}}_{i} (1 - R_{i})) | R_{i}] - E [E (W_{i}^{d_{1}} {\bar{Y}}_{i} (1 - R_{i})) | R_{i}] E [E (W_{i}^{d_{3}} {\bar{Y}}_{i} (1 - R_{i})) | R_{i}]

= - (1 - γ^{d_{1}}) (1 - γ^{d_{3}}) μ_{d_{1} N R} μ_{d_{3} N R} .

Thus, $cov ({\bar{Y}}^{d_{1}}, {\bar{Y}}^{d_{3}})$ is

cov ({\bar{Y}}^{d_{1}}, {\bar{Y}}^{d_{3}}) = \frac{1}{N} {\frac{γ^{d_{1}}}{π_{1 i}^{d_{1}} π_{2 i}^{d_{1} R}} (σ_{d_{1} R}^{2} + μ_{d_{1} R}^{2}) - γ^{d_{1}} γ^{d_{3}} μ_{d_{1} R} μ_{d_{3} R} - γ^{d_{1}} (1 - γ^{d_{3}}) μ_{d_{1} R} μ_{d_{3} N R} - γ^{d_{3}} (1 - γ^{d_{1}}) μ_{d_{1} N R} μ_{d_{3} R} - (1 - γ^{d_{1}}) (1 - γ^{d_{3}}) μ_{d_{1} N R} μ_{d_{3} N R}} .

Therefore, the variance of regime means differences between d_l and d₃ is

V ({\bar{Y}}^{d_{1}} - {\bar{Y}}^{d_{3}}) = \frac{1}{N} {\frac{γ^{d_{1}}}{π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} R}]} (σ_{d_{1} R}^{2} + (1 - π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} R}]) μ_{d_{1} R}^{2}) + \frac{1 - γ^{d_{1}}}{π_{1 i}^{d_{1}} [π_{1 i}^{d_{1} N R}]} (σ_{d_{1} N R}^{2} + (1 - π_{1 i}^{d_{1}} [π_{2 i}^{d_{1} N R}]) μ_{d_{1} N R}^{2}) + γ^{d_{1}} (1 - γ^{d_{1}}) {(μ_{d_{1} R} - μ_{d_{1} N R})}^{2} + \frac{γ^{d_{3}}}{π_{1 i}^{d_{3}} [π_{2 i}^{d_{3} R}]} (σ_{d_{3} R}^{2} + (1 - π_{1 i}^{d_{3}} [π_{2 i}^{d_{3} R}]) μ_{d_{3} R}^{2}) + \frac{1 - γ^{d_{3}}}{π_{1 i}^{d_{3}} [π_{2 i}^{d_{3} N R}]} (σ_{d_{3} N R}^{2} + (1 - π_{1 i}^{d_{3}} [π_{2 i}^{d_{3} N R}]) μ_{d_{3} N R}^{2}) + γ^{d_{3}} (1 - γ^{d_{3}}) {(μ_{d_{3} R} - μ_{d_{3} N R})}^{2} - 2 [\frac{γ^{d_{1}}}{π_{1 i}^{d_{1}} π_{2 j}^{d_{1} R}} (σ_{d_{1} R}^{2} + μ_{d_{1} R}^{2}) - γ^{d_{1}} γ^{d_{3}} μ_{d_{1} R} μ_{d_{3} R} - γ^{d_{1}} (1 - γ^{d_{3}}) μ_{d_{1} R} μ_{d_{3} N R} - γ^{d_{3}} (1 - γ^{d_{1}}) μ_{d_{1} N R} μ_{d_{3} R} - (1 - γ^{d_{1}}) (1 - γ^{d_{3}}) μ_{d_{1} N R} μ_{d_{3} N R}]} .

Similarly, we can also derive the expression for $cov ({\bar{Y}}^{d_{1}}, {\bar{Y}}^{d_{5}})$ , where d_l and d₅ does not share an initial treatment, that is,

cov ({\bar{Y}}^{d_{1}}, {\bar{Y}}^{d_{5}}) = \frac{1}{N} {- γ^{d_{1}} γ^{d_{5}} μ_{d_{1} R} μ_{d_{5} R} - γ^{d_{1}} (1 - γ^{d_{5}}) μ_{d_{1} R} μ_{d_{5} N R} - γ^{d_{5}} (1 - γ^{d_{1}}) μ_{d_{1} N R} μ_{d_{5} R} - (1 - γ^{d_{1}}) (1 - γ^{d_{5}}) μ_{d_{1} N R} μ_{d_{5} N R}} .

Thus, all the variances of ${\bar{Y}}^{d_{1}} - {\bar{Y}}^{d_{2}}$ ,… and ${\bar{Y}}^{d_{1}} - {\bar{Y}}^{d_{8}}$ can be derived in the similar ways as $V ({\bar{Y}}^{d_{1}} - {\bar{Y}}^{d_{3}})$ or $V ({\bar{Y}}^{d_{1}} - {\bar{Y}}^{d_{5}})$ . So do the covariances among ${\bar{Y}}^{d_{1}} - {\bar{Y}}^{d_{2}}$ , … and ${\bar{Y}}^{d_{1}} - {\bar{Y}}^{d_{8}}$ , for example,

cov ({\bar{Y}}^{d_{1}} - {\bar{Y}}^{d_{2}}, {\bar{Y}}^{d_{1}} - {\bar{Y}}^{d_{3}}) = cov ({\bar{Y}}^{d_{1}}, {\bar{Y}}^{d_{1}}) - cov ({\bar{Y}}^{d_{1}}, {\bar{Y}}^{d_{3}}) - cov ({\bar{Y}}^{d_{1}}, {\bar{Y}}^{d_{2}}) + cov ({\bar{Y}}^{d_{2}}, {\bar{Y}}^{d_{3}}) .

Now, Equation (14) can be derived as follows:

Pr (\frac{{\hat{δ}}_{d_{1} - d_{2}}}{\sqrt{2 σ_{d_{1} - d_{2}} / N_{4}}} > z_{α}, \dots, \frac{{\hat{δ}}_{d_{1} - d_{8}}}{\sqrt{2 σ_{d_{1} - d_{8}} / N_{4}}} > z_{α}) = 1 - β = power

\Rightarrow Pr (\frac{{\hat{δ}}_{d_{1} - d_{2}} - δ_{d_{1} - d_{2}}}{\sqrt{2 σ_{d_{1} - d_{2}} / N_{4}}} > z_{α} - \frac{δ_{d_{1} - d_{2}}}{\sqrt{2 σ_{d_{1} - d_{2}} / N_{4}}}, \dots, \frac{{\hat{δ}}_{d_{1} - d_{8}} - δ_{d_{1} - d_{8}}}{\sqrt{2 σ_{d_{1} - d_{8}} / N_{4}}} > z_{α} - \frac{δ_{d_{1} - d_{8}}}{\sqrt{2 σ_{d_{1} - d_{8}} / N_{4}}}) = power

\Rightarrow Pr (\frac{{\hat{δ}}_{d_{1} - d_{2}} - δ_{d_{1} - d_{2}}}{\sqrt{2 σ_{d_{1} - d_{2}} / N_{4}}} > z_{α} - \sqrt{N_{4} / 2} δ_{d_{1} - d_{2}}^{*}, \dots, \frac{{\hat{δ}}_{d_{1} - d_{8}} - δ_{d_{1} - d_{8}}}{\sqrt{2 σ_{d_{1} - d_{8}} / N_{4}}} > z_{α} - \sqrt{N_{4} / 2} δ_{d_{1} - d_{8}}^{*}) = power

\Rightarrow Pr (Z_{d_{1} - d_{2}} \leq - (z_{α} - δ_{d_{1} - d_{2}}^{*} \sqrt{N_{4} / 2}), \dots, Z_{d_{1} - d_{8}} \leq - (z_{α} - δ_{d_{1} - d_{8}}^{*} \sqrt{N_{4} / 2})) = power .

APPENDIX D: PROOF OF THEOREM 3.1

Proof. The proof of consistency requires the result of strong law of large numbers, such that ${\hat{δ}}_{d_{1} - d_{3}} = {\bar{Y}}_{d_{1}} - {\bar{Y}}_{d_{3}} = \frac{1}{N_{2}} \sum_{i = 1}^{N_{2}} W_{i}^{d_{1}} {\bar{Y}}_{i} - \frac{1}{N_{2}} \sum_{i = 1}^{N_{2}} W_{i}^{d_{3}} {\bar{Y}}_{i} \to μ_{d_{1} 0} - μ_{d_{3} 0}$ , almost surely, and uniformly for $μ_{d_{1}}$ and $μ_{d_{3}} \in Θ$ as N₂ → ∞ and $δ_{(d_{1} - d_{3}) 0}$ being the unique expected value of ${\hat{δ}}_{d_{1} - d_{3}}$ due to Assumption 2. To prove the asymptotic normality result, we have

\sqrt{N_{2}} ({\hat{δ}}_{d_{1} - d_{3}} - δ_{(d_{1} - d_{3}) 0}) = \sqrt{N_{2}} [\frac{1}{N_{2}} \sum_{i = 1}^{N_{2}} W_{i}^{d_{1}} {\bar{Y}}_{i} - \frac{1}{N_{2}} \sum_{i = 1}^{N_{2}} W_{i}^{d_{3}} {\bar{Y}}_{i} - (μ_{d_{1} 0} - μ_{d_{3} 0})]

with $E (\frac{1}{N_{2}} \sum_{i = 1}^{N_{2}} W_{i}^{d_{1}} {\bar{Y}}_{i}) = μ_{d_{1} 0}$ , $E (\frac{1}{N_{2}} \sum_{i = 1}^{N_{2}} W_{i}^{d_{3}} {\bar{Y}}_{i}) = μ_{d_{3} 0}$ , $E ({\hat{δ}}_{d_{1} - d_{3}}) = δ_{(d_{1} - d_{3}) 0}$ and $var ({\hat{δ}}_{d_{1} - d_{3}}) = 2 σ_{d_{1} - d_{3}}^{2} / N_{2}$ due to Assumptions 1 and 2. Note that $σ_{d_{1} - d_{3}}^{2}$ can be computed by Equation (10). Thus, by the central limit theorem, $\sqrt{N_{2}} ({\hat{δ}}_{d_{1} - d_{3}} - δ_{(d_{1} - d_{3}) 0})$ converges in distribution to Normal $(0, 2 σ_{d_{1} - d_{3}}^{2})$ . □

Footnotes

CONFLICT OF INTEREST

The authors have declared that there is no conflict of interest.

SUPPORTING INFORMATION

Additional supporting information including source code to reproduce the results may be found online in the Supporting Information section at the end of the article.

REFERENCES

Albert PS (2019). Shared random parameter models: A legacy of the biostatistics program at the National Heart, Lung, and Blood Institute. Statistics in Medicine, 38(4), 501–511. [DOI] [PubMed] [Google Scholar]
Aparecida Guedes T, Rossi RM, Tozzo Martins AB, Janeiro V, & Pedroza Carneiro JW (2014). Applying regression models with skew-normal errors to the height of bedding plants of Stevia rebaudiana (Bert) Bertoni. Acta Scientiarum. Technology, 36(3), 463–468. [Google Scholar]
Artman W, Nahum-Shani I, Wu T, Mckay J, & Ertefaie A (2018). Power analysis in a SMART design: Sample size estimation for determining the best embedded dynamic treatment regime. Biostatistics, 1–17. https://doi-org.eres.qnl.qa/10.1093/biostatistics/kxy064 [DOI] [PMC free article] [PubMed] [Google Scholar]
Azarpazhooh A, Shah PS, Tenenbaum HC, & Goldberg MB (2010). The effect of photodynamic therapy for periodontitis: A systematic review andmeta-analysis. Journal of Periodontology, 81(1), 4–14. [DOI] [PubMed] [Google Scholar]
Azzalini A, & Capitanio A (1999). Statistical applications of the multivariate skew normal distribution. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61(3), 579–602. [Google Scholar]
Azzalini A, & Capitanio A (2003a). Distributions generated by perturbation of symmetry with emphasis on a multivariate skew-t distribution. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(2), 367–389. [Google Scholar]
Azzalini A, & Capitanio A (2003b). Distributions generated by perturbation of symmetry with emphasis on a multivariate skew-t distribution. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(2), 367–389. [Google Scholar]
Azzalini A, & Dalla Valle A (1996). The multivariate skew-normal distribution. Biometrika, 83(4), 715–726. [Google Scholar]
Banerjee S, Carlin BP, & Gelfand AE (2014). Hierarchical modeling and analysis for spatial data (2nd ed.). Boca Raton, FL: Chapman and Hall/CRC. [Google Scholar]
Beck JD, Elter JR, Heiss G, Couper D, Mauriello SM, & Offenbacher S (2001). Relationship of periodontal disease to carotid artery intima-media wall thickness: The atherosclerosis risk in communities (ARIC) study. Arteriosclerosis, Thrombosis, and Vascular Biology, 21(11), 1816–1822. [DOI] [PubMed] [Google Scholar]
Breslin FC, Sobell MB, Sobell LC, Cunningham JA, Sdao-Jarvie K, & Borsoi D (1998). Problem drinkers: Evaluation of a stepped-care approach. Journal of Substance Abuse, 10(3), 217–232. [DOI] [PubMed] [Google Scholar]
Brooner R, & Kidorf M (2002). Using behavioral reinforcement to improvemethadone treatment participation. Science and Practice Perspectives, 1(1), 38–47. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cao W, Tsiatis A, & Davidian M (2009). Improving efficiency and robustness of the doubly robust estimator for a population mean within complete data. Biometrika, 96(3), 723–734. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chakraborty B, & Moodie E, (2013). Statistical methods for dynamic treatment regimes. New York: Springer. [Google Scholar]
Chakraborty B, Murphy S, & Strecher V (2010). Inference for non-regular parameters in optimal dynamic treatment regimes. Statistical Methods in Medical Research, 19(3), 317–343. [DOI] [PMC free article] [PubMed] [Google Scholar]
Eke P, Page R, Wei L, Thornton-Evans G, & Genco R (2012). Update of the case definitions for population-based surveillance of periodontitis. Journal of Periodontology, 55(12), 1449–1454. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ertefaie A, Wu T, Lynch K, & Nahum-Shani I, (2015). Identifying a set that contains the best dynamic treatment regimes. Biostatistics, 17, 135–148. [DOI] [PMC free article] [PubMed] [Google Scholar]
Fernandes JK, Wiegand RE, Salinas CF, Grossi SG, Sanders JJ, Lopes-Virella MF, & Slate EH (2009). Periodontal disease status in Gullah African Americans with Type-2 diabetes living in South Carolina. Journal of Periodontology, 80(7), 1062–1068. [DOI] [PMC free article] [PubMed] [Google Scholar]
Garaicoa-Pazmino C, Decker AM, & Polverini PJ (2015). Personalized Medicine approaches to the prevention, diagnosis, and treatment of chronic periodontitis In Polverini PJ (Eds.), Personalized Oral Health Care (pp. 99–112). Berlin: Springer. [Google Scholar]
Garcia I, Kuska R, & Somerman M (2013). Expanding the foundation for personalized medicine: Implications and challenges for dentistry. Journal of Dental Research Clinical Research Supplement, 92(Suppl. 7), 3S–10S. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ghosh P, Cheung Y, & Chakraborty B (2016). Sample size calculations for clustered SMART designs In Kosorok M & Moodie E (Eds.), ASA-SIAM Statistics and Applied Probability Series. Adaptive treatment strategies in practice: Planning trials and analyzing data for personalized medicine (pp. 55–68). Philadelphia, PA: SIAM. [Google Scholar]
Glasgow M, Engel B, & D’Lugoff B (1989). A controlled study of a standardized behavioural stepped treatment for hypertension. Psychosomatic Medicine, 51(1), 10–26. [DOI] [PubMed] [Google Scholar]
Grossi SG, Skrepcinski FB, DeCaro T, Robertson DC, Ho AW, Dunford RG, & Genco RJ (1997). Treatment of periodontal disease indiabetics reduces glycated hemoglobin. Journal of Periodontology, 68(8), 713–719. [DOI] [PubMed] [Google Scholar]
Herrera D (2016). Scaling and root planning is recommended in the nonsurgicaltreatment of chronic periodontitis. Journal of Evidence-Based Dental Practice, 16(1), 56–58. [DOI] [PubMed] [Google Scholar]
John M, Michalowicz B, Kotsakis G, & Chu H (2017). Network meta-analysis of studies included in the Clinical Practice Guideline on the nonsurgical treatment of chronic periodontitis. Journal of Clinic Peridontology, 44(6), 603–611. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lavori PW, & Dawson R (2004). Dynamic treatment regimes: Practical design considerations. Clinical Trials, 1(1), 9–20. [DOI] [PubMed] [Google Scholar]
Lei H, Nahum-Shani I, Lynch K, Oslin D, & Murphy SA(2012). A “SMART” design for building individualized treatment sequences. Annual Review of Clinical Psychology, 8, 21–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
Liu C-M, Hou L-T, Wong M-Y, & Lan W-H (1999). Comparison of Nd: YAG laser versus scaling and root planing in periodontal therapy. Journal of Periodontology, 70(11), 1276–1282. [DOI] [PubMed] [Google Scholar]
Murphy SA (2003). Optimal dynamic treatment regimes. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(2), 331–366. [Google Scholar]
Murphy SA (2005). An experimental design for the development of adaptivetreatment strategies. Statistics in Medicine, 24(10), 1455–1481. [DOI] [PubMed] [Google Scholar]
Murphy SA, & McKay JR, (2004). Adaptive treatment strategies: An emerging approach for improving treatment effectiveness. Clinical Science, 12, 7–13. [Google Scholar]
Murphy SA, Van Der Laan MJ, Robins JM, & Conduct Problems Prevention Research Group. (2001). Marginal mean models for dynamic regimes. Journal of the American Statistical Association, 96(456), 1410–1423. [DOI] [PMC free article] [PubMed] [Google Scholar]
NeCamp T, Kilbourne A, & Almirall D (2017). Comparing cluster-level dynamic treatment regimens using sequential, multiple assignment, randomized trials: Regression estimation and sample size considerations. Statistical Methods in Medical Research, 26(4), 1572–1589. [DOI] [PMC free article] [PubMed] [Google Scholar]
Nicholls C (2003). Periodontal disease incidence, progression and rate of tooth loss in a general dental practice: The results of a 12-year retrospective analysis of patient’s clinical records. British Dental Journal, 194(9), 485–488. [DOI] [PubMed] [Google Scholar]
Porteous MS, & Rowe DJ (2014). Adjunctive use of the diode laser in non-surgical periodontal therapy: Exploring the controversy. Journal of Dental Hygiene, 88(2), 78–86. [PubMed] [Google Scholar]
Reich BJ, & Bandyopadhyay D (2010). A latent factor model for spatial data with informative missingness. Annals of Applied Statistics, 4(1), 439–459. [DOI] [PMC free article] [PubMed] [Google Scholar]
Reich B, Bandyopadhyay D, & Bondell H (2013). A nonparametric spatial model for periodontal data with nonrandom missingness. Journal of the American Statistical Association, 108(503), 820–831. [DOI] [PMC free article] [PubMed] [Google Scholar]
Robins J, Rotnitzky A, & Zhao L (1994a). Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89(427), 846–866. [Google Scholar]
Robins JM (2004). Optimal structural nested models for optimal sequential decisions In Lin D & Heagerty P (Eds.), Proceedings of the Second Seattle Symposium in Biostatistics: Analysis of Correlated Data, Lecture Notes in Statistics (Vol. 179, pp. 189–326). New York: Springer. [Google Scholar]
Robins JM, Rotnitzky A, & Zhao LP (1994b). Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89(427), 846–866. [Google Scholar]
Sgolastra F, Petrucci A, Gatto R, & Monaco A (2012). Efficacy of Er:YAG laser in the treatment of chronic periodontitis: Systematic review and meta-analysis. Lasers in Medical Science, 27(3), 661–673. [DOI] [PubMed] [Google Scholar]
Smiley C, Tracy S, Abt E, Michalowicz B, John M, Gunsolley J, &, Hanson N (2015). Systematic review and meta-analysis on the nonsurgical treatment of chronic periodontitis by means of scaling and root planing with or without adjuncts. Journal of American Dental Association, 146(7), 508–524. [DOI] [PubMed] [Google Scholar]
Thornton-Evans G, Eke P, Wei L, Palmer A, Moeti R, Hutchins S, & Borrell L (2013). Periodontitis among adults aged >30 years—United States, 2009–2010. Centers for Disease Control and Prevention. Morbidity and mortality weekly report, 62(Suppl. 3), 129–135. [PubMed] [Google Scholar]
Untzer J, Katon W, Williams J, Callahan C, Harpole L, Hunkeler E, &, Langston C (2001). Improving primary care for depression in late life: The design of a multicenter randomized trial. Medical Care, 39(8), 785–799. [DOI] [PubMed] [Google Scholar]
Van Der Laan M, & Rubin D (2006). Targeted maximum likelihood learning (Working Paper Series Working Paper 213). Berkeley, CA: U.C. Berkeley Division of Biostatistics. [Google Scholar]
Vonesh EF, Greene T, & Schluchter MD (2006). Shared parameter models for the joint analysis of longitudinal data and event times. Statistics in Medicine, 25(1), 143–163. [DOI] [PubMed] [Google Scholar]
Wang Z, Zhou X, Zhang J, Zhang L, Song Y, Hu FB, & Wang C (2009). Periodontal health, oral health behaviours, and chronic obstructive pulmonary disease. Journal of Clinical Periodontology, 36(9), 750–755. [DOI] [PubMed] [Google Scholar]
Wiebe CB, & Putnins EE (2000). The periodontal disease classification system of the American Academy of Periodontology—An update. Journal of the Canadian Dental Association, 66(11), 594–597. [PubMed] [Google Scholar]
Workgroup A, (2011). American Academy of Periodontology Statement on the efficacy of lasers in the non-surgical treatment of inflammatory periodontal disease. Journal of Periodontology, 82, 513–514. [DOI] [PubMed] [Google Scholar]
Zhao Y, Yin Y, Tao L, Nie P, Tang Y, & Zhu M (2014). Er:YAG laser versus scaling and root planing as alternative or adjuvant for chronic periodontitis treatment: A systematic review. Journal of Clinical Periodontology, 41(11), 1069–1079. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

R codes

NIHMS1053056-supplement-R_codes.zip^{(2.8MB, zip)}

[R1] Albert PS (2019). Shared random parameter models: A legacy of the biostatistics program at the National Heart, Lung, and Blood Institute. Statistics in Medicine, 38(4), 501–511. [DOI] [PubMed] [Google Scholar]

[R2] Aparecida Guedes T, Rossi RM, Tozzo Martins AB, Janeiro V, & Pedroza Carneiro JW (2014). Applying regression models with skew-normal errors to the height of bedding plants of Stevia rebaudiana (Bert) Bertoni. Acta Scientiarum. Technology, 36(3), 463–468. [Google Scholar]

[R3] Artman W, Nahum-Shani I, Wu T, Mckay J, & Ertefaie A (2018). Power analysis in a SMART design: Sample size estimation for determining the best embedded dynamic treatment regime. Biostatistics, 1–17. https://doi-org.eres.qnl.qa/10.1093/biostatistics/kxy064 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] Azarpazhooh A, Shah PS, Tenenbaum HC, & Goldberg MB (2010). The effect of photodynamic therapy for periodontitis: A systematic review andmeta-analysis. Journal of Periodontology, 81(1), 4–14. [DOI] [PubMed] [Google Scholar]

[R5] Azzalini A, & Capitanio A (1999). Statistical applications of the multivariate skew normal distribution. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61(3), 579–602. [Google Scholar]

[R6] Azzalini A, & Capitanio A (2003a). Distributions generated by perturbation of symmetry with emphasis on a multivariate skew-t distribution. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(2), 367–389. [Google Scholar]

[R7] Azzalini A, & Capitanio A (2003b). Distributions generated by perturbation of symmetry with emphasis on a multivariate skew-t distribution. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(2), 367–389. [Google Scholar]

[R8] Azzalini A, & Dalla Valle A (1996). The multivariate skew-normal distribution. Biometrika, 83(4), 715–726. [Google Scholar]

[R9] Banerjee S, Carlin BP, & Gelfand AE (2014). Hierarchical modeling and analysis for spatial data (2nd ed.). Boca Raton, FL: Chapman and Hall/CRC. [Google Scholar]

[R10] Beck JD, Elter JR, Heiss G, Couper D, Mauriello SM, & Offenbacher S (2001). Relationship of periodontal disease to carotid artery intima-media wall thickness: The atherosclerosis risk in communities (ARIC) study. Arteriosclerosis, Thrombosis, and Vascular Biology, 21(11), 1816–1822. [DOI] [PubMed] [Google Scholar]

[R11] Breslin FC, Sobell MB, Sobell LC, Cunningham JA, Sdao-Jarvie K, & Borsoi D (1998). Problem drinkers: Evaluation of a stepped-care approach. Journal of Substance Abuse, 10(3), 217–232. [DOI] [PubMed] [Google Scholar]

[R12] Brooner R, & Kidorf M (2002). Using behavioral reinforcement to improvemethadone treatment participation. Science and Practice Perspectives, 1(1), 38–47. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] Cao W, Tsiatis A, & Davidian M (2009). Improving efficiency and robustness of the doubly robust estimator for a population mean within complete data. Biometrika, 96(3), 723–734. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] Chakraborty B, & Moodie E, (2013). Statistical methods for dynamic treatment regimes. New York: Springer. [Google Scholar]

[R15] Chakraborty B, Murphy S, & Strecher V (2010). Inference for non-regular parameters in optimal dynamic treatment regimes. Statistical Methods in Medical Research, 19(3), 317–343. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] Eke P, Page R, Wei L, Thornton-Evans G, & Genco R (2012). Update of the case definitions for population-based surveillance of periodontitis. Journal of Periodontology, 55(12), 1449–1454. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] Ertefaie A, Wu T, Lynch K, & Nahum-Shani I, (2015). Identifying a set that contains the best dynamic treatment regimes. Biostatistics, 17, 135–148. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] Fernandes JK, Wiegand RE, Salinas CF, Grossi SG, Sanders JJ, Lopes-Virella MF, & Slate EH (2009). Periodontal disease status in Gullah African Americans with Type-2 diabetes living in South Carolina. Journal of Periodontology, 80(7), 1062–1068. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] Garaicoa-Pazmino C, Decker AM, & Polverini PJ (2015). Personalized Medicine approaches to the prevention, diagnosis, and treatment of chronic periodontitis In Polverini PJ (Eds.), Personalized Oral Health Care (pp. 99–112). Berlin: Springer. [Google Scholar]

[R20] Garcia I, Kuska R, & Somerman M (2013). Expanding the foundation for personalized medicine: Implications and challenges for dentistry. Journal of Dental Research Clinical Research Supplement, 92(Suppl. 7), 3S–10S. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] Ghosh P, Cheung Y, & Chakraborty B (2016). Sample size calculations for clustered SMART designs In Kosorok M & Moodie E (Eds.), ASA-SIAM Statistics and Applied Probability Series. Adaptive treatment strategies in practice: Planning trials and analyzing data for personalized medicine (pp. 55–68). Philadelphia, PA: SIAM. [Google Scholar]

[R22] Glasgow M, Engel B, & D’Lugoff B (1989). A controlled study of a standardized behavioural stepped treatment for hypertension. Psychosomatic Medicine, 51(1), 10–26. [DOI] [PubMed] [Google Scholar]

[R23] Grossi SG, Skrepcinski FB, DeCaro T, Robertson DC, Ho AW, Dunford RG, & Genco RJ (1997). Treatment of periodontal disease indiabetics reduces glycated hemoglobin. Journal of Periodontology, 68(8), 713–719. [DOI] [PubMed] [Google Scholar]

[R24] Herrera D (2016). Scaling and root planning is recommended in the nonsurgicaltreatment of chronic periodontitis. Journal of Evidence-Based Dental Practice, 16(1), 56–58. [DOI] [PubMed] [Google Scholar]

[R25] John M, Michalowicz B, Kotsakis G, & Chu H (2017). Network meta-analysis of studies included in the Clinical Practice Guideline on the nonsurgical treatment of chronic periodontitis. Journal of Clinic Peridontology, 44(6), 603–611. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] Lavori PW, & Dawson R (2004). Dynamic treatment regimes: Practical design considerations. Clinical Trials, 1(1), 9–20. [DOI] [PubMed] [Google Scholar]

[R27] Lei H, Nahum-Shani I, Lynch K, Oslin D, & Murphy SA(2012). A “SMART” design for building individualized treatment sequences. Annual Review of Clinical Psychology, 8, 21–48. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] Liu C-M, Hou L-T, Wong M-Y, & Lan W-H (1999). Comparison of Nd: YAG laser versus scaling and root planing in periodontal therapy. Journal of Periodontology, 70(11), 1276–1282. [DOI] [PubMed] [Google Scholar]

[R29] Murphy SA (2003). Optimal dynamic treatment regimes. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(2), 331–366. [Google Scholar]

[R30] Murphy SA (2005). An experimental design for the development of adaptivetreatment strategies. Statistics in Medicine, 24(10), 1455–1481. [DOI] [PubMed] [Google Scholar]

[R31] Murphy SA, & McKay JR, (2004). Adaptive treatment strategies: An emerging approach for improving treatment effectiveness. Clinical Science, 12, 7–13. [Google Scholar]

[R32] Murphy SA, Van Der Laan MJ, Robins JM, & Conduct Problems Prevention Research Group. (2001). Marginal mean models for dynamic regimes. Journal of the American Statistical Association, 96(456), 1410–1423. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] NeCamp T, Kilbourne A, & Almirall D (2017). Comparing cluster-level dynamic treatment regimens using sequential, multiple assignment, randomized trials: Regression estimation and sample size considerations. Statistical Methods in Medical Research, 26(4), 1572–1589. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] Nicholls C (2003). Periodontal disease incidence, progression and rate of tooth loss in a general dental practice: The results of a 12-year retrospective analysis of patient’s clinical records. British Dental Journal, 194(9), 485–488. [DOI] [PubMed] [Google Scholar]

[R35] Porteous MS, & Rowe DJ (2014). Adjunctive use of the diode laser in non-surgical periodontal therapy: Exploring the controversy. Journal of Dental Hygiene, 88(2), 78–86. [PubMed] [Google Scholar]

[R36] Reich BJ, & Bandyopadhyay D (2010). A latent factor model for spatial data with informative missingness. Annals of Applied Statistics, 4(1), 439–459. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] Reich B, Bandyopadhyay D, & Bondell H (2013). A nonparametric spatial model for periodontal data with nonrandom missingness. Journal of the American Statistical Association, 108(503), 820–831. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] Robins J, Rotnitzky A, & Zhao L (1994a). Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89(427), 846–866. [Google Scholar]

[R39] Robins JM (2004). Optimal structural nested models for optimal sequential decisions In Lin D & Heagerty P (Eds.), Proceedings of the Second Seattle Symposium in Biostatistics: Analysis of Correlated Data, Lecture Notes in Statistics (Vol. 179, pp. 189–326). New York: Springer. [Google Scholar]

[R40] Robins JM, Rotnitzky A, & Zhao LP (1994b). Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89(427), 846–866. [Google Scholar]

[R41] Sgolastra F, Petrucci A, Gatto R, & Monaco A (2012). Efficacy of Er:YAG laser in the treatment of chronic periodontitis: Systematic review and meta-analysis. Lasers in Medical Science, 27(3), 661–673. [DOI] [PubMed] [Google Scholar]

[R42] Smiley C, Tracy S, Abt E, Michalowicz B, John M, Gunsolley J, &, Hanson N (2015). Systematic review and meta-analysis on the nonsurgical treatment of chronic periodontitis by means of scaling and root planing with or without adjuncts. Journal of American Dental Association, 146(7), 508–524. [DOI] [PubMed] [Google Scholar]

[R43] Thornton-Evans G, Eke P, Wei L, Palmer A, Moeti R, Hutchins S, & Borrell L (2013). Periodontitis among adults aged >30 years—United States, 2009–2010. Centers for Disease Control and Prevention. Morbidity and mortality weekly report, 62(Suppl. 3), 129–135. [PubMed] [Google Scholar]

[R44] Untzer J, Katon W, Williams J, Callahan C, Harpole L, Hunkeler E, &, Langston C (2001). Improving primary care for depression in late life: The design of a multicenter randomized trial. Medical Care, 39(8), 785–799. [DOI] [PubMed] [Google Scholar]

[R45] Van Der Laan M, & Rubin D (2006). Targeted maximum likelihood learning (Working Paper Series Working Paper 213). Berkeley, CA: U.C. Berkeley Division of Biostatistics. [Google Scholar]

[R46] Vonesh EF, Greene T, & Schluchter MD (2006). Shared parameter models for the joint analysis of longitudinal data and event times. Statistics in Medicine, 25(1), 143–163. [DOI] [PubMed] [Google Scholar]

[R47] Wang Z, Zhou X, Zhang J, Zhang L, Song Y, Hu FB, & Wang C (2009). Periodontal health, oral health behaviours, and chronic obstructive pulmonary disease. Journal of Clinical Periodontology, 36(9), 750–755. [DOI] [PubMed] [Google Scholar]

[R48] Wiebe CB, & Putnins EE (2000). The periodontal disease classification system of the American Academy of Periodontology—An update. Journal of the Canadian Dental Association, 66(11), 594–597. [PubMed] [Google Scholar]

[R49] Workgroup A, (2011). American Academy of Periodontology Statement on the efficacy of lasers in the non-surgical treatment of inflammatory periodontal disease. Journal of Periodontology, 82, 513–514. [DOI] [PubMed] [Google Scholar]

[R50] Zhao Y, Yin Y, Tao L, Nie P, Tang Y, & Zhu M (2014). Er:YAG laser versus scaling and root planing as alternative or adjuvant for chronic periodontitis treatment: A systematic review. Journal of Clinical Periodontology, 41(11), 1069–1079. [DOI] [PubMed] [Google Scholar]

PERMALINK

SMARTp: A SMART design for nonsurgical treatments of chronic periodontitis with spatially referenced and nonrandomly missing skewed outcomes

Jing Xu

Dipankar Bandyopadhyay

Sedigheh Mirzaei Salehabadi

Bryan Michalowicz

Bibhas Chakraborty

Abstract

1 |. INTRODUCTION

FIGURE 1.

2 |. A PROPOSED SMART DESIGN INVOLVING DTRs FOR TREATING CP

FIGURE 2.

3 |. SMART DESIGN: MODEL, HYPOTHESIS TESTING, AND SAMPLE SIZE CALCULATIONS

3.1 |. Statistical model

3.2 |. Hypothesis testing

3.3 |. Sample size calculation

4 |. SIMULATION STUDIES

TABLE 1.

TABLE 4.

TABLE 2.

TABLE 3.

5 |. IMPLEMENTATION IN R

6 |. DISCUSSION

Supplementary Material

ACKNOWLEDGMENTS

APPENDIX A: SKEW-NORMAL DISTRIBUTION

APPENDIX B: SKEW-t DISTRIBUTION

APPENDIX C: SAMPLE SIZE FORMULA DERIVATION

APPENDIX D: PROOF OF THEOREM 3.1

Footnotes

REFERENCES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

SMARTp: A SMART design for nonsurgical treatments of chronic periodontitis with spatially referenced and nonrandomly missing skewed outcomes

Jing Xu

Dipankar Bandyopadhyay

Sedigheh Mirzaei Salehabadi

Bryan Michalowicz

Bibhas Chakraborty

Abstract

1 |. INTRODUCTION

FIGURE 1.

2 |. A PROPOSED SMART DESIGN INVOLVING DTRs FOR TREATING CP

FIGURE 2.

3 |. SMART DESIGN: MODEL, HYPOTHESIS TESTING, AND SAMPLE SIZE CALCULATIONS

3.1 |. Statistical model

3.2 |. Hypothesis testing

3.3 |. Sample size calculation

4 |. SIMULATION STUDIES

TABLE 1.

TABLE 4.

TABLE 2.

TABLE 3.

5 |. IMPLEMENTATION IN R

6 |. DISCUSSION

Supplementary Material

ACKNOWLEDGMENTS

APPENDIX A: SKEW-NORMAL DISTRIBUTION

APPENDIX B: SKEW-t DISTRIBUTION

APPENDIX C: SAMPLE SIZE FORMULA DERIVATION

APPENDIX D: PROOF OF THEOREM 3.1

Footnotes

REFERENCES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases