. 2017 Mar 10;15(3):e2001307. doi: 10.1371/journal.pbio.2001307

Table 1. Early stopping for significance or futility using nonsequential group sequential designs (examples with n = 36 or n = 72).

	Small study (n = 36) stop for significance, three stages					Larger study (n = 72) stop for significance or futility, two stages
	Sample size (per group)	Freq. nonseq.	Freq. seq.	Bayes Factor	Bayes (CRI) with noninf. Prior	Sample size (per group)	Freq. nonseq.	Freq. seq.	Bayes Factor	Bayes (CRI) with noninf. Prior
d = 0
Stage 1 [%] sign./futility Stage 1 and 2 [%] sign. Stage 1 and 2 (and 3) = type 1 error [%] sign.	12 (6 versus 6) 24 (12 versus 12) 36 (18 versus 18)	- - 5.0	0.1 1.4 4.9	2.3 4.0 5.0	0.4 3.2 5.0	36 (18 versus 18) - 72 (36 versus 36)	- 5.0	0.8/50.7 5.3	3.5/70.7 5.0	1.1/50.1 5.1
Cost [mean number of animals] d_est		36 0.78	36 0.84	36 1.38	36 1.00		72 0.54	53 0.55	45 0.77	54 0.57
d = 0.5
Stage 1 [%] sign./futility Stage 1 and 2 [%] sign. Stage 1 and 2 (and 3) = Power [%] sign.	12 (6 versus 6) 24 (12 versus 12) 36 (18 versus 18)	- - 30.8	0.4 10.3 31.2	6.4 15.3 24.3	1.0 18.4 30.3	36 (18 versus 18) - 72 (36 versus 36)	55.3	9.7/18.8 53.8	25.6/32.5 46.0	11.2/18.5 54.3
Cost [mean number of animals] d_est		36 0.86	35 0.93	34 1.13	34 1.00		72 0.65	62 0.67	51 0.78	61 0.68
d = 1.0
Stage 1 [%] sign./futility Stage 1 and 2 [%] sign. Stage 1 and 2 (and 3) = Power [%] sign.	12 (6 versus 6) 24 (12 versus 12) 36 (18 versus 18)	- - 83.0	1.6 43.6 82.2	22.8 53.3 74.7	4.5 58.4 80.5	36 (18 versus 18) - 72 (36 versus 36)	98.7	54.4/0.9 98.1	78.3/2.8 96.1	57.7/0.8 98.2
Cost [mean number of animals] d_est		36 1.09	31 1.16	27 1.27	28 1.14		72 1.01	52 1.07	43 1.05	51 1.06

Simulations based on a total number of 18 or 36 samples per group. Power or type I error for three different standardized effect sizes Cohen’s d = 0, or 0.5, or 1.0, respectively. Numbers give cumulative percentages of statistically significant trials in percent [%] out of 10,000 simulation runs, as well as “Costs” defined as the long term mean of experimental units, and median estimated effect sizes in significant trials (d_est). Small study with n = 18 per group: Stage 1: n = 12 (6 versus 6), stage 1 and 2: n = 24 (12 versus 12), stage 1 and 2 and 3: n = 36 (18 versus 18) experimental units. Stopping rules that allowed early stopping: Freq. nonseq.: α = 0.05; Freq. seq.: significance levels for interim analyses: α₁ = 0.0006, α₂ = 0.0151, α₃ = 0.0471 according to [11]; Bayes Factor: 3 for each stage; Bayes noninf. prior: CRI for effect size: stage 1: 99.8% CRI, stage 2 and 3: 96.8% CRI.

Larger study with n = 36 per group: Stage 1: n = 36 (18 versus 18), stage 1 and 2: n = 72 (36 versus 36) experimental units. Stopping rules that allowed early stopping for futility or significance: Freq. nonseq.: α = 0.05; Freq seq. [11]: α_futility = 0.5, α₁ = 0.0065, α₂ = 0.0525; Bayes Factor: 2 and for futility: 0.5; CRI for effect size d using a Bayesian approach with noninf. prior: stage 1 99% CRI, for futility: zero is included in 50% CRI for effect size d, stage 2 95% CRI.

All sequential approaches used were calibrated to get a type I error of about 5%.

Abbreviations: CRI, credible interval; Freq. nonseq., Frequentist nonsequential; Freq. seq., Frequentist sequential; Noninf., Noninformative.