Table 3.
Statistical test | Variation in intercept | ||
---|---|---|---|
Absent | Present | ||
Statistical power a | |||
Study 1a | Study 1b | ||
Variation in experimental effect | |||
Absent |
T test ind. obs. T test summary st. Multilevel analysis I Multilevel analysis II |
Correct Decreased power Correct Correct |
Decreased power Decreased power Correct Correct |
False positive rate | |||
Study 2a | Study 2b | ||
Present |
T test ind. obs T test summary st. Multilevel analysis I Multilevel analysis II |
Increased false positive rate Correct Increased false positive rate Correct |
Increased false positive rate Correct Increased false positive rate Correct |
The results of four statistical tests to detect the experimental effect are compared with respect to (1) statistical power to detect the (overall) experimental effect (when variation in the experimental effect is absent) and (2) false positive rate (when variation in the experimental effect is present). Fitted statistical models are a t test on individual observations (T test ind. obs), a paired t test on the experimental condition specific cluster means (T test summary st.), a multilevel analysis that does not accommodate the variation in the experimental effect but does accommodate variation in the intercept (Multilevel analysis I), and a multilevel analysis that accommodates both variation in the intercept and in the experimental effect (Multilevel analysis II)
aIn case that variation in the experimental effect is absent, all fitted statistical models result in a false positive rate that does not exceed the nominal α specified by the user (i.e., correct or slightly conservative, see e.g. [4, 8])