. 2015 Dec 19;16:94. doi: 10.1186/s12868-015-0228-5

Table 3.

Consequences of not accommodating cluster-related variation in research design B

	Statistical test	Variation in intercept
	Statistical test	Absent	Present
		Statistical power ^a
		Study 1a	Study 1b
Variation in experimental effect
Absent	T test ind. obs. T test summary st. Multilevel analysis I Multilevel analysis II	Correct Decreased power Correct Correct	Decreased power Decreased power Correct Correct
		False positive rate
		Study 2a	Study 2b
Present	T test ind. obs T test summary st. Multilevel analysis I Multilevel analysis II	Increased false positive rate Correct Increased false positive rate Correct	Increased false positive rate Correct Increased false positive rate Correct

The results of four statistical tests to detect the experimental effect are compared with respect to (1) statistical power to detect the (overall) experimental effect (when variation in the experimental effect is absent) and (2) false positive rate (when variation in the experimental effect is present). Fitted statistical models are a t test on individual observations (T test ind. obs), a paired t test on the experimental condition specific cluster means (T test summary st.), a multilevel analysis that does not accommodate the variation in the experimental effect but does accommodate variation in the intercept (Multilevel analysis I), and a multilevel analysis that accommodates both variation in the intercept and in the experimental effect (Multilevel analysis II)

^aIn case that variation in the experimental effect is absent, all fitted statistical models result in a false positive rate that does not exceed the nominal α specified by the user (i.e., correct or slightly conservative, see e.g. [4, 8])