Among published studies of neonatal imitation in humans, across a variety of facial and other actions (shown here: tongue protrusion, TP; mouth opening, MO; other facial gestures or other actions), sample size is a good predictor of whether the study found positive results (i.e. evidence of imitation) or negative/null results. We carried out an a priori power analysis to determine the sample size necessary for power = 0.80 (f = 0.40; α = 0.05) to detect this effect and determined a sample size of 26 is needed. The ‘frequencies of actions’ axis label refers to the number of modelled actions that were tested, both within and between studies. For example, nine studies with samples sizes >26 tested TP and found positive results, whereas six studies tested MO and, of these, five found positive results. (Online version in colour.)