. Author manuscript; available in PMC: 2017 Jan 14.

Published in final edited form as: J Res Educ Eff. 2016 Jan 14;9(1):103–127. doi: 10.1080/19345747.2015.1060282

Table 4.

Calibration Simulations: RSMSE

Scenario	1	2	3	4	5	6	7	8	avg RSMSE	avg RSMSE
Scenario	Ignorable				Non-ignorable				Ignorable	Non-ignorable
OLS (T)	0.11	0.15	0.13	0.14	0.17	0.22	0.22	0.60	0.13	0.30
BART (T)	0.06	0.08	0.07	0.06	0.11	0.48	0.19	0.85	0.07	0.41
OLS	0.20	0.24	0.22	0.21	0.31	0.28	0.30	0.57	0.22	0.37
BART	0.08	0.10	0.08	0.08	0.14	0.33	0.11	0.66	0.09	0.31
IPSW-LR	0.21	0.22	0.20	0.19	0.28	0.54	0.24	0.80	0.20	0.47
IPSW-RF	0.13	0.10	0.11	0.10	0.16	0.45	0.14	0.70	0.11	0.36
IPSW-GBM	0.17	0.18	0.16	0.16	0.23	0.64	0.20	0.81	0.17	0.47
DR-LR	0.33	0.35	0.36	0.34	0.44	0.41	0.42	0.68	0.34	0.48
DR-RF	0.24	0.29	0.26	0.24	0.36	0.32	0.34	0.60	0.26	0.40
DR-GBM	0.34	0.37	0.37	0.35	0.42	0.44	0.43	0.69	0.36	0.50

Note: All results reported here average over 10,000 simulated datasets. See Figures 1 and 2 for a description of the scenarios. RSMSE refers to Root Standardized Mean Square Error. (T) refers to simulations in which control outcomes are available in the target dataset. OLS denotes linear regression; BART denotes Bayesian Additive Regression Trees; IPSW-LR (IPSW-RF/IPSW-GBM) denotes inverse propensity score weighting with propensity scores estimated using logistic regression (random forests/boosting); DR-LR (DR-RF/DR-GBM) refers to double robust weighted linear regression models with propensity scores estimated using logistic regression (random forests/boosting). The last two columns show the average RSMSE for the ignorable (1–4) and non-ignorable (5–8) scenarios.