. Author manuscript; available in PMC: 2018 Nov 16.

Published in final edited form as: Stat Comput. 2017 Jul 27;28(4):869–890. doi: 10.1007/s11222-017-9767-1

Table 9.

Coverage for out-of-sample 75% prediction intervals and average interval width for BART-BMA, RF using conformal prediction bartMachine and dbarts for the Friedman example. Perfect calibration is 75% hence the model with the lowest average interval width and a coverage as close to 75% as possible is most desirable. Items in bold refer to the best calibrated model with respect to interval coverage and interval width for each simulated dataset.

	Coverage					Average Interval Width

p	BART BMA	RF CP Intervals	bart Machine	dbarts default	dbarts best	BART BMA	RF CP Intervals	bart Machine	dbarts default	dbarts best
100	75.8%	75%	77.2%	70.4%	61.2%	6.86	6.84	4.45	4.08	2.69
1000	74.0%	79%	79.0%	59.4%	62.6%	6.89	7.89	6.16	4.44	3.63
5000	72.6%	79%	87.0%	61.0%	64.8%	6.84	8.48	9.74	5.97	4.34
10000	73.4%	78%	76.8%	68.6%	64.2%	6.84	8.62	9.89	7.10	5.19
15000	73.4%	78%	75.2%	69.0%	67.0%	6.84	8.47	10.73	7.91	5.73
100000	71.8%	79%	-	59.0%	73.4%	6.91	9.30	-	9.21	8.88
500000	70.2%	-	-	56.6%	-	6.88	-	-	9.14	-