. 2020 Nov 18;87(4):1717–1729. doi: 10.1111/bcp.14608

TABLE 2.

Performance measures

Measures		Algorithm development (n = 433)		External validations (n = 481)
Measures		N ^a	Median (range)	N ^a	Median (range)
Fit accuracy	R ² ^b (%)
	All	323	43 (2–96 ^c )	261	39 (<1–86)
	Pharmacogenetic	273	45 (8–96)	232	41 (<1–86)
	Clinical ^d	178	20 (2–83)	29	24 (<1–69)
	CYP2C9	98	7 (<1–50)	‐	‐
	VKORC1	114	25 (1–59)	‐	‐
	Correlation coefficient
	All	19	0.65 (0.31–0.82)	101	0.60 (0.03–0.86)
	Pharmacogenetic	15	0.65 (0.52–0.79)	97	0.60 (0.03–0.86)
	Clinical	4	0.56 (0.31–0.82)	4	0.32 (0.07–0.54)
Precision/predictive accuracy	Mean absolute error (mg/d) ^e ^, ^f
	All	137	1.23 (0.11–2.89)	222	1.20 (0.37–3.70)
	Pharmacogenetic	105	1.26 (0.11–1.96)	185	1.18 (0.57–3.30)
	Clinical	32	1.10 (0.21–2.89)	37	1.34 (0.37–3.70)
	Mean square error (mg²/d²)
	All	54	0.02 (0.01–0.74)	4	0.67 (0.60–0.74)
	Pharmacogenetic	30	0.02 (0.01–0.10)	‐	‐
	Clinical	24	0.02 (0.01–0.74)	4	0.67 (0.60–0.74)
	Root mean square error (mg/d)
	All	14	0.80 (0.10–3.09)	68	1.44 (0.19–4.29)
	Pharmacogenetic	6	0.34 (0.10–1.44)	58	1.37 (0.19–4.29)
	Clinical	8	1.87 (0.66–3.09)	10	1.77 (0.66–2.33)
	Mean absolute percentage error (%) ^f
	All	7	21 (13–54)	37	32 (20–53)
	Pharmacogenetic	6	25 (18–54)	34	32 (21–53)
	Clinical	1	19 (13–21)	3	34 (20–36)
	Unbiased mean absolute percentage
	Error (%)
	All (clinical)	1	34	3	37 (36–38)
	Root mean square percentage error (%)
	All (pharmacogenetic)	1	42	5	53 (37–99)
Bias	Mean prediction error (mg/d) ^f
	All	17	0.01 (−0.28–0.60)	144	−0.20 (−3.94–1.80)
	Pharmacogenetic	9	−0.10 (−0.28–0.48)	140	−0.20 (−3.94–1.80)
	Clinical	8	0.04 (0.01–0.60)	4	−0.59 (−1.01–0.27)
	Mean percentage prediction error (%) ^f
	All (pharmacogenetic)	3	4 (3–6)	26	22 (2–76)
	Logarithm of the accuracy ratio‐derived (%)
	All (clinical)	1	<1	3	8 (4–13)
Clinical relevance	Patients with predicted dose within 20% of actual (%)
	All	132	48 (10–98)	245	43 (0–80)
	Pharmacogenetic	95	50 (30–98)	231	42 (0–80)
	Clinical	37	47 (10–87)	14	48 (26–63)
	Patients with predicted dose within 1 mg/d of actual (%)
	All	14	63 (34–92)	47	42 (17–83)
	Pharmacogenetic	12	63 (34–92)	34	42 (17–83)
	Clinical	2	62 (36–87)	13	42 (22–70)

^{^a}

N represents the number of algorithms for which the respective measures were explored and reported. For algorithm development, both development and internal validation cohorts were included, if both reported, although the algorithm was still counted as 1. Results in figures were included if a numerical value was extractable.

^{^b}

Also called the coefficient of determination. For the development cohort, adjusted values used, when reported.

^{^c}

The highest R ² reported in Pavani ²⁹ as 94%/96%.

^{^d}

From clinical algorithms. For algorithm development, this also includes pharmacogenetic algorithms that reported R ² contributions of clinical factors only.

^{^e}

Includes 9 studies reporting median absolute error.

^{^f}

In some studies (e.g. Botton, ³⁰ You, ³¹ Tan, ³² Biss, ³³ Zhou, ³⁴ Lin, ³⁵ Xie ³⁶ ) these performance measures were unclear or inconsistent with their definitions (if available) and/or reported values, in which case a best guess was made. For example, a negative mean absolute error was likely to be a mean prediction error.