. 2017 Sep 8;20(6):692–701. doi: 10.1007/s11102-017-0835-5

Table 4.

Inter-rater agreement of common scenarios

Scenario^a	S	M-DA	S-DA	Pr
Scenario 1 [11111]	21	0	0	1.000
Scenario 5 [11122]	17	4	0	0.676
Scenario 11 [11212]	14	7	0	0.533
Scenario 59 [13122]	1	9	11	0.433
Scenario 92 [21212]	4	16	1	0.600
Scenario 122 [22222]	1	17	3	0.662
Scenario 166 [31121]	2	8	11	0.400
Scenario 203 [32222]	1	3	17	0.662
Scenario 230 [33222]	1	0	20	0.905
Scenario 243 [33333]	0	0	21	1.000
Pc	0.295	0.305	0.400	κ = 0.526

Pr denotes the extent to which physicians agree on each scenario (physician pairs in agreement relative to the number of all possible pairs), ranging from 0 to 1 and with 1 representing complete agreement

Pc denotes the proportion of all physician assessments that were assigned to each category. For instance, for the outcome “stable,” it equals the total number of physician assessments rated as stable (n = 62), divided by the total number of possible physician assessments (10 × 21 = 210)

Fleiss’ kappa statistic (κ) provides a summary statistical measure for assessing the reliability of agreement between physicians in rating common scenarios

S stable, M-DA mild disease activity, S-DA significant disease activity

^aBracketed numbers refer to the level of severity for each of the health status parameters. As an example, scenario 166 [31121] as shown in Table 4 describes a hypothetical patient case with IGF-I at level 3, Tumor status at level 1, Comorbidities at level 1, Symptoms at level 2 and QoL at level 1. For a description of the levels, see Table 2