Star wars, NHS style

Richard M Barker; Mark S Pearce; Miles Irving

doi:10.1136/bmj.329.7457.107

. 2004 Jul 10;329(7457):107–109. doi: 10.1136/bmj.329.7457.107

Star wars, NHS style

Richard M Barker ¹, Mark S Pearce ², Miles Irving ¹

PMCID: PMC449826 PMID: 15242919

Short abstract

NHS trusts awarded three stars are supposed to be the best performing in the country. However, problems with the 2002-3 assessment mean that this is not necessarily true

Monitoring of performance is generally agreed to be essential for the efficient running of large publicly funded organisations such as the NHS. However, to be effective it must not only be statistically sound but also have the confidence of the organisations to which it applies. A report of the Royal Statistical Society in 2003 noted that “Done badly it [perfomance monitoring] can be very costly and not merely ineffective but harmful and indeed destructive.”¹

The performance of NHS trusts is monitored through the star ratings system, which the Department of Health introduced in September 2001. The Commission for Health Improvement was given responsibility for assessment in 2002-3. We investigated the rating system after our trust was downgraded from three stars (the highest rating) in 2001-2 to two stars in 2002-3. Our analysis shows important shortcomings with the method of assessment.

2002-3 star ratings

In 2002-3 the performance of NHS trusts was assessed for nine key targets, together with a balanced scorecard of clinical focus (10 indicators), patient focus (19 indicators), and capacity and capability (seven indicators). Each trust's performance on the key targets was assessed on the basis that they had been “achieved, underachieved, or significantly underachieved.” Trusts were given two penalty points for underachievement of a target and six points for significant underachievement. An overall score was derived by adding scores in key targets and the balanced scorecard. The highest scoring trusts were given three stars. However, trusts that had significantly underachieved in one of the key targets could not be awarded three stars, whatever the overall score.

The assessment showed that our trust had significantly underachieved on one key target, the outpatient indicator (by 0.1%), and the trust was thus awarded two stars. This was despite the trust achieving the top band of performance for all three of the clinical focus, patient focus, and capacity and capability sets of indicators, and eight out of nine of the key target indicators.

Statistical considerations

Our review of the statistical methods adopted for the star ratings shows that inappropriate criteria were used to arrive at the judgments. We obtained data on acute trusts and indicator targets from the Commission for Health Improvement and Department of Health websites (www.chi.nhs.uk/ratings/ and www.doh.gov.uk). The outpatient indicator was a composite score calculated from the number of patients waiting longer than 26 weeks at the end of each of the first three quarters of 2002-3 plus the number of patients who were still waiting longer than 21 weeks at the end of the fourth quarter. To achieve the target, trusts had to have had no more than five breaches; more than 50 breaches constituted significant underachievement.

Use of quarter end figures

The size of a sample must be large enough to be representative of a population. The outpatient indicator stated that its purpose was to measure a trust's performance throughout the year. However, the commission sampled only four days, even though it had all the figures necessary to calculate performance over the whole year. With this method, a trust could have a considerable number of breaches during a quarter yet show none on the final day. The total figure for Newcastle showed that there were 1502 breaches during the year. Other trusts had more total breaches during the year yet were judged to have met the target because their figures on the four quarter end days fell within requirements. Table 1 shows the trusts that had over 1000 breaches of the target throughout the year. Of these 19 trusts, 11 achieved the target, six underachieved, and two (including Newcastle) significantly underachieved.

Table 1.

Total breaches of outpatient waiting time within 2002-3 among trusts with over 1000 breaches

Trust	No of patients waiting >26 weeks	Achieved standard	Star rating
Leeds Teaching Hospitals NHS Trust	3139	No	2
West Hertfordshire Hospital NHS Trust	2803	Yes	1
Royal Free Hampstead NHS Trust	2369	Yes	2
St George's Healthcare NHS Trust	2172	No	2
Wirral Hospital	1692	Yes	3
University Hospital of Coventry and Warwickshire NHS Trust	1671	Yes	2
Barnet and Chase Farm Hospitals NHS Trust	1660	Yes	1
Mid Yorkshire Hospitals NHS Trust	1577	No	1
Barking, Havering, and Redbridge Hospitals NHS Trust	1565	Yes	2
Hull and East Yorkshire Hospitals NHS Trust	1544	No	1
Newcastle upon Tyne Hospitals NHS Trust	1502	No	2
Princess Royal Hospitals NHS Trust	1438	Yes	1
Plymouth Hospitals NHS Trust	1344	Yes	2
North Staffordshire Hospital NHS Trust	1312	Yes	2
South Tees Hospitals NHS Trust	1288	No	2
East and North Hertfordshire NHS Trust	1225	No	1
Royal United Hospital Bath NHS Trust	1114	No	1
Peterborough Hospitals	1108	Yes	3
Norfolk and Norwich Healthcare NHS Trust	1041	Yes	2

Open in a new tab

Freeman Hospital, Newcastle upon Tyne

Credit: NEWCASTLE EVENING HERALD

Absolute or relative test

In contrast to other key targets based on percentages, the outpatient key target was an absolute test—that is, it measured the number of breaches of the target irrespective of the size and activity of the trust. The number of referrals differs greatly across trusts, which range from small specialist hospitals that receive just 64 general practitioner referrals a year to large university hospitals such as Newcastle, which receives 118 000 referrals a year. Table 2 shows that a breach of the target waiting time in up to 6% of patients could lead a trust to achieve, underachieve, or significantly underachieve this target, depending on patient workload.

Table 2.

Analysis of breaches of outpatient waiting time for all NHS trusts

	No of outpatients waiting longer than the standard
	0-5 (achieved)	6-50 (underachieved)	>50 (significantly underachieved)
No of trusts	156	14	6
Range of No of breaches	0-5	8-49	57-288
Range of No of new written referrals	64-13 9011	764-152 778	27 788-188 102
Possible range of % outpatient breaches	0-7.8	0.004-6.5	0.03-100

Open in a new tab

Clearly, it is easier for a small trust to meet the threshold of five or fewer breaches a year than it is for trusts with a large referral base. The Commission for Health Improvement used a relative test for other indicators. The rationale for choosing relative tests for some key targets and one patient focus indicator and numerical targets for other similar indicators is difficult to understand.

Six of the nine key target indicators specify absolute rather than percentage targets. Absolute targets may demand comparatively higher levels of service from larger trusts, although even those key targets measured in percentage terms may be easier to attain in small trusts. To investigate this, we considered data from the 150 trusts that had returned information for all six key target indicators concerning patient numbers.

We compared the patient population in the highest ranking trusts with that in the remaining trusts for the six key targets. For all indicators, the highest ranked trusts had smaller average patient populations (table 3).

Table 3.

Relation between patient population and performance ranking. Data were analysed with the Mann-Whitney test

	Trusts ranked joint first		Trusts ranked lower than first
Key target	No of trusts	Average patient population	No of trusts	Average patient population	P value
Waiting time for admission from accident and emergency	91	65 497^*	59	79 548^*	0.001
Cancelled operations not readmitted within 28 days	26	22 169^†	124	29 193^†	0.01
Inpatient waiting time	130	27 853^†	20	31 920^†	0.39
Outpatient waiting time	26	9 887^‡	124	12 686^‡	0.005
Total time in accident and emergency departments^§	50	1 370^¶	100	1 497^¶	0.12
Time between GP referral and first cancer appointment	18	1 575^**	132	2 023^**	0.03

Open in a new tab

Median annual total number of first attendances.

^†

Median annual total number of elective admissions.

^‡

Median annual total number of outpatients seen after written referral from general practitioner.

^§

Top 50 ranked trusts, as no trusts were ranked joint first. Based on weekly data.

^¶

Median total number of attendances for the week ending 30 March 2003.

^**

Median annual total number of patients seen for first outpatient appointment when urgently referred by their general practitioner with suspected cancer and whose referral was received by the NHS trust within 24 hours.

Discussion

A three star rating brings appreciable benefits to a trust. Trusts with three stars can expect to be given greater autonomy in line with the NHS commitment to devolve responsibility and decision making away from central government.

With such high stakes, the performance assessments should be beyond reproach. We have shown that this was not the case in 2002-3. Calculation of annual breaches of targets by using quarter end figures was flawed. Some trusts with many breaches of the target were judged as having achieved the indicator while others with lesser breaches were judged to have underachieved. The method might be acceptable if only end of quarter figures were available. However, the relevant data were available throughout the year.

Our analysis also shows that the methods used for the 2002-3 star ratings seem to discriminate against larger trusts. This is because absolute rather than relative measures were used. The performance of larger trusts will also be affected by factors such as the complexity of cases referred, especially when regional or national specialisation exists. In addition, capacity for specialist services is limited, and expansion is a challenge because growing requirements are difficult to predict.

Techniques used to judge the delivery of a service must have the confidence of those being judged as well as that of the government and the community. The concerns over the methods of assessment that we raise above have been reinforced by the Royal Statistical Society working party report on performance monitoring in the public sector.¹ This report identifies inadequacies in current performance management techniques, including some of those used by the Commission for Health Improvement. In particular, the society criticises targets that are the same in absolute terms for small groups as for large groups when the precision of estimation differs greatly.

The Healthcare Commission has inherited the ratings task for 2003-4 and has recently advised that the number of breaches will now be measured as a percentage of the total number of patients a trust treats.² Our analysis shows that trusts need to monitor constantly the statistical rigour of the performance indicators used to assess them.

Summary points

Performance of NHS trusts is reported in terms of a star rating

The methods used to assess performance in 2002-3 were flawed

Use of absolute rather than relative figures for some measures created bias against larger trusts

Use of end of quarter figures for outpatient waiting times masked large breaches in some hospitals

More statistical rigour is needed to ensure trusts' confidence in the ratings system

We thank L Parker for help with the statistical analyses and M F Laker for advice.

Contributors and sources: RMB is an executive director with extensive knowledge and experience of planning and performance management in healthcare. MSP is a Royal Statistical Society recognised chartered statistician and an experienced non-clinical epidemiologist. MI is a retired clinical academic with a background in health technology assessment and the transfer of evidence based practice into acute trust policies. All the above contributed equally to the paper. The article resulted from a detailed analysis of the reasons why a high performing trust breached a key indicator in the 2002-3 star ratings exercise.

Competing interests: RMB and MI are members of the Newcastle upon Tyne Hospitals NHS Trust Board of Directors. MSP was paid by the trust for a statistical report on the trust's performance in the star ratings exercise

References

1.Bird SM, Cox D, Farewell VT, Goldstein H, Holt T, Smith PC. Performance indicators: good, bad, and ugly. London: Royal Statistical Society, 2003.
2.Healthcare Commission. Indicator listings for acute trusts. http://ratings.healthcarecommission.org.uk/Indicators_2004/ (accessed 11 July 2004).

[ref1] 1.Bird SM, Cox D, Farewell VT, Goldstein H, Holt T, Smith PC. Performance indicators: good, bad, and ugly. London: Royal Statistical Society, 2003.

[ref2] 2.Healthcare Commission. Indicator listings for acute trusts. http://ratings.healthcarecommission.org.uk/Indicators_2004/ (accessed 11 July 2004).

PERMALINK

Star wars, NHS style

Richard M Barker

Mark S Pearce

Miles Irving

Roles

Short abstract

2002-3 star ratings

Statistical considerations

Use of quarter end figures

Table 1.

Figure 1.

Absolute or relative test

Table 2.

Table 3.

Discussion

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Star wars, NHS style

Richard M Barker

Mark S Pearce

Miles Irving

Roles

Short abstract

2002-3 star ratings

Statistical considerations

Use of quarter end figures

Table 1.

Figure 1.

Absolute or relative test

Table 2.

Table 3.

Discussion

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases