Detection of changes in mortality after heart surgery: Control limits failed to account for case mix

Steve Gallivan; Jocelyn Lovegrove; Christopher Sherlaw-Johnson

doi:10.1136/bmj.317.7170.1453

. 1998 Nov 21;317(7170):1453. doi: 10.1136/bmj.317.7170.1453

Detection of changes in mortality after heart surgery

Control limits failed to account for case mix

Steve Gallivan ^1,², Jocelyn Lovegrove ^1,², Christopher Sherlaw-Johnson ^1,²

PMCID: PMC1114307 PMID: 9822413

Editor—We are concerned about the graphical technique described by Poloniecki et al in their analysis of perioperative mortality rates associated with cardiac surgery.¹ Figure 2 shows three traces: observed mortality performance bracketed by control limits and plotted against the number of successive cases performed. The interpretation of the middle of the traces is straightforward since it is simply a variable life adjusted display that has previously been described and will be familiar to many cardiac surgeons in the United Kingdom.² The use of control limits, on the other hand, is new. However, the usefulness and indeed the validity of these is not clear. As the authors themselves note, their analysis does not amount to a formal test of significance since the control limits have not been corrected for multiple testing; this is a major deficiency. The use of 99% control limits rather than 95% control limits presumably increases their separation and makes them more forgiving. It is not clear which level of significance should be used, a difficulty compounded by the fact that the limits are not based on formal significance testing.

If we understand correctly, these control limits have been calculated using a χ² distribution. However, this fails to take into account case mix and heterogeneity of risk, the very things for which variable life adjusted display plots are used. The following example illustrates the danger in ignoring case mix when estimating ranges of variability. Consider operations on two sequences of 1000 patients with different underlying mortality risks that have been assessed preoperatively (table). Based on the given mortality risks, there is a 99% probability that the number of deaths that actually occur would fall in the range shown in the last column. These ranges are derived from exact calculations based on the binomial expansion. Using a χ² distribution would give a range (16 to 44) close to the exact values obtained for the patients in sequence 2, for whom no heterogeneity of risk is present, but would substantially overestimate the range for the patients in sequence 1, for whom risks are heterogeneous.

When examining surgical mortality, it is important to take case mix into account. However, this should be done not only when estimating the expected mortality but also when estimating the likely variability. Any overestimation of likely ranges of variability might well lead to undue complacency.

Table.

Hypothetical operations on two groups of patients with different mortality risks

Case load	Preoperative estimate of mortality risk (%)	Predicted	Exact 99% limits on
		No of deaths	No of deaths
Sequence 1 (n=1000)	40 patients with 72%, 960 patients with 0.125%	30	22 to 38
Sequence 2 (n=1000)	1000 with 3%	30	17 to 45

Open in a new tab

References

1.Poloniecki J, Valencia O, Littlejohns P. Cumulative risk adjusted mortality chart for detecting changes in death rate: observational study of heart surgery. BMJ. 1998;316:1697–1700. doi: 10.1136/bmj.316.7146.1697. . (6 June.) [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Lovegrove J, Valencia O, Treasure T, Sherlaw-Johnson C, Gallivan S. Monitoring the results of cardiac surgery by variable life-adjusted display. Lancet. 1997;350:1128–1130. doi: 10.1016/S0140-6736(97)06507-0. [DOI] [PubMed] [Google Scholar]

BMJ. 1998 Nov 21;317(7170):1453.

Author’s reply

J D Poloniecki ¹

Editor—In order not to generate confusion when referring to the cumulative risk adjusted mortality chart, I suggest that Gallivan et al stick to the original name for this plotting technique, which I gave it in 1995. Further details of the precedent are set out at the end of our paper.

Unfortunately Gallivan et al are not alone in the practice of claiming that a surgeon is worse than his or her colleagues, or that a colleague’s performance has deteriorated (and then improved), without any statistical basis for the assertion—that is, without consideration of the rate of false positives.^1-1

The potential usefulness of control limits is no doubt clear to Gallivan et al. In their paper they state: “Some work on the statistical approach to this question has been done (Jan Poloniecki, unpublished observations).¹³” Their haste to submit the same data with the same plotting technique to the same journal at the same time may be responsible for the fact that reference 13 has been omitted from the list of references published in the Lancet.^1-1

Gallivan and his colleagues at University College London are right in thinking that nominal 99% control limits will give wider confidence intervals, and therefore fewer false positive results, than 95% limits based on the same test. If, as we have suggested should happen, a formal internal inquiry is launched whenever the statistical control limits are breached, then the confidence limits must be wide enough to ensure that this does not occur so often as to be unmanageable. For our series, we found that the control limits for the cumulative risk adjusted mortality were breached at most twice in nearly four years. The second occasion was particularly transient—that is, self-correcting—and might not have occurred at all if any of the parameters had been reset after the first occasion.

Gallivan et al suggest that the test could be based on a multinomial distribution. Both 0 and 100% are valid Parsonnet scores, and with these risk estimates the multinomial confidence limits have a width of 0. None the less, they could try their suggestion on, for example, the St George’s data, which they have, to find out how often the control limits for the cumulative risk adjusted mortality chart are breached.

References

1-1.Lovegrove J, Valencia O, Treasure T, Sherlaw-Johnson C, Gallivan S. Monitoring the results of cardiac surgery by variable life-adjusted display. Lancet. 1997;350:1128–1130. doi: 10.1016/S0140-6736(97)06507-0. [DOI] [PubMed] [Google Scholar]

[B1] 1.Poloniecki J, Valencia O, Littlejohns P. Cumulative risk adjusted mortality chart for detecting changes in death rate: observational study of heart surgery. BMJ. 1998;316:1697–1700. doi: 10.1136/bmj.316.7146.1697. . (6 June.) [DOI] [PMC free article] [PubMed] [Google Scholar]

[B2] 2.Lovegrove J, Valencia O, Treasure T, Sherlaw-Johnson C, Gallivan S. Monitoring the results of cardiac surgery by variable life-adjusted display. Lancet. 1997;350:1128–1130. doi: 10.1016/S0140-6736(97)06507-0. [DOI] [PubMed] [Google Scholar]

PERMALINK

Detection of changes in mortality after heart surgery

Steve Gallivan

Jocelyn Lovegrove

Christopher Sherlaw-Johnson

Roles

Table.

References

Author’s reply

J D Poloniecki

Roles

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Detection of changes in mortality after heart surgery

Steve Gallivan

Jocelyn Lovegrove

Christopher Sherlaw-Johnson

Roles

Table.

References

Author’s reply

J D Poloniecki

Roles

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases