Skip to main content
The BMJ logoLink to The BMJ
. 1999 Jan 16;318(7177):193. doi: 10.1136/bmj.318.7177.193b

Sensitivity and specificity and their confidence intervals cannot exceed 100%

Jonathan J Deeks 1, Douglas G Altman 1
PMCID: PMC1114676  PMID: 9888930

Editor—Stell and Gransden investigated the diagnostic accuracy of liquid media and direct culture of aspirated fluid as tests of septic bursitis.1 They reported that culture in liquid media had a sensitivity of 100% (95% confidence interval 92% to 108%) and a specificity of 89% (74% to 104%).

As sensitivity and specificity cannot exceed 100%, neither should their confidence intervals. Such impossible results arise when the standard large sample method for calculating confidence intervals for proportions is used when the proportion is near to zero or one or when the sample is small, or both. Even the use of a continuity correction, such as the subtraction of 0.5 from the numerator (as seems to have been used in the calculation of the above confidence interval for sensitivity) does not get around this problem. For the large sample method to be valid both the number of tests giving a negative result and the number giving a positive result should exceed 5.2 The values here are 17 and 0 for sensitivity, and 17 and 2 for specificity. Because the numbers are small, exact confidence intervals based on binomial probabilities should be calculated. These require computer programs or tables and give 95% confidence intervals of 80% to 100% for sensitivity and 67% to 99% for specificity.3

We are concerned that even these values are too high. Unbiased evaluation of the accuracy of a diagnostic test requires that the test results are compared with those of a good independent reference standard.4 Where such a standard does not exist, studies of diagnostic accuracy are problematic and can be misleading. Stell and Gransden used the definitive diagnoses made by an independent panel as their reference standard. The panel based its diagnoses on “all clinical, laboratory, treatment, and follow up data to the point of final discharge”; although ambiguous, this suggests that the results of the culture methods were in fact part of the data used to make the reference diagnosis. In this situation it is not surprising that high agreement is observed between the test and the reference standard. This bias is known as “incorporation bias”5 and nearly always leads to overestimation of the predictive abilities of the test. If the culture results were available to the independent panel the results of this study are likely to be overoptimistic.

References

  • 1.Stell IM, Gransden WR. Simple tests for septic bursitis: comparative study. BMJ. 1998;316:1877–1878. doi: 10.1136/bmj.316.7148.1877. . (20 June.) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Bland M. An introduction to medical statistics. 2nd ed. Oxford: Oxford University Press; 1995. pp. 125–126. [Google Scholar]
  • 3.Lentner C, editor. Geigy scientific tables. 8th ed. Basle: Geigy; 1982. pp. 89–102. [Google Scholar]
  • 4.Jaeschke R, Guyatt GH, Sackett DL.for the Evidence-Based Medicine Working Group. Users’ guides to the medical literature. III. How to use an article about a diagnostic test A: Are the results of the study valid? JAMA 1994271703–707. [DOI] [PubMed] [Google Scholar]
  • 5.Ransohoff DF, Feinstein AR. Problems of spectrum and bias in evaluating the efficacy of diagnostic tests. N Engl J Med. 1978;299:926–930. doi: 10.1056/NEJM197810262991705. [DOI] [PubMed] [Google Scholar]

Articles from BMJ : British Medical Journal are provided here courtesy of BMJ Publishing Group

RESOURCES