Skip to main content
Eurosurveillance logoLink to Eurosurveillance
. 2024 Feb 15;29(7):2300259. doi: 10.2807/1560-7917.ES.2024.29.7.2300259

Bias in vaccine effectiveness studies of clinically severe outcomes that are measured with low specificity: the example of COVID-19-related hospitalisation

Christian Holm Hansen 1,2
PMCID: PMC10986656  PMID: 38362627

Abstract

Many vaccine effectiveness (VE) analyses of severe disease outcomes such as hospitalisation and death include ‘false’ cases that are not actually caused by the infection or disease under study. While the inclusion of such false cases inflate outcome rates in both vaccinated and unvaccinated populations, it is less obvious how they affect estimates of VE. Illustrating the main points through simple examples, this article shows how VE is underestimated when false cases are included as outcomes. Depending how the outcome indicator is defined, estimates of VE against severe disease outcomes, whose definition allows for the inclusion of false cases, will be biased downwards and may in certain circumstances approximate the same level as the VE against infection. The bias is particularly pronounced for vaccines that offer high levels of protection against severe disease outcomes but poor protection against infection. Analysing outcomes that are measured with low sensitivity generally does not cause bias in VE studies; defining outcome indicators that minimise the number of false cases rather than the number of missed cases is preferable in VE studies.

Keywords: COVID-19, vaccine effectiveness, sensitivity and specificity, bias, methods

Introduction

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Omicron variant (Phylogenetic Assignment of Named Global Outbreak Lineages (Pangolin) designation B.1.1.529), which appeared towards the end of 2021, was generally associated with milder disease than previous SARS-CoV-2 variants but also with higher transmission rates [1-4]. At the height of the Omicron wave, a large number of hospital patients therefore happened to be SARS-CoV-2-positive despite being hospitalised for reasons unrelated to COVID-19 [5]. Analyses that attributed such hospitalisations to COVID-19, simply because of the co-occurrence in time with a SARS-CoV-2 infection, exaggerated the number of hospitalisations supposedly due to COVID-19 [6]. But how does such outcome misclassification affect estimates of vaccine effectiveness (VE)?

Vaccine effectiveness is essentially a measure of the protective immunological effect induced by a vaccine against a certain outcome of interest such as infection, disease, hospitalisation or death. It is defined as the ratio of the case rate in vaccinated people to that in unvaccinated people. Valid estimates in VE studies are important as such studies help inform public health decisions. Observational VE studies are routinely used to monitor the protective effect in a population, or segments of a population, of new and established vaccines, for example those against seasonal diseases such as influenza and COVID-19, but they are also used in more unusual outbreaks as seen recently with studies of vaccine protection against mpox [7-9].

In this article, we investigate how outcome misclassification affects estimates in VE studies of severe outcomes. The first section presents the theoretical background and is followed by a section illustrating the main points through six simple scenarios using the example of VE against COVID-19-related hospitalisation and finally some suggestions for assessing the magnitude of the bias based on a few assumptions.

Theoretical background

Define VE against infection with a particular pathogen as VEinf = 1 − (πi1i0) where πi1 and πi0 are the infection rates in the vaccinated and unvaccinated population respectively.

If the outcome of interest is a disease outcome caused by the infection, e.g. hospitalisation, rather than just infection itself, the quantity we seek to estimate is

VEhos=1-πi1×πh1πi0×πh0  (Formula1)

where πh1 and πh0 denote the probability that an infection in a vaccinated and unvaccinated person, respectively, will lead to the outcome. For argument’s sake, but without loss of generality, this outcome is taken to be hospitalisation from this point onwards.

In studies using case definitions with low specificity, however, the measured outcome rates will be inflated by the inclusion of misclassified cases. For example, if individuals hospitalised for other reasons also have an ‘incidental’ infection with the pathogen studied and are erroneously included as cases in the analysis, the quantity actually being estimated is

VE'hos=1-πi1×πh1+πi11-πh1×πfπi0×πh0+πi01-πh0×πf    (Formula 2)

Here πf is the probability that a randomly selected person from the population is hospitalised for reasons unrelated to the studied infection and is assumed independent of vaccination status and infection.

In the numerator of the ratio measure in Formula 2, note that the term πi1 × πh1 is retained from Formula 1 and represents the proportion of the vaccinated population that is hospitalised due to the infection. The additional term, πi1(1 − πh1) × πf, is the proportion among the vaccinated population that is infected and hospitalised due to other causes. The denominator is of the same form and relates to the unvaccinated population.

We can rewrite Formulae 1 and 2 in terms of relative rates (RR),

RRhos = 1-VEhos = (1-VEinf)πh1πh0  (Formula 3) 
RR'hos = 1-VE'hos = 1-VEinfπh1+1-πh1×πfπh0+1-πh0×πf  (Formula 4)

Comparing the expressions in Formulae 3 and 4, and assuming that πh1 ≤ πh0, it can be seen that RRhos ≤ RR’hos, or equivalently that VE’hos ≤ VEhos, since

πh1πh0πh1+(1-πh1)×πfπh0+(1-πh0)×πf

It can also be seen from Formula 4 that VE’hos tends towards VEinf as πf approaches 1. We therefore have VEhos ≥ VE’hos ≥ VEinf.

By introducing a bias correction factor c we can express RRhos as a function of RR’hos

RRhos = c.RR’hos (Formula 5)
where c=(1-VEinf)πh1πh0(1-VEinf)πh1+(1-πh1)×πfπh0+(1-πh0)×πf=πh1πh0×πh0+(1-πh0)×πfπh1+(1-πh1)×πf

is a value ranging from 1 to πh1/πh0 as πf ranges from 0 to 1. Note that the bias is greatest when πh1 is a lot smaller than πh0 and when πf is large relative to πh0 and πh1. Meanwhile there is no bias (c = 1) when πh1 = πh0, since VEhos = VEinf at that point. This would be the case for a vaccine which may protect against infection but does not provide any further protection against hospitalisation once infection has occurred. Note also that there is no relationship between the size of the bias, c, and the infection rates as expressed through πi0 and πi1.

The relationship in Formula 5 can be rewritten as VEhos = 1 − c(1 − VE’hos).

Illustrating examples

In the following, the results derived above are demonstrated through a number of scenarios using the example of VE against COVID-19-related hospitalisation. The first three scenarios present examples of a vaccine with a relatively high level of effectiveness against infection, 80%, while the VE against infection in Scenarios 4–6 is only 20%. The scenarios are hypothetical and have been selected to illustrate the relationship between the VE as estimated in a study and the parameters introduced above.

Scenario 1: Base case scenario

In Scenario 1 (see Table), the proportion of the population who are positive for SARS-CoV-2 is 10% among unvaccinated (πi0 = 0.10) and 2% among vaccinated people (πi1 = 0.02), meaning that VE against infection is 80%. Once infected, vaccinated individuals are less likely than unvaccinated individuals to require hospitalisation as only 0.5% of vaccinated cases are hospitalised because of COVID-19 (πh1 = 0.005) compared with 2% of unvaccinated cases (πh0 = 0.02). The true RR of hospitalisation for COVID-19 in the population is therefore (0.02 × 0.005) / (0.10 × 0.02) = 0.05 resulting in a VE against COVID-19 hospitalisation of 95%. The estimated VE of 87.4% is lower, however, due to contamination from individuals with an ‘incidental’ SARS-CoV-2 diagnosis, i.e. individuals who happen to be infected but who are actually hospitalised for other reasons. Two per cent of the population under study, whether vaccinated or unvaccinated, is in hospital for other reasons (πf = 0.02). The observed COVID-19 hospitalisation rates are consequently inflated in both the vaccinated and unvaccinated group by an amount that is equal to 2% of the total infections observed in the two groups. This forces the numerator and denominator of the RR closer together and pushes down the VE estimate.

Table. Six scenarios illustrating bias in estimated vaccine effectiveness against hospitalisation for COVID-19 in studies with low specificity of outcome.

Total population Infection rates, πi Number of infected individuals VE against infection, 1−(πi1i0) × 100% Rates of hospitalisation for COVID-19 among infected, πh Rate of hospitalisation for other reasons, πf Hospitalised individuals with a SARS-CoV-2 infection True VE against hospitalisation for COVID-19 Observed VE against hospitalisation for COVID-19
Number of hospitalisations that are because of COVID-19 Number of hospitalisations for reasons other than COVID-19
Scenario 1: Base case scenario
Vaccinated 1,000,000 πi1 = 0.02 20,000 80% πh1 = 0.005 0.02 100 398 1−(100/2,000): 95% 1−(498/3,960): 87.4%
Unvaccinated 1,000,000 πi0 = 0.10 100,000 πh0 = 0.020 2,000 1,960
Scenario 2: 10 times lower infection rates
Vaccinated 1,000,000 πi1 = 0.002 2,000 80% πh1 = 0.005 0.02 10 40 1−(10/ 200): 95% 1−(50/ 396): 87.4%
Unvaccinated 1,000,000 πi0 = 0.010 10,000 πh0 = 0.020 200 196
Scenario 3: 10 times less severe disease
Vaccinated 1,000,000 πi1 = 0.02 20,000 80% πh1 = 0.0005 0.02 10 400 1−(10/ 200): 95% 1−(410/2,196): 81.3%
Unvaccinated 1,000,000 πi0 = 0.10 100,000 πh0 = 0.0020 200 1,996
Scenario 4: Poor vaccine effectiveness against infection
Vaccinated 1,000,000 πi1 = 0.08 80,000 20% πh1 = 0.005 0.02 400 1,592 1−(400/2,000): 80% 1−(1,992/3,960): 49.7%
Unvaccinated 1,000,000 πi0 = 0.10 100,000 πh0 = 0.020 2,000 1,960
Scenario 5: Poor vaccine effectiveness against infection; good vaccine protection once infected
Vaccinated 1,000,000 πi1 = 0.08 80,000 20% πh1 = 0.002 0.02 160 1,597 1−(160/2,000): 92% 1−(1,757/3,960): 55.6%
Unvaccinated 1,000,000 πi0 = 0.10 100,000 πh0 = 0.020 2,000 1,960
Scenario 6: Poor vaccine effectiveness against infection; good vaccine protection once infected; low rates of hospitalisation for other reasons
Vaccinated 1,000,000 πi1 = 0.08 80,000 20% πh1 = 0.002 0.001 160 80 1−(160/2,000): 92% 1−(240/2,098): 88.6%
Unvaccinated 1,000,000 πi0 = 0.10 100,000 πh0 = 0.020 2,000 98

SARS-CoV-2: severe acute respiratory syndrome coronavirus 2; VE: vaccine effectiveness.

The infection rate is 10% among the unvaccinated population except in Scenario 2 where it is 1%. The risk of hospitalisation due to COVID-19 following an infection is 2% among the unvaccinated population except in Scenario 3 where it is 0.2%. Except in Scenario 6 where it is 0.1%, the risk of hospitalisation in the population among those not hospitalised for reasons related to COVID-19 is 2%, e.g. (20,000–100) × 0.02 = 398 in Scenario 1.

Scenario 2: 10 times lower infection rates

As was established above, the incidence rate of infection in the population, reflecting the contagiousness of the pathogen, does not impact the magnitude of this bias (the bias correction factor c is independent of πi0 and πi1). This is illustrated in Scenario 2 where the infection rate is 10 times smaller in both the vaccinated and unvaccinated population (πi0 = 0.01, πi1 = 0.002) compared with Scenario 1 but the estimated VE remains unchanged.

Scenario 3: 10 times less severe disease

In Scenario 3, the disease is milder as the risk that an infection will lead to hospitalisation is 10 times less that in Scenario 1 (πh0 = 0.002, πh1 = 0.0005). Relative to the many misclassified cases with incidental infection, the relevant contrast between vaccinated and unvaccinated cases that are actually hospitalised for COVID-19 is diluted considerably. If the number of hospitalisations with incidental infections is large relative to those actually hospitalised due to the infection under study, the observed hospitalisation rate ratio, which in Scenario 3 is (400 + 10) / (1,996 + 200) = 410/2,196, will approach the rate ratio for infections among vaccinated vs unvaccinated (20,000/100,000). Consequently, as was established above, we see in Scenario 3 that the estimated VE against hospitalisation approaches the level of VE against infection.

Scenario 4: Poor vaccine effectiveness against infection

When VE for infection is high, as in Scenarios 1–3, the absolute scale of the bias may be limited. On the other hand, when VE for infection is low (e.g. 20% as in Scenario 4), estimates of VE against hospitalisation may tend towards similarly low levels despite good vaccine protection against hospitalisation once infected.

Scenario 5: Poor vaccine effectiveness against infection; good vaccine protection once infected

The bias is particularly pronounced for vaccines that offer high levels of protection against hospitalisation despite poor protection against infection as was the case for many of the original (monovalent) COVID-19 vaccines during the Omicron era [10,11]. This is illustrated further in Scenario 5 where the vaccine still only protects 20% against infection but vaccinated SARS-CoV-2 cases are 10 times less likely to require hospitalisation than unvaccinated cases (πh1/πh0 = 0.002/0.02 = 0.1). Here, VE against hospitalisation is estimated at just 55.6% when in fact it is 92%.

Scenario 6: Poor vaccine effectiveness against infection; good vaccine protection once infected, low rates of hospitalisations for other reasons

Arguably most important for the magnitude of the bias is the rate of misclassified cases, πf, i.e. in this example, the proportion of people in hospital for reasons other than COVID-19. This is illustrated in Scenario 6, which is a repeat of Scenario 5 except only 0.1% of the total population under study (not hospitalised for COVID-19) is in hospital for other reasons (πf = 0.001). In this Scenario, the estimated VE of 88.6% is wrong by only a few percentage points. In many real-life scenarios, a lower πf such as in Scenario 6 may be more realistic, especially if the study is conducted in a general population of relatively good health.

Bias correction

As explained above, the true (unbiased) VE against a severe disease outcome, such as hospitalisation, can be expressed as VEhos = 1 – c × (1 –VE’hos), where VE’hos is the estimated VE and c is the bias correction factor

c=πh1πh0×πh0+(1-πh0)πfπh1+1-πh1πf

which is a function of the three parameters, πf, πh1 and πh0. To illustrate, having observed VE’hos = 81.3% as in Scenario 3, and assuming πf = 0.02, πh1 = 0.0005 and πh0 = 0.002, we can evaluate c as 0.2679 and the true VE as 1 – 0.2679 × (1 – 0.813) = 95%.

In practice πf, πh1 and πh0 will generally not be known but might be gauged from electronic health records or external studies so that it may still be possible to gain a sense of the level of underestimation that can be expected in particular scenarios and, by varying the parameters, suggest a range of plausible values within which the true (unbiased) VE is likely to be.

Discussion

Generally, a low specificity of the outcome measure in clinical studies results in rates being overestimated in both treatment and control groups. Consequently, it is well established that low specificity attenuates the ratio of the measured rates causing the ratio to be closer to 1 than it truly is [12,13]. However, in studies of vaccine protection against severe outcomes, VE is typically derived from the product of two ratios, namely the ratio of infection rates and secondly, among those infected, the ratio of the rates of severe outcome. In many studies, such as those using PCR methods to detect infection, the problem of low specificity affects only the second ratio. Consequently, as illustrated in this article, the biased estimate is bounded by the level of VE against infection.

We have seen that the proportion of individuals admitted to hospital for unrelated reasons, πf, is an important factor for the size of the bias. In all the scenarios shown, and in the expression for c, it is assumed that πf is independent of vaccination status. A fundamental principle of VE studies is that the vaccinated and unvaccinated groups being compared do not differ systematically with respect to other risk factors (at least not after adjustment). This is necessary to ensure that the VE measure captures only the immunological effect of vaccination and not the effects of other exposure and disease predictors. It would probably be a sign that the health profiles are not comparable if πf differed between the two groups, and the resultant VE estimate would therefore also be biased due to confounding. Supplementary analyses with negative control exposures or outcomes are recommended in observational VE studies as a way to assess such bias [14,15]. Nonetheless, it is possible to adapt the expression for c to accommodate different πf rates, say πf1 and πf0 respectively, in the vaccinated and unvaccinated group; it can then be shown that the underestimation is even more pronounced if πf1 > πf0, which would be the situation if, for example, people of poorer health were more likely to be vaccinated.

Observational studies to estimate VE may be designed in various ways. Study designs include retrospective cohort studies through analysis of routinely collected electronic health records, cross-sectional designs, prospective cohort studies, test-negative designs and other case–control studies. The biasing effects presented in this article, caused by outcome misclassification, apply equally across all these designs. Even randomised controlled trials (RCTs) are vulnerable to this type of outcome misclassification bias, although due to their budgets and rigour, RCTs often include more accurate outcome assessment methods than observational studies.

Unlike the problems caused by low specificity, low sensitivity of an outcome measure generally does not bias the ratio of the measured rates in the two groups [13]. It is therefore preferable for studies of VE to use outcomes that minimise numbers of misclassified cases (false cases) rather than numbers of missed cases.

Depending on data availability, minimising the number of misclassified cases may be achieved by defining more specific outcome measures based for example on primary diagnosis codes upon hospital admission, hospital procedures or death certificate information. Death as a specific disease outcome may also be defined with higher specificity, but possibly at the cost of lower sensitivity, by requiring certain accompanying hospital diagnoses or procedures. A number of studies have explored alternative severe outcome definitions in the context of VE [6,16,17].

Whether in the context of COVID-19, influenza or some other type of infection, studies of VE against severe disease outcomes such as hospitalisation and death should aim to use outcome measures that minimise inclusion of false cases to avoid underestimation of the effects of interest. Where this is not possible, consideration should be given to the magnitude of the resulting bias, for example by investigating likely scenarios for the three parameters that enter the expression for c above.

Conclusions

Vaccine effectiveness against severe disease outcomes will generally be underestimated when incidental cases are included in the analysis. The potential error is greatest for vaccines that offer high levels of protection against severe disease but poor protection against infection. Outcomes with high specificity rather than high sensitivity are preferable in VE studies.

Ethical statement

Ethical approval was not needed for this work because no participant data were involved.

Funding statement

CHH is partly funded through a UK Medical Research Council (MRC) grant (reference: MR/R010161/1). The funder had no role in the study, decision to publish, or preparation of the manuscript.

Conflict of interest: None declared.

Authors’ contributions: CHH alone conceived the idea and is responsible for the manuscript and decision to publish.

References

  • 1.World Health Organization (WHO). Classification of omicron (B.1.1.529): SARS-CoV-2 variant of concern. Geneva: WHO: 2021. Available from: https://www.who.int/news/item/26-11-2021-classification-of-omicron-(b.1.1.529)-sars-cov-2-variant-of-concern
  • 2.Bálint G, Vörös-Horváth B, Széchenyi A. Omicron: increased transmissibility and decreased pathogenicity. Signal Transduct Target Ther. 2022;7(1):151. 10.1038/s41392-022-01009-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Bager P, Wohlfahrt J, Bhatt S, Stegger M, Legarth R, Møller CH, et al. Omicron-Delta study group . Risk of hospitalisation associated with infection with SARS-CoV-2 omicron variant versus delta variant in Denmark: an observational cohort study. Lancet Infect Dis. 2022;22(7):967-76. 10.1016/S1473-3099(22)00154-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Nyberg T, Ferguson NM, Nash SG, Webster HH, Flaxman S, Andrews N, et al. COVID-19 Genomics UK (COG-UK) consortium . Comparative analysis of the risks of hospitalisation and death associated with SARS-CoV-2 omicron (B.1.1.529) and delta (B.1.617.2) variants in England: a cohort study. Lancet. 2022;399(10332):1303-12. 10.1016/S0140-6736(22)00462-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Voor In ’t Holt AF, Haanappel CP, Rahamat-Langendoen J, Molenkamp R, van Nood E, van den Toorn LM, et al. Admissions to a large tertiary care hospital and Omicron BA.1 and BA.2 SARS-CoV-2 polymerase chain reaction positivity: primary, contributing, or incidental COVID-19. Int J Infect Dis. 2022;122(2):665-8. 10.1016/j.ijid.2022.07.030 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Stowe J, Andrews N, Kirsebom F, Ramsay M, Bernal JL. Effectiveness of COVID-19 vaccines against Omicron and Delta hospitalisation, a test negative case-control study. Nat Commun. 2022;13(1):5736. 10.1038/s41467-022-33378-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Price AM, Flannery B, Talbot HK, Grijalva CG, Wernli KJ, Phillips CH, et al. Influenza vaccine effectiveness against influenza A(H3N2)-related illness in the United States during the 2021-2022 influenza season. Clin Infect Dis. 2023;76(8):1358-63. 10.1093/cid/ciac941 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Poland GA, Kennedy RB, Tosh PK. Prevention of monkeypox with vaccines: a rapid review. Lancet Infect Dis. 2022;22(12):e349-58. 10.1016/S1473-3099(22)00574-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Deputy NP, Deckert J, Chard AN, Sandberg N, Moulia DL, Barkley E, et al. Vaccine effectiveness of JYNNEOS against mpox disease in the United States. N Engl J Med. 2023;388(26):2434-43. 10.1056/NEJMoa2215201 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Gram MA, Emborg H-D, Schelde AB, Friis NU, Nielsen KF, Moustsen-Helms IR, et al. Vaccine effectiveness against SARS-CoV-2 infection or COVID-19 hospitalization with the Alpha, Delta, or Omicron SARS-CoV-2 variant: A nationwide Danish cohort study. PLoS Med. 2022;19(9):e1003992. 10.1371/journal.pmed.1003992 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Andrews N, Stowe J, Kirsebom F, Toffa S, Rickeard T, Gallagher E, et al. Covid-19 vaccine effectiveness against the Omicron (B.1.1.529) variant. N Engl J Med. 2022;386(16):1532-46. 10.1056/NEJMoa2119451 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Jackson ML, Rothman KJ. Effects of imperfect test sensitivity and specificity on observational studies of influenza vaccine effectiveness. Vaccine. 2015;33(11):1313-6. 10.1016/j.vaccine.2015.01.069 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Field trials of health interventions: a toolbox. Smith PG, Morrow RH, Ross DA. 3rd edition. Oxford: Oxford University Press; 2015. pp. 211-212. [PubMed] [Google Scholar]
  • 14.Lipsitch M, Tchetgen Tchetgen E, Cohen T. Negative controls: a tool for detecting confounding and bias in observational studies. Epidemiology. 2010;21(3):383-8. 10.1097/EDE.0b013e3181d61eeb [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Hansen CH, Moustsen-Helms IR, Rasmussen M, Søborg B, Ullum H, Valentiner-Branth P. Short-term effectiveness of the XBB.1.5 updated COVID-19 vaccine against hospitalisation in Denmark: a national cohort study. Lancet Infect Dis. 2024;24(2):e73-4. 10.1016/S1473-3099(23)00746-6 [DOI] [PubMed] [Google Scholar]
  • 16.Chung H, Austin PC, Brown KA, Buchan SA, Fell DB, Fong C, et al. Effectiveness of COVID-19 vaccines over time prior to omicron emergence in Ontario, Canada: test-negative design study. Open Forum Infect Dis. 2022;9(9):ofac449. 10.1093/ofid/ofac449 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.World Health Organization (WHO). Evaluation of COVID-19 vaccine effectiveness: Interim guidance. Geneva: WHO; 2021. Available from: https://www.who.int/publications/i/item/WHO-2019-nCoV-vaccine_effectiveness-measurement-2021.1

Articles from Eurosurveillance are provided here courtesy of European Centre for Disease Prevention and Control

RESOURCES