Box 3.
Surveillance bias
Diagnoses data from healthcare providers are increasingly used for surveillance [27]. Diagnoses trends can differ from diseases trends, and ignoring this difference can lead to a surveillance bias [1]. Hence, during the pandemic, the number of reported COVID-19 cases has been routinely used as the main indicator of the infection's spread severity, because this data was readily available daily and, at least apparently, relatively simple to communicate. However, the difference between the diagnosis (based on a positive test) and the disease we want to prevent (severe symptomatic infection) was often overlooked. Focusing on the number of positive tests to gauge the severity of the pandemic could be misleading since this number was influenced over time by changes in test availability and testing intensity, as well as changes in reporting rates. As a result, the changes in the number of cases diagnosed did not match in a predictable way with the number of clinical diseases. Another example is the use of hospitalizations as a marker of epidemic severity, which is hampered by the fact that the threshold for hospitalization changed due to changing perceptions of risk, attitudes about who should be hospitalized, beliefs about the efficacy of in-hospital treatments, and different incentives to hospitalize patients with SARS-CoV-2 infection. Comparing naively the hospitalization rate over time or across settings can bias the alleged severity of the pandemic. Prevention of surveillance bias requires strengthening standardization in the definition of health events as well as building information systems to capture high-quality population-based data, and not only diagnosis-based data. When multiple sources of data are available, each with its own bias, triangulation can also help [28]. It can, however, further compound bias especially when there are dominant (but false) narratives; then, new biases are added to the argumentation to continue supporting false premises |