Table 3.
Certainty assessment | Impact | Certainty | Importance | ||||||
---|---|---|---|---|---|---|---|---|---|
№ of studies | Study design | Risk of bias | Inconsistency | Indirectness | Imprecision | Other considerations | |||
Accuracy—sensitivity (Ag-RDT self-testing vs. rRT-PCR) | |||||||||
2311,24–34 | Observational studies | Not seriousa | Not seriousb | Not seriousc | Not seriousd | None | Normalized to a study population with 1000 participants and 10% prevalence, 66 true positive and 34 false negative self-testing results were reported. Pooled sensitivity was 66.1% (95% CI 53.5 to 76.7) | ⨁⨁⨁⨁ High | CRITICAL |
Accuracy—specificity (Ag-RDT self-testing vs. rRT-PCR) | |||||||||
2311,24–34 | Observational studies | Not seriousa | Not seriousb | Not seriousc | Not seriousd | None | Normalized to a study population with 1000 participants and 10% prevalence, 874 true negative and 2 false positive self-testing results were reported. Pooled specificity was high with 99.5% (95% CI 99.1 to 99.7) | ⨁⨁⨁⨁ High | CRITICAL |
Accuracy—concordance (Ag-RDT self-testing vs. Ag-RDT performed by professionals) | |||||||||
111 | Observational studies | Not seriousa | Seriousb | Not seriousc | Seriousd | None | Kappa: 0.92 (out of 1.00); (95% CI 0.89 to 0.95) | ⨁⨁◯◯ Low | CRITICAL |
Accuracy—Proportion of user errors | |||||||||
111 | Observational studies | Not seriousa | Seriousb | Not seriousc | Not seriouse | None | 15.5% of the sampling steps and 15.0% of testing steps, were found to have deviations by study participants. However, these did not impede the self-test's performance | ⨁⨁◯◯ Low | IMPORTANT |
Explanation: aWe used QUADAS-2 to assess risk of bias. The studies enrolled patients consecutively and assessed the self-testing, defined as self-sampling and self-performing the Ag-RDT, results blinded to the reference standard result (rRT-PCR or prof. Ag-RDT testing). While for one study it was not clear whether all self-tests were performed as per manufacturer’s instructions, this was ensured in the other. Furthermore, we could not detect any potential bias resulting from the study flow and timing. Therefore, we did not downgrade the quality of evidence for this criterion.
bThe heterogeneity/inconsistency in findings, as shown by the wide-ranging point estimates with only marginally overlapping confidence intervals, is likely to originate from differences in the study population. This is strengthened by the fact that the head-to-head comparison between self-testing and professionally testing on the same study population shows similar performance of Ag-RDTs. However, as there are only a few studies available for concordance and one study for user errors, we downgrade for these two outcomes by one.
cFollowing current guidance from the GRADE guideline, we do not downgrade by one point for all studies but acknowledge that the study populations are not fully representative of the populations of interest. Furthermore, the intervention did not differ from the one of interest and outcomes were reported directly, therefore indirectness was judged 'not serious'.
dThe number of studies and sample size were small, and only one study reported on concordance between self-testing and professionally testing using Ag-RDTs.
eFor this outcome only qualitative data, or quantitative data in isolated studies in well-described but not comparable settings were available, therefore the criterion 'imprecision' is negligible and rated as 'not serious'.