Skip to main content
Microbiology Spectrum logoLink to Microbiology Spectrum
. 2022 Nov 1;10(6):e03418-22. doi: 10.1128/spectrum.03418-22

Reply to Li et al., “Kappa Values in Testing the Concordance: Comments on a Recent Article about Nasopharyngeal Swabs for SARS-CoV-2 Detection”

Cody Callahan a,b, Sarah Ditelberg b, Sanjucta Dutta b, Nancy Littlehale c, Annie Cheng b, Kristin Kupczewski b, Danielle McVay b, Stefan Riedel b,c,e, James E Kirby b,c,e, Ramy Arnaout b,c,d,e,
Editor: Eleanor A Powellf
PMCID: PMC9769801  PMID: 36318026

REPLY

In a letter regarding our paper (1), Li et al. recalculated Cohen’s κ values from the 2 × 2 tables from our Fig. 4 and obtained slightly different values (2). Li et al. pointed out that none of these differences affect the conclusions of our paper but asked the reason for the differences.

The reason for the difference is that, in our study, we used a different cutoff for positive values creating the 2 × 2 tables than was used for calculating the κ values. There is argument in the field as to whether cycle threshold (CT) values significantly above the limit of detection (LOD) are potential false positives. Therefore, to be conservative and not overestimate sensitivity, we used the LOD as a cutoff for the 2 × 2 tables (to reduce the likelihood of false positives); however, for the κ calculations, we used CT values without cutoffs, to reflect qualitative data as reported out by our testing platform according to Emergency Use Authorization (EUA) approval as a basis for calculated agreement. It would have been clearer to have done the same thing for both calculations (although as Li et al. pointed out, the difference does not affect our paper’s conclusions).

Figure 1 is an update of Fig. 4 from our paper (1), making all values consistent by using LOD cutoffs throughout. Our κ values (obtained using Python’s scikit-learn package) are now identical to those of Li et al. (obtained using the commercial software SPSS) except for panel g, where we obtain 0.90 compared to Li et al.’s 0.91. We suspect this discrepancy is due to rounding: the value to multiple decimal places is κ = 0.9049…, which rounds down to 0.90. Possibly SPSS returned only three decimal places—0.905—which would round up to 0.91. This difference is unimportant.

FIG 1.

FIG 1

Viral load in saliva versus NP swab samples, Cohen’s kappa (κ) concordance values, and contingency tables for overall study (a), subjects presenting for initial presentation (within 5 days of first COVID-19 RT-PCR test) (b), subjects presenting for follow-up testing (c), samples treated with GITC (guanidinium isothiocyanate) transport buffer as a preservative after receipt at the central laboratory (d), untreated samples (e), samples run on the Alinity m platform (f), and samples run on the m2000 (g). Diagonal lines in scatterplots, 1:1. Gray-shaded areas in the scatterplots are below the LOD (100 copies/mL). Gray-shaded cells in the contingency tables highlight discordant results. In all cases, the LOD was used as a cutoff for positive versus negative.

We believe the letter by Li et al. and our response highlight the value of both critical feedback and transparency (3, 4), especially the advantage of our having provided the entirety of our data set to facilitate reanalysis. We made all data and code available on GitHub when the original paper was published and have revised the Python notebook there that produces our results in light of Li et al.’s letter. We favor this practice, as well as the use of open-source software—here, Python instead of proprietary software like SPSS, which is less available and therefore harder to vet—in the interest of open science.

We thank Li et al. again for their careful reanalysis of our data.

Ramy Arnaout, on behalf of the authors.

Footnotes

This is a response to a letter by Li et al. (https://doi.org/10.1128/Spectrum.01582-22).

Contributor Information

Ramy Arnaout, Email: rarnaout@bidmc.harvard.edu.

Eleanor A. Powell, University of Cincinnati

REFERENCES

  • 1.Callahan C, Ditelberg S, Dutta S, Littlehale N, Cheng A, Kupczewski K, McVay D, Riedel S, Kirby JE, Arnaout R. 2021. Saliva is comparable to nasopharyngeal swabs for molecular detection of SARS-CoV-2. Microbiol Spectr 9:e00162-21. doi: 10.1128/Spectrum.00162-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Li M, Zhang C, Yu T. 2022. Kappa values in testing the concordance: comments on a recent article about nasopharyngeal swabs for SARS-CoV-2 detection. Microbiol Spectr 9:e01582-22. doi: 10.1128/Spectrum.01582-22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Lakhani KR, Boudreau KJ, Loh P-R, Backstrom L, Baldwin C, Lonstein E, Lydon M, MacCormack A, Arnaout RA, Guinan EC. 2013. Prize-based contests can provide solutions to computational biology problems. Nat Biotechnol 31:108–111. doi: 10.1038/nbt.2495. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Arnaout RA. 2021. Cooperation under pressure: lessons from the COVID-19 swab crisis. J Clin Microbiol 59:e01239-21. doi: 10.1128/JCM.01239-21. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Microbiology Spectrum are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES