Skip to main content
PLOS One logoLink to PLOS One
. 2021 Dec 20;16(12):e0260884. doi: 10.1371/journal.pone.0260884

Modelling RT-qPCR cycle-threshold using digital PCR data for implementing SARS-CoV-2 viral load studies

Fabio Gentilini 1,*,#, Maria Elena Turba 2,#, Francesca Taddei 3, Tommaso Gritti 3, Michela Fantini 3, Giorgio Dirani 3, Vittorio Sambri 3,4
Editor: Jean-Luc EPH Darlix5
PMCID: PMC8687578  PMID: 34928966

Abstract

Objectives

To exploit the features of digital PCR for implementing SARS-CoV-2 observational studies by reliably including the viral load factor expressed as copies/μL.

Methods

A small cohort of 51 Covid-19 positive samples was assessed by both RT-qPCR and digital PCR assays. A linear regression model was built using a training subset, and its accuracy was assessed in the remaining evaluation subset. The model was then used to convert the stored cycle threshold values of a large dataset of 6208 diagnostic samples into copies/μL of SARS-CoV-2. The calculated viral load was used for a single cohort retrospective study. Finally, the cohort was randomly divided into a training set (n = 3095) and an evaluation set (n = 3113) to establish a logistic regression model for predicting case-fatality and to assess its accuracy.

Results

The model for converting the Ct values into copies/μL was suitably accurate. The calculated viral load over time in the cohort of Covid-19 positive samples showed very low viral loads during the summer inter-epidemic waves in Italy. The calculated viral load along with gender and age allowed building a predictive model of case-fatality probability which showed high specificity (99.0%) and low sensitivity (21.7%) at the optimal threshold which varied by modifying the threshold (i.e. 75% sensitivity and 83.7% specificity). Alternative models including categorised cVL or raw cycle thresholds obtained by the same diagnostic method also gave the same performance.

Conclusion

The modelling of the cycle threshold values using digital PCR had the potential of fostering studies addressing issues regarding Sars-CoV-2; furthermore, it may allow setting up predictive tools capable of early identifying those patients at high risk of case-fatality already at diagnosis, irrespective of the diagnostic RT-qPCR platform in use. Depending upon the epidemiological situation, public health authority policies/aims, the resources available and the thresholds used, adequate sensitivity could be achieved with acceptable low specificity.

Introduction

A year after severe acute respiratory system coronavirus 2 (SARS-CoV-2) was declared to be a pandemic [1], many aspects of the infection still remain undefined. In particular, the role of viral loads (VLs) in infectivity and case-fatality rates is still poorly clarified and scarcely used to implement public health measures [29].

Since the beginning, it has been clear that VLs have varied greatly among patients over the course of disease, and that infectivity was associated with higher VLs [5, 8]. With respect to SARS, however, high VLs may also be evident in the pre-symptomatic phase, and the peak of viral shedding was observed early in the course of the disease [2, 7, 9]. Furthermore, the role of the VL in the respiratory tract in predicting mortality is also not well-known, although it was evident that higher VLs were associated with higher case-fatality ratios. One of the main hindrances to assessing VLs lies in the inherent difficulty of absolutely quantifying SARS-CoV-2. In fact, reverse transcription quantitative polymerase chain reaction (RT-qPCR) could provide absolute quantification by using labour-demanding daily calibration procedures which, in turn, require not readily available reference materials [10]. Diagnostic laboratories worldwide have been buried by an impressive demand for diagnostics and have hardly been able to face any additional investigational activity. As a result, the majority of the studies regarding VLs have evaluated the cycle threshold (Ct), automatically calculated by thermal cyclers, as a rough quantitative estimate of VL [47, 9, 1116].

Digital PCR (dPCR) is a straightforward evolution of PCR with some obvious advantages over standard qPCR assays. Specifically, dPCR allows for the absolute quantitation of nucleic acid samples without the need for a calibration curve, thanks to compartmentalization by partitioning of the target nucleic acid in thousands of small volume vessels [17]. Thanks to these features, dPCR is inherently more sensitive, specific and precise than standard qPCR, and is specifically reliable for VL absolute quantification [18, 19]. In the face of its many advantages over RT-qPCR, dPCR has still been limited by much higher costs for analysis and a longer turnaround time (TAT), which restricts its application as ancillary or complementary to RT-qPCR. In fact, to date, many studies have demonstrated the superiority of dPCR when compared to RT-qPCR in terms of diagnostic performance [2031]. However, studies relying on dPCR have not been based on consistent case numbers and the VLs were quantified in only relatively small cohorts. To date, dPCR has been utilised for investigating SARS-CoV-2 for VL quantification in regard to infectivity [2], and disease course monitoring [32, 33], and as a tool for assessing the circulating RNAaemia as an outcome predictor [3438], as a diagnostic tool for specifically reducing the false negative results for discharging convalescent patients [27, 30, 39], when inhibition was likely as in examining crude sample lysates or samples without RNA purification [21, 35, 40], or wastewater [41], for analysing contaminated surfaces [39] or for preparing standard material for RT-qPCR or cell cultures [4245]. Moreover, the overwhelming demand for diagnostic testing and the very low TAT required during the epidemic “waves” were scarcely suited to the majority of dPCR platforms. As a result, the vast majority of Covid-19 cases, being evaluated with only RT-qPCR or dPCR were generally characterised by relatively low consistency cohorts of cases and had a limited impact on SARS-CoV-2 knowledge.

To overcome this drawback, this study attempted to model the relationship between the Ct and genome absolute quantification for calculating the VL expressed as copies/μL; this was carried out using the stored Ct value retrieved from the medical records of a public centralised diagnostic laboratory intensely involved in diagnosing SARS-CoV-2. The aim was to test the hypothesis that a model built on a small subset of data could be advantageously harnessed to infer the VL in a large cohort. To that aim, the calculated VLs (cVLs) were then investigated in relationship to chronological fluctuations and differences between age groups, or were used to investigate its predictive power for the outcome.

Materials & methods

Ethics statement

The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Review Board of AUSL Romagna under the protocol code “COVdPCR of 07/02/2020. The study has been performed using exclusively anonymized, leftover samples deriving from the routine diagnostic procedures therefore the Ethical approval or informed consent is not required. The anonymization was achieved by using the current procedure (AVR-PPC P09, rev.2) checked by the local Ethical Board.

Experimental layout

The present study was composed of three steps:

  1. The first step was aimed at defining a function to convert the Ct values obtained using diagnostic RT-qPCR to absolute quantification as genome copies/μL carried out using dPCR and to assess the respective error. This task was achieved using a linear regression model built using a small cohort of 51 samples.

  2. After defining the regression function and its accuracy, the equation was used to calculate the VL in a very large cohort of 6208 Covid-19 cases. The cVLs were investigated in an observational study with a cross-sectional retrospective design.

  3. Finally, the medical data, including cVL, was used to build a straightforward predictive model, and its accuracy was calculated in a single cohort retrospective study and compared with a model including the raw Ct value.

Samples

Digital PCR has been demonstrated to be suitable for the retrospective evaluation of universal transport medium (UTM)-stored SARS-CoV-2 positive samples [31]. On this basis, 51 RNA samples conserved at -80°C in UTM (Copan, Copan Italia SpA) were selected from all the diagnostic samples examined at the “Great Romagna Hub Laboratory Pievesestina” (AVR Centro Servizi Laboratorio Unico Pievesestina, Cesena) during the Covid-19 pandemic. All the samples had been collected using nasopharyngeal or oropharyngeal swabs (Copan), immediately transferred into tubes containing 3 mL of UTM and transported to the diagnostic laboratory for SARS-CoV-2 testing using one of many different RNA purification platforms and RT-qPCR assays (S1 File). The results were expressed as positive or negative together with the Ct values of the respective targets; some anamnestic, epidemiological and clinical data were retrieved from the Laboratory database. The case-fatality information was recovered from the Death Registries of the Public Health Departments of the local medical services of Romagna. The samples were retrieved from the repository, the RNA was purified and re-assessed using both RT-qPCR and dPCR. In addition, the samples were divided into two sets; the first set of 13 samples (training set) was used to create the regression model while the second set of 38 samples (evaluation set) was used to validate the model. Regardless of the method originally used, the cohort of 51 samples was composed by stratifying the samples into high (≤ 20 Ct), medium (> 20 ≤ 25 Ct) and low (> 25 Ct) VL categories using the recorded Ct.

RNA purification, RT-qPCR and dPCR

The samples were collected using oro and nasopharyngeal swabs immediately placed in UTM (Copan, Copan Italia SpA). The RNA was purified from UTM, and used for RT-PCR and dPCR assays. The detailed protocols are reported in S1 File.

Statistical analysis and modelling

The analytical performances of the dPCR assay were established in terms of analytical sensitivity, precision and linearity, and were expressed as Limit of Detection (LOD) and Coefficient of Variation % (CV%) across technical replicates carried out over different days, and as a linear coefficient of correlation R2, respectively. The analytical performances were evaluated using Analyse-it software (Analyse-it Software, UK). (S1 File).

Fifty-one samples positive at Sars-CoV-2 RT-qPCR were retrieved from the repository and divided into two sets: a training set composed of 13 samples and an evaluation set composed of 38 samples. The training set samples were analysed in triplicate with dPCR, and the findings were included in building the model. After that, the 38 samples of the evaluation set were also assessed in single using dPCR, and the results were used to validate the linear regression model. The dPCR results, expressed in terms of log10 copies/μL of cDNA, were entered as dependent variables and the Ct values as predictors using STATA v12 software. The software allowed calculating both the fitting of the model as a Pseudo R-squared value and its significance beyond the terms of the linear regression function Y = aX + b where Y is the log10 copies/μL, X is the Ct measured in the RT-qPCR, a is the coefficient of X as defined by the model and b the constant (Table 1).

Table 1. Linear regression model including copies/μL as a dependant variable and cycle threshold (Ct) as a predictor factor.

y = ax+b
y = LOG10 (copies/μL)
x = Cycle threshold
a = Cycle threshold coefficient
b = constant
coefficient Robust SE n R-squared Root MSE
Log copies 38 .900 .454
Ct -.307 .018
constant 10.55 .431
log10AbsQuant = −.307[Ct]+10.55

SE: Standard Error; MSE: Mean error sum of squares. R-squared is an indicator of reliability of the model. Root MSE is an indicator of accuracy of the model.

The 38 samples of the evaluation set were used to test the model. To that end, the predictor formula was used to calculate the absolute counts using the Ct of each sample of the evaluation group. All the samples in the evaluation group were then assayed once using dPCR, and the results were compared with those obtained using the predictor formula. The accuracy of the prediction was calculated as the Median Absolute Deviation (MAD) of the percentage error (PE) calculated using the formula PE = Absolute value (measured − calculated/measured) x 100. The formula was used to calculate the VL in a cohort of 6208 cases. The cVL as well as the raw Ct and other medical data including gender, age, presence of signs and symptoms, ward/unit of origin, administrative origin, turnaround-time (urgency), date of sampling, type of sampling (oro or nasopharyngeal swabs) were also entered into a logistic regression model to investigate their role as a predictor of case-fatality (outcome alive or dead). In particular, age and cVL were evaluated either as continuous variables or as factors after categorisation according to the following: age (< 6 years-old, ≥6 and < 18; ≥18 and < 30; ≥30 and < 50; ≥50 and < 70; ≥70); cVL (<1 copies/μL; ≥1 and < 101; ≥101 and < 102; ≥102 and < 103; ≥103 and < 104; ≥104 and < 105; ≥105 and < 106; ≥106) which were entered into the model as factors. Covid-19 case fatality was retrieved from the death registries of the local Public Health Departments. Collinear variables were excluded. The best model was built by entering the predictors in a stepwise approach following the criteria of the significant contribution to the fitting of the model in terms of Pseudo-R squared. Alternative models including either categorized cVL, continuous cVL or raw Ct values were also built for comparison purposes.

To that end, the entire cohort was randomly divided into two sets (50% randomly selected samples): a model set (n = 3095), used to build predictive logistic regression models which were built using the fewest predictors achieving the best Pseudo R-squares, and an evaluation set (n = 3113) used to evaluate models’ accuracy. The coefficients and constants of the models were included in predicting equations which calculated the probability of death (S1 File). The diagnostic performances of the models selected were evaluated using receiver operating characteristic (ROC) curve analysis, and optimal thresholds were obtained using the Youden J parameter. The predicted outcomes of the different models were utilised to calculate the respective sensitivity, specificity, positive and negative likelihood ratios, positive and negative predictive values, and overall accuracy. The latter statistical analyses were carried out using Analyse-it software (Analyse-it Software, Ltd, UK).

The cVLs from the end of February 2020 until October 2020 were reported using descriptive statistics.

For statistical purposes, the samples positive only at a target different from the N gene were considered positive with 0 copies/μL.

Results

The original Ct values were compared with the Cts of the retested values to exclude the possibility of a degradation of the samples. No evidence of degradation was observed since the Cts were not statistically different between the retested values and the original test values (p = 0.74) (Fig 1). However, in terms of absolute value, a mean difference in the Ct of 1.6 and 1.9 was observed between all the samples (regardless of the primary assay) and only the Seegene samples (comparison restricted to samples initially assayed with the Seegene assay), respectively. The difference was not statistically significant (p = 0.45). This finding was not dependent on the original test used (Fig 1).

Fig 1. Scatter plot A) and Residual plot B) of the original and repeated Ct values.

Fig 1

Colour codes indicate the original testing method.

The dPCR assay performed adequately under the conditions described herein, achieving an LOD of 1.19 copies/μL (Fig 2).

Fig 2. Precision profile analysis of variance for assessing the analytical sensitivity of a digital PCR assay using α and β values of 5% for Limit of Blank (LoB) and Limit of Detection (LOD), respectively.

Fig 2

Using a serial dilution experiment, the dPCR linearity was restricted to samples below 2.3 x104 copies/μL. Hence, in building the linear regression model, an adequate dynamic range was obtained by diluting those samples below the 22 Ct threshold 1:10 (Fig 3).

Fig 3. Linearity fitting of the linear logistic model establishing the relationship between the cycle threshold values and the actual viral loads.

Fig 3

A) Linearity fitting plot including the individual plot of replicates (of the same colour) and the fitting line of linearity with the respective confidence bands and individual bands at the 99% level. B) Standardised residual plot: there are only two replicates of two different samples outside the 2 standard deviations.

Finally, precision as a measure of inter-assay repeatability over the entire dynamic range achieved an average CV of 15.3% and a median CV of 4.3%: (S1 File).

The absolute quantification obtained by dPCR was regressed on the Ct values, and a linear regression function was derived. The model achieved good reliability and accuracy (Table 1).

The regression equation allowed calculating the absolute quantity of viral genome expressed as copies/μL in the evaluation set. Notably, in the evaluation set, the R2 value was also 0.918 assessing a good linear correlation between the predicted and measured copies/μL values (Fig 4).

Fig 4. A) Scatter plot and B) Residual plot of the findings achieved by the linear regression model in the evaluation set.

Fig 4

VL log: Log of the Viral Load value expressed as copies/μL; cVL log: Log of the calculated Viral Load.

In absolute terms, the error in predicting the cVL expressed as MAD of the PE was 53.0%. The error was uniformly distributed into high, medium and low VL categories, although, to some extent, the latter showed higher errors. The complete comparison of measured versus calculated absolute copies/μL in evaluating the set counts are reported in detail in (S1 File).

The linear regression equation was used to calculate the cVL in a cohort of 6208 Covid-19 positive cases diagnosed in the period from 24 February to 30 September 2020 using the Allplex Seegene assay, and the cVL was recorded in the database. There is no unanimous consensus on how to interpret very low VL. In the present study, those cases with only one of the three positive target genes different from the N gene, which was that targeted by dPCR, was considered positive with 0 copies/μL [46]. The characteristics of the cohort are reported in Table 2.

Table 2. Characteristics of the cohort of SARS-CoV-2 positive cases (n = 6208).

Age (median (IQR) 55 (38–74)
Age categories (years) n (%)
 < 6 55 (0.9)
 ≥ 6 < 18 298 (4.8)
 ≥ 18 < 30 728 (11.7)
 ≥ 30 < 50 1478 (23.8)
 ≥ 50 < 70 1755 (28.3)
 ≥ 70 1894 (30.5)
Gender N (%)
 female 3155 (50.8)
 male 3053 (49.2)
Viral load (copies/μL) n (%)
 <1 2722 (43.8)
 ≥ 1 < 101 957 (15.4)
 ≥ 101 < 102 719 (11.6)
 ≥ 102 < 103 619 (10.0)
 ≥ 103 < 104 522 (8.4)
 ≥ 104 < 105 430 (6.9)
 ≥ 105 < 106 197 (3.2)
 ≥ 106 42 (0.7)
Turnaround time (days) N (%)
 0 922 (14.1)
 1 4535 (73.9)
 2 708 (11.4)
 ≥ 3 43 (0.7)
Swabs n (%)
 nasopharyngeal 5982 (96.4)
 oropharyngeal 226 (3.6)
Ward/unit
 Hospital ward 1289 (20.8%)
 Emergency ward 755 (12.2%)
 Covid Drive-through 890 (14.3%)
 Preventive medicine unit 3101 (50.0%)
 Intensive care unit 161 (2.6%)
 Others 12 (0.2%)
Presence of signs/symptoms n (%)
 no 2315 (37.3)
 yes 2210 (35.6)
 Not known 1670 (26.9)
Outcome n (%)
 Deceased 583 (9.4)

IQR: Interquartile range.

The cVL differed greatly from 0 to more than 5x106 copies/μL. The majority of cases showed less than 1 copy/μL (Fig 5).

Fig 5. Pie chart reporting the percentage of each viral load category in the cohort of 6208 Covid-19 cases.

Fig 5

Calculated viral loads are expressed as Log10copies/μL.

By plotting the positive results over time, it could be observed that, in addition to the absolute number of positives, the cVL differed over time, being markedly lower during the summertime. The 90th and 95th percentiles of VL tended be very low during this period (Fig 5A). Furthermore, a rapid increase in VL in terms of higher percentiles, could be observed beginning in mid-August and peaking in September. This anticipated the exponential upwards rapid incidence increase of the epidemic curve observed in the same geographical area one month later (Fig 6B).

Fig 6. Graph reporting the viral loads (VLs) over time (March 2020 to October 2020) in Italy.

Fig 6

A) histograms of the 90th and 95th percentiles of calculated VL on a monthly basis. B) Scatter plots of calculated VL (pale grey dots and lines) over the same timescale of fluctuations of active cases (black solid line) expressed as the last daily change (difference with respect to the day before) in active cases (Italian National Ministry of Health).

The cVLs were also examined after stratifying the cohort according to age group. Higher VLs, i.e. those which account for the majority of the transmission risk roughly estimated at 1500 copies/ μL by converting the reported Ct beyond which it is not possible to infect cell cultures using diagnostic samples [26], were observed primarily in the elderly followed by the youngest age group (Fig 7).

Fig 7. Histograms characterising infectivity A) percentage of cases above the likely infectivity threshold (calculated viral load of 1500 copies/μL) divided by age category and B) 75th, 90th and 95th percentile values of viral load divided by age category.

Fig 7

Finally, logistic regression models were used to investigate the effect of the sets of predictors considered in this study, including cVL, categorised cVL and raw Ct to predict the case-fatality outcome and to evaluate their accuracy.

Of all the possible models considered, the best one reached an adjusted R-square of 0.34 (p<0.01) and included categorised cVL, age and gender. A high cVL was associated with increased case-fatality odds. In particular, a cVL > 103 copies/μL was significantly associated with increased mortality rates and a cVL > 1x106 was associated with an Odds-ratio of 9.24 (CI 2.36–36.26; p<0.001) (Table 3). Age (odds-ratio 1.11; CI 1.10–1.12; p<0.001), and male gender (odds-ratio 1.51; CI 1.14–2.00; p<0.001) were also significantly associated with increased case-fatality odds.

Table 3. A three parameter (age, gender and cVL) logistic regression model to predict case-fatality.

The statistically significant (P<0.01) parameters are evidenced in bold.

Number of observations = 3095
LR chi2 = 698.84
Probability > chi2 = 0.0000
Pseudo R2 = 0.346
Odds Ratio SE p 95% Confidence Interval
Viral load (copies/μL)
<1
≥ 1 < 101 1.10 .23 .962 .65 1.58
≥ 101 < 102 1.30 .30 .261 .82 2.05
≥ 102 < 103 2.30 .52 .000 1.47 3.58
≥ 103 < 104 2.63 .65 .000 1.62 4.27
≥ 104 < 105 2.23 .57 .002 1.35 3.69
≥ 105 < 106 4.70 1.53 .000 2.49 8.89
≥ 10 6 9.24 6.45 .001 2.36 36.26
Age 1.11 .007 .000 1.10 1.12
Male gender 1.51 .219 .005 1.14 2.00

SE = Standard Error; LR = logistic regression.

The diagnostic performances of the models, including cVL, categorised cVL and raw Ct, were substantially equal having areas under the curve (AUCs) of 0.889, 0.888 and 0.889, respectively; no statistically significant differences were found at pair comparisons. At the optimal threshold, all models achieved very high specificity and low sensitivity. Being substantially equivalent, additional analyses were carried out using the model, including the cVL parameter. The optimal threshold was found to be 57.1% of the probability of death. Using this setting, the sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, positive predictive value and negative predictive value were 21.7%, 99.0%, 5.70, 0.43, 67.0% and 93.0%, respectively (Fig 8).

Fig 8. Diagnostic performance of the model including the predictors age, gender and calculated viral load.

Fig 8

A) Receiver operating characteristic curve of the model showing an area under the curve of 0.889. B) Scatter plot showing the survivors (blue) and the deceased (red) along with the predicted probability of death calculated by the model. C) the Youden plot highlighting the optimal threshold and the threshold at a fixed sensitivity value of 75%. TPF: True positive fractions (sensitivity). FPF: False positive fractions.

In the evaluation set with a prior probability of case-fatality of 8.74%, the model identified 59 deaths out of 272. The false positive death predictions were 29 with a positive predictive probability of 67%. By fixing the sensitivity threshold at 75.0%, the predictive threshold was found at 2.66%. Using this threshold, a specificity of 83.7% was achieved and the model identified 204/272 case-fatalities; however, 464 false positive predictions occurred with a positive predictive power of 31% (Fig 9). Complete findings of the logistic predictive models are reported in S1 File. Both models performed almost identically regarding predictions (Fig 9).

Fig 9. Mosaic plot of models including either calculated viral load or raw cycle threshold to classify patients at risk of death.

Fig 9

Both models performed almost identically. A) Using optimal thresholds, both models showed very high specificity but low sensitivity. Conversely B) at the fixed threshold of 75%, of sensitivity the specificity dropped to approximately 84%. Within the mosaic boxes, the number of subjects from the 3113 of the evaluation set are indicated.

Discussion

In this study, a method of quantifying the VL by modelling the relationship between the dPCR and RT-qPCR results was established. To the best of the Authors’ knowledge, this is the first study of this kind, although, while this paper was under review, another study was published suggesting the same approach for underpinning the investigations regarding Sars-CoV-2 biology [19]. The Authors further extended the approach by challenging the regression model in an evaluation set of data. The median of difference between the calculated and the measured VL was 53.0%. Since a 100% efficient PCR doubles the target every cycle, the 53.0% mean average error corresponds to a Ct value of less than 0.5. As the cVL spanned over 6 orders of magnitude from 0 to more than 5 million genomes/μL, the reported error could be considered almost negligible. Therefore, the model was used to calculate the VL in a cohort of 6208 cases diagnosed with Covid-19 having a known specified error.

Many studies have investigated VL in Covid-19 patients. Regrettably, the majority of them used an RT-qPCR assay originally intended as a qualitative, not a quantitative, assay; as a matter of fact, the Ct values of the diagnostic RT-qPCR in large cohorts of cases were used either as a rough estimate of the amount of virus as such or, in a minority of cases, were converted into log10 copies/mL using an RT-qPCR calibration curve carried out once [47, 48]. However, in these studies, no detailed methods of Ct conversion were reported nor were the measured errors provided. A plethora of factors can affect the accuracy of absolute quantification by RT-qPCR using calibration curves. These concerns have recently been addressed together with strategies for improving absolute quantification [49, 50]. One of the greatest concerns is that the efficiency of RT-qPCR may vary, biasing the accuracy and, specifically, the reproducibility of the calibration curves which, in turn, propagates the error. For the above-mentioned reasons, a direct comparison with the study herein reported could not be made, and the findings of these studies should be regarded as an error prone approximate estimation of the VL. Although hampered by the above-mentioned drawbacks, such studies established or not an association between VL and case-fatality; however, even in the former case, they did not provide a model for precisely quantifying the risk [4, 48, 5154]. On the other hand, many fewer studies have quantified VL as copies/volume (of reaction or of sample) using dPCR. Remarkably, as indirect evidence of robustness and reliability of the calculated approach method, the measured copies/reaction in convalescent patients would be very similar, if calculated using the model here described from the high Ct values obtained in RT-qPCR [23]. Overall, the cVL range inferred with the regression model matches that directly measured from nasopharyngeal swabs in other studies [32]. However, to the best of the Authors’ knowledge, no studies have quantified VL using diagnostic swabs and correlated it to outcome. Many advantages of the mathematical approach herein described are highlighted as 1) findings which can be compared between studies, 2) findings which can be included in a metanalysis or 3) different diagnostic RT-qPCR results which can be expressed in terms of Ct within the same laboratory and can be included in the same dataset which allows increasing the possibility of addressing the question of Sars-CoV-2 biology [19]. This latter advantage it is noteworthy since it has the potential to allow comparing the VL obtained using different RT-qPCR platforms in the same laboratory or even in different laboratories worldwide, provided that the linear regression equation was defined using the respective Ct data.

Overall, nearly half of the cases had less than 1 copy/μL. This quantity is very close to the detection limit of RT-qPCR. It is beyond the Authors’ aims to investigate whether these were false positives in technical terms or true positive, carrying however only free nucleic acid and no viral particles. Moreover, the majority of studies which used viral isolation in cell cultures to estimate the infectivity identified in 24 Ct, the threshold beyond which the likelihood of isolating SARS-CoV-2 from nasopharyngeal swabs drops abruptly, and in 30–33 Ct beyond which it is not possible to isolate the virus [26, 55, 56]. In terms of cVL according with our regression model, it could be roughly estimated that approximately 1500copies/μL represented the limit beyond which infectivity drops and 20 copies/μL the limit beyond which it is almost impossible to isolate the virus. These data are in agreement with those reported in a small case series including the absolute quantification of VL using RT-qPCR [57]. This study established a limit of 106 copies/mL (corresponding to 103/μL in the present study) for successfully achieving virus isolation. In the present cohort, fewer than 18% (17.4%) of the samples had a cVL > 1500 copies/μL and only 37.2% had a cVL > 20 copies/μL (Fig 5). The VL may also depend on the different times of diagnosis. Unfortunately, this represents an inherent limitation of the present study due to its retrospective nature.

With regard to the fluctuation of the VL, Clementi et al. (2020) reported low VLs during the summer period in Italy after the first epidemic wave had hit the country in the previous spring. Lower VLs were associated with fewer Covid-19 cases. The present study confirmed and extended these observations. After the first public health measures were eased in mid-May, the incidence continued to decline, reaching its lowest rate at the end of July 2020; similarly, the active cases also remained at very low levels until the end of September 2020 when an exponential rise in active cases was observed. Interestingly, this study confirmed that almost all the cases diagnosed during the summertime had very low VLs. This evidence was corroborated by the 95th and 99th percentiles of VL data which were below the likely threshold of infectivity in May and June, and started to moderately rise in July and peaked in August. The peak of cases with high VLs was followed one month later by the exponential rise in incidence and active cases. This finding would suggest that an increase in VL should be considered as an early predictor of worsening epidemiologic parameters useful for tightening public health measures while minimising the economic impact of the restrictions [5862].

The VL data were also applied to age groups to more specifically investigate the role of childhood in SARS-CoV-2 transmission. Although children are relatively spared by the severe forms of Covid-19, their possible role in transmission should be considered when implementing radical measures, such as school closure [3]. The authors found that, in addition to the elderly, preschool children (0–6-years old) had the highest VL in both the higher percentiles and in the percentage of cases above the aforementioned threshold of infectivity (Fig 7). Conversely, school children >6-years old and < 18-years old) were those with the lowest VLs. To the best of the Authors’ knowledge, the largest cohort study to date has showed a moderate trend to higher VLs with increasing age categories [63, 64]. However, it should be emphasised that, in this study, the percentiles above the threshold considered as the limit of infectivity were almost similar among all age categories. This latter parameter is likely more representative of the weight of each age category as a transmitter.

Strong evidence exists that the absolute quantification of circulating VL is an independent and strong predictor of fatality [3438]. However, this approach requires invasive blood sampling which is carried out solely in hospital settings with prognostic aims while its diagnostic value is limited. Herein, the power of cVL, and some other simple and readily available signalment data at the moment of Covid-19 testing were evaluated. The regression model found cVL to be an independent predictor of case-fatality after correcting for gender and age. In particular, a VL above the threshold of 106 copies/μL was strongly associated with negative outcomes. Similar findings have also been reported by others [48]; however, in the present study, the odds ratios were additionally refined using different levels of cVL, making the VL readily interpretable. The other negative independent predictors found were male gender and older age. These predictors were almost invariably found in all the studies and meta-analyses carried out in hospital settings [48, 6567], with or without considering VLs. When the model including the three predictors was used to predict the outcome in the evaluation set of cases, it was notably able to specifically detect those cases having a high probability of survival. For instance, using the optimal threshold, the model identified 3024 out of 3113 subjects who were predicted to survive the Sars-CoV-3 infection with a probability of 93% (negative predictive value). Of the 89 predicted deaths, eventually 59 died; hence, the model showed a 66.3% positive predictive value. Conversely, if adequate sensitivity was requested by the model, for instance 75% sensitivity, the model allowed identifying 204 out of 272 case-fatalities although 465 false positive cases were also found.

This evidence may have relevant implications in terms of public health since this tool could give public health institutions the opportunity of classifying those patients at risk of death already at the moment of diagnosis so as to efficiently allocate finite health resources by focusing medical monitoring on those high-risk patients at an early stage, at a very low additional cost. Furthermore, the threshold may be set at different levels based upon specific aims and guidelines in order to focus the public health resources on those Covid-19 cases at risk of developing severe disease.

There are many reviews and meta-analyses which have investigated the risk factors associated with death outcome. However, the majority of them examined different cohorts of patients variably selected, i.e., those hospitalised, those having the presence of specific comorbidities, those coming from specific wards and those presenting specific markers. All these studies were suitable for stratifying patients within specific settings, implementing in an evidence-based manner, the resource allocations. It should be noted that this was the first study aimed at addressing the role of VL as an independent variable in a cohort of diagnosed cases from the Diagnostic Laboratory. This evidence could be important for the early stratification of cases, thus focusing medical surveillance on patients at a high risk of developing severe forms of Covid-19 and on efficiently allocating resources. Interestingly, this result could be attained computationally by simply including the signalment and anamnestic data already available at the moment of diagnosis along with the cVL data. Unfortunately, data regarding the presence or absence of signs and symptoms were very incompletely represented in the database; therefore, they could not be included in the model. It is very likely that this information would have allowed better stratifying the sample. Public health laboratories should be aware of this and improve the exchange of information between all the players engaged in the network of diagnosing and curing Covid-19.

Supporting information

S1 File. Supplementary materials & methods.

(DOCX)

S2 File. Complete dataset.

The Excel file includes two spreadsheets providing both model group and test group. In the latter, the predictive formulas are embedded in the spreadsheet.

(XLSX)

Data Availability

All relevant data are within the manuscript and its S1 and S2 Files.

Funding Statement

This work was supported by the University of Bologna [grant Sambri287] and AUSL Romagna [grant COVdPCR].

References

  • 1.World Health Organisation. Coronavirus Disease 2019 (COVID-19): Situation Report-51 [Internet]. [cited 2020 Mar 18]. https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200311-sitrep-51-covid-19.pdf. 1.
  • 2.He X, Lau EHY, Wu P, Deng X, Wang J, Hao X, et al. Temporal dynamics in viral shedding and transmissibility of COVID-19. Nat Med. 2020. May;26(5):672–675. doi: 10.1038/s41591-020-0869-5 [DOI] [PubMed] [Google Scholar]
  • 3.Merckx J, Labrecque JA, Kaufman JS. Transmission of SARS-CoV-2 by Children. Dtsch Arztebl Int. 2020;117: 553–560. doi: 10.3238/arztebl.2020.0553 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Shlomai A, Ben-Zvi H, Glusman Bendersky A, Shafran N, Goldberg E, Sklan EH. Nasopharyngeal viral load predicts hypoxemia and disease outcome in admitted COVID-19 patients. Crit Care. 2020;24: 539. doi: 10.1186/s13054-020-03244-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.To KK, Tsang OT, Leung WS, Tam AR, Wu TC, Lung DC, et al. Temporal profiles of viral load in posterior oropharyngeal saliva samples and serum antibody responses during infection by SARS-CoV-2: an observational cohort study. Lancet Infect Dis. 2020;20: 565–574. doi: 10.1016/S1473-3099(20)30196-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Walsh KA, Jordan K, Clyne B, Rohde D, Drummond L, Byrne P, et al. SARS-CoV-2 detection, viral load and infectivity over the course of an infection. J Infect. 2020;81: 357–371. doi: 10.1016/j.jinf.2020.06.067 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Widders A, Broom A, Broom J. SARS-CoV-2: The viral shedding vs infectivity dilemma. J Infect Dis Health. 2020;25: 210–215. doi: 10.1016/j.idh.2020.05.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Zheng S, Fan J, Yu F, Feng B, Lou B, Zou Q, et al. Viral load dynamics and disease severity in patients infected with SARS-CoV-2 in Zhejiang province, China, January-March 2020: retrospective cohort study.BMJ. 2020;369: m1443. doi: 10.1136/bmj.m1443 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Zou L, Ruan F, Huang M, Liang L, Huang H, Hong Z, et al. SARS-CoV-2 Viral Load in Upper Respiratory Specimens of Infected Patients. N Engl J Med. 2020;382: 1177–1179. doi: 10.1056/NEJMc2001737 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Vogels CBF, Brito AF, Wyllie AL, Fauver JR, Ott IM, Kalinich CC, et al. Analytical sensitivity and efficiency comparisons of SARS-CoV-2 RT-qPCR primer-probe sets. Nat Microbiol. 2020;5: 1299–1305. doi: 10.1038/s41564-020-0761-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Azzi L, Carcano G, Gianfagna F, Grossi P, Gasperina DD, Genoni A, et al. Saliva is a reliable tool to detect SARS-CoV-2. J Infect. 2020;81: e45–e50. doi: 10.1016/j.jinf.2020.04.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Clementi N, Ferrarese R, Tonelli M, Amato V, Racca S, Locatelli M, et al. Lower nasopharyngeal viral load during the latest phase of COVID-19 pandemic in a Northern Italy University Hospital. Clin Chem Lab Med. 2020;58: 1573–1577. doi: 10.1515/cclm-2020-0815 [DOI] [PubMed] [Google Scholar]
  • 13.Hu X, Zhu L, Luo Y, Zhao Q, Tan C, Chen X et al. Evaluation of the clinical performance of single-, dual-, and triple-target SARS-CoV-2 RT-qPCR methods. Clin Chim Acta. 2020;511: 143–148. doi: 10.1016/j.cca.2020.10.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Lim J, Jeon S, Shin HY, Kim MJ, Seong YM, Lee WJ, et al. Case of the Index Patient Who Caused Tertiary Transmission of COVID-19 Infection in Korea: the Application of Lopinavir/Ritonavir for the Treatment of COVID-19 Infected Pneumonia Monitored by Quantitative RT-PCR. J Korean Med Sci. 2020;35: e79. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Wang W, Xu Y, Gao R, Lu R, Han K, Wu G, et al. Detection of SARS-CoV-2 in Different Types of Clinical Specimens. JAMA. 2020;323: 1843–1844. doi: 10.1001/jama.2020.3786 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Zhu Z, Lu Z, Xu T, Chen C, Yang G, Zha T, et al. Arbidol monotherapy is superior to lopinavir/ritonavir in treating COVID-19. J Infect. 2020;81: e21–e23. doi: 10.1016/j.jinf.2020.03.060 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Morley AA. Digital PCR: A brief history. Biomol Detect Quantif. 2014;1: 1–2. doi: 10.1016/j.bdq.2014.06.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Salipante SJ, Jerome KR. Digital PCR-An Emerging Technology with Broad Applications in Microbiology. Clin Chem. 2020;66: 117–123. doi: 10.1373/clinchem.2019.304048 [DOI] [PubMed] [Google Scholar]
  • 19.Kinloch NN, Ritchie G, Dong W, Cobarrubias KD, Sudderuddin H, et al. SARS-CoV-2 RNA Quantification Using Droplet Digital RT-PCR. J Mol Diagn. 2021; 23:907–919. doi: 10.1016/j.jmoldx.2021.04.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Cassinari K, Alessandri-Gradt E, Chambon P, Charbonnier F, Gracias S, Beaussire L, et al. Assessment of multiplex digital droplet RT-PCR as a diagnostic tool for SARS-CoV-2 detection in nasopharyngeal swabs and saliva samples. Clin Chem. 2020. Dec 17. hvaa323. doi: 10.1093/clinchem/hvaa323 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Deiana M, Mori A, Piubelli C, Scarso S, Favarato M, Pomari E. Assessment of the direct quantitation of SARS-CoV-2 by droplet digital PCR. Sci Rep. 2020;10: 18764. doi: 10.1038/s41598-020-75958-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.de Kock R, Baselmans M, Scharnhorst V, Deiman B. Sensitive detection and quantification of SARS-CoV-2 by multiplex droplet digital RT-PCR. Eur J Clin Microbiol Infect Dis. 2020. Oct 26: 1–7. doi: 10.1007/s10096-020-04076-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Dong L, Zhou J, Niu C, Wang Q, Pan Y, Sheng S, et al. Highly accurate and sensitive diagnostic detection of SARS-CoV-2 by digital PCR. Talanta. 2021. Mar 1;224:121726. doi: 10.1016/j.talanta.2020.121726 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Duong K, Ou J, Li Z, Lv Z, Dong H, Hu T, et al. Increased sensitivity using real-time dPCR for detection of SARS-CoV-2. Biotechniques. 2020. Nov 23. doi: 10.2144/btn-2020-0133 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Falzone L, Musso N, Gattuso G, Bongiorno D, Palermo CI, Scalia G, et al. Sensitivity assessment of droplet digital PCR for SARS-CoV-2 detection. Int J Mol Med. 2020;46: 957–964. doi: 10.3892/ijmm.2020.4673 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Gniazdowski V, Morris CP, Wohl S, Mehoke T, Ramakrishnan S, Thielen P, et al. Repeat COVID-19 Molecular Testing: Correlation of SARS-CoV-2 Culture with Molecular Assays and Cycle Thresholds. Clin Infect Dis. 2020. Oct 27:ciaa1616. doi: 10.1093/cid/ciaa1616 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Liu C, Shi Q, Peng M, Lu R, Li H, Cai Y, et al. Evaluation of droplet digital PCR for quantification of SARS-CoV-2 Virus in discharged COVID-19 patients. Aging (Albany NY). 2020;12: 20997–21003. doi: 10.18632/aging.104020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Lv J, Yang J, Xue J, Zhu P, Liu L, Li S. Detection of SARS-CoV-2 RNA residue on object surfaces in nucleic acid testing laboratory using droplet digital PCR. Sci Total Environ. 2020. Nov 10;742:140370. doi: 10.1016/j.scitotenv.2020.140370 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Suo T, Liu X, Feng J, Guo M, Hu W, Guo D, et al. ddPCR: a more accurate tool for SARS-CoV-2 detection in low viral load specimens. Emerg Microbes Infect. 2020;9: 1259–1268. doi: 10.1080/22221751.2020.1772678 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Ternovoi VA, Lutkovsky RY, Ponomareva EP, Gladysheva AV, Chub EV, Tupota NL, et al. Detection of SARS-CoV-2 RNA in nasopharyngeal swabs from COVID-19 patients and asymptomatic cases of infection by real-time and digital PCR. Klin Lab Diagn. 2020; 65: 785–792. doi: 10.18821/0869-2084-2020-65-12-785-792 [DOI] [PubMed] [Google Scholar]
  • 31.Vasudevan H, Xu P, Servellita V, Miller S, Liu L, Gopez A, et al. Digital droplet PCR accurately quantifies SARS-CoV-2 viral load from crude lysate without nucleic acid purification. medRxiv.:2020.09.02.20186023. [posted 2020 Sep 3]. Available from doi: 10.1101/2020.09.02.20186023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Li L, Tan C, Zeng J, Luo C, Hu S, Peng Y, et al. Analysis of viral load in different specimen types and serum antibody levels of COVID-19 patients. J Transl Med. 2021;19: 30. doi: 10.1186/s12967-020-02693-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Yu F, Yan L, Wang N, Yang S, Wang L, Tang Y, et al. Quantitative Detection and Viral Load Analysis of SARS-CoV-2 in Infected Patients. Clin Infect Dis. 2020;71: 793–798. doi: 10.1093/cid/ciaa345 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Bermejo-Martin JF, González-Rivera M, Almansa R, Micheloud D, Tedim AP, Domínguez-Gil M, et al. Viral RNA load in plasma is associated with critical illness and a dysregulated host response in COVID-19. Crit Care. 2020;24: 691. doi: 10.1186/s13054-020-03398-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Chen L, Wang G, Long X, Hou H, Wei J, Cao Y, et al. Dynamics of Blood Viral Load Is Strongly Associated with Clinical Outcomes in Coronavirus Disease 2019 (COVID-19) Patients: A Prospective Cohort Study. J Mol Diagn. 2021; 23: 10–18. doi: 10.1016/j.jmoldx.2020.10.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Ram-Mohan N, Kim D, Zudock EJ, Hashemi MM, Tjandra KC, Rogers AJ, et al. SARS-CoV-2 RNAaemia predicts clinical deterioration and extrapulmonary complications from COVID-19. medRxiv. 2020.12.19.20248561. [posted 2020 Dec 22]. Available from: doi: 10.1101/2020.12.19.20248561 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Szwebel TA, Veyer D, Robillard N, Eshagh D, Canoui E, Bruneau T, et al. Usefulness of Plasma SARS-CoV-2 RNA Quantification by Droplet-based Digital PCR to Monitor Treatment Against COVID-19 in a B-cell Lymphoma Patient. Stem Cell Rev Rep. 2021. Jan 5:1–4. doi: 10.1007/s12015-020-10107-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Veyer D, Kernéis S, Poulet G, Wack M, Robillard N, Taly V, et al. Highly sensitive quantification of plasma SARS-CoV-2 RNA shelds light on its potential clinical value. Clin Infect Dis. 2020. Aug 17:ciaa1196. doi: 10.1093/cid/ciaa1196 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Jiang Y, Wang H, Hao S, Chen Y, He J, Liu Y, et al. Digital PCR is a sensitive new technique for SARS-CoV-2 detection in clinical applications. Clin Chim Acta. 2020;511: 346–351. doi: 10.1016/j.cca.2020.10.032 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Park C, Lee J, Hassan ZU, Ku KB, Kim SJ, Kim HG, et al. Comparison of digital PCR and quantitative PCR with various SARS-CoV-2 primer-probe sets. J Microbiol Biotechnol. 2020. Dec 25. doi: 10.4014/jmb.2009.09006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Hamouda M, Mustafa F, Maraqa M, Rizvi T, Aly Hassan A. Wastewater surveillance for SARS-CoV-2: Lessons learnt from recent studies to define future applications. Sci Total Environ. 2020;759: 143493. doi: 10.1016/j.scitotenv.2020.143493 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Ahmed W, Bertsch PM, Angel N, Bibby K, Bivins A, Dierens L, et al. Detection of SARS-CoV-2 RNA in commercial passenger aircraft and cruise ship wastewater: a surveillance tool for assessing the presence of COVID-19 infected travellers. J Travel Med. 2020;27: taaa116. doi: 10.1093/jtm/taaa116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Chang L, Yan Y, Zhao L, Hu G, Deng L, Su D, et al. No evidence of SARS-CoV-2 RNA among blood donors: A multicenter study in Hubei, China. Transfusion. 2020;60: 2038–2046. doi: 10.1111/trf.15943 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Fung B, Gopez A, Servellita V, Arevalo S, Ho C, Deucher A, et al. Direct Comparison of SARS-CoV-2 Analytical Limits of Detection across Seven Molecular Assays. J Clin Microbiol. 2020;58: e01535–20. doi: 10.1128/JCM.01535-20 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Zhou H, Liu D, Ma L, Ma T, Xu T, Ren L, et al. A SARS-CoV-2 Reference Standard Quantified by Multiple Digital PCR Platforms for Quality Assessment of Molecular Tests. Anal Chem. 2020. Dec 8:acs.analchem.0c03996. doi: 10.1021/acs.analchem.0c03996 [DOI] [PubMed] [Google Scholar]
  • 46.Sung H, Roh KH, Hong KH, Seong MW, Ryoo N, Kim HS, et al. COVID-19 Molecular Testing in Korea: Practical Essentials and Answers From Experts Based on Experiences of Emergency Use Authorization Assays. Ann Lab Med. 2020;40: 439–447. doi: 10.3343/alm.2020.40.6.439 Erratum: in Ann Lab Med. 2021; 41: 126–127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Kleiboeker S, Cowden S, Grantham J, Nutt J, Tyler A, Berg A, et al. SARS-CoV-2 viral load assessment in respiratory samples. J Clin Virol. 2020;129:104439. doi: 10.1016/j.jcv.2020.104439 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Yazdanpanah Y; French COVID cohort investigators and study group. Impact on disease mortality of clinical, biological, and virological characteristics at hospital admission and overtime in COVID-19 patients. J Med Virol. 2020. Oct 15:10.1002/jmv.26601. doi: 10.1002/jmv.26601 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Tellinghuisen J, Spiess AN. Comparing real-time quantitative polymerase chain reaction analysis methods for precision, linearity, and accuracy of estimating amplification efficiency. Anal Biochem. 2014;449: 76–82. doi: 10.1016/j.ab.2013.12.020 [DOI] [PubMed] [Google Scholar]
  • 50.Tellinghuisen J, Spiess AN. Bias and imprecision in analysis of real-time quantitative polymerase chain reaction data. Anal Chem. 2015;87: 8925–8931. doi: 10.1021/acs.analchem.5b02057 [DOI] [PubMed] [Google Scholar]
  • 51.Bryan A, Fink SL, Gattuso MA, Pepper G, Chaudhary A, Wener MH, et al. SARS-CoV-2 Viral Load on Admission Is Associated With 30-Day Mortality. Open Forum Infect Dis. 2020;7: ofaa535. doi: 10.1093/ofid/ofaa535 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Carrasquer A, Peiró ÓM, Sanchez-Gimenez R, Lal-Trehan N, Del-Moral-Ronda V, Bonet G, et al. Lack of Association of Initial Viral Load in SARS-CoV-2 Patients with In-Hospital Mortality. Am J Trop Med Hyg. 2020. Dec 23. doi: 10.4269/ajtmh.20-1427 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Faíco-Filho KS, Passarelli VC, Bellei N. Is Higher Viral Load in SARS-CoV-2 Associated with Death? Am J Trop Med Hyg. 2020;103: 2019–2021. doi: 10.4269/ajtmh.20-0954 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Magleby R, Westblade LF, Trzebucki A, Simon MS, Rajan M, Park J, et al. Impact of SARS-CoV-2 Viral Load on Risk of Intubation and Mortality Among Hospitalized Patients with Coronavirus Disease 2019. Clin Infect Dis. 2020. Jun 30:ciaa851. doi: 10.1093/cid/ciaa851 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.La Scola B, Le Bideau M, Andreani J, Hoang VT, Grimaldier C, Colson P, et al. Viral RNA load as determined by cell culture as a management tool for discharge of SARS-CoV-2 patients from infectious disease wards. Eur J Clin Microbiol Infect Dis. 2020;39: 1059–1061. doi: 10.1007/s10096-020-03913-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Sohn Y, Jeong SJ, Chung WS, Hyun JH, Baek YJ, Cho Y, et al. Assessing Viral Shedding and Infectivity of Asymptomatic or Mildly Symptomatic Patients with COVID-19 in a Later Phase. J Clin Med. 2020;9: 2924. doi: 10.3390/jcm9092924 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Wölfel R, Corman VM, Guggemos W, Seilmaier M, Zange S, Müller MA, et al. Virological assessment of hospitalized patients with COVID-2019. Nature. 2020;581: 465–469. doi: 10.1038/s41586-020-2196-x [DOI] [PubMed] [Google Scholar]
  • 58.Davies NG, Barnard RC, Jarvis CI, Russell TW, Semple MG, Jit M, et al. ; Centre for Mathematical Modelling of Infectious Diseases COVID-19 Working Group; ISARIC4C investigators. Association of tiered restrictions and a second lockdown with COVID-19 deaths and hospital admissions in England: a modelling study. Lancet Infect Dis. 2020. Dec 23:S1473-3099(20)30984-1. doi: 10.1016/S1473-3099(20)30984-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Gatto M, Bertuzzo E, Mari L, Miccoli S, Carraro L, Casagrandi R, et al. Spread and dynamics of the COVID-19 epidemic in Italy: Effects of emergency containment measures. Proc Natl Acad Sci U S A. 2020;117: 10484–10491. doi: 10.1073/pnas.2004978117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Guzzetta G, Riccardo F, Marziano V, Poletti P, Trentini F, Bella A, et al. ; COVID-19 Working Group,2, Brusaferro S, Rezza G, Pezzotti P, Ajelli M, Merler S. Impact of a Nationwide Lockdown on SARS-CoV-2 Transmissibility, Italy. Emerg Infect Dis. 2021;27: 267–270. doi: 10.3201/eid2701.202114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Inglesby TV. Public Health Measures and the Reproduction Number of SARS-CoV-2. JAMA. 2020;323: 2186–2187. doi: 10.1001/jama.2020.7878 [DOI] [PubMed] [Google Scholar]
  • 62.Nussbaumer-Streit B, Mayr V, Dobrescu AI, Chapman A, Persad E, Klerings I, et al. Quarantine alone or in combination with other public health measures to control COVID-19: a rapid review. Cochrane Database Syst Rev. 2020;4: CD013574. doi: 10.1002/14651858.CD013574 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Jones T.C., Mühlemann B., Veith T., Biele G, Zuchowski M., Hofmann J., et al. An analysis of SARS-CoV-2 viral load by patient age. medrxiv [posted 2020 June 9] Available from: 10.1101/2020.06.08.2012548 [DOI] [Google Scholar]
  • 64.Held L. A discussion and reanalysis of the results reported in Jones et al. https://osf.io/bkuar/2020.
  • 65.Chidambaram V, Tun NL, Haque WZ, Majella MG, Sivakumar RK, Kumar A, et al. Factors associated with disease severity and mortality among patients with COVID-19: A systematic review and meta-analysis. PLoS One. 2020;15: e0241541. doi: 10.1371/journal.pone.0241541 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Palaiodimos L, Kokkinidis DG, Li W, Karamanis D, Ognibene J, Arora S, et al. Severe obesity, increasing age and male sex are independently associated with worse in-hospital outcomes, and higher in-hospital mortality, in a cohort of patients with COVID-19 in the Bronx, New York. Metabolism. 2020;108: 154262. doi: 10.1016/j.metabol.2020.154262 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Tian W, Jiang W, Yao J, Nicholson CJ, Li RH, Sigurslid HH, et al. Predictors of mortality in hospitalized COVID-19 patients: A systematic review and meta-analysis. J Med Virol. 2020;92: 1875–1883. doi: 10.1002/jmv.26050 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Jean-Luc EPH Darlix

14 Jul 2021

PONE-D-21-12462

Modelling RT-qPCR cycle-threshold using digital PCR data for implementing SARS-CoV-2 viral load studies

PLOS ONE

Dear Dr. Gentilini,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a properly revised version of the manuscript that addresses the points raised during the review process.

Notably and also according to the comments of the expert reviewer

#1 Most importantly a careful and appropriate statistical analysis

#2 Appropriate quantitative RT-PCR analyses serving as a standard curve based on effective copy numbers of the viral RNA. Based on such a standard real VL (viral loads) numbers can be provided.

#3. there are some doubts on the consistency and reproducibility of the data; this is of the utmost importance , see #5 of the reviewer. Yet this section is in need of clarification.

Please submit your revised manuscript by Aug 28 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Jean-Luc EPH Darlix, MG, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please list all of the different RT-qPCR assays used in the experiment.

3. Please clarify if the biological samples used in your study were:

(1) from an established biobank (if so please provide the name and a link)

(2) specifically collected for this study or not

(3) collected through a medically prescribed test

(4) completely de-identified before researchers accessed the samples

4. In your ethics statement in the Methods section and in the online submission form, please provide additional information about the cohort used in your study. Specifically, please ensure that you have discussed whether all data were fully anonymized before you accessed them and/or whether the IRB or ethics committee waived the requirement for informed consent. If patients provided informed written consent to have data from their medical records used in research, please include this information.

5. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. 

When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section.

6. We note that you have included the phrase “data not shown” in your manuscript. Unfortunately, this does not meet our data sharing requirements. PLOS does not permit references to inaccessible data. We require that authors provide all relevant data within the paper, Supporting Information files, or in an acceptable, public repository. Please add a citation to support this phrase or upload the data that corresponds with these findings to a stable repository (such as Figshare or Dryad) and provide and URLs, DOIs, or accession numbers that may be used to access these data. Or, if the data are not a core part of the research being presented in your study, we ask that you remove the phrase that refers to these data.

7. Please include your full ethics statement in the ‘Methods’ section of your manuscript file. In your statement, please include the full name of the IRB or ethics committee who approved or waived your study, as well as whether or not you obtained informed written or verbal consent. If consent was waived for your study, please include this information in your statement as well. 

8. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information. 

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Gentilini et al present a manuscript describing a method to estimate the viral load of SARS-CoV-2 based on the Ct values of RT-qPCR. They performed digital PCR for a small number of samples to build a linear regression model to infer the viral load (copies per µl), then applied it to a much larger data set. Inferred viral load over time revealed that it preceded the peak of positive cases. Furthermore, logistic regression using inferred viral load and sex and age was able to predict mortality at an above chance level. Although this manuscript is potentially useful, the authors need to provide some more validation and head-to-head comparisons between Ct and calculated viral load (cVL) to clarify the advantage of their method.

Major comments:

1. Instead of using COVID-19 patient samples, the authors first need to show the results of both RT-qPCR and dPCR by using a concentration series of SARS-CoV-2 RNA standards that have known copy number per µl. It is important to show the sensitivity and dynamic range of these methods. This experiment also helps to properly infer the actual copy number based on dPCR results, especially since the Ct values varied across retesting. Currently, it does not appear that this crucial control experiment was performed.

2. To clearly show this calculated VL value is more useful than the Ct value, the authors need to do the same analyses and show if their methods indeed can provide more insight in terms of the relationship between test results and COVID-19 mortality or active case dynamics. For example, they need to show how better is the performance of calculated-VL based logistic regression compared to Ct-based model. If, as the authors claim, the Ct values and VL are linearly related, then one may not expect any improvement between these new metrics.

3. The performance of the logistic regression is not convincing enough as a useful tool. Given the low rate of mortality (~9%), ~90% accuracy of their logistic regression is not particularly high (you can achieve >90% accuracy even if you called everything negative). In a model with 50% threshold, sensitivity is ~25%, which means only a quarter of the true positives were correctly called as positive. In addition, positive predictive value is 63%, which means only 2/3 of the positives this model called are true positive. Even though the authors discussed the potential use of this prediction as a tool to screen more susceptible patients, it is hard to imagine this model can actually be used for that purpose.

4. The authors purified RNA from UTM stored samples and compared results of RT-qPCR and dPCR. It has been shown that RT-qPCR results are different between purified RNA and UTM samples as shown in Figure 3 of reference 31, https://www.nature.com/articles/s41598-020-80715-1). The authors need to describe how original tests these 51 samples and other ~6000 samples were performed in more detail (such as if purified RNA was used for RT-qPCR). Although it seems a part of 51 samples and all of ~6000 samples were tested by “Seegene assay”, the authors did not describe the detail of this method. If these original tests were performed directly from UTM samples, using purified RNA for building a regression model does not sound reasonable.

5. I am not sure what is the rational of performing dPCR only once as described in line 161. The authors should perform multiple times to see the consistency. Related to this, it appears that the authors performed the regression with a single training and test split, but it would be better to perform k-fold cross validation with multiple restarts (meaning randomly separate all samples to training and test sets many times). This would help to understand if their model is generalizable or specific to the particular training/test (evaluation) sets. Additionally, the percentage of data in the training and evaluation sets differs between the linear and logistic regression models.

6. The authors need to show more data instead of just reporting the results of statistical tests or analyses by numbers or table format. This manuscript is not reader-friendly partly due to the lack of plots for most of their key analyses. Individual points will be pointed out below. The better visualization is also important to evaluate their key results such as the performance of linear and logistic regression.

7. The authors claimed there was no significant difference between retested and original Ct values for 51 samples they analyzed. They need to show a plot showing the relationship between retested and original values for both SARS-CoV-2 genes and control genes. It is important to show how consistent these values are to evaluate their results.

8. The authors need to show plots showing Ct values and log10(measured VLs) with a regression line for 51 samples they analyzed in addition to table 2. This will help readers to evaluate their model. Based on the table, the difference between measured and calculated VLs is not small even though they their regression was significant, and it is hard to understand the pattern of distribution without a plot. Additionally, the R-squared value (0.900) on the training set is quite low, especially for data that is distributed across multiple log-orders. Likewise, the MAD on the evaluation data is >50%, with some samples being off by up to 10-fold, suggesting that the model does not generalize well.

9. The results of logistic regression should be also presented by some plots instead of just showing numbers as a table. The methods also describe using the VL as both a continuous and categorical variable, but only the results from the categorized version are shown in table 4. The authors need to show both results. It would also be helpful to show key values such as sensitivity as a function of probability threshold. Even though the authors used three thresholds, using more thresholds and showing them as plots would be more helpful.

Minor comments:

1. It would be more helpful for readers to generate a figure explaining the experimental layout, which is nicely described by text in the Materials and methods section.

2. In page 10 (line 181), the authors cited “Supp. Mat” though I was not able to find corresponding description or data in the attached supplementary materials.

3. Figure 1 needs a label saying “calculated viral load (copies/µl)”

4. Figure 2 was not properly labeled. The authors need to show the axis labels and what black curve and grey histogram indicate. Although the text mention about 90th and 95th percentile by citing this figure, it doesn’t look those values are properly presented in this figure. I think it looks a lot better to have 2 stacked plots instead of current partially overlaid one.

5. Infectivity threshold at 1500 copies per µl sounds somewhat arbitrary. The authors need to cite references or explain more.

6. Although the authors cite Vasudevan et al. 2021, they should also include more discussion about how their results compare to previous efforts to use digital PCR to quantify VL.

7. The authors should explain and justify their sample sizes for their linear and logistic regression models. The original linear regression model is fit on only 13 samples, but then applied to over 3000 samples in the logistic regression model.

8. The authors should add original and retested Ct values in table 2.

9. The authors should provide a supplementary table that include all the information about patients (age, sex, etc) and viral loads (Ct, inferred VL) for ~6000 samples they used.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Dec 20;16(12):e0260884. doi: 10.1371/journal.pone.0260884.r002

Author response to Decision Letter 0


2 Oct 2021

Editorial Office of

Plos One

Ozzano dell’Emilia, September 24, 2021

Modelling RT-qPCR cycle-threshold using digital PCR data for implementing SARS-CoV-2 viral load studies

Dear Editor,

We would like to re-submit the revised manuscript (PONE-D-21-12462)) for publication in your journal. The manuscript has been completely revised as suggested by the reviewer. To address his/her concern great effort has been made to re-analyze the raw dataset by carrying out new analyses and generating new plots. As a matter of fact, we could claim that the concerns and criticisms have been wholly addressed although, the reviewer showed overall skepticism on the authors’ approach although he/she acknowledged some merits of the paper. We would respectfully draw the attention of the Editorial Office, and of the reviewer as well, to the paper published very recently, in particular after the submission of the present paper entitled “ SARS-CoV-2 RNA Quantification Using Droplet Digital RT-PCR”, The Journal of Molecular Diagnostics 2021 23,8, Pages 907-919 published on May 29, 2021. The manuscript describes the hypothesis of using mathematical modelling (linear regression) to model the relationship between the absolute count carried out using ddPCR and the cycle threshold. We quote ”.. RT-ddPCR derived SARS-CoV-2 E gene copy numbers were further calibrated against cycle threshold values from a commercial real-time RT-PCR diagnostic platform. This log-linear relationship can be used to mathematically derive SARS-CoV-2 RNA copy numbers from cycle threshold values, allowing the wealth of available diagnostic test data to be harnessed to address foundational questions in SARS-CoV-2 biology. Furthermore, no studies to our knowledge have calibrated SARS-CoV-2 viral loads to diagnostic test Ct values.“ Our manuscript was aimed at addressing some questions in SARS-CoV-2 biology using precisely that approach.

We would like to thank the reviewer for giving us the possibility to provide in detail the diagnostic method. We think that we have fully addressed his/her concerns and have hopefully improved the quality of the manuscript.

The revisions suggested are indicated in red in the revised manuscript.

Since the rebuttal is very long, we eventually decided to not submit it for review by a mother tongue person. We hope that our answers herein reported are clear enough for the reviewer. However, we regret any errors contained herein.

Below detailed answers to the concerns raised:

Reviewer #1: Gentilini et al present a manuscript describing a method to estimate the viral load of SARS-CoV-2 based on the Ct values of RT-qPCR. They performed digital PCR for a small number of samples to build a linear regression model to infer the viral load (copies per µl), then applied it to a much larger data set. Inferred viral load over time revealed that it preceded the peak of positive cases. Furthermore, logistic regression using inferred viral load and sex and age was able to predict mortality at an above chance level. Although this manuscript is potentially useful, the authors need to provide some more validation and head-to-head comparisons between Ct and calculated viral load (cVL) to clarify the advantage of their method.

As suggested, we have carried out some more head-to-head comparisons and further statistical analysis and reported in more detail the validation process of dPCR. However, we would refer the reviewer to the above-mentioned paper by Kinloch et al., 2021 (just published) who also used the same mathematical modelling approach.

Major comments:

1. Instead of using COVID-19 patient samples, the authors first need to show the results of both RT-qPCR and dPCR by using a concentration series of SARS-CoV-2 RNA standards that have known copy number per µl. It is important to show the sensitivity and dynamic range of these methods. This experiment also helps to properly infer the actual copy number based on dPCR results, especially since the Ct values varied across retesting. Currently, it does not appear that this crucial control experiment was performed.

This is a key point. Unfortunately, we only partially agree with the reviewer on this crucial point. Before we answer this issue, we should share the assumption that digital PCR has been widely demonstrated to be the most accurate way to assess the true value of a standard. Measuring the copies/µL in a “clinical sample” is better than synthetic nucleic acids and then titrated viruses. Hence, the use of clinical samples quantified with dPCR is at least equivalent (if not better) than other standards (i.e. synthetic nucleic acids, quantified by fluorimetric methods, or titrated viruses spiked in negative samples). That said, we actually carried out a validation study on the dPCR but, since it was not within the main aim of the study, the findings had been reported in brief in the original manuscript which the reviewer may have missed: quoting “Since the linearity of dPCR is limited by the fixed capacity of the vessels on the chip, it is extremely important to define the dynamic range of linearity. To that end, a dilution experiment was carried out; a cDNA sample which had been highly positive at qPCR with a Ct of approximately 16 was serially diluted 1:10 in molecular biology grade water and assayed in duplicate with dPCR. Since the first 2 dilutions were beyond the chip saturation, only the last five dilutions were linear, achieving a good r coefficient of 0.994. Based on these findings, all the samples >22 Ct were tested in dPCR as such while all the cDNA samples < 22 Ct were pre-diluted 1:10 in 5mM Tris-HCl. “it should also be taken into account that the same primers (CDC N gene) and method had already been validated in many studies (see references of the manuscript) with very consistent results. We did carry out linearity assays and analytical sensitivity experiments of the dPCR assay; the linearity experiment is relevant inasmuch as the dynamic range could affect the actual copies/µL of the standards (clinical samples). This issue is well known; hence we ascertained that a dilution was necessary for those samples < 22Ct. In the revised manuscript many more details are reported and graphed either in the main text or in Suppl. Materials. In detail, the analytical sensitivity has been assessed as LOD and reported in the revised manuscript adding some sentences in methods, results and a new Figure. We do not think that LOD is as critical as the dynamic range since the dPCR has been used here to create a linear regression model and not to diagnose the COVID-19. So, it is not evident to us the importance of including this info but we did. Demonstrating that the clinical samples used to regress Ct on the copies/µL and claiming that the clinical samples are valid and the copies/µL reliable, the reviewer suggested establishing the same features for the RT-pPCR assay. Again, we do not agree that this issue is relevant to the aim of the paper since our regression model includes and corrects the inherent bias of the Ct (referenced in the manuscript). However, such info could be obtained from another paper from our group. So, we eventually decided to cite the manuscript where the readers could draw data and graphics on this issue. (Brandolini M, Taddei F, Marino et al. (2021) Correlating qRT-PCR, dPCR and Viral Titration for the Identification and Quantification of SARS-CoV-2: A New Approach for Infection Management. Viruses. 28;13(6):1022.). It could be seen that for values > 27 Ct the repeatability of the qPCR worsens greatly. Overall, a R2=0.9128 was found.

All these further findings have been cited in the revised manuscript to hopefully improve the quality of the manuscript.

2. To clearly show this calculated VL value is more useful than the Ct value, the authors need to do the same analyses and show if their methods indeed can provide more insight in terms of the relationship between test results and COVID-19 mortality or active case dynamics. For example, they need to show how better is the performance of calculated-VL based logistic regression compared to Ct-based model. If, as the authors claim, the Ct values and VL are linearly related, then one may not expect any improvement between these new metrics.

In this regard, we have to re-affirm that in the original manuscript we did not claim that cVL would have performed better than the respective CT value since in this setting Cts are likely very homogeneous and hence our aim was not to compare raw Ct against cVL but instead to demonstrate the reliability of using the cVL. Using cVL has many advantages which were pointed outt in the revised manuscript. However, the comparison between Ct and cVL (and even he continuous and categorized one) was reported and graphed in the revised manuscript as requested.

That said we have also to stress that 1) clearly the use of dPCR as a diagnostic method due to its superior accuracy than qPCR would have achieved more accurate Viral Load quantification. Unfortunately, this is not good due to costs and turn-around time issues. Regardless of whether Viral Loads are prognostic this is a matter of investigations as in our paper. We have demonstrated that the addition of cVL(or raw Ct) to age and gender increased significantly the predictive power. Conversely, as suggested by the reviewer the model with cVL is almost equivalent to Ct (cVL is slightly but not significantly better). This was quite expected for the reasons that the reviewer pointed out. Precisely because of this, we have not claimed in the manuscript that the model with cVL would be better than the model with Ct. Differently, changing settings (labs, methods, instruments, kits, etc.), Ct may vary greatly and the model would likely lose accuracy. So, the improvement is not between the Ct and cVL in the experimental setting but in the real world. If we talk about Viral Load why would it be better to use a crude semiquantitive estimate instead of a more standardized and comparable parameter? We would also stress that we have measured the error rate of our method. It makes no sense to measure the error rate of other methods inherently more prone to errors (Please, see references). 2) CT values obtained by qPCR is not a quantitative method but it is a rough estimate of viral load. We cannot compare Ct against copies/µL. Basically, Ct may be compared within the same environment, personnel, qPCR method and so on but they are not comparable outside these settings in such a way that the eventual prognostic power in a setting cannot be generalized 3) Also viral load measured using Ct obtained with qPCR and interpolated using a calibration curve is more prone to errors than dPCR for a multitude of reasons addressed in detail elsewhere (References in the text). Even worse is that many studies used viral load using a calibration curve obtained only once. This is even worse than using the Ct. 4) We measured the error rate. It is not soundly based to say that since others did not, now we have to measure the errors of others. You can intend our approach as we do not know if this quantitive method employing the cutting-edge, most accurate, most reproducible, most precise, most sensitive technology is better but at least we have measured the error of measurement. This would be enough to prefer this approach rather than the Ct approach without measuring the error.

Also, the assertion that since two sets of data are linearly correlated, hence one or the other is the same is conceptually wrong from a mathematical standpoint. It’s possible but not necessarily true. Indeed, it depends on how the Ct values are dispersed around the regression line. The best way to explain this is by looking at the Residuals. Residuals may lay on the regression function (in this case the regressed value or the original datapoint is the same) or lay far from the regression line and in this case the use of the regressed data would be better than the original datapoint. In other words, errors are dispersed around the regression functions. Only those samples close to or overlapping with the regression line may give the same prediction as the data obtained by the regression line while those samples laying outside (for instance this is more evident for Ct values > 27, See the graph in Brandolini et al. 2021) the regression line and so the prediction based on the measured value is likely farther from the correct value than the predicted one. Unfortunately, low values even if incorrect are likely not predictive of case-fatality and hence the models using the raw Ct or cVL gave similar performances even if raw Cts are incorrect when > 27. So, it is not obvious that Ct and calculated Viral loads had the same predictive power. In our case, the Ct values are more prone to error for low Ct values which are, in turn, those values much less prognostic than higher values.

So, the same regression line may be obtained by different sets of data more close or less close to the line. Clearly, each set of data would have a different impact in predicting a variable even if they had the same regression line.

However, in the revised manuscript the methods, results and discussion sections have been revised and two novel Figures have been added. We hope they will be appreciated by the reviewer, especially the Figures.

3. The performance of the logistic regression is not convincing enough as a useful tool. Given the low rate of mortality (~9%), ~90% accuracy of their logistic regression is not particularly high (you can achieve >90% accuracy even if you called everything negative). In a model with 50% threshold, sensitivity is ~25%, which means only a quarter of the true positives were correctly called as positive. In addition, positive predictive value is 63%, which means only 2/3 of the positives this model called are true positive. Even though the authors discussed the potential use of this prediction as a tool to screen more susceptible patients, it is hard to imagine this model can actually be used for that purpose.

This is the reviewer’s opinion. Unfortunately, we disagree with this opinion which represents a very original approach to statistical modelling. How much is “enough” it is a matter of opinions. We have reported data. 90% accuracy with 9% mortality. It is indeed 90% accuracy. We would like to stress that 1) during the pandemic, we approved IVD assays (i.e. SARS-CoV-2 antigenic tests) that have been approved achieving no more than 70% accuracy. Also, molecular assay with a 98% accuracy that with disease prevalence of 1-5% means that tests do make errors either false positive or false negative. Also, predictive models may do errors. Our model has great benefits: first of all there is no cost. Indeed, it relies on already available data to obtain an objective prediction. Second it is straightforward. Third it allows concentrating the effort of public health services on a smaller subset of subjects at higher risk. If this is worthless for the reviewer, it is just an opinion and likely an opinion of a person not directly involved in the rational use of public resources in handle the pandemic. However, the example does not fit at all. The reviewer cannot take extreme cases and make a rule. In the real world of Public Health Services, among more than 3000 Covid-19 cases, at no cost the model predicted 111 deaths. Thus, the PHS may prioritize the assistance on 111 predicted death cases among >3000. Within these predicted deaths, 70 were correctly identified (They would actually die). This means that the PHS could prioritizes those patients who had 63% of possibility of death. Evidently, this critical impact of the study was not highlighted enough in the first version of the manuscript. As a consequence, a paragraph has been added to the revised manuscript on lines 399-404.

4. The authors purified RNA from UTM stored samples and compared results of RT-qPCR and dPCR. It has been shown that RT-qPCR results are different between purified RNA and UTM samples as shown in Figure 3 of reference 31, https://www.nature.com/articles/s41598-020-80715-1). The authors need to describe how original tests these 51 samples and other ~6000 samples were performed in more detail (such as if purified RNA was used for RT-qPCR). Although it seems a part of 51 samples and all of ~6000 samples were tested by “Seegene assay”, the authors did not describe the detail of this method. If these original tests were performed directly from UTM samples, using purified RNA for building a regression model does not sound reasonable.

The reviewer is totally right. We agree that this is relevant information to be added. It is well known that at certain timepoints during he pandemic, due to a shortage of reagents, many labs in the world carried out the diagnostic RT-qPCR assay starting from crude lysate of UTM instead of by purifying RNA. There is a linear correlation between the CT obtained from purified RNA and from UTM lysate; besides the literature cited by the reviewer our group also addressed this issue (Brandolini et al., 2021 doi: 10.3390/v130610229) Indeed, the lab has verified the use of crude lysate and a mean of 5 Cts of difference was observed. However, in 2020 all UTM samples for Covid-19 diagnostic procedures had undergone a RNA purification step before diagnostic PCR (only purified RNA was used as a template for RT-qPCR). This was more clearly stated in the revised manuscript on line 129 of the revised manuscript. Also, among the 51 samples used for establishing and evaluating the linear regression models most of them had been assayed with methods different from Seegene but all of them have been re-analysed using Seegene. Hence, since the linear regression function is valid only for the Seegene method, the cohort of ~ 6000 Covid-19 cases was selected among those diagnosed with Seegene. In this regard a new Figure has been added to the revised manuscript (Figure 1new) Finally, the details of the Seegene method were reported in the supplementary materials since it was considered not so relevant and since it is one of the most used Sars-CoV-2 diagnostic assays in the world. Indeed, the cohort of ~6000 samples used Ct obtained with the same method of qPCR and purification (criteria of inclusion obviously) among hundreds of thousands of samples analyzed by The Great Romagna Hub Laboratory of Pievesestina, Italy with a multitude of platforms.

5. I am not sure what is the rational of performing dPCR only once as described in line 161. The authors should perform multiple times to see the consistency. Related to this, it appears that the authors performed the regression with a single training and test split, but it would be better to perform k-fold cross validation with multiple restarts (meaning randomly separate all samples to training and test sets many times). This would help to understand if their model is generalizable or specific to the particular training/test (evaluation) sets. Additionally, the percentage of data in the training and evaluation sets differs between the linear and logistic regression models.

Concerning the use of a sole dPCR in the evaluation set compared with the triplicate dPCR reactions used in the training set, the rationale is that the model should be as accurate as possible and hence dPCR was carried out in replicate while the evaluation set served to evaluate the error. Hence, we have evaluated the error when the sample is assessed once by dPCR, likely the error would be lower if the actual VL would have been assessed in triplicate also in the evaluation set. Also consider that the precision of the dPCR data is much better than the error of the Ct values. In this regard the CV results have been reported in the Suppl (Table reporting the CV% across the technical replicates) Material. Hence, using dPCR replicates also for the evaluation sets would have improved only marginally the Error estimation. Since in our opinion the measured errors of the model were good enough, we eventually decided to save resources.

Concerning the need for using iterative modelling, this would be a meaningful suggestion for regression modelling other than linear ones built on small cohorts. Indeed, establishing an equation to explain a so clear linear relationship between two factors linked by a so close relationship would be almost of no benefit. Indeed, the reliability of the linear model can be fully evaluated by the findings of the evaluation set and, most importantly, by the findings of the predictive (logistic) models. However, the suggestion would be warranted when a small cohort is examined. By using a cohort of more than 6000 samples and after having looked at the results it is very evident that the findings are solid and reproducible. Please, the reviewer should cite some manuscripts which used such a design with so consistent cohorts.

However, in the revised manuscript we have added as Suppl. Material the entire anonymized dataset including the findings of the predictive models. The last show the consistency of the prediction across different models and probability thresholds.

6. The authors need to show more data instead of just reporting the results of statistical tests or analyses by numbers or table format. This manuscript is not reader-friendly partly due to the lack of plots for most of their key analyses. Individual points will be pointed out below. The better visualization is also important to evaluate their key results such as the performance of linear and logistic regression.

This is quite an original concern. What does it means “just reporting the results of the statistical tests or analysis by number of table formats”? We examined 6208 samples by means of logistic regression. We reported the Table summarizing the dataset (cohorts), the equations of logistic regression in a way that anyone can apply the equation to its own dataset and verify the effectiveness and the results of the analysis. These results need to be discussed. Also, plots are tools to summarize and report data. However, to meet the suggestion of the reviewer we have added 6 new Figures reporting plots and graphs (also more than one graph per figure). Since the Plots of the software used to analyse data (STATA) are not reader-friendly we used instead another software (Analyse-it Software, Analyse-it Software, Ltd. UK)), This information has been added to the M&M section and where appropriate.

We would also stress that the most useful results of this study are the functions itself embedded in a spreadsheet. So also the excel file embedding the formula has been added as suppl. Material in the revised manuscript. We are confident that the reviewer will appreciate the revised manuscript as more reader-friendly.

7. The authors claimed there was no significant difference between retested and original Ct values for 51 samples they analyzed. They need to show a plot showing the relationship between retested and original values for both SARS-CoV-2 genes and control genes. It is important to show how consistent these values are to evaluate their results.

Done accordingly

8. The authors need to show plots showing Ct values and log10(measured VLs) with a regression line for 51 samples they analyzed in addition to table 2. This will help readers to evaluate their model. Based on the table, the difference between measured and calculated VLs is not small even though they their regression was significant, and it is hard to understand the pattern of distribution without a plot. Additionally, the R-squared value (0.900) on the training set is quite low, especially for data that is distributed across multiple log-orders. Likewise, the MAD on the evaluation data is >50%, with some samples being off by up to 10-fold, suggesting that the model does not generalize well.

We have added the Figures reporting the plots, accordingly. We respectfully do not agree with the reviewer’s claim in the last paragraph of point 8. “. Evidently, the reviewer is not used to Ct data. Please, note that a difference of 3.3 CT difference means a difference of 10 times. We included in the model many samples with low Cts and a huge difference. Also, it is a bit surprising the claim that a 0.900 R-squared value is quite low, In our opinion 0.900 is remarkable since it is not the result of a serially diluted series but of a correlation between two different measurements. Additionally, in the revised manuscript as requested by the reviewer we have also added a further Figure which should ease the interpretation of the linear regression model, representing that the linear model obtained an outstanding 0.918 R-squared value in the evaluation set. Hopefully, this further evidence should clarify the concerns of the reviewer.

9. The results of logistic regression should be also presented by some plots instead of just showing numbers as a table. The methods also describe using the VL as both a continuous and categorical variable, but only the results from the categorized version are shown in table 4. The authors need to show both results. It would also be helpful to show key values such as sensitivity as a function of probability threshold. Even though the authors used three thresholds, using more thresholds and showing them as plots would be more helpful.

We do understand that the reviewer would prefer plots to numbers. So, we added many plot graphs to the revised manuscript. Hopefully this should increase the readability of the manuscript.

Concerning the need to show both continuous and categorical VL variables the reviewer does not explain the reasons. For sure, when evaluating the logistic model, one tries a multitude of models using a step forward or step backward method and using also redundant variables as the same predictor entered either as continuous or categorical. Clearly the use of a categorial variable has the advantage of being simpler and more intuitive to infer the risk (odds ratio) while continuous variables are more easy to transfer to the predictor equation. As a matter of fact, the categorization of cVL allowed to easily interpret the odds ratio of each category. The criteria used to drive the choice of the model is the best R-square of the overall model.

Finally, we also added a plot of the ROC curve analysis showing the optimal threshold while in the revised manuscript the different models have been compared using Area under ROC curves.

Minor comments

1. It would be more helpful for readers to generate a figure explaining the experimental layout, which is nicely described by text in the Materials and methods section.

We thank the reviewer for this suggestion but eventually decided to give up since the revised manuscript has already 9 Figures.

2. In page 10 (line 181), the authors cited “Supp. Mat” though I was not able to find corresponding description or data in the attached supplementary materials.

We regret this missing information. In the revised manuscript an excel file including the equations to calculate the cVL and the probability of death has been added.

3. Figure 1 needs a label saying “calculated viral load (copies/µl)”.

Done as suggested including this information in the legend (new Figure 5).

4. Figure 2 was not properly labeled. The authors need to show the axis labels and what black curve and grey histogram indicate. Although the text mention about 90th and 95th percentile by citing this figure, it doesn’t look those values are properly presented in this figure. I think it looks a lot better to have 2 stacked plots instead of current partially overlaid one.

Figure 2 (renamed Figure 6 in the revised manuscript) has been re-labelled. Furthermore, the caption has been re-phrased to better represent what is reported in more detail. Also, a graph representing the 90th and 95th percentiles has been added in the same Figure to accurately indicate the results. Thanks for the suggestion. The format has been left overlapped instead of stacked for our personal preferences. (Please, consider that the straight line has been reported as such by the National Health authorities as cited and we do not have the raw data that generated such graph. So minimal adjustment could be made.

5. Infectivity threshold at 1500 copies per µl sounds somewhat arbitrary. The authors need to cite references or explain more.

The appropriate references and the limit of defining this threshold have been acknowledged in the revised manuscript by appropriate re-phrasing on lines 368-371.

6. Although the authors cite Vasudevan et al. 2021, they should also include more discussion about how their results compare to previous efforts to use digital PCR to quantify VL.

Some interesting comparisons are mentioned on lines 355-359 of the revised manuscript.

7. The authors should explain and justify their sample sizes for their linear and logistic regression models. The original linear regression model is fit on only 13 samples, but then applied to over 3000 samples in the logistic regression model.

DPCR is much more expansive and time-demanding than RT-qPCR. The advantage we wanted to highlight was to build models by correlating dPCR and RT-qPCR in small samples to be applied in large samples. All the findings highlighted in the revised manuscript corroborate such approach and also explain the limits. We have stressed this approach by re-phrasing the aim at the end of the introduction section on lines 108-110.

8. The authors should add original and retested Ct values in table 2.

Done accordingly

9. The authors should provide a supplementary table that include all the information about patients (age, sex, etc) and viral loads (Ct, inferred VL) for ~6000 samples they used.

Done accordingly

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

On behalf of all the authors

Yours sincerely,

Fabio Gentilini

Attachment

Submitted filename: Rebuttal Comments to the Author.docx

Decision Letter 1

Jean-Luc EPH Darlix

19 Nov 2021

Modelling RT-qPCR cycle-threshold using digital PCR data for implementing SARS-CoV-2 viral load studies

PONE-D-21-12462R1

Dear Dr. Gentilini

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Jean-Luc EPH Darlix, MG, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional): 

Reviewers' comments:

Acceptance letter

Jean-Luc EPH Darlix

1 Dec 2021

PONE-D-21-12462R1

Modelling RT-qPCR cycle-threshold using digital PCR data for implementing SARS-CoV-2 viral load studies

Dear Dr. Gentilini:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Professor Jean-Luc EPH Darlix

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 File. Supplementary materials & methods.

    (DOCX)

    S2 File. Complete dataset.

    The Excel file includes two spreadsheets providing both model group and test group. In the latter, the predictive formulas are embedded in the spreadsheet.

    (XLSX)

    Attachment

    Submitted filename: Rebuttal Comments to the Author.docx

    Data Availability Statement

    All relevant data are within the manuscript and its S1 and S2 Files.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES