Abstract
The endpoint dilution assay’s output, the 50% infectious dose (ID50), is calculated using the Reed-Muench or Spearman-Kärber mathematical approximations, which are biased and often miscalculated. We introduce a replacement for the ID50 that we call Specific INfection (SIN) along with a free and open-source web-application, midSIN (https://midsin.physics.ryerson.ca) to calculate it. midSIN computes a virus sample’s SIN concentration using Bayesian inference based on the results of a standard endpoint dilution assay, and requires no changes to current experimental protocols. We analyzed influenza and respiratory syncytial virus samples using midSIN and demonstrated that the SIN/mL reliably corresponds to the number of infections a sample will cause per mL. It can therefore be used directly to achieve a desired multiplicity of infection, similarly to how plaque or focus forming units (PFU, FFU) are used. midSIN’s estimates are shown to be more accurate and robust than the Reed-Muench and Spearman-Kärber approximations. The impact of endpoint dilution plate design choices (dilution factor, replicates per dilution) on measurement accuracy is also explored. The simplicity of SIN as a measure and the greater accuracy provided by midSIN make them an easy and superior replacement for the TCID50 and other in vitro culture ID50 measures. We hope to see their universal adoption to measure the infectivity of virus samples.
Author summary
The infectivity of a virus sample is measured by the infections it causes. One approach, the endpoint dilution assay, aims to estimate the number of TCID50 contained in a sample, where one TCID50 is the dose at which a virus sample is expected to infect a tissue or cell culture 50% of the time, on average. Unfortunately, the commonly used methods to estimate the TCID50 from the assay’s outcome yield biased approximations that relate poorly to the number of infections the sample will cause. We propose replacing the TCID50 with a more accurate, robust, and biologically meaningful measurement unit we call Specific INfection (SIN). It corresponds to the number of infections the virus sample will cause, which can be used directly to achieve the desired multiplicity of infection. Computing the SIN from one’s endpoint dilution assay outcome requires no change in experimental procedure, and can be done conveniently via a web-application we developed, called midSIN. midSIN can be accessed for free on any device (laptop, cellular phone, tablet) from any web browser, without the need to download and install software.
Introduction
The progression of a virus infection in vivo or in vitro, or the effectiveness of therapeutic interventions in reducing viral loads, are monitored over time through sample collections to measure changes (increases or decreases) in virus concentrations. As such, accurate measurement of the concentration in a sample is critical to study and manage virus infections.
Methods to count infectious virus are based on counting the infections they cause, rather than the particles themselves. In practice, however, not all infection-competent virions contained in a sample will go on to successfully cause infection. Experimental conditions, cell type used or temperature or acidity of the medium, can alter the rate at which virions, that were infection-competent in the sample, will lose infectivity before they can cause infection and thus be counted. This is why, hereafter, we will refer to the quantity measured by infectivity assays as the infection concentration or the number of infections the sample will cause per unit volume, rather than its concentration of infectious virions, which is not a measurable quantity. Two main types of assays are used to quantify the infection concentration within a virus sample: (1) the plaque forming (PFU) or focus forming (FFU) assays; and (2) assays we will collectively refer to as endpoint dilution (ED) assays, which include the 50% tissue culture infectious dose (TCID50), or cell culture infectious dose (CCID50) or egg infectious dose (EID50) assays, etc. Herein, we focus on ED assays. Technically, the plaque and focus forming assays are also endpoint dilution assays because they rely on the counting of plaques or foci (the endpoint) as a function of dilutions. However, herein, we will refer to them as plaque or foci forming assays rather than endpoint dilution assays.
The ED assay has one major, remediable weakness: its output quantity, the TCID50 (or CCID50 or EID50), does not directly correspond, or trivially relate, to causing one infected cell. The simplistic calculations, introduced by Spearman-Kärber (SK) [1, 2] and Reed-Muench (RM) [3] nearly a century ago, remain the most commonly used methods to quantify a virus sample’s infectivity in units of TCID50 (or CCID50 or EID50) using the ED assay. Many research groups rely on spreadsheet calculators that are passed down through generations of trainees or found on the internet, and can contain errors (e.g., versions 2 and 3 of the spreadsheet calculator provided by the Lindenbach Lab at Yale University (http://lindenbachlab.org/resources.html), which have since been removed). Theoretically, a dose of 1 TCID50 is expected to cause −1/ln(50%) = 1.44 infections [4]. However, the approximation used by the RM and SK methods introduces an often overlooked bias where 1 TCID50 ≈ 1.781 infections where 1.781 = eγ and γ = 0.5772 is the Euler-Mascheroni constant [5, 6]. This makes it problematic to experimentally achieve the desired multiplicity of infection when inoculating from a sample quantified via the RM or SK methods. Many have proposed replacements for the RM and SK calculations based on logit or probit transforms of the data [4, 6, 7] or on statistical analysis of the ED assay output [7, 8] with some implemented as website applications [9, 10]. Sadly, none of these improvements were widely adopted to improve estimates of the TCID50, possibly due to a lack of visibility of these publications, or the lack of widespread awareness of the limitations of the RM and SK methods. None proposed replacing the TCID50 measurement unit, with a more meaningful measure.
The work herein proposes to:
Encourage the use of the ED assay (e.g., TCID50 assay), but replace its output, the TCID50/mL (or CCID50/mL, EID50/mL, etc.), with a new quantity in units of Specific INfections or SIN/mL which corresponds to the number of infections the sample will cause per mL. The word specific highlights the fact that the infectivity of a sample is specific to the particulars of the experimental conditions (temperature, medium, cell type, incubation time, etc.).
Replace the Reed-Muench and Spearman-Kärber approximations with a computer software, midSIN (measure of infectious dose in SIN), that relies on Bayesian inference to measure the SIN/mL of a virus sample. To avoid calculation errors and make the new method widely accessible, midSIN is maintained and distributed as free, open-source software on GitHub (https://github.com/cbeauc/midSIN) for user installation, but also via a free-to-use website application (https://midsin.physics.ryerson.ca) with an intuitive user interface.
Here, we present examples of midSIN being used to analyze influenza and respiratory syncytial virus samples. We demonstrate that midSIN’s output, SIN/mL, is an accurate estimate of the number of infections the sample will cause per unit volume. We show how the accuracy of the SIN concentration estimate can be controlled by experimental choice of plate layout, including the dilution factor, and the number of replicates per dilution. We compare midSIN’s performance to that of the RM and SK methods, and demonstrate how the latter estimators are inaccurate under various circumstances, underlining the need to adopt midSIN to quantify virus samples via the ED assay.
Results
Key features of midSIN’s output
Let us consider a fictitious ED experiment, with 11 dilutions and 8 replicate wells per dilutions, in which the minimum sample dilution, , is serially diluted by a factor of 10−0.5 ≈ 0.32 (, , …, ), and the total volume of inoculum (diluted virus sample + dilutant) placed in each well is Vinoc = 0.1 mL. Now, consider that a virus sample is measured using this ED experiment and one observes (8,8,8,8,8,7,7,5,2,0,0) infected wells out of 8 replicates at each of the 11 dilutions, as illustrated in Fig 1A.
midSIN provides a graphical output of its results, shown in Fig 1B and 1C for this example. Note how the posterior distribution for log10(SIN/mL) (Fig 1B) is approximately a normal distribution. This is why log10 of the infection concentration should be used and reported, rather than the concentration itself. midSIN also graphically compares the number of infected wells observed experimentally (Fig 1C, black dots) against the theoretically expected values (blue curve and grey CI bands). This graphical representation makes it easy to identify issues with the data entered or with the experiment itself.
Importantly, midSIN provides a more useful quantity to the user than the TCID50: an estimate of the concentration of infections the sample will cause, SIN/mL. For this example, the concentration is 106.2±0.1 SIN/mL, where 6.2 is the mode (most likely value) of log10(SIN/mL), and ±0.1 is its 68% credible interval (CI). The SIN/mL corresponds to the number of infections that will be caused per mL of the sample, which can be directly used to determine the sample dilution required to obtain a desired multiplicity of infection (MOI).
In a laboratory setting, ED experiments can be performed in batches, such as to quantify the infectious concentration in samples collected at several time points over the course of a cell culture infection. For such applications, midSIN provides a comma separated value (csv) template file readily editable in a spreadsheet program, to collect and submit the results for batch processing. Details on the format of the template file are available on midSIN’s website (https://midsin.physics.ryerson.ca). Fig 2 illustrates the output for a subset of measurements for in vitro infection with the respiratory syncytial virus (RSV). Each sample was measured twice, and midSIN’s estimates are in good agreement with one another (within 95% CI).
The y-axis in the left graph panels of midSIN’s graphical output is the non-normalized scale of the posterior distribution for log10(SIN/mL), which ranges between 10−7 and 10−2. The scale loosely relates to the likelihood of observing a particular ED experimental outcome (see Methods). Unlikely ED outcomes appear as large departures of the observed number of infected wells (right panels, black dots) from what is theoretically expected (right panels, curve). It is interesting that the uncertainty (CI) of midSIN’s estimated log10(SIN/mL) appears to be independent of how much the ED outcome deviates from theoretical expectations. That is, the accuracy of midSIN is not strongly affected even when it is provided more unlikely, noisy experimental data. This robustness is explored further below.
Comparing SIN to TCID50 and PFU virus sample concentrations
The midSIN calculator provides an estimate of the number of infections that will be caused per mL of a virus sample (SIN/mL). In principle, a plaque assay also measures the number of infections a sample will cause, with each infection expected to develop into a plaque. If a plaque assay is performed under experimental conditions and protocols as similar as possible to those of the ED assay (i.e., using the same cells, medium, period of incubation, rinsing method, etc.), midSIN’s SIN/mL estimate is expected to be comparable, in theory, to the number of PFU/mL observed in the plaque assay. In practice, however, the plaque assay likely provides a biased estimate of the true concentration of infections in a sample due to various experimental limitations (e.g., distinguishing between two merged plaque and a larger one, or between small plaques and staining artifacts). To evaluate midSIN’s performance compared to existing methods, the infection concentration in two influenza A (H1N1) virus strain samples were measured via both plaque and ED assays, and their concentration in units of PFU, TCID50, and SIN were compared (Fig 3). Details regarding the samples, and how the plaque and ED assays were performed are provided in Methods.
The TCID50 concentrations estimated via the RM and SK methods are ∼1.5–1.7 times larger (Fig 3C and 3D) than the SIN concentration, and the set of ratios are statistically inconsistent with the assumption of equality (p-value: 0.01–0.03). Theoretically, 1 TCID50 is expected to cause 1.44 infections (= 1/ln(2)) [4]. However, the RM or SK approximations are known to introduce a bias such that 1 TCID50 estimated by these methods is expected to cause 1.781 infections (= eγ where γ = 0.5772 is the Euler-Mascheroni constant) [5, 6]. Using the RM, SK, and SIN measurements presented in Fig 3A and 3B, we confirmed (the mean log10(ratio) was re-computed for ratio = (RM/1.781)/SIN and (SK/1.781)/SIN, and found to be 0.85–0.93, which is statistically consistent (p-value: 0.1–0.3) with the assumption of equality, i.e., ratio = 100 = 1.) that 1.781 SIN ≈ 1 TCID50 when the latter is estimated via the RM or SK approximations, as expected theoretically if SIN is indeed measuring the infection concentration in a sample.
Similarly, the ratio of the PFU concentration determined via the plaque assay and the SIN concentrations estimated by midSIN is ∼0.89–0.93, which is statistically consistent with the assumption of equality (p-value: 0.2–0.5). These results confirm the theoretical expectation that 1 PFU ≈ 1 SIN when the plaque and ED assays are performed in the same manner, as was the case here. This provides further support, via two independent assays, that the SIN concentration estimated by midSIN from the ED assay is a robust measure of the infection concentration of a virus sample.
Comparing midSIN’s performance to that of the RM and SK methods
The RM and SK methods rely on the number of infected wells decreasing as dilution increases. Their estimates are affected when the number of infected wells remains unchanged or even increases as dilution increases, which statistics and experimental data herein (Fig 2) tell us can reasonably occur experimentally. The RM and SK methods also mostly require that at the lowest and highest sample dilutions, all wells be infected and uninfected, respectively. Fig 4 provides a graphical representation of how the RM and SK methods estimate the TCID50 concentration from an ED assay. Simply stated, the RM and SK methods use geometric arguments to estimate the sample dilution at which 50% of wells would be infected. While they are sometimes accurate (Fig 4A and 4B), their simplicity often leads to biased estimates (Fig 4C and 4D).
In contrast, midSIN is robust to these issues. Fig 5 demonstrates how midSIN can provide an estimate for the log10(SIN/mL) in a sample using the number of infected wells at a single dilution, as long as at least one well is uninfected if all others are infected or vice-versa. This is because midSIN relies on Bayesian inference, i.e., when more than one column is available, it uses information from each column successively to revise and improve its estimate. This allows midSIN to correct for even large deviations from theoretical expectations, and thus improves its accuracy.
Fig 6 illustrates how well the midSIN, RM, and SK methods recover a known input sample concentration in simulated ED experiments, based on a plate layout consisting of 11 dilutions ( to ), a dilution factor of 1/4, and 8 replicates per dilutions. The infection concentration estimated by midSIN is in excellent agreement with the input concentration. For the RM and SK methods, which estimate the log10(TCID50/mL) rather than the log10(SIN/mL), the agreement is generally poor due to the bias they introduce. Furthermore, the RM and SK predictions are more variable (wavy pattern), and lose accuracy dramatically as the sample concentration approaches the limits of detection (the 2 ends) which, for the example plate layout simulated here, is around 103 SIN/mL and 109 SIN/mL. Interestingly, the basic calculations behind the RM and SK methods constrain the set of values they can return (sparsely populated grey histograms), compared to the more continuous range returned by midSIN, which contributes to its increased accuracy.
Estimate accuracy as a function of plate layout
In Fig 2, we observed that even for large discrepancies between the expected (right panels, blue curve) and observed (right panels, black dots) ED assay outcome, the uncertainty (CI) of midSIN’s estimate remains relatively unchanged. This apparent robustness is because the uncertainty is primarily determined by the experimental design, namely the change in dilution between columns (dilution factor) and the number of replicate wells per dilution. Fig 7 explores the impact of varying either only the dilution factor, or only the number of replicates at each dilution, or varying one at the expense of the other by using a fixed number of wells (96 wells). When using midSIN, smaller changes in dilution (e.g., going from a dilution factor of 2.2/100 to 61/100) or more replicates per dilution (4 to 24) each improves the measure’s accuracy (narrower CIs) by comparable amounts, but only when the total number of wells is allowed to increase to accommodate the change. When the total number of wells used is fixed, changing one at the expense of the other leaves the accuracy (CI) unchanged. This is somewhat also true for the log10(TCID50) output concentration estimated by the RM and SK methods. However, at the smallest dilution factors (10/100 and 2.2/100), the bias introduced by the RM and SK methods becomes even larger and more unpredictable. For the input concentration considered in Fig 7 (105 SIN/mL), the dilution at which 50% of wells are infected is near the middle dilution. For sample concentrations such that 50% infected wells occur near or at the lowest or highest dilution chosen, the effect is even more significant.
Fig 7 also demonstrates that varying the dilution by smaller increments (e.g., a dilution factor of 61/100 rather than 10/100) provides greater granularity (uniqueness) of ED plate outcomes, and thus, greater accuracy of the log10 infection concentration estimates. Here, a distinct plate outcome means a distinct number of infected wells at each dilution, with no distinction as to exactly which of the replicate wells (e.g., the second versus the fourth) is infected at each dilution. An ED plate with serial dilutions ranging over 6 orders of magnitude (e.g., 10−2 to 10−7), with 4 different dilutions and 24 replicates/dilution (i.e., dilution factor of 2.2/100) provides ∼106 ([24 + 1]4) possible, distinct ED plate outcomes (Fig 7C, 7F and 7I, leftmost histogram). In contrast, a plate with the same serial dilution range, but with 24 different dilutions and 4 replicates/dilution (i.e., dilution factor of 61/100) yields ∼1017 ([4 + 1]24) distinct outcomes (Fig 7C, 7F and 7I, rightmost histogram). More generally, [reps + 1]dils is the number of distinct plate outcomes for a chosen number of dilutions (dils) and replicates (reps). Having fewer possible plate outcomes means that a larger range of concentrations would share the same most-likely ED plate outcome, yet each plate outcome only maps to one (the most likely) concentration estimate. This means that with fewer dilutions, the concentration estimate is forced to take on the nearest possible value it can take (Fig 7, the next closest grey band in the stack), and the accuracy of the concentration estimate is therefore reduced. So although having a greater number of dilutions is more labour intensive, it should be preferred over having a greater number of replicates per dilution.
Discussion
We have introduced a new calculator tool called midSIN to replace the Reed-Muench (RM) and Spearman-Kärber (SK) calculations to quantify the infectivity of a virus sample based on an endpoint dilution (ED) assay. Rather than estimating the TCID50 of a virus sample, midSIN calculates the number of infections the sample will cause, reported in units of specific infections (SIN). It does so without requiring any changes to current ED assay protocols, and can be accessed for free via an open-source web-application (https://midsin.physics.ryerson.ca). Importantly, because the SIN of a virus sample corresponds to the number of infections it will cause, it can be used directly to determine what dilution of the sample will achieve the desired multiplicity of infection (MOI).
Using a combination of in vitro and simulated experimental data, we demonstrated that midSIN provides more accurate and robust estimates than the biased RM and SK approximations. We confirmed that the RM and SK approximations overestimate the TCID50 by 23.5%, such that 1 TCID50 estimated by these methods will cause 1.781 rather than 1.44 infections [5, 6]. While, in theory, the intended MOI can be obtained by multiplying the TCID50 by 0.7 (or rather ln(2) = 0.693), one should instead multiply by 0.561 to account for the overestimation by RM and SK. Even when accounting for the overestimation, we showed that these methods perform particularly poorly when too few replicate wells per dilutions are used or when the change in dilution is large between successive serial dilutions. The two methods perform especially poorly when quantifying samples whose infection concentration approaches, but is still well within, the detection limit of the ED assay. In such cases, the bias introduced by these methods becomes even larger and more significant. For example, if the minimum and maximum dilutions of an ED plate are 10−2 and 10−8, virus samples with a concentration less than 102.2 SIN or greater than 107.6 SIN per inoculated well volume (typically 0.1 mL), will see their concentration estimated with an even larger bias by the RM and SK methods.
Using midSIN to measure the infectivity of a virus sample based on an ED assay does not require any change to ED experimental protocols and methods currently in use in one’s laboratory (e.g., dilution factor, replicate per dilution, minimum dilution). Indeed, we demonstrated that midSIN can estimate a virus sample’s SIN concentration based on even just a single dilution, as long as replicate wells at that dilution are not all infected or all uninfected. For a given number of ED wells used to titrate the sample and fixed minimum and maximum dilutions (ED detection range), we showed that having smaller changes between dilutions should be favoured over more replicates at each dilution. For example, using 11 dilutions, with a 4-fold dilution factor between dilutions and 8 replicate wells per dilution uses up 88 wells, leaving 8 wells of a 96-well plate for controls. This ED plate design, analyzed using midSIN, accurately measures virus sample concentrations ranging over ∼6 orders of magnitude (e.g., [101–107] SIN/mL, or [106–1012] SIN/mL, etc.) with an accuracy of ∼1.6-fold (×10±0.2, 95% CI). In comparison, using 7 dilutions, with a 10-fold dilution factor, and 4 replicates (which uses 28 rather than 88 wells) would also span 6 orders of magnitude, but with an accuracy of ∼3.2-fold (×10±0.5, 95% CI). To put these 2 accuracies in perspective: 1 mL of a sample measured to contain 10 SIN/mL, is expected to yield either 6–16 or 3–31 infections 95% of the time, given an accuracy of either ×10±0.2 or ×10±0.5 SIN/mL, respectively. Such an important decrease in accuracy means a reduced ability to detect experimental changes as statistically significant, with the ×10±0.5 accuracy requiring a >10-fold change for statistical significance. Failing to identify a change as statistically significant as part of a study is far more costly than using more wells for each sample to increase measurement accuracy, and thus the statistical power of the study.
The midSIN-estimated SIN obtained from an ED assay was also compared to the PFU from a plaque assay for a set of influenza A virus samples. When the plaque and ED assays are performed as identically as possible (cell type, incubation time, etc.), as was the case here, 1 SIN ≈ 1 PFU. This demonstrates that indeed midSIN’s SIN is a measure of the number of infections a virus sample will cause. However, the plaque and focus forming assays have experimental limitations (time required, sensitivity of target cells to overlay, limited to viruses that cause CPE, subjectivity in counting plaques/foci, etc.) that cause many researchers to titrate virus using ED assays. Indeed midSIN’s SIN is a measure of the number of infections a virus sample will cause, and estimating the SIN concentration of a virus sample using data from ED assays is accessible, accurate, and predictive.
The work herein focused on the virus sample infectivity estimated from an unmodified ED assay. In principle, further improvements in accuracy could be achieved through the use of machine-automated scoring of infected wells using fluorescence intensity or colorimetry. Plate readers can be quite expensive, as are the consumable compounds they require, such as fluorescent antibodies, or antibodies loaded with compounds that can precipitate in the presence of another (colorimeter). In contrast, staining with crystal violet, trypan blue, etc. is an inexpensive and efficient way to identify the widespread cellular pathogenic effect of infection by a lytic virus, as are red blood cells to identify the presence of notable virus concentration in the supernatant of a well infected with a hemagglutination-capable virus. Since the aim of the ED assay is merely to establish whether or not infection occurred, the scoring of a well as having been infected or not, even when done visually, is likely less ambiguous. Therefore, in future work, it would be interesting to compare human- vs machine-scoring of wells to evaluate this step’s contribution to the accuracy of the measure obtained.
Beyond the work presented herein, the development of midSIN will continue online as we implement new features and inputs for integration with various colorimetric and fluorescence instruments. The ease of use of midSIN and the greater usefulness and relevance of SIN as a measure of a virus sample’s infectivity make them far superior to the TCID50, and other ID50 measures. We hope to see them adopted widely.
Methods
The mathematics of the dose-response assay
Considering a single well
Consider a virus sample of volume Vsample which contains an unknown concentration of infectious virions, Cinf, which we aim to determine. Drawing a small volume, Vinoc < Vsample, from the sample of volume Vsample, is analogous to drawing balls out of a bag containing green and yellow balls, and considering green balls a success, and yellow ones a failure. It is a series of Bernoulli trials where
n = Vinoc/Vvir is the number of draws, i.e., the number of virion-size volumes (Vvir) drawn from the sample to form the inoculum volume (Vinoc), analogous to the number of balls drawn.
k is the number of successes, i.e., the number of infectious virions drawn from the sample to form the inoculum, analogous to the number of green balls drawn.
p is the probability of success, i.e., the fraction of virion-size volumes in the sample that are occupied by infectious virions, analogous to the probability of drawing a green ball.
The probability of success, p, is related to the concentration of infectious virus in the sample, Cinf, as
where Cinf is the quantity we aim to estimate. Unlike the ball analogy where it is easy to count how many green balls k were drawn, after having drawn n virion-size volumes from the sample into our inoculum, we cannot count how many infectious virions were drawn into the inoculum. However, if this inoculum is deposited onto a susceptible cell culture, we can observe whether or not infection occurs, and this would indicate that the inoculum contained at least one or more infectious virions. Note that, as explained in the Introduction, even a productively infectious virion, i.e., one capable of completing the full virus replication from attachment to progeny release, might not result in a productive infection. As such, from hereon, Cinf is used to designate the concentration of specific infections in the sample, which is smaller or equal to the concentration of infectious virions, i.e., measures the subset of the infectious virions.
Having deposited the inoculum into one well of the 96-well plate of our ED experiment, the likelihood that the well will not become infected, qnoinf, corresponds to the likelihood of having drawn k = 0 infectious virions (or rather, specific infections) out of the n virion volumes that make up our inoculum, namely
(1) |
where qnoinf can be simplified by realizing that
As such,
(2) |
where qnoinf and (Cinf Vvir) ∈ [0, 1] because Cinf = Nvir/Vsample and the number of specific infections in the sample, Nvir, is at a minimum zero, and at most the maximum number of virion-size volumes that can physically fit in the sample volume, namely Vsample/Vvir. As such, the maximum possible infection concentration, given a sample of volume Vsample, is Cinf = (Vsample/Vvir)/Vsample = 1/Vvir, and Cinf ∈ [0,1/Vvir].
Considering replicate wells at a given dilution
The ED assay is based on serial dilutions of the sample, with each dilution separated by a fixed dilution factor. We define the dilution factor ∈ (0,1) as the fraction of the inoculum volume drawn from the previous dilution. For example, if the inoculum for a well, Vinoc = 100 μL, comprises 10 μL drawn from the previous dilution and 90 μL of dilution media, the dilution factor is 10/100 = 0.1. If the serial dilution begins with a dilution of , then the following dilution will be . In Eq (1), the dilution under consideration, , will affect n, the number of virion-sized volumes drawn from the sample and deposited into the wells of the ith dilution, such that now . Therefore, the probability that a well at the ith dilution will not become infected is given by
(3) |
where 1 − qi is the probability of infection for a well at the ith dilution, where .
When conducting an ED assay, each dilution in the assay contains a number of independent infection wells (replicates), all inoculated with the same dilution, . This is analogous again to drawing balls out of a bag, but this time there are ni draws (replicate wells), and the probability of success (i.e., that a well becomes infected) is simply one minus the probability of failure (i.e., that a well does not become infected, qi). The probability that ki out of the ni wells become infected at dilution , is described by the Binomial distribution
where ni is the number of replicate wells at each dilution, but could be less if any well at dilution are spoiled or contaminated.
However, our interest is not in determining k1 given qnoinf, but rather in determining qnoinf given that we observed k1 infected wells out of n1 wells in the first column. To this aim, we can make use of Bayes’ theorem which, in our context, can be expressed as
or rather
where is our updated, posterior belief about qnoinf after having observed k1 successes out of n1 trials in the first column (i = 1), and given our prior belief, , about qnoinf before making this observation.
Considering all dilutions of the ED assay
As mentioned above, in the 96-well ED assay, each dilution contains a number of independent infection wells (replicates) inoculated with the same sample concentration. This process is then repeated over a series of dilutions, each separated from the previous by a fixed dilution factor. Having observed the fraction of wells infected at the first dilution considered, , we have updated our posterior belief about qnoinf. We will now use this updated belief as our new prior as we observe our second dilution (), such that
where we introduce and
as short-hands for convenience. From this, it is easy to extrapolate the posterior distribution after having observed all J dilutions () of the ED assay, namely
(4) |
where
(5) |
Note that this expression is largely equivalent to that obtained by Mistry et al. [8] in the context of estimating the TCID50 of a virus sample, and by many others in the broader context of infection dose quantification [12, 13].
Considering the choice of prior
In Eq (4), we obtained a posterior for qnoinf. Our objective, however, is to estimate the posterior distribution for Cinf, the specific infection concentration in our sample, rather than qnoinf. In fact, because both the plaque and ED assays provide an accuracy that is normally distributed in log10(Cinf) rather than Cinf, it follows that log10(Cinf) (hereafter ℓCinf) rather than Cinf is the quantity of interest. We note that in Eq (4) is a probability density function in , rather than in qnoinf. As such, a change of variables from qnoinf to ℓCinf would affect only the prior, because . Thus, the posterior distribution for ℓCinf is given by
(6) |
To complete this expression, we need to choose a physically and biologically appropriate prior belief regarding ℓCinf. Prior to conducting the ED assay, we know at least that Cinf ∈ [1/VEarth,1/Vvir], where 1/Vvir is the maximum possible concentration, namely that if the entire volume of the sample is constituted solely of infectious virions, and 1/VEarth is the minimum possible concentration, namely that if there was only one infectious virion left on Earth. As we explain below, these limits are not important; only the fact that they are convincingly physically bounded both from above and below, i.e., ∈ (0, ∞), is relevant.
If we choose our prior to be uniform in Cinf ∈ [1/VEarth,1/Vvir], namely , and using the fact that , we can write
which yields
(7) |
We see here that the range chosen for the uniform prior in Cinf is not important because it only contributes a constant to our proportionality Eq (6).
Alternatively, because the ED assay estimates ℓCinf rather than Cinf, our prior belief about the virus concentration is more appropriately expressed in ℓCinf rather than Cinf. Again, the bounds of the uniform distribution in ℓCinf is unimportant, provided that it is finite in extent such that where , such that we can write
(8) |
Fig 8 illustrates the two distinct priors assumed to arrive at Eqs (7) and (8) and their impact on the posterior for the example ED experiment described in Fig 1. Fig 8A illustrates the consequence of choosing a prior uniform in Cinf, i.e., a bias towards higher virus concentrations. This is because a uniform prior in Cinf corresponds to a belief that one is as likely to measure a set of virus concentrations in the range [0.001, 0.002] as in the range [1,000, 000.001, 1, 000, 000.002]. When plotted on a log-scale, there are 100× more intervals of width 0.001 in [104, 105] than in [102, 103]. Thus, this prior corresponds to a belief that the likelihood of measuring a certain virus concentration increases exponentially as ℓCinf increases linearly. In contrast, a prior uniform in ℓCinf corresponds to a belief that one is as likely to measure a set of virus concentrations in the range [0.001, 0.002] as in the range [1, 000, 000, 2, 000, 000], or rather in the range [1, 2] × 10−3 as in the range [1, 2] × 106. As such, a uniform distribution in ℓCinf is more physically and biologically sensible and therefore was chosen for our estimation method.
Calculation of midSIN’s outputs
One of the graphical outputs of midSIN is the non-normalized posterior distribution of ℓCinf given the number of wells that were infected at each dilution, , like that shown in Fig 1(left panel), computed as
(9) |
where
(10) |
While is not the normalized posterior distribution for ℓCinf, its maximum value at its mode () is the normalized probability of observing this particular ED plate outcome () out of all other possible plate outcomes, assuming the true, specific infection concentration in the sample is .
Another visual output of midSIN is a graphical representation of the theoretical number of wells that would be infected given the most likely ℓCinf, like that shown in Fig 1(right panel). It is computed following
(11) |
where x is the log10 of the dilution such that is the dilution. It corresponds to the continuous equivalent of this quantity which is discrete in the ED assay, namely which is the ith dilution of the sample. As such, where i ∈ [1, J]. For example, if the dilution of the least diluted column is 0.1 = 10−1 and the dilution factor between dilutions in the ED assay is such that it halves the concentration between each dilution, i.e., , then such that , , , and so on, such that x1 = 1, x2 = 1.301, x3 = 1.602, and so on.
In the graphical representation of the ED assay, the edges of the grey bands flanking the theoretical blue curve correspond to Eq (11) wherein has been replaced by the 68% and 95% CI values for ℓCinf. These CI bands do not correspond to the 68% and 95% CI of the expected number of infected wells at each dilution given .
The sample dilution corresponding to 1 TCID50 estimated based on the biased RM and SK approximations (right panels) are converted to SIN (left panels) based on 1 TCID50 = eγ=0.5772 SIN = 1.781 SIN [5, 6]. In contrast, the log10(SIN/mL) computed by midSIN can be converted to a true (unbiased) estimate of log10(TCID50) using 1 TCID50 = 1/ln(2) SIN = 1.44 SIN [4].
Infection concentration measures of influenza A virus samples
Cell culture
Madin-Darby canine kidney cells (MDCKs) were cultured in growth media (complete MEM media with 5% heat-inactivated FBS), in tissue culture treated T75 flasks, at 37°C with 5% CO2 and 95% relative humidity. Cells were split 1/10 every 3–4 days or upon reaching approximately 95% confluency. One passage of cells was expanded for use by both researchers in one experiment to quantify the 50% tissue culture infectious dose (TCID50) and plaque forming units (PFU) of one viral strain.
Viral stocks
Stocks of influenza A/Puerto Rico/8/34 (H1N1) (PR8) and influenza A/California/4/09 (Cali/09) were stored at -80°C and thawed on ice immediately before use. The TCID50 and PFU of stock viruses was known to both researchers prior to this study. Serial dilutions were made in MDCK infection media (complete MEM media with 4.25% BSA) and dilutions were made by each researcher independently for titering. ‘Researcher A’ and ‘Researcher B’ independently performed the TCID50 and PFU assays of one viral strain for one experiment on the same day using the same viral stock, reagents, and passage of cells. Each experiment was performed on a separate day (Fig 3).
Plaque assay
MDCKs were seeded in six-well plates (5.5 × 105 cells/mL, 2 mL/well) and grown to 90% confluency overnight (37°C, 5% CO2, 95% relative humidity). Each six-well plate contained 10-fold serial dilutions plated in singlet as well as a negative control and five 6-well plates were carried out per experiment. Cells were washed twice with PBS containing Ca2+Mg2+ (PBS w/ Ca2+Mg2+) (Gibco), before the addition of 500 μL of viral dilutions per well. After 1 h at room temperature on a rocker, the inoculum was aspirated, cells were washed with PBS w/ Ca2+Mg2+, and gently covered with 2 mL of agarose overlay (complete media, 4.25% BSA, 0.9% agarose, 1 μg/mL TPCK-Trypsin). After drying the overlay at room temperature, plates were inverted and incubated (37°C, 5% CO2, 95% relative humidity) for 3 d (PR8) or 4 d (Cali/09). Plaques were visualized by staining cells with 0.1% crystal violet solution in 37% formaldehyde for 30 min and counted by ‘Researcher A’ or ‘Researcher B’ on their respective experiments (Fig 3).
TCID50 assay
MDCKs were seeded in 96-well flat bottom plates (5 × 104 cells/100 μL, 100 μL/well) and grown to 80% confluency overnight (37°C, 5% CO2, 95% relative humidity). For each experiment, 4 replicate wells, at each of 7 different dilutions separated by a 10-fold dilution, were infected, and the dilution series was performed 5 times. Cells were washed with PBS w/ Ca2+Mg2+ before the addition of 100 μL of viral dilutions per well. After 1 h at room temperature on a rocker, the inoculum was aspirated and replaced with 100 μL of infection media containing 1 μg/mL TPCK-Trypsin. Cells were incubated (37°C, 5% CO2, 95% relative humidity) for 3 d (PR8) or 4 d (Cali/09). Supernatants from each of the MDCK-containing wells were transferred to a matching well in a 96-well U-bottom plate in the same configuration, and mixed with chicken red blood cells (30 min, room temperature). This enabled us to score each of the original MDCK-containing wells as either positive or negative for infection, based on whether their supernatant caused hemagglutination. This was performed and read by ‘Researcher A’ or ‘Researcher B’ on their respective experiments.
Statistical analysis
The data points reported in Fig 3C and 3D were computed by taking each of the 5 replicates measured with either the PFU, RM, or SK and the 5 replicates measured via SIN (5 replicates × 5 replicates = 25 pairs) for each of the 2 experiments by each of the 2 researchers, yielding 100 pairs. For each pair, the log10 of ratio of either PFU, RM or SK over SIN was computed. The mean and standard deviation of the resulting 100 log10(ratio) were computed and are reported in Fig 3C and 3D. The statistical significance (p-value) of the differences between (PFU,RM,SK) and (SIN) was computed using the Mann-Whitney U test (scipy.stats.mannwhitneyu).
Data Availability
The authors confirm that all data underlying the findings are fully available without restriction. The code is freely available on GitHub (https://github.com/cbeauc/midSIN) and the midSIN tool is available as a web application (https://midsin.physics.ryerson.ca).
Funding Statement
This work was supported in part by Discovery Grant 355837-2013 (CAAB) from the Natural Sciences and Engineering Research Council of Canada (www.nserc-crsng.gc.ca), Early Researcher Award ER13-09-040 (CAAB) from the Ministry of Research and Innovation of the Government of Ontario (www.ontario.ca/page/early-researcher-awards), by the Interdisciplinary Theoretical and Mathematical Sciences programme (iTHEMS, ithems.riken.jp) at RIKEN (CAAB), and by R01 AI139088 (AMS, APS, LCL) from the NIH NIAID (www.niaid.nih.gov). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1. Spearman C. The method of “right and wrong cases” (constant stimuli) without Gauss’s formula. Br J Psychol. 1908;II(Part 3):227–242. doi: 10.1111/j.2044-8295.1908.tb00176.xref [DOI] [Google Scholar]
- 2. Kärber G. Beitrag zur kollecktiven behandlung pharmakologischer reihenversuche. Archiv f Experiment Pathol u Pharmakol. 1931;162(4):480–483. doi: 10.1007/BF01863914 [DOI] [Google Scholar]
- 3. Reed LJ, Muench H. A simple method of estimating fifty per cent endpoints. Am J Hygiene. 1938;27(3):493–497. doi: 10.1093/oxfordjournals.aje.a118408 [DOI] [Google Scholar]
- 4. Bryan WR. Interpretation of host response in quantitative studies on animal viruses. Ann N Y Acad Sci. 1957;69(4):698–728. doi: 10.1111/j.1749-6632.1957.tb49710.x [DOI] [PubMed] [Google Scholar]
- 5. Wulff NH, Tzatzaris M, Young PJ. Monte Carlo simulation of the Spearman-Kaerber TCID50. J Clin Bioinforma. 2012;2(1):5. doi: 10.1186/2043-9113-2-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Govindarajulu Z. 4. The Logit Approach. In: Statistical techniques in bioassay. 2nd ed. Basel; New York: Karger; 2001. p. 35–90. [Google Scholar]
- 7. LaBarre DD, Lowy RJ. Improvements in methods for calculating virus titer estimates from TCID50 and plaque assays. J Virol Methods. 2001;96(2):107–126. doi: 10.1016/S0166-0934(01)00316-0 [DOI] [PubMed] [Google Scholar]
- 8. Mistry BA, D’Orsogna MR, Chou T. The effects of statistical multiplicity of infection on virus quantification and infectivity assays. Biophys J. 2018;114(12):2974–2985. doi: 10.1016/j.bpj.2018.05.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Mistry BA. Website application associated with [8];. Available from: http://www.bhavenmistry.com/SMOI/.
- 10.Spouge JL. Website application associated with [12];. Available from: https://www.ncbi.nlm.nih.gov/CBBresearch/Spouge/html_ncbi/html/id50/id50.cgi.
- 11. Beauchemin CAA, Kim YI, Yu Q, Ciaramella G, DeVincenzo JP. Uncovering critical properties of the human respiratory syncytial virus by combining in vitro assays and in silico analyses. PLOS ONE. 2019;14(4):e0214708. doi: 10.1371/journal.pone.0214708 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Spouge JL. Statistical analysis of sparse infection data and its implications for retroviral treatment trials in primates. Proc Natl Acad Sci USA. 1992;89(16):7581–7585. doi: 10.1073/pnas.89.16.7581 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Weir MH, Mitchell J, Flynn W, Pope JM. Development of a microbial dose response visualization and modelling application for QMRA modelers and educators. Environ Model Softw. 2017;88:74–83. doi: 10.1016/j.envsoft.2016.11.011 [DOI] [PMC free article] [PubMed] [Google Scholar]