Abstract
The standard approach for quantitative estimation of genetic materials with qPCR is calibration with known concentrations for the target substance, in which estimates of the quantification cycle (Cq) are fitted to a straight-line function of log(N0), where N0 is the initial number of target molecules. The location of Cq for the unknown on this line then yields its N0. The most widely used definition for Cq is an absolute threshold that falls in the early growth cycles. This usage is flawed as commonly implemented: threshold set very close to the baseline level, which is estimated separately, from designated "baseline cycles." The absolute threshold is especially poor for dealing with the scale variability often observed for growth profiles. Scale-independent markers, like the first derivative maximum (FDM) and a relative threshold (Cr) avoid this problem. We describe improved methods for estimating these and other Cq markers and their standard errors, from a nonlinear algorithm that fits growth profiles to a 4-parameter log-logistic function plus a baseline function. By examining six multidilution, multireplicate qPCR data sets, we find that nonlinear expressions are often preferred statistically for the dependence of Cq on log(N0). This means that the amplification efficiency E depends on N0, in violation of another tenet of qPCR analysis. Neglect of calibration nonlinearity leads to biased estimates of the unknown. By logic, E estimates from calibration fitting pertain to the earliest baseline cycles, not the early growth cycles used to estimate E from growth profiles for single reactions. This raises concern about the use of the latter in lengthy extrapolations to estimate N0. Finally, we observe that replicate ensemble standard deviations greatly exceed predictions, implying that much better results can be achieved from qPCR through better experimental procedures, which likely include reducing pipette volume uncertainty.
Abbreviations: qPCR, quantitative polymerase chain reaction; y and y0, fluorescence signal above baseline at cycle x and at cycle 0; E, amplification efficiency; Cq, quantification cycle; yq, signal at x = Cq; N0, initial number of target molecules in sample; Ct, threshold cycle, where y = yq; FDM and SDM, cycles where y reaches its maximal first and second derivatives, respectively; Cy0, intersection of a straight line tangent to the curve at the FDM with the baseline-corrected x-axis; LS, least squares; χ2, chi-square; wi, statistical weight for ith data point; σ2 and σ, variance and standard deviation; S, sum of weighted, squared residuals (= "Chisq" in KaleidaGraph fit results, = χ2 when wi = 1/σi2); ν, statistical degrees of freedom, = # of data points − # of adjustable parameters; SD, standard deviation; SE, parameter standard error
Keywords: qPCR, Data analysis, Weighted least squares, Statistical errors, Chi-square, Calibration
1. Introduction
The goal of quantitative polymerase chain reaction (qPCR) is, as advertised, the quantification of small amounts of targeted genetic material through amplification to easily detected quantities [1]. Quantification may be relative to a chosen reference substance [2,3] or absolute. The standard approach for the latter is through calibration procedures that compare the unknown with results for the same substance measured at a range of known concentrations appropriate for the unknown [[4], [5], [6]]. Fig. 1 illustrates typical qPCR growth profiles obtained at 5 concentrations spanning 4 orders of magnitude in template copy number. Calibration with such data, like all classical calibration procedures, involves identifying some property that depends monotonically on the concentration, fitting measurements of that property to a suitable response function, z = f(x), and then solving the equation,
| z0 = f(x0), | (1) |
for the unknown x0, where z0 is its measured response. Although analysts prefer linear response functions, there is no requirement for such, and calculation of x0 and its uncertainty is a simple computational task for any f(x) [8,9]. In qPCR the chosen calibration relationship is Cq vs. log(N0), as illustrated in Fig. 2 for the data in Fig. 1. This choice is based on the exponential growth equation that is assumed to hold throughout the baseline region and into the early growth phase,
| y = y0Ex, | (2) |
where E is the amplification efficiency (AE), ranging from E = 1 (no amplification) to E = 2 (perfect doubling), x is the cycle number, and y represents the fluorescence signal, or under the assumption that target fluorescence is proportional to its amount, the number N of amplicons. Accordingly, y0 represents this quantity in cycle 0, before amplification. Cq is defined as the cycle, x = Cq, where growth reaches a defined threshold intensity, yq. Thus a plot of Cq vs log(N0) is expected to be linear with slope –1/log(E). Locating the unknown on this curve by its Cq determines its N0.
Fig. 1.
qPCR fluorescence curves for lambda gDNA for 10-fold dilution from 188,000 copy numbers to 19, as recorded in triplicate by Rutledge and Stewart [7]. Inset shows positions of Cq markers for one reaction at highest concentration. With the threshold set at 12% of the (plateau – baseline) difference, the relative threshold Cr coincides with Cy0 within 0.1 cycle.
Fig. 2.
Results of LS fits of 4 Cq markers from growth profiles in Fig. 1 to linear relation (right) and of FDM to quadratic centered at log(N0) = 3 (top). The quadratic coefficient in the latter is statistically significant in ad hoc fitting, having magnitude larger that its SE. Note close agreement in slopes (giving E) and in "Chisq" values (sums of squared residuals) for linear fits. Cq values were obtained from log-logistic fits of 24-point regions of profiles centered near the half-intensity points.
Although this threshold definition of Cq (often designated Ct) has been most widely used [1,10], it is clear from the plateauing behavior in the growth profiles that Eq. (2) cannot hold throughout the growth region. Recognizing this, most workers have taken yq as a level near baseline, where it is hoped that Eq. (2) remains valid. However, when the profiles have constant shape, as appears to hold in Fig. 1, the value of yq is irrelevant, as different choices simply displace Ct by constant amounts [7]. The same holds for other markers, like the first- and second-derivative maxima (FDM and SDM), and Cy0 (the intercept with the cycle axis of a line tangent to the growth curve at the FDM) [10]. The issue then becomes, which of these can be estimated most precisely from the data and also yield a smooth dependence of Cq on log(N0). It has been noted that Ct, with yq set near the baseline, actually gives the poorest estimation precision, exacerbated by the practice of separately estimating the baseline level [11,12]. Further, Ct is subject to additional precision loss when the data vary in scale [13,14], as is evident from the varying plateau levels usually observed for the profiles, including those in Fig. 1. The other common markers – FDM, SDM, and Cy0 – are scale-independent, thus free from this problem, as is also the relative threshold Cr, which is obtained by setting yq to a designated fraction of the plateau level [12].
The least-squares (LS) fit results in Fig. 2 show that indeed the different markers are statistically comparable for the data in Fig. 1. However, the linear response appears to be statistically inferior to quadratic analysis, from which we would conclude that E is concentration dependent. In fact this conclusion is an incorrect consequence of our use of unweighted LS to fit these data, which have excess imprecision at high dilution (low N0) from unavoidable limitations of Poisson statistics [11]. Such effects can be strong enough to permit quantitative estimation of N0 directly from the statistical variance in Cq [12,15]. When the data are refitted with appropriate downweighting for the Poisson contributions to the Cq variance (details below), the quadratic contribution is statistically undefined, and E = 1.916(6).
The realization that the slope in Fig. 2 is determined by E has led to efforts to estimate E from analysis of single-reaction (SR) data, in combination with which a single calibration constant relating y0 to N0 could permit absolute determination of the unknown N0. Most SR methods employ Eq. (2) on data limited to the early growth region [10], while at least two focus directly on y0 [16,17] but tacitly assume that E = 2 in the baseline region [11]. For commonly encountered growth profiles like those in Fig. 1, E must decline from E0 (∼2) in the baseline to 1 in the plateau, but there is little direct evidence on just how this decline occurs in the late baseline region.1 The "mechanistic" models of [16] and [17] predict reasonable decline of E in this region, as does the model [18,19],
| (3) |
where ymax is the limiting growth and E0 the initial AE. The second expression is the logistic model and is obtained by neglecting y0 in the denominator of the first; x1/2 is the half-intensity point and the FDM, with b = ln(E0) and ymax/y0 = .
When data are generated with the model of Eq. (3) (designated LRE in [19]) and then analyzed using Eq. (2), there is an unavoidable playoff between imprecision and bias: When enough cycles are included to permit adequate precision, E has already declined enough to give significantly low-biased estimates [12]. If there is real variation of E with cycle number, there is another problem with SR methods of estimating it. By simple logic [13], the calibration-based E applies to the earliest cycles: Comparing two starting concentrations, ΔCq is the number of cycles for N in the more dilute sample to equal N0 for the more concentrated, after which they grow together. From this, log(E) = log(R)/ΔCq, where R is the dilution ratio. By extension a dilution series estimates E for just the earliest cycles in the most dilute samples to cycle 0 in the most concentrated. Thus, if E varies with cycle number, even a correct estimate of it from early growth cycles could vary substantially from its value in the early baseline cycles.
In the present work we describe an improved algorithm for fitting growth profile data to a log-logistic function with asymmetry parameter [20] plus a baseline function containing 1–4 parameters; and we use it to estimate the 4 common scale-independent Cq markers and their standard errors. We also use its estimates of the SDM and the plateau amplitude to facilitate a second estimation of Cr (which we label Cr,x), by fitting just cycles up to the SDM to Eq. (2) plus a baseline function [15]. We use these codes to compare the performance of the several markers on six qPCR datasets having replicates at multiple dilutions suitable for calibration fitting, and we ask whether the commonly adopted linear relation is statistically justified. In most cases we find that it is not, implying that E does depend on concentration, and in turn that estimates of N0 based on linear calibration are biased. The magnitude of such bias may or may not be of practical concern, depending on circumstances. In [1] the authors found that estimates of E can vary with instrument and reaction parameters; here we find that even for a given instrument and fixed reaction conditions, E can vary with N0 as well as with the choice of Cq marker. These results bear out earlier indications [13] for the large 94 × 4 Reps technical data set from [10]. Those results were obtained using the Cq estimates provided by the authors of the methods compared in [10]; our present algorithm yields Cq estimates that match or exceed the best of those in their statistical performance.
One observation from our comparisons has important practical significance: Ensemble standard deviations (SDs) for Cq typically exceed the LS parametric standard errors (SEs) by factors of 2-5. The SEs are based on the fit model and the random error in the profile data, and they should correctly predict the ensemble SDs if such data error is the only source of dispersion in the estimates [12,21]. The excess dispersion in the ensemble estimates thus means there are other sources of experimental error, which likely include pipette volume uncertainty. It follows that qPCR is capable of much higher precision through better experimental procedures.
2. Mathematical background and methods
2.1. Amplification efficiency from calibration fitting
Fitting data to a straight-line response function is perhaps the most common exercise in data analysis and needs little review. Here the fit relation is
| Cq = a + bx = a – x/log(E), | (4) |
where x = log(N0). The second expression in Eq. (4) represents a nonlinear LS (NLS) fit, with the advantage of yielding the parametric standard error (SE) directly from NLS programs that provide SEs. Alternatively, the SE in E can be obtained from that in b using error propagation [11,22],
| σE = E ln(10) σb/b2. | (5) |
The extension of Eq. (4) to polynomials of order 2 and higher is straightforward and remains a linear LS fit,
| Cq = a + bx + cx2 + …, | (6) |
However, the slope is now a function of x, involving b and all higher-order fit parameters. On the other hand, by recentering the fit about x0 (= selected log(N0)), we obtain a statistically equivalent fit where b is the slope at x0 [22], so E and its SE can be obtained as a function of x0 by just changing x0 in the expression,
| Cq = a + b(x – x0) + c(x–x0)2 + … = a – (x–x0)/log(E) + c(x–x0)2 + …, | (7) |
Calibration fitting in qPCR is almost universally done with neglect of weighting, which tacitly assumes the data have constant uncertainty. This assumption is incorrect when the calibration data extend to N0 small enough to make Poisson uncertainty in N0 significant – typically N0 ≈ 100 or smaller [13,15] – in which case weighted fitting is called for. As is described below, one can often get reliable estimates of the Cq variance from all other effects by analyzing the replicates at the higher concentrations. The Poisson contribution for small N0 is approximately
| (8) |
and can be added to the estimated variance from other effects to give the total variance. The weights wi are then taken as the reciprocals of the total variances.
2.2. The log-logistic model with asymmetry
The log-logistic model is a sigmoidal alternative to Eq. (3), and in the 4-parameter form [20],
| LL4(x) = ymax [1 + (g/x)h]−p, | (9) |
it can accommodate some asymmetry. With p = 1, it is nearly symmetrical, and g ≈ x1/2 (≈ xFDM). We have found that this form, in combination with suitable baseline functions, can yield Cq values that are statistically at least as precise as those values obtained by any of the other methods reviewed in [10]. We achieve this performance by fitting typically 20–30 cycles in the transition region; and we evaluate all 4 common Cq markers at a time: FDM, SDM, Cy0, and relative threshold Cr, all of which are scale-independent.
The FDM is obtained by setting the second derivative of LL4(x) = 0 and solving for x = xFDM. Similarly, the SDM is obtained by setting the third derivative = 0. Cy0 is defined [23] as the intercept with the cycle axis of a straight line tangent to the growth curve at the FDM. Cr is obtained by first estimating the plateau level ymax and then solving LL4(Cr) = r ymax, with r a specified fraction, typically about 0.15. In fitting data to Eq. (9), there is always at least one additional parameter for the baseline, and we have used as many as four, giving from 5 to 8 adjustable parameters (see below).
The expression for the FDM is
| xFDM = g/Z(1/h), | (10) |
where Z = (h + 1)/(hp – 1). We can obtain the FDM and its SE directly from the fit by replacing g with xFDM, whereupon the denominator in Eq. (9) becomes Dp with
| (11) |
This substitution can be useful, because xFDM is both more precise and easier to estimate visually than g for curves with significant asymmetry. Cy0 is given by
| (12) |
and the SDM is obtained by solving the quadratic equation in u (=(g/xSDM)h),
| Au2 + Bu + C = 0, | (13) |
with A = (hp–1)(hp–2), B = (h+1)(4–h–3hp), and C = (h+1)(h+2).
For asymmetrical growth profiles, there are actually two different modes on which the fits can converge: with h > 0 and h < 0. Correspondingly, the roles of the parameters for baseline and plateau are reversed, with ymax becoming negative; and p changes from >1 to 0 < p < 1. The two modes converge for p = 1 but are not equivalent otherwise, typically showing a factor of ∼2 difference in fit variance. We examine both modes below. Note that there are no parameters in this model for predicting E0 in the baseline region, and in fact the calculated values of E(x), from LL4(x+1)/LL4(x), are not physically reasonable outside of the growth region. Thus, estimating E0 with the LL4 model in SR analysis requires some prescription for choosing the appropriate x at which to assess this ratio. We do not pursue this matter further in this work.
As mentioned earlier, we have found that the Cq markers are obtained with optimal precision by limiting the fit to just 20–30 cycles centered on the growth region. We designate the fit region by first locating the approximate SDM (SDMprx) from second differences, as done by Boggy and Woolf [16], and then specifying a start cycle x1 = SDMprx – Δ1 and end cycle x2 = SDMprx + Δ2. By trial and error, we find best results with Δ1 = 8–12 and Δ2 = 12–16, depending on dataset.2 The Cq estimates do depend somewhat on x1 and x2, and in cases where the true SDM falls near a half-cycle, SDMprx can vary with reaction in a set of replicates, giving excess Cq variance from the variation in cycle range. To reduce such effects, we can use code versions that permit specifying absolute x1 and x2, which would normally be employed after discovering results that show varying ranges among replicates. With all replicates analyzed with the same x1 and x2, the precision is not strongly sensitive to these values, with variance typically changing by only a few percent when x1 and x2 are changed by ±1.
By using the LL4 estimates of ymax and SDM, we are able to estimate Cr a second way: fit just cycles up to the (rounded) SDM to the exponential growth law of Eq. (2) plus a suitable baseline function (discussed below). This threshold, which we label Cr,x, is a 1-step relative version of the FPLM method [10,24] and was used previously to estimate absolute copy number from the Cq variance [15].
2.3. Baseline functions
It is common practice to estimate the baseline from a selected range of early cycles and then subtract it from the profile to yield the growth curve. As compared with simultaneous estimation of baseline and growth parameters from a single nonlinear LS fit, this 2-step procedure is statistically inferior, yielding biased Cq estimates that on average amount to a 3-fold increase in Cq variance [12]. The 1-step NLS fit is easy to implement, with
| y(x) = bas(x) + LL4(x), | (14) |
being the fit model for the LL4 growth curve. The choice of function for bas(x) can depend on the range of early cycles included in the fit. We have commonly used linear and quadratic functions of x, in which we judge the need for parameters beyond the minimal single constant by their statistical significance in the fit: Any parameter having SE greater than its magnitude is statistically undefined in ad hoc fitting [25]. For data like those in the 94 × 4 Reps dataset from [10] that exhibit "saturation" baseline behavior, we have used the exponential plateau function [10,15],
| bas(x) = a – q exp(–ρx), | (15) |
sometimes with an added linear term. Saturation behavior manifests as a rise in the first few cycles, so if these are omitted from the fit range, a linear or quadratic bas(x) may perform as well.
2.4. Weighted least squares
In linear LS it is rigorously true (but not widely appreciated) that minimum-variance parameter estimates are obtained if and only if the data are weighted inversely as their variances. The same is normally assumed to hold in nonlinear LS but cannot be proved, in part because many NLS estimators do not even have finite variance. Still, Monte Carlo simulations for common nonlinear models have supported inverse-variance weighting [21,26]. This issue arises in the LS fitting of qPCR data in two situations already mentioned: the analysis of reaction profile data by whole-curve fitting, and the fitting of Cq vs log(N0) for calibration [13].
Considering first the fitting of growth profile data to sigmoidal models, most such data are collected on instruments that monitor fluorescence. A major source of random error or noise in optical data has long been referred to as "flicker" [27], describing the fluctuations in the light source. The result is noise proportional to signal; for example, 1% fluctuation in the intensity of the light source produces 1% fluctuation in the detected signal (σy,p = σpy). However, as the signal decreases, the limiting noise becomes a constant related to the properties of the detection instrumentation. The sources are considered independent, so their variances add, giving a simple relation that holds well for a number of different experimental techniques [28,29]:
| σy2 = σ02 + (σpy)2. | (16) |
The main effects of neglecting weights are reduced precision of estimation and incorrect parametric SEs; however, the magnitude of such losses may be tolerable when the weights vary by less than a factor of 10 over the data set [8,30]. This, for example, appears to be the case for the 94×4 Reps data from [10], where the baseline intensity level is about half that in the plateau region. However, it does not hold for most qPCR data we have analyzed, and is borderline for those in Fig. 1, where plateau levels are about 3 times baseline levels.
The conditions requiring weighting in calibration fitting of Cq vs log(N0) were discussed in Section 2.1. When the range of N0 includes values small enough to add Poisson scatter to the results, the increased variance dictates a reduced weight for the relevant N0 values. The Poisson variance contribution is predictable and was given in Eq. (8).
In assessing the quality of LS fits, an important metric is χ2 [13], defined as the sum of weighted squared residuals, Σ wiδi2, where δi is the difference between observed and calculated yi. If the data variances σyi2 are known and the wi are taken as their inverses, the expected value of χ2 is the number of statistical degrees of freedom ν = n–p, where n is the number of fitted values and p the number of adjustable parameters. Accordingly the expected value for the reduced χ2 (RCS, χ2/ν) is 1. The standard deviation (SD) of the RCS is (2/ν)1/2, so that, e.g., the 90% range is 0.39–1.83 for ν = 10, narrowing to 0.66–1.39 for ν = 40 [25]. Too-high values of RCS result from some combination of optimistic assessment of the data error and a poor fit model, while too-low mean pessimistic data errors. In unweighted fitting the data are tacitly assumed to have constant variance and the wi are taken as 1; RCS then becomes an estimate of the data variance σy2. When comparing fit models, smaller χ2 indicates a better fit.
2.5. Computations
We have used the KaleidaGraph program (Synergy Software) for examining data and running preliminary analyses, also for preparing the illustrations in this work. For processing the multireplicate data sets efficiently, we have devised FORTRAN (Microsoft) codes that are similar in structure to the NLS routine provided long ago by Bevington (program 11-5, CURFIT) [25]. The different Cq markers are evaluated as detailed above, and their SEs are obtained by error propagation [22]. The NLS fits require initial values for the parameters but are not very sensitive to these, making it possible to achieve successful convergence for all but at most a few experiments in a multireplicate data set in a first attempt, with success on problem experiments in a second pass.
User-friendly versions of these programs are planned for later distribution; similar codes in R are already available [20], and modifications that permit most of the computations described here can be obtained at https://github.com/anspiess/qPCR-algorithms.
3. Results and discussion
3.1. Performance tests
Fig. 3 illustrates the two convergence modes for the LL4 model on one of the reactions in the 94 × 4 Reps dataset from [10]. Note the negative values for ymax and h in alt mode and the reversed role for a from baseline to plateau level. For this example, the slope b is statistically insignificant in normal mode and the quadratic coefficient is insignificant in alt mode. The significantly smaller Chisq (χ2) value in alt mode is characteristic of all reactions in this dataset.3 As has been noted above, unweighted NLS suffices for these fits, because the data noise varies by only a factor of ∼2 from baseline to plateau. (Other reasons are discussed below.) The close agreement in the two estimates of the FDM holds in general for all Cq markers except the SDM, where there is systematic difference of about 0.2 (alt mode higher). This effect and the different χ2 values presumably relate to differences in how the asymmetry is handled in the early growth region vs the approach to plateau in the two modes.
Fig. 3.
NLS fits of first reaction at highest concentration in 94 × 4 Reps dataset [10] to LL4 + bas(x) = a + bx + cx2, in normal (upper) and alternate (alt) modes. The quantity D in the denominator of LL4 is as defined in Eq (11). Chisq is the sum of squared residuals for these unweighted fits.
In [13] (Table 1 and Fig. 3) were compared the precisions of estimation of Cq for each of the 4 concentrations in this dataset for the 7 methods reviewed in [10]. The minimum-variance results came from the Miner method [31] for 3 of the 4 concentrations, with Cy0 slightly bettering it for the most dilute samples. The closest to Miner was 5PSM [20], in which the LL4 model was employed but with a single baseline parameter and fitting all cycles. As we have noted, the present version of this model gives ensemble precisions equaling or bettering the best in [10]. Cy0 and Cr gave generally smaller ensemble SDs across the concentration range than FDM and SDM, with the two modes being comparable in all cases. Fig. 4 compares the Cy0 ensemble SDs with the smallest from [10] and with Cr,x estimates obtained in [15] using Eq. (15) for the saturation baseline. Also shown in Fig. 4 are the rms averages of the SEs predicted by the individual fits for Cy0 in alt mode and for the Cr,x model. The point of this comparison is to emphasize that the ensemble SDs display excess dispersion from effects other than data noise. If the latter were the only source of variability, we would expect the ensemble SDs to agree with the parametric SEs [21]. As already noted, the excess dispersion at small N0 is from Poisson scatter. At large N0 Poisson scatter is negligible, and we have suggested that pipette delivery volume error is the main source of excess dispersion [12,15]. Reasons for the much smaller predicted SE for the Cr,x model are discussed just below. The better fit quality for LL4 in alt mode is responsible for its predicted SEs being ∼25% smaller than those for normal mode.
Table 1.
Cq values obtained for 3 × 5 data from Rutledge and Stewart [7] by fitting to the LL4 model with a linear baseline function (Eq. (14)). Cycles in the indicated range (c1–c2) were fitted, with values inverse-variance weighted using the variance function from Fig. 9.a
| N0 | c1–c2 | RCSb | FDM | SDM | Crc | Cy0 | Cr,xc, d |
|---|---|---|---|---|---|---|---|
| 188,000 | 7–30 | 0.826 | 18.408 (22) | 16.420 (28) | 15.549 (23) | 15.457 (20) | 15.442 (30) |
| 7–30 | 1.325 | 18.447 (28) | 16.493 (37) | 15.525 (30) | 15.482 (25) | 15.373 (38) | |
| 7–30 | 0.96 | 18.444 (24) | 16.497 (31) | 15.518 (25) | 15.481 (21) | 15.402 (38) | |
| 18,800 | 10–33 | 1.359 | 22.014 (28) | 20.052 (37) | 19.066 (31) | 19.030 (25) | 18.972 (41) |
| 11–34 | 1.013 | 21.983 (24) | 20.025 (31) | 19.056 (27) | 19.014 (23) | 18.984 (27) | |
| 11–34 | 1.16 | 21.972 (26) | 20.011 (34) | 19.112 (29) | 19.039 (25) | 18.934 (27) | |
| 1880 | 15–38 | 0.845 | 25.408 (22) | 23.441 (29) | 22.515 (25) | 22.454 (22) | 22.437 (29) |
| 14–37 | 1.028 | 25.553 (24) | 23.587 (31) | 22.609 (27) | 22.570 (22) | 22.544 (27) | |
| 15–38 | 0.752 | 25.470 (20) | 23.528 (27) | 22.562 (23) | 22.523 (20) | 22.422 (27) | |
| 188 | 18–41 | 1.741 | 29.196 (31) | 27.232 (40) | 26.306 (35) | 26.247 (28) | 26.211 (21) |
| 18–41 | 1.501 | 29.073 (29) | 27.109 (38) | 26.160 (34) | 26.111 (28) | 26.097 (23) | |
| 19–42 | 1.41 | 29.376 (28) | 27.376 (37) | 26.506 (32) | 26.415 (28) | 26.471 (37) | |
| 18.8 | 21–44 | 1.252 | 32.407 (27) | 30.455 (35) | 29.558 (31) | 29.491 (25) | 29.514 (28) |
| 21–44 | 1.558 | 31.987 (31) | 30.040 (40) | 29.165 (36) | 29.090 (29) | 29.028 (26) | |
| 22–45 | 1.14 | 32.530 (25) | 30.585 (34) | 29.687 (30) | 29.622 (25) | 29.558 (19) | |
Figures in parentheses are parametric standard errors, in terms of final displayed digits; e.g., 18.408 (22) means SE = 0.022.
Reduced chi-square.
Obtained for yq = 0.12 ymax.
Obtained fitting cycles 5-c2 to Eq. (2) plus a linear baseline, with c2 = 15, 19, 23, 26, 29 for the 5 concentrations, respectively.
Fig. 4.
Standard deviations/errors for each of 4 concentrations in the 94 × 4 Reps data [10]. Ensemble SDs at top from present estimates of Cy0, compared with best from [10] and Cr,x estimates from [15]. At bottom are the rms (root-mean-square) averages of the parametric SEs from the individual fits, for Cy0 using LL4 model in alt mode, and for Cr,x. Connecting lines are just for display purposes.
The observation that the ensemble SDs significantly exceed the parametric SEs means that the LL4 model used to estimate the Cq markers is likely adequate for this task, thanks to the excess dispersion from sources other than data noise. However, neither the LL4 model nor a similar 4-parameter version of Eq. (3) accurately represents the fitted data, as is clear from the pronounced systematic component in the fit residuals (see graphic and Supplemental Fig. S-5 in [12]). The improved fit quality in alt mode reduces but does not eliminate these effects. On the other hand, systematic residual trends practically vanish for the Cr,x model, because only a few cycles in the growth region are included in the fits. This leads to an estimated σy a factor of 3 smaller than that from the LL4 fits, and this is a major contributor to the smaller parametric SE for this model in Fig. 4. Poisson dispersion for small N0 is an unavoidable mathematical reality; however, the excess dispersion at large N0 in these data is presumably reducible through better experimental techniques, in which case better fit models will be needed to take advantage of the improved precision in whole-curve fitting. The low rms SE for Cr,x represents the best that can be hoped for optimal experimental data; its factor-of-4 improvement over the ensemble SDs at large N0 represents a 42 efficiency improvement, meaning one reaction would be the statistical equivalent of 16 replicates.
As we have noted, one source of excess dispersion is correctable through data analysis alone, namely the effect of scale variability on the absolute threshold Ct. Fig. 5 compares the ensemble SDs for Ct and Cr,x, with yq and r chosen to make the average Cqs about the same. For the largest two N0s, the variance ratios are 2 and 5; these would have been even larger had we used the standard practice of separately estimating the baseline and subtracting before assessing Ct [12]. The large differences stem directly from the pronounced scale variability for these data [13,14].
Fig. 5.

Ensemble variances for absolute and relative threshold in the 94 × 4 Reps data. For Ct, yq = 700; for Cr,x, r = 0.18. Estimates for both were obtained by fitting to Eq. (2) plus the bas(x) function of Eq. (15). Error bars represent one SD. The average Ct values slightly exceed those for Cr,x, by from 0.07 to 0.30.
Fig. 6 shows calibration fit results for this dataset. The weights for all markers at each concentration were the same as used to produce Fig. 5 in [13], so the low Chisq values (cf ν ≈ 370) are another indicator of the higher precision of the present Cq estimates. The results resemble those in Fig. 4 from [13] in showing a statistically insignificant cubic coefficient for Cr. This parameter was also insignificant for FDM, was barely so for Cy0, and significant by about 2 σ for SDM. Results for Cq from alt mode were very similar, with Chisq smaller by as much as ∼25 for SDM, higher by ∼5 for FDM. Fig. 7 shows that the E estimates from the quadratic fits are statistically consistent for all markers. The increase in E with concentration is solid, but at the highest concentration, all Es exceed the physical limit of 2.0 by more than 1 σ.
Fig. 6.
Calibration fits of Cq estimates for 94 × 4 Reps data, weighted using a common set of inverse ensemble variances. At top are linear, quadratic, and cubic fits of the Cr estimates to polynomials in (x –1.5), showing that the cubic coefficient (d) is not statistically defined but that the quadratic one (c) is. For comparison, the quadratic fits of Cy0, FDM, and SDM are included, confirming that c is statistically significant in every case and showing that all E estimates are statistically consistent at x = 1.5.
Fig. 7.
Amplification efficiency as a function of concentration, from quadratic fit results in Fig. 6. Error bars (1-σ) are shown for just Cy0 but are nearly identical for all 4 markers.
3.2. Weighted fitting
The need for weighted LS arises in two situations, already noted: In the estimation of Cq by fitting whole-curve data when the scatter in the plateau region greatly exceeds that in the baseline region; and in calibration fitting when the Cq range includes N0 values small enough (<100) that Poisson scatter contributes significantly to the Cq variance. Using the data behind Fig. 1, Fig. 2, we show how to derive the needed weights.
Fig. 8 illustrates the estimation of the variance in the baseline and plateau regions for the curves shown in Fig. 1. The approach is to use the scatter about a smooth approximation of the data in regions where they vary little in magnitude. For that approximation, here we use polynomials of 4th order fitted by unweighted LS. From the discussion near the end of Section 2.4, the estimated variance is Chisq/ν, with ν = n – 5, giving estimated σy2 = 173 and 1845, respectively, for the baseline and plateau regions from the included fit results. Note that all polynomial parameters are statistically significant here for the baseline fit (B), since all SEs are less than the parameter magnitudes; however, this is not true for the plateau data, and the indicated reduction of the fit order decreases that variance estimate to 1554.
Fig. 8.
Estimating data variance from polynomial fitting, for 4th dilution in 3 × 5 data from [7], in plateau (A) and baseline (B) regions. The estimated variances are Chisq/(n–5), with n = 14 in the plateau region and 22 in the baseline region. Fit results are shown for only the lowest curve in each panel; Chisq values for the other curves (open and solid points, respectively) are 27,000 and 14,700 (A) and 1056 and 4080 (B). Note that none of the parameters in A is statistically significant; in fact these data are well represented by a quadratic function, with little increase in Chisq but an increase of 2 in ν, giving ∼20% smaller estimated variances.
Fig. 9 shows the fit of the 15 baseline and 15 plateau variance estimates from these data to Eq. (16). This is a weighted fit, since variance estimates have error proportional to (2/ν)1/2, i.e., σ(σy2) = (2/ν)1/2 σy2. The resulting χ2 is statistically reasonable −38 as compared with ν = 28. The data here are not sufficient to confirm that Eq. (16) is better than other variance functions (VF); however, this function works well in many situations and VFs are anyway not needed with great precision to yield near-optimal fit results when they are subsequently used in weighting. An alternative to the fit shown in Fig. 9 is the fit of ln(σy2) to ln (a2 + (by)2), in which the weighting is simpler [28]: σ(ln(σy2)) = (2/ν)1/2. Since ν varies with estimate here, weighted fitting is still required.
Fig. 9.
Fit of estimated variances for 3 × 5 data from [7] to Eq. (16). From these results, the second term dominates the variance even in the baseline region.
Using this VF to compute weights (inverse variances), we fitted the 15 profiles to Eq. (14), obtaining results for the 5 Cq markers summarized in Table 1. To do the calibration analysis of Fig. 2 properly, we need to include weights for Cq to compensate for the decreasing precision of these with decreasing N0 from Poisson scatter. Variances estimated from just 3 values are inherently very uncertain: (2/ν)1/2 = 1, meaning 100% relative SD. However, using the estimates collectively, we obtain adequate results by assuming the Cq variance is the sum of the predictable Poisson contribution and a constant, as shown in Fig. 10. When weights computed from these variance functions are used in the calibration fits of Fig. 2, there is no statistically significant nonlinearity for any of these Cq markers; and they yield virtually identical E: 1.918 for Cr, 1.916 for the others.
Fig. 10.
Cq variance estimates from replicate values in Table 1, displayed in logarithmic form, and results from fitting values for each marker to ln (), where the Poisson variance is given by Eq. (8), with E taken as 1.915. Error bars are shown for Cr only but are the same for all, σ = 1. Values of A range from 0.00015 for Cy0 to 0.0011 for SDM.
3.3. Other multireplicate datasets
In addition to the two datasets already discussed, from [7] and [10], we have analyzed data from Rutledge and Cote [5] (20 replicates at each of 6 concentrations), Guescini et al. [23] (12 reps at 7 concentrations), Lievens, et al. [32] (18 × 5), and Karlen et al. [33] (FN, 15 × 4 + 12). Of these, the Rutledge and Cote data showed only marginally significant quadratic coefficients for just two of the 5 Cq markers, leading to an increase in E from 1.90(1) to 1.93(1) from the most to least concentrated samples. Linear fits gave Es spanning the narrow range 1.902(3) to 1.916(3). As discussed below (and see online Supplement for more detail), the other three datasets showed statistically significant nonlinearity in calibration fits. The data themselves, some of which are no longer downloadable from the journal web sites, can be obtained at www.dr-spiess.de.
For the other three datasets, at least some Cq markers gave statistical significance for all 5 parameters in a fit to a quartic polynomial, and the Lievens data did so for all 5 markers. However, drawing reliable conclusions about the derivatives (hence E) from such high-order fits is risky, as the fit functions typically diverge from the fitted points rapidly outside their range. Also, the polynomial representation is just a means to an end, and typically other 4- and 5-parameter functions can give comparably small χ2 and yet yield substantially different AEs. Quantifying such "model error" requires examining other functions in trial-and-error fashion, a possibly demanding task. Here our purpose is less the precise determination of E than seeing whether it varies with N0 and how that might affect calibration results. Restricting our consideration to just cubic (4-parameter) and lower-order polynomials, we find that the Guescini and Karlen data require quadratic calibration, while the Lievens data justify cubic representation.
Baseline and plateau variance estimates for the Lievens data [32] indicated a significant weighting range (∼500), but in fact subsequent calibration fitting of the Cqs showed little difference for the weighted and unweighted estimates obtained by fitting with Eq. (14). This result is another manifestation of the effects discussed in connection with Fig. 4, namely that proper weighting doesn't help much when the ensemble statistics are dominated by factors other than the random noise in the profile data. A survey of the Cq estimates led to the exclusion of 8 reactions from further computations (see Supplement). The Cq variance analysis analogous to that in Fig. 10 is shown in Fig. 11. Here the samples decrease in concentration by factors of 5 from 105 templates in the most concentrated to 160 in the least. However, at low N0 the Cq variances for all markers were higher than predicted, so we added a correction factor for N0 in the fit model, yielding values ranging from 0.37 for SDM to 0.61 for Cr,x. These values imply that N0 in the most dilute sample is 59–98, in rough agreement with the results from a similar analysis in [15]. To obtain a common variance function for all Cq markers, we set the correction factor at its average (0.42) and refitted, obtaining an average for the constant A = 0.0005. This VF was then used to weight the Cqs in the calibration fitting, yielding the cubic-based E results illustrated in Fig. 12. Although the case for N0-dependence in E is solid, the several Cq markers give AEs that often differ by more than their combined SEs. This statistical inconsistency can be taken as a rough indicator of model uncertainty, since the fit quality (χ2) does not vary strongly.
Fig. 11.
Cq variance estimates from Lievens data [32], displayed and fitted as in Fig. 10, with E taken as 1.86 and N0 = 160 for the lowest concentration. Error bars shown for Cr,x are representative of the others.
Fig. 12.
Dependence of AE on N0 from cubic calibration fits of Cq estimates for data from [32]. Error bars shown for FDM are comparable for all. χ2 in the calibration fits ranged from 87 for Cy0 to 112 for SDM (90 Cq values).
The ranges of data error in the growth curves for the Guescini [23] and Karlen FN data [33] were about 30 and 100, respectively, or enough to warrant weighting in the fits to extract Cq. However, here again visual surveys of the Cq estimates showed patterned variation that greatly exceeded the differences between weighted and unweighted estimates, indicating that the ensemble Cq variances are dominated by effects other than data noise (see Supplement). Indeed, for the Karlen data, the highest Cq variances occurred for the most concentrated samples, where Poisson scatter is negligible. Thus these data were weighted in the calibration fits using just ensemble variances in place of the approach of Fig. 10, Fig. 11 (which was used for the Guescini Cqs). In both cases a single average Cq variance function was used for all markers, to permit comparison of their fit qualities through their χ2 values.
As already indicated, both datasets support quadratic calibration fits, giving slopes linear in log(N0) (but nonlinear AEs, thanks to their inverse slope exponential dependence). The extreme E estimates are illustrated in the form of error bands in Fig. 13. In both cases the SDM estimates (and only these) actually support constant E, though for the Guescini data, this is because of the anomalously poor quality of the calibration fit (χ2 = 500). In contrast, the Cr fits give lowest or near-lowest χ2 values and show clear dependence of E on N0.
Fig. 13.
1-σ confidence bands for extreme estimates of E as functions of log(N0) for data from [23] (lower) and [33] (upper). For the former (84 reactions), χ2 values for FDM, SDM, Cr, Cy0, and Cr,x were, respectively, 100, 500, 109, 183, and 109; in the same order the χ2 values for the latter (72 reactions) were 80, 79, 62, 66, and 71.
For these 4 datasets, the precision comparisons resemble those in Fig. 4, with the greatest disparity between rms parametric SEs and ensemble SDs for the Karlen data (ratio ∼10) and the least for Rutledge and Cote (∼2). The SEs are compared in Fig. 14, which shows that Cr,x is lowest except for the Lievens data, where Cy0 is slightly lower. We are reluctant to interpret these data any further, as some have clearly been baselined and may also have been smoothed. For statistical comparisons like those we are attempting here to be valid, fully "raw" data are needed [12,34].
Fig. 14.
rms parametric SE values for the different Cq markers, averaged over all concentrations in each dataset. These SEs generally vary little with concentration.
3.4. E is not constant; so what?
The consequences of nonconstant E depend on whether and how this E is used. Thus, if calibration data are fitted to a linear response (assuming constant E) instead of a quadratic, E is not needed explicitly and the resulting bias is likely to be minor as long as the unknown is in the range of the calibration data. For example, with Cr calibration of the Guescini data (Fig. 13), the greatest in-range bias from linear calibration (which gives E = 1.913) is only –0.04 in log(N0) for the unknown, which is about half the magnitude of the SE for a single measurement, and which translates into a 9% undershoot in the estimate of N0. Since replicates reduce the SE roughly as n–1/2, this bias does become marginally significant for 4 replicates.4 Alternatively, if E is used to estimate N0 for an unknown relative to a single known (as is the hope behind SR methods), the error increases with the difference ΔCq for the reference and unknown. For example a 5% error in E gives a factor 1.05ΔCq (or its reciprocal), which is 2 (or 1/2) for |ΔCq| = 14. (The error is smaller if 5% is the maximum error in variable E, as for the Guescini data in Fig. 13). (See Supplement for a fuller discussion of this matter.)
The significant dependence of estimated E on choice of Cq marker for the last three datasets considered here means that their growth curves are changing shape systematically with N0. When such changes occur smoothly, there is little effect on calibration results, an exception being the SDM for the Guescini data, which gave anomalously large χ2. For best estimates of the "true" E, the threshold Cr,x might be least sensitive to such N0-dependent shape changes. Note that even though this method yields low-biased SR estimates of E [12], its Cq estimates remain valid for calibration fitting. Note further that the whole-curve-based threshold marker Cr gave comparable χ2 in calibration fitting and Es that were statistically consistent with those from Cr,x.
4. Conclusion
A nonlinear LS algorithm for fitting qPCR growth profiles to a 4-parameter log-logistic function [20] has been modified to incorporate multiparameter baseline functions and permit fitting selected cycle ranges, with output of 5 common scale-independent Cq markers and their SEs. Statistical comparisons for the very large 94 × 4 Reps dataset show that the Cq estimates from this routine are comparable to the best obtainable by other methods. The provision for limiting the fitting to selected cycle ranges means that whole-curve models can be used even on data with anomalous profiles, like the “hook” behavior sometimes seen in the plateau region [35].
The dependence of Cq on log(N0) shows statistically significant nonlinear behavior for 4 of 6 multireplicate datasets, which means that the AE is not constant for these. When such data are used in calibration, neglect of this nonlinearity leads to bias in the estimates of N0; however, this bias is likely to be tolerably small in most applications. On the other hand when linear-based E estimates are used in lengthy extrapolations from a reference sample, the bias can become significant. This could be important when E is estimated from single-reaction data, because such estimates (1) focus on the early growth cycles, and (2) typically are low-biased. Even without the bias, such estimates are perforce constant and may not well approximate E in the earliest baseline cycles, for which Cq calibration is relevant.
In Cq calibration fitting, weighting is important anytime the data include dilutions where Poisson scatter is significant, and may be needed when other sources of anomalous scatter can be identified in the Cq data. Individual reaction growth curves generally show much more scatter in the plateau region than in the baseline cycles, again indicating the need for weighting when fitting to whole-curve models. However, we find that such weighting makes little difference in the quality of the Cq estimates, as evidenced from their subsequent calibration fitting. This means that the dispersion in Cq estimates is dominated by factors other than data noise. We have suggested that for concentrations where Poisson scatter is negligible, pipette volume uncertainty may be the main such factor [12,13,15] – a notion that has also received attention in more recent work by de Ronde et al. [36]. The good news: qPCR is capable of much better precision with experimental procedures that reduce such operational uncertainties. A cautionary note is in order here. We have attributed the variance discrepancy to excess ensemble variance, which assumes that the fitted data are truly raw. There are indications that some qPCR data have been smoothed before being reported. Such smoothing can lead to falsely precise parametric SEs, making the latter the source of the variance discrepancy [12,34]. There is no easy way to get reliable statistics by fitting smoothed data, so qPCR instrument makers should ensure that the raw data are accessible, and users should use these data in any fitting.
Which Cq marker is "best"? Using χ2 from the calibration fitting as the goodness metric, the best performer varies with dataset, as shown in Fig. 15, where the values for each dataset have been normalized to the average for that dataset. SDM is somewhat poorer than the others, even when the anomalously high χ2 for the Guescini data is excluded. Cy0 is best, with mean = 0.87 and low dispersion, σ = 0.05. (However, the low σ is also a statistical quirk, as without the high SDM value, Cy0 would be high for the Guescini data.) Cr,x is close behind Cy0, at 0.90(18); and Cr and FDM are in statistical agreement – 0.94(27) and 0.95(23). In short, with quality routines for estimating them, there is little difference in performance among the compared Cq markers, with slightly poorer results for SDM.
Fig. 15.
χ2 values from the weighted calibration fits for the different Cq markers, normalized to unity for each dataset. In each case, common weights were used for the 5 Cqs.
The results shown in Fig. 15 are based on actual performance on the replicate datasets. A possibly more telling answer to the "best Cq" question is given by the SE comparisons in Fig. 14, because these represent what could be achieved without extraneous experimental sources of variability. As we have noted earlier, all but Cr,x are affected by limitations in the whole-curve model, as manifested in systematic trends in the LS fit residuals. Cr,x is much less sensitive to such effects, because only the first few growth cycles are included in the fitting. This restricted cycle range also means that weights will probably never be needed in these fits, while they are often warranted in whole-curve fitting and will make a difference there when data noise is the only source of variance. It is for these reasons that we chose Cr,x in our method of estimating N0 from Cq variance [15].
Competing interests
The authors declare that there are no conflicts of interest.
Acknowledgments
We thank Bob Rutledge for providing the raw data for the 15 reactions from [7] – the 3 × 5 data used here. Funding has been provided to ANS by grant Sp721/4–2 of the Deutsche Forschungsgemeinschaft (DFG).
Handled by Jim Huggett
Footnotes
The E (E0) that appear here are apparent amplification factors, relevant for data analysis. The true AE for the target amplicon may not equal E, especially in the plateau region, where complications from effects like limited dye and amplicon reannealing can lead to a decrease in the fluorescence signal.
This specification can as well be tied to the approximate FDM from first differences, in which case Δ1 and Δ2 increase and decrease by ∼2 cycles, respectively, giving a range that is approximately symmetric about the FDM.
This behavior is not general. An examination of representative data from the profiles shown in Fig. 1 and from two additional datasets discussed further below showed χ2 for alt mode higher in all 16 cases, by factors from 1.2-4.4.
The SE for the unknown includes contributions from the calibration curve, but here that is small compared with a single measurement of the unknown, thanks to the large number of replicates. In general, optimal calibration is achieved with about equal numbers of knowns and unknowns, requiring both to be increased together to best increase precision.
Supplementary data associated with this article can be found, in the online version, at https://doi.org/10.1016/j.bdq.2019.100084.
Appendix A. Supplementary data
The following is Supplementary data to this article:
References
- 1.Svec D., Tichopad A., Novosadova V., Pfaffl M.W., Kubista M. How good is a PCR efficiency estimate: recommendations for precise and robust qPCR efficiency estimates. Biomol. Detect. Quantif. 2015;3:9–16. doi: 10.1016/j.bdq.2015.01.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Pfaffl M.W. A new mathematical model for relative quantification in real-time RT-PCR. Nucleic Acids Res. 2001;29:e45. doi: 10.1093/nar/29.9.e45. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Tellinghuisen J. Using nonlinear least squares to assess relative expression and its uncertainty in real-time qPCR studies. Anal. Biochem. 2016;496:1–3. doi: 10.1016/j.ab.2015.10.016. [DOI] [PubMed] [Google Scholar]
- 4.Bustin S.A. Absolute quantification of mRNA using real-time reverse transcription polymerase chain reaction assays. J. Mol. Endocrinol. 2000;25:169–193. doi: 10.1677/jme.0.0250169. [DOI] [PubMed] [Google Scholar]
- 5.Rutledge R.G., Cote C. Mathematics of quantitative kinetic PCR and the application of standard curves. Nucleic Acids Res. 2003;31:e93. doi: 10.1093/nar/gng093. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Larionov A., Krause A., Miller W. A standard curve based method for relative real time PCR data processing. BMC Bioinformatics. 2005;6:62. doi: 10.1186/1471-2105-6-62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Rutledge R., Stewart D. Critical evaluation of methods used to determine amplification efficiency refutes the exponential character of real-time PCR. BMC Mol. Biol. 2008;9:96. doi: 10.1186/1471-2199-9-96. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Tellinghuisen J. Simple algorithms for nonlinear calibration by the classical and standard additions method. Analyst. 2005;130:370–378. doi: 10.1039/b411054d. [DOI] [PubMed] [Google Scholar]
- 9.Tellinghuisen J. Least squares in calibration: weights, nonlinearity, and other nuisances. Method Enzymol. 2009;454:259–285. doi: 10.1016/S0076-6879(08)03810-X. [DOI] [PubMed] [Google Scholar]
- 10.Ruijter J.M., Pfaffl M.W., Zhao S., Spiess A.N., Boggy G., Blom J., Rutledge R.G., Sisti D., Lievens A., De Preter K., Derveaux S., Hellemans J., Vandesompele J. Evaluation of qPCR curve analysis methods for reliable biomarker discovery: Bias, resolution, precision, and implications. Methods. 2013;59:32–46. doi: 10.1016/j.ymeth.2012.08.011. [DOI] [PubMed] [Google Scholar]
- 11.Tellinghuisen J., Spiess A.-N. Statistical uncertainty and its propagation in the analysis of quantitative polymerase chain reaction data: comparison of methods. Anal. Biochem. 2014;464:94–102. doi: 10.1016/j.ab.2014.06.015. [DOI] [PubMed] [Google Scholar]
- 12.Tellinghuisen J., Spiess A.-N. Bias and imprecision in analysis of real-time quantitative polymerase chain reaction data. Anal. Chem. 2015;87:8925–8931. doi: 10.1021/acs.analchem.5b02057. [DOI] [PubMed] [Google Scholar]
- 13.Tellinghuisen J., Spiess A.-N. Comparing real-time quantitative polymerase chain reaction analysis methods for precision, linearity, and accuracy of estimating amplification efficiency. Anal. Biochem. 2014;449:76–82. doi: 10.1016/j.ab.2013.12.020. [DOI] [PubMed] [Google Scholar]
- 14.Spiess A.-N., Rödiger S., Burdukiewicz M., Volksdorf T., Tellinghuisen J. System-specific periodicity in quantitative real-time polymerase chain reaction data questions threshold-based quantitation. Sci. Rep. 2016;6:38951. doi: 10.1038/srep38951. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Tellinghuisen J., Spiess A.-N. Absolute copy number from the statistics of the quantification cycle in replicate quantitative polymerase chain reaction experiments. Anal. Chem. 2015;87:1889–1895. doi: 10.1021/acs.analchem.5b00077. [DOI] [PubMed] [Google Scholar]
- 16.Boggy G.J., Woolf P.J. A mechanistic model of PCR for accurate quantification of quantitative PCR data. PLoS One. 2010;5:e12355. doi: 10.1371/journal.pone.0012355. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Carr A.C., Moore S.D. Robust quantification of polymerase chain reactions using global fitting. PLoS One. 2012;7:e37640. doi: 10.1371/journal.pone.0037640. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Chervoneva I., Li Y.Y., Iglewicz B., Waldman S., Hyslop T. Relative quantification based on logistic models for individual polymerase chain reactions. Stat. Med. 2007;26:5596–5611. doi: 10.1002/sim.3127. [DOI] [PubMed] [Google Scholar]
- 19.Rutledge R., Stewart D. A kinetic-based sigmoidal model for the polymerase chain reaction and its application to high-capacity quantitative real-time PCR. BMC Biotechnol. 2008;8:47. doi: 10.1186/1472-6750-8-47. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Spiess A.N., Feig C., Ritz C. Highly accurate sigmoidal fitting of real-time PCR data by introducing a parameter for asymmetry. BMC Bioinformatics. 2008;9:221. doi: 10.1186/1471-2105-9-221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Tellinghuisen J. Can you trust the parametric standard errors in nonlinear least squares? Yes, with provisos. Biochim. Biophys. Acta. 2018;1862:886–894. doi: 10.1016/j.bbagen.2017.12.016. [DOI] [PubMed] [Google Scholar]
- 22.Tellinghuisen J. Statistical error propagation. J. Phys. Chem. A. 2001;105:3917–3921. [Google Scholar]
- 23.Guescini M., Sisti D., Rocchi M.B., Stocchi L., Stocchi V. A new real-time PCR method to overcome significant quantitative inaccuracy due to slight amplification inhibition. BMC Bioinformatics. 2008;9:326. doi: 10.1186/1471-2105-9-326. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Tichopad A., Dilger M., Schwarz G., Pfaffl M.W. Standardized determination of real-time PCR efficiency from a single reaction set-up. Nucleic Acids Res. 2003;31:e122. doi: 10.1093/nar/gng122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Bevington P.R. McGraw-Hill; New York: 1969. Data Reduction and Error Analysis for the Physical Sciences. [Google Scholar]
- 26.Tellinghuisen J. A Monte Carlo study of precision, bias, and non-Gaussian distributions in nonlinear least squares. J. Phys. Chem. A. 2000;104:2834–2844. [Google Scholar]
- 27.Ingle J.D., Jr. Exchange of comments: signal flicker noise and noise power spectra. Anal. Chem. 1977;49:339–340. (and references therein) [Google Scholar]
- 28.Zeng Q.C., Zhang E., Dong H., Tellinghuisen J. Weighted least squares in calibration: estimating data variance functions in high-performance liquid chromatography. J. Chromatogr. A. 2008;1206:147–152. doi: 10.1016/j.chroma.2008.08.036. [DOI] [PubMed] [Google Scholar]
- 29.Tellinghuisen J., Bolster C.H. Least-squares analysis of phosphorus soil sorption data with weighting from variance function estimation: a statistical case for the Freundlich isotherm. Environ. Sci. Technol. 2010;44:5029–5034. doi: 10.1021/es100535b. [DOI] [PubMed] [Google Scholar]
- 30.Tellinghuisen J. Weighted least-squares in calibration: what difference does it make? Analyst. 2007;132:536–543. doi: 10.1039/b701696d. [DOI] [PubMed] [Google Scholar]
- 31.Zhao S., Fernald R.D. Comprehensive algorithm for quantitative real-time polymerase chain reaction. J. Comput. Biol. 2005;12:1047–1064. doi: 10.1089/cmb.2005.12.1047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Lievens A., Van Aelst S., Van den Bulcke M., Goetghebeur E. Enhanced analysis of real-time PCR data by using a variable efficiency model: FPK-PCR. Nucleic Acids Res. 2012;40:e10. doi: 10.1093/nar/gkr775. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Karlen Y., McNair A., Perseguers S., Mazza C., Mermod N. Statistical significance of quantitative PCR. BMC Bioinformatics. 2007;8:131. doi: 10.1186/1471-2105-8-131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Spiess A.-N., Deutschmann C., Burdukiewicz M., Himmelreich R., Klat K., Schierack P., Rödiger S. Impact of smoothing on parameter estimation in quantitative DNA amplification experiments. Clin. Chem. 2015;61:379–388. doi: 10.1373/clinchem.2014.230656. [DOI] [PubMed] [Google Scholar]
- 35.Burdukiewicz M., Spiess A.-N., Blagodatskikh K.A., Lehmann W., Schierack P., Rödiger S. Algorithms for automated detection of hook effect-bearing amplification curves. Biomol. Detect. Quantif. 2018;16:1–4. doi: 10.1016/j.bdq.2018.08.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.de Ronde M.W.J., Ruijter J.M., Lanfear D., Bayes-Genes A., Kok M.G.M., Creemers E.E., Pinto Y.M., Pinto-Sietsma S.-J. Practical data handling pipeline improves performance of qPCR-based circulating miRNA measurements. RNA. 2017;23:811–821. doi: 10.1261/rna.059063.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.














