Skip to main content
Royal Society Open Science logoLink to Royal Society Open Science
. 2022 Feb 2;9(2):211103. doi: 10.1098/rsos.211103

Misuse of Beer–Lambert Law and other calibration curves

Rosario Delgado 1,
PMCID: PMC8808104  PMID: 35127113

Abstract

Calibration curves allow instrument calibration by predicting the concentration of an analyte in a sample from the reading of the instrument. This curve is constructed as the regression straight line that best fits the relationship between some known concentration standards and their respective instrument readings. An example is the Beer–Lambert Law, used to predict the concentration of a new sample from its absorbance obtained by spectrometry. The issue is that usually this methodology is misapplied. In this paper, we want to clarify this point, explaining what the error consists of and how (easily) to fix it, with the intention of ensuring that it does not continue to be reproduced in the experimental scientific work.

Keywords: calibration curve, Beer–Lambert Law, spectrometry, linear regression, concentration, absorbance

1. Introduction

Instrument calibration involves the construction of a calibration curve that allows to predict the concentration of an analyte in a sample from the reading of an instrument. This curve is the linear regression model that ‘best fits’ the relationship between some known concentration standards and the respective instrument responses. Of course, the effectiveness of the calibration procedure will depend on whether the relationship between the concentration and the instrument reading is indeed (approximately) linear. If it is, bivariate regression may be used to address the issue of predicting the output or dependent variable, say Y, from the input, regressor or independent variable X, by fitting a straight line to a scatterplot of observations on both variables, with the values of the variable X on the x-axis (abscissa), and those of the variable Y on the y-axis (ordinate). The best straight line, in the sense of minimizing the sum of the squared errors of prediction has the expression

y=b0+b1x,

b0 being the intercept (where the straight line intersects the y-axis), and b1 the slope, both computed from the observations (see formula (A 1) in appendix A), if the prediction for variable Y when X = x0 is that given by the straight line, that is, b0 + b1 x0.

A paradigmatic example is the very popular Beer–Lambert Law (also known as Beer’s Law), which establishes that under ideal conditions, the absorbance of a solution of an absorbing substance that is obtained by spectrometry techniques is directly proportional to the substance’s concentration. This implies that the increase of the concentration value gives an increasing value of the absorbance, which is due to the fact that a high concentration of solution absorbs more light compared with a low concentration and that this happens in a linear way. This relationship between absorbance and concentration is used not only by chemists, but by experimental scientists of many other disciplines. Details of what this law says are given in §2.

There are innumerable works that collect research in which Beer’s Law has been applied in very diverse fields that use the technique of spectrometry. Just to mention a few of them: in [1] the authors obtain the absorbance of some samples of glucose extracted from three different types of fruits peel wastes using UV–Vis spectroscopy, and from it and by means of Beer’s Law, they obtain the corresponding concentrations, comparing between them. In [2] the authors say verbatim that ‘The significance of Beer–Lambert Law is to measure the absorbance of a particular sample and to infer the concentration of the solution’. They use a spectrometer for measuring the absorbance of three macronutrients that are essential for plant growth (nitrogen, phosphorus and potassium) and are commonly used in fertilizers, in non-agriculture soil. As the quantity of fertilizer has to be estimated based on the requirements for optimum production, they apply the Beer–Lambert Law to determine the nutrients concentrations. Paper [3] explains a study for the determination of the amount of manganese metal present in tricalcium phosphate using flame atomic absorption spectrophotometer to observe the corresponding absorbance, by means of the calibration curve. The authors of [4] carry out an experiment to introduce a method to estimate the amlopidine in pure drug and marketed tablet Formulation consisting in the use of a calibration curve derived from Beer’s Law to obtain the concentration from the absorbance. Andriamahenina et al. [5] investigate the effect of the presence of outliers in the calibration of lead by graphite furnace atomic absorption spectrometry, concluding that the presence of outliers worsens the quality of the measurement of the concentration of lead obtained from the absorbance given by the instrument reading, by using the calibration curve. A non-invasive alternative of blood glucose monitoring is introduced in [6], based on the detection of the optical density of the solution samples by means of a spectrophotometer, and then converting it into the corresponding glucose concentration by using the Beer–Lambert Law, with the help of a concentration curve. In [7] Ocean Optics Ocean View spectrometer operating software is used to obtain and process data from spectrometer, and get the transmittance (then, the absorbance) of a uric acid solution, from which to calculate uric acid concentration by using a concentration curve. The authors of [8] present and validate a quick and sensitive spectrophotometric method for quantitative determination of gliquidone in bulk drug, pharmaceutical formulations and human serum, based on the absorbance readings and their transformation into concentration through a calibration curve of the absorbance over the concentration. Restrepo et al. [9] report an easy methodology to construct handmade solar cells to produce clean energy from chlorophyll-a (chl-a) extracted from the leaves of Diacol Capiro potato. A spectroscopic calibration curve was constructed using different chl-a standard solutions and their absorbances. In [10] a quality-by-design (QbD) approach was implemented for the routine quality control analysis of serotonin in pharmaceutical dosage form through a spectroscopic method, by using a calibration curve of the absorbance over the concentration.

Although very common, Beer’s Law is not the only source of application of calibration curves in different fields. For example, in the very recent paper [11] the authors construct calibration curves for the total protein eluted from membranes with respect to the concentrations of Bevacizumab or Trastuzumab used to add to serum employed to load the membranes. The total protein eluted from membranes is determined by measuring native fluorescence and then the concentration of Bevacizumab or Tratuzumab is determined using the calibration curve.

The problem of the proper use of calibration curves is common to many engineering and science applications, but not much attention has been paid to it from Statistics, with some exceptions (see ch. 15 in [12], for example, and references therein). The objective of this work is to show simply and without too many technicalities, in an accessible way to engineers and experimental scientists, the misuse of the calibration curves, explaining how to (easily) correct this pitfall, that could result in undesirable consequences. This issue has been treated before, although not always with the same success (see details in §4), but it is still worth reporting and publicizing to ward off further spreading among experimental scientists. Probably, in most cases this error has not practical importance and does not invalidate the published studies, since there will be little difference between the results obtained using the wrong calibration curve (classical calibration), and those obtained using the proper one (inverse regression). Nevertheless, this does not prevent the error from being worth noting, for three main reasons:

  • (a)

    because regardless of the practical implications, from a conceptual point of view, the statistical methodology must be used in the appropriate way;

  • (b)

    because a priori it is not possible to know the extent of the repercussions of the misuse of the calibration curve on the results of an experiment;

  • (c)

    because an error does not cease to be so even though it is very generalized and commonly accepted.

The organization of the rest of the paper is as follows: in §3 we explain the misuse of the Beer–Lambert Law and other calibration curves. Section 4 details how to fix this problem, and a toy example of calibration is developed in §5 to show how the two calibration curves are applied, and compare them. Section 6 includes a few words in conclusion and an outline of what calibration curve is appropriate in every situation in figure 6. Finally, in appendix A we recall the main formulae of the linear regression model, and in appendix B we show two more examples of calibration, one with real experimental data and the other using simulation.

Figure 6.

Figure 6.

Outline on how to choose the most suitable calibration curve in each situation to get the proper prediction.

2. The Beer–Lambert Law

A spectrophotometer is an instrument that measures the number of photons delivered by a solution of a chemical species that absorbs light of a particular wavelength in a given unit of time, which is called the intensity, allowing to compare the intensity of the beam of light entering the solution (I0) with the intensity of the beam of light exiting it (I). The ratio of these intensities is called transmittance, and is denoted by the letter T. That is, T = I/I0. If the transmittance is a measure of the quantity of photons passing through a solution (the proportion of the intensity of the light entering the solution that exits), the absorbance A is a measure of how much light is absorbed by the solution, and is defined as a function of the transmittance in this way,

A=log10(T), 2.1

(large values of absorbance are associated with very little light passing through the solution, and on the opposite, small values of absorbance are associated with most of the light passing entirely through it).

When passing a beam of light of the appropriate wavelength through the solution, if it is fairly dilute, the photons will encounter a small number of the absorbing chemical species and then we can expect a high transmittance and low absorbance. On the contrary, if the solution is highly concentrated we will expect a higher number of the absorbing chemical species and a low transmittance and high absorbance. This leads us to think that the absorbance could be a monotonic increasing function of the concentration of the solution, and even that it could be (directly) proportional to it. As well, it seems that the absorbance would increase if the beam of light goes through the solution for a longer period of time, and since the speed of light is constant, we could think that the absorbance is also directly proportional to the path length of the beam through the solution. In this way we come to the (deterministic) Beer–Lambert Law, which states the following:

TheBeerLambertLaw:A=εLc, 2.2

where c is the concentration of the absorbing species in the solution, L is the path length of beam through the sample compartment where the solution is, and ɛ is the proportionality constant. If the path length L is reported in centimetres (cm), and the concentration c is reported in molarity (moles per litre, mol l−1), the proportionality constant ɛ is called the molar absorptivity or molar extinction coefficient, and has units litres per mole-centimetre (l (mol × cm)−1). In this way, when multiplying ɛ, L and c, all the units cancel and as such, it follows that absorbance A is unit-less. Note that ɛ is intrinsic to the absorption of the solution of chemical species at a particular wavelength of light.

If, in a given context, we know three of the four quantities that appear in equation (2.2), we can solve for the value of the fourth. We could obtain the absorbance of a solution A from its concentration c, knowing the other two quantities L and ɛ, without needing more to substitute in (2.2). Or vice versa, knowing the absorbance of the solution at a given wavelength, usually from the transmittance, by using equation (2.1), we could obtain the concentration by solving c from equation (2.2),

c=AεL, 2.3

(note that equations (2.2) and (2.3) are completely equivalent, since ɛ L > 0).

The crux of the issue appears when the product of the molar absorptivity and the path length, ɛ L, which is constant for a given solution (ɛ) and as long as the same sample compartment is used to make measurements (L), is not known. Then, in order to determine the concentration c of the solution given its absorbance value A, a calibration curve needs to be constructed. And it is at this point that the source of the error appears, as will be described in the next section.

3. Misuse of the calibration curves

What is this widespread error? In the context of lack of knowledge of the (constant) value of ɛ L, the following misuse of the Beer–Lambert Law is usually committed: in order to construct the calibration curve to predict the concentration of an unknown solution from its known absorbance, a set of standard concentrations within the range of the measuring instrument are prepared, and the corresponding absorbances are determined by spectrometry, say (c1, A1), (c2, A2), …, (cn, An). Then the equation of the regression straight line for the response variable absorbance and prediction variable (or regressor) concentration that best fits these n points is

 Calibration curve ofAoverc:A=β0+β1c, 3.1

where β1 is the slope of the line, and β0 is the y-intercept, and both are obtained from the n points by means of the linear least-squares method and are given by formulae

β1=i=1nciAinc¯A¯i=1nci2n(c¯)2,β0=A¯β1c¯,withc¯=1ni=1nci,A¯=1ni=1nAi. 3.2

Now, if we denote by A^i the prediction of the absorbance given by the straight line for a solution whose concentration is that corresponding to the ith point, ci, it is obtained by substituting ci into equation (3.1),

A^i=β0+β1ci,

then the difference (error) between the predicted and the observed absorbance for the solution with concentration ci is: ei=AiA^i, and formulae in (3.2) are obtained imposing that the sum of the square of the errors be minimum

SSE=i=1nei2=i=1n(AiA^i)2=i=1n(Ai(β0+β1ci))2. 3.3

That is, if absorbance A is plotted versus concentration c for the series of n known solutions with the dependent variable A placed on the y-axis, and the independent variable c graphed on the x-axis, the calibration curve (3.1) is the straight line that best fits the n points in the plane in the sense of minimizing the sum of the squares of the distances from each point to its prediction vertically (figure 1).

Figure 1.

Figure 1.

Calibration curve of A over c properly used to predict absorbance A from concentration c. The error of prediction is ei=AiA^i (b).

Calibration curve (3.1) is therefore intended for predicting the absorbance of new solutions for which concentrations are known, since with the parameters β0 and β1 given by (3.2), it ensures that the sum of the square of the errors committed in prediction for the n initial solutions is as low as possible. Then, given the concentration of a new solution, say c0, we can obtain the predicted absorbance value for it, A^0 from equation (3.1) by substituting the concentration c0, that is A^0=β0+β1c0 (figure 2a). However, in what is known as classical calibration, (3.1) is usually used inappropriately to predict the concentration of new solutions for which absorbances are known in the following way: first finding the y-value on the regression straight line corresponding to the measure of the absorbance, and then tracing downward to see which concentration matches up to it, and this value will be the predicted concentration of the solution with that absorbance (figure 2b).

Figure 2.

Figure 2.

(a) Calibration curve of A over c properly used to predict absorbance from concentration and prediction of the absorbance A^0 of a new solution from its concentration c0. (b) Calibration curve of A over c misused to predict concentration from absorbance (classical calibration) and prediction of the concentration c^0 of a new solution from its absorbance A0.

That is, given the absorbance value of a new unknown solution, say A0, the usual (wrong) practice is to obtain the predicted concentration value for it, c^0, from equation (3.1) by substituting the absorbance A0, that is

c^0=A0β0β1=β0+A0β1=b+mA0, 3.4

where b = −β0/β1 and m = 1/β1, being β0 close to zero, and β1 an estimation of the unknown product ɛ L, both computed using the formulae in (3.2). If we predict the concentration for the ith point given its absorbance in this (wrong) way, we obtain

c^i=Aiβ0β1. 3.5

But then, the sum of squared errors (differences between observed and predicted concentrations) is

i=1n(cic^i)2=i=1n(ciAiβ0β1)2,

and we do not have any optimality result in the sense that we cannot ensure that it is as small as possible, with β0 and β1 given by (3.2), unlike what happens with (3.3).

In summary: it is possible algebraically to predict the concentration from the absorbance by using the calibration curve of the absorbance A over the concentration c given by (3.1), following the expression (3.4) with β0 and β1 given by (3.2), as in figure 2b. This is the classical calibration approach. But this is not the optimal way, since we do not control for the prediction errors that are committed. Therefore, this procedure should be avoided. Instead, it is advisable to preserve (3.1) exclusively to predict the absorbance from the concentration, because this procedure is optimal to achieve the minimum sum of the squared prediction errors (figure 2a).

4. Easily fixing it

The problem is easily solvable: since it is a question of constructing a calibration curve to predict the concentration of a new solution of which the absorbance is known, from the concentrations and absorbances of the initial known solutions, the regression straight line of the concentration c over the absorbance A will be the proper one to be used, since it is the one that minimizes the sum of the squared errors of prediction (ordinary least squares, OLS). From the known concentrations and absorbances of the set of n solutions, we obtain the equation of the regression straight line for the response variable concentration and prediction variable absorbance

Calibration curve ofcoverA:c=α0+α1A, 4.1

with the slope α1, which is an estimation of (ɛ L)−1, and the intercept α0 (close to zero) obtained from the formulae

α1=i=1nciAinc¯A¯i=1nAi2n(A¯)2,α0=c¯α1A¯. 4.2

Given the absorbance corresponding to the ith point, Ai, the prediction of its concentration, c^i, is obtained by substituting Ai into equation (4.1), that is,

c^i=α0+α1Ai, 4.3

and then the difference (error) between the predicted and the observed concentration for the solution with absorbance Ai is: εi=cic^i, and in these cases formulae in (4.2) are obtained imposing that the following sum of the square of the errors be minimum:

i=1nεi2=i=1n(cic^i)2=i=1n(ci(α0+α1Ai))2,

(see figure 3). Note that the two straight lines (4.1) and (3.1) intersect at the point (c¯,A¯).

Figure 3.

Figure 3.

(a) Calibration curve of A over c (blue colour) and calibration curve of c over A (magenta colour), on the same coordinate axes. The intersection point of the two lines is (c¯,A¯). (b) Calibration curve of c over A interchanging the axes, with the prediction error εi=cic^i.

Given the absorbance of a new solution, say A0, we can obtain the predicted concentration value for it, c^0 from equation (4.1) by substituting the absorbance A0 in this direct way

c^0=α0+α1A0, 4.4

and if we compare (4.4) with (3.4) we realize that in general, α0bandα1m, that is, the two approaches are not equivalent, as can be seen graphically in figure 4.

Figure 4.

Figure 4.

Predicting the concentration c^0 of a new solution from its absorbance A0. (a) In blue, with the calibration curve of c over A (inverse regression). (b) Comparison with the prediction using the calibration curve of A over c (classical calibration) in red, on the same coordinate axes.

Since we are interested in minimizing the sum of the squared errors of prediction, it is then evident that the proper calibration curve is (4.1) and not (3.1). This approach is known as inverse regression from [13]. It is perfectly adequate in terms of prediction errors, since the OLS method does not depend on any additional hypotheses about the regression model, being the optimal approach in the sense of minimizing the sum of the squared errors.

However, it is true that to make statistical inferences about the linear regression model (confidence intervals and tests of hypothesis on the coefficients of the regression straight line), some hypotheses are assumed (see appendix A for details), being the most basic that the regressor is measured without error, and that the response variable is randomly distributed following a normal distribution with mean a linear function of the regressor, and constant variance. We will call them: LR hypotheses (by linear regression). If we are interested in making statistical inferences about the regression model, we have to design the experiment to collect data in such a way that these assumptions are reasonably fulfilled. In our case, this means that absorbances have to be measured with precision while concentrations are measured with non-negligible error, which in practice may not be possible, and this is considered in the literature the weak point of the inverse regression approach. Indeed, in the opinion of Parker et al. [14], for example, the observed measurements (absorbances) in practical calibrations are subject to measurement error, violating the LR hypotheses.

What if the LR hypotheses with regressor the absorbance and the concentration as response, corresponding to the approximation of the inverse regression, are not fulfilled, not even roughly? Nothing invalidatesthis approximation, in our opinion, for the following reasons:

  • (1)

    The hypotheses are needed if we want to make statistical inference about the model, not to make predictions, that can be carried out equally.

  • (2)

    The convenience of using the inverse regression approach relies on OLS, which does not depend on any hypothesis but on the errors of prediction, which allow to evaluate the predictive capacity of any model.

  • (3)

    The greater predictive power of the inverse regression, compared with that of classical calibration, gives support to its use and has been shown empirically in this work by a toy example in §5 and two more examples in appendix B, one with real experimental data, and the other built by simulation.

    Likewise, it has also been described in some works. In this regard, [13] compared classical calibration (named there Method A) and inverse regression (Method B) using simulations, and recommended the latter based on the mean squared error. The authors of [14] also arrived at the same conclusions through some simulation studies (see also references therein in the same vein), although other authors criticize that recommendation. For example, in the recent paper [15], the authors introduced a new methodology, the ‘reverse inverse regression’ to address the same problem, assuming that the inputs (concentration values) vary according to Gaussian distributions, which allow them to derive some statistical properties, and criticize the inverse regression approach based on the treatment of the inputs (absorbance values) as determined with small error. But they compare their method against classical calibration and inverse regression using a simulation study, and have to recognize the best behaviour of the latter in the sense of minimizing the variance of the prediction interval.

In brief, leaving aside assumptions that could, or not, be accomplished (that in case to be fulfilled allow to deduce some statistical properties for the linear regression model), if we are interested in prediction, the best approach nonetheless seems to be inverse regression.

5. A toy example

We prepare a set of n ( = 10) standards within the range of the measuring instrument, with the following made-up values of concentration (in mg l−1) and absorbance, recorded in table 1.

Table 1.

Toy example: concentration and absorbance of 10 solutions, and their averages.

concentration (mg l−1) absorbance
20 0.0060
40 0.0111
60 0.0233
80 0.0547
100 0.0489
120 0.0675
140 0.0654
160 0.0625
180 0.0785
200 0.0705
c¯=110 A¯=0.04884

The two calibration curves given by (3.2) and (4.2) are:

Classical calibration (curve ofAoverc):A=β0+β1c=0.00554+0.0003936364cInverse regression (curve ofcoverA):c=α0+α1A=6.06475+2128.07645A

We can observe in figure 5 that indeed, as explained above, the two curves are not the same, and they cut at the point (c¯=110,A¯=0.04884). Moreover, the values of the R-squared (R2) have also been reported for the two calibration curves, being higher than that of the inverse regression approach to predict concentration from absorbance. R2 represents the proportion of variation in the response variable that is explained by the calibration curve (the higher the better).

Figure 5.

Figure 5.

Calibration curves for the toy example to predict the concentration from the absorbance. In blue the calibration curve of A over c (classical calibration). In magenta the (proper) calibration curve of c over A (inverse regression).

Note that R2 = 1 − (SSE/SST), where SSE and SST denote the sum of squared errors and the sum of squared total, respectively, that is, SSE=i=1n(cic^i)2 and SST=i=1nci2n(c¯)2, being c^i the prediction for the concentration of the ith solution (with absorbance Ai), that is given by (3.5) for the classical calibration approach, but by (4.3) for the inverse regression. In table 2, we report the predictions c^i with the two approaches.

Table 2.

Toy example: predictions with the two methods: classical calibration and inverse regression, and corresponding prediction errors with the difference of the absolute value of the errors. In italics the maximum R2 and the minimum standard error (s.e.), as well as the p-value for the one-sided t-test in favour of the hypothesis that the mean of the differences is greater than 0.

predictionsc^i
errorscic^i
Ai ci classical Aiβ0β1 inverse α0 + α1 Ai classical ei inverse ɛi difference |ei| − |ɛi|
0.0060 20 1.168591 18.83320 18.83141 1.166795 17.66461345
0.0111 40 14.124711 29.68639 25.87529 10.313605 15.56168328
0.0233 60 45.117783 55.64893 14.88222 4.351073 10.53114443
0.0547 80 124.886836 122.47.053 −44.88684 −42.470528 2.41630800
0.0489 100 110.152425 110.12768 −10.15242 −10.127685 0.02474035
0.0675 120 157.404157 149.70991 −37.40416 −29.709907 7.69425040
0.0654 140 152.069284 145.24095 −12.06928 −5.240946 6.82833797
0.0625 160 144.702079 139.06952 15.29792 20.930476 −5.63255415
0.0785 180 185.348730 173.11875 −5.34873 6.881252 −1.53252256
0.0705 200 165.025404 156.09414 34.97460 43.905864 −8.93126815
 SSE = 6394.129 5356.287 Shapiro–Wilk p-value =0.915
 MSE = SSE/(n − 2) = 799.266 669.536 one-sided t-test for mean >0
s.e.=MSE= 28.271 25.875 p-value = 0.07094*
R2 = 1 − SSE/SST = 0.80624 0.83769

*Significance at 10% level.

As expected, the proper calibration curve (that of c over A) has lower standard error (s.e.) and higher R2 than the usual one (the calibration curve of A over c), to predict concentration from absorbance, which confirms the theoretical result that states that it is better. In other words, inverse regression is better than classical calibration in the sense of minimizing the sum of squared errors in prediction, and this conclusion is independent of the hypotheses of the linear regression model.

One way to see if the differences in prediction errors are statistically significant is as follows: consider the differences of the absolute value of the prediction errors with the two approaches (last column in table 2). For this sample of size 10, we can perform a goodness-of-fit test for normality (Shapiro–Wilk test) obtaining a p-value of 0.915, which does not allow us to reject the hypothesis of normality, so we apply the one-sided t-test to compare the mean against 0, giving a p-value of 0.07094*. This p-value is not less than 0.05 but it is not very far off (it is less than 0.10), so we can say that there is a slight statistical significance in favour of the difference of the absolute values of the predictive errors being positive, or what is the same, that on average the errors with the classical calibration approach are greater in absolute value than with the inverse regression. Since in practical calibrations the errors in making the predictions are of the most important measures of the goodness of the calibration method, in table 3 we also record the values of the radius of the prediction intervals.

Table 3.

Radius of the (approximated) prediction intervals, and p-value of the one-sided t-test in favour of the hypothesis that the mean of the differences of the radius is greater than 0.

prediction interval radius
Ai ci classical (a) inverse (b) difference (a) − (b)
0.0060 20 13.47537 12.76106 0.6607362
0.0111 40 13.28582 12.60487 0.6366435
0.0233 60 12.90608 12.29470 0.5882947
0.0547 80 12.57598 12.02834 0.5462853
0.0489 100 12.55686 12.01301 0.5438552
0.0675 120 12.74684 12.16581 0.5680196
0.0654 140 12.70719 12.13383 0.5629739
0.0625 160 12.65973 12.09561 0.5569352
0.0785 180 13.02142 12.38851 0.6029853
0.0705 200 12.81089 12.21756 0.5761737
 Shapiro–Wilk p-value = 0.1859
 one-sided t-test for mean >0 p-value = 1.998 × 10−12***

***Significance at 0.1% level.

For any absorbance Ai, the corresponding prediction intervals are of the form c^i±(a) using the classical calibration (the expression for (a), which has been derived with the approximative Delta method, can be found in (A 6), appendix A), and c^i±(b) with the inverse regression, where by (A 5) in appendix A, (b)=t1(α/2)n2(i=1nεi2/(n2))(1+(1/n)+(AiA¯)2/(i=1nAi2nA¯2)), with ɛi = ci − (α0 + α1 Ai).

Note that both (a) and (b) in table 3 are deduced from the assumptions of the linear regression model; therefore, they will be more or less adjusted, depending on the degree of compliance with the LR hypotheses. In any case, for all absorbance values, the estimated radius of the prediction interval is greater with the classical calibration than with the inverse regression. This fact is statistically significant: if the two methods were equivalent from the perspective of the prediction interval error, or if the classical calibration were better, the probability that for the 10 absorbance values the prediction interval radius with the inverse regression are all less than the corresponding with the classical calibration, is upper bounded by

P(B(10,p=0.5)=10)=0.510=0.0009765625,

which is a very low p-value (corresponding to the exact binomial test). This means that the probability that the 10 prediction interval radius with the inverse regression are less than the corresponding with the classical calibration if the first method is not better than the second in the sense of having less prediction error, is very low, which reveals that the assumption must be rejected, and accepted that inverse regression is statistically significantly better than classical calibration. The same conclusion is reached by performing a statistical one-sided t-test to compare the mean of the differences (a)–(b) with 0, with a p-value of 1.998 × 10−12*** in favour that the mean is greater than 0 or, equivalently, that on average, the radius of the prediction intervals for the classical calibration are greater than for the inverse regression. The t-test is performed after using a Shapiro–Wilk test of normality, whose p-value is: 0.1859.

As a final comment in this toy example, note that the analysis of variance (ANOVA) methodology for regression (see appendix A) can only be applied to the inverse regression approach, and that in this case, the ANOVA table is:

source of variation response c d.f. sum Sq mean Sq F-value
regressor A 1 α12SAA=27643.713 α12SAA=27643.713 f = 41.28787
residuals (error) 8 SSE=i=1nei2=5356.287 MSE = 669.536
total 9 SST=i=1n(cic¯)2=33000

where SAA=i=1n(AiA¯)2=0.006104104. Then, if the LR hypotheses hold, the null hypothesis H0: ‘no linear relationship between A and c’ is rejected since the corresponding p-value is P(F1,8 > 41.28787) = 0.0002035***. That is, we accept with a very strong statistical significance that A and c are linearly related. We observe the concordance between values in this ANOVA table and that of table 2. However, this is not true with classical calibration, the other approach. The reason is clear: the values recorded in its ANOVA table (that we have not reproduced here) are that of the regression curve of A over c: A = β0 + β1 c when used to predict the absorbance from the concentrations, and not vice versa. For this reason, to compare both approaches, the ANOVA methodology does not turn out to be useful.

6. Conclusion

There are many very painstaking experimental works in which an analytical methodology to determine the concentration of a given substance by using spectrometry is described. Without trying to undermine the interest of these studies, it is necessary to mention that in them, in a systematic way, a gross error is made in the application of the Beer–Lambert Law that allows to determine the concentration c from the absorbance A. The pitfall consists in using the calibration curve of A over c (classical calibration), which is clearly not an optimal approach (see [13], for example), instead of using the calibration curve of c over A, which would be the appropriate (inverse regression), in the sense of minimizing the sum of the squared errors of prediction.

But this not only happens in the application of Beer’s Law: it is also a common practice in other contexts where instrument calibration is used, when inexpensive and quick measurements (Y) are related to expensive and time-consuming measurements (X) based on a set of observations, and we are interested in estimating the expensive measurement of X given a new measurement of Y. Instead of use the classical calibration approach, it is advisable, from the point of view of minimizing the sum of squared errors of prediction, to use the inverse regression. A guide on how to get it right is in figure 6.

Even if the LR hypotheses with regressor the absorbance and the concentration as response are not accomplished, the approximation of the inverse regression remains valid: to carry out predictions it is not necessary for the hypotheses to be fulfilled since the inverse regression approach relies on OLS, which does not depend on any hypothesis. Moreover, the greater predictive power of the inverse regression, compared with that of classical calibration, gives support to its use. This fact is founded on the fact that inverse regression minimizes the sum of the squared error of the predictions for the concentration given the absorbance, but it is also shown empirically in this work by a toy example in §5 and two more examples, one with real data and the other built by simulation, in appendix B.

That in the classical calibration approach the LR hypotheses are fulfilled, is nothing more than an entelechy: how to be sure of the normality of the absorbance distribution given the concentration value, which is assumed to be fixed (and determined without error, despite the fact that measurement errors are unavoidable), and of the rest of the hypotheses? Despite the (possible but not usual) utilization of methods for studying the goodness of fit of the observations to them, the assumption of the hypotheses of a model is always a delicate subject that could be considered, in a sense, a matter of faith. Evaluating the predictive capacity of a model by means of the sum of the squares of the errors of prediction is not.

Even in the simulation example presented in appendix B, in which the absorbance values have been simulated from those of the concentration, that are fixed, according to the equation of a straight line with an additive Gaussian noise, that is, in such a say that it can be assumed that the LR hypotheses are fulfilled with the concentration as regressor and the absorbance as output variable (classical calibration), from a predictive point of view it turns out that the inverse regression approach surpasses the classical calibration. In other words: leaving aside assumptions that could, or not, be accomplished (that in the case to be fulfilled allow to deduce some statistical properties for the linear regression model), if we are interested in prediction, the most appropriate would be to use the inverse regression approach.

It is true that in many applications the difference between the predicted concentrations obtained with both calibration curves is small, and therefore, for practical purposes, this error does not usually have great consequences. However, this does not justify overlooking the entanglement, which is important from a conceptual point of view. What is more, it could potentially have practical consequences, so it should be avoided. This paper aims to draw the attention of experimental scientists to this important issue and contribute to the eradication of this pitfall.

Supplementary Material

Acknowledgements

The author wishes to thank the anonymous referees and Associate Editor for careful reading and helpful comments that resulted in an overall improvement of the paper.

Appendix A. The linear regression model

In this section, we will see the formulae relative to the linear regression model, which is a model to describe the linear relationship between two quantitative variables, namely X, which is the input or regressor, and Y, which is the output or predicted variable. In each scenario, which of the two variables should play the role of X, and which of Y, depends on the objective: the variable that has to play the role of Y is the one for which we want to obtain a prediction given a known value for the other variable (which, then, will play the role of X). This asymmetry between the variables is a factor to take into account, since it could be a source of confusion. Indeed, it is very important to resolve this issue at the beginning, before building the model, since making the wrong decision will lead, as has been explained above that is common in instrument calibration by spectrometry, to predictions subject to greater error, being precisely to highlight and clarify this matter, the motivation of this paper.

The linear regression model of Y over X is a straight line whose equation is the one that better fits the data, which is a set of n > 2 pairs of values of the variables X and Y, say (x1, y1), (x2, y2), …, (xn, yn), and is given by

y=b0+b1x,

where b0 and b1 are obtained from the data in this way

b1=SxySxx,withx¯=1ni=1nxi,y¯=1ni=1nyi,Sxy=i=1nxiyinx¯y¯,Sxx=i=1nxi2n(x¯2)b0=y¯b1x¯} A 1

(Note that the asymmetry between X and Y is reflected in the expressions to obtain the coefficients of the straight line b0 and b1.)

In what sense is the regression line the one that best approximates the data? In which it is the one that minimizes the sum of the squared errors, denoted by ei, which are the difference between the observed value of the variable Y when the variable X takes the value xi, which is yi, and the prediction given by the regression straight line, which is y^i=b0+b1xi, that is, ei=yiy^i. If the relationship between X and Y were perfectly explained by the straight line (hypothetical and deterministic situation), then ei = 0 for i = 1, …, n.

By imposing this criterion we can easily find (A 1). This is the well-known ordinary least squares (OLS) method, due to Carl F. Gauss. To apply this method, we must derive

SSE=i=1nei2=i=1n(yiy^i)2=i=1n(yi(b0+b1xi))2,

with respect to b0 and b1, and set these two derivatives to zero. Indeed, we obtain

2i=1n(yi(b0+b1xi))=0and2i=1n(yi(b0+b1xi))xi=0.

From the first we get

i=1n(yi(b0+b1xi))=0i=1nyinb0b1i=1nxi=0y¯b0b1x¯=0b0=y¯b1x¯,

and from the second, by substituting the expression obtained for b0, we finally have that

i=1n(yi(b0+b1xi))xi=0i=1nxiyib0i=1nxib1i=1nxi2=0i=1nxiyi(y¯b1x¯)nx¯b1i=1nxi2=0i=1nxiyinx¯y¯+b1n(x¯)2b1i=1nxi2=0b1=i=1nxiyinx¯y¯i=1nxi2n(x¯)2=SxySxx

(further verification that it is indeed a minimum is necessary, although we will not go into details). A value that is used as a measure of how well the regression straight line approximates to the n point, is the determination coefficient or R-squared, being defined by

R2=(i=1nxiyinx¯y¯)2(i=1nxi2n(x¯)2)(i=1nyi2n(y¯)2)=Sxy2SxxSyy,

with Syy=i=1nyi2n(y¯)2, which is between 0 and 1 and is interpreted as the proportion of the total variability of the data that is explained by the regression straight line. The closer to 1 is R, the better the linear approximation of the relationship between variables X and Y. Its square root, with the sign of the slope b1, is the well-known Pearson’s correlation coefficient r ∈ [− 1, 1].

A.1. The hypotheses of the regression model (LR hypotheses)

The regression model assumes that for each fixed value of the variable X, xi (i = 1, …, n), the random variable Y, which is denoted in this case by Yi, has Gaussian distribution with a mean which is a linear function of xi, say γ0 + γ1 xi, where γ0 and γ1 are parameters independent of i, and with variance σ2 > 0, which is also a parameter independent of i, that is, we assume that

YiN(γ0+γ1xi,σ2)i=1,,n.

Moreover, we assume that the random variables Y1, …, Yn are independent. In other words,

Yi=γ0+γ1xi+δii=1,,n, A 2

where δ1, …, δn are independent and identically distributed random variables, N(0, σ2). These are the LR hypotheses that are needed in order to perform statistical inferences. We assume them in the remainder of appendix A. In this context, b0 and b1, the coefficients of the regression straight line, are the estimations of the parameters of the model γ0 and γ1, respectively, obtained from data, that is, γ^0=b0 and γ^1=b1. The estimation of parameter σ2 is

i=1nei2n2=SSEn2=MSE. A 3

A.2. The coefficient estimates

Consider the estimations b0 and b1 of the coefficients of the linear regression model (respectively, γ0 and γ1 in equation (A 2)) given by (A 1). If in (A1) we substitute the observations yi by the random variables from which they are assumed to be realizations, Yi, we obtain the expressions in (A 4) of the estimators of the coefficients, say B0 and B1, which are random variables from which the estimations b0 and b1, respectively, are realizations.

B1=SxYSxx,withY¯=1ni=1nYi,SxY=i=1nxiYinx¯Y¯B0=Y¯B1x¯} A 4

The Gauss–Markov theorem1 says that if the hypothesis of the linear regression model, LR hypotheses, are satisfied, the estimators B0 and B1 are unbiased, that is, their distributions are centred at the corresponding coefficients

E(B1)=γ1,E(B0)=γ0,

(E denotes expectation of a random variable, that is, its mean value), and they are the tightest possible in the sense that they have the smallest variance among all possible estimators of the coefficients that are linear functions of the variables Y1, …, Yn. Then, they are the best linear unbiased estimators (BLUE) of the coefficients of the linear regression model.

With regard to the other parameter of the model, σ2, its estimation is given by (A 3), which is the realization of the estimator σ2^, a random variable independent of B0 and B1 defined by

σ2^=i=1nEi2n2,withEi=Yi(B0+B1xi),that verifiesσ2^σ2(n2)χn22.

A.3. The analysis of the variance (ANOVA) for the linear regression model

The principles and methodology of ANOVA (ANalysis Of the VAriance) can be applied to study if there is a linear relationship between two variables X and Y. Specifically, we will carry on a statistical test for the hypotheses

H0:γ1=0H1:γ10}

(H0 is the null statistical hypothesis that corresponds to ‘no linear relationship between the variables’, while the alternative H1 is the opposite). Considering that quantities x1, …, xn are fixed, the total variability of the observations is measured by the ‘total sum of squares’ SST=i=1n(yiy¯)2, which can be decomposed as

SST=i=1nei2+b12i=1n(xix¯)2=SSE+b12Sxx,

where SST has n − 1 associated degrees of freedom (over the n quantities yiy¯, there is only one linear restriction: i=1n(yiy¯)=0), SSE has n − 2 degrees of freedom since we sum the squares of n terms with two independent linear restrictions: i=1nei=0 and i=1nei(xix¯)=0, and finally b12Sxx has 1 degree of freedom since it is fixed.

The statistical test of hypotheses consists in rejecting H0 if f=b12Sxx/MSE, with MSE = SSE/(n − 2), is ‘big enough’, that means greater than a tabulated value. As it can be seen (we do not give the details here) that f is the realization of a random variable F with distribution Fisher’s F with 1 and n − 2 degrees of freedom if the null hypothesis H0 is true, that is,

F=B12Sxx(i=1nEi2/(n2))F1,n2ifγ1=0,

the hypothesis null H0 is rejected with a significance level α (then, a linear relationship between the variables is accepted) if

p-value=P(F1,n2>f)<α.

Calculations necessary to obtain f are usually carried out with help of the ANOVA table:

Analysis of variance (ANOVA) table

source of variation response Y degree of freedom (d.f.) sum of squares (sum Sq) mean square (mean Sq) F-value
regressor X 1 b12Sxx b12Sxx f=b12Sxx/MSE
residuals (error) n − 2 SSE=i=1nei2 MSE = SSE/(n − 2)
total n − 1 SST=i=1n(yiy¯)2

A.4. Predicting with the linear regression model

Given a value for the variable X, let us say x0, the straight line equation is used to predict the corresponding for the variable Y, which is denoted by y^|x0, in the following way:

y^|x0=b0+b1x0,

and it can be carried out as long as the value x0 is found within the range of values given by x1, …, xn, and if the linear approximation is good (R2 big enough).

A.5. Confidence intervals for the coefficients

Fixed γ ∈ (0, 1) as confidence level, the confidence intervals for the coefficients of the regression straight line are

γ1:b1±t1(α/2)n2(i=1nei2/(n2))Sxxandγ0:b0±t1(α/2)n2(i=1nei2/(n2))(i=1nxi2/n)Sxx

where α = 1 − γ and t1(α/2)n2 is the critical value for the distribution Student’s t with n − 2 degrees of freedom, tn−2, such that the probability that this distribution gives a value greater than the critical value is α/2 (that is, given ω ∈ (0, 1), tωn2 denotes the real number such that P(tn2<tωn2)=ω).

A.6. Confidence interval for the prediction

Fixed a value for the variable X, say x0, and γ ∈ (0, 1) as confidence level, the confidence interval for the prediction for the variable Y, Y^|x0=γ0+γ1x0 (which can be thought as a new parameter, function of γ0 and γ1, whose estimation is y^|x0=b0+b1x0) is

y^|x0±t1(α/2)n2(i=1nei2n2)(1n+(x0x¯)2Sxx).

The value of x0 that minimizes the length of the confidence interval for the prediction is x0=x¯. As x0 moves away from x¯ (by excess or by default) the length increases symmetrically.

A.7. Statistical hypotheses testing

Fixed a significance level α ∈ (0, 1), the statistical test of hypotheses for the parameters of the regression model are:

γ1:statistict=b1γ10(i=1nei2/(n2))/Sxx,γ0:statistict=b0γ00(i=1nei2/(n2))(i=1nxi2/n)/Sxxalternative hypothesis(γ10,γ00fixed )accepted ifp-valueH1:γ1>γ10/γ0>γ00t>t1αn2P(tn2>t)H1:γ1<γ10/γ0<γ00t<tαn2P(tn2<t)H1:γ1γ10/γ0γ00t∣>t1α/2n22P(tn2>|t|)

A.8. Prediction interval

Fixed a value for the variable X, say x0, and γ ∈ (0, 1) as confidence level, the prediction interval is an interval ‘of the most probable values’ for the variable Y, that when X = x0 we denote by Y0, that is, Y0 = γ0 + γ1 x0 + δ0 with δ0N(0, σ2) independent of δ1, …, δn. Informally speaking, the prediction interval is an interval where the variable Y0 takes values with probability γ, and has the expression

y^|x0±t1(α/2)n2(i=1nei2n2)(1+1n+(x0x¯)2Sxx). A 5

A.9. Prediction interval for classical calibration

The problem with classical calibration is that to make predictions we have to deal with the reciprocal of the slope, which follows a Gaussian distribution under the hypothesis of the linear regression model. The reciprocal of a Gaussian random variable has infinite variance (then, the mean squared error is infinite), but although an asymptotic approximation can be derived using the Delta method (see [14]), it has limitations. By formulae (4.32) and (4.32a) in [18, p. 169], for any absorbance Ai, the corresponding prediction interval using the classical calibration and under the hypothesis of the linear regression model, is of the form c^i±(a), where

(a)=t1α2n21β1(i=1ne~i2n2)(1+1n+(c^ic¯)2i=1nci2nc¯2), A 6

with e~i being the errors committed with the classical calibration, not to predict concentration from absorbance but to predict absorbance by concentration, that is, e~i=Ai(β0+β1ci). See also formula (5) in [14].

Appendix B. Two more examples

B.1. An example of practical calibration with real experimental data

The following example of practical calibration is borrowed from [19] and can be used to compare the approaches of classical calibration and inverse regression. The data (table 3 in [19]) are absorbance readings for potassium permanganate at 525 nm given by the scanning of the spectrophotometer for different concentrations. Specifically, a stock solution for standards was made by 0.072 g of potassium permanganate in 250.0 cm3 standard flask, and standard working solutions are five replicates of each one, containing, respectively, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 40, 60 mg dm−3 of potassium permanganate, made by dilution of appropriate aliquots of the stock solution to 100.0 cm3 with deionized water. Concentrations and measured absorbances are recorded in table 4 .

Table 4.

Example of practical calibration (table 3 in [19]): five replications of the absorbance reading for any of the 14 fixed concentrations.

concentration absorbance
0 0 0 0 0 0
1 0.053 0.053 0.054 0.054 0.055
2 0.092 0.092 0.092 0.092 0.092
3 0.130 0.134 0.129 0.129 0.128
4 0.181 0.181 0.181 0.179 0.180
5 0.209 0.208 0.208 0.207 0.207
6 0.265 0.265 0.264 0.262 0.264
7 0.324 0.324 0.324 0.324 0.324
8 0.354 0.352 0.352 0.352 0.354
9 0.381 0.379 0.381 0.379 0.381
10 0.430 0.430 0.430 0.430 0.430
20 0.881 0.880 0.880 0.880 0.882
40 1.576 1.575 1.576 1.576 1.576
60 2.062 2.062 2.062 2.060 2.062

The two calibration curves given by (3.2) and (4.2) are

Classical calibration (curve ofAoverc):A=β0+β1c=0.05261356+0.03540806cInverse regression (curve ofcoverA):c=α0+α1A=1.349146+27.96597A.

For the difference between the absolute value of the errors in prediction with the classical calibration and the inverse regression (the former minus the latter) (table 5), we perform a one-sided Wilcoxon signed-rank test, which is the non-parametric counterpart of the t-test, to compare its median with 0 (the p-value of the Shapiro–Wilk test for normality is 1.976 × 10−8***, meaning that we have enough evidence to reject the normality of the sample). The p-value of the one-sided Wilcoxon test with the alternative hypothesis: ‘the median of the difference is greater than 0’ is 0.0002126***; that shows a clear statistical significance in favour of inverse regression.

Table 5.

Predictions with the two methods: classical calibration and inverse regression, and corresponding radius of the prediction intervals and errors, for data in table 4. In italics the maximum R2 and the minimum standard error s.e.

predictions c^i
prediction interval radius
errors cic^i
Ai ci classical Aiβ0β1 inverse α0 + α1 Ai classical (a) inverse (b) classical ei inverse ɛi
0.181 4 3.62591020 3.7126938 2.582782 2.576339 0.374089799 0.28730621
0.179 4 3.56942588 3.6567619 2.582810 2.576365 0.430574117 0.34323815
0.180 4 3.59766804 3.6847278 2.582796 2.576352 0.402331958 0.31527218
0.209 5 4.41669065 4.4957409 2.582411 2.575986 0.583309349 0.50425915
0.208 5 4.38844849 4.4677749 2.582423 2.575998 0.611551508 0.53222512
 SSE = 187.5209 185.687
 MSE = SSE/(n − 2) = 2.75766 2.730692
se=MSE=, 1.66062 1.65248
R2 = 1 − SSE/SST = 0.990124 0.9902206

With respect to the prediction interval radius, for all the (n = 70) observations, the radius for the inverse regression is less than that of the classical calibration approach. We can perform a statistical test to check if the median of the difference of the prediction interval radius (classical calibration minus inverse regression) is significantly greater than 0. As the p-value for the Shapiro–Wilk of normality is 8.925 × 10−14***, we reject normality and make the one-sided Wilcoxon signed-rank test, obtaining as p-value 1.793 × 10−13***, that expresses a very high statistical significance in favour of the inverse regression approach.

The analysis of variance (ANOVA) table for regression (see appendix A) applied to the inverse regression approach is:

source of variation response c d.f. sum Sq mean Sq F-value
regressor A 1 α12SAA=18801.81 α12SAA=18801.81 f = 6885.344
residuals (error) 68 SSE=i=1nei2=185.69 MSE = 2.7307
total 69 SST=i=1n(cic¯)2=18987.5

with SAA=i=1n(AiA¯)2=24.04031. The p-value for the statistical test with H0: ‘no linear relationship between A and c’, if the LR hypotheses can be reasonably assumed, is P(F1,68 > 6885.344) < 2.2 × 10−16***, and therefore we accept the linear relationship between concentration and absorbance.

B.2. An example by simulation

Apart from the toy example in §5, and the practical calibration example with real experimental data in the first subsection of this appendix, now we will perform a simulation experiment consisting in the following. First, a dataset with some values of concentration and the corresponding absorbances have been created by simulation, in this way:

  • (i)

    Fix values for the concentration, ci, from 50 to 500, with a step one by one: 50, 51, 52, …, 499, 500 (a total of 451 values).

  • (ii)
    Compute the corresponding values of the absorbance Ai by means of the linear expression with Gaussian additive noise (error),
    Ai=0.01+0.05ci+εi,i=1,,451,
    with ɛiN(μ = 0, σ2 = 10), all generated independently. We use the function rnom of R, and fix a random seed for reproducibility purpose with set.seed(123).
  • (iii)

    As it is possible that some values of the absorbance are negative, delete such observations. This will depend on the Gaussian values that have been randomly generated. In our case, we are left with a final number of n = 447.

The first 10 observations of the 447 have been recorded in table 6. For the dataset with n = 447 observations, we obtain 0.9020323 as Pearson’s correlation coefficient, and 0.9060091 if instead we compute Spearman’s correlation coefficient, both reflecting a good linear relationship between concentration values and the corresponding simulated absorbances.

Table 6.

First 10 observations of the simulated dataset. Note that original observation 8 has been deleted since the simulated absorbance for a concentration of 57 was the negative number −1.1404749.

observation original order ci Ai
1 50 0.7376204
2 51 1.8321149
3 52 7.5390685
4 53 2.8829671
5 54 3.1188437
6 55 8.1835117
7 56 4.2675450
9 58 0.7379806
10 59 1.5506931
11 60 6.8808865

Second, we use k-fold cross-validation with k = 10 to evaluate the prediction error with the two approaches, classical calibration and inverse regression. Indeed, we randomly order the n instances (using the sample function of R), and then divide the observations into 10-folds, the first 9 composed of n/10 observations (in this case, 44), and the last with the rest (51 observations). Then, for each fold:

  • (a)

    We reserve the fold for validation and learn the linear regression models (to follow the two approaches) with the rest of the folds as a learning (training) set.

  • (b)

    Once learned the two linear regression models, we follow the two approaches to predict, for each of the instances in the validation set, the concentration value from the known absorbance.

  • (c)

    As we know the observed concentration value corresponding to the absorbance of any observation in the validation set, we can compare the observed and the predicted values obtained with the two approaches.

  • (d)

    For any fold and approach, we compute the sum of the squared errors in making predictions and also divide by the number of instances minus 2, to compensate the fact that one of the folds has more observations than the other, obtaining in this way the mean sum of squared errors.

    Be careful: we are making predictions for the concentrations given the absorbances of new observations not seen by the regression models, which are the observations of the validation dataset. This is different from the usual situation in which we evaluate the predictive capacity of the model making predictions for the same observations that have been used to construct the model.

  • (e)

    Finally, we have two paired samples of size k = 10 of values of the mean sum of squared errors, that can be used to perform a statistical test to compare the two approaches from the point of view of their predictive power.

In table 7, we have recorded the mean sum of squared errors for each fold.

Table 7.

Sum of squared errors and mean sum of squared errors in the validation procedure for both approximations, classical calibration and inverse regression, with k-fold cross-validation, k = 10, and difference in the mean sum of squared errors between the approximations (classical calibration minus inverse regression).

fold sum of squared error
mean sum of squared error
classical inverse classical inverse difference
1 268747.2 221536.15 6398.742 5274.670 1124.07218
2 207565.9 195265.34 4942.046 4649.175 292.87117
3 162817.2 137581.22 3876.600 3275.743 600.85696
4 148482.3 92484.90 3535.294 2202.022 1333.27201
5 198475.3 180907.94 4725.602 4307.332 418.26981
6 158211.0 105958.98 3766.930 2522.833 1244.09681
7 128881.1 112011.31 3068.597 2666.936 401.66138
8 128828.6 130050.20 3067.347 3096.433 −29.08651
9 157612.6 84083.97 3752.680 2001.999 1750.68097
10 173684.0 147785.03 3544.572 3016.021 528.55118
average: 4067.841 3301.316 766.5246

For the difference between the mean sum of squared errors with the classical calibration and the inverse regression (the former minus the latter), we can perform a one-sided t-test to compare its mean with 0 (since the Shapiro–Wilk test for normality gives a p-value of 0.5422, which implies that we do not have enough evidence to reject the normality of the sample). The p-value of the one-sided t-test with the alternative hypothesis: ‘the mean of the difference is greater than 0’ is 0.0009752***, giving a very high statistical significance in favour of inverse regression being better than classical calibration (less mean sum of squared errors when predicting new cases). If instead, we had used the non-parametric Wilcoxon signed-rank test, not assuming normality of the sample of the differences, the one-sided p-value continues to be very small, 0.001953**, showing statistical significance in the same sense.

Finally, it is also possible to compute the p-value of the exact binomial test in favour of the inverse regression, taking into account that out of 10 cases, there are nine in which the mean sum of squared errors is greater for the classical calibration, and only one in which it is less,

p-value=P(B(10,p=0.5)=1)=(101)0.510=10×0.510=0.009765625

(showing significance at 1% level).

As a conclusion, we can see that even in this example, in which the absorbance values have been simulated from those of the concentration to be able to reasonably assume the LR hypotheses with the classical calibration approach, favouring this approach, from the perspective of predictive power it is better to use the approximation of the inverse regression instead, in concordance with the conclusions in [13].

To evaluate the possible effect of the variance σ2, that we have chosen to be 10 to simulate the absorbance values up to now, we repeat the procedure with other possible values ranging from 0.01 to 30. In table 8, we record for any σ2, the values that had been computed before for the case σ2 = 10: the average of the mean sum of squared errors (both, for the classical calibration and the inverse regression), the p-value of the one-sided t-test (or Wilcoxon signed-rank test, as appropriate) to compare the differences (mean/median of the classical calibration greater than that of inverse regression), and the p-value of the exact binomial test in favour of the inverse regression. We can observe clear evidence in favour of the inverse regression approach if σ2 is big (σ2 ≥ 2), and no differences when σ2 is small, which agrees with intuition.

Table 8.

Average mean sum of squared errors for both approximations, classical calibration and inverse regression, with k-fold cross-validation, k = 10, p-value for the one-sided t-test to compare the differences in the mean/median, and p-value of the exact binomial test in favour of the inverse regression (except those marked with , which are in favour of the classical calibration), with the number of folds, out of the 10 there are, for which the mean sum of squared errors is greater for the inverse regression than for the classical calibration in brackets. All the p-values are significant except those for σ2 < 2.

average mean sum of squared errors
p-value p-value
σ2 classical inverse t-test/Wilcoxon test exact binomial test
0.01 4.014717 4.01434 0.4754 (6) 0.2050781
0.1 40.17608 40.10235 0.3534 (6) 0.2050781
0.5 201.1428 199.0806 0.1815 (5) 0.2460938
1 402.6806 394.2868 0.1016 (3) 0.117187500
2 806.484 772.9201 0.04209* (2) 0.043945310*
3 1211.606 1133.598 0.01526* (1) 0.009765625**
4 1617.850 1481.911 0.007936** (1) 0.009765625**
5 2009.188 1801.027 0.004883** (2) 0.043945310*
6 2439.265 2135.996 0.003818** (1) 0.009765625**
7 2832.297 2437.114 0.003476** (1) 0.009765625**
8 3243.029 2735.674 0.002206** (1) 0.009765625**
9 3654.897 3023.564 0.001446** (1) 0.009765625**
10 4067.841 3301.316 0.0009752*** (1) 0.009765625**
20 8539.498 5685.383 0.0003228*** (1) 0.009765625**
30 13335.26 7548.766 0.0009766*** (0) 0.0009765625***

As usual, *, ** and *** denote significance at 5%, 1% and 0.1% levels, respectively.

Footnotes

1

As explained in [16], the method of OLS was developed by Gauss in Theoria combinationis observationum erroribus minimis obnoxiae (1823), where a first proof of an early version of the theorem is given. Markov rediscovered the same result and included it in his book Wahrscheinlichkeitsrechnung (1912), the year in which Fisher converts least squares into a general estimation method in statistics. The terminology Gauss–Markov theorem comes from Neyman. For historical details, see [17].

Data accessibility

All scripts used in this study are openly accessible through https://github.com/StochasticBiology/boolean-efflux.git. The data are provided in electronic supplementary material [20]. I have used simulated data that I have uploaded in a csv format file.

Competing interests

I declare I have no competing interests.

Funding

The author is supported by Ministerio de Ciencia, Innovación y Universidades, Gobierno de España, project ref. PGC2018-097848-B-I0.

References

  • 1.Alikasturi AS, Shaharuddin Sh, Anuar MR, Radzi ARM, Asnawi ASFM, Husin AN, Aswandi NA, Mustapha AI. 2018. Extraction of glucose by using alkaline hydrolysis from Musa Sapientum Peels, Ananas Comosus and Mangifera Indica Linn. Mater. Today: Proc. 5, 22 148-22 153. ( 10.1016/j.matpr.2018.07.083) [DOI] [Google Scholar]
  • 2.Yusof KM, Isaak S, Ch.A. Rashid N, Ngajikin NH. 2016. NPK detection spectroscopy on non-agriculture soil. J. Teknologi 78, 227-231. ( 10.11113/jt.v78.8382) [DOI] [Google Scholar]
  • 3.Yildiz Y, Karadag R, Cordera M, Gensinger B. 2020. Determination of manganese in tri calcium phosphate (TCP) by atomic absorption spectrometry. Am. J. Anal. Chem. 11, 301-308. ( 10.4236/ajac.2020.118024) [DOI] [Google Scholar]
  • 4.Gidwani B, Patel L, Gupta A, Kaur CD. 2017. Ultra-violet spectrophotometric method for estimation and validation of amlodipine in bulk and tablet formulation. J. Anal. Pharm. Res. 4, 00125. ( 10.15406/japlr.2017.04.00125) [DOI] [Google Scholar]
  • 5.Andriamahenina NN, Rasoazanany EO, Ravoson HN, Rakotozafy LV, Harinoely M, Andraimbololona R, Edmond R. 2018. Dealing with outlier in linear calibration curves: a case study of graphite furnace atomic absorption spectrometry. World J. Appl. Chem. 3, 10-16. ( 10.11648/j.wjac.20180301.12) [DOI] [Google Scholar]
  • 6.Majumdar A, Bhattacharya A, Sengupta A, Ghosh D. 2017. Measurement of blood glucose concentration from tear glucose concentrtion of a type-2 diabetic patient using LDR based spectrometer. In 6th Int. Conf. on ‘Computing, Communication and Sensor Networks’, CCSN2017, Kolkata, India, 30–31 December. See https://www.researchgate.net/project/Detection-of-blood-Glucose-through-tears-in-humans/update/5a07e40c4cde262689144b05.
  • 7.Yaacob A, Ngajikin NH, Rashid NChA, Ali SHA, Yaacob M, Isaak S, Cholan NA. 2020. Uric acid detection in visible spectrum. TELKOMNIKA Telecommunication, Comput. Electron. Control 18, 2035-2041. ( 10.12928/telkomnika.v18i4.14993) [DOI] [Google Scholar]
  • 8.Arayne MS, Sultana N, Mirza A. 2006. Spectrophotometric method for quantitative determination of gliquidone in bulk drug, pharmaceutical formulations and human serum. Pak. J. Pharm. Sci. 19, 182-185. [PubMed] [Google Scholar]
  • 9.Restrepo CV, Benavides E, Zambrano JC, Moncayo V, Castro E. 2020. Hand made solar cells from chlorophyll for teaching in high school energy education. Int. J. Ambient Energy. ( 10.1080/01430750.2020.1712243). [DOI]
  • 10.Bhusnure OG, Fasmale RN, Gandge NV, Gholve SB, Giram PS. 2017. QbD approach for analytical method development and validation of serotonin by spectroscopic method. Int. J. Pharm. Pharm. Res., Hum. J. 10, 98-117. [Google Scholar]
  • 11.Berwanger JD, Tan HY, Jokhadze G, Bruening ML. 2021. Determination of the serum concentrations of the monoclonal antibodies bevacizumab, rituximab, and panitumumab using porous membranes containing immobilized peptide mimotopes. Anal. Chem. 93, 7562-7570. ( 10.1021/acs.analchem.0c04903) [DOI] [PubMed] [Google Scholar]
  • 12.Montgomery DC, Peck EA, Vining GG. 2012. Introduction to linear regression analysis, 5th edn. Hoboken, NJ: John Wiley & Sons Ltd. [Google Scholar]
  • 13.Krutchkoff RG. 1967. Classical and inverse regression methods. Technometrics 9, 425-439. ( 10.1080/00401706.1967.10490486) [DOI] [Google Scholar]
  • 14.Parker PA, Vining GG, Wilson SR, Szarka JL III, Johnson NG. 2010. The prediction properties of inverse and reverse regression for the simple linear calibration problem. J. Qual. Technol. 42, 332-347. ( 10.1080/00224065.2010.11917831) [DOI] [Google Scholar]
  • 15.Kang P, Koo C, Roh H. 2017. Reversed inverse regression for the univariate linear calibration and its statistical properties derived using a new methodology. Int. J. Metrol. Quanl. Eng. 8, 28. ( 10.1051/ijmqe/2017021) [DOI] [Google Scholar]
  • 16.Hallin M. 2012. Gaus-Markov theorem. In Encyclopedia of environmetrics, 2nd edn, John Wiley & Sons, Ltd. ( 10.1002/9780470057339.vnn102). [DOI]
  • 17.Stigler SM. 1981. Gauss and the invention of least squares. Ann. Stat. 9, 465-474. ( 10.1214/aos/1176345451) [DOI] [Google Scholar]
  • 18.Kutner MH, Nachtsheim CJ, Neter J, Li W. 2005. Applied linear statistical models, 5th edn. New York, NY: McGraw-Hill Irwin. [Google Scholar]
  • 19.Adeeyinwo CE, Okorie NN, Idowu GO. 2013. Basic calibration of UV/visible spectrophotometer. Int. J. Sci. Technol. 2, 247-251. [Google Scholar]
  • 20.Rosario D. 2022. Misuse of Beer-Lambert Law and other calibration curves. Figshare. [DOI] [PMC free article] [PubMed]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Citations

  1. Rosario D. 2022. Misuse of Beer-Lambert Law and other calibration curves. Figshare. [DOI] [PMC free article] [PubMed]

Supplementary Materials

Data Availability Statement

All scripts used in this study are openly accessible through https://github.com/StochasticBiology/boolean-efflux.git. The data are provided in electronic supplementary material [20]. I have used simulated data that I have uploaded in a csv format file.


Articles from Royal Society Open Science are provided here courtesy of The Royal Society

RESOURCES