Skip to main content
Sage Choice logoLink to Sage Choice
. 2020 Oct 14;30(2):523–534. doi: 10.1177/0962280220961793

Reference range: Which statistical intervals to use?

Wei Liu 1, Frank Bretz 2, Mario Cortina-Borja 3,
PMCID: PMC8008401  PMID: 33054684

Abstract

Reference ranges, which are data-based intervals aiming to contain a pre-specified large proportion of the population values, are powerful tools to analyse observations in clinical laboratories. Their main point is to classify any future observations from the population which fall outside them as atypical and thus may warrant further investigation. As a reference range is constructed from a random sample from the population, the event ‘a reference range contains (100P)% of the population’ is also random. Hence, all we can hope for is that such event has a large occurrence probability. In this paper we argue that some intervals, including the P prediction interval, are not suitable as reference ranges since there is a substantial probability that these intervals contain less than (100P)% of the population, especially when the sample size is large. In contrast, a (P,γ) tolerance interval is designed to contain (100P)% of the population with a pre-specified large confidence γ so it is eminently adequate as a reference range. An example based on real data illustrates the paper’s key points.

Keywords: Nonparametric prediction interval, nonparametric tolerance interval, prediction interval, reference range, tolerance interval

1 Introduction

The ‘Choose Wisely’ campaign was developed in the United States in 2012 by the American Board of Internal Medicine Foundation and was launched in the United Kingdom in 2016 by the Academy of Medical Royal Colleges. It aims to encourage a dialogue between clinicians and patients regarding the risk and benefits of interventions, and the practice of evidence-based treatment regimens.1 As described recently,2 this conversation often refers to the patient’s observed values of relevant clinical markers. Since the clinical laboratory provides comparator intervals to assist the clinician in determining a context for an individual value, a natural question from the patient is ‘Are my test results typical with respect to a healthy population?’. Although such assessment values are often referred to as the test’s normal range, this terminology should be discouraged as it implies that such a result has a binary ‘normal or abnormal’ quality which may lead to an arbitrary dichotomous interpretation of the patient’s health status.2 Instead, the terms ‘reference limits’ or ‘reference range’ should be used in this context.

Reference ranges are powerful tools in laboratory medicine to aid decision making3 and their use has become increasingly prevalent in clinical practice. Searching in the Web of Science engine at the time of writing for articles published between 1999 and 2019 with ‘reference range’ as a topic, we found 5431 articles of which 469 appeared in 2019, in contrast to 268 articles that appeared in 2009. These articles have collectively been cited by 91,034 publications of which 11,270 appeared in 2019, 2.4 times more than the number of citing publications 10 years earlier.

Apart from the important individual overtones for patients, incorrectly estimating the reference range of a sensitive clinical marker of physiological function has enormous public health implications. For example, underestimating the upper limit of a reference range would mean classifying a large number of people as diseased, thus affecting the doses of medication prescribed.4 Construction of appropriate reference ranges is therefore crucial in laboratory medicine practice. Well-known general references3,59 and a case for teaching tolerance intervals in introductory statistics courses10 are available.

It is common practice to assume that clinical markers related to a disease follow a normal distribution among healthy subjects. If there is evidence against this assumption we could fit models to specify optimal transformations to normality, e.g. logarithmic or square root though this might still result in biased estimates of the upper or lower limits of the reference range depending on whether the distribution is right or left skewed.9 Alternatively we could construct reference ranges under specific parametric assumptions different to normality, or follow a nonparametric procedure. The focus of this paper is on the construction of parametric and nonparametric reference ranges for a selected reference population based on a random sample from the population. The problems related to selecting a reference population have been discussed elsewhere.6

A P (commonly set to 95%) reference range is a data-based interval that purports to include (100P)% of the values in the population of interest. Their main point is to classify any future observations from the population which fall outside these intervals as atypical and thus may warrant further investigation.

Let F(·) denote the continuous cumulative distribution function (cdf) of the population, and F1(γ) denote the (100γ)-th percentile of the population for a given γ(0,1). The interval [F1((1P)/2),F1((1+P)/2)] contains exactly (100P)% of the population and would be used as the P reference range had F been known. Since F(·) is usually not known completely in real problems, the reference range has to be estimated from a random sample X1,,Xn from the population, i.e. X1,Xn are independent random variables identically distributed F(·). Note that we follow the notation in Krishnamoorthy and Mathew11 thus denoting the interval’s content level by P instead of the commonly used 1α, and its confidence level by γ.

When F(·) is assumed to have a normal distribution N(μ,σ2) with unknown mean μ and unknown variance σ2, we have F1(γ)=μ+zγσ where zγ denotes the (100γ)-th percentile of the standard normal distribution N(0, 1). When F(·) is not assumed to have a parametric form, nonparametric (or distribution free) methods can be used. In this paper, both normal-based and nonparametric methods are considered.

As a reference range depends on the random sample, the proportion of the population contained in it is also random. Thus the question is ‘which statistical intervals should be used as reference ranges?’

In this article we argue that a P prediction interval, which continues to be used as a reference range in the literature,6,12,13 is not fit for the purpose of interest since there is a substantial probability (due to the randomness in the sample) that the prediction interval contains less than (100P)% of the population.

We then argue that a (P,γ) tolerance interval, with confidence γ(0,1) set at a pre-specified large value, γ=0.95 say, is valid as a reference range since it guarantees, with large confidence γ due to the randomness in the sample, to contain (100P)% of the population values. Several authors have proposed to use tolerance intervals as reference ranges.5,14,15 With almost 80 years of research on tolerance intervals or regions, various parametric and nonparametric procedures are readily available for use as reference ranges.

The next two sections discuss reference ranges based on the normal distribution, and nonparametric reference ranges. They are followed by a section considering a numerical example, and a final one with concluding remarks.

2 Reference ranges based on the normal distribution

2.1 Reference ranges currently in use

Based on the sample, one reference range that has been widely used is the P prediction interval for a future observation Y from a population with N(μ,σ2) distribution6,12,13

RR1=X¯±t(1+P)/2,νS1+1/n=X¯±c1S

where X¯=1ni=1nXi is the sample mean, S2=1n1i=1n(XiX¯)2 is the sample variance, tδ,ν is the (100δ)-th percentile of the t distribution with ν degrees of freedom (df), ν=n1, and c1=t(1+P)/2,ν1+1/n.

A relevant guide on prediction intervals for reference regions is available,7 and we note that the prediction interval RR1 has also been called the P expectation tolerance interval.16,17

Other reference ranges are based on estimators of the percentiles μ±z(1+P)/2σ and include

RR2=X¯±z(1+P)/2S=X¯±c2SRR3=X¯±z(1+P)/2S/λν=X¯±c3SRR4=X¯±z(1+P)/2Sλν=X¯±c4S

where λν=2/νΓ((ν+1)/2)/Γ(ν/2),c2=z(1+P)/2, c3=z(1+P)/2/λν and c4=z(1+P)/2λν.9,12 Now X¯+c2S is a naïve estimator of μ+zγσ,X¯+c3S has the minimum variance among unbiased estimators of μ+zγσ, and X¯+c4S has minimum mean squared error among estimators of the form X¯+cS where c is a constant.12

One immediate question is whether these reference ranges RRi contain (100P)% of the values in the population, which is the objective of a reference range. Note that the proportion of the population within the reference range RRi=X¯±ciS is given by

Ki=PrY|X1,,Xn{YX¯±ciS}=Φ(X¯μσ+ciSσ)Φ(X¯μσciSσ) (1)

where YN(μ,σ2) and is independent of the sample X1,,Xn, PrY|X1,,Xn{·} is the conditional probability of Y conditioning on the sample X1,,Xn, and Φ(·) is the cdf of a N(0, 1) random variable. Hence the objective of a reference range is to have KiP. It is clear from equation (1) that Ki is a random variable depending on the random sample via X¯ and S so whether ‘KiP’ is also random. As a result, all we can hope is that the event {KiP} has a large probability of occurrence.

We note from equation (1) that Ki increases as ci increases. Hence, among the RRi (1i4) given above, the one that has the largest ci contains the largest proportion of the population. Figure 1 compares the ci for given sample sizes n=2:150 and P =0.95. Clearly, c1 is the largest among the ci (1i4), and so RR1 contains the largest proportion of the population among the four reference ranges. We therefore investigate whether or not ‘K1P’ has a large probability to occur in order for RR1 to be used as a reference range.

Figure 1.

Figure 1.

The value of ci as a function of the sample size n.

First, note that

E(K1)=E{PrY|X1,,Xn{YX¯±c1S}}=Pr{YX¯±c1S} (2)
   =Pr{|YX¯|/(σ1+1/n)S/σ<tP/2,ν}=P (3)

where the equality in equation (2) results directly from the well-known conditional expectation formula,18 and the equality in equation (3) follows from the fact that (YX¯)/(σ1+1/n) is distributed N(0, 1) and is independent of S/σ which has the distribution χν2/ν, with χν2 denoting a chi-squared random variable with ν=n1 df. That the probability in equation (3) is equal to P qualifies RR1 as a P prediction interval for a future observation Y from the same population that the sample X1,,Xn is drawn.

Second, the distribution of K1 can be studied by simulating a large number, Rsim=1,000,000 say, of independent realisations of K1. Note from equation (1) that

K1=Φ(Zn+c1χν2ν)Φ(Znc1χν2ν) (4)

where Z=n(X¯μ)/σ is a standard N(0, 1) random variable, χν2=νS2/σ2 is a chi-squared random variable with ν=n1 df, and Z and χν2 are statistically independent. From equation (4), K1 can easily be simulated. For given P and n, Rsim=1,000,000 replicas of K1 are simulated, based on which the probability density function (pdf) of K1 can be accurately approximated. In Figure 2, the kernel density estimate19 of the pdf of K1 based on the simulated K1 values is plotted (by using the R package KernSmooth)20 for n=20, 50, 100 and 150. Based on the simulated K1 values, we approximated Pr{K1<P} by the proportion of the K1 values that are less than P=0.95, which are given by 0.385, 0.429, 0.450 and 0.459 for n=20, 50, 100 and 150, respectively. Note that Pr{K1<P} is given in Figure 2 by the area under the pdf to the left of the vertical line at P=0.95.

Figure 2.

Figure 2.

The pdf’s of K1 for various sample sizes n.

Given equation (3), it can be shown by the delta method that n(K1P) tends when n to a normal distribution with zero mean and finite variance. This is supported by Figure 2 which shows that the pdf of K1 is getting closer to be symmetric and centered with decreasing variance at P as n increases. Note that n =150 is not large enough yet for the pdf of K1 to converge to a normal pdf. From a brief simulation study we found that in order to achieve this satisfactorily the sample size must be very large indeed. Even for n =10, 000 the skewness and kurtosis values suggest a significant lack of normality. The coefficient of variation of K1 for n =150 is 0.014, and becomes smaller than 0.01 for n300, and is around 0.002 for n =10,000. This asymptotic normal distribution implies that Pr{K1<P}0.5 as n, that is, the probability of the reference range RR1 containing less than (100P)% of the population is about 1/2 when the sample size is large.

The argument above means that, due to the sample’s randomness, using RR1 as the reference range implies that there is a substantive probability, close to 50% when n is sufficiently large, that the reference range does not fulfill its objective of containing (100P)% of the population. Its property E(K1)=P in equation (3) has the following interpretation. A large number of individuals, I say, collect independent samples, one each, and compute the corresponding reference ranges RR1 based on their own samples. Then the proportions of the population contained in these I reference ranges, K1,1,,K1,I, are random values from the interval (0, 1) and form a random sample from the distribution of K1 although some values could be very close to 0 and some values could be very close to 1. The property E(K1)=P merely says that (K1,1++K1,I)/I is close to P when I is large. Hence, the proportion of the population that one particular reference range contains could be very small but this is compensated by some very large proportions of the population that some other individuals’ reference ranges might contain in the sense that (K1,1++K1,I)/I is close to P. This potential for compensation from other reference ranges is unlikely to offer any comfort for knowing that one’s reference range has a substantial probability of containing less than (100P)% of the population. It is clearly desirable to have a high confidence that our own reference range contains (100P)% of the population. Hence RR1 falls short on this ground and should not be used as a reference range.

The justification for using prediction intervals as reference ranges5,13 is that exactly (100P)% of the future observations from the population should fall within the prediction intervals. It is clear from the line of reasoning stated in the previous paragraphs that this argument is not valid. The inappropriateness of prediction regions when used as reference regions has also been noted in Sections 2.2 and 3.3 of Dong and Mathew.15

In the next section we discuss tolerance intervals since several authors5,14,15 have proposed to use them as reference ranges. For example it has been stated that ‘it would seem that the statistical tolerance interval is what clinical chemists have in mind when they speak of a reference range derived from a sample of individuals representing some defined population’5 (p. 55).

2.2 Tolerance intervals

A tolerance interval with content level P is a data-based random interval constructed to contain (100P)% of the population with a pre-specified (large) confidence level γ about the randomness in the sample.11,16,17,2123 Specifically, a (P,γ) tolerance interval is given by11

RR5=X¯±c5S

where the critical constant c5=c5(P,γ,n) is chosen such that

Pr{PrY|X1,,Xn{YX¯±c5S}P}=Pr{Φ(X¯μσ+c5Sσ)Φ(X¯μσc5Sσ)P}=Pr{Φ(Z/n+c5χν2/ν)Φ(Z/nc5χν2/ν)P}=γ (5)

where the random variables Z and χν2 in equation (5) are the same as those in equation (4). The R package tolerance24,25 can be used to compute c5.

Figure 3 compares c1 and c5 for given sample sizes n=2:150 with P =0.95 and γ={0.90,0.95}. It is clear from Figure 3 that c5 is considerably larger than c1 in order that RR5 contains (100P)% of the population with a pre-specified large confidence γ about the randomness in the sample. Also, as expected, c5 increases with γ as seen in Figure 3.

Figure 3.

Figure 3.

The values of c1 and c5 for various sample sizes n.

3 Equal-tailed tolerance intervals

The tolerance interval RR5 contains (100P)% of the population with a pre-specified (large) confidence γ about the randomness in the sample. But the proportion P of the population contained in RR5 may not be the central (100P)% interval of the population. If we insist that a reference range should contain that central proportion of the population, i.e. [μz(1+P)/2σ,μ+z(1+P)/2σ] with pre-specified confidence γ about the randomness in the sample, then we should use the following interval as the reference range

RR6=X¯±c6S

where the critical constant c6=c6(P,γ,n) is chosen such that

Pr{X¯c6S<μz(1+P)/2σ and μ+z(1+P)/2σ<X¯+c6S}=γ

This interval is called the equal-tailed or central (P,γ) tolerance interval.15 A formula for values of c6 is available11 and can be computed using the function K.factor of the R package tolerance.24,25 This interval can be viewed as a γ confidence simultaneous lower confidence bound on quantile μz(1+P)/2σ and upper confidence bound on quantile μ+z(1+P)/2σ.26

It is clear that comprising the central (100P)% of the population [μz(1+P)/2σ,μ+z(1+P)/2σ] implies containing (100P)% of the population. Hence the equal-tailed RR6 satisfies a more stringent requirement than RR5 and, as a result, c6 is larger than c5.

Figure 4 compares c5 and c6 for given sample sizes n=2:150 with P =0.95 and confidence γ={0.90,0.95,0.99}. It is clear from Figure 4 that c6>c5, as expected.

Figure 4.

Figure 4.

The values of c5 and c6 for various sample sizes n.

Our view is that the (P,γ) tolerance interval should be used as the reference range since its form X¯±c5S is centered at X¯, mimicking the form of the equal-tailed tolerance interval μ±c6σ, and with a large confidence γ it does contain (100P)% of the population. Only if we specifically require the reference range to contain the central (100P)% of the population, μ±z(1+P)/2σ, then the equal-tailed (P,γ) tolerance interval should be used; otherwise it is unnecessarily wider and flags as atypical fewer individuals than the (P,γ) tolerance interval.

4 Nonparametric reference ranges

4.1 Nonparametric prediction intervals

When F(·) is not assumed to have a specific form, nonparametric reference ranges can be considered and are based on the order statistics X[1]<<X[n] of the sample X1,,Xn, and the sample quantiles have been used to estimate the population quantiles F1((1P)/2) and F1(1+P)/2).6

In what follows, j(p) and j(t) are indices used for prediction and tolerance intervals, respectively.

Let j(p), with 1j(p)n/2, be the largest natural number such that

Pr{Y(X[j(p)],X[nj(p)+1])}P (6)

where Y is a future observation from the population F(·) independent of the random sample X1,,Xn as before. Using the well-known facts that U1=F(X1),,Un=F(Xn) are independent, each having a uniform distribution on the interval (0, 1), and that U[k]=F(X[k]) is the k-th order statistic of U1,,Un and has a beta distribution with parameters k and nk+1, the probability in (6) is equal16 to (n+12j(p))/(n+1). Hence the constraint on j(p) required in equation (6) gives

j(p)=(n+1)(1P)/2 (7)

where a denotes the integer part of a. This leads to use the nonparametric prediction interval

RR7=(X[j(p)],X[nj(p)+1])

as a reference range. An interesting remark is that X[j(p)] and X[nj(p)+1] are consistent point estimators of the population quantiles F1((1P)/2) and F1((1+P)/2), respectively.

The proportion of the population contained in RR7 is given by

K7=PrY|X1,,Xn{Y(X[j(p)],X[nj(p)+1])}=PrY|X1,,Xn{F(Y)(F(X[j(p)]),F(X[nj(p)+1]))}=U[nj(p)+1]U[j(p)] (8)

which is a random variable. The important question is whether the probability that this proportion is at least P, given by

Pr{U[nj(p)+1]U[j(p)]P} (9)

is sufficiently large to qualify the P prediction interval RR7=(X[j(p)],X[nj(p)+1]) as a reference range.

By noting that U[nj(p)+1]U[j(p)] and U[n2j(p)+1] follow the same beta distribution Bn2j(p)+1,2j(p), Tukey’s equivalence blocks result27 directly implies that

Pr{U[nj(p)+1]U[j(p)]P}=1Bn2j(p)+1,2j(p)(P) (10)

where Bn2j(p)+1,2j(p)(·) denotes the cdf of the beta distribution with parameters n2j(p)+1 and 2j(p). This probability can be easily calculated using the function pbeta in R.

Note that, as n, the beta distribution Bn2j(p)+1,2j(p) converges to a normal distribution with mean P thus the probability in equation (10) approaches 0.5 as n.

Figure 5 plots this probability against n for P={0.90,0.95,0.99}. The plots are saw-tooth shaped due to the discreetness of n and j(p). It is clear from the figure that this probability can be substantially smaller than P, and approaches 0.5 as n is large as expected from the asymptotic normal distribution pointed out above. This shows that the nonparametric prediction interval has a substantial probability, close to 0.5 when n is large, of containing less than (100P)% of the population values. Hence, this nonparametric prediction interval should not be used as a reference range for the same reason as the prediction interval based on the normal distribution.

Figure 5.

Figure 5.

The probability in equation (10) for various sample sizes n.

5 Nonparametric tolerance intervals

A nonparametric tolerance interval is constructed to contain (100P)% of the population with a pre-specified (large) confidence γ about the randomness in the sample. Consider the following nonparametric tolerance interval21

RR8=(X[j(t)],X[nj(t)+1])

where j(t) satisfies that 1j(t)n/2 should be the largest natural number such that the proportion of the population contained in RR8, given by

K8=PrY|X1,,Xn{Y(X[j(t)],X[nj(t)+1])}=U[nj(t)+1]U[j(t)]

following similar lines as K7 in equation (8), is at least P with probability γ about the randomness in the sample X1,,Xn. It follows therefore from equation (10) that 1j(t)n/2 should be the largest natural number that satisfies

Pr{U[nj(t)+1]U[j(t)]P}=1Bn2j(t)+1,2j(t)(P)γ (11)

For given n, P and γ, j(t) can be easily computed by a direct search over the natural numbers in the range from 1 to n/2. Note that if the sample size n is too small, then the existence of j(t) is not guaranteed unless n satisfies11

1(nPn1(n1)Pn)γ (12)

The equal-tailed or central nonparametric tolerance intervals can be constructed in a similar way. Our view is that a (P,γ) nonparametric tolerance interval is pertinent as a reference range similar to the normal distribution case. Hence we do not go into the details about the equal-tailed nonparametric tolerance intervals to save space.

Figure 6 compares j(t) and j(p) for given sample sizes n with P =0.95 and γ={0.90,0.95,0.99}. It is clear from Figure 6 that j(t) is considerably smaller than j(p), and so RR8 is wider than RR7, in order that RR8 contains (100P)% of the population with a pre-specified large confidence γ about the randomness in the sample. Also, as expected, j(t) decreases as γ increases.

Figure 6.

Figure 6.

The values of j(p) and j(t) for various sample sizes n given P and γ.

6 Example

A random sample of n =210 observations on fasting plasma glucose is taken from the population of interest. The data and the R code for all the computations in this paper are available at http://www.personal.soton.ac.uk/wl/RefRange/.

Suppose that the usual normality tests28 show that it is reasonable to assume the population has a normal distribution. The sample mean and standard deviation are computed to be X¯=5.31 and S =0.41 (in unit mmol/L). If we use the prediction interval as the reference range, then it is given by

RR1=X¯±t(1+P)/2,νS1+1/n=5.31±1.97×0.41×1+1/210=[4.49,6.12]

Note, however, as pointed out above, that the probability of the prediction interval containing less than (100P)% of the population can be substantial and is computed to be 47%. So there is a 47% probability that the interval does not do what it purports to do: containing (100P)% of the population.

If we use the (P,γ) tolerance interval as the reference range, with γ=0.95, then it is given by

RR5=X¯±c5S=5.31±2.14×0.41=[4.43,6.19]

This interval is wider than the prediction interval. But, as we pointed out, the tolerance interval does contain (100P)% of the population with probability γ=0.95. Therefore, any future observations falling outside this interval can be regarded as atypical and should be considered for further investigation.

While the tolerance interval above has a confidence γ=95% of containing (100P)% of the population, it has a less than γ=95% probability of containing the central (100P)% of the population, μ±z(1+P)/2σ. This probability is computed to be 86%.

In order to have a γ=95% probability of containing the central (100P)% of the population, μ±z(1+P)/2σ, we can use the equal-tailed (P,γ) tolerance interval, which is given by

RR6=X¯±c6S=5.31±2.21×0.41=[4.40,6.22]

The confidence that this equal-tailed tolerance interval contains (100P)% of the population is computed to be 99%, which is much larger than γ=95%. Hence, with a 99% probability, the equal-tailed tolerance interval contains (100P)% of the population. Furthermore, we estimated that the equal-tailed tolerance interval X¯±2.21S is the (0.957,γ) tolerance interval, that is, the interval contains 95.7% of the population with confidence γ=95%.

Now suppose that the distribution of the population cannot be assumed to be normal. Then nonparametric reference ranges should be used. If we use the prediction interval as the reference range, then it is given by

RR7=[X[5],X[n5+1]]=[X[5],X[206]]=[4.62,6.09]

with j(p)=5. Note, however, as we have pointed out, that the probability of the prediction interval containing less than (100P)% of the population can be substantial and is computed to be 39%. So there is a 39% probability that the interval does not do what it purports to do: containing (100P)% of the population.

If we use the (P,γ) nonparametric tolerance interval as the reference range, with γ=0.95, then it is given by

RR8=[X[3],X[n3+1]]=[X[3],X[208]]=[4.38,6.27]

with j(t)=3. This tolerance interval is wider than the nonparametric prediction interval but, as we pointed out, it does contain (100P)% of the population with 95% confidence. Therefore, any future observations falling outside this interval can be regarded as atypical and should be considered for further investigation.

Finally, we note that nonparametric intervals are usually wider than the corresponding parametric ones since they require fewer assumptions than the parametric model.

7 Conclusions

The objective of a reference range is to contain a pre-specified large content level (100P)% of the population with γ confidence level, so that a future observation falling outside the reference range is regarded as atypical and considered for further investigation. This procedure should be useful as part of screening programmes, whose aim is to identify subjects at sufficient risk of a specific disorder who may benefit from further investigation or direct preventive action to avoid death or disability and to improve their quality of life.29

Since a reference range depends on the random sample, the event ‘a reference range contains (100P)% of the population’ is also random and so we can never be certain that a reference range contains (100P)% of the population. All we can hope for is that the event ‘a reference range contains (100P)% of the population’ occurs with a large probability, γ.

Based on this premise, we have argued that the prediction interval is not suitable as a reference range since there is a substantial probability, close to 50% when n is large, that the prediction interval contains less than (100P)% of the population. In contrast, a (P,γ) tolerance interval is designed to contain (100P)% of the population with a pre-specified large confidence γ so it is eminently adequate as a reference range.

Tolerance intervals or regions have been studied by many statisticians since the 1940s. Various parametric and nonparametric procedures are readily available for use as reference ranges or reference regions.11,16,17,24 Finally, we note that there is some work on constructing reference ranges specifically assuming that the clinical marker follows a log-normal distribution,30 and on sample size calculation for reference ranges,31,32 and tolerance intervals.3335 These aspects, however interesting, fall beyond the scope of our paper.

Acknowledgements

We are grateful to the editor, Professor Andrew Forbes, and two reviewers for their insightful comments which led to considerable improvements in this paper.

Footnotes

Declaration of conflicting interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was partly supported by the National Institute for Health Research (NIHR) Great Ormond Street Hospital Biomedical Research Centre. The views expressed are those of the authors and not necessarily those of the National Health Service (NHS), the NIHR or the UK Department of Health.

Supplemental material: Supplemental material for this article is available online.

ORCID iD: Mario Cortina-Borja https://orcid.org/0000-0003-0627-2624

References

  • 1.Wise J. Choosing Wisely: how the UK intends to reduce harmful medical overuse. BMJ 2017; 356: j370. [DOI] [PubMed] [Google Scholar]
  • 2.Whyte MB, Kelly P. The normal range: it is not normal and it is not a range Postgrad Med J 2018; 94: 613–616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Henny J, Hyltoft Petersen P. . Reference values: from philosophy to a tool for laboratory medicine. Clin Chem Lab Med 2004; 42: 686–691. [DOI] [PubMed] [Google Scholar]
  • 4.Samuels MH, Kolobova I, Smeraglio A, et al. Effect of thyroid function variations within the laboratory reference range on health status, mood, and cognition in Levothyroxine-treated subjects. Thyroid 2016; 26: 1173–1184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Albert A, Harris EK. Multivariate interpretation of clinical laboratory data. New York, NY: Marcel–Dekker, 1987. [Google Scholar]
  • 6.Harris EK, Boyd JC. Statistical basis of reference values in laboratory medicine. New York, NY: Marcel Dekker, 1995. [Google Scholar]
  • 7.Horn PS, Pesce AJ. Reference intervals: a user’s guide. Washington, DC: AACC Press, 2005. [Google Scholar]
  • 8.Clinical and Laboratory Standard Institute. Defining, establishing, and verifying reference intervals in clinical laboratory: approved guideline. 3rd ed. Wayne, PA: PLSI, 2008. [Google Scholar]
  • 9.Geffré A, Friedrichs K, Harr K, et al. Reference values: a review. Vet Clin Pathol 2009; 38: 288–298. [DOI] [PubMed] [Google Scholar]
  • 10.Gitlow H, Awad H. Intro stats students need both confidence and tolerance (intervals). Am Stat 2013; 67: 229–234. [Google Scholar]
  • 11.Krishnamoorthy K, Mathew T. Statistical tolerance regions – theory. Appl Computat New York, NY: Wiley, 2009. [Google Scholar]
  • 12.Royston P, Matthews JNS. Estimation of reference ranges from normal samples. Stat Med 1991; 10: 691–695. [DOI] [PubMed] [Google Scholar]
  • 13.Trost DC. Multivariate probability–based detection of drug–induced hepatic signals. Toxicol Rev 2006; 25: 37–45. [DOI] [PubMed] [Google Scholar]
  • 14.Katki HA, Engels EA, Rosenberg PS. Assessing uncertainty in reference intervals via tolerance intervals: application to a mixed model describing HIV infection. Stat Med 2005; 24: 3185–3198. [DOI] [PubMed] [Google Scholar]
  • 15.Dong X, Mathew T. Central tolerance regions and reference regions for multivariate normal population. J Multivar Anal 2015; 134: 50–60. [Google Scholar]
  • 16.Guttman I. Statistical tolerance regions: classical and Bayesian. London: Griffin, 1970. [Google Scholar]
  • 17.Hahn G, Meeker WQ. Statistical intervals: a guide to practitioners. 2nd ed. New York, NY: Wiley, 1991. [Google Scholar]
  • 18.DeGroot MH. Probability and statistics. 2nd ed. Reading, MA: Addison–Wesley, 1986. [Google Scholar]
  • 19.Wand M, Jones MC. Kernel smoothing. New York: Springer, 1985. [Google Scholar]
  • 20.Wand M. KernSmooth: Functions for kernel smoothing supporting Wand & Jones (1995)). R package version 2.23-16, https://CRAN.R-project.org/package=KernSmooth.
  • 21.Wilks SS. Determination of sample sizes for setting tolerance limits. Ann Math Stat 1941; 12: 91–96. [Google Scholar]
  • 22.Guttman I. Tolerance regions. In: S Kotz, et al. (eds) Encyclopaedia of statistical sciences, 2nd ed. New York, NY: Wiley, 2006, pp.8644–8659. [Google Scholar]
  • 23.Meeker WQ, Hahn GJ, Escobar LA. Statistical Intervals: a guide for practitioners and researchers. 2nd ed. New York: Wiley, 2017. [Google Scholar]
  • 24.Young DS. tolerance: An R Package for Estimating Tolerance Intervals. J Stat Softw 2010; 36: 1–39. [Google Scholar]
  • 25.Young DS. Normal tolerance interval procedures in the tolerance package. R J 2016; 8: 200–212. [Google Scholar]
  • 26.Liu W, Bretz F, Hayter AJ, et al. Simultaneous inference for several quantiles of a normal population with applications. Biometrical J 2013; 55: 360–369. [DOI] [PubMed] [Google Scholar]
  • 27.Tukey JW. Nonparametric estimation II: statistical equivalence blocks and tolerance regions – the continuous case. Ann Math Stat 1947; 18: 529–539. [Google Scholar]
  • 28.Chantarangsi W, Liu W, Bretz F, et al. Normal probability plots with confidence. Biometrical J 2015; 57: 52–63. [DOI] [PubMed] [Google Scholar]
  • 29.Peckham CS, Dezateux C. Issues underlying the evaluation of screening programmes. Br Med Bull 1998; 54: 767–778. [DOI] [PubMed] [Google Scholar]
  • 30.Häggström M. Establishment and clinical use of reference ranges. Wiki J Med 2014; 1: 1. [Google Scholar]
  • 31.Jennen–Steinmetz C, Wellek S. A new approach to sample size calculation for reference interval studies. Stat Med 2005; 24: 3199–3212. [DOI] [PubMed] [Google Scholar]
  • 32.Wellek S, Lackner KJ, Jennen–Steinmetz C, et al. Determination of reference limits: statistical concepts and tools for sample size calculation. Clin Chem Lab Med 2014; 52: 1685–1694. [DOI] [PubMed] [Google Scholar]
  • 33.Scheffé H, Tukey JW. A formula for sample sizes for population tolerance limits. Ann Math Stat 1944; 15: 217–217. [Google Scholar]
  • 34.Faulkenberry GD, Weeks D. Sample size determination for tolerance limits. Technometrics 1968; 10: 343–348. [Google Scholar]
  • 35.Young DS, Gordon CM, Zhu S, et al. Sample size determination strategies for Normal tolerance intervals using historical data. Qual Eng 2016; 28: 337–351. [Google Scholar]

Articles from Statistical Methods in Medical Research are provided here courtesy of SAGE Publications

RESOURCES