Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Mar 17.
Published in final edited form as: Epidemiology. 2019 Nov;30(Suppl 2):S3–S9. doi: 10.1097/EDE.0000000000001094

Combining Biomarker Calibration Data to Reduce Measurement Error

Neil J Perkins a, Jennifer Weck a, Sunni L Mumford a, Lindsey A Sjaarda a, Emily M Mitchell b, Anna Z Pollack c, Enrique F Schisterman a
PMCID: PMC7968112  NIHMSID: NIHMS1679853  PMID: 31569147

Abstract

Biomarker assay measurement often consists of a two-stage process where laboratory equipment yields a relative measure which is subsequently transformed to the unit of interest using a calibration curve. The calibration curve establishes the relation between the measured relative units and sample biomarker concentrations using stepped samples of known biomarker concentrations. Samples from epidemiologic studies are often measured in multiple batches or plates, each with independent calibration experiments. Collapsing calibration information across batches before statistical analysis has been shown to reduce measurement error and improves estimation. Additionally, collapsing in practice can also create an additional layer of quality control (QC) and optimization in a part of the laboratory measurement process that is often highly automated. Principled recalibration is demonstrated via. a three-step process of identifying batches where recalibration might be beneficial, forming a collapsed calibration curve and recalibrating identified batches, and using QC data to assess the appropriateness of recalibration. Here, we use inhibin B measured in biospecimens from the BioCycle study using 50 enzyme-linked immunosorbent assay (ELISA) batches (3875 samples) to motivate and display the benefits of collapsing calibration experiments, such as detecting and overcoming faulty calibration experiments, and thus improving assay coefficients of variation from reducing unwanted measurement error variability. Differences in the analysis of inhibin B by testosterone quartile are also demonstrated before and after recalibration. These simple and practical procedures are minor adjustments implemented by study personnel without altering laboratory protocols which could have positive estimation and cost-saving implications especially for population-based studies.

INTRODUCTION

Epidemiologic studies increasingly rely on biomarker measurement.1 While biomarkers typically reflect an improvement over other methods of ascertaining information from study participants, they remain susceptible to variability and error. In measuring biomarkers, there are three primary areas of variability: biological variability, which represents the intraindividual, within-subject variation; preanalytical variability, which is variability introduced between sample collection and measurement; and analytical variation, which is introduced upon measurement.2,3 Because only the first type of variability is of interest to researchers for understanding biological processes, minimization of preanalytic and analytic nuisance variability is essential for the accurate assessment and investigation of epidemiologic hypotheses.

To evaluate the problem further, it is essential to first understand the typical laboratory measurement process of biomarkers used in epidemiologic studies. The “standard curve” is the backbone of contemporary quantitative analytic chemistry, by which most modern biomarkers are assessed. Explicitly, an assay is performed on specimens of known standard concentrations, “standards,” along with biological specimens of unknown concentrations to produce a machine measured value (e.g., optical density [OD]) for each known and unknown sample. The standard curve is a mapping of stepped known concentrations to the machine measurable quantity (e.g., ODs) (see Figure 1 for example). Standard curves are often estimated via cubic splines or simple regressions of 5–10 pairs of log base 10 transformed OD (or other machine measured quantity) and serial dilutions of known standards which cover a desired range for a particular biomarker established by the assay manufacturer. The standard curve is then used to interpolate the unknown specimen concentrations from the machine measured OD measurements using the established standard curve for that assay run or batch.

FIGURE 1.

FIGURE 1.

Plots of calibration curves mapping log10 optical density to log10 Inhibin-B concentration in independent batches/runs (grey) and a collapsed calibration curve (black). Left and right panels correspond to “good” and “bad” calibrated batches categorized by visual inspection (41 and 9 batches). The 9 “bad” batches are identified as candidates for recalibration via. the collapsed curve. Each batch calibration curve was estimated from cubic splines based on six known standards and one blank (imputed at −1 here).

Quality control (QC) samples are specimens of known or consistent concentration that are also commonly included in assay batches but are separate from the standard curve and used as validation measures within and across multiple assay runs or batches. Westgard et al4 suggest accepting as valid all QC samples within up to 3 SDs of the mean concentration computed from serial QC measures. For a batch whose QC sample meets this standard, the concentrations of all specimens in that batch are considered valid after mapping their respective ODs using a common standard curve.4,5 If the OD of a QC sample is mapped to a concentration beyond a desired threshold and thus out of control, then the batch is rejected and the entire batch must be assayed again, requiring additional test specimens and assay materials, including the associated labor and material costs.

The process described above, including batch-specific standard curves to interpolate unknown biomarker concentrations, along with the application of QC samples, was designed to account for analytical variation from multiple laboratory conditions (e.g., temperature, humidity), instrument variation, sample preparation techniques, and reagent characteristics and is routinely employed. Of note, particularly for large-scale epidemiologic studies, dozens of batches of biospecimens may need to be run to complete measurements for a given study question. This process of separate and independent standard curves, and one-by-one evaluation of QC samples and standards, must be conducted dozens of times as well. Variability in the standard curve is tolerated to a certain degree, and such assay variation is commonly estimated across batches and reported with study findings (e.g., inter- and intra-assay coefficients of variation [CVs]). Currently, there is no established method for systematically managing error in the standard curve without rerunning the entire batch, including both the curve-generating known standard samples and the unknown biological samples of interest. Instead, current practice is to either wholly accept or reject the standard curve and the derived concentrations for that batch. In large-scale studies, this can lead to substantial cost and material waste. This practice can also overlook not only substantial measurement error but also bias, which can occur in a simple mean if, for example, a few batches all have low calibrators.

Combining measurements of standards across batches and estimating a single, “collapsed” standard curve have been proposed previously as a method to mitigate the effects of analytic variation by exploiting the correlation across batches.69 Whitcomb et al6 used existing OD measurements of biological samples from multiple batches and applied this collapsed standard curve to recalculate concentrations. By recalibrating all batches within a particular study, they reduced variability in subsequently estimated odds ratios. However, this approach did not incorporate batch-specific QC data. Other methods have been proposed to improve performance when measurement error and batch-specific errors are present.10 When calibration and QC data are available on multiple batches, a principled batch-specific approach seems appropriate, combining unchanged batches which pass QC rules with batches that are recalibrated using a collapsed curve in batches where QC rules are not initially satisfied. Such an approach seems prudent in successfully accounting for laboratory variation with minimal adjustment to standard laboratory practice and without incurring additional assay or specimen costs.

The aim of this investigation is to establish a data-driven method that reduces analytic variability through the selective use of a collapsed standard curve in lieu of individual standard curves or blanket recalibration in population-based studies with multiple batches. Specifically, we describe a novel method of principled recalibration to (1) identify candidate batches for recalibration, (2) recalibrate those candidate batches by applying a collapsed standard curve formed from combining all calibration data, and (3) assess the appropriateness of recalibration using QC metrics in the selected batches. For purposes of demonstration, we apply these steps of principled recalibration to a hormone assay measured via. Enzyme-linked immunosorbent assay (ELISA) in the BioCycle Study.

METHODS

The goal of principled recalibration is to reduce unnecessary measurement variation and potentially increase efficiency in the analysis of interest. Specifically, assay batches which demonstrate suboptimal or poor QC performance based on a particular criterion would benefit from recalibration with a collapsed calibration curve. The collapsed curve is estimated from many more points than the suboptimal batch-specific curve, thus achieving a more accurate mapping of OD to biomarker concentration and potentially reducing bias in measured concentrations, as well as increasing efficiency in downstream analysis. Principled recalibration is applied in three steps: (1) identify, (2) apply, and (3) assess.

Step 1: Identify Candidate Batches for Recalibration

Identifying potential batches for recalibration starts with visual inspection of the standard curves used for a set of completed assay batches for a given study or study question. Plotting the batch-specific curves together provides a useful tool in identifying high analytic variability due to one or more “bad” calibration samples that fall distant from samples with the same known concentrations. Curves should have a smooth monotone structure as concentrations increase and have a similar shape across batches (Figure 1, left panel). Failing this visual test by deviating in shape or slope identifies a batch as a candidate for recalibration (Figure 1, right panel). These batches are labeled candidates at this stage because the appropriateness of recalibration will be assessed in the last stage.

Candidate batches for recalibration can also be identified by QC samples of known concentrations that fall outside of control limits set by the manufacturer or laboratory protocol. This criterion is commonly used by laboratories to identify batches needing to be reassayed. Researchers could modify this criterion by setting their own desired limits of allowable deviation from the QC known concentration to identify candidate batches for recalibration in lieu of reassaying. Batches with unacceptable QC measures may or may not have been identified as a recalibration candidate by visual inspection.

Step 2: Apply Recalibration Using Collapsed Standard Curve

To recalibrate candidate batches, an alternative calibration curve must be estimated and applied. First, the calibration data from all batches are combined. As the points in Figure 1 show, this will result in multiple OD measurements for each known calibration concentration. Before estimating the collapsed curve, researchers may choose to remove points which might be outliers (e.g., visually disjointed from other points or outside of 2 SDs for a given concentration) or may negatively influence the collapsed curve. Next, the collapsed standard curve is estimated. Researchers can apply the same modeling technique performed by batch to estimate the collapsed standard curve from the calibration data collapsed across batches. Using this collapsed standard curve and already measured OD, the concentrations of the QC samples of candidate batches are recalibrated.6

Step 3: Assess the Appropriateness of Recalibration

Now that the QC samples of candidate batches have been recalibrated, the appropriateness of recalibration can be assessed. Generally, recalibration is appropriate for a candidate batch if QC measurements improve. This is evidenced by recalibrated QC concentrations in that batch being closer to the known concentration than the original batch-specific calibrated QC measurements. For batches identified as candidates by visual inspection, assessment of appropriateness of recalibration is achieved by applying this improvement in QC measurement criterion. If the QC measurements improve then recalibration is appropriate; otherwise, recalibration is not appropriate for given batches. For candidate batches identified by QC concentrations outside of control limits, assessment of the appropriateness of recalibration is based on whether QC measures are brought within limits through recalibration of that batch. Researchers should conservatively remove batches from the set of candidate batches, where recalibration is not clearly indicated, and such a decision could be reasonably questioned.

After conducting the three steps of identifying candidate batches, recalibrating those batches, and assessing the appropriateness of recalibration in those batches, recalibration should then be applied to the biological samples from those batches where appropriate. Those recalibrated biological concentrations should then be used alongside concentrations where recalibration was not applied into a single dataset to conduct the analysis of interest. Next, we present a motivating example using inhibin B calibration where the variability of inhibin B across the menstrual cycle, as well as by quartile of testosterone, is of interest.

ILLUSTRATIVE EXAMPLE

The BioCycle Study was a prospective study of endogenous hormones and oxidative stress across the menstrual cycle and enrolled 259 healthy, premenopausal women ages 18–44 years with regular menstrual cycles.11 The University at Buffalo Health Sciences Institutional Review Board (IRB) approved the study and served as the IRB designated by the National Institutes of Health under a reliance agreement.

Details of participant characteristics and study design for the BioCycle Study have been described in detail elsewhere.11,12 Women provided blood samples timed to menstrual cycle phases up to 8 times per menstrual cycle.12 Inhibin B was measured in blood using a GEN II ELISA (Beckman Coulter, Brea, CA) in 3875 samples across 50 assay batches. The standard concentrations supplied were 0, 10, 29, 97, 242, 486, and 944 pg/ml. ODs at wavelengths 450 and 630 nm were measured. OD630 represents nonspecific background and was subtracted from the OD450 reading for every measurement. Batch-specific calibration curves were generated from cubic splines of OD450–630 from the standard concentrations after log10 transformation was applied to both the OD450–630 and corresponding concentrations, as specified by the assay kit manufacturer. Additionally, QC samples of known concentrations of 102 and 398 pg/ml, along with the laboratory-generated (laboratory) QC sample with unknown but consistent concentration, were included and measured in each batch. Table shows assay performance through mean, SD, and interassay %CV for QC samples.

TABLE.

Summary of Quality Control Samples Measurements in 150 Batches of Inhibin-B Assays Based on Original Calibration (i.e., Original Concentrations), Total (Whitcomb et al6) Recalibrations, and the Approach Described Here, Principled Recalibration Using Multi–Batch Collapsed Calibration Curve

Manufacturer Low Control Manufacturer High Control Laboratory Control



Original Recalibration Original Recalibration Original Recalibration



Total Principled Total Principled Total Principled

Mean 97.13 99.42 96.50 409.19 398.08 392.88 96.87 95.52 95.88
SD 9.01 14.21 7.94 63.69 59.99 35.74 10.36 9.29 10.16
CV% 9.28 14.29 8.23 15.57 15.07 9.09 10.70 9.72 10.59

Values are given in pg/ml. Manufacturer acceptable control limits: low, 102 ± 20 pg/ml and high, 398 ± 80 pg/ml.

Step 1: Identify Candidate Batches for Recalibration

While order of identification method does not matter generally, here we first applied visual inspection followed by QC rules. Investigating the calibration curves for each batch via. visual inspection, we plotted each batch for comparison against a background of all batches, as well as the collapsed calibration curve. Figure 1 displays the results of the visual inspection in two panels corresponding to a “good” group that seems visually consistent in shape and level and a “bad” group that visually deviates from the rest of the batches in shape or level. The variation at each standard calibration point and variability in the shape of “good” and “bad” batch calibration curves can be seen. The 41 batch calibration curves in the left panel shared a consistent, almost linear, pattern and are consistent with the collapsed curve. Some lines exhibited an acceptable vertical shift, possibly associated with variation in laboratory conditions (e.g., temperature, humidity). Visual inspection revealed nine “bad” batch curves, displayed in the right panel of Figure 1, based on exceptionally and irregularly nonlinear or “wavy,” and thus are candidates for recalibration. We could be more or less conservative here in identifying candidates based on these subjective criteria; keeping in mind that appropriateness of the recalibration of all candidates will be assessed later.

Next, we expand our set of candidate batches by using the manufacturer provided “low” and “high” QC samples at 102 and 398 pg/ml, with suggested upper and lower control ranges of 102 ± 20 and 398 ± 80 pg/ml, respectively. Based on these QC data, 8/50 batches were candidates for recalibration because they were outside of the manufacturer recommended QC range. The manufacturer QC measures in the remaining 42 batches fell within the recommended range.

In addition to the manufacturer recommended QC measures, laboratory QC samples were measured in each batch. This laboratory QC is a single biological sample generated and maintained by the laboratory which is run in each batch. Limits of 85 to 105 pg/ml were applied to identify candidates for recalibration, of which six batches were identified.

Considering the three criteria, (1) visual inspection, (2) manufacturer QC samples, and (3) laboratory QC samples, 17 unique batches were identified as candidates for recalibration, with three batches being identified under exactly two criteria and one batch (batch 4) was identified by all three criteria.

Step 2: Apply Recalibration Using Collapsed Standard Curve

A collapsed calibration experiment of 350 measurements was created by combining across the 50 batches. No single calibration point was an obvious outlier by visual inspection of the OD deviating greatly from the rest (points in Figure 1) of a particular concentration. However, 15 of the 350 measured calibration points were outside of 2 SDs and thus were excluded from our estimation of the collapsed curve used for recalibration. A collapsed curve (Figure 1 shown in black) was generated from a cubic spline based on the manufacturer’s modeling recommendations for batch-specific curves. The collapsed curve was then applied to the 17 batches identified as candidates for recalibration, calculating new manufacturer and laboratory QC concentrations from the existing batch ODs, as well as new concentrations for the biological samples.

Step 3: Assess the Appropriateness of Recalibration

Lastly, the appropriateness of recalibration in the candidate batches must be assessed. Batches identified visually or by laboratory QC were assessed with recalibrated manufacturer QC concentrations. Four candidate batches were removed because recalibration caused the updated manufacturer QC measurement to be outside of the manufacturer prespecified limits, implying that recalibration would negatively affect the batch’s concentration measurements. Another batch identified by laboratory QC samples as being just outside the range improved slightly when recalibrated but displayed no other indication that recalibration would be appropriate (e.g., already smooth curve and manufacturer QC concentrations within control limits) and was thus removed as a candidate for recalibration. Our assessment was not a strict two out of three criteria, but was conservative with regard to limiting unnecessary recalibration.

RESULTS

The final principled recalibration set consisted of 12 batches, while the other 38 batches retained their original standard curve calibration and concentration measures. The mean of low and high manufacturer QC samples in the set of 12 batches changed from 102.85 to 100.24 pg/ml and 476.70 to 408.73 pg/ml before and after recalibration, respectively, with the CV reduced from 11.2% to 8.9% and 18.4% to 9.5%, respectively. The characteristics of the manufacturer and laboratory QC samples before and after applying total and principled recalibration for all 50 batches are displayed in Table. Explicitly, all columns are based on 50 batches with the “Original” columns using the laboratory’s standard batch-specific methodology, the “Total” columns reflecting recalibration of all batches (without selection), and “Principled” columns based on the recalibration of 12 select batches and 38 unchanged original batches. Table shows that QC concentration means remained similar before and after both recalibration methods, but the SDs were markedly reduced after principled recalibration described in Steps 1–3 above. The CV for the low and high manufacturer QC samples was reduced from 9.3% to 8.2% and 15.6% to 9.1%, respectively, by the proposed principled recalibration in lieu of the original calibration. Variance and CV in the laboratory QC sample were also reduced from the original. Table also shows that principled recalibration of a subset of batches identified via. our Steps 1–3 results in improved CVs for both the low and high manufacturer QC samples compared with performance from recalibrating all batches, principled versus total columns, respectively.

Impact: Inhibin B Across the Menstrual Cycle

The targeted study question we use here for illustration is “Do inhibin B concentrations differ across the menstrual cycle in women with lower versus higher testosterone?” Figure 2 displays the inhibin B variability patterns across the menstrual cycle, where cycle length was standardized to 28 days, comparing women in the highest and lowest quartiles of testosterone. The left panel of Figure 2 displays original concentrations, while the right panel displays inhibin B concentrations after the principled recalibration described in the previous section. Both panels show inhibin B rising and falling during the follicular phase (days 2 and 7), a sharp increase during ovulation (day 13), before declining substantially during the luteal phase (days 18–27), with women in the lowest quartile having lower inhibin B at each time point. This general pattern is consistent based on the original concentrations (Figure 2, left panel) as well as after applying the principled recalibration (Figure 2, right panel). However, after principled recalibration, the differences in inhibin B levels between testosterone groups (lowest versus highest quartile women) increased at each time point, demonstrated by the slightly widening gap between the lines of the right panel of Figure 2 versus the original data in the left panel. Changes in the accompanying day-specific 95% confidence intervals remained mostly similar, narrowing or widening slightly across time points and quartiles after recalibration. Overall, a 24% mean increase in differences of log-transformed inhibin B across all time points was observed, with quartile differences on day 14 increasing 40%, and 30%, 28%, and 29% in the luteal phase (days 18, 22, 28, respectively). Estimated 95% confidence intervals for the differences in log-transformed inhibin B levels between first and fourth testosterone quartiles similarly narrowed or widened slightly after recalibration. While improved estimation is of primary interest, for thoroughness Figure 3 plots the P values <0.20 from tests of day-specific comparisons of first quartile of testosterone with quartiles 2, 3, and 4. The x axis corresponds to results based on the original data and the y axis corresponds to those after principled recalibration. Figure 3 shows a general reduction in P values displaying a potential effect on inference based on estimates improved by recalibration.

FIGURE 2.

FIGURE 2.

Inhibin B across the menstrual cycle in women of the first (1) and fourth (4) quartiles of testosterone. Left panel shows original concentrations, and the right panel shows concentrations after principled recalibration using a collapsed calibration curve.

FIGURE 3.

FIGURE 3.

P values from hypothesis tests for group differences in Inhibin B concentrations between the first quartile of testosterone and quartiles 2, 3, and 4 at 8 time points across the menstrual cycle (24 total comparisons). The horizontal axis reflects results from the original data, and the vertical axis reflects principled recalibration. Both axes were truncated at 0.2.

DISCUSSION

As we have shown here, principled recalibration of large biomarker studies can reduce measurement error and improve inference without additional costs and with only minimal adjustments to the laboratory calibration process. The laboratory methodology that measures concentrations of biomarkers is easily overlooked in epidemiologic research. Consequently, the accuracy of biomarker values received from a laboratory assay is typically taken at face value by epidemiologists. Batch-specific calibration experiments are generally conducted for protocol convenience without regard to the overall study size that often must incorporate multiple batches (as many as 50 in the present example). While batch variability will depend on the medium, laboratory device, and personnel, as well as the biomarker being measured, consideration of a thoughtful calibration design may lead to less biased and more efficient etiological research.

The analytical steps (identify, recalibrate, and assess) for principled recalibration described here are easy to implement postassay measurement. These techniques use existing data to reduce variability and potentially reduce bias introduced by the measurement process in epidemiologic studies with multiple assay runs. In combining calibration data across batches, a variety of exclusion criteria could be applied to specific calibration samples (e.g., an OD falling outside 2 SDs of ODs for that known concentration).

Principled recalibration can help detect and overcome “bad” calibrators that can unnecessarily increase batch variability. We demonstrated examples of “wavy” or highly nonlinear curves, which may not be detected by traditional QC samples. For example, the QC samples provided by the manufacturer in this example were at 102 and 398 pg/ml concentrations, which would not be as informative to detecting a problem at the low end of the curve (i.e., with the 10 or 29 pg/ml known standards). A curve could also be smooth but shifted, resulting in QC samples outside of control limits, indicating a need for better calibration. Both are real-world scenarios where an assay might have to be rerun, imposing additional cost and using valuable sample volume stored from study participants. Recalibration is a potential solution in these instances without requiring additional resources. Another benefit of this approach is that it provides a more stable imputation for values below the limit of detection, a common practice in epidemiologic studies.

Improvement in the substantive data analysis will depend on the amount and type of measurement error, the number of batches available to form the collapsed curve, and whether the error is in measurements of the exposure or outcome.13 Another limitation is that recalibration is a final technique in the measurement process, applied after following good sample collection and laboratory practices (e.g., randomizing case/control specimens and batch assignment well, or using a single laboratory or assay manufacturer). It is unlikely that recalibration can correct for loss of information or bias that may be introduced by fundamentally poor sample collection and laboratory practices. A final important caution is that calibration and recalibration should always be assessed before data analysis. While in the example provided herein, the observed effect sizes became larger and P values smaller, it is possible by chance or by differential error that recalibration could lead to the opposite impact on study findings (move findings toward the null). Importantly, recalibration should be performed based only on laboratory data and quality measures, not on the etiologic analysis. Doing the latter may be viewed as altering the data to coincide with a particular hypothesis, thus compromising the integrity of the study findings.

As population-based studies often have scarce sample volume and funding available for assay, the option of rerunning an assay is commonly costly and occasionally not feasible. Moreover, these studies are often searching for a relatively small effect size, the detection of which is greatly hindered by unnecessary variation in the measurement process. The general steps developed here for principled recalibration give scientists an effective, user-friendly guide to reduce analytic error and variability in a variety of biomarkers and, while not guaranteed, potentially improve inference through increased efficiency or reduced bias in etiologic investigation.

Acknowledgments

This research was supported by the Intramural Research Program of the Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health.

Footnotes

The authors report no conflicts of interest.

The code used for processing and analyzing the data is available from the first author upon request. The data are available upon request, through a collaboration agreement.

REFERENCES

  • 1.Angerer J, Ewers U, Wilhelm M. Human biomonitoring: state of the art. Int J Hyg Environ Health. 2007;210:201–228. [DOI] [PubMed] [Google Scholar]
  • 2.Fankhauser-Noti A, Grob K. Blank problems in trace analysis of diethylhexyl and dibutyl phthalate: investigation of the sources, tips and tricks. Anal Chim Acta. 2007;582:353–360. [DOI] [PubMed] [Google Scholar]
  • 3.Alcock RE, Halsall CJ, Harris CA, et al. Contamination of environmental samples prepared for PCB analysis. Environ Sci Technol. 1994;28:1838–1842. [DOI] [PubMed] [Google Scholar]
  • 4.Westgard JO, Barry PL, Hunt MR, et al. A multi-rule Shewhart chart for quality control in clinical chemistry. Clin Chem. 1981;27:493–501. [PubMed] [Google Scholar]
  • 5.Westgard JO, Burnett RW, Bowers GN. Quality management science in clinical chemistry: a dynamic framework for continuous improvement of quality. Clin Chem. 1990;36:1712–1716. [PubMed] [Google Scholar]
  • 6.Whitcomb BW, Perkins NJ, Albert PS, et al. Treatment of batch in the detection, calibration, and quantification of immunoassays in large-scale epidemiologic studies. Epidemiology. 2010;21(suppl 4):S44–S50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Andrews SS, Rutherford S. A method and on-line tool for maximum likelihood calibration of immunoblots and other measurements that are quantified in batches. PLoS One. 2016;11:e0149575. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Liao JJZ. A linear mixed-effects calibration in qualifying experiments. J Biopharm Stat. 2005;15:3–15. [DOI] [PubMed] [Google Scholar]
  • 9.Ideker T, Thorsson V, Siegel AF, et al. Testing for differentially-expressed genes by maximum-likelihood analysis of microarray data. J Comput Biol. 2000;7:805–817. [DOI] [PubMed] [Google Scholar]
  • 10.Wang M, Flanders WD, Bostick RM, et al. A conditional likelihood approach for regression analysis using biomarkers measured with batch-specific error. Stat Med. 2012;31:3896–3906. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Wactawski-Wende J, Schisterman EF, Hovey KM, et al. ; BioCycle Study Group. BioCycle study: design of the longitudinal study of the oxidative stress and hormone variation during the menstrual cycle. Paediatr Perinat Epidemiol. 2009;23:171–184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Howards PP, Schisterman EF, Wactawski-Wende J, et al. Timing clinic visits to phases of the menstrual cycle by using a fertility monitor: the BioCycle Study. Am J Epidemiol. 2009;169:105–112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Hutcheon JA, Chiolero A, Hanley JA. Random measurement error and regression dilution bias. BMJ. 2010;340:c2289. [DOI] [PubMed] [Google Scholar]

RESOURCES