Skip to main content
. 2016 Feb 17;27(2):141–151. doi: 10.1038/jes.2016.3

Table 2. Options for interpreting non-detect data.a.

Technique Percent of observations below detection limits General description Advantages Disadvantages Source
Discard non-detect entries NA Entries with a non-detect value are eliminated. This approach is simple. Analysis of results that have been reported as not detected is not possible. The data set may be distorted. Levine113
Substitution of a value in place of the non-detect value <15% Substitute non-detects with zero; half the LOQ or LOD; at the LOQ or LOD; or at the LOQ/√2. Substitution with ½ the LOD has been used frequently in the past for chemical assessments. Substitution is simple. Treating non-detects as zero reduces overestimation while treating non-detects as the LOD avoids underestimation. Use of this method could cause the data set to become skewed. Underestimation (with treating non-detects as zero) and overestimation (with treating non-detects at the ½ the DL and at the DL) is possible. EPA;17 EPA;115 Levine113
Atchison's method <15% The mean and variance are adjusted to assume non-detects are zero. Assumption is that microbial data is log normally distributed. Assumes data below the LOD were actually present, but could not be recorded. May result in overestimation. EPA;115 Levine113
Cohen's method <20% Uses a maximum likelihood estimation approach to fit a lognormal distribution to the data. Assumes the data follow a normal distribution. Accounts for data below the LOD. As the number of observations falling below the LOD increases, the statistical power decreases, and the true significance level increases. observations >20 are required for consistent results. Do not use if >50% of observations are non-detect. The LOD must be the same for all entries. EPA;115 Levine;113 Helsel116
Kaplan–Meier <50% Non-parametric method. Estimates a cumulative distribution function for data that has multiple LODs to compute descriptive statistics. Does not require a distribution to be specified. Can account for multiple censoring limits. Used primarily for data with “greater thans”. Helsel;116 Helsel120
ROS 50–80% Imputation method (censored or missing observations are given a value, but not all non-detects are given the same value) which uses probability plot of detects to fill in the non-detect values. Can be used for data with multiple LODs. Performs better on small sample sizes than MLE and substitution methods or for data that do not fit a distribution. None given in the cited sources. Helsel;116 Helsel;120 Wong108
Modern MLE 50–80% Uses less-than values (censored values) and detected observations to provide adjusted estimates of the mean and SD that were likely to have produced both detected and non-detected data. Assumes data follow a normal or lognormal distribution. Accounts for data below the detection level. Must have an n >50 to use this method. EPA;115 Helsel;116 Helsel120
Test of proportions >50% Non-parametric method. Requires at least 10% of the data be quantified. Can be used for categorical data (presence/absence). May not be applicable for composite samples. EPA;115 Levine113
Log-probit analysis NA?? Distributional method. Assumed data has a lognormal probability distribution. Detected values are plotted and percentages of non-detects are accounted for. More accurate and less biased than substitution. Requires data to have enough detected observations to define the distribution function with confidence. EPA17

Abbreviations: EPA, U.S. Environmental Protection Agency; LOD, limit of detection; LOQ, limit of quantitation; NA, not applicable; ROS, regression on order statistic.

a

Adapted from Levine113 and EPA.115