Skip to main content
Dose-Response logoLink to Dose-Response
. 2006 May 22;3(4):453–455. doi: 10.2203/dose-response.003.04.001

Statistical Challenges in Evaluating Dose-Response using Epidemiological Data

Kenneth A Mundt 1
PMCID: PMC2477196  PMID: 18648626

One of the most frequently-cited and relied-upon criteria for deciding whether or not some substance to which humans may be exposed (such as an environmental contaminant) is causally related to increased disease occurrence, or risk, is the presence of a dose-response relationship. In his now famous President's Address in 1965 to the newly-formed Section of Occupational Medicine of the Royal Society of Medicine, Sir Austin Bradford Hill outlined nine guidelines for evaluating observed associations and drawing informed judgments regarding causality. Fifth among these points to be considered – after “Strength,” “Consistency,” “Specificity” and “Temporality” – is “Biological gradient.” Hill wrote, “For instance, the fact that the death rate from cancer of the lung rises linearly with the number of cigarettes smoked daily, adds a very great deal to the simpler evidence that cigarette smokers have a higher death rate than non-smokers.” He noted that when a dose-response curve might be observed, “then we should look most carefully for such evidence,” but adding “[o]ften the difficulty is to secure some satisfactory quantitative measure of the environment which will permit us to explore this dose-response. But we should invariably seek it (Hill 1965).”

More than 40 years later efforts to uncover and understand dose-response relationships underlying epidemiological data continue. Furthermore, it is increasingly useful, for a variety of reasons beyond evaluating causality including regulatory standards for exposure limits, targets for contaminated site cleanup and proper dosing of pharmaceutical preparations, that the shape of the actual dose-response relationship be elucidated.

However, deriving a valid dose-response curve is not simple. Following are a few of the many reasons why this is so. First, as Hill pointed out, specific data are required. Some valid and at least semi-quantitative, but preferably quantitative, estimates of exposure (as a surrogate for dose, as dose is rarely available in observational studies) are required for each individual studied. Because environmental and occupational exposures are sustained over many years, the characterization of exposure is necessarily time-dependent, and the relevant time frame may span several decades, over which adequate historical exposure records are unlikely to exist.

Second, associations between an estimated exposure and the occurrence of disease can result not only from an underlying causal relationship, but as a result of systematic error or study bias. For example, in an occupational study disease risk might correlate with cumulative exposure, which in turn might simply indicate longer-term workers who by definition would be older and in whom cancer risk could be considerably higher. Cumulative exposure might also be a surrogate for other substances especially likely to be present in earlier years of a production facility, or reflect the ability to follow long-term employees longer (versus those lost to follow-up) thereby providing a greater opportunity to detect the occurrence of the disease of interest. More subtly, due to the lack of complete exposure information, historical estimates may be incorrectly extrapolated back in time. If, due to the general impression that early production years conferred much higher exposure, exposures are over-estimated for early years of employment, a spurious inverse dose-response may be observed. Depending on the direction and degree of all study biases, an observed dose-response may not at all reflect the true underlying dose-response relationship.

Even in the absence of systematic error, epidemiological studies often suffer from low statistical power due to limited numbers of observed events (usually disease occurrences), especially for rare diseases such as specific cancers, even if the study population is large. At the lowest estimated doses, where risks are anticipated to be low as well, the small number of observed cases may even preclude the differentiating of no increase in risk from a linear dose-response or a threshold (or other more complicated) dose-response function.

Assuming that sample sizes are large enough to assure reasonable statistical power, exposure can be estimated reasonably validly, and that a true dose-response relationship is present, several additional statistical/analytical challenges remain. The three papers that follow address different (but related) statistical or analytical aspects of characterizing dose-response relationships in observational studies.

In the first paper of this section, Dr. Crump discusses the impact that random error in the exposure measurement can have on the assessment of the shape of the dose-response (Crump 2005). As is the case in many epidemiological studies, the exposure estimates are based on incomplete data, and generally assumed to have no random error, raising the possibility that the risk function that is derived is erroneous. This also, as the paper addresses, can have serious implications for risk extrapolation.

In the second paper, Dr. van Wijngaarden proposes a novel graphical approach for uncovering dose-responses and elucidating possible non-linearity of exposure-response in epidemiological studies using standardized morbidity or mortality ratio (SMR) analyses (van Wijngaarden 2005). SMR analyses remain popular in study settings where the available “exposed” population (often defined as an occupational or community cohort) is finite, and only used to determine the actual number of cases of the outcome of interest (“observed”): and to enumerate the person-time at risk of the cohort so that external, statistically stable, reference rates may be used to derive the “expected” number of outcomes. The proposed method is especially attractive in that it allows for moving (time-dependent) windows of exposure.

In the third paper, Drs. May and Bigelow discuss practical statistical modeling approaches for identifying and characterizing non-linear dose-response relationships using epidemiological data (May and Bigelow 2005). They review various factors that may influence or even interfere with the analyst's ability to properly discover and characterize the underlying dose-response function. With a focus on threshold and J-shaped relationships, they deftly illustrate the actual impact these factors have on several statistical methods commonly used to characterize dose-response relationships.

Combined, these papers underscore the challenges epidemiologists and statisticians analyzing observational epidemiological data face in deriving valid and defensible conclusions regarding not only the presence, but also the shape, of the underlying dose-response relationship. Better quantitative characterization of these relationships will be useful beyond causal inference, by impacting practical decision-making and supporting toxicological and biological discovery. Further refinement of available statistical approaches, as well as development of new strategies, are clearly needed.

REFERENCES

  1. Crump KS. The effect of random error in exposure measurement upon the shape of the exposure response. Dose-Response. 2005;3:456–464. doi: 10.2203/dose-response.003.04.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Hill AB. The Environment and Disease: Association or Causation? Proc R Soc Med. 1965;58:295–300. [PMC free article] [PubMed] [Google Scholar]
  3. May S, Bigelow C. Modeling nonlinear dose-response relationships in epidemiologic studies: Statistical approaches and practical challenges. Dose-Response. 2005;3:474–490. doi: 10.2203/dose-response.003.04.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. van Wijngaarden E. A graphical method to evaluat exposure-response relationships in epidemiologic studies using standardized mortality or morbidity ratios. Dose-Response. 2005;3:465–473. doi: 10.2203/dose-response.003.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Dose-Response are provided here courtesy of SAGE Publications

RESOURCES