Abstract
The previous article in this series explained basic concepts related to statistical signals and statistical noise in research. This article explains statistical noise in the context of randomized controlled trials (RCTs) and observational studies and offers suggestions on how noise in such studies may be reduced so as to better detect and understand the signal. Postrandomization bias related to RCTs and confounding in observational studies are discussed. Examples are provided to facilitate understanding.
Keywords: Signal, noise, randomized controlled trials, postrandomization bias, observational studies, confounding, regression
The previous article in this series introduced and explained basic concepts related to statistical noise in research. 1 In summary, in research, we look for signals in our data. The signal may be a mean or a proportion, that is, a descriptive statistic. Or, it may be the identification of a relationship between variables, that is, the outcome of an inferential statistical procedure. Statistical signals may be distorted by noise from extraneous variables. Such noise may be random or nonrandom, and may be adequately measured, inadequately measured, unmeasured, or unknown (Appendix). The present article examines statistical noise in the context of clinical trials and observational studies and offers suggestions on how noise may be reduced.
Clinical Trials and Noise
In randomized controlled trials (RCTs), a property of randomization is that noise from inadequately measured, unmeasured, and unknown biases tends to be equally distributed between groups at baseline. So, when groups are compared, because the noise is similar between groups, the noise cancels out, making the signal easier to detect. Unfortunately, although the RCT is considered a gold standard for research, RCTs are not 24-carat gold.
Noise in RCTs can arise in three ways. First, randomization is never perfect, especially in smaller samples, and so the noise is never perfectly balanced between groups. Second, especially in multicenter RCTs, methodological variations because of inadequately standardized operating procedures, interrater variations, and between-site variations can all generate noise; this was explained in detail in an earlier article. 2 Third, postrandomization bias 3 can create noise that compromises the internal validity of RCTs.
Postrandomization bias, also known as postrandomization confounding, can arise from imbalance in noise (between groups) created by events that happen between study baseline and study endpoint. The longer the duration of the RCT the greater the risk of postrandomization bias. Examples of causes of postrandomization bias are differences between groups in rescue medication use, non-study treatment use, substance use, psychosocial stress, psychosocial support, medical health, and other psychosocial and biological patient and environmental variables. So, noise that was balanced between groups at baseline is no longer balanced between groups at endpoint because of changes in existing sources of noise or addition of new sources of noise. Postrandomization bias can make conventional methods of RCT analysis unsuitable and other methods become necessary.3,4
Observational Studies and Noise
It would now be obvious to readers that, when comparing groups in observational studies such as cohort and case-control studies, because of the absence of randomization to the groups of interest, noise is never balanced between groups even at baseline. Confounding by adequately measured, inadequately measured, unmeasured, and unknown variables can therefore be substantial in such studies.
As with RCTs, the longer the duration of follow up in observational studies, the greater the accumulation of additional noise due to changes from baseline in patient and environmental variables. Here is an example. We study a cohort of healthy elderly individuals to determine the effects of diet, physical exercise, and blood pressure on the risk of mild cognitive decline and dementia across a 10-year follow up. We use appropriate instruments to measure diet, physical exercise, blood pressure, and a wide range of relevant variables that might be a source of noise when testing relationships between the risk factors of interest and the outcomes of interest. We perform all these measurements at the time of recruitment into the cohort. We can be certain that most of these measurements will change, and perhaps substantially, across the 10-year follow up. For example, subjects may change their diet and level of exercise, some for the better and some for the worse. Many subjects who were normotensive at baseline may become hypertensive during follow up; not all will be identified and receive adequate antihypertensive treatment, let alone early antihypertensive treatment. Some subjects will develop dyslipidemia or diabetes that may or may not be detected and adequately treated. Some subjects may experience stroke or head injury. It would be very difficult to capture all these changes and to adjust for all of them in statistical analyses.
In the situation described above, we look for a signal through noise generated by a large number of adequately measured, inadequately measured, unmeasured, and unknown variables. How well we succeed in detecting the signal, should one exist, depends on how strong the signal is and how well we can detect and statistically adjust for the noise. As a parallel, if a man is standing in a pool of water, we can spot his head above the surface only if he is very tall (the signal is strong) or if we are able to drain some of the water (reduce noise by adjusting for measured sources of noise).
Addressing Noise in Research
No population is homogenous; for example, for any given continuous vari- able, there is always a standard deviation. For every relationship between variables in a study, which is what hypotheses examine, there is always subject-to-subject variation even within groups. So, it is impossible to eliminate noise in research. Noise is part of the real world.
It is possible to preemptively reduce noise in RCTs through various methods such as selection of homogenous samples and standardization of procedures. 1 Noise can be statistically reduced in observational studies by propensity score matching. 5 Finally, noise can be reduced in RCTs and in observational studies through the use of linear, logistic, proportional hazards, and other forms of regression in which measured confounding variables and biases can be adjusted for.6-9
Parting Notes
Unmeasured and unknown confounds and biases create noise that can never be adjusted for. As a result, when researchers fail to detect a signal, they need to consider whether failure to control for noise was responsible. And, when they do detect a signal, they must consider whether the signal may be false because its value was distorted by noise (Appendix).
Finally, if a signal cannot be reliably detected through noise, one must ask whether the signal, if any, was worth detecting. Weak signals and real-world noise may explain why a treatment that worked well in valid animal models does not work in humans, why statistically significant findings may not be clinically significant, and why a drug that had clinically significant effects in homogeneous samples in RCTs does not work well in real-world settings.
Footnotes
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding: The author received no financial support for the research, authorship, and/or publication of this article.
Appendix
Statistical noise is broadly defined in this article to include signal-distorting variance that is random as well as signal-distorting variance that arises from systematic biases. A random source of noise affects everybody—some more, some less. For example, environmental stress generates random noise that distorts the value of depression ratings in an antidepressant versus placebo RCT. In contrast, a systematic bias as a source of noise affects one group of interest more than the other. For example, depression-related impaired sleep and appetite that can distort pregnancy outcomes may be more common in antidepressant-treated women than in women not treated with antidepressants during pregnancy. This is explained further in the next point.
When noise produces systematic rather than random distortion of the signal, the inability to adjust for noise can result in erroneous conclusions. For example, genetic variables, biological characteristics of moderate to severe depression, and symptoms and behaviors related to moderate to severe depression may predispose to adverse neonatal outcomes. That is, they are a source of noise that can systematically distort the signal that describes the relationship between antidepressant exposure during pregnancy and neonatal outcomes. If these biological characteristics, symptoms, and behaviors are inadequately measured, unmeasured, or unknown, they cannot be adjusted for in statistical analyses. In such an event, antidepressant drugs, used to treat moderate to severe depression, will erroneously be blamed for adverse neonatal outcomes when the comparison group comprises non-depressed or mildly depressed women who do not receive antidepressants during pregnancy.
References
- 1.Andrade C. Understanding statistical noise in research: 1. Basic concepts. Indian J Psychol Med, 2023; 45(1): 89–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Andrade C. Signal to noise ratio, variability, and their relevance in clinical trials. J Clin Psychiatry, 2013; 74: 479–481. [DOI] [PubMed] [Google Scholar]
- 3.Manson JE, Shufelt CL, Robins JM. The potential for postrandomization confounding in randomized clinical trials. JAMA. 2016; 315(21): 2273–2274. [DOI] [PubMed] [Google Scholar]
- 4.Rochon J. Accounting for covariates observed post randomization for discrete and continuous repeated measures data. J R Statist Soc B, 1996; 58: 205–219. [Google Scholar]
- 5.Andrade C. Propensity score matching in nonrandomized studies: a concept simply explained using antidepressant treatment during pregnancy as an example. J Clin Psychiatry, 2017; 78: e162–e165. [DOI] [PubMed] [Google Scholar]
- 6.Vickers AJ. Analysis of variance is easily misapplied in the analysis of randomized trials: a critique and discussion of alternative statistical approaches. Psychosom Med, 2005; 67(4): 652–655. [DOI] [PubMed] [Google Scholar]
- 7.Rosenblum M and van der Laan MJ. Using regression models to analyze randomized trials: Asymptotically valid hypothesis tests despite incorrectly specified models. Biometrics, 2009; 65(3): 937–945. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Singh R and Mukhopadhyay K.. Survival analysis in clinical trials: Basics and must know areas. Perspect Clin Res, 2011; 2(4): 145–148. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.van Leeuwen N, Walgaard C, van Doorn PA, et al. Efficient design and analysis of randomized controlled trials in rare neurological diseases: An example in Guillain-Barré syndrome. PLoS One, 2019; 14(2): e0211404. [DOI] [PMC free article] [PubMed] [Google Scholar]