Abstract
In the first of a series of four articles the authors explain the statistical concepts of hypothesis testing and p values. In many clinical trials investigators test a null hypothesis that there is no difference between a new treatment and a placebo or between two treatments. The result of a single experiment will almost always show some difference between the experimental and the control groups. Is the difference due to chance, or is it large enough to reject the null hypothesis and conclude that there is a true difference in treatment effects? Statistical tests yield a p value: the probability that the experiment would show a difference as great or greater than that observed if the null hypothesis were true. By convention, p values of less than 0.05 are considered statistically significant, and investigators conclude that there is a real difference. However, the smaller the sample size, the greater the chance of erroneously concluding that the experimental treatment does not differ from the control--in statistical terms, the power of the test may be inadequate. Tests of several outcomes from one set of data may lead to an erroneous conclusion that an outcome is significant if the joint probability of the outcomes is not taken into account. Hypothesis testing has limitations, which will be discussed in the next article in the series.
Full text
PDFSelected References
These references are in PubMed. This may not be the complete list of references from this article.
- Altman D. G., Gore S. M., Gardner M. J., Pocock S. J. Statistical guidelines for contributors to medical journals. Br Med J (Clin Res Ed) 1983 May 7;286(6376):1489–1493. doi: 10.1136/bmj.286.6376.1489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clegg F. Descriptive statistics. Br J Hosp Med. 1987 Apr;37(4):356–357. [PubMed] [Google Scholar]
- Cohn J. N., Johnson G., Ziesche S., Cobb F., Francis G., Tristani F., Smith R., Dunkman W. B., Loeb H., Wong M. A comparison of enalapril with hydralazine-isosorbide dinitrate in the treatment of chronic congestive heart failure. N Engl J Med. 1991 Aug 1;325(5):303–310. doi: 10.1056/NEJM199108013250502. [DOI] [PubMed] [Google Scholar]
- Detsky A. S., Sackett D. L. When was a "negative" clinical trial big enough? How many patients you needed depends on what you found. Arch Intern Med. 1985 Apr;145(4):709–712. [PubMed] [Google Scholar]
- Emerson J. D., Colditz G. A. Use of statistical analysis in the New England Journal of Medicine. N Engl J Med. 1983 Sep 22;309(12):709–713. doi: 10.1056/NEJM198309223091206. [DOI] [PubMed] [Google Scholar]
- Evidence-Based Medicine Working Group Evidence-based medicine. A new approach to teaching the practice of medicine. JAMA. 1992 Nov 4;268(17):2420–2425. doi: 10.1001/jama.1992.03490170092032. [DOI] [PubMed] [Google Scholar]
- Gardner M. J., Altman D. G. Estimating with confidence. Br Med J (Clin Res Ed) 1988 Apr 30;296(6631):1210–1211. doi: 10.1136/bmj.296.6631.1210. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guyatt G. H., Rennie D. Users' guides to the medical literature. JAMA. 1993 Nov 3;270(17):2096–2097. [PubMed] [Google Scholar]
- Kirshner B. Methodological standards for assessing therapeutic equivalence. J Clin Epidemiol. 1991;44(8):839–849. doi: 10.1016/0895-4356(91)90139-z. [DOI] [PubMed] [Google Scholar]
- Mayou R., MacMahon D., Sleight P., Florencio M. J. Early rehabilitation after myocardial infarction. Lancet. 1981 Dec 19;2(8260-61):1399–1402. doi: 10.1016/s0140-6736(81)92811-7. [DOI] [PubMed] [Google Scholar]
- O'Brien P. C., Shampo M. A. Statistical considerations for performing multiple tests in a single experiment. 1. Introduction. Mayo Clin Proc. 1988 Aug;63(8):813–815. doi: 10.1016/s0025-6196(12)62362-3. [DOI] [PubMed] [Google Scholar]
- Oxman A. D., Sackett D. L., Guyatt G. H. Users' guides to the medical literature. I. How to get started. The Evidence-Based Medicine Working Group. JAMA. 1993 Nov 3;270(17):2093–2095. [PubMed] [Google Scholar]
- Pocock S. J., Geller N. L., Tsiatis A. A. The analysis of multiple endpoints in clinical trials. Biometrics. 1987 Sep;43(3):487–498. [PubMed] [Google Scholar]
- Pocock S. J., Hughes M. D., Lee R. J. Statistical problems in the reporting of clinical trials. A survey of three medical journals. N Engl J Med. 1987 Aug 13;317(7):426–432. doi: 10.1056/NEJM198708133170706. [DOI] [PubMed] [Google Scholar]
- Wasson J. H., Sox H. C., Neff R. K., Goldman L. Clinical prediction rules. Applications and methodological standards. N Engl J Med. 1985 Sep 26;313(13):793–799. doi: 10.1056/NEJM198509263131306. [DOI] [PubMed] [Google Scholar]