MULTIPLE OUTCOMESANDMUTLTIPLE SOURCES OF EVIDENCE – BEST STATISTICAL PRACTICES

Sharon-Lise T Normand

doi:10.1161/CIRCOUTCOMES.111.963751

. Author manuscript; available in PMC: 2013 Aug 28.

Published in final edited form as: Circ Cardiovasc Qual Outcomes. 2011 Nov 1;4(6):579–580. doi: 10.1161/CIRCOUTCOMES.111.963751

MULTIPLE OUTCOMESANDMUTLTIPLE SOURCES OF EVIDENCE – BEST STATISTICAL PRACTICES

Sharon-Lise T Normand ¹

PMCID: PMC3755586 NIHMSID: NIHMS329640 PMID: 22085981

This issue contains two articles in our planned Statistical Primer on Methods or Interpretation Series. The goals of our series are to (1) familiarize cardiovascular outcomes researchers with design and analytical problems encountered in outcomes research, (2) point to potential solutions, and (3) introduce modern analytical approaches. The series’ inaugural article¹ discussed approaches for handling missing data – approaches that have existed for several decades but have not been fully embraced by outcomes researchers. The second article² focused on the “landmark analysis, an analytical approach in which patients having treatment-censoring events before a “landmark” time are excluded from analysis. In this issue, Teixeira-Pinto and Mauri³ address the problem of multiple outcomes while Kwok and Lewis⁴ discuss the use of Bayesian hierarchical models.

What is “the problem” of multiple outcomes? Many readers of Circulation: Cardiovascular Quality and Outcomes will be familiar with multiplicity adjustments used in a hypothesis testing setting. Here, researchers control the overall risk of making a mistake, such as rejecting the null hypothesis when it is true, through the use of an “adjustment”. If the effects of an intervention of four different outcomes need to be tested and the investigator desires to assume a 5% risk of error, then each outcome would be would be tested using a significance level of 1.25%. While less conservative adjustments exist, the general adjustment strategy can lead to incorrect conclusions in many common settings – and the effectiveness of the adjustment strategy depends upon the correlation among the multiple outcomes. Teixeira-Pinto and Mauri describe several common strategies used in the analysis of studies with multiple outcomes, and illustrate advantages and disadvantages using data from the SIRIUS trial that compared bare-metal and sirolimus-eluting stents.⁵ The techniques the authors propose stem from methodological work designed to properly model data from “multiple informants” in psychiatric research.⁶ For example, in studying risk factors for childhood psychopathology, statistically efficient methods to utilize the information obtained about the child’s mental health status from multiple informants (i.e., teachers, parents, and the child)were developed to replace arbitrary pooling rules. For example, in order to determine if a symptom is present, psychiatric researchers would assume it was present when all informants responded yes; if one informant responded no, then the symptom was assumed absent. This particular pooling rule is similar to the “all-or-none” strategy proposed to assess process of care performance measures.⁷ Other arbitrary pooling rules have been (and continue to be) used. Teixeira-Pinto and Mauri show how methods developed in psychiatric services research to handle multiple outcomes can benefit cardiovascular outcomes researchers. They consider the case when there is two “informants” or data sources: a continuous measure of restenosis measured on all subjects and a binary measure of restenosis measured on a pre-planned subset of subjects. These new approaches are very attractive as they provide a comprehensive approach for cardiovascular outcomes researchers to make use of all data simultaneously, avoiding the need for pooling rules, and increasing statistical power to detect effects.

Kwok and Lewis describe Bayesian hierarchical models as a strategy to use the “ensemble” of information to learn about the effectiveness of cardiovascular therapies. They illustrate fundamental concepts through a meta-analysis of immunosuppressive therapy in idiopathic dilated cardiomyopathy and through a subgroup analysis of the National Institute of Neurological Disorders and Stroke intravenous tissue plasminogen activator stroke trial. Both examples are worthy of a careful read but the use of Bayesian methods for bolstering the evidence-base in subgroup analyses is particularly thoughtful. The assumption underpinning their examples is that the underlying mechanism that generate data from the different sources, whether the sources are distinct trials, subgroups within trials, or even cohorts of patients from different countries treated with the same therapy, are related. The strength of the relationships among the data sources is unknown but relationships can be estimated through the use of a (hierarchical) Bayesian probability model.

The idea of formally combining information is not new – a baseball question popularized the technique in a 1977 Scientific American article. In that article, two statisticians demonstrated the practical use of hierarchical models to determine “whether Ty Cobb was really a .400 hitter”.⁸ So what does baseball and cardiovascular research have in common? – Simple the structure of the data. In the setting considered by Kwok and Lewis, there are several experiments (baseball players) of different sizes (number at bats) and within each, event rates (number of hits) among subjects are tallied. Moreover, much has changed since 1977 – computers are better and smart mathematical algorithms now exist. We can now combine data using models commensurate with our decision-making process in order to have quantitative measures of evidence.⁹ The methods Kwok and Lewis describe, however, go even further in that the Bayesian hierarchical model reflects the uncertainty in the decisions made.

Cardiovascular outcomes researcher should read the two articles in the Statistical Primer series carefully – with an increasing focus on assessing quality and comparing the effectiveness of treatment strategies, more data sources, more informants, and more comparisons will present themselves. A solid approach to carefully combining the information is required.

Acknowledgments

Source of Funding: Dr. Normand is supported by Grant MH54693 (Analysis of Multiple Informant Data in Psychiatry) from the National Institute of Mental Health and from Contract DHHS/FDA-1074629 (Innovative Methods for Evidence Synthesis for Medical Devices) from the Center for Devices and Radiological Health.

Footnotes

Conflict of Interest Disclosures: In addition to the funding information acknowledged in this article, Dr. Normand also receives funding from the Massachusetts Department of Public Health to develop and apply Bayesian hierarchical models for profiling hospitals in the Massachusetts and funding from Yale/Yale-New Haven Hospital Center for Outcomes Research and Evaluation to develop and apply Bayesian hierarchical models to assess quality of care delivered at all US hospitals.

References

1.He Y. Missing data analysis using multiple imputation: Getting to the heart of the matter. Circulation: Cardiovascular Quality and Outcomes. 2010;3:98–105. doi: 10.1161/CIRCOUTCOMES.109.875658. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Dafni U. Landmark analysis at the 25-year landmark point. Circulation: Cardiovascular Quality and Outcomes. 2011;4:363–371. doi: 10.1161/CIRCOUTCOMES.110.957951. [DOI] [PubMed] [Google Scholar]
3.Teixeira-Pinto A, Mauri L. This issue. [Google Scholar]
4.Kwok H, Lewis RJ. This issue. [Google Scholar]
5.Holmes DR, Jr, Leon MB, Moses JW, Popma JJ, Cutlip D, Fitzgerald PJ, Brown C, Fischell T, Wong SC, Midei M. Analysis of 1-year clinical outcomes in the SIRIUS trial: A randomized trial of a siolimus-eluting stent versus a standard stent in patients at a high risk for coronary artery restenosis. Circulation. 2004;109:634. doi: 10.1161/01.CIR.0000112572.57794.22. [DOI] [PubMed] [Google Scholar]
6.Fitzmaurice GM, Laird NM, Zahner GEP, Daskalakis C. Bivariate logistic regression analysis of childhood psychopathology ratings using multiple informants. American Journal of Epidemiology. 1995;142:1194–203. doi: 10.1093/oxfordjournals.aje.a117578. [DOI] [PubMed] [Google Scholar]
7.Nolan T, Berwick DM. All-or-none measurement raises the bar on performance. Journal of the American Statistical Association. 2006;295:1168–1170. doi: 10.1001/jama.295.10.1168. [DOI] [PubMed] [Google Scholar]
8.Efron B, Morris C. Stein’s paradox in statistics. Scientific American. 1977;236:119–127. [Google Scholar]
9.Normand S-L, McNeil BJ. What is evidence? Statistics in Medicine. 2010;29:1985–1988. doi: 10.1002/sim.3933. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] 1.He Y. Missing data analysis using multiple imputation: Getting to the heart of the matter. Circulation: Cardiovascular Quality and Outcomes. 2010;3:98–105. doi: 10.1161/CIRCOUTCOMES.109.875658. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Dafni U. Landmark analysis at the 25-year landmark point. Circulation: Cardiovascular Quality and Outcomes. 2011;4:363–371. doi: 10.1161/CIRCOUTCOMES.110.957951. [DOI] [PubMed] [Google Scholar]

[R3] 3.Teixeira-Pinto A, Mauri L. This issue. [Google Scholar]

[R4] 4.Kwok H, Lewis RJ. This issue. [Google Scholar]

[R5] 5.Holmes DR, Jr, Leon MB, Moses JW, Popma JJ, Cutlip D, Fitzgerald PJ, Brown C, Fischell T, Wong SC, Midei M. Analysis of 1-year clinical outcomes in the SIRIUS trial: A randomized trial of a siolimus-eluting stent versus a standard stent in patients at a high risk for coronary artery restenosis. Circulation. 2004;109:634. doi: 10.1161/01.CIR.0000112572.57794.22. [DOI] [PubMed] [Google Scholar]

[R6] 6.Fitzmaurice GM, Laird NM, Zahner GEP, Daskalakis C. Bivariate logistic regression analysis of childhood psychopathology ratings using multiple informants. American Journal of Epidemiology. 1995;142:1194–203. doi: 10.1093/oxfordjournals.aje.a117578. [DOI] [PubMed] [Google Scholar]

[R7] 7.Nolan T, Berwick DM. All-or-none measurement raises the bar on performance. Journal of the American Statistical Association. 2006;295:1168–1170. doi: 10.1001/jama.295.10.1168. [DOI] [PubMed] [Google Scholar]

[R8] 8.Efron B, Morris C. Stein’s paradox in statistics. Scientific American. 1977;236:119–127. [Google Scholar]

[R9] 9.Normand S-L, McNeil BJ. What is evidence? Statistics in Medicine. 2010;29:1985–1988. doi: 10.1002/sim.3933. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

MULTIPLE OUTCOMESANDMUTLTIPLE SOURCES OF EVIDENCE – BEST STATISTICAL PRACTICES

Sharon-Lise T Normand, PhD

Acknowledgments

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

MULTIPLE OUTCOMESANDMUTLTIPLE SOURCES OF EVIDENCE – BEST STATISTICAL PRACTICES

Sharon-Lise T Normand, PhD

Acknowledgments

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases