Abstract
OBJECTIVES: To examine the influences of situational and model factors on the accuracy of Bayesian learning systems. DESIGN: This study examines the impacts of variations in two situational factors, training sample size and number of attributes, and in two model factors, choice of Bayesian model and criteria for excluding model attributes, on the overall accuracy of Bayesian learning systems. MEASUREMENTS: The test data were derived from myocardial infarction patients who were admitted to eight hospitals in New Orleans during 1985. The test sample consisted of 339 cases; the training samples included 100, 400, and 800 cases. APACHE II variables were used for the model attributes and patient discharge status as the outcome predicted. Attribute sets were selected in sizes of 4, 8, and 12. The authors varied the Bayesian models (proper and simple) and the attribute exclusion criteria (optimism and pessimism). RESULTS: The simple Bayes model, which assumes conditional independence, consistently equalled or outperformed the proper (maximally dependent) Bayes model, which assumes conditional dependence, across all training sample and attribute set sizes. Not excluding model attributes was found to be preferable to using sample theory as an attribute exclusion criterion in both the simple and the proper models. CONCLUSION: In the domain tested, the simple Bayes model with optimistic exclusion is more robust than previously assumed and increasing the number of attributes in a model had a greater relative impact on model accuracy than did increasing the number of training sample cases. Assessment of applicability of these findings to other domains will require further study. In addition, other models that are between these two extremes must be investigated. These include models that approximate proper Bayes' conditional dependence computations while requiring fewer training sample cases, attribute exclusion criteria between optimism and pessimism that improve accuracy, and ordering techniques for introducing attributes into Bayes models that optimize the information value associated with the attributes in test-sample cases.
Full Text
The Full Text of this article is available as a PDF (1.4 MB).
Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Alemi F., Rice J., Hankins R. Predicting in-hospital survival of myocardial infarction. A comparative study of various severity measures. Med Care. 1990 Sep;28(9):762–775. doi: 10.1097/00005650-199009000-00006. [DOI] [PubMed] [Google Scholar]
- Bigongiari L. R., Preston D. F., Cook L., Dwyer S. J., 3rd, Fritz S., Fryback D. G., Thornbury J. R. Uncertainty/information as measure of various urographic parameters: an information theory model of diagnosis of renal masses. Invest Radiol. 1981 Jan-Feb;16(1):77–81. doi: 10.1097/00004424-198101000-00014. [DOI] [PubMed] [Google Scholar]
- Chard T. Self-learning for a Bayesian knowledge base: how long does it take for the machine to educate itself? Methods Inf Med. 1987 Oct;26(4):185–188. [PubMed] [Google Scholar]
- Chard T. The effect of dependence on the performance of Bayes' theorem: an evaluation using a computer simulation. Comput Methods Programs Biomed. 1989 May;29(1):15–19. doi: 10.1016/0169-2607(89)90085-0. [DOI] [PubMed] [Google Scholar]
- Fryback D. G. Bayes' theorem and conditional nonindependence of data in medical diagnosis. Comput Biomed Res. 1978 Oct 5;11(5):423–434. doi: 10.1016/0010-4809(78)90001-0. [DOI] [PubMed] [Google Scholar]
- Gammerman A., Thatcher A. R. Bayesian diagnostic probabilities without assuming independence of symptoms. Methods Inf Med. 1991;30(1):15–22. [PubMed] [Google Scholar]
- Gustafson D. H., Kestly J. J., Greist J. H., Jensen N. M. Initial evaluation of a subjective Bayesian diagnostic system. Health Serv Res. 1971 Fall;6(3):204–213. [PMC free article] [PubMed] [Google Scholar]
- Hanley J. A., McNeil B. J. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology. 1983 Sep;148(3):839–843. doi: 10.1148/radiology.148.3.6878708. [DOI] [PubMed] [Google Scholar]
- Hanley J. A., McNeil B. J. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982 Apr;143(1):29–36. doi: 10.1148/radiology.143.1.7063747. [DOI] [PubMed] [Google Scholar]
- Herskovits E. H., Cooper G. F. Algorithms for Bayesian belief-network precomputation. Methods Inf Med. 1991 Apr;30(2):81–89. [PubMed] [Google Scholar]
- Knaus W. A., Draper E. A., Wagner D. P., Zimmerman J. E. APACHE II: a severity of disease classification system. Crit Care Med. 1985 Oct;13(10):818–829. [PubMed] [Google Scholar]
- McNeil B. J., Hanley J. A. Statistical approaches to the analysis of receiver operating characteristic (ROC) curves. Med Decis Making. 1984;4(2):137–150. doi: 10.1177/0272989X8400400203. [DOI] [PubMed] [Google Scholar]
- Norusis M. J., Jacquez J. A. Diagnosis. I. Symptom nonindependence in mathematical models for diagnosis. Comput Biomed Res. 1975 Apr;8(2):156–172. doi: 10.1016/0010-4809(75)90036-1. [DOI] [PubMed] [Google Scholar]
- Ohmann C., Yang Q., Künneke M., Stöltzing H., Thon K., Lorenz W. Bayes theorem and conditional dependence of symptoms: different models applied to data of upper gastrointestinal bleeding. Methods Inf Med. 1988 May;27(2):73–83. [PubMed] [Google Scholar]
- Russek E., Kronmal R. A., Fisher L. D. The effect of assuming independence in applying Bayes' theorem to risk estimation and classification in diagnosis. Comput Biomed Res. 1983 Dec;16(6):537–552. doi: 10.1016/0010-4809(83)90040-x. [DOI] [PubMed] [Google Scholar]
- Spicer C. C. Test reduction: II--Bayes's theorem and the evaluation of tests. Br Med J. 1980 Aug 30;281(6240):592–594. doi: 10.1136/bmj.281.6240.592. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Séroussi B. Computer-aided diagnosis of acute abdominal pain when taking into account interactions. Methods Inf Med. 1986 Oct;25(4):194–198. [PubMed] [Google Scholar]
- Weintraub W. S., Madeira S. W., Jr, Bodenheimer M. M., Seelaus P. A., Katz R. I., Feldman M. S., Agarwal J. B., Banka V. S., Helfant R. H. Critical analysis of the application of Bayes' theorem to sequential testing in the noninvasive diagnosis of coronary artery disease. Am J Cardiol. 1984 Jul 1;54(1):43–49. doi: 10.1016/0002-9149(84)90301-1. [DOI] [PubMed] [Google Scholar]
- de Dombal F. T., Leaper D. J., Staniland J. R., McCann A. P., Horrocks J. C. Computer-aided diagnosis of acute abdominal pain. Br Med J. 1972 Apr 1;2(5804):9–13. doi: 10.1136/bmj.2.5804.9. [DOI] [PMC free article] [PubMed] [Google Scholar]
