Abstract
ICD-9-coded emergency department (ED) diagnoses and free-text triage diagnoses are routinely collected data elements that have potential value for public health surveillance and early detection of epidemics. We constructed and measured performance of three classifiers for the detection of cases of acute gastrointestinal syndrome of public health significance: one used ICD-9-coded ED diagnosis as input data; the other two used free-text triage diagnosis. We measured the performance of these classifiers against the expert classification of cases based on review of ED reports. The sensitivity of the ICD-9-code classifier was 0.32, and the specificity was 0.99. The sensitivity of a naïve Bayes classifier using triage diagnoses was 0.63, the specificity was 0.94, and the area under the ROC curve was 0.82. A bigram Bayes classifier had sensitivity 0.38, specificity 0.94, and area under the ROC of 0.69. We conclude that a naive Bayes classifier of free-text triage diagnosis data provides more sensitive and earlier detection of cases of acute gastrointestinal syndrome than either a bigram Bayes classifier or an ICD-9 code classifier. The sensitivity achieved should be sufficient for syndromic surveillance system designed to detect moderate to large epidemics.
Full text
PDFSelected References
These references are in PubMed. This may not be the complete list of references from this article.
- Hanley J. A., McNeil B. J. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982 Apr;143(1):29–36. doi: 10.1148/radiology.143.1.7063747. [DOI] [PubMed] [Google Scholar]