Abstract
Whether or not high accuracy classification methods can be scaled to large applications is crucial for the ultimate usefulness of such methods in text categorization. This paper applies two statistical learning algorithms, the Linear Least Squares Fit (LLSF) mapping and a Nearest Neighbor classifier named ExpNet, to a large collection of MEDLINE documents. With the use of suitable dimensionality reduction techniques and efficient algorithms, both LLSF and ExpNet successfully scaled to this very large problem with a result significantly outperforming word-matching and other automatic learning methods applied to the same corpus.
Full text
PDF




Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Chute C. G., Yang Y., Buntrock J. An evaluation of computer assisted clinical classification algorithms. Proc Annu Symp Comput Appl Med Care. 1994:162–166. [PMC free article] [PubMed] [Google Scholar]
