Skip to main content
Springer Nature - PMC COVID-19 Collection logoLink to Springer Nature - PMC COVID-19 Collection
. 2009;5446:659–664. doi: 10.1007/978-3-642-00672-2_67

Directly Identify Unexpected Instances in the Test Set by Entropy Maximization

Chaofeng Sha 22, Zhen Xu 22, Xiaoling Wang 23, Aoying Zhou 23
Editors: Qing Li16, Ling Feng17, Jian Pei18, Sean X Wang19, Xiaofang Zhou20, Qiao-Ming Zhu21
PMCID: PMC7122406

Abstract

In real applications, a few unexpected examples unavoidably exist in the process of classification, not belonging to any known class. How to classify these unexpected ones is attracting more and more attention. However, traditional classification techniques can’t classify correctly unexpected instances, because the trained classifier has no knowledge about these. In this paper, we propose a novel entropy-based method to the problem. Finally, the experiments show that the proposed method outperforms previous work in the literature.

Keywords: Text Data, Severe Acute Respiratory Syndrome, Nominal Data, Positive Class, Negative Instance

Contributor Information

Qing Li, Email: itqli@cityu.edu.hk.

Ling Feng, Email: fengling@tsinghua.edu.cn.

Jian Pei, Email: jpei@cs.sfu.ca.

Sean X. Wang, Email: sean.wang@uvm.edu

Xiaofang Zhou, Email: zxf@itee.uq.edu.au.

Qiao-Ming Zhu, Email: qmzhu@suda.edu.cn.

Chaofeng Sha, Email: cfsha@fudan.edu.cn.

Zhen Xu, Email: xzhen@fudan.edu.cn.

Xiaoling Wang, Email: xlwang@sei.ecnu.edu.cn.

Aoying Zhou, Email: ayzhou@sei.ecnu.edu.cn.

References

  • 1.Liu, B., Dai, Y., Li, X., Lee, W., Yu, S.: Building Text Classifiers Using Positive and Unlabeled Examples. In: IJCAI 2003 (2003)
  • 2.Györfi L., Györfi Z., Vajda I. Bayesian decision with rejection. Problems of Control and Information Theory. 1978;8:445–452. [Google Scholar]
  • 3.Devroye L., Györfi L., Lugosi G. A Probabilistic Theory of Pattern Recognition. Heidelberg: Springer; 1996. [Google Scholar]
  • 4.Cover T., Thomas J. Elements of information theory. Hoboken: Wiley Interscience; 1991. [Google Scholar]
  • 5.Li, X., Liu, B., Lee, W., Yu, S.: Text Classificaton by Labeling Words. In: AAAI 2004 (2004)
  • 6.Li, X., Liu, B., Ng, S.: Learning to identify unexpected instances in the test set. In: IJCAI 2007 (2007)
  • 7.Guo, Y., Greiner, R.: Optimistic active learning using mutual information. In: IJCAI 2007 (2007)

Articles from Advances in Data and Web Management are provided here courtesy of Nature Publishing Group

RESOURCES