Skip to main content

Three-Way Decisions Solution to Filter Spam Email: An Empirical Study

  • Conference paper
Rough Sets and Current Trends in Computing (RSCTC 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7413))

Included in the following conference series:

Abstract

A three-way decisions solution based on Bayesian decision theory for filtering spam emails is examined in this paper. Compared to existed filtering systems, the spam filtering is no longer viewed as a binary classification problem. Each incoming email is accepted as a legitimate or rejected as a spam or undecided as a further-exam email by considering the misclassification cost. The three-way decisions solution for spam filtering can reduce the error rate of classifying a legitimate email to spam, and provide a more meaningful decision procedure for users. The solution is not restricted to a specific classifier. Experimental results on several corpus show that the three-way decisions solution can get a better total cost ratio value and a lower weighted error.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Androutsopoulos, I., Paliouras, G., Karkaletsis, V., Sakkis, G., Spyropoulos, C.D., Stamatopoulos, P.: Learning to filter spam e-mail: A comparison of a naive bayesian and a memory-based approach. In: 4th European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 1–13 (2000)

    Google Scholar 

  2. Carreras, X., Marquez, L.: Boosting trees for anti-spam email filtering. In: European Conference on Recent Advances in NLP (2001)

    Google Scholar 

  3. Chow, C.K.: On optimum recognition error and reject tradeoff. IEEE Transcations on Information Theory 16(1), 41–46 (1970)

    Article  MATH  Google Scholar 

  4. Domingos, P., Pazzani, M.: Beyond independece: Conditions for the optimality of the simple Bayesian classifier. In: 13th International Conference on Machine Learning, pp. 105–112 (1996)

    Google Scholar 

  5. Drucker, H., Wu, D.H., Vapnik, V.N.: Support vector machines for spam categorization. IEEE Transactions on Neural Networks 10(5), 1048–1054 (1999)

    Article  Google Scholar 

  6. Elkan, C.: The foundations of cost-sensitive learning. In: 17th International Joint Conference on Artificial Intelligence, pp. 973–978 (2001)

    Google Scholar 

  7. Metsis, V., Androutsopoulos, I., Paliouras, G.: Spam filtering with naive bayes-which naive bayes? In: 3rd Conference on Email and Anti-Spam (2006)

    Google Scholar 

  8. Mitchell, T.M.: Machine Learning. McGraw-Hill (1997)

    Google Scholar 

  9. Pauker, S.G., Kassirer, J.P.: The threshold approach to clinical decision making. New England Journal of Medicine 302, 1109–1117 (1980)

    Article  Google Scholar 

  10. Pawlak, Z.: Rough sets. International Journal of Computer and Information Sciences 11, 341–356 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  11. Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A bayesian approach to filtering junk e-mail. In: Learning for Text Categorization-Papers from the AAAI Workshop, pp. 55–62 (1996)

    Google Scholar 

  12. Schneider, K.M.: A comparison of event models for Naive Bayes anti-spam e-mail filtering. In: 10th Conference of the European Chapter of the Association for Computational Linguistics, pp. 307–314 (2003)

    Google Scholar 

  13. Yao, Y., Wong, S.K.M., Lingras, P.: A decision-theoretic rough set model. Methodologies for Intelligent Systems 5, 17–24 (1992)

    Google Scholar 

  14. Yao, Y.: Three-way decisions with probabilistic rough sets. Information Sciences 180, 341–353 (2010)

    Article  MathSciNet  Google Scholar 

  15. Zhou, B., Yao, Y., Luo, J.: A Three-Way Decision Approach to Email Spam Filtering. In: Farzindar, A., Kešelj, V. (eds.) Canadian AI 2010. LNCS, vol. 6085, pp. 28–39. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jia, X., Zheng, K., Li, W., Liu, T., Shang, L. (2012). Three-Way Decisions Solution to Filter Spam Email: An Empirical Study. In: Yao, J., et al. Rough Sets and Current Trends in Computing. RSCTC 2012. Lecture Notes in Computer Science(), vol 7413. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32115-3_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32115-3_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32114-6

  • Online ISBN: 978-3-642-32115-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics