Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Learning to Filter Spam E-Mail: A Comparison of a Naive Bayesian and a Memory-Based Approach

Sep 18, 2000

Ion Androutsopoulos, Georgios Paliouras, Vangelis Karkaletsis, Georgios Sakkis, Constantine D. Spyropoulos, Panagiotis Stamatopoulos

Figure 1 for Learning to Filter Spam E-Mail: A Comparison of a Naive Bayesian and a Memory-Based Approach

Figure 2 for Learning to Filter Spam E-Mail: A Comparison of a Naive Bayesian and a Memory-Based Approach

Figure 3 for Learning to Filter Spam E-Mail: A Comparison of a Naive Bayesian and a Memory-Based Approach

Figure 4 for Learning to Filter Spam E-Mail: A Comparison of a Naive Bayesian and a Memory-Based Approach

Share this with someone who'll enjoy it:

Abstract:We investigate the performance of two machine learning algorithms in the context of anti-spam filtering. The increasing volume of unsolicited bulk e-mail (spam) has generated a need for reliable anti-spam filters. Filters of this type have so far been based mostly on keyword patterns that are constructed by hand and perform poorly. The Naive Bayesian classifier has recently been suggested as an effective method to construct automatically anti-spam filters with superior performance. We investigate thoroughly the performance of the Naive Bayesian filter on a publicly available corpus, contributing towards standard benchmarks. At the same time, we compare the performance of the Naive Bayesian filter to an alternative memory-based learning approach, after introducing suitable cost-sensitive evaluation measures. Both methods achieve very accurate spam filtering, outperforming clearly the keyword-based filter of a widely used e-mail reader.

* Proceedings of the workshop "Machine Learning and Textual Information Access", 4th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD-2000), H. Zaragoza, P. Gallinari and M. Rajman (Eds.), Lyon, France, September 2000, pp. 1-13

View paper on

Share this with someone who'll enjoy it:

Title:Learning to Filter Spam E-Mail: A Comparison of a Naive Bayesian and a Memory-Based Approach

Paper and Code