« XXX |
Main Page
| Little Danzig »
Comments:
bayesian filter for spam
Posted by dav at 2002 August 18 10:52 AM
File under: Geek
File under: Geek
Here's a couple of papers dealing with using Bayesian Probabilies to filter spam from you inbox:
A layman's article: A Plan for Spam
A scientific paper: Naıve-Bayes vs. Rule-Learning in Classification of Email.
If I get time (ha) I'll try to implement the algorithm in perl.
Post a new comment:
Thanks for signing in, . Now you can comment. (sign out)
(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)
Yes, please do so!
Do you know of ANY implementations that already work so far? I know what Paul Graham is dealing with but he is still starting up...
Cheers,
Moritz
Posted by: Moritz on September 4, 2002 08:22 AMinfo AT moritz HYPHON naumann DOT de
If you check the website of Paul Graham every now and then you'll find that he's updating links to "products" that already implement some of the ideas discussed in various Bayesian filtering papers (including his). Go here - http://www.paulgraham.com/filters.html
Bayespam (found on freshmeat easily) is one such implementation in Perl. There are supposedly gains to be made by implementing in C code so check out SpamProbe and SpamOracle as well - these don't call third party routines to do their thing.
Also - you might be interested in Memory Based Learning routines. Although it's beyond me, it's another machine learning algorithm type - and you can read a paper about it here - http://arxiv.org/abs/cs.CL/0009009
And lastly - you might be interested to know that Microsoft have funded some research into Bayesian mail filtering (and I'm no M$ basher so don't look for laughter here) and you might be interested in reading some of it here - http://research.microsoft.com/~horvitz/junkfilter.htm
Cheers
Watcher
Posted by: on September 11, 2002 07:43 AM