SpamBayes
Posted on July 8, 2005
No Comments
Just installed SpamBayes on my machine that runs Outlook 2k3. It can run as a plugin with Outlook and as proxy for most any other email client you might use.
My spam is out of control and Outlook’s built-in stuff… to be gentle… sucks monkey parts. This thing uses Bayesian statistical analysis to root out the spam. I had used a product in the past with similar claims - and it worked GREAT, but it was linux only and I have since switched back to windows for a lot of my machines. Read about this and figured I’d give it a go. Basically, you train it to detect spam. As mail comes in, it classifies it and puts it in a Spam, Maybe Spam or Inbox. First week or so I keep an eye on it and use the little buttons it adds to the toolbar to reclassify. Over time, it learns. And if past experience proves out, it will learn well. I figure I’ll do this for a week or two and if all goes well I’ll add it to my work machine for Outlook as well.
From the website, a bit more detail:
That’s great, but what’s SpamBayes?
(the non-technical hand-waving answer)
SpamBayes will attempt to classify incoming email messages as ’spam’, ‘ham’ (good, non-spam email) or ‘unsure’. This means you can have spam or unsure messages automatically filed away in a different mail folder, where it won’t interrupt your email reading. First SpamBayes must be trained by each user to identify spam and ham. Essentially, you show SpamBayes a pile of email that you like (ham) and a pile you don’t like (spam). SpamBayes will then analyze the piles for clues as to what makes the spam and ham different. For example; different words, differences in the mailer headers and content style. The system then uses these clues to examine new messages.
For instance, the word “Nigeria” appears often in spam, so you could use a spam filter which identifies anything with that word in it as spam. But what if your business involves writing a guidebook on Nigerian Wildlife Conservation? Clearly a more flexible approach is necessary. Additionally spammers will adapt their content over time and will no longer use the word “Nigeria” (or the words “Lose Weight Fast”, or any number of other common lines). Ideally the software will be able to adapt as the spam changes.
So, that is what SpamBayes does. It compares the spam and the ham and calculates probabilities. For instance, for me, the word “weight” almost never occurs in legitimate email, but it occurs all the time in ‘lose weight fast’ spam. SpamBayes can then look at incoming email, extract the most significant clues and combine the probabilities to produce an overall rating of “spamminess”. It flags the messages so that your mailer can handle the different message types. You might set it up so that ham goes straight through untouched, spam goes to a folder that you ignore (or delete without checking) and the unsure messages go to another folder which you can review for errors.
Now if I could just figure out how to anti-spam comments on my blog. Every now and then I get a comment with a spam ad added to an old blog entry. Annoying. I’m pondering either hacking the code on this blog (conveniently in class ASP) or considering other blog software that has the feature of review/approve before displaying comments.
Tags: outlook, spam, spamBayes
Possibly Related Posts
Comments
Leave a Reply


