Anti Spam Thoughts

There are a variety of different ways to prevent spam; here's my little list and my thoughts on each.

Security by obscurity

Essentially, pick an oddball email address and give it out to very few people.

If you need a "public" email address that you advertise on your site or similar, use a public email solution like Hotmail or Gmail which have good antispam built in.

TMDA

There are a number of technologies out there like TMDA but I'll lump them all together and speak to TMDA specifically.

Basically TMDA-type technologies require the sender to confirm that they are a legitimate person before their email gets passed through the filter to the receiver. Once they've been allowed through, that sender can send any subsequent messages through to the receiver without issue.

Per-company aliases

I love this concept and use it with all my email addresses. Essentially, when you're dealing with sites or companies that want your email address, give them a specific alias that is unique to the company. For example, for Famous Players you could use user-at-famousplayers@yourdomain.com . What this does is give you the ability to see where your spam is coming from. If you receive spam on this address six months later, you know that it originated from Famous Players as the odds of an automated spammer guessing this address are slim to none.

A proper extension of this would be to use a SHA-1 or MD5 hash of the company name (i.e. user-fd355fdd2c4aae50c23ead93ec1fefd4@yourdomain.com) as this would completely eliminate the possibilty of an attacker guessing the address.

This is a pain because you need to create an email alias on the fly, which is awkward when you're not near a computer or similar. Similarly, it's difficult to read a long email address to people.

Spam Detection

There are two main families of spam detection that I've seen. Rule based spam detection and blacklisting. Rule based, i.e. Spamassassin, has a variety of different tests that help to identify spam. i.e. if it has the word Viagra then it's spam. Blacklisting acts on either the IP/domain names used in the email headers or it will look at the URL's used within the document. URL blacklisting works like a hot damn and I haven't seen an issue with it yet. Rule based is a pain unless you're using pre-tuned statistically analysed rulesets (i.e. spamassassin).

Rejection at the MTA layer

Rejection of high scoring emails at the MTA layer is nice as it keeps the spam out of your mailbox. It's a pain because your mail server needs to have serious horsepower to be able to score a message real-time while a TCP transaction is going on. So far, this appears to cause issues on anything but the lowest email volumes.

Bayesian filtering

Bayesian filtering works by allowing the user to specify whether a message is spam or ham. Based on this, there is some abstract math done to form word groupings and arrange them. This isn't user controllable. After a bit of tuning, Bayesian filtering gets pretty good. Downside is that lots of spammers will stuff spam with legitimate words. This generally occurs at the mailbox level, i.e. within client software like Thunderbird.

Graylisting

Graylisting basically means that the mail server issues a temporary rejection the first time it sees an incoming message from a new IP. After this, it allows it OK. This allows time for antispam rulesets to become updated and reject messages.