top
User Name
Pass Word:

home
archives
features
links
users
faqs
registration!
Blatherings

So you want me to read that? Part 1
Previous | Next by rich 22 May, 2006 - 4:25 PM

I have had to do a bit of research on the topic of assuring delivery of e-mail lately. E-mail as we know it is based on the Simple Mail Transfer Protocol (SMTP).

SMTP was released in August of 1982 to a much younger more naive internet, still called arpanet. The protocol was created to link e-mail in various disparate systems. The goal being that, if you knew an e-mail address then you could just send mail without thinking about how it was to get there. Messages could (and still can for the most part!) be sent using things as simple as telnet without even having an e-mail account.

Here is a link on SMTP commands. http://www.yuki-onna.co.uk/email/smtp.html
If you attempt some of these commands against an e-mail server you will see just how easy it is to send out e-mail as if you were someone else. Some of the guys at the ISP I interned at in 1997 showed me this. I thought it was nothing more then a cool novelty at the time when I received an e-mail from god@god.com telling me to go on a divine quest to buy Steve and Patrick tacos or face damnation.

As you will find out this address forgery is really the crux behind the whole problem. We will return to the sending address forgery problem in part two. For part one let me detail the various spam fighting solutions that are used currently.

- Blacklists:
The first spam fighting technique out of the gate was (and is) blacklists. These are rather simple in premise. If we get an e-mail from someone and it is spam then we put him on a list of people we no longer want e-mail from. This list can then remove mail from our e-mail box, or going a step further we can have our e-mail server outright reject mail from that address.
Server based blacklists are more prevalent. This is where an IP address of an offensive e-mail server that is known to be spewing spam is identified and put on a list where our server will no long accept mail from it.
The problem here is that convention on the internet several years ago was that if a server received e-mail it would send that mail along it’s way, no matter if the recipient of the e-mail lived on it’s server or if another host. Default configurations on most SMTP servers had them in “open relay” configuration until around the year 2000. This was considered good karma and a helpful thing to do for your internet brethren. This good nature was quickly exploited by spammers as a way to leach bandwidth and the “good name” of a non-blocked IP address to send loads of spam. Many a small organization light on technical staff had their legitimate mail blocked due to spammers using their “open-relay” and subsequently finding their little e-mail server on various block lists. Among any half competent e-mail admin this problem is largely taken care of now days and the conventional default configuration is to no longer deliver mail to non local accounts unless the user is authenticated. More on this topic here http://www.ordb.org/faq/
Another problem with blacklists of course is how to maintain them. How does an IP address get on a list, and then how does one get off of that same list? Every list has it’s own procedures and some list maintainers are rather extreme. DNS stuff has a rather handy dandy tool to check some 266 well known lists. Here is a link to whitehouse.gov for example http://www.dnsstuff.com/tools/ip4r.ch?ip=63.161.169.140. As you can see it looks as if your hard earned tax money buys the White House some competent mail administrators that have not let there servers become abused. However, wohh unto them if they run afoul of a black list admin such as this spambag.org. This is an example of a rather extreme list maintainer. He will ban whole IP blocks because of a single offending e-mail server. This overall is extreme but once you look into the delegated responsibility of the internet it is really the only way it can be done. This all leads to the whole system of black lists being unworkable because there are too many false positives. More on this in part two.

- Heuristics ( defined)
This I will not go into as deeply as black lists. Heuristics is most simply when a program “reads your mail” and makes a decision about it. It can be as simple as searching for the string of characters such as “Viagra” or as complicated as looking at the sentence structure, dates and even the images contained in e-mails. These filters often assign “weights” to different elements it finds and if those weights ad up to a predetermined amount then the message is deemed spam. These filters can even “learn” over time how heavy to assign weights to things if given user feed back on what is “actual spam” and what have been “false positives.” As anyone who has ever wanted to e-mail about a topic that is also often spammed about via e-mail knows, heuristics can induce some very annoying false positives.
Filters can also blend by assigning weights to different heuristic algorithms and weights to appearances on blacklists listed above.

- White lists
One can get so fed up with it all that he decides to only receive mail from known entities (or at least those claiming to be). To create a list some white list systems will respond to a unknown first time e-mailer by sending a response e-mail requesting the user click on a link and fill out a form. This will not work for several reasons.
A: users have been told never to click on links in e-mail so that a spammer can not verify their address (some listen too intently at times)
B: people resent going through the additional steps to e-mail you
C: automated lists that you do want are always blocked

The three methods above are and will be useful for a good time to come. Current spam fighting has reached a new level however

- Various technically RFC kosher SMTP trickery
One major source of spam now days now that the blacklists are somewhat effective and open relays are often shut down are compromised hosts. Most of these are unprotected windows boxes loaded with spyware obtained via all the usual methods. One of the many things these “zombie” machines are infected with are SMTP engines! So now we have a whole bunch of SMTP servers that are totally built for the purpose of sending spam. These servers implement short cuts that legitimate servers often do not. We can target these short cuts and reduce our spam.
- Receiving servers should insist on reverse dns records being in place for all SMTP hosts connecting to it.
Reverse DNS is what it sounds like it is, you lookup against an IP address, it returns a Fully Qualified Domain Name (FQDN). DNS stuff linkThe thought on this one is that if you have a legitimate SMTP server sending mail for a legitimate domain they will have bothered to register a reverse DNS record. Registering a reverse DNS entry is harder than setting up traditional DNS. The provider of your IP address (your internet provider) is the authority for this.
DNS Stuff FAQ
[1] you can have the Internet provider set up the reverse DNS entries for you (the easy option), or [2] you can have the Internet provider delegate the reverse DNS entries to your DNS servers

AOL is known to require this to be correct to send mail to their users. Correct reverse DNS entries are technically a DNS RFC for any internet host. In practice this is often not implemented, and only in place for routers, and now mail servers.
- Greylisting
Greylisting is a technique where we maintain a list of hosts that we often receive mail from. When a host that is not on our “greylist” contacts us for the first time we do not answer. When a proper SMTP mail spooler “does the right thing” and attempts to resend the message a few minutes later we then let the mail through and add the server to our list. Subsequently we allow this server to send us mail without delay. The thought here is that the “zombie” SMTP engines will not queue a dropped mail and send it again for reasons of speed and efficiency.
There are may glowing reports of this technique at the moment. However it is only a matter of time until spammers have their SMTP engines queue their mail properly. http://www.greylisting.org/
- Great-pause-wait listing
It seems many spam optimized SMTP engines rapidly “slam” a server with the necessary SMTP commands to transmit a message before allowing for the receiving server to issue the usual SMTP responses (if you followed my telneting link at the top of this article you would have a better understating of this). This violates SMTP RFC. What we can do is instruct our SMTP server to wait a few seconds before displaying the typical SMTP greeting. Any machine that does not wait for our greeting can have it’s message dropped or heavily weighted against. Machines that constantly respect our delay can be put into a whitelist and will not get a delay in the future. Again this also is only a matter of time until spammers have their SMTP queues respect RFC.

This is a fairly full discussion of how spam filtering works currently, as I know it. In part two I am going to discuss what is coming next in spam filtering and what you can do (as a server admin) to make sure your message gets to the intended recipient.







6/5/2006 >> muhgcee

Where's part deux?




You must be logged in to comment.

comments

links

www.flickr.com
This is a Flickr badge showing public photos from Kheiligh. Make your own badge here.

flickr