|
Managing Spam What is the largest headache caused by spam? Many sites find that once you get decent filtering in place and start identifying spam, a new problem that crops up is just a disconcerting. Deciding what to do with your spam is one of the most difficult aspects of dealing with spam effectively.
Software such as amavisd-new, a front-end for spamassassin and virus filters, leaves the ultimate decision up to the administrator. That is, what do we do with email that has been identified as spam? The options are: use before-queue filtering to not accept it in the first place, send a delivery status notification (DSN) notifying the party that their email was not delivered, or just silently discard the email. All of these options have consequences, and some are more hair-raising than others. Option 1: DSNs, aka Bouncing In this first scenario, a mail server will accept most email, and then subject it to spam and virus filtering before delivering it to a user’s mailbox. If the email is determined to be spam, it isn’t delivered to the user, and a DSN is sent to the address in the From: header, notifying the sender that delivery was not successful. This is problematic for many reasons. Most critical, is the fact that the From: header in spam is rarely correct. In fact, it is possibly claiming to be from someone you know, since spammers have been known to harvest email addresses from people’s address books. Sending a DSN to someone who didn’t send email in the first place causes confusion, and results in support calls from the confused user who thinks their email account has been compromised. Even more detrimental to productivity, sending DSNs to addresses or domains that don’t exist will cause the bounces to pile up on the mail server, since they can’t be handed off to another server. Thousands of email messages sitting in the mail queue will jeopardize system resources and can effectively clog mail services for legitimate mail. Most organizations find this to be the most difficult aspect of dealing with spam. Option 2: Silently Discarding Once email has been accepted and eventually identified as spam, another option is to simply discard the message. This completely solves the problem of a mail server crumbling from having too much mail in the queue, but is perhaps just as problematic. If email is falsely identified as spam, and the sender isn’t notified that delivery failed, the sender will just assume everything was delivered as usual. Clearly this is less than optimal, but when servers start falling over due to extensive resources being consumed, many people turn to silently dropping spam. Oftentimes, silently discarding email is an intermediate step between DSNs and before-queue filtering. David Ernst of HoosierNet stated the following when we asked about his need to transition from DSNs to before-queue filtering: "Well, something had to be done. We can grind the service to a halt if we try to process all of those return-to-senders. So, it made the difference between working and not working." Many people in this position opt to use a hybrid system of still sending DSNs, but cleaning the queue periodically to discard ones that cannot be sent. Option 3: Don’t Accept it at All Ideally we want to identify spam while the sending server is still connected, and tell them that delivery isn’t going to happen. This means that the sending server has to deal with it, and in the case of a spammer, it simply means that sending failed. “Just don’t accept it” is quite easy to say, but sometimes tricky to implement. Mail servers such as postfix and sendmail, for you open source folk, both have the ability to hand messages to another program before sending them to the queue for final delivery. This provides the ability for the second program to scan the email for viruses and spam, and report the status to the mail server. If the message is identified as spam, the server, who has not yet reported to the sending server “delivery accepted,” now has the option of reporting an error. There is no need to send a DSN, since we never accepted the suspicious email in the first place. Best Practices Implementing spam and virus checking, in general, isn’t very difficult. Depending on the mail server, implementing spam filtering such that it is able to reject spam before the SMTP session is over, can be difficult. Two widely used mail servers, postfix and sendmail, both have the ability to utilize amavisd-new. Amavisd-new is a favorite, since it provides a nice and simple way to implement spam and virus checking, so we feel it deserves special mention. Sendmail has the milter interface, which allows anyone to program add-ons to sendmail. The amavis-milter will hand off email to amavisd-new, which in turn runs spamassassin and virus checking on the email. Amavisd-new will also check attachments, and can extract the data even if they are zip files (and many other types of archives) to check for viruses and spam. Configuring this in postfix is even simpler, since it only requires one change in the configuration file, plus the addition of another smtpd process. Email is increasingly frustrating to manage, due in large part to the model of email. We sometimes want to receive email from people we don’t know, so email is designed after that fact. People have implemented systems where a sender has to verify himself the first time they send email, but that type of system doesn’t work. We always want to receive automated messages when we purchase things online, and those messages are normally sent from an address that people don’t monitor, making “sender verification” impossible. For the sanity of everyone who runs email servers, please don’t send DSN messages for spam. This is rapidly becoming a widely accepted standard practice. Receiving a DSN for mail you didn’t send is confusing, and dropping spam silently will lead to lost email. The only sane option is to complete all of your virus and spam checking before accepting the mail for delivery, and reporting “success” to the sending server. Aside from the fact that this option tends to make the most sense, in most cases it also conserves system resources. Resources: |