Added on Oct 22nd, 2015 and marked as spam spamassassin

In a previous post I explained how to setup Amavis, ClamAV and SpamAssassin. In this post I will describe how to customize the SpamAssassin thresholds in order to better block spam messages.

If you followed my SpamAssassin setup the SpamAssassin thresholds are defined in the file /etc/amavis/conf.d/20-debian_defaults. Below are the relevant settings to control the thresholds:

Flag Description
$sa_tag_level_deflt The threshold at which amavisd will add a header to the mail to show the spam score.
$sa_tag2_level_deflt The threshold at which amavisd will add the value of $sa_spam_subject_tag to the mail’s subject (by default ***SPAM***).
$sa_kill_level_deflt The threshold at which amavisd will execute the action described by $final_spam_destiny (possible options are: REJECT, BOUNCE, DISCARD and PASS).

In the end, the only threshold I’m really interested in is $sa_kill_level_deflt. But how do you find the value of this level?

By default it is set at 6.31 (at least in my case), but is this really the value that blocks the most spam and still passes legit (i.e. non-spam) messages? It’s time to find out and in order to do this we’re going to need $sa_tag2_level_deflt.

Any message with a score above this threshold is supposed to get a mail header with some information to use for analysis. During my experiments I was not able to get this to work (I probably had some other setting messing things up). However, the relevant information was also added to the log file (/var/log/mail.log). For every message (spam and legit) not only the spam score is shown, but also what amavis did with the message: Passed CLEAN, Passed SPAMMY and Blocked SPAM.

By choosing a relatively large value for $sa_kill_level_deflt (lets say 10) and a low value for $sa_tag2_level_deflt (lets pick 0) a large group of messages will be marked as Passed SPAMMY. A lot of these messages clearly will be spam messages. Now you can use the information from the log file to figure out a new upper threshold: you will notice that no legit message exceeds a certain spam score. You have found your new value for $sa_kill_level_deflt!

The same can be done for the lower threshold. There is a certain minimum score for obvious spam messages. This value can be used for $sa_tag2_level_deflt. (Note that this has no effect on the number of messages being blocked. It just indicates that anything below this limit is considered non-spam.)

Currently I’m using these values:

$sa_tag_level_deflt  = 0.0;
$sa_tag2_level_deflt = 0.5;
$sa_kill_level_deflt = 2.0;

This will block most of the obvious spam without rejecting (too much) legit messages. It still can happen that a normal message exceeds my limit of 2.0 and therefore will be blocked. During my experiments this did not happen often and is probably a sign of some misconfiguration on the sender’s side. My guess is that they probably will experience problems in sending mail often (or at least delivering it to the recipient).

Anything below 0.5 is considered clean. Most major players (Gmail, Hotmail, Mailchimp etc.) have excellent scores (far far below zero) and therefore these will always be delivered. Some spam messages have scores just above or even below zero and will also pass the filter. This is something I take for granted.

The problem area is the twilight zone between 0.5 and 2.0. For some reason too many legit messages look kind of spammy (no subject line, only images or videos, all caps, misconfigured server, …) and there is no foolproof way for a non-human to distinguish them from the spam messages that succeed in hiding their spammy intentions. By narrowing down this range you will receive less spam, but potentially miss certain legit messages. Using trial and error will give result in values that are acceptable for you.