Piratefish 2.3 Tuning Tips
Please note: Changes listed in this document are for Piratefish 2.3 installations only, and are intended to add features that are not documented in the Piratefish 2.3 Guide, or to correct problems in the documentation.
The software used in the Piratefish system is well documented on the Internet. If you have questions regarding these software packages, please go to their respective home pages and read the online documentation for more detailed help.
Many parts of the Piratefish are well documented within the webmin interface as well. In the MailScanner plug-in for instance, many of the settings are documented in pop-ups accessible just by clinking on the items you're curious about. Don't be afraid to explore!
- Common Build Problems (Updated 9/6/2006)
- Drag and Drop Bayesian Filter Programming
- Fuzzy OCR Filtering of Spam (Added 12/6/2006, Updated 1/10/2007)
- Updating SpamAssassin after Fuzzy OCR is installed (Added 12/8/2006)
- Using Postgrey to add automatic greylisting (Added 3/15/2007)
As with anything that relies on the outside world, problems can creep in. The Piratefish isn't immune to these problems, and it's now time to provide some assistance with building your Piratefish while the documentation is being updated formally into the 2.4 version.
Currently, as most of you have found, there are problems in building a Piratefish. As the Internet, and Linux distributions change, so does the process of building the Piratefish. This becomes more difficult as new patches and changes appear almost all the time, however, some core changes have happened recently in development, requiring rewrites to some chapters of the eBook, and also some needed changes to URL's listed in the eBook.
Correction in Chapter 3: Getting Webmin Going
In this chapter, the process of installing and upgrading Webmin hasn't changed much, but, in step #8, where you are provided a URL from which to load the Webmin Mailscanner Plugin - using that URL results in a file error, preventing that plugin from loading.
To fix this, use this URL instead: http://www.piratefish.org/docs/webmin-module-1.1-4.wbm
Correction in Chapter 11: Configuring SpamAssassin White lists
Technically speaking, this chapter is 100% accurate, however, whitelists in SpamAssassin only prevent SpamAssassin from whitelisting spam - because the address is not whitelisted in MailScanner - and since MailScanner calls SpamAssassin only after a message passes the black list checks, it's possible for addresses whitelisted in the Piratefish to not get their mail delivered if they're on a black list.
To solve this problem, whitelisting of email should be performed in the MailScanner gui, and not in the SpamAssassin section at all. This is a major change, and I apologize for not catching this problem sooner.
Whitelists made for MailScanner are formatted slightly differently than they are for SpamAssassin, so it's not a simple act of cut and paste to convert from one to the other.
To access the MailScanner whitelisting area, go into the MailScanner webmin gui, click on the Spam detection and spam lists (DNS blocklists) section, and find the item labeled Is Definitely Not Spam. This should be pointing to a ruleset file that was defined by the initial setup - don't make any changes here. Click on the edit button to the far right of this line.
This file contains the line: "FromOrTo: Default no" - DO NOT MODIFY OR ERASE THIS!!!
At the top of this file, add in your whitelist entries formatted like this:
From: johnny@piratefish.org yes
From: ibm.com yes
For more rule examples, the MailScanner website has some great examples available here.
Be sure that the last line in the file contains: FromOrTo: Default no
If you have a large whitelist in your SpamAssassin setup now, the following script can convert your SpamAssassin whitelist into a whitelist usable by MailScanner.
This script requires you to log in as root and create a file called wlconv.sh
while read line
do
echo From: $line yes
done
echo FromOrTo: Default no
Once this text has been entered, save the file.
Then enter this command to make the script executable:
chmod 777 wlconv.sh
Now that this is done, you can run this on your existing whitelist, and output it into the MailScanner whitelist using this command:
grep "whitelist_from" /etc/spamassassin/local.cf | sed "s/whitelist_from //" | ./wlconv.sh > /etc/MailScanner/rules/spam.whitelist.rules
This single line command (yes, it's long) does the following:
- grep "whitelist_from" /etc/spamassassin/local.cf | reads all the existing whitelist lines from the SpamAssassin local.cf configuration file - and outputs them into the "pipe"
- sed "s/whitelist_from //" | invokes a command called the script editor - this is an editor which can run on every line of text run through it. In this case, sed is going to read each line in the pipe and it's told to substitute the "whitelist_from " with nothing - erasing that text from each line - so now just the email address is being pushed into the "pipe"
- ./wlconv.sh > /etc/MailScanner/rules/spam.whitelist.rules invokes the wlconv.sh script you created so that it reads input from the pipe, and tells it to send it's output to overwrite the existing MailScanner whitelist rules.
Once you've run this script, it's important to double-check the whitelist script for errors, then apply the changes to MailScanner, then restart it a moment later.
Drag and Drop Bayesian Filter Programming
With the help and initial test of Steven Siegel, the wonderful SpamAssassin Wiki, and some creative hashing, at last an easy solution to Bayesian is at hand!
This solution uses an IMAP client on your Piratefish server to copy emails from a folder on your mail server. Steven has tested this successfully in a Microsoft Exchange environment, and I've had a good run with it in my Linux server environment here at Piratefish Central.
Also, for another change of pace, I'm providing ready-to-use scripts as well, so there will be less editing for those of you wanting to do this!
Setup:
- Create a new user on your mail server - call it "spaminator" or something else - be creative. Assign it a password as well.
For example, the user spaminator was created with the password spaminator5
- Add this account to your email client, but be sure to treat it as IMAP - not as POP3/SMTP. This is important so as to keep the messages on the mail server. Everyone who uses this account will need to use IMAP too.
- Once you can reach the account from your mail client, go ahead and create two folders in the mailbox - one called Spam and one called Ham - or call them anything you like. Case matters here too, and each folder must be uniquely named.
- Now move to the Piratefish server, log in as root to the console (or SSH in) and execute the following command:
wget http://www.piratefish.org/registered/ddbayes-setup.sh
- Now run the script by typing the following commands:
chmod 755 ddbayes-setup.sh
./ddbayes-setup.sh
When this setup script is run, it will create the files needed to run fetchmail and install fetchmail if it's not already installed on your Piratefish server.
When running this script, be ready to answer the following questions:
The name (or IP address) of your IMAP mail server
The account name used for Bayesian Learning
The account password
The Spam folder name
The Ham folder name
- To test, copy some ham into the ham folder for the spaminator user, then move some spam into the Spaminator Spam folder as well.
Remember that this system removes the email from these mailboxes, so it's critical that no important messages are moved into the Ham box - copy them in - don't move them or they will be deleted when it learns!
- Once you've loaded the directories using your mail client, type the following at the Piratefish command prompt:
./autolearn.sh
- The screen should get really busy for a few minutes as the Spams and Hams are loaded, one at a time, from each of the accounts and learned.
Once installed, this script will run automatically every hour.
If you run into problems and want to stop running this script every hour, just remove the file /etc/cron.hourly/autolearn.sh - this will remove the link that makes the script run every hour.
Multiple users can use the same IMAP account easily, with each person adding spams and hams as needed. Dragging and dropping the messages ensures that the message headers aren't appended or modified, ensuring good programming of the SpamAssassin Bayesian filter.
Fuzzy OCR Filtering of Spam (Updated 1/10/2007)
As many of folks have seen, the Piratefish isn't infallible. Some spams still get by.
One type of these emails that seems to be prevalent lately is the "image spam" - they contain a single image attachment that contains the spam message itself. This method gets through the Piratefish because it can't look at the image to see if the image contains spam.
There is a way to defend against these, and it seems to be working in my system, and I have documented the process to add this capability to the Piratefish below.
WARNING: This change is risky. It requires the SpamAssassin program to be upgraded to an "unstable" version and will also upgrade a few other elements of the system in the process.
Any mistakes made in this process, or in my documentation, could kill a working fish - and I take no responsability for any damages coming from this solution.
The changes to your Piratefish on this page are not reversible. If for any reason these changes make your fish unstable or unusable, I recommend rebuilding from scratch.
The reasons for this are because the process of updating the SpamAssassin program from version 3.0.3 to the debian "unstable" version can introduce new problems.
You are warned. Proceed at your own risk.
Adding FuzzyOcr image spam detection to the Piratefish
- Log into your Piratefish server as root
- Type cd (press enter)
- Upgrade the system by typing these commands:
apt-get update
apt-get upgrade
If there are no updates, then proceed. If there are, install all of them, then reboot the system and continue from this point.
- Type "apt-get install netpbm imagemagick gocr libungif-bin libstring-approx-perl libmail-audit-perl"
- Type "wget http://users.own-hero.net/~decoder/fuzzyocr/fuzzyocr-2.3b.tar.gz"
- Type "tar -zxvf fuzzyocr-2.3b.tar.gz"
- Type "cd FuzzyOcr2.3b"
- Type "cp FuzzyOcr.cf /etc/mail/spamassassin"
- Type "cp FuzzyOcr.pm /etc/mail/spamassassin"
- Type "cp FuzzyOcr.words.sample /etc/mail/spamassassin/FuzzyOcr.words"
- Edit the /etc/mail/spamassassin/fuzzyOcr.words" file to add in any words your see appearing in image spams.
- Type "cd /etc/mail/spamassassin"
- Type "pico FuzzyOcr.cf file"
- Find the "focr_logfile" line and change it to /var/log/focr.log
- Save the file.
- Type "chmod 755 Fuzzy*"
- Type "pico /etc/apt/sources.list
- Insert some blank lines into the file, then add in a line that reads:
deb http://mirrors.kernel.org/debian/ unstable main
and save the file
- Now type the following commands:
apt-get update
apt-get install spamassassin
The system will ask questions regarding downloading - answer yes to the questions about getting the updates.
Be very careful during the process - you will be asked if you wish to use the
package maintainers configuration file, or if you wish to keep your configuration file.
Be sure to keep your configuration file by answering N or No or "O" for original.
- Once the update is complete, type in this command:
spamassassin -V
SpamAssassin should report that it's now version 3.1.4
- Remove the Unstable Sources - IMPORTANT!!! if you don't do this, the next
upgrade will change all the software in the system and possibly kill your fish!
Type "pico /etc/apt/sources.list"
Find the line you addd earlier with the "unstable" in it.
Add a # sign to the front of the line to comment it out.
Save the file.
- Now type the following:
apt-get update
apt-get upgrade
There should be no updates pending.
- Reboot the piratefish server.
This should be all that's required. SpamAssassin should be scanning this directory when it's run, searching for any file ending in .cf, and executing it as a plugin.
To test this change, cd into the FuzzyOcr directory, then go into the samples directory.
The README file within the samples directory shows how to test and what output each test file should generate.
Updating SpamAssassin after Fuzzy OCR is installed
SpamAssassin has some features that havn't been exploited yet in the Piratefish - one of these features is the ability to update itself with new rulesets. Rulesets in SpamAssassin are maintained by the folks who maintain the package, and updating these rulesets can help ensure your Piratefish remains strong and blocks more of the bad stuff than ever.
SpamAssassin rules are used to tag words, phrases and text sequences that appear commonly in spam messages. These rules can be customized by anyone easily by editing the rule files, but the syntax is somewhat complex. Additionally, a mistake here could result in legitimate messages being blocked, so editing of these files is most definitely an "at your own risk" activity.
The Apache Project maintains an excellent FAQ about all of this here.
This update is for users who have installed the updates for Fuzzy OCR. I don't know if this update will work for non-Fuzzy OCR Piratefish. If you try this and it works on your non-Fuzzy OCR Piratefish, please let me know so I can update this page!
All we need to do is this:
- Log into the Piratefish server as root
- Type the command "sa-update --nogpg"
This will download the latest SpamAssassin rules and place them into the folder /var/lib/spamassassin/xxxxxxxx where the XX's will be indicative of the ruleset number. As more updates are released, more directories will appear in this folder.
Using Postgrey to add automatic greylisting
Greylisting is a relatively new feature that I discovered while researching a problem a Piratefish user was having with high volumes of spam. This became a concern to me recently since this users server was getting more than 100 emails per minute, and the Piratefish just wasn't able to keep up with that kind of volume. To it's credit however, it was able to receive that kind of volume - since each messages needs 2-10 seconds to be processed fully, the failure was in the ability of MailScanner to keep up.
This led me to thinking about how incoming mail is being processed - Postfix is a very greedy mail server, and in this case I think it's eagerness to accept mail was working against it. I then asked around a bit, and I was pointed to an item called Postgrey.
Postgrey is a greylisting policy server for Postfix, and when emails are checked with it, it records the senders IP address, who's sending the message, and the recipient. If these three items have never been seen before, or only seen within the last 5 minutes, then the message is rejected with a temporary error. If the same combination is seen after 5 minutes and before 35 days, the message is allowed through.
This is clever since it requires that email senders are RFC compliant - and RFC's dictate that if an email gets a temporary error, the server should hang onto the message and try delivering it a little later. Since spaming is a volume business, and this puts a 5 minute delay on their volume, this slows them down. To make things more interesting, by the time a high-volume spammer actually tries to deliver this message again, with luck, their server may already be blacklisted.
WARNING: This can interrupt delivery of messages and slow down delivery times beyond reasonable for many business. For this reason, it is recommended that this feature only be used under desperate high-volume situations. If you receive less than 20k messages per day, this feature may be overkill for your situation.
To add this feature into your existing Piratefish is easy:
- Log into the console of your Piratefish as root
- Type "apt-get install postgrey" and press enter.
- Once you approve, press Y and enter and the download and installation will complete momentarily.
At this point you've not added the Postgrey daemon into your Piratefish, but until you modify the Postfix configuration, Postgrey will not be used.
To add Postgrey processing into Postfix is easy:
- Type "cd /etc/postfix" and press enter.
- Type "pico main.cf" and press enter.
This will load the pico editor and will drop you into the main configuration file for Postfix.
Scroll down through the file, you'll see a section that looks like this:
smtpd_recipient_restrictions =
permit_mynetworks
reject_unauth_destination
reject_unknown_recipient_domain
reject_unverified_recipient
check_policy_service unix:private/spfpolicy
- Add the line
check_policy_service inet:127.0.0.1:60000
into the list so it looks like this:
smtpd_recipient_restrictions =
permit_mynetworks
reject_unauth_destination
reject_unknown_recipient_domain
reject_unverified_recipient
check_policy_service inet:127.0.0.1:60000
check_policy_service unix:private/spfpolicy
- Once you have completed the edit, press CTRL-X and save the file.
- Restart Postfix so that it reads the configuration change.
/etc/init.d/postfix reload


