How To Automate Spamcop Submissions

Author: Stephan Jau

Introduction

Spamcop is a service which provides RBLs for mailservers in order to reject incoming mail from spammers.

Their philosophy is to process possible spam complaints from users. When they receive a certain amount of complaints during a time-period then they will blacklist the offender. This system is dependant on spam reporting from users. However, their submission process is not very user-friendly:

1.) You need either to forward the spam to a spamcop-email address given to you during sign-up (something like [email protected] --> I made this address just up) or you manually copy'n'paste the headers and the body and fill them into a form on their server.

2.) You then receive an email to the email address that you have supplied when you signed up.

3.) In this email, there will be a link that asks you to verify the supplied data on a webform. This is not just clicking the link but also manually submitting the webpage.

Problem

As I have said above, Spamcop is pretty much dependant on the user input. If no one submits and verifies spam, then they will have no blacklist. However that whole submission and verification process is a bit annoying. Why should I bother to actually submit spam to spamcop and have it verified? If I just delete it, that will take less time...

Solution

The human being isn't really made to do repeating things. This gets quickly boring and hence my idea to automate this submission and verfication process.
In this howto I will show you how I achieved that. All I do is just putting the spam into certain folders and our good old friend cron does the rest.

Prerequisites

I'm not yet an advanced linux user and/or coder or sysadmin or whatever. I just share all the knowledge I have. In order for this tutorial to run you will need several things:

1.) Maildir structure for your email (I'm just looping in the directory through all the emails hence maildb won't work with this)

2.) Be able to setup cron jobs (otherwise you can't automate a thing)

3.) Be able to run shell script (in Bash)

4.) Be able to run PHP script from shell (I do that with Lynx)

5.) A small programm called mime-construct. You can install it on Debian like this:

apt-get install mime-construct

6.) The Snoopy Class (a PHP class used for submitting webforms and other things)

7.) Eventually you need to have mailfiltering capabilites, especially if you use catch-all email addresses (e.g. procmail).

Let's start

Spamcop Account

First of all, you will need to create an account at Spamcop. This is, of course, free of charge.

"Spamfolder"

Then you need to create a folder where you put all your spam into. On my system I just call it "Spam". (Since Maildir is a prerequisite the important folder is the cur folder under the spam folder, in my case /home/mail/web4p1/Maildir/.Spam/cur/).

Spam forward script

Now that we have more or less everything together, it's time for the first script. We put now all the spam that come through our RBLs and Spamassassin into our "Spam" folder where the spams are actually put into the subfolder "cur".
What we need to do now is setting up a script that loops through the folder and forwards the emails to spamcop.
Here is what I have:
fe.sh (forward email script)

#!/bin/bash

# ENTER PATH OF THE EMAILS THAT ARE TO BE SUBMITTED TO SPAMCOP
FPATH="/home/mail/web4p1/Maildir/.Spam/cur"

# ENTER YOUR SPAMCOP EMAIL ADDRESS
EMAIL="........ a.t. spam.spamcop.net"

#################################################################
#################################################################

cd $FPATH

for FILENAME in *
do

# Create email and submit it to the supplied spamcop address
/usr/bin/mime-construct \
--subject "Forwarded spam (MIME encoded)" \
--attachment "Original message" \
--type message/rfc822 \
--encoding base64 \
--file $FILENAME \
--to "$EMAIL"

# Train this email to be spam to the bayesian SA filters
/usr/bin/sa-learn --spam $FILENAME

# Delete email
/bin/rm $FILENAME

done

All in all, this is a very simple script and just two things need to be adjusted:

1.) The PATH variable needs to point to your spam/cur folder

2.) The EMAIL variable needs to be set to the one you have received upon signing up at spamcop

In my script I also teach Spamassassin about those spams and then I delete them. You may want to handle them differently. Important however is, that once you have submitted those emails to spamcop, that you don't resubmit them again. Either delete them or move them in some other place. If you don't have spamassassin enabled to make use of Bayes then also remove the Spamassassin learning line.

Spam verification script

As said in the prerequisites we also need a folder where the verification emails from spamcop go to. This can be either a complete new email account or some folder combined with some email filtering. I have opted for the second option and I use procmail to filter my incoming email:

:0:
* ^To: spamcop a.t. roleplayer.org

Maildir/.Spamcop-Reply/

Now that we have another folder for the verification emails we need to filter out the unique ID that is contained in them. I have created this little script to get the whole url:
vs.sh (verify spam script)

#!/bin/bash

# ENTER PATH OF THE VERIFICATION EMAILS FROM SPAMCOP
FPATH="/home/mail/web4p1/Maildir/.Spamcop-Reply/cur"

# ENTER WEBPATH TO PHP SCRIPT
URL="http://www.domain.com/spamcop/index.php"

#################################################################
#################################################################

cd $FPATH

for FILENAME in *
do

# Get the supplied URL from the spamcop email
DATA=`/bin/grep -F http://www.spamcop.net/sc?id= $FILENAME`
echo $DATA

# Submit the URL to the PHP script
/usr/bin/lynx -dump $URL?data=$DATA

# Remove that file
/bin/rm $FILENAME

done

Again, quite a simple script. All it does is go to the path given, loop through all the emails contained there, filtering out the line with the ID and passing that information to a PHP script (which will then do the actual form submission).

1.) The PATH variable needs to point to your spam/cur folder.

2.) The URL variable needs to be set to your weblocation of the script.

Spamcop form submission script

Well, so far we have forwarded all spam emails to spamcop, received their verifcation emails containing the ID for the form submisson and sent that data to a PHP script.
Now you create a PHP script with the following content, make sure that it is located at the path provided in the vs.sh script, and put also the Snoopy.class.php file into the same folder where you put the php script: index.php (form submission script)

<?php

// Function for displaying an array in a table (also works on multidimensional arrays)
function displayArray($aArray) {
if (is_array($aArray) && (count($aArray) > 0)) {
print("<table border=1>");
print("<tr><th>Key</th><th>Value</th></tr>");
foreach ($aArray as $aKey => $aValue) {
print("<tr>");
if (!is_array($aValue)) {
if (empty($aValue)) {
print("<td>$aKey</td><td><i>$aValue</i></td>");
} else {
print("<td>$aKey</td><td>$aValue</td>");
}
} else {
print("<td>$aKey(array)</td><td>");
displayArray($aValue);
print("</td>");
}
print("</tr>");
}
print("</table>");
} else {
print("<i>empty or invalid</i>");
}
}

// The default form fields (those are being repeated to everyone the mail is sent to)
$offender = array("type", "master", "info", "sc_comment", "comment");

// The default form fields (these are the unique fields)
$form_vars = array("action", "spamid", "crc", "date", "source", "reports", "goodrelay", "max", "notes");

// Get the URL from the attached parameters
$data_org = $_GET["data"];

// Split it at sc?id= so that you have the "id code" only
$data = explode("sc?id=", $data_org);
$data = $data[1];

// Just some verification
echo "SC-ID: " . $data;

if($data == "") {
echo "done";
exit;
}

echo "<hr>";

// Require the snoopy class for retrieving the form
require_once("Snoopy.class.php");

$snoopy = new Snoopy;

$snoopy->fetch("http://www.spamcop.net/sc?id=" . $data);

$results = $snoopy->results;

// Another verification that it is actually a spam email that can be submitted....
$results = explode('<form action="/sc"', $results);
$results = $results[1];

if($results == "") {
echo "done";
exit;
}

// Count the number of recipients
$i = substr_count($results, 'textarea name="comment');

while ($i > 0) {

foreach($offender as $val) {

// Get Field Value
$findme = 'name="' . $val . $i . '"';
$offset = strlen($findme);
$pos_start = strpos($results, $findme) + $offset;
$pos_end = strpos($results, ">", $pos_start);
$res = substr($results, $pos_start, $pos_end);
$res = explode('"', $res);
$res = $res[1];
if($val == "comment") { $res = ""; }

$submit_vars["send".$i] = "on";
$submit_vars[$val.$i] = $res;

}

$i--;

}

$submit_vars["submit"] = "Send Spam Report(s) Now";

foreach($form_vars as $val) {

// Get Field Value
$findme = 'name="' . $val . '"';
$offset = strlen($findme);
$pos_start = strpos($results, $findme) + $offset;
$pos_end = strpos($results, ">", $pos_start);
$res = substr($results, $pos_start, $pos_end);
$res = explode('"', $res);
$res = $res[1];
if($val == "notes") { $res = ""; }

$submit_vars[$val] = $res;

}

// Display the data to be sent --> can be deactivated
displayArray($submit_vars);

// Create a new instance to submit the form data
$snoopy = new Snoopy;

$submit_url = "http://www.spamcop.net/sc";

if($snoopy->submit($submit_url,$submit_vars)) {
while(list($key,$val) = each($snoopy->headers)) {
echo $key.": ".$val."<br>\n";
}
echo "<p>\n";
echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
} else {
echo "error fetching document: ".$snoopy->error."\n";
}

?>

I have added quite a few comments for you to understand the logic.
One thing that might be changed is the line

                $submit_vars["send".$i] = "on";

This line may be removed but then this line here:

$offender = array("type", "master", "info", "sc_comment", "comment");

Needs to be altered to:

$offender = array("type", "master", "info", "sc_comment", "comment", "on");

In the unaltered version you tell spamcop to send a mail to every entry found in the headers, while the altered version uses the recommendations by spamcop (this is probably the safer method).

Cron Setup

We now have everything together, however we would be required to execute the two bash scripts manually (fe.sh and vs.sh). Well, this tutorial is about on how to automate this so the last step is now to create cronjobs which do the job for us.
I personally prefer creating text files and adding them then to crontab. In order to do so, simply create a file containing this:
cron.txt (cron command file)

0,20,40 * * * * /bin/sh /PATH/TO/fe.sh
10,30,50 * * * * /bin/sh /PATH/TO/vs.sh

Of course you have to replace /PATH/TO with your own path to where the files are. In my case it's this:

0,20,40 * * * * /bin/sh /home/mail/web4p1/fe.sh > /home/mail/web4p1/output1.txt
10,30,50 * * * * /bin/sh /home/mail/web4p1/vs.sh > /home/mail/web4p1/output2.txt

Note: I have added there also an output in order to see whether the crons and scripts run fine. Once you are satisfied, just delete the > .... stuff from the cron text file

So, now we have created the cron text file but how do we add it as cronjob? The answer is straight forward:

crontab -uUSER cron.txt

Just replace USER with the user you want to run the cronjob under or just leave -uUSER away if you are logged in as this user or as root and want to have it run as root (not recommended!!!)

Final words

Well, that's it.

1.) You can download a copy of the scripts from the forum.

2.) Don't forget to chown and chmod the files correctly (I have made the shell script executing for the user - however I'm not sure if that is required).

3.) You only need 1 vs.sh script if you keep using the same spamcop submission email. All that is required in order to make use of the auto-submission is creating a "Spam" folder in each email account and have the fe.sh script run on it.

4.) I set cron to run every 20 minutes... very likely you want to change that to once an hour... however speed of submission is crucial. The faster you submit and verify spam the sooner it will appear in the spamcop RBL. Because of that and because I'm almost non-stop online when awake during the day (as it is my job) I set cron to run every 20 minutes.

If you have improvements and suggestions, let me know :)

Share this page:

17 Comment(s)