[xmail] antispam - feedback needed!

Catalin Caranfil Wed, 05 Feb 2003 03:06:02 -0800

Alternative ways to fight spam

Spam is currently the biggest nuisance of the Internet, and as far as I can see 
the problem looks worse every day!


The single largest problem with spam is that the "spam kings" can gain a 
valuable amount of money with very, very small costs for them (but with large 
costs for the recipients - mostly in terms of lost time and frustration) - if 
somehow we could level the cost-to-benefits balance then things will be 99% 
fixed! (I think that many people would be very happy to be spammed for a fee of 
1 US$/spam - or at least if the spammer was to spend at least 5 minutes of his 
own time for each email sent to each recipient :)

Certain such fixes have been proposed - from taxing each email (a very funny 
thing which only shows how little our politicians know about email, and how 
everything for them is about taxes :) to approaches involving traditional 
micropayments or the smart nonfungible micropayments and "client puzzles" - 
however those ideas require a new protocol and as a result might be very 
difficult to implement :(

I will try to present here a new antispam idea - and use the same opportunity 
to make that idea "public domain" in order to avoid situations when the Amazons 
or AOLs of the world will try to patent it in the future :(

The idea is obviously to add a certain "cost" for spammers without changing in 
any way the existing SMTP protocol and without making that cost a major burden 
for legitimate email - and since money were out of the question (with the 
existing protocol) the main idea left was about time! To be more precise - my 
impression is that while the average email server will only send or receive 
well under 100000 emails each day, a spammer (even a beginner) will need to 
send 10 times more email each day (and probably the "spam kings" need at least 
100 times more).

I think now it's a good moment to mention that the idea is NOT about "mail-
throttling" and it requires a well-written open-source multi-threaded SMTP 
server to be implemented/tested (which explains why I am first posting it on 
the xmail list) - and in order to be correctly evaluated it requires people 
with a certain knowledge on programming SMTP servers and ideally also some 
experience administrating SMTP servers!

Enough preparations - the idea is to add an ADAPTATIVE DELAY in certain points 
of the receiving SMTP thread - that delay should be implemented in such a way 
that it will be a major burden for known spam-sources or high-volume (unknown) 
spammers but only a very small overhead for legitimate (unknown) email - and 
please keep in mind that another thing that differentiates spam is the much 
lower interest in delivering each individual email (when you have 10 million 
spam messages to send each day any email can be dropped at any time - while 
obviously a normal email server will be willing to spend a lot more effort in 
delivering each and every one of the emails in its queue).

Please note that the most important aspect is the non-exclusive result - 
traditionally both "whitelisting" and "blacklisting" can create lots of 
problems since there is no "middle route" - the email is either 100% good or 
100% bad - while in this new approach we can have "good", "unknown" and 
"probably bad" emails - and it will be possible to still accept emails from a 
server that was incorrectly labeled as a spam source - but only as long as the 
server is not sending a large number of emails and is willing to spend some 
time to deliver each one (which is the pattern that can differentiate good 
email from spam).

I would not really expect for this method alone to eliminate spam (and great 
results will only be achieved by combining it with other methods - especially 
with public "blacklists") - but I hope that certain spam servers will "time-
out" much faster than the legitimate servers (and as a result it might even be 
possible that many emails protected with this method will be dropped from the 
really large spam lists) - I also expect that being a large spammer will become 
more difficult and more expensive - and if this is extended in the right 
direction the amount of spam received by servers implementing this method will 
be much lower! I also hope that this mehod might seriously decrease the use of 
"spam relays" (especially those on rather slow connections and modest 
computers).

Obvious problems and limitations:

1) ideally for best results only one email at a time should be received from 
any other (unknown or bad) email server;

2) the number of SMTP threads should be increased (even more if most of the 
emails are coming from unknown or bad servers);

3) even from the start certain servers/domains should probably be 
"whitelisted";

4) if the method will become successful it is to be expected that spammers will 
"fight back" - they might try to increase the number of threads/processes that 
are sending email at the same time (but that might not be so easy as it might 
look - those values are probably already very large and increasing them even 
more will require more expensive computers and faster and more expensive 
Internet links) and waiting more before a timeout (which is again not easy when 
you need to send a very large number of messages each day).

5) one thing that might limit the effectiveness of the method for small servers 
(which only receive a small number of emails) is that the amount of information 
is simply too low - a large server will probably get 10-1000 emails/day from 
the same spammer but a small server might only get 1-2 messages from the same 
spammer - this can probably be solved in two ways - either using information 
from anti-spam lists or by implementing an extension of this method in which 
small servers will benefit from the "statistics" of the larger servers!


Preliminary tests with xmail

My first test was to change a few lines in SMTPSvr.cpp in a function called 
SMTPHandleCmd_DATA - the main loop for that function (which actually gets 
99.99% of the email) is something like:

    for (;;)
    {
        if (BSckGetString(hBSock, szBuffer, sizeof(szBuffer) - 3, 
SMTPS.pSMTPCfg->iTimeout, &iLineLength, &iGotNL) == NULL)
        ...

Simply adding a call to a function (let's call it AdaptativeDelay () ) to this 
loop should "slow-down" certain senders - and ideally we would like to add NO 
(zero) real delay to "known good" high-volume servers (probably aol, yahoo, 
hotmail and similar) and only a medium delay to the first email from an unknown 
server (like for instance 10-50 miliseconds for each line), but a bigger and 
bigger delay if another email from the same server / domain / IP class is 
attempted within a certain time limit (probably something around 5-30 minutes 
?) and eventually a very large delay for certain "probably bad" domains or IP 
classes.

The AdaptativeDelay () function will not be as simple as it might look (and 
that is even before implementing a look-up on blacklists and whitelists :) - 
since TCP will do a certain amount of invisible buffering, great care must be 
taken with long emails and large delays in order to avoid a total delay bigger 
than (about) 60 seconds without any new TCP transmission!!! (one quick fix in 
my tests was to eliminate the delays after a certain number of lines - in my 
experience very long spam (like over 100kbytes) is very unusual)!

For the moment I am waiting for more feedback from other people before actually 
doing more work on this - what other problems and limitations can you see for 
this method ?

Best regards,
Catalin Caranfil
-
To unsubscribe from this list: send the line "unsubscribe xmail" in
the body of a message to [EMAIL PROTECTED]
For general help: send the line "help" in the body of a message to
[EMAIL PROTECTED]

[xmail] antispam - feedback needed!

Reply via email to