Pantek Library
Hosting Provided By
CybrHost
High Speed Hosting

Re: A different approach to scoring spamassassin hits, Re: A different approach to scoring spamassassin hits

From: Nix <nix(at)esperi.org.uk>
Date: Thu Jul 05 2007 - 16:45:48 EDT


On 5 Jul 2007, tom@tacocat.net stated:

> On 7/2/2007, "Nix" <nix@esperi.org.uk> wrote:
>
>

>>If you wanted to replace all other scoring mechanisms with the Bayes DB,
>>you'd need a second Bayes DB for this, anyway, or you'd need the tokens
>>corresponding to typically negative-scoring rules to have values which
>>cannot appear in the body of an email. Anything else would enable spammers
>>to force both FPs and FNs by customizing spam appropriately to include
>>suitable NO_FOO/YES_FOO values.

>
> That's why the data is being passed in as a second reference, nothing to
> do with the message. Seems to be working well, but there's some
> optimization to include.

It doesn't just need to be a second reference. The tokens need to be independent of the message-derived tokens in the Bayes database itself as well: i.e., it needs to be impossible for spammers to generate tokens in the message body which can be used to influence the scores of the tokens in the Bayes DB which correspond to the Bayes-scored rule hits.

(btw, Tom, what's wrong with your mailer? ^M characters --- CRCRLF line terminators on the wire, perhaps? --- a doubled-up Subject line, and two To: lines, one with fullnames, one without... I cleaned up the ^Ms in this response.)

-- 
`... in the sense that dragons logically follow evolution so they would
 be able to wield metal.' --- Kenneth Eng's colourless green ideas sleep
 furiously
Received on Thu Jul 5 16:46:34 2007

This archive was generated by hypermail 2.1.8 : Fri Jul 06 2007 - 15:57:44 EDT


Contact Us  Legal Notices  Order Services Online 
Pantek Home  Privacy Policy  IT news  Site Map  Pantek Library