Pantek Library
Hosting Provided By
CybrHost
High Speed Hosting

Re: Purpose for SpamAssassin using MySQL

From: Michal Jeczalik <michal(at)jeczalik.com>
Date: Wed Oct 10 2007 - 15:11:22 EDT


On Wed, 3 Oct 2007, Rob Mangiafico wrote:

> On Tue, 2 Oct 2007, [iso-8859-2] Micha? J?czalik wrote:
>> There are many. It allows you to share data between user accounts (IMHO it
>> doesn't make much sense to have separate bayes databases for each account,
>> at least they are of a 'massive' sort and users are not allowed to feed
>> their own spam/ham etc. - because they share mostly the same data and the
>> bayes is more up-to-date if one single database autolearns from many
>> mailboxes). It allows you to share data among several hosts. It allows
>> you to keep data on a remote host if you don't have enough space. Etc.
>
> Picking up on the point of one Bayes DB in MySQL vs. individual ones for
> each user, is it more effective in an ISP/host environment where you have
> diverse users to have them all share one Bayes DB with autolearn, or is it
> better if they each have their own Bayes data in MySQL (per user)?
>
> We're slowly converting to mysql for bayes, and have not decided yet which
> method would be best for our users and for the servers in general. Thanks.

Sorry for a late answer. Of course it's more effective. This was the major reason for me to do it. Then you have one bayes db, one autoexpire, you need space only for one db. If anything goes wrong (some disk failure, or db malfunction) you need to recreate only one db.

If you don't have any significant reason to have per-user bayes databases, then you should probably use one-for-all method.

And one more advantage - I'm not too much into SQL performance stuff, but one-for-all is probably faster, because the SQL engine doesn't have to look up for multiple (possibly thousands) different bayes databases and probably it's able to cache at least some of those bayes tokens. Remember that on a large system it's common to receive the same spam message to multiple mailboxes at one time.

-- 
Michał Jęczalik, +48.603.64.62.97
INFONAUTIC, +48.33.487.69.04
Received on Wed Oct 10 15:11:25 2007

This archive was generated by hypermail 2.1.8 : Fri Jul 04 2008 - 12:18:54 EDT


Contact Us  Legal Notices  Order Services Online 
Pantek Home  Privacy Policy  IT news  Site Map  Pantek Library