Pantek Library
Hosting Provided By
CybrHost
High Speed Hosting

Bayes innodb problems

From: Micah Anderson <micah(at)riseup.net>
Date: Wed Sep 26 2007 - 20:20:00 EDT

I was having problems with scalability with my bayes DB, so I read up on the mailing list and found that it was recommended to switch to the innodb storage engine because of the row-level locking (versus the table-level locking that comes with MyISAM). Sounds great. So I switched, and everything was fine for several days.

Then today the load on the DB server shot up to 11-13 and spam processing has ground down to really slow. I'm seeing some incredibly long queries now in my slow-query log, such as:

# Time: 070926 17:10:53
# User@Host: spamass[spamass] @ [10.0.2.4]
# Query_time: 758 Lock_time: 0 Rows_sent: 1 Rows_examined: 2205327
SELECT count(*)

               FROM bayes_token
              WHERE id = '4'
                AND ('1190846660' - atime) > '345600';

This seems really wrong....

Then queries such as the following taking at least 30 seconds:

# Time: 070926 17:13:24
# User@Host: spamass[spamass] @ [10.0.2.4]
# Query_time: 30 Lock_time: 0 Rows_sent: 88 Rows_examined: 88
SELECT RPAD(token, 5, ' '), spam_count, ham_count, atime

                     FROM bayes_token
                    WHERE id = '4'
                      AND token IN

(' <binary data removed here> ')

I'm seeing in my spamd logs the following:

Sep 26 17:17:52 spamd2 spamd[5479]: bayes: expire_old_tokens: child processing timeout at /usr/sbin/spamd line 1246. 
Sep 26 17:17:52 spamd2 spamd[1160]: prefork: child states: BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 
Sep 26 17:17:52 spamd2 spamd[1160]: prefork: server reached --max-children setting, consider raising it 
Do you need help?X

I've got my --max-children set to 50, and I'm hitting this because the DB is not responding fast enough.

Did I hit some sort of tipping point with the tokens in my database, do I have too many or ... what is going on here? I have to turn off bayes because its too slow and this is sad because this adds a lot to the results. This is what I have configured:

bayes_store_module                 Mail::SpamAssassin::BayesStore::MySQL
bayes_sql_dsn                      DBI:mysql:bayes:dbw-pn
bayes_sql_username                 spamassassin
bayes_sql_password                 notthepasswd
bayes_sql_override_username        @GLOBAL

# keep the database from getting too big:
bayes_expiry_max_db_size 1000000
# no affect
bayes_learn_to_journal 0

mysql settings related to innodb:

# * InnoDB

innodb_data_file_path = ibdata1:10M:autoextend
#
# Set buffer pool size to 50-80% of your computer's memory
set-variable = innodb_buffer_pool_size=1250M set-variable = innodb_additional_mem_pool_size=20M
#
# Set the log file size to about 25% of the buffer pool size
set-variable = innodb_log_file_size=313M set-variable = innodb_log_buffer_size=8M
#

innodb_flush_log_at_trx_commit=1

I'm using spamassassin 3.2.3 and mysql 5.0.45.

Thanks,
Micah Received on Wed Sep 26 20:20:49 2007

This archive was generated by hypermail 2.1.8 : Sat Oct 27 2007 - 17:53:16 EDT


Contact Us  Legal Notices  Order Services Online 
Pantek Home  Privacy Policy  IT news  Site Map  Pantek Library