Pantek Library
Hosting Provided By
CybrHost
High Speed Hosting

RE: Replication issues, slaves falling "behind"

From: Matt Juszczak <matt(at)atopia.net>
Date: Fri Aug 10 2007 - 14:02:54 EDT


Rick:

There are many different queries causing the problems. Again, I believe I've tuned as best as I possibly can, and I'm not sure why some of these queries are just taking forever at random times.

Here is one that just decided to hang on insert for 10 seconds, without anything actually locking it. It was a replicated query. I've subsituted fields with field[1-9] and value[1-9], etc. It brought our seconds behind master behind to 10 seconds while inserting on the slave from the master.

Show processlist (chopped off prior to query): 2 system user databasename Connect 6 update

Query:
INSERT INTO tablename
(field2, field3, field4, field5, field6, field7, field8, field9 ) VALUES (
value1, value2, value3, value4, value5, value6, value7, value8);

Show create table:
CREATE TABLE `tablename` (

   `field1` int(11) NOT NULL auto_increment,
   `field2` datetime NOT NULL default '0000-00-00 00:00:00',
   `field3` varchar(128) default NULL,
   `field4` varchar(16) NOT NULL default '',
   `field5` varchar(32) default NULL,
   `field6` varchar(64) NOT NULL default '',
   `field7` varchar(100) NOT NULL default '',
   `field8` varchar(128) default NULL,
   `field9` int(1) default NULL,

   PRIMARY KEY (`field1`),
   KEY `name1` (`field2`),
   KEY `name2` (`field6`),
   KEY `name3` (`field4`),
   KEY `name4` (`field5`)

) ENGINE=MyISAM AUTO_INCREMENT=29484916 DEFAULT CHARSET=latin1

Show table status:

           Name: tablename
          Engine: MyISAM
         Version: 10
      Row_format: Dynamic
            Rows: 29486074
  Avg_row_length: 85
     Data_length: 2506923996

Max_data_length: 281474976710655
    Index_length: 1111112704
       Data_free: 0
  Auto_increment: 29486096
     Create_time: 2007-07-19 07:55:33
     Update_time: 2007-08-10 13:55:41
      Check_time: 2007-08-06 16:07:58
       Collation: latin1_swedish_ci
        Checksum: NULL
  Create_options:
         Comment: *****

Pretty busy right now on the slaves, but here's the I/O Stat and vmstat:

Do you need help?X

avg-cpu: %user %nice %sys %iowait %idle

            2.24 0.00 0.90 1.13 95.73

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              26.56       105.67       375.72  273221908  971416258
sda1              0.00         0.04         0.00      90774       1314
sda2              2.01         1.56        15.91    4032902   41141216
sda3              0.00         0.00         0.00       2820       3000
sda4              0.00         0.00         0.00          2          0
sda5             49.35       104.08       359.80  269095194  930270704
sdb               2.77       145.37       100.34  375861300  259432720
sdb1             13.73       145.37       100.34  375861164  259432592

avg-cpu:  %user   %nice    %sys %iowait   %idle
            3.50    0.00    0.50    0.00   96.00

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               1.00         8.00         0.00          8          0
Do you need more help?X
sda1 0.00 0.00 0.00 0 0 sda2 0.00 0.00 0.00 0 0 sda3 0.00 0.00 0.00 0 0 sda4 0.00 0.00 0.00 0 0 sda5 1.00 8.00 0.00 8 0 sdb 0.00 0.00 0.00 0 0 sdb1 0.00 0.00 0.00 0 0 avg-cpu: %user %nice %sys %iowait %idle 16.21 0.00 0.75 0.25 82.79 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 0.99 7.92 0.00 8 0 sda1 0.00 0.00 0.00 0 0 sda2 0.00 0.00 0.00 0 0 sda3 0.00 0.00 0.00 0 0 sda4 0.00 0.00 0.00 0 0 sda5 0.99 7.92 0.00 8 0 sdb 0.00 0.00 0.00 0 0 sdb1 0.00 0.00 0.00 0 0

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----   r b swpd free buff cache si so bi bo in cs us sy id wa

  0  0    252  25844  24724 2496240    0    0    31    60    3     2  2  1 96  1
  0  0    252  25844  24724 2496240    0    0    24     0 1059   123  0  0 99  1
  0  0    252  25852  24724 2496308    0    0    24     0 1187   323  0  0 99  1
  0  1    252  24188  24724 2497940    0    0  1636     0 1329   605  0  0 83 16
  0  1    252  20924  24732 2500992    0    0  3108   724 1562   921  1  1 74 24
  0  1    252  19772  24736 2502280    0    0  1228     0 1472   773  1  0 76 23
Can we help you?X
0 0 252 19660 24740 2502412 0 0 40 0 1758 913 1 1 96 1 0 0 252 18892 24740 2503160 0 0 776 0 2268 1506 4 0 81 15 0 1 252 18444 24740 2503568 0 0 460 0 1650 883 1 0 80 19
  0 0 252 18444 24764 2503544 0 0 12 1404 1309 340 0 0 84 16

On Fri, 10 Aug 2007, Rick James wrote:

> You are hitting the other caching.
>
> Caching in Mysql:
> * "Query cache" -- exactly the same SELECT rerun
> * key_buffer -- MyISAM indexes
> * innodb_buffer_pool -- InnoDB data and index
> * Unix caching -- MyISAM data.
>
> "runs instantly" is because the index and data are in the non-query_cache
> cache(s).
>
> Handler counts won't recognize this -- they mostly count the reaching for
> blocks, not the actual disk reads needed.
>
> Can you run iostat (etc) to confirm that you are using lots of disk I/O when
> it is slow and little disk I/O when it is fast.
>
> What is the query?
> What is the EXPLAIN?
> SHOW TABLE STATUS LIKE 'tbl'\G
> SHOW CREATE TABLE tbl;
>
>
>> -----Original Message-----
>> From: Matt Juszczak [mailto:matt@atopia.net]
>> Sent: Friday, August 10, 2007 7:19 AM
>> To: Rick James
>> Cc: replication@lists.mysql.com
>> Subject: RE: Replication issues, slaves falling "behind"
>>
>> We've got the thread_cache_size at a decent size. Any tuning
>> that can be
>> done in Oreilly's book(s) or from documentation has been
>> done, and looked
>> over twice.
>>
>> The problem is that many of these queries locking replication
>> take 5-10
>> seconds to run, but if I run them again, this time altering
>> the query by
>> adding a space so that it won't go to the query cache, most
>> of the time it
>> runs instantly.
>>
>> I'm trying to figure out what's causing the queries to take so long
>> randomly, and run instantly other times.
>>
>> The only thing I'm wondering is if we're creating too many
>> tmp tables to
>> disk.
>>
>> These polls, during off peak hours, are 30 seconds apart:
>>
>> mysql> show global status like '%tmp%';
>> +-------------------------+---------+
>> | Variable_name | Value |
>> +-------------------------+---------+
>> | Created_tmp_disk_tables | 1018168 |
>> | Created_tmp_files | 1392 |
>> | Created_tmp_tables | 8938265 |
>> +-------------------------+---------+
>> 3 rows in set (0.00 sec)
>>
>> mysql> show global status like '%tmp%';
>> +-------------------------+---------+
>> | Variable_name | Value |
>> +-------------------------+---------+
>> | Created_tmp_disk_tables | 1018318 |
>> | Created_tmp_files | 1392 |
>> | Created_tmp_tables | 8938809 |
>> +-------------------------+---------+
>> 3 rows in set (0.00 sec)
>>
>>
>> So in a period of 30 seconds, About 200 tmp_disk_tables were
>> created. And
>> our tmp_table_size is set to 256MB....
>>
>> -Matt
>>
>> On Thu, 9 Aug 2007, Rick James wrote:
>>
>>> You checked the query before the delayed one?
>>>
>>> Is the hardware on the slaves at least as good as on the master?
>>>
>>> Are you willing to share these:
>>> EXPLAIN ... \G (on the SELECT equivalent of the UPDATE/INSERT/...)
>>> SHOW TABLE STATUS LIKE 'tbl';
>>> SHOW CREATE TABLE tbl;
>>>
>>> Many of the numbers below need to be divided by Uptime to
>> make any sense.
>>> Also deltas would be more useful.
>>>
>>> Threads_created can be improved by increasing thread_cache_size.
>>>
>>>> -----Original Message-----
>>>> From: Matt Juszczak [mailto:matt@atopia.net]
>>>> Sent: Thursday, August 09, 2007 8:43 AM
>>>> To: replication@lists.mysql.com
>>>> Subject: Replication issues, slaves falling "behind"
>>>>
>>>> Hi all,
>>>>
>>>> I've got a three server setup. Master, and two slaves.
>> Both slaves
>>>> replicate from the master.
>>>>
>>>> I've tweaked the two slaves as best as I possibly can, and
>>>> all performance
>>>> indicators look good. However, a script running keeps seeing
>>>> the seconds
>>>> behind master falling behind 3-4, sometimes 10, and even
>> 20 seconds.
>>>>
>>>> I've trapped these, and most of the time, the reason is
>> either that a
>>>> replicated query is taking too long to run, which shouldn't happen,
>>>> because many times they are just simple updates on myisam
>>>> tables using
>>>> indexes (an explain confirms this). Sometimes, the table is
>>>> InnoDB. It
>>>> really is random on what causes it, but a "show processlist"
>>>> on the slaves
>>>> show replication lagged because of either replicated and/or
>>>> direct queries
>>>> taking forever, but again, that specific query only "takes forever"
>>>> sometimes.
>>>>
>>>> I believe I've tuned these two slaves the best I can. Here's
>>>> what I've
>>>> got..
>>>>
>>>> | Key_read_requests | 1672843810 |
>>>> | Key_reads | 803414 |
>>>>
>>>> This looks good, the ratio is great...
>>>>
>>>> | Key_write_requests | 12696602 |
>>>> | Key_writes | 5313303 |
>>>>
>>>> Less than 1, which is good.
>>>>
>>>> | Threads_cached | 15 |
>>>> | Threads_connected | 2 |
>>>> | Threads_created | 350013 |
>>>> | Threads_running | 1 |
>>>>
>>>> Thread caching turned on, threads created not increasing
>>>> substantially...
>>>> a few per minute, if that.
>>>>
>>>> | Innodb_buffer_pool_pages_data | 462 |
>>>> | Innodb_buffer_pool_pages_dirty | 0 |
>>>> | Innodb_buffer_pool_pages_flushed | 468790 |
>>>> | Innodb_buffer_pool_pages_free | 0 |
>>>> | Innodb_buffer_pool_pages_latched | 0 |
>>>> | Innodb_buffer_pool_pages_misc | 50 |
>>>> | Innodb_buffer_pool_pages_total | 512 |
>>>> | Innodb_buffer_pool_read_ahead_rnd | 272 |
>>>> | Innodb_buffer_pool_read_ahead_seq | 2044 |
>>>> | Innodb_buffer_pool_read_requests | 108601590 |
>>>> | Innodb_buffer_pool_reads | 36763 |
>>>> | Innodb_buffer_pool_wait_free | 0 |
>>>> | Innodb_buffer_pool_write_requests | 46188825 |
>>>>
>>>> InnoDB buffers looking good.
>>>>
>>>> | Created_tmp_disk_tables | 821424 |
>>>> | Created_tmp_files | 960 |
>>>> | Created_tmp_tables | 6538711 |
>>>>
>>>> I just noticed this actually. Is that too high of a ratio?
>>>> About 13% of
>>>> our temporary tables are going to disk.
>>>>
>>>> Query caching is turned on...
>>>>
>>>> | Table_locks_immediate | 11399631 |
>>>> | Table_locks_waited | 17457 |
>>>>
>>>> As far as locking is concerned, the locks_waited value is
>>>> increasing, but
>>>> its no where near the value of table_locks_immediate, which
>>>> provides for a
>>>> great ratio.
>>>>
>>>> Sure, maybe the table_locks_waited is a bit high anyway, and
>>>> that could be
>>>> causing a bunch of problems.
>>>>
>>>>
>>>> Anyway, was wondering if anyone had any ideas.
>>>>
>>>> Thanks,
>>>>
>>>> Matt
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> MySQL Replication Mailing List
>>>> For list archives: http://lists.mysql.com/replication
>>>> To unsubscribe:
>>>> http://lists.mysql.com/replication?unsub=rjames@yahoo-inc.com
>>>>
>>>>
>>>
>>>
>>>
>>>
>>
>
>
> !DSPAM:46bc9213614685209328925!
>

-- 
MySQL Replication Mailing List
For list archives: 
http://lists.mysql.com/replication
To unsubscribe:    
http://lists.mysql.com/replication?unsub=lists@pantek.com
Received on Fri Aug 10 14:03:03 2007

This archive was generated by hypermail 2.1.8 : Sun Oct 07 2007 - 10:14:21 EDT


Contact Us  Legal Notices  Order Services Online 
Pantek Home  Privacy Policy  IT news  Site Map  Pantek Library