Re: Replication stopping with exactly the same characteristics as [Jesse]
Memory allocation on the master to copy binlog events to a slave is done one
event at a time. Whether a slave is always or intermittently connected
should not change the size of the allocations.
If this error is possible, then there are steps you can take on the master
to avoid it without code changes. It is difficult to determine whether out
of memory errors have occurred while allocating memory for replication
events. Preventive measures include: reducing the number of slaves connected
at the same time, reducing the maximum binlog event size (reduce network
packet size allowed between a client and server to control this), reduce
memory used for other things on the server: InnoDB buffer cache, sort, etc.
We never used --slave-skip-errors and I don't know if it helps. But in my
case, I didn't want the slave to skip processing of a replication event when
an error occurred. I wanted it to restart or retry when it received an error
message while trying to copy a replication event from the master. Because
the error didn't occur in the slave SQL thread while processing the event.
It occurred in the IO thread while copying an event from the master. And no
events in the relay log had the 'error on master' flag set.
On 10/31/07, Kieffer, Tom <Tom.Kieffer@ds-plan.com> wrote:
> > I have very large replication events, because I use replication > discontinously. The tslave generally disconnects at least once a day and on > weekends. > > The query that causes the truncation error has no errors for sure, because > it has executed in the row just before the one that causes the error. > > I use version 5.0.29. I found your arcticle on code.google.com but if I > understand well, the patch is for versions 4 only. > > May it be possible that the SQL thread on the slave stopps even if I put > slave-skip-errors in the config file? Because I planned to simply ignore > the errors but the slave still stopps as mentioned. Is there really anything > I could do? > > > > Freundliche Grüße > > > > Tom Kieffer > > ________________________________________ > > > > *DS-Plan* > > Obere Waldplätze 11 > > 70569 Stuttgart > > Deutschland > > > > Tel: +49 711 687070-350 > > Fax: +49 711 687070-368 > > tom.kieffer@ds-plan.com > > www.ds-plan.com > > ________________________________________ > > > > DS-Plan Aktiengesellschaft, Sitz in Stuttgart > > Vorstand: Peter Tzeschlock (Vorsitz), Martin Lutz, Dr.-Ing. Michael Bauer > > Vorsitzender des Aufsichtsrates: Andreas Blaschkowski > > Handelsregistereintrag: Amtsgericht Stuttgart, HRB 720034 > > > > > -----Ursprüngliche Nachricht----- > *Von:* Mark Callaghan [mailto:mcallaghan@google.com] > *Gesendet:* Montag, 29. Oktober 2007 16:17 > *An:* Kieffer, Tom > *Cc:* replication@lists.mysql.com > *Betreff:* Re: Replication stopping with exactly the same characteristics > as [Jesse] > > > > We had errors similar to this. They occurred because the master got memory > allocation errors while trying to copy a large replication event to the > slave. The master did not log an error for this, nor did it notice the > memory allocation error. The slave got the truncation error and replication > stopped. The problems on the master have been fixed in recent 5.0branches. > > Large replication events are not streamed to slaves. They require > contiguous memory allocation. Binlog dump threads on the master call > malloc/free for every replication event copied to a slave. > > On 10/29/07, *Kieffer, Tom* <Tom.Kieffer@ds-plan.com> wrote: > > Hello, > > > > just came to the same problem than Jesse. Replication is stopping without > any > visible reason. Slave stays at "Waiting for master to send event". MySQL > writes to the error log: > Slave SQL thread exiting, replication stopped in log 'xxx' at position > yyy. > I have the "Data truncated error" ignored. Never understood why an entry > is > accepted on the master but throws a "Data truncated" error on the slave. > But > if replication stopped even with ignore_error this parameter doesn't > really > make sense. > Until now i couldn't figure out any solution whatsoever. I just issue a > "Start Slave" and replication continues where it stopped. > The data is not yet critical but may become so in the near future so i'm > worrying somewhat about the replication behaviour. > > > > > > > > Tom Kieffer > > > > > > > > > -- > Mark Callaghan > mcallaghan@google.com >
--
Mark Callaghan
mcallaghan@google.com
Received on Wed Oct 31 08:46:30 2007
This archive was generated by hypermail 2.1.8
: Fri Jul 04 2008 - 00:24:14 EDT
|