|
|||||||||||
|
Crash cluster after crash server.
From: Fabien FAYE <ffaye(at)dclux.com>
Date: Tue Aug 28 2007 - 07:28:07 EDT
We have a cluster based on 2 servers with mysql 5.0.27. On each server, we have manager, mysqlD and NDBD. I know it is not recommended by mysql to have manager, mysqld and ndbd on the same server, but in this case it is architecture reason.
Node 1 : Manager 1
Node 10 : NDBD1 Node 11 : NDBD2 Node 20 : Mysql Api 1 Node 21 : Mysql Api 2 On server 1 we have: Node 1,Node 10,Node 20 On server 2 we have: Node 2,Node 11,Node 21 We have tested on the Node 10 or Node 11 some crash test. But during the crash test of Node 10, we have a shutdown few minutes after of Node 11, and this mistake could be reproducible. I have check on other log file to find something and I have read, error during arbitration On the configuration file we have define of each manager this things :
ArbitrationRank=1 on manager 1
My questions : Do you have already seen this problem ? (I have found some similar bugs on MYSQL but not in the same case) This problem could be generate by the arbitration Rank ? Thanks for your help!! Manager 1 log : 2007-08-23 15:40:39 [MgmSrvr] INFO -- Node 10: Local checkpoint 186 started. Keep GCI = 69786 oldest restorable GCI = 26677 2007-08-23 15:55:04 [MgmSrvr] INFO -- Node 10: Local checkpoint 187 started. Keep GCI = 70173 oldest restorable GCI = 26677 2007-08-23 16:03:39 [MgmSrvr] INFO -- Node 10: Local checkpoint 188 started. Keep GCI = 70555 oldest restorable GCI = 70244 2007-08-23 16:17:25 [MgmSrvr] ALERT -- Node 10: Node 20 Disconnected 2007-08-23 16:17:25 [MgmSrvr] INFO -- Node 10: Communication to Node 20 closed 2007-08-23 16:17:25 [MgmSrvr] ALERT -- Node 11: Node 20 Disconnected 2007-08-23 16:17:25 [MgmSrvr] INFO -- Node 11: Communication to Node 20 closed 2007-08-23 16:17:26 [MgmSrvr] INFO -- Mgmt server state: nodeid 20 freed, m_reserved_nodes 0000000000200002. 2007-08-23 16:17:26 [MgmSrvr] INFO -- Node 10: Node shutdown initiated 2007-08-23 16:17:29 [MgmSrvr] INFO -- Node 11: Communication to Node 20 opened 2007-08-23 16:17:29 [MgmSrvr] INFO -- Node 10: Communication to Node 20 opened 2007-08-23 16:17:35 [MgmSrvr] INFO -- Node 1: Node 10 Connected 2007-08-23 16:17:35 [MgmSrvr] INFO -- Node 1: Node 11 Connected Manager 2 Log : 2007-08-23 16:17:25 [MgmSrvr] ALERT -- Node 11: Node 20 Disconnected 2007-08-23 16:17:25 [MgmSrvr] INFO -- Node 11: Communication to Node 20 closed 2007-08-23 16:17:25 [MgmSrvr] ALERT -- Node 10: Node 20 Disconnected 2007-08-23 16:17:25 [MgmSrvr] INFO -- Node 10: Communication to Node 20 closed 2007-08-23 16:17:26 [MgmSrvr] INFO -- Node 10: Node shutdown initiated 2007-08-23 16:17:29 [MgmSrvr] INFO -- Node 11: Communication to Node 20 opened 2007-08-23 16:17:29 [MgmSrvr] INFO -- Node 10: Communication to Node 20 opened 2007-08-23 16:17:32 [MgmSrvr] WARNING -- Node 11: Node 1 missed heartbeat 2 2007-08-23 16:17:33 [MgmSrvr] WARNING -- Node 11: Node 10 missed heartbeat 2 2007-08-23 16:17:33 [MgmSrvr] WARNING -- Node 11: Node 1 missed heartbeat 3 2007-08-23 16:17:35 [MgmSrvr] WARNING -- Node 11: Node 10 missed heartbeat 3 2007-08-23 16:17:35 [MgmSrvr] WARNING -- Node 11: Node 1 missed heartbeat 4 2007-08-23 16:17:35 [MgmSrvr] ALERT -- Node 11: Node 1 declared dead due to missed heartbeat 2007-08-23 16:17:35 [MgmSrvr] INFO -- Node 11: Lost arbitrator node 1 - process failure [state=6] 2007-08-23 16:17:35 [MgmSrvr] INFO -- Node 11: Communication to Node 1 closed 2007-08-23 16:17:35 [MgmSrvr] ALERT -- Node 11: Node 1 Disconnected 2007-08-23 16:17:35 [MgmSrvr] INFO -- Node 2: Node 10 Connected 2007-08-23 16:17:38 [MgmSrvr] INFO -- Node 2: Node 11 Connected 2007-08-23 16:23:57 [MgmSrvr] ALERT -- Node 11: Forced node shutdown completed. Initiated by signal 0. Caused by error 2305: 'Node lost connection to other nodes and can not form a unpartitioned cluster, please investigate if there are error(s) on other node(s)(Arbitration error). Temporary er ________________________________ This e-mail and any attached files are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail by mistake, please notify the sender immediately and delete it from your system. You must not copy the message or disclose its contents to anyone. Received on Tue Aug 28 07:32:19 2007 This archive was generated by hypermail 2.1.8 : Sun Oct 07 2007 - 10:15:04 EDT |
||||||||||
|
|||||||||||