Pantek Library
Hosting Provided By
CybrHost
High Speed Hosting

RE: Crash cluster after crash server.

From: Jonathan Miller <jmiller(at)mysql.com>
Date: Tue Aug 28 2007 - 13:14:41 EDT


Hi,

When the arbitrator goes down the cluster has to hold an election. The problem here is you not just loosing an arbitrator but 1/2 the cluster as well, so there is no time to hold an election.

The cluster sees this as a potential split brain and act properly by shutting itself down.

This is why it is recommended that arbitrators are not on the same hosts as your data nodes.

Best wishes,
/Jeb

Jonathan Miller
Austin, Texas USA
Senior Lead Quality Assurance Developer
MySQL AB www.mysql.com

    __  ___     ___ ____  __
   /  |/  /_ __/ __/ __ \/ /  
  / /|_/ / // /\ \/ /_/ / /__ 
 /_/  /_/\_, /___/\___\_\___/ 
        <___/   www.mysql.com 

Jumpstart your cluster!
http://www.mysql.com/consulting/packaged/cluster.html

Get training on clusters
http://www.mysql.com/training/courses/mysql_cluster.html

Do you need help?X

All-in-one Enterprise-grade Database, Support and Services http://www.mysql.com/network/

> -----Original Message-----
> From: Fabien FAYE [mailto:ffaye@dclux.com]
> Sent: Tuesday, August 28, 2007 6:28 AM
> To: 'cluster@lists.mysql.com'
> Subject: Crash cluster after crash server.
>
> Hi,
>
> We have a cluster based on 2 servers with mysql 5.0.27.
> On each server, we have manager, mysqlD and NDBD.
>
> I know it is not recommended by mysql to have manager, mysqld and ndbd on
> the same server, but in this case it is architecture reason.
>
> Node 1 : Manager 1
> Node 2 : Manager 2
> Node 10 : NDBD1
> Node 11 : NDBD2
> Node 20 : Mysql Api 1
> Node 21 : Mysql Api 2
>
> On server 1 we have: Node 1,Node 10,Node 20
> On server 2 we have: Node 2,Node 11,Node 21
>
> We have tested on the Node 10 or Node 11 some crash test.
>
> But during the crash test of Node 10, we have a shutdown few minutes after
> of Node 11, and this mistake could be reproducible.
> I have check on other log file to find something and I have read, error
> during arbitration
>
> On the configuration file we have define of each manager this things :
>
> ArbitrationRank=1 on manager 1
> ArbitrationRank=2 on manager 2
>
> My questions :
>
> Do you have already seen this problem ? (I have found some similar bugs on
> MYSQL but not in the same case)
> This problem could be generate by the arbitration Rank ?
>
> Thanks for your help!!
>
> Manager 1 log :
>
> 2007-08-23 15:40:39 [MgmSrvr] INFO -- Node 10: Local checkpoint 186
> started. Keep GCI = 69786 oldest restorable GCI = 26677
> 2007-08-23 15:55:04 [MgmSrvr] INFO -- Node 10: Local checkpoint 187
> started. Keep GCI = 70173 oldest restorable GCI = 26677
> 2007-08-23 16:03:39 [MgmSrvr] INFO -- Node 10: Local checkpoint 188
> started. Keep GCI = 70555 oldest restorable GCI = 70244
> 2007-08-23 16:17:25 [MgmSrvr] ALERT -- Node 10: Node 20 Disconnected
> 2007-08-23 16:17:25 [MgmSrvr] INFO -- Node 10: Communication to Node
> 20 closed
> 2007-08-23 16:17:25 [MgmSrvr] ALERT -- Node 11: Node 20 Disconnected
> 2007-08-23 16:17:25 [MgmSrvr] INFO -- Node 11: Communication to Node
> 20 closed
> 2007-08-23 16:17:26 [MgmSrvr] INFO -- Mgmt server state: nodeid 20
> freed, m_reserved_nodes 0000000000200002.
> 2007-08-23 16:17:26 [MgmSrvr] INFO -- Node 10: Node shutdown initiated
> 2007-08-23 16:17:29 [MgmSrvr] INFO -- Node 11: Communication to Node
> 20 opened
> 2007-08-23 16:17:29 [MgmSrvr] INFO -- Node 10: Communication to Node
> 20 opened
> 2007-08-23 16:17:35 [MgmSrvr] INFO -- Node 1: Node 10 Connected
> 2007-08-23 16:17:35 [MgmSrvr] INFO -- Node 1: Node 11 Connected
>
> Manager 2 Log :
>
> 2007-08-23 16:17:25 [MgmSrvr] ALERT -- Node 11: Node 20 Disconnected
> 2007-08-23 16:17:25 [MgmSrvr] INFO -- Node 11: Communication to Node
> 20 closed
> 2007-08-23 16:17:25 [MgmSrvr] ALERT -- Node 10: Node 20 Disconnected
> 2007-08-23 16:17:25 [MgmSrvr] INFO -- Node 10: Communication to Node
> 20 closed
> 2007-08-23 16:17:26 [MgmSrvr] INFO -- Node 10: Node shutdown initiated
> 2007-08-23 16:17:29 [MgmSrvr] INFO -- Node 11: Communication to Node
> 20 opened
> 2007-08-23 16:17:29 [MgmSrvr] INFO -- Node 10: Communication to Node
> 20 opened
> 2007-08-23 16:17:32 [MgmSrvr] WARNING -- Node 11: Node 1 missed heartbeat
> 2
> 2007-08-23 16:17:33 [MgmSrvr] WARNING -- Node 11: Node 10 missed
> heartbeat 2
> 2007-08-23 16:17:33 [MgmSrvr] WARNING -- Node 11: Node 1 missed heartbeat
> 3
> 2007-08-23 16:17:35 [MgmSrvr] WARNING -- Node 11: Node 10 missed
> heartbeat 3
> 2007-08-23 16:17:35 [MgmSrvr] WARNING -- Node 11: Node 1 missed heartbeat
> 4
> 2007-08-23 16:17:35 [MgmSrvr] ALERT -- Node 11: Node 1 declared dead
> due to missed heartbeat
> 2007-08-23 16:17:35 [MgmSrvr] INFO -- Node 11: Lost arbitrator node 1
> - process failure [state=6]
> 2007-08-23 16:17:35 [MgmSrvr] INFO -- Node 11: Communication to Node 1
> closed
> 2007-08-23 16:17:35 [MgmSrvr] ALERT -- Node 11: Node 1 Disconnected
> 2007-08-23 16:17:35 [MgmSrvr] INFO -- Node 2: Node 10 Connected
> 2007-08-23 16:17:38 [MgmSrvr] INFO -- Node 2: Node 11 Connected
> 2007-08-23 16:23:57 [MgmSrvr] ALERT -- Node 11: Forced node shutdown
> completed. Initiated by signal 0. Caused by error 2305: 'Node lost
> connection to other nodes and can not form a unpartitioned cluster, please
> investigate if there are error(s) on other node(s)(Arbitration error).
> Temporary er
>
>
> ________________________________
> --------------------------------------------------------
>
> This e-mail and any attached files are confidential and intended solely
> for the use of the individual or entity to whom they are addressed. If you
> have received this e-mail by mistake, please notify the sender immediately
> and delete it from your system. You must not copy the message or disclose
> its contents to anyone.
>
> --------------------------------------------------------

-- 
MySQL Cluster Mailing List
For list archives: 
http://lists.mysql.com/cluster
To unsubscribe:    
http://lists.mysql.com/cluster?unsub=lists@pantek.com
Received on Tue Aug 28 13:16:52 2007

This archive was generated by hypermail 2.1.8 : Sun Oct 07 2007 - 10:15:04 EDT


Contact Us  Legal Notices  Order Services Online 
Pantek Home  Privacy Policy  IT news  Site Map  Pantek Library