Pantek Library
Hosting Provided By
CybrHost
High Speed Hosting

1 node in cluster fails hourly: Ndb kernel is stuck in: Job Handling

From: James Graham <james(at)asperity.co.uk>
Date: Mon Oct 01 2007 - 04:51:48 EDT


Hi,

Since yesterday evening, one of our data nodes has been crashing every hour or so.
Our setup is a load balancer running the ndb_mgm, and 2 machines running ndbd/mysqld.

The error is below:
Time: Monday 1 October 2007 - 06:54:03
Status: Temporary error, restart node
Message: WatchDog terminate, internal error or massive overload on the machine r
unning this node (Internal error, programming error or missing error message, pl
ease report a bug)
Error: 6050
Error data: Job Handling
Error object: WatchDog.cpp
Program: ndbd
Pid: 4218
Trace: /var/lib/mysql-cluster/ndb_3_trace.log.21 Version: Version 5.0.32

Trace (relevant part)

NR: setLcpActiveStatusEnd - !m_participatingLQH NR: setLcpActiveStatusEnd - m_participatingLQH

Ndb kernel is stuck in: Job Handling
Ndb kernel is stuck in: Job Handling
Ndb kernel is stuck in: Job Handling
2007-10-01 06:54:03 [ndbd] INFO     -- Watchdog restarting system
2007-10-01 06:54:03 [ndbd] INFO     -- Watchdog shutdown completed -
exiting
2007-10-01 06:54:03 [ndbd] ALERT -- Node 3: Forced node shutdown completed, restarting. Initiated by signal 0. Caused by error 6050: 'WatchDog terminate, internal error or massive overload on the machine running this node(Internal error, programming error or missing error message, please re
2007-10-01 06:54:03 [ndbd] INFO -- Ndb has terminated (pid 4218) restarting
2007-10-01 06:54:03 [ndbd] INFO     -- Angel pid: 2868 ndb pid: 5076
2007-10-01 06:54:03 [ndbd] INFO     -- NDB Cluster -- DB node 3
2007-10-01 06:54:03 [ndbd] INFO     -- Version 5.0.32 --
2007-10-01 06:54:03 [ndbd] INFO     -- Configuration fetched at
89.200.138.148 port 1186
2007-10-01 06:54:03 [ndbd] INFO     -- Start initiated (version 5.0.32)

It is only one of the two nodes that does this, luckily the other is fine.
Any ideas?

-- 
MySQL Cluster Mailing List
For list archives: 
http://lists.mysql.com/cluster
To unsubscribe:    
http://lists.mysql.com/cluster?unsub=lists@pantek.com
Received on Mon Oct 1 05:07:19 2007

This archive was generated by hypermail 2.1.8 : Sun Oct 07 2007 - 10:15:17 EDT


Contact Us  Legal Notices  Order Services Online 
Pantek Home  Privacy Policy  IT news  Site Map  Pantek Library