Pantek Library
Hosting Provided By
CybrHost
High Speed Hosting

Configuration Management and Monitoring of a Debian Etch Beowulf Cluster

From: Farid Behnia <behnia(at)gmail.com>
Date: Thu Aug 30 2007 - 10:34:19 EDT


Hi,

I've managed to put together a simple 2-node cluster using Debian etch , OpenMPI , FAI & Cfengine.

I'm looking for ideas that can help me with building a better self-healing cluster. Right now I'm making rule files for cfengine and would acknowledge any input on sample files and important configurations that need to be made for the cluster's health. (Although it's site-specific but I'm sure I can get good hints out of them)

However I'd also be glad to see if you have any monitoring system in mind that can cooperate with cfengine in the maintenance job. I've looked briefly into Ganglia and Nagios so far. It seems Ganglia is mostly meant for large (groups of) clusters and focuses on hw resources. Nagios seems to be better-suited for my job, but the gurus at cfengine mailing list believe that cfenvd & cfexecd can provide equal monitoring & recovery capability (in terms of response time).
 What's your take on either of them?

Thanks beforehand for any input.

-- 
To UNSUBSCRIBE, email to debian-beowulf-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Received on Thu Aug 30 11:04:17 2007

This archive was generated by hypermail 2.1.8 : Sun Oct 07 2007 - 07:58:55 EDT


Contact Us  Legal Notices  Order Services Online 
Pantek Home  Privacy Policy  IT news  Site Map  Pantek Library