I need to monitor a compute farm system. The system is made up of 18 hosts. I have inherited a functioning command (example below) that gives me what I want from the command line: an overview of the health of 18 devices in the service.
I would like to know the recommended way (Best Practices) to get two things: 1) I would like this manner of overview of all 18 devices in the Zenoss GUI, and 2) I would like an Error event raised when status on one or more devices goes from "ok" to "unavail" I have reviewed zencommand and ssh modelling in both the Admin guide and the forum. I can model individual devices via SNMP just fine. The best hint I have seen so far is from this post http://forums.zenoss.com/viewtopic.php?t=7470&highlight=ssh+command I thought about doing this by creating a /device/lsf class for this compute farm, wrapping the command below in a shell script check_lsf , dropping that shell in /opt/zenoss/libexec/ running the shell script through zencommand, then parsing the output in the shell script and using the output in a new Data Source. That would raise alerts on "ok" or "unavail" However, I have a need to display the rows and columns of the command as well, either in this form direct from the shell script or some other manner more native to Zenoss. I don't want to call the command twice. I run the command and I get an overview of the 18 devices. [EMAIL PROTECTED] ~$ ssh 134.87.177.20 /tools/bin/lsload HOST_NAME status r15s r1m r15m ut pg ls it tmp swp mem chew ok 0.0 0.0 0.0 0% 0.0 0 4728 9248M 32G 7736M gaff ok 0.0 0.0 0.0 0% 0.0 0 6024 9288M 32G 15G kaiser ok 0.0 0.0 0.0 0% 0.0 1 4604 9296M 32G 15G sebastian ok 0.0 0.0 0.0 0% 0.0 0 17760 9296M 32G 15G tyrell ok 0.0 0.0 0.0 0% 0.0 0 301 9296M 32G 15G zhora ok 0.0 0.0 0.0 0% 0.0 0 5968 9296M 32G 15G cannonball ok 0.0 0.0 0.0 0% 0.0 0 15816 9296M 16G 5260M skyride ok 0.0 0.0 0.0 0% 0.0 0 79808 9296M 16G 5300M swanboats ok 0.0 0.0 0.0 0% 0.0 0 79808 9296M 16G 5292M roy ok 0.0 0.0 0.0 0% 0.0 0 36160 9296M 32G 11G rachel ok 0.0 0.0 0.1 1% 20.4 3 237 3634M 31G 10G batty ok 0.1 0.0 0.0 0% 0.0 1 54944 9296M 32G 15G eurobungy ok 0.2 0.0 0.0 0% 0.0 0 79808 9296M 16G 5516M holden ok 0.3 0.1 0.0 0% 0.0 0 45056 9296M 32G 15G bryant ok 0.6 1.4 0.8 33% 0.0 1 425 9304M 32G 4582M pris ok 0.9 0.0 0.0 0% 0.0 0 126 9296M 31G 14G taffey ok 1.0 1.0 1.0 6% 0.0 1 65 9296M 32G 59G deckard ok 1.1 1.0 1.0 50% 0.0 1 403 9296M 32G 15G [EMAIL PROTECTED] ~$ The compute farm system is referred to as Load Sharing Facility, or lsf from www.platform.com I expect there is the quick ugly way to do this, and the elegant way. Given a choice, I am today looking for the quick ugly way :-) and will bring in the elegant way later... This example is from one site. There are more sites after this site. I run Zenoss 2.2.4 on RHEL4 Any help would be appreciated. So would a detailed example of how someone else is successfully using zencommand. Thank you in advance, David Sloboda PMC-Sierra, Inc.
_______________________________________________ zenoss-users mailing list [email protected] http://lists.zenoss.org/mailman/listinfo/zenoss-users
