Hi,

I'm trying to implement some kind of basic liveness test for nodes in
the cluster based on results from nodestat -p:
1. noping(off) - powered off node
2. noping(on) - zombie node
3. noping - dead node (both os and ipmi non-responsive)
4. sshd - live node
5. all else (installing prep etc.)

I have put "nodestat -up" into cron, and nodelist table gets updated
as expected, however when trying to get the same results I notice some
big differences.
For example:
# nodestat -up compute | grep -v sshd | wc -l
48
# nodels compute nodelist.appstatus=~"sshd=down" | wc -l
4

I have confirmed all 4 nodes are a subset of the 48 nodes from nodestat.
The status column does provide the ping/noping(on|off) info, but for
most nodes with status noping the appstatus field shows "sshd=up"

Am I missing something?

The xCAT version is 2.8.3
An upgrade to 2.9.x is not practical at this time (it is scheduled),
but in any event I hadn't seen anything in the release notes that
might be relevant.

------------------------------------------------------------------------------
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to