We use Icinga, but simple make sure the service is alive.
[http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/gradient.gif]
Gilad Berman
HPC Architect
Lenovo EMEA
[Phone]+972-52-2554262
[Email]gber...@lenovo.com<mailto:gber...@lenovo.com>
Lenovo.com <http://www.lenovo.com/>
Twitter<http://twitter.com/lenovo> | Facebook<http://www.facebook.com/lenovo> |
Instagram<https://instagram.com/lenovo> | Blogs<http://blog.lenovo.com/> |
Forums<http://forums.lenovo.com/>
[DCG-Hardware]
From: Xiao Peng Wang [mailto:w...@cn.ibm.com]
Sent: Wednesday, June 7, 2017 5:45 PM
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Cc: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Subject: [xcat-user] 回复: xcatd service failed (while actually still up)
Someone from xcat will try to recreate. I am curious that how do you monitor
the xcat service.
Using IBM Verse, send from my iPhone.
________________________________
在 2017年6月7日,下午10:29:20,gber...@lenovo.com<mailto:gber...@lenovo.com> 写道:
From: gber...@lenovo.com<mailto:gber...@lenovo.com>
To: xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>
Cc:
Date: 2017年6月7日 下午10:29:20
Subject: Re: [xcat-user] xcatd service failed (while actually still up)
Probably yes. But the service was OK since then (we are monitoring the xCAT
service status)
[http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/gradient.gif]
Gilad Berman
HPC Architect
Lenovo EMEA
[Phone]+972-52-2554262<tel:+972-52-2554262>
[Email]gber...@lenovo.com<mailto:gber...@lenovo.com>
Lenovo.com <http://www.lenovo.com/>
Twitter<http://twitter.com/lenovo> | Facebook<http://www.facebook.com/lenovo> |
Instagram<https://instagram.com/lenovo> | Blogs<http://blog.lenovo.com/> |
Forums<http://forums.lenovo.com/>
[DCG-Hardware]
From: Xiao Peng Wang [mailto:w...@cn.ibm.com]
Sent: Wednesday, June 7, 2017 3:53 PM
To: xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>
Cc: xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>
Subject: Re: [xcat-user] xcatd service failed (while actually still up)
The start of xcatd looks failed to finish:
Process: 7630 ExecStart=/etc/init.d/xcatd start (code=killed, signal=TERM)
Did you restart xcatd 16h ago?
Active: failed (Result: timeout) since Tue 2017-06-06 19:12:36 CEST; 16h ago
Best Regards
----------------------------------------------------------------------
Wang Xiaopeng (王晓朋)
Manager for HPC SW Dev: xCAT, ESSL, SMI, Test
IBM China Systems Laboratory (CSL)
Tel: 86-10-82453455<tel:86-10-82453455>
Email: w...@cn.ibm.com<mailto:w...@cn.ibm.com>
----- Original message -----
From: Gilad Berman <gber...@lenovo.com<mailto:gber...@lenovo.com>>
To: xCAT Users Mailing list
<xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>>
Cc:
Subject: [xcat-user] xcatd service failed (while actually still up)
Date: Wed, Jun 7, 2017 5:35 PM
Hello,
We are seeing some strange behavior with xCAT service –
Systemctl status xcatd return the following output –
s08:~ # systemctl status xcatd
● xcatd.service - LSB: xcatd
Loaded: loaded (/etc/init.d/xcatd; bad; vendor preset: disabled)
Active: failed (Result: timeout) since Tue 2017-06-06 19:12:36 CEST; 16h ago
Docs: man:systemd-sysv-generator(8)
Process: 30751 ExecStop=/etc/init.d/xcatd stop (code=exited, status=0/SUCCESS)
Process: 7630 ExecStart=/etc/init.d/xcatd start (code=killed, signal=TERM)
Tasks: 8 (limit: 512)
CGroup: /system.slice/xcatd.service
├─23978 /usr/sbin/in.tftpd -v -l -s /tftpboot -m
/etc/tftpmapfile4xcat.conf
├─28343 xcatd: SSL listener
├─28344 xcatd: DB Access
├─28345 xcatd: UDP listener
├─28346 xcatd: install monitor
├─28347 xcatd: Discovery worker
├─28348 xcatd: Command log writer
└─28727 xcatd: DB Access
While xcatd is still running and operational (tabdump every other functionality
seems to work).
Doing a systemctl restart xcatd immediately works and seems to fix the issue.
Any idea what could make the service thinks it failed? Anything we should look
at?
THX!!
[http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/gradient.gif]
Gilad Berman
HPC Architect
Lenovo EMEA
[Phone]+972-52-2554262<tel:+972-52-2554262>
[Email]gber...@lenovo.com<mailto:gber...@lenovo.com>
Lenovo.com <http://www.lenovo.com/>
Twitter<http://twitter.com/lenovo> | Facebook<http://www.facebook.com/lenovo> |
Instagram<https://instagram.com/lenovo> | Blogs<http://blog.lenovo.com/> |
Forums<http://forums.lenovo.com/>
[DCG-Hardware]
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user