I'll note that I have seen similar problems with physical hosts and running a 
diskfull install. If I put a "systemctl start $servicename" in any of my 
postscripts they will hang during the postinstall process. I have taken to 
removing the start step and just enabling the service, then once I've 
determined that the postinstall is complete I just reboot the node and all 
services start as expected. I have seen it with gmond and slurmd so I know that 
it isn't specific to Ganglia.

The other work around that I'm working on implementing is to move all the 
things that postscripts are doing to Ansible.

Mike

On 3/4/19 4:42 PM, Brian Joiner wrote:
We're deploying diskless nodes in Vsphere and installing Ganglia monitoring 
tools.

The ganglia packages get installed in otherpkgs, and the ganglia postscript 
edits the /etc/ganglia/gmond.conf file with our custom cluster info and 
attempts to enable and start the service.

systemctl enable gmond works
systemctl start gmond causes the script to hang, indefinitely, until I log into 
the node and kill it.  Then the script completes and allows other 
postbootscripts to run.

Why is systemctl hanging on service start?  If we remove that command from the 
script, it completes but the service doesn't auto start, so manual intervention 
is required.  Is this unique to a diskless install?  We got around it by 
creating the gmond.service file and symlink in the rootimg dir of the diskless 
image, but were wondering if there's a way to get a service to start the normal 
way.

HOST:  Vsphere, diskless,
Centos 7.5
Ganglia 3.7
xCAT 2.14
--
Thanks,
Brian Joiner




_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/xcat-user

_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to