I'll note that I have seen similar problems with physical hosts and running a diskfull install. If I put a "systemctl start $servicename" in any of my postscripts they will hang during the postinstall process. I have taken to removing the start step and just enabling the service, then once I've determined that the postinstall is complete I just reboot the node and all services start as expected. I have seen it with gmond and slurmd so I know that it isn't specific to Ganglia.
The other work around that I'm working on implementing is to move all the things that postscripts are doing to Ansible. Mike On 3/4/19 4:42 PM, Brian Joiner wrote: We're deploying diskless nodes in Vsphere and installing Ganglia monitoring tools. The ganglia packages get installed in otherpkgs, and the ganglia postscript edits the /etc/ganglia/gmond.conf file with our custom cluster info and attempts to enable and start the service. systemctl enable gmond works systemctl start gmond causes the script to hang, indefinitely, until I log into the node and kill it. Then the script completes and allows other postbootscripts to run. Why is systemctl hanging on service start? If we remove that command from the script, it completes but the service doesn't auto start, so manual intervention is required. Is this unique to a diskless install? We got around it by creating the gmond.service file and symlink in the rootimg dir of the diskless image, but were wondering if there's a way to get a service to start the normal way. HOST: Vsphere, diskless, Centos 7.5 Ganglia 3.7 xCAT 2.14 -- Thanks, Brian Joiner _______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/xcat-user
_______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user