Solved ... In /var/log/messages I see this: Oct 21 09:37:53 node1008 systemd: Starting xcat service on compute node, the framework to run postbootscript and update node status... Oct 21 09:37:53 node1008 xcat.deployment: /opt/xcat/xcatpostinit1: action is start Then later: Oct 21 09:38:20 node1008 xcat.deployment: failed to download the postscripts from the xCAT server for node node1008.oscar.ccv.brown.edu <http://node1008.oscar.ccv.brown.edu/>
Same message was in /var/log/xcat/ which is where I saw it first. As I said, the postscripts seemed to be there for the most part. [root@node1008 log]# du -sh /xcatpost/ 1.5M /xcatpost/ Solution: I copied the /xcatpost directory back to /tmp on the manager node, and did a diff. There was an extra new file /etc/ssh/sshd_config.new which was mode 600. This apparently scrubbed the entire download. I had put it there in anticipation of rolling out a change. Something to watch out for. Thanks for the pointers, > On Oct 22, 2020, at 9:25 AM, david_john...@brown.edu wrote: > > The node is diskless so the postscripts must run on every boot. It looks like > it got a lot of the files but the last directory it was working on was > “repos”. Maybe there is a permission problem with a file that got dropped in > that area in the last week or so. > > -- ddj > Dave Johnson > >> On Oct 22, 2020, at 8:10 AM, Casandra H Qiu <cxh...@us.ibm.com> wrote: >> >> >> All the postscripts will be download from MN to CN during the provision or >> when run "updatenode" command, reboot node will not trigged postscript to >> run. >> >> the default postscript (and any other user postscripts) defined in the node >> definition will be run during the provision, >> >> # lsdef cn01 -i postscripts >> Object name: cn01 >> postscripts=syslog,remoteshell,syncfiles >> >> the xcat log on the compute node will be in the /var/log/xcat/xcat.log >> >> >> Thanks, >> Casandra Qiu >> ................................................................... >> Casandra Hong Qiu >> Phone: (845) 433-9291, t/l 293-9291 >> Office: Building 8, 3-B-04 >> cxh...@us.ibm.com >> >> >> >> <graycol.gif>David Johnson ---10/21/2020 03:39:23 PM---Wondering where to >> start looking for a problem that just started recently. Node had been up for >> 100 >> >> From: David Johnson <david_john...@brown.edu> >> To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> >> Date: 10/21/2020 03:39 PM >> Subject: [EXTERNAL] [xcat-user] Diskless node downloads all postscripts but >> none of them are executed >> >> >> >> >> Wondering where to start looking for a problem that just started recently. >> Node had been up for 100+ days, >> other nodes have been successfully rebooted as recently as ten days ago, >> this one reset a couple days ago, >> but the ssh keys, hardeths, and our site specific postscripts were not >> executed at all. No log file in ~root/. >> >> /xcatpost seems to be fully populated. >> >> Thanks, >> -- ddj >> Dave Johnson >> >> _______________________________________________ >> xCAT-user mailing list >> xCAT-user@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/xcat-user >> <https://lists.sourceforge.net/lists/listinfo/xcat-user> >> >> >> >> >> _______________________________________________ >> xCAT-user mailing list >> xCAT-user@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/xcat-user
_______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user