Solved ... 
In /var/log/messages I see this:
Oct 21 09:37:53 node1008 systemd: Starting xcat service on compute node, the 
framework to run postbootscript and update node status...
Oct 21 09:37:53 node1008 xcat.deployment: /opt/xcat/xcatpostinit1: action is 
start
Then later:
Oct 21 09:38:20 node1008 xcat.deployment: failed to download the postscripts 
from the xCAT server for node node1008.oscar.ccv.brown.edu 
<http://node1008.oscar.ccv.brown.edu/>

Same message was in /var/log/xcat/ which is where I saw it first.

As I said, the postscripts seemed to be there for the most part.
[root@node1008 log]# du -sh /xcatpost/
1.5M    /xcatpost/


Solution:
I copied the /xcatpost directory back to /tmp on the manager node, and did a 
diff.
There was an extra new file /etc/ssh/sshd_config.new which was mode 600.
This apparently scrubbed the entire download.  I had put it there in 
anticipation of rolling out a change.

Something to watch out for.

Thanks for the pointers, 

> On Oct 22, 2020, at 9:25 AM, david_john...@brown.edu wrote:
> 
> The node is diskless so the postscripts must run on every boot. It looks like 
> it got a lot of the files but the last directory it was working on was 
> “repos”. Maybe there is a permission problem with a file that got dropped in 
> that area in the last week or so. 
> 
>   -- ddj
> Dave Johnson
> 
>> On Oct 22, 2020, at 8:10 AM, Casandra H Qiu <cxh...@us.ibm.com> wrote:
>> 
>> 
>> All the postscripts will be download from MN to CN during the provision or 
>> when run "updatenode" command, reboot node will not trigged postscript to 
>> run.
>> 
>> the default postscript (and any other user postscripts) defined in the node 
>> definition will be run during the provision,
>> 
>> # lsdef cn01 -i postscripts
>> Object name: cn01
>> postscripts=syslog,remoteshell,syncfiles
>> 
>> the xcat log on the compute node will be in the /var/log/xcat/xcat.log
>> 
>> 
>> Thanks,
>> Casandra Qiu
>> ...................................................................
>> Casandra Hong Qiu
>> Phone: (845) 433-9291, t/l 293-9291
>> Office: Building 8, 3-B-04
>> cxh...@us.ibm.com
>> 
>> 
>> 
>> <graycol.gif>David Johnson ---10/21/2020 03:39:23 PM---Wondering where to 
>> start looking for a problem that just started recently. Node had been up for 
>> 100
>> 
>> From: David Johnson <david_john...@brown.edu>
>> To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
>> Date: 10/21/2020 03:39 PM
>> Subject: [EXTERNAL] [xcat-user] Diskless node downloads all postscripts but 
>> none of them are executed
>> 
>> 
>> 
>> 
>> Wondering where to start looking for a problem that just started recently.  
>> Node had been up for 100+ days,
>> other nodes have been successfully rebooted as recently as ten days ago, 
>> this one reset a couple days ago,
>> but the ssh keys, hardeths, and our site specific postscripts were not 
>> executed at all. No log file in ~root/.
>> 
>> /xcatpost seems to be fully populated. 
>> 
>> Thanks, 
>> -- ddj
>> Dave Johnson
>> 
>> _______________________________________________
>> xCAT-user mailing list
>> xCAT-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/xcat-user 
>> <https://lists.sourceforge.net/lists/listinfo/xcat-user> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> xCAT-user mailing list
>> xCAT-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/xcat-user

_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to