Hi devs/users, I am having a very weird issue where "remoteshell" is failing to run on multiple different images/clusters after we performed datacenter maintenance. On the compute node side I am seeing:
Mon Nov 27 15:13:08 CST 2023 [info]: xcat.deployment: trying to download > postscripts... > Mon Nov 27 15:13:08 CST 2023 [info]: xcat.deployment: postscripts > downloaded successfully > Mon Nov 27 15:13:08 CST 2023 [info]: xcat.deployment: trying to get > mypostscript from <removed>... > Mon Nov 27 15:13:08 CST 2023 [info]: xcat.deployment.postbootscript: > postbootscript start..: syslog > Mon Nov 27 15:13:09 CST 2023 [info]: xcat.deployment.postbootscript: > postbootscript end...:syslog return with 0 > Mon Nov 27 15:13:09 CST 2023 [info]: xcat.deployment.postbootscript: > postbootscript start..: remoteshell .... and it just hangs here. On the cluster manager side, I see: Nov 27 15:23:15 xcat8 xcat[2124]: ERR The node (compute-n2) is not ready, > ignore it. It is saying this same error for all the nodes I have booted, across multiple different osimages. I am not understanding - what is it looking for? How can I correct this hangup? I have tried restarting xcatd, as well as rebooting the xcat VM. No changes yet.
_______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user