One possibility is that the xcatsslversion and xcatsslciphers settings in the site table on the management node is not allowed by the version of OpenSSL running on the RHEL8 compute nodes.
Check to see if you have anything set for xcatsslversion and xcatsslciphers in the site table using: tabdump site | egrep "xcatsslversion|xcatsslciphers" If you have those attributes set, as an experiment, remove the xcatsslversion and xcatsslciphers from the site table using tabedit and repeat the test to see if getcredentials.awk succeeds. If things are now working, you can leave xcatsslversion and xcatsslciphers out of the site table if you do not require specific SSL versions or ciphers; xCAT will choose default values if those attributes are not set. If you require specific xcatsslversion or xcatsslciphers settings, you will need to adjust the settings to be compatible with the versions of OpenSSL installed on the management node and compute node to produce a working combination. I can provide more information on how to check these settings, but let's confirm whether this is the problem first. Depending on the version of xCAT installed, you may also check whether the xCAT default values for SSL_version here are compatible with the version of OpenSSL running on the RHEL8 compute nodes. https://github.com/xcat2/xcat-core/blob/1dea6334b3ba00337fee66a2bab37a6a1b09dbd5/xCAT-server/sbin/xcatd#L1553 From: Michael Robbert <mrobb...@mines.edu> To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> Date: 03/16/2021 05:11 PM Subject: [EXTERNAL] [xcat-user] Postscript remoteshell not working on RHEL8 compute node I have an x86_64 management node running CentOS 7.9 and most of my cluster is x86_64 nodes running the same OS. I’m trying to test RHEL8 by installing it on one of our ppc64le nodes, but I’ve found that the remoteshell postscript is failing to install the correct SSH hostkeys on the node during installation and when run manually after the node comes up after the install completes. I’ve enabled xcatdebugmode and this is what I see in the logs from the install when that postscript runs: Mar 16 14:09:40 m002 xcat.deployment.postscript INFO Running postscript: remoteshell Mar 16 14:09:40 m002 xcat[36244]: INFO Install: rsyslog version 8 setup Mar 16 14:09:40 m002 xcat[36268]: INFO remoteshell: setup /etc/ssh/sshd_config and ssh_config Mar 16 14:09:40 m002 xcat[36273]: INFO Install: setup root .ssh Mar 16 14:09:41 m002 xcat[36280]: INFO remoteshell: sending xcatflowrequest 172.18.10.201 3001 Mar 16 14:09:41 m002 xcat[36310]: INFO remoteshell:xcatflowrequest received response return=0 Mar 16 14:09:41 m002 xcat[36324]: INFO remoteshell: getting ssh_host_dsa_key Mar 16 14:09:41 m002 xcat[36326]: INFO remoteshell: sending xcatflowrequest 172.18.10.201 3001 Mar 16 14:09:41 m002 xcat[36356]: INFO remoteshell:xcatflowrequest return=0 Mar 16 14:09:41 m002 xcat[36368]: INFO remoteshell: sending xcatflowrequest 172.18.10.201 3001 Mar 16 14:09:42 m002 xcat[36398]: INFO remoteshell:xcatflowrequest return=0 Mar 16 14:09:42 m002 xcat[36410]: INFO remoteshell: sending xcatflowrequest 172.18.10.201 3001 Mar 16 14:09:42 m002 xcat[36440]: INFO remoteshell:xcatflowrequest return=0 Mar 16 14:09:42 m002 xcat[36452]: INFO remoteshell: sending xcatflowrequest 172.18.10.201 3001 Mar 16 14:09:42 m002 xcat[36482]: INFO remoteshell:xcatflowrequest return=0 Mar 16 14:09:42 m002 xcat[36494]: INFO remoteshell: sending xcatflowrequest 172.18.10.201 3001 Mar 16 14:09:42 m002 xcat[36524]: INFO remoteshell:xcatflowrequest return=0 Mar 16 14:09:42 m002 xcat[36536]: INFO remoteshell: sending xcatflowrequest 172.18.10.201 3001 Mar 16 14:09:42 m002 xcat[36566]: INFO remoteshell:xcatflowrequest return=0 Mar 16 14:09:42 m002 xcat[36578]: INFO remoteshell: sending xcatflowrequest 172.18.10.201 3001 Mar 16 14:09:42 m002 xcat[36608]: INFO remoteshell:xcatflowrequest return=0 Mar 16 14:09:42 m002 xcat[36620]: INFO remoteshell: sending xcatflowrequest 172.18.10.201 3001 Mar 16 14:09:42 m002 xcat[36650]: INFO remoteshell:xcatflowrequest return=0 Mar 16 14:09:42 m002 xcat[36662]: INFO remoteshell: sending xcatflowrequest 172.18.10.201 3001 Mar 16 14:09:43 m002 xcat[36692]: INFO remoteshell:xcatflowrequest return=0 Mar 16 14:09:43 m002 xcat[36710]: INFO remoteshell: sending xcatflowrequest 172.18.10.201 3001 Mar 16 14:09:43 m002 xcat[36740]: INFO remoteshell:xcatflowrequest return=0 Mar 16 14:09:43 m002 xcat[36754]: INFO ssh_rsa_hostkey Mar 16 14:09:43 m002 xcat[36756]: INFO remoteshell: sending xcatflowrequest 172.18.10.201 3001 Mar 16 14:09:43 m002 xcat[36786]: INFO remoteshell:xcatflowrequest return=0 Mar 16 14:09:43 m002 xcat[36798]: INFO remoteshell: sending xcatflowrequest 172.18.10.201 3001 Mar 16 14:09:43 m002 xcat[36828]: INFO remoteshell:xcatflowrequest return=0 Mar 16 14:09:43 m002 xcat[36840]: INFO remoteshell: sending xcatflowrequest 172.18.10.201 3001 Mar 16 14:09:43 m002 xcat[36870]: INFO remoteshell:xcatflowrequest return=0 Mar 16 14:09:43 m002 xcat[36882]: INFO remoteshell: sending xcatflowrequest 172.18.10.201 3001 Mar 16 14:09:43 m002 xcat[37017]: INFO remoteshell:xcatflowrequest return=0 Mar 16 14:09:43 m002 xcat[37029]: INFO remoteshell: sending xcatflowrequest 172.18.10.201 3001 Mar 16 14:09:43 m002 xcat[37143]: INFO remoteshell:xcatflowrequest return=0 Mar 16 14:09:44 m002 xcat[37155]: INFO remoteshell: sending xcatflowrequest 172.18.10.201 3001 Mar 16 14:09:44 m002 xcat[37520]: INFO remoteshell:xcatflowrequest return=2 Mar 16 14:09:44 m002 xcat[37521]: INFO remoteshell: error from xcatflowrequest, will not use flow control Mar 16 14:10:39 m002 xcat[37616]: INFO ssh_ecdsa_hostkey Mar 16 14:12:56 m002 xcat[37796]: INFO remoteshell: gathering ssh_root_pub_key Mar 16 14:12:56 m002 xcat[37802]: INFO ssh_root_pub_key Mar 16 14:15:28 m002 xcat[38065]: INFO remoteshell:sshbetweennodes is yes Mar 16 14:15:28 m002 xcat[38076]: INFO remoteshell: gathering ssh_root_key Mar 16 14:15:28 m002 xcat[38080]: INFO ssh_root_key Mar 16 14:18:02 m002 xcat.deployment.postscript INFO postscript remoteshell return with 0 It looks to me like it has the correct return code, but the hostkey files are not correct after reboot. I found an old post that suggested to run the getcredentials.awk script manually after starting the miniserver on the compute node: /xcatpost/allowcred.awk & USEOPENSSLFORXCAT=yes XCATSERVER=172.18.10.201:3001 /xcatpost/getcredentials.awk ssh_rsa_hostkey If I do that from an x86_64 CentOS 7.9 node it returns output that includes a hostkey, but if I run the same thing from my ppc64le RHEL 8.2 node it returns no data, but has an exitcode of 0. Any thoughts on what might be wrong or what else I can check in order to fix this? Mike Robbert Cyberinfrastructure Specialist, Cyberinfrastructure and Advanced Research Computing Information and Technology Solutions (ITS) 303-273-3786 | mrobb...@mines.edu A close up of a sign Description automatically generated Our values: Trust | Integrity | Respect | Responsibility[attachment "smime.p7s" deleted by Nathan A Besaw/Poughkeepsie/IBM] _______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
_______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user