As to why root, ultimately 'nodeapply' has to ssh in to the nodes in a way that currently allows arbitrary commands as root, so it's not going through the confluent api and so you have to be 'really root' to do nodeshell and nodeapply, as that is the only user, by default, allowed to ssh into nodes to do such a thing.
If you want a common user to be able to do that, then you can create: /var/lib/confluent/public/site/ssh/*.rootpubkey Files including a public key your user has access too. Also you can use ssh-agent and ssh-add if you want to add /root/.ssh/id_ed25519 for non-interactive use in a sesion. In terms of why that fails, there are some things I would check: /var/log/confluent/stdout /var/log/confluent/stderr /var/log/confluent/trace And the output of: confluent_selfcheck -an node01 ________________________________ From: Brian Joiner <martinitime1...@gmail.com> Sent: Wednesday, October 2, 2024 10:00 PM To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> Subject: [External] [xcat-user] Confluent nodeapply -F fail I'm trying to setup syncfiles to transfer some configs for munge and slurm, and no matter what I do 'nodeapply -F <nodename>' fails if run after the node has booted: [brian@confluent01 confluent]$ sudo -i nodeapply -F node01 Enter passphrase for key '/root/.ssh/id_ed25519': node01: node01: --------------------------------------------------------------------------- node01: Running python script 'syncfileclient' from https://10.13.13.5/confluent-public/os/rocky-9.2-x86_64-default/scripts/ node01: Executing in /tmp/confluentscripts.Nm4k5fCWl node01: Traceback (most recent call last): node01: File "/tmp/confluentscripts.Nm4k5fCWl/syncfileclient", line 286, in <module> node01: synchronize() node01: File "/tmp/confluentscripts.Nm4k5fCWl/syncfileclient", line 233, in synchronize node01: status, rsp = ac.grab_url_with_status('/confluent-api/self/remotesyncfiles') node01: File "/opt/confluent/bin/apiclient", line 413, in grab_url_with_status node01: raise Exception(rsp.read()) node01: 'syncfileclient' exited with code 1 node01: Exception: b"500 - Command '['rsync', '-rvLD', '/tmp/tmp9wxzgajv.synctonode01/', 'root@[10.13.13.11]:/']' returned non-zero exit status 255." I have tried using various section headers like APPENDONCE and REPLACE in /var/lib/confluent/public/os/rocky-9.2-x86_64-default/syncfiles or just no header and it fails every time. I have no issue running other post scripts from various scripts subdirs. The files I want to transfer are in /var/lib/confluent/syncfiles, but I even tried them in /var/lib/confluent/public/syncfiles and no difference. Also, my user 'brian' is supposed to be an admin but I have to use 'sudo -i' to run anything. Syncffiles on xCAT never gave me any issues and I'm using the same syntax in this syncfiles file for Confluent: /source/file -> /destination/file Brian Joiner
_______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user