FYI
Something I am noticing, and it seems to be consistent, is the
installation is still failing IF the VM has more than 1 nic.
Even though it PXE boots from the deployment interface, once the install
starts it fails to configure the network if a second nic is enabled (in
this case the enterprise network). The network does not come up, DHCP
will not get an address, and they have to be configured by hand. I also
(sometimes) have issues where nodediscover adds the right mac but still
tries to PXE boot off the wrong interface due to the UUID associated
with both interfaces being the same.
xCAT would deploy the node but often the network was still not correct
even if the routes table was setup right. Both nics would setup their
own default gateway instead of using one (the enterprise). The
'setroutes' postscript would fix the issue but it didn't persist after
reboot. Easy enough to fix with a postscript full of nmcli statements.
However, if the node won't deploy with 2 nics present and it has to be
added manually as a device in vSphere that defeats a hands off deploy.
So far I only see this with the Ubuntu installs, Rocky seems to deploy
(then I fix the default route and custom dns with a bash postscript).
Since the bulk of a cluster is made up of compute nodes this may not be
a huge deal, just for login/remote visualization/etc nodes that need
connectivity to the corporate network, but I thought I would document it
here.
So for now, I only add the one nic, deploy the node, add the second nic
and update the netplan file with the new nic, default gw, and proper dns
address/search order.
Hope it helps,
Brian J
On 4/25/25 05:48, Jarrod Johnson wrote:
Ok, changes were made to the Ubuntu deployment bootstrap to be a bit
more tenacious. There was previously a chance to fall through the
automation setup into manual setup. Now it should never do that,
either hanging or erroring. I'll look at a hanging scenario I saw
yesterday and make it more clear what is missing, but that shouldn't
apply to a normal PXE install.
Also as an aside I've been doing "hardware" control to target VCSA. I
can't seem to get "setboot" to work yet, but could provide power,
inventory and text console. The API for boot device doesn't seem to
work as I would expect. I hope to push libvirt, VCSA, and proxmox in
the near future.
------------------------------------------------------------------------
*From:* Brian Joiner <martinitime1...@gmail.com>
*Sent:* Thursday, April 24, 2025 8:24 PM
*To:* xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
*Cc:* Jarrod Johnson <jjohns...@lenovo.com>
*Subject:* Re: [xcat-user] [External] Confluent: Anyone get Ubuntu to
deploy?
Ok so today I had some time to try again:
Ubuntu server 22.04 pxe boots, starts installer, but becomes
interactive and will not detect the network
Added ubuntu server 24.04, same result
Then I saw the email about the new release. Installed updates
normally via yum, retried 24.04 . No other changes.
It worked! Fully hands off, hostname correct.
I may not have mentioned that my environment is esxi 7 VM's all the
way, so I don't know if there was a problem with the nic firmware or
what, but the Confluent update fixed it.
Thanks for all you devs do!
Brian Joiner
On Wed, Apr 9, 2025 at 11:25 AM Brian Joiner
<martinitime1...@gmail.com> wrote:
Awesome, thanks. Just knowing that a > 18.04 deployment should
work as expected is a good start. I'll double check paths,
permissions, logs and the curl test in the other reply and report
back.
On 4/1/25 12:57, Jarrod Johnson via xCAT-user wrote:
The 'password not accepted' can happen if you reboot a deployment
and retry without doing a 'nodedeploy' again. In confluent you
have to explicitly say you want to deploy, as there's a security
mechanism that locks down after a node API token is claimed.
[root@r3u20 ~]# nodedeploy r3u24
r3u24: pending: ubuntu-22.04.5-x86_64-default (node
authentication armed)
[root@r3u20 ~]#
Note 'node authentication armed'. This means it is configured to
allow a single more weakly authenticated request for a node token.
At some point during an install attempt, you get to:
[root@r3u20 ~]# nodedeploy r3u24
r3u24: pending: ubuntu-22.04.5-x86_64-default
[root@r3u20 ~]#
In this case, a new attempt must be accompanied by another
nodedeploy, a reboot without completing deployment will discard
the node token that was granted.
However, I assume at least the first attempt did fail for other
reasons and may need to know more about that. I just did a
deployment myself of 22.04.5 without issue and had no issues, so
at least in theory it should be workable, but will probably need
to see some files when you get to a manual interaciton point.
Files like /conf/param.conf, files in
/custom-installation/confluent/..
------------------------------------------------------------------------
*From:* Brian Joiner <martinitime1...@gmail.com>
<mailto:martinitime1...@gmail.com>
*Sent:* Tuesday, April 1, 2025 12:00 PM
*To:* xcat-user@lists.sourceforge.net
<xcat-user@lists.sourceforge.net>
<mailto:xcat-user@lists.sourceforge.net>
*Subject:* [External] [xcat-user] Confluent: Anyone get Ubuntu to
deploy?
I have attempted to deploy an Ubuntu 22.04 server via Confluent, and
I've run into all kinds of issues. Either the process dies after
trying
to mount the cd, I get the option to enter an emergency command
prompt,
or I get sent into a non-interactive setup screen where I'm
prompted to
enter account/network/disk info by hand. I've had the same
experience
in my home lab and my office Confluent instance.
However, if I deploy Rocky 9.x to the same node (with no attribute
changes) it deploys as expected. Has anyone gotten Ubuntu server to
deploy without intervention? Have I missed some kind of setup
step for
Ubuntu based installs?
My confluent server has all updated packages, and I didn't see
anything
of interest in /var/log/confluent
Attached is one of the fail screen shots if that helps.
Thanks,
Brian Joiner
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user