$ nodedeploy MYHOST MYHOST: pending: ubuntu-20.04.6-x86_64-default I have U22.04 available already as well if testing with that is useful.
The server in question isn’t used for anything special currently. My hope is that once I get some basic stuff going with the SuperMicro hardware we can start upgrading our Lenovo systems. > On Nov 10, 2023, at 14:25, Jarrod Johnson <jjohns...@lenovo.com> wrote: > > It should cloud-init as a matter of course, just like for the kickstart > installs... > > What does nodedeploy <node> look like when you hit interactive? May need to > look into this more directly next week... > >> From: David Magda <dmagda+x...@ee.torontomu.ca> >> Sent: Friday, November 10, 2023 2:16 PM >> To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> >> Subject: Re: [xcat-user] [External] Re: xCAT-Confluent >> >> Ah, silly me: bad copy-paste. >> >> That command gives: >> >> File "/opt/confluent/bin/confluent_selfcheck", line 241 >> for rsp in sess.read(f'/nodes/{args.node}/attributes/all’): >> >> ^ >> SyntaxError: invalid syntax >> >> Regardless, I (re-)ran the `nodeattrib` correctly, but that did not help. I >> then removed all the “filename=…” stanzas in dhcpd.conf, did a restart, and >> the system got (AFAICT) an IP from DHCPd, but Confluent gave it the PXE boot >> parameters and the system launched into the Ubuntu 20.04 installer. The >> console is prompting me a bunch of questions. >> >> So I’ve think I’ve finally managed to muddle through this part of the >> documentation: >> >> https://hpc.lenovo.com/users/documentation/confluentosdeploy.html >> >> Is there any documentation about automating Ubuntu installs with Confluent? >> Does Confluent handle any cloud-init stuff (which was run during the boot >> process), or is there some other method to send things that partitioning and >> packing information to Ubuntu? >> >> >> > On Nov 10, 2023, at 11:01, Jarrod Johnson <jjohns...@lenovo.com> wrote: >> > >> > The attribute name is plural, with s at the end. >> > deployment.useinsecureprotocols rather than deployment.useinsecureprotocol. >> > >> > confluent_selfcheck -n MYHOST >> > >> > Say anything interesting? >> > >> >> From: David Magda <dmagda+x...@ee.torontomu.ca> >> >> Sent: Friday, November 10, 2023 10:50 AM >> >> To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> >> >> Subject: Re: [xcat-user] [External] Re: xCAT-Confluent >> >> >> >> Looking in that file there was: >> >> >> >> Nov 09 09:02:06 {"info": "Boot attempt by MYHOST detected in >> >> insecure >> >> mode, but insecure mode is disabled. Set the attribute >> >> `deployment.useinsecureprotocols` to `firmware` or `always` to >> >> enable >> >> support, or use UEFI HTTP boot with HTTPS." } >> >> >> >> Trying to tweak that attribute, I got: >> >> >> >> $ nodeattrib MYHOST deployment.useinsecureprotocol=firmware >> >> Error: Bad Request - deployment.useinsecureprotocol attribute on >> >> node MYHOST is invalid >> >> >> >> I tried using nodegroupattrib as well on a group that the host was in, >> >> and got: >> >> >> >> Error: Bad Request - deployment.useinsecureprotocol attribute is >> >> invalid >> >> >> >> I then edited the reply_dhcp4(() function in >> >> /opt/confluent/lib/python/confluent/discovery/protocols/pxe.py to change >> >> the default check to remove the “return;" in the "if insecuremode == >> >> 'never' and not httpboot:" stanza so that it would continue going. The >> >> log message still appears (so I know the code is getting there), but the >> >> events file now has: >> >> >> >> Nov 09 09:18:34 {"info": "Offering PXE boot without address, >> >> served from 172.17.15.254 to MYHOST"} >> >> >> >> And the system is still booting xCat (I have commented out >> >> "gpxe.no-pxedhcp 1" in dhcpd.conf and restarted). >> >> >> >> Not running the dhcpd at all simply has the system timeout on its PXE >> >> attempt. I told Confluent about the particular IP address the system >> >> should have: >> >> >> >> $ nodeattrib MYHOST net.ipv4_address=172.17.15.223/21 >> >> >> >> And that did not help. >> >> >> >> Per "lsof -i udp", Confluent is listening on (amongst many other ports) >> >> *:bootps, *:dhcpv6-server, *:pxe (etc). >> >> >> >> Should I edit my dhcpd.conf and rip out things like: >> >> >> >> […] >> >> if option user-class-identifier = "xNBA" and option >> >> client-architecture = 00:00 { #x86, xCAT Network Boot Agent >> >> always-broadcast on; >> >> filename = "…" >> >> […] >> >> >> >> to try to see if that will get things going with Confluent? Or are things >> >> expected to work with all of that? >> >> >> >> >> >> >> >> > On Nov 8, 2023, at 16:19, Jarrod Johnson <jjohns...@lenovo.com> wrote: >> >> > >> >> > tail /var/log/confluent/events for a hint on why it might be ignoring >> >> > the request. >> >> > >> >> >> From: David Magda <dma...@ee.torontomu.ca> >> >> >> Sent: Wednesday, November 8, 2023 2:46 PM >> >> >> To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> >> >> >> Subject: Re: [xcat-user] [External] Re: xCAT-Confluent >> >> >> >> >> >> >> >> >> I did a “service dhcpd stop” and a “service confluent restart”, and >> >> >> the SuperMicro did not receive any reply to the DHCP/PXE packets it >> >> >> was sending out. I then did a “service dhcpd start” and the >> >> >> “xcat/genesis” file was loaded. >> >> >> >> >> >> The dhcpd.conf did have "gpxe.no-pxedhcp”, but removing it and >> >> >> restarting did not change any behaviour. I noticed that >> >> >> “http://IP:80/tftpboot/xcat/xnba/nets/172.17.8.0_21” is being >> >> >> referenced. >> >> >> >> >> >> Per “lsof -i udp”, the Confluent is listening on *:bootps, so I’m not >> >> >> sure why it is not answering. I had run a “nodedeploy MYHOST -n >> >> >> ubuntu-20.04.6-x86_64-default” earlier. >> >> >> >> >> >> $ nodeattrib MYHOST >> >> >> MYHOST: console.method: ipmi >> >> >> MYHOST: deployment.apiarmed: once >> >> >> MYHOST: deployment.pendingprofile: ubuntu-20.04.6-x86_64-default >> >> >> MYHOST: deployment.profile: >> >> >> MYHOST: deployment.stagedprofile: >> >> >> MYHOST: deployment.state: >> >> >> MYHOST: deployment.state_detail: >> >> >> MYHOST: groups: prox,ipmi,all,everything >> >> >> MYHOST: hardwaremanagement.manager: MYHOST-ipmi >> >> >> MYHOST: net.hwaddr: ac:1f:AA:BB:CC:DD >> >> >> MYHOST: net.ipv4_method: dhcp >> >> >> MYHOST: secret.hardwaremanagementpassword: ******** >> >> >> MYHOST: secret.hardwaremanagementuser: ******** >> >> >> >> >> >> >> >> >> > On Nov 7, 2023, at 13:40, Jarrod Johnson wrote: >> >> >> > >> >> >> > If dhcpd.conf is set to not send any 'filename', it's best. If you >> >> >> > don't need a dhcp server, then you can turn it off. There's also >> >> >> > >> >> >> > If you have a dhcp server with a dynamic range on it, then: >> >> >> > nodeattrib net.ipv4_method=firmwaredhcp >> >> >> > >> >> >> > If you have a dhcp server with static reservations, you could either >> >> >> > have dhcp continue, or disallow dhcp for the confluent node. >> >> >> > >> >> >> > If you have no dhcp server, then it should just do the right thing >> >> >> > directly. >> >> >> > >> >> >> > If you want to use dhcp ongoing, then 'net.ipv4_method=dhcp', >> >> >> > however you own the IPAM sort of responsibility totally. >> >> >> > >> >> >> > If your dhcp has: >> >> >> > option gpxe.no-pxedhcp 1; >> >> >> > Please remove that to let confluent merge an offer with an >> >> >> > uncoordinated dhcp server. >> >> >> > >> >> >> > I need to do a deeper right up on the detail about dhcp interaction, >> >> >> > how it is now optional, and how it can coexist with an unmanaged >> >> >> > dhcp server and free the dhcp server from 'filename' >> >> >> > >> >> >> >> From: David Magda >> >> >> >> Sent: Tuesday, November 7, 2023 9:27 AM >> >> >> >> To: xCAT Users Mailing list >> >> >> >> Subject: Re: [xcat-user] [External] Re: xCAT-Confluent >> >> >> >> >> >> >> >> After running the first few commands, I have >> >> >> >> /tftpboot/confluent/x86_64/ipxe* and /var/lib/confluent/public/{os, >> >> >> >> distribution}/ubuntu* present, along with genesis-x86_64/. >> >> >> >> >> >> >> >> However the contents of the RHEL/CentOS /etc/dhcp/dhcpd.conf are >> >> >> >> such that “filename” is “xcat/xnba.*”, so that’s what gets loaded. >> >> >> >> >> >> >> >> Do I need to tweak the dhcpd.conf just for the test system I’m >> >> >> >> playing with, or should a completely new dhcpd.conf file be put in >> >> >> >> place for using Confluent? (Moving the current one out of the way, >> >> >> >> perhaps temporarily until I get an understanding of Confluent so I >> >> >> >> can revert to xCat if need-be.) >> >> >> >> >> >> >> >>> On Oct 26, 2023, at 11:33, Jarrod Johnson wrote: >> >> >> >>> >> >> >> >>> I will say that EL7 hasn't been tested and thus we haven't pushed >> >> >> >>> updates since 3.8.0, but 3.8.0 should be plenty. >> >> >> >>> >> >> >> >>> The confluent you have going is already enough to start examining >> >> >> >>> OS deployment profiles. If you would like to, you can use >> >> >> >>> commands like osdeploy initialize and osdeploy import and even >> >> >> >>> imgutil build, and it won't mess with xCAT. >> >> >> >>> >> >> >> >>> When you get to nodedeploy, that is the time when you have to >> >> >> >>> start planning around potential disruption as xCAT and confluent >> >> >> >>> might fight over who gets to deploy a system, and that can be >> >> >> >>> confusing. We should document formally how to mask a node from >> >> >> >>> xCAT ('!*NOIP*' in mac table) to let one kick the tires with a >> >> >> >>> node... >> >> >> >>> >> >> >> >>> I can help look at a few people kicking tires, certainly seems >> >> >> >>> worthy of documentation or video example... >> >> >> >>>> From: David Magda >> >> >> >>>> Sent: Thursday, October 26, 2023 11:22 AM >> >> >> >>>> To: xCAT Users Mailing list >> >> >> >>>> Subject: [External] Re: [xcat-user] xCAT-Confluent >> >> >> >>>> >> >> >> >>>> Yes, there was perhaps auto-completion with regards >> >> >> >>>> Confluent/Confluence. >> >> >> >>>> I currently have a (legacy?) ‘joint’ xCAT-Confluent (3.6) >> >> >> >>>> installation on RHEL 7 that I inherited; if one wants to fully >> >> >> >>>> move from xCAT to Confluent, is there document on how to >> >> >> >>>> ‘extract’ oneself from xCAT? I don’t see anything that jumps out >> >> >> >>>> at: >> >> >> >>>> https://hpc.lenovo.com/users/ >> >> >> >>>> https://hpc.lenovo.com/users/documentation/ >> >> >> >>>> Should I simply abandon the previous installation and do a fresh >> >> >> >>>> install? While there is some documentation, the system leans >> >> >> >>>> towards being heavily vendor-used so people completely new to it >> >> >> >>>> have a steep learning curve (xCAT is/was also challenging to get >> >> >> >>>> into since it was fairly vendor-focused). >> >> >> >> […] >> >> >> >> >> _______________________________________________ >> xCAT-user mailing list >> xCAT-user@lists.sourceforge.net >> https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Fxcat-user&data=05%7C01%7Cjjohnson2%40lenovo.com%7C792090eb799c44203d5f08dbe221c79a%7C5c7d0b28bdf8410caa934df372b16203%7C0%7C0%7C638352407016733478%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HnVKyN2mc6qmLTaPkQafrcs5ZZ3UV9tp%2B9xFz6jf0bE%3D&reserved=0 >> _______________________________________________ >> xCAT-user mailing list >> xCAT-user@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/xcat-user > _______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user