$ nodedeploy MYHOST MYHOST: pending: ubuntu-20.04.6-x86_64-default I have U22.04 available already as well if testing with that is useful.
The server in question isn’t used for anything special currently. My hope is that once I get some basic stuff going with the SuperMicro hardware we can start upgrading our Lenovo systems. > On Nov 10, 2023, at 14:25, Jarrod Johnson <jjohns...@lenovo.com> wrote: > > It should cloud-init as a matter of course, just like for the kickstart > installs... > > What does nodedeploy <node> look like when you hit interactive? May need to > look into this more directly next week... > >> From: David Magda <dmagda+x...@ee.torontomu.ca> >> Sent: Friday, November 10, 2023 2:16 PM >> To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> >> Subject: Re: [xcat-user] [External] Re: xCAT-Confluent >> >> Ah, silly me: bad copy-paste. >> >> That command gives: >> >> File "/opt/confluent/bin/confluent_selfcheck", line 241 >> for rsp in sess.read(f'/nodes/{args.node}/attributes/all’): >> >> ^ >> SyntaxError: invalid syntax >> >> Regardless, I (re-)ran the `nodeattrib` correctly, but that did not help. I >> then removed all the “filename=…” stanzas in dhcpd.conf, did a restart, and >> the system got (AFAICT) an IP from DHCPd, but Confluent gave it the PXE boot >> parameters and the system launched into the Ubuntu 20.04 installer. The >> console is prompting me a bunch of questions. >> >> So I’ve think I’ve finally managed to muddle through this part of the >> documentation: >> >> https://hpc.lenovo.com/users/documentation/confluentosdeploy.html >> >> Is there any documentation about automating Ubuntu installs with Confluent? >> Does Confluent handle any cloud-init stuff (which was run during the boot >> process), or is there some other method to send things that partitioning and >> packing information to Ubuntu? >> >> >>> On Nov 10, 2023, at 11:01, Jarrod Johnson <jjohns...@lenovo.com> wrote: >>> >>> The attribute name is plural, with s at the end. >>> deployment.useinsecureprotocols rather than deployment.useinsecureprotocol. >>> >>> confluent_selfcheck -n MYHOST >>> >>> Say anything interesting? >>> >>>> From: David Magda <dmagda+x...@ee.torontomu.ca> >>>> Sent: Friday, November 10, 2023 10:50 AM >>>> To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> >>>> Subject: Re: [xcat-user] [External] Re: xCAT-Confluent >>>> >>>> Looking in that file there was: >>>> >>>> Nov 09 09:02:06 {"info": "Boot attempt by MYHOST detected in >>>> insecure >>>> mode, but insecure mode is disabled. Set the attribute >>>> `deployment.useinsecureprotocols` to `firmware` or `always` to >>>> enable >>>> support, or use UEFI HTTP boot with HTTPS." } >>>> >>>> Trying to tweak that attribute, I got: >>>> >>>> $ nodeattrib MYHOST deployment.useinsecureprotocol=firmware >>>> Error: Bad Request - deployment.useinsecureprotocol attribute on >>>> node MYHOST is invalid >>>> >>>> I tried using nodegroupattrib as well on a group that the host was in, and >>>> got: >>>> >>>> Error: Bad Request - deployment.useinsecureprotocol attribute is >>>> invalid >>>> >>>> I then edited the reply_dhcp4(() function in >>>> /opt/confluent/lib/python/confluent/discovery/protocols/pxe.py to change >>>> the default check to remove the “return;" in the "if insecuremode == >>>> 'never' and not httpboot:" stanza so that it would continue going. The log >>>> message still appears (so I know the code is getting there), but the >>>> events file now has: >>>> >>>> Nov 09 09:18:34 {"info": "Offering PXE boot without address, served >>>> from 172.17.15.254 to MYHOST"} >>>> >>>> And the system is still booting xCat (I have commented out >>>> "gpxe.no-pxedhcp 1" in dhcpd.conf and restarted). >>>> >>>> Not running the dhcpd at all simply has the system timeout on its PXE >>>> attempt. I told Confluent about the particular IP address the system >>>> should have: >>>> >>>> $ nodeattrib MYHOST net.ipv4_address=172.17.15.223/21 >>>> >>>> And that did not help. >>>> >>>> Per "lsof -i udp", Confluent is listening on (amongst many other ports) >>>> *:bootps, *:dhcpv6-server, *:pxe (etc). >>>> >>>> Should I edit my dhcpd.conf and rip out things like: >>>> >>>> […] >>>> if option user-class-identifier = "xNBA" and option >>>> client-architecture = 00:00 { #x86, xCAT Network Boot Agent >>>> always-broadcast on; >>>> filename = "…" >>>> […] >>>> >>>> to try to see if that will get things going with Confluent? Or are things >>>> expected to work with all of that? >>>> >>>> >>>> >>>>> On Nov 8, 2023, at 16:19, Jarrod Johnson <jjohns...@lenovo.com> wrote: >>>>> >>>>> tail /var/log/confluent/events for a hint on why it might be ignoring the >>>>> request. >>>>> >>>>>> From: David Magda <dma...@ee.torontomu.ca> >>>>>> Sent: Wednesday, November 8, 2023 2:46 PM >>>>>> To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> >>>>>> Subject: Re: [xcat-user] [External] Re: xCAT-Confluent >>>>>> >>>>>> >>>>>> I did a “service dhcpd stop” and a “service confluent restart”, and the >>>>>> SuperMicro did not receive any reply to the DHCP/PXE packets it was >>>>>> sending out. I then did a “service dhcpd start” and the “xcat/genesis” >>>>>> file was loaded. >>>>>> >>>>>> The dhcpd.conf did have "gpxe.no-pxedhcp”, but removing it and >>>>>> restarting did not change any behaviour. I noticed that >>>>>> “http://IP:80/tftpboot/xcat/xnba/nets/172.17.8.0_21” is being referenced. >>>>>> >>>>>> Per “lsof -i udp”, the Confluent is listening on *:bootps, so I’m not >>>>>> sure why it is not answering. I had run a “nodedeploy MYHOST -n >>>>>> ubuntu-20.04.6-x86_64-default” earlier. >>>>>> >>>>>> $ nodeattrib MYHOST >>>>>> MYHOST: console.method: ipmi >>>>>> MYHOST: deployment.apiarmed: once >>>>>> MYHOST: deployment.pendingprofile: ubuntu-20.04.6-x86_64-default >>>>>> MYHOST: deployment.profile: >>>>>> MYHOST: deployment.stagedprofile: >>>>>> MYHOST: deployment.state: >>>>>> MYHOST: deployment.state_detail: >>>>>> MYHOST: groups: prox,ipmi,all,everything >>>>>> MYHOST: hardwaremanagement.manager: MYHOST-ipmi >>>>>> MYHOST: net.hwaddr: ac:1f:AA:BB:CC:DD >>>>>> MYHOST: net.ipv4_method: dhcp >>>>>> MYHOST: secret.hardwaremanagementpassword: ******** >>>>>> MYHOST: secret.hardwaremanagementuser: ******** >>>>>> >>>>>> >>>>>>> On Nov 7, 2023, at 13:40, Jarrod Johnson wrote: >>>>>>> >>>>>>> If dhcpd.conf is set to not send any 'filename', it's best. If you >>>>>>> don't need a dhcp server, then you can turn it off. There's also >>>>>>> >>>>>>> If you have a dhcp server with a dynamic range on it, then: >>>>>>> nodeattrib net.ipv4_method=firmwaredhcp >>>>>>> >>>>>>> If you have a dhcp server with static reservations, you could either >>>>>>> have dhcp continue, or disallow dhcp for the confluent node. >>>>>>> >>>>>>> If you have no dhcp server, then it should just do the right thing >>>>>>> directly. >>>>>>> >>>>>>> If you want to use dhcp ongoing, then 'net.ipv4_method=dhcp', however >>>>>>> you own the IPAM sort of responsibility totally. >>>>>>> >>>>>>> If your dhcp has: >>>>>>> option gpxe.no-pxedhcp 1; >>>>>>> Please remove that to let confluent merge an offer with an >>>>>>> uncoordinated dhcp server. >>>>>>> >>>>>>> I need to do a deeper right up on the detail about dhcp interaction, >>>>>>> how it is now optional, and how it can coexist with an unmanaged dhcp >>>>>>> server and free the dhcp server from 'filename' >>>>>>> >>>>>>>> From: David Magda >>>>>>>> Sent: Tuesday, November 7, 2023 9:27 AM >>>>>>>> To: xCAT Users Mailing list >>>>>>>> Subject: Re: [xcat-user] [External] Re: xCAT-Confluent >>>>>>>> >>>>>>>> After running the first few commands, I have >>>>>>>> /tftpboot/confluent/x86_64/ipxe* and /var/lib/confluent/public/{os, >>>>>>>> distribution}/ubuntu* present, along with genesis-x86_64/. >>>>>>>> >>>>>>>> However the contents of the RHEL/CentOS /etc/dhcp/dhcpd.conf are such >>>>>>>> that “filename” is “xcat/xnba.*”, so that’s what gets loaded. >>>>>>>> >>>>>>>> Do I need to tweak the dhcpd.conf just for the test system I’m playing >>>>>>>> with, or should a completely new dhcpd.conf file be put in place for >>>>>>>> using Confluent? (Moving the current one out of the way, perhaps >>>>>>>> temporarily until I get an understanding of Confluent so I can revert >>>>>>>> to xCat if need-be.) >>>>>>>> >>>>>>>>> On Oct 26, 2023, at 11:33, Jarrod Johnson wrote: >>>>>>>>> >>>>>>>>> I will say that EL7 hasn't been tested and thus we haven't pushed >>>>>>>>> updates since 3.8.0, but 3.8.0 should be plenty. >>>>>>>>> >>>>>>>>> The confluent you have going is already enough to start examining OS >>>>>>>>> deployment profiles. If you would like to, you can use commands like >>>>>>>>> osdeploy initialize and osdeploy import and even imgutil build, and >>>>>>>>> it won't mess with xCAT. >>>>>>>>> >>>>>>>>> When you get to nodedeploy, that is the time when you have to start >>>>>>>>> planning around potential disruption as xCAT and confluent might >>>>>>>>> fight over who gets to deploy a system, and that can be confusing. >>>>>>>>> We should document formally how to mask a node from xCAT ('!*NOIP*' >>>>>>>>> in mac table) to let one kick the tires with a node... >>>>>>>>> >>>>>>>>> I can help look at a few people kicking tires, certainly seems worthy >>>>>>>>> of documentation or video example... >>>>>>>>>> From: David Magda >>>>>>>>>> Sent: Thursday, October 26, 2023 11:22 AM >>>>>>>>>> To: xCAT Users Mailing list >>>>>>>>>> Subject: [External] Re: [xcat-user] xCAT-Confluent >>>>>>>>>> >>>>>>>>>> Yes, there was perhaps auto-completion with regards >>>>>>>>>> Confluent/Confluence. >>>>>>>>>> I currently have a (legacy?) ‘joint’ xCAT-Confluent (3.6) >>>>>>>>>> installation on RHEL 7 that I inherited; if one wants to fully move >>>>>>>>>> from xCAT to Confluent, is there document on how to ‘extract’ >>>>>>>>>> oneself from xCAT? I don’t see anything that jumps out at: >>>>>>>>>> https://hpc.lenovo.com/users/ >>>>>>>>>> https://hpc.lenovo.com/users/documentation/ >>>>>>>>>> Should I simply abandon the previous installation and do a fresh >>>>>>>>>> install? While there is some documentation, the system leans towards >>>>>>>>>> being heavily vendor-used so people completely new to it have a >>>>>>>>>> steep learning curve (xCAT is/was also challenging to get into since >>>>>>>>>> it was fairly vendor-focused). >>>>>>>> […] >>>> >> >> >> _______________________________________________ >> xCAT-user mailing list >> xCAT-user@lists.sourceforge.net >> https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Fxcat-user&data=05%7C01%7Cjjohnson2%40lenovo.com%7C792090eb799c44203d5f08dbe221c79a%7C5c7d0b28bdf8410caa934df372b16203%7C0%7C0%7C638352407016733478%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HnVKyN2mc6qmLTaPkQafrcs5ZZ3UV9tp%2B9xFz6jf0bE%3D&reserved=0 >> _______________________________________________ >> xCAT-user mailing list >> xCAT-user@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/xcat-user _______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user