Frank, I don't think the xNBA changes are related to the problem you are seeing with sles12 on these servers. Replacing the two files described here is sufficient for testing: https://github.com/xcat2/xNBA/issues/2#issuecomment-720693322
This message from the console output shows that the patch is being used, if you roll back to the prior versions of those two files, you should see a different xNBA version here: xNBA 1.20.1+ (0) -- Open Source Network Boot Firmware -- http://ipxe.org There is a pre-built package available here, but, again, I think this is irrelevant to your problem: https://xcat.org/files/xcat/repos/yum/devel/xcat-dep/xnba-undi-1.20.1-0.noarch.rpm You can stick with the manual patch or roll back to the original version, either way should be fine. If xNBA was the issue, you would not be able to successfully boot RHEL either. I think the next step is trying to figure out what is causing this: [2021-02-08T13:32:42+01:00] Sending DHCP request to eth0... [2021-02-08T13:33:13+01:00] no/incomplete answer. You can get some additional debug information about the provisioning process by monitoring the install process with: xcatprobe osdeploy -n <NODE_NAME> This should help you determine whether the xCAT MN is receiving the DHCP request and responding correctly or not. I don't have first hand experience with this combination of server and OS, perhaps other members of the mailing list have some experience they can share. Nate From: "Heckes, Frank" <hec...@mps.mpg.de> To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> Date: 02/08/2021 10:16 AM Subject: [EXTERNAL] Re: [xcat-user] sles12.5 net-installation fails Hello Nate, Yes, strange thing is that redhat installation works out-of-box. To verify your suspect I checked sles12.3 network (standart) installation and ran into the same problem as for sles12.{4, 5} The xNBA is loaded, the networks gets configured, but the additional DHCP request fails and therefore the installation. (I attached the console output below. I also set the debuglevel=2, but I couldn’t find anything useful in computes.log). As the SLES installer is running into this a problem with the installer program or does the nXBA didn’ t prepare the install environment correctly? The NIC is newer card (QLogic 2x1GE+2x10GE QL41264HMCU CNA) is there something to be configured to work with xnba? I can install sles12.{3,4,5} to older HW and VMs without problems. For the patching I was actually a bit lazy and tried https://github.com/xcat2/xNBA/issues/2#issuecomment-720693322 the files in comment of chonx. Theses binaries didn’t help. Does the procedure on this page: https://github.com/xcat2/xcat-core/issues/6567 is the one to use to create the patched xnba? Cheers, -Frank From: Nathan A Besaw <bes...@us.ibm.com> Sent: Thursday, 4 February 2021 14:25 To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> Subject: Re: [xcat-user] sles12.5 net-installation fails Hi Frank, If you are able to provision the nodes with both diskless and diskfull images of RHEL 7.{7,8}, you probably do not need the xNBA patches. The newest version of SLES 12 that is officially supported by xCAT is SLES 12.3, so it is possible that changes between SLES 12.3 and 12.{4,5} might be involved. Are you able to successfully install the nodes with SLES 12.3? You might also be able to get some more detailed debug information by enabling xcatdebugmode in the site table and then watching the node boot process with xcatprobe osdeploy -n NODENAME. # Set xcatdebugmode to 1 using tabedit tabedit site tabdump site | grep xcatdebugmode "xcatdebugmode","1",, # Start the node install using rinstall or rpower, then watch the install process using: xcatprobe osdeploy -n NODENAME Nate Inactive hide details for "Heckes, Frank" ---02/03/2021 05:20:14 PM---Hello Nate, I replace the xnba files with the one provid"Heckes, Frank" ---02/03/2021 05:20:14 PM---Hello Nate, I replace the xnba files with the one provided here: From: "Heckes, Frank" <hec...@mps.mpg.de> To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> Date: 02/03/2021 05:20 PM Subject: [EXTERNAL] Re: [xcat-user] sles12.5 net-installation fails Hello Nate, I replace the xnba files with the one provided here: https://www.xcat.org/files/xcat/xcat-dep/2.x_Linux/beta/xNBA/ both bios and uefi boot failed again in the same scenario. I also recreated an ipxe.efi binary as described here ( https://github.com/Hoeze/ipxe), but I guess it’s the same binary as for the link above. Here the efi boot failed, too. I’m currently using xCAT version 2.15. Does the latest version contain the patches or even newer supporting the latest generation of dell r640? Many thanks in advance. Cheers, -Frank From: Heckes, Frank <hec...@mps.mpg.de> Sent: Wednesday, 3 February 2021 21:59 To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> Subject: Re: [xcat-user] sles12.5 net-installation fails Hello Nate, Many thanks for your quick reply. No the nodes affected are all new HW. Interestingly it works with a RHEL7.{7,8} install and netboot (ramfs). An older bare-metal node and VMs are still working with a sles12.{4,5}. Many thanks for the pointer, I forgot to query the github. I’ll check the patches. Cheers, -Frank From: Nathan A Besaw <bes...@us.ibm.com> Sent: Wednesday, 3 February 2021 21:48 To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> Subject: Re: [xcat-user] sles12.5 net-installation fails Hi Frank, Have you successfully installed these nodes using xCAT previously or is this is a new cluster? If these are newer x86 servers, you may need the patches discussed here: https://github.com/xcat2/xNBA/issues/2#issuecomment-720693322 Nate Inactive hide details for "Heckes, Frank" ---02/03/2021 02:39:13 PM---Hi all, I’ve a problem installing sles12.4 or 12.5 on ou"Heckes, Frank" ---02/03/2021 02:39:13 PM---Hi all, I’ve a problem installing sles12.4 or 12.5 on our compute nodes (Dell PowerEdge R640). The i From: "Heckes, Frank" <hec...@mps.mpg.de> To: "xcat-user@lists.sourceforge.net" <xcat-user@lists.sourceforge.net> Date: 02/03/2021 02:39 PM Subject: [EXTERNAL] [xcat-user] sles12.5 net-installation fails Hi all, I’ve a problem installing sles12.4 or 12.5 on our compute nodes (Dell PowerEdge R640). The installation breaks after successfully downloading the initrd, kernel (via http) and xnba.kpxe file (via tftp). The installation stops with the linuxrc – tui, immediately after querying for an IP Address via dhcp. It is possible to start a shell and verify that the network is up and running (see below). I don’t see any logfile in the mounted initrd file,and no hint raising the xCAT debug level to 2. Does anyone ran into this problem before and has a workaround or at least an idea how-to enable logging to find the root cause. Many thanks in advance. Cheers, -Frank Heckes Expert Mode -> show config . . . network interface states: ▒││ │ │ ││ lo: up ▒││ │ │ ││ eth0: setup-in-progress ▒││ │ │ ││ eth1: device-unconfigured ▒││ │ │ ││ eth2: device-unconfigured ▒││ │ │ ││ eth3: device-unconfigured Expert Mode -> start shell Tail of var/log/boot.msg <6>[ 94.467797] 8021q: adding VLAN 0 to HW filter on device eth0 <5>[ 94.508878] [qede_link_update:2409(eth0)]Link is up <6>[ 94.508923] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready Interface is up and can ping xCAT mgmt. node: 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 34:80:0d:c1:93:00 brd ff:ff:ff:ff:ff:ff inet 134.XX.XXX.XXX/21 brd 134.XX.XXX.XXX scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::3680:dff:fec1:9300/64 scope link valid_lft forever preferred_lft forever ping 134.XX.XXX.244 PING 134.XXX.XXX.XXX (134.XX.XXX.244) 56(84) bytes of data. 64 bytes from 134.XX.XXX.244: icmp_seq=1 ttl=64 time=0.147 ms 64 bytes from 134.xx.XXX.244: icmp_seq=2 ttl=64 time=0.246 ms 64 bytes from 134.XX.XXX.244: icmp_seq=3 ttl=64 time=0.200 ms [attachment "smime.p7s" deleted by Nathan A Besaw/Poughkeepsie/IBM] _______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user [attachment "smime.p7s" deleted by Nathan A Besaw/Poughkeepsie/IBM] _______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user [attachment "sles12.3-console.log" deleted by Nathan A Besaw/Poughkeepsie/IBM] [attachment "sles12.5-console.log" deleted by Nathan A Besaw/Poughkeepsie/IBM] [attachment "smime.p7s" deleted by Nathan A Besaw/Poughkeepsie/IBM] _______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
_______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user