Look at the man pages for "makedhcp" In my experience when there is no fixed-address for a host, that means DNS is wrong. Make sure your DNS entries are correct for that host, and that the management node can resolve both the hostname and reverse DNS entries for the IP address.
Also I noticed in your node definition you don't have a nicips.eth0/ensXXX - Check that. On Wed, Sep 1, 2021 at 12:09 PM Imam Toufique <techie...@gmail.com> wrote: > Yes, that is correct. Below is the output : > > [root@hpc3-14-03 ~]# ip a > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group > default qlen 1000 > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > inet 127.0.0.1/8 scope host lo > valid_lft forever preferred_lft forever > inet6 ::1/128 scope host > valid_lft forever preferred_lft forever > 2: eno1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > default qlen 1000 > link/ether 08:f1:ea:9e:c7:60 brd ff:ff:ff:ff:ff:ff > 3: eno2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > default qlen 1000 > link/ether 08:f1:ea:9e:c7:61 brd ff:ff:ff:ff:ff:ff > 4: eno3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP > group default qlen 1000 > link/ether 08:f1:ea:e4:35:52 brd ff:ff:ff:ff:ff:ff > inet 10.240.58.16/23 brd 10.240.59.255 scope global eno3 > valid_lft forever preferred_lft forever > inet6 fe80::af1:eaff:fee4:3552/64 scope link > valid_lft forever preferred_lft forever > > .... > > What I have noticed is that dhcpd.leases file does not have an entry with > an "fixed address" entry like the following: > > host hpc3-gpu-16-03 { > dynamic; > hardware ethernet 20:67:7c:10:ba:86; > uid 20:67:7c:10:ba:86; > fixed-address 10.240.58.61; > supersede server.ddns-hostname = "hpc3-gpu-16-03"; > supersede host-name = "hpc3-gpu-16-03"; > if option user-class-identifier = "xNBA" and option > client-architecture > = 00:00 { > supersede server.filename = > "http:// > ${next-server}:80/tftpboot/xcat/xnba/nodes/hpc3-gpu-16-03"; > } elsif option client-architecture = 00:00 { > supersede server.filename = "xcat/xnba.kpxe"; > } else { > supersede server.filename = ""; > } > } > > I am assuming the lease file got messed up somehow. What are your > thoughts on reconstructing the file (programmatically) and using a modified > file? Or is there another way from within xcat to add entries in the dhcpd > leases file? > > thanks. > > > > > On Wed, Sep 1, 2021 at 9:20 AM Russell Jones <arjone...@gmail.com> wrote: > >> Is the mac correct for the node? >> >> On Wed, Sep 1, 2021 at 11:02 AM Imam Toufique <techie...@gmail.com> >> wrote: >> >>> Hi, >>> >>> Need your helpful thoughts here with a problem we have, please. >>> >>> We have nodes that were provisioned with xcat, they are running, OS is >>> working and installed. The boot order is set to PXE first, SSD 2nd. >>> >>> Several days ago, when I rebooted one of the nodes, it went straight to >>> PXE discovery mode - attempting for an install. This is a node that is >>> built, it should have exited the PXE boot mode and boot off the disk, but >>> it never did. >>> >>> I am not sure what's going on, it looks like xcat has lost the status of >>> the node, whether it is installed or not ( need provisioning?) >>> >>> Here is the 'lsdef' output of the node: >>> >>> ``` >>> [root@mn] lsdef -t node hpc3-14-03 >>> Object name: hpc3-14-03 >>> arch=x86_64 >>> cpucount=40 >>> cputype=Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz >>> currchain=boot >>> currstate=boot >>> disksize=sda:224GB,sdb:224GB >>> groups=centos78 >>> ip=10.240.58.16 >>> mac=08:f1:ea:e4:35:52 >>> memory=193122MB >>> mtm=HPE:ProLiant XL170r Gen10 >>> netboot=xnba >>> nichostnamesuffixes.ib0=-ib0 >>> nichostnamesuffixes.ipmi=-ipmi >>> nicips.ib0=10.240.60.16 >>> nicips.ipmi=10.240.62.16 >>> os=centos7.7 >>> postbootscripts=otherpkgs,hpc3-postscripts/hpc3postbootscript >>> >>> postscripts=syslog,remoteshell,syncfiles,setupntp,hpc3-postscripts/hpc3postscript.1,confignetwork >>> -s >>> profile=compute >>> provmethod=centos7.8-x86_64-install-compute >>> serial=2M294204L9 >>> status=booted >>> statustime=06-21-2021 16:41:45 >>> supportedarchs=x86,x86_64 >>> >>> ``` >>> >>> ``` >>> [root@mn]# nodediscoverls |grep 14-03 >>> 38363730-3535-324D-3239-343230344C39 hpc3-14-03 manual >>> HPE:ProLiant XL170r Gen10 2M294204L9 >>> ``` >>> >>> ``` >>> [root@mn]# lsdef -t network compute_net_1 >>> >>> Object name: compute_net_1 >>> >>> domain=local >>> >>> dynamicrange=10.240.58.221-10.240.58.240 >>> >>> gateway=10.240.58.1 >>> >>> mask=255.255.254.0 >>> mgtifname=eno1 >>> mtu=1500 >>> nameservers=10.240.58.4,8.8.8.8,128.200.192.202 >>> net=10.240.58.0 >>> staticrange=10.240.58.4-10.240.59.220 >>> tftpserver=<xcatmaster> >>> ``` >>> >>> Any idea what might be going on here? Why an already setup/installed >>> node is going back to discovery ( and wanting to be installed) mode? >>> >>> Can someone please shed some light? >>> >>> thanks a lot! >>> >>> >>> _______________________________________________ >>> xCAT-user mailing list >>> xCAT-user@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/xcat-user >>> >> _______________________________________________ >> xCAT-user mailing list >> xCAT-user@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/xcat-user >> > > > -- > Regards, > *Imam Toufique* > *213-700-5485* > _______________________________________________ > xCAT-user mailing list > xCAT-user@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/xcat-user >
_______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user