OK, great thanks!  I wrote a quick script to cross check DNS for each host,
cross match the IP with the host IP and then run a makdhcp command to add
the fixed address in the lease file.

I will check the node definition(s) for the eth0/enxxxx address shortly.

This was a great help, thank you all so much!

--imam

On Thu, Sep 2, 2021 at 1:57 PM Russell Jones <arjone...@gmail.com> wrote:

> Look at the man pages for "makedhcp"
>
> In my experience when there is no fixed-address for a host, that means DNS
> is wrong. Make sure your DNS entries are correct for that host, and that
> the management node can resolve both the hostname and reverse DNS entries
> for the IP address.
>
> Also I noticed in your node definition you don't have a nicips.eth0/ensXXX
> - Check that.
>
> On Wed, Sep 1, 2021 at 12:09 PM Imam Toufique <techie...@gmail.com> wrote:
>
>> Yes, that is correct.  Below is the output :
>>
>> [root@hpc3-14-03 ~]# ip a
>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group
>> default qlen 1000
>>     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>>     inet 127.0.0.1/8 scope host lo
>>        valid_lft forever preferred_lft forever
>>     inet6 ::1/128 scope host
>>        valid_lft forever preferred_lft forever
>> 2: eno1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group
>> default qlen 1000
>>     link/ether 08:f1:ea:9e:c7:60 brd ff:ff:ff:ff:ff:ff
>> 3: eno2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group
>> default qlen 1000
>>     link/ether 08:f1:ea:9e:c7:61 brd ff:ff:ff:ff:ff:ff
>> 4: eno3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
>> group default qlen 1000
>>     link/ether 08:f1:ea:e4:35:52 brd ff:ff:ff:ff:ff:ff
>>     inet 10.240.58.16/23 brd 10.240.59.255 scope global eno3
>>        valid_lft forever preferred_lft forever
>>     inet6 fe80::af1:eaff:fee4:3552/64 scope link
>>        valid_lft forever preferred_lft forever
>>
>> ....
>>
>> What I have noticed is that dhcpd.leases file does not have an entry with
>> an "fixed address" entry like the following:
>>
>> host hpc3-gpu-16-03 {
>>   dynamic;
>>   hardware ethernet 20:67:7c:10:ba:86;
>>   uid 20:67:7c:10:ba:86;
>>   fixed-address 10.240.58.61;
>>         supersede server.ddns-hostname = "hpc3-gpu-16-03";
>>         supersede host-name = "hpc3-gpu-16-03";
>>         if option user-class-identifier = "xNBA" and option
>> client-architecture
>>              = 00:00 {
>>           supersede server.filename =
>>                                       "http://
>> ${next-server}:80/tftpboot/xcat/xnba/nodes/hpc3-gpu-16-03";
>>         } elsif option client-architecture = 00:00 {
>>           supersede server.filename = "xcat/xnba.kpxe";
>>         } else {
>>           supersede server.filename = "";
>>         }
>> }
>>
>> I am assuming the lease file got messed up somehow.  What are your
>> thoughts on reconstructing the file (programmatically) and using a modified
>> file?  Or is there another way from within xcat to add entries in the dhcpd
>> leases file?
>>
>> thanks.
>>
>>
>>
>>
>> On Wed, Sep 1, 2021 at 9:20 AM Russell Jones <arjone...@gmail.com> wrote:
>>
>>> Is the mac correct for the node?
>>>
>>> On Wed, Sep 1, 2021 at 11:02 AM Imam Toufique <techie...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Need your helpful thoughts here with a problem we have, please.
>>>>
>>>> We have nodes that were provisioned with xcat, they are running, OS is
>>>> working and installed.  The boot order is set to PXE first, SSD 2nd.
>>>>
>>>> Several days ago, when I rebooted one of the nodes, it went straight to
>>>> PXE discovery mode - attempting for an install.  This is a node that is
>>>> built, it should have exited the PXE boot mode and boot off the disk, but
>>>> it never did.
>>>>
>>>> I am not sure what's going on, it looks like xcat has lost the status
>>>> of the node, whether it is installed or not ( need provisioning?)
>>>>
>>>> Here is the 'lsdef' output of the node:
>>>>
>>>> ```
>>>> [root@mn] lsdef -t node hpc3-14-03
>>>> Object name: hpc3-14-03
>>>>     arch=x86_64
>>>>     cpucount=40
>>>>     cputype=Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
>>>>     currchain=boot
>>>>     currstate=boot
>>>>     disksize=sda:224GB,sdb:224GB
>>>>     groups=centos78
>>>>     ip=10.240.58.16
>>>>     mac=08:f1:ea:e4:35:52
>>>>     memory=193122MB
>>>>     mtm=HPE:ProLiant XL170r Gen10
>>>>     netboot=xnba
>>>>     nichostnamesuffixes.ib0=-ib0
>>>>     nichostnamesuffixes.ipmi=-ipmi
>>>>     nicips.ib0=10.240.60.16
>>>>     nicips.ipmi=10.240.62.16
>>>>     os=centos7.7
>>>>     postbootscripts=otherpkgs,hpc3-postscripts/hpc3postbootscript
>>>>
>>>> postscripts=syslog,remoteshell,syncfiles,setupntp,hpc3-postscripts/hpc3postscript.1,confignetwork
>>>> -s
>>>>     profile=compute
>>>>     provmethod=centos7.8-x86_64-install-compute
>>>>     serial=2M294204L9
>>>>     status=booted
>>>>     statustime=06-21-2021 16:41:45
>>>>     supportedarchs=x86,x86_64
>>>>
>>>> ```
>>>>
>>>> ```
>>>> [root@mn]# nodediscoverls |grep 14-03
>>>>   38363730-3535-324D-3239-343230344C39    hpc3-14-03          manual
>>>>       HPE:ProLiant XL170r Gen10 2M294204L9
>>>> ```
>>>>
>>>> ```
>>>> [root@mn]# lsdef -t network compute_net_1
>>>>
>>>> Object name: compute_net_1
>>>>
>>>>     domain=local
>>>>
>>>>     dynamicrange=10.240.58.221-10.240.58.240
>>>>
>>>>     gateway=10.240.58.1
>>>>
>>>>     mask=255.255.254.0
>>>>     mgtifname=eno1
>>>>     mtu=1500
>>>>     nameservers=10.240.58.4,8.8.8.8,128.200.192.202
>>>>     net=10.240.58.0
>>>>     staticrange=10.240.58.4-10.240.59.220
>>>>     tftpserver=<xcatmaster>
>>>> ```
>>>>
>>>> Any idea what might be going on here?  Why an already setup/installed
>>>> node is going back to discovery ( and wanting to be installed) mode?
>>>>
>>>> Can someone please shed some light?
>>>>
>>>> thanks a lot!
>>>>
>>>>
>>>> _______________________________________________
>>>> xCAT-user mailing list
>>>> xCAT-user@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/xcat-user
>>>>
>>> _______________________________________________
>>> xCAT-user mailing list
>>> xCAT-user@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/xcat-user
>>>
>>
>>
>> --
>> Regards,
>> *Imam Toufique*
>> *213-700-5485*
>> _______________________________________________
>> xCAT-user mailing list
>> xCAT-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/xcat-user
>>
> _______________________________________________
> xCAT-user mailing list
> xCAT-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xcat-user
>


-- 
Regards,
*Imam Toufique*
*213-700-5485*
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to