On 2/19/19 2:34 AM, Bin XA Xu wrote:

 But it should be handled when discovering,  xCAT will assign the same IP to eth0 and eth2 during the auto-discovery.
Ertao,  could you help to give more information about that?
And Thomas,  could you give  a `lsdef` output on your node,  before discovering and after  discovering?

Thanks for your answer. Sorry for the following long post but it'll give you any details needed just to make sure and be complete about my setup :

- eth0 is connected to 1Gb/s switchA/portA which allows untagged incoming packets and tags them in the vlan matching the cluster private subnet - eth2 is connected to 10Gb/s switchB/portB which allows untagged incoming packets and tags them in the vlan matching the cluster private subnet (same vlan as above)

That's what I meant when I said "are on the same subnet" but I expect only one of those 2 nics to get the node desired ip address (as stated with a regexp in the hosts table)

[In addition, bmc is configured as a chain task and uses the same physical port as eth0 but a differant vlan - bmc card is configured to tag packets]

Here are the info you asked corresponding to a scenario where I'm starting from scratch (node doesn't exist) and bios on the node PXE boots in this order :

1. eth0
2. eth1 [not connected]
3. eth2

and ends up the node beeing correctly provisionned (and with ONLY one ip) but through eth0 and with eth0 carrying the final desired ip. Which is what I'd like to avoid (prevent such bios misconfiguration as eth2 should be first)

1) my subnets (note the dynamic rande address range)

"tars-ipmi","10.6.96.0","255.255.252.0",,"10.6.96.1",,,,,,,,,,,,,,
"tars","192.168.128.0","255.255.248.0","eth1",,"192.168.132.2","192.168.132.2",,,,"192.168.134.2-192.168.135.254",,,,,,"tars.cluster.pasteur.fr",,

2) I rmdef'ed the node and did some cleaning to emulate a first time creation

# ls -l /tftpboot/xcat/xnba/nodes/tars-113*
ls: cannot access /tftpboot/xcat/xnba/nodes/tars-113*: No such file or directory

# grep -E '(0c:c4:7a:4d:85:a8|0c:c4:7a:4d:85:a9|0c:c4:7a:58:c7:6a)' /var/lib/dhcpd/dhcpd.leases
#

3) the node before genesis :

# lsdef tars-113
Object name: tars-113
addkcmdline=ipv6.disable=1 biosdevname=0 net.ifnames=0 rd.driver.blacklist=nouveau nouveau.modeset=0
    arch=x86_64
    bmc=10.6.96.115
    bmcpassword=XXXX
    bmcport=0
    bmcusername=XXXX

chain=runcmd=bmcsetup,runimage=http://xcat-tars/install/sum_activate/sum_activate.tgz,osimage=centos6.10-x86_64-netboot-compute-prod
    groups=tars-compute,tars-ipmi,tars,standard,b10
    ip=192.168.128.115
    mgt=ipmi
    os=centos6.10
    postbootscripts=otherpkgs
    profile=compute
    provmethod=centos6.10-x86_64-netboot-compute-prod
    supportedarchs=x86,x86_64
    switch=b10b4.dc1.pasteur.fr
    switchport=8

4) at the console I saw the following happen

eth0 : 192.168.134.252
no dhcp answer for eth2

then :

eth0 gets  192.168.128.115 which is the correct node regexp assigned ip
eth2 gets 192.168.134.250 which is from the dynamic range

-> I'm not sure what happened here and who did what

5) the node once netbooted (after genesis)

$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 0c:c4:7a:4d:85:a8 brd ff:ff:ff:ff:ff:ff
    inet 192.168.128.115/21 brd 192.168.135.255 scope global eth0
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 0c:c4:7a:4d:85:a9 brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 0c:c4:7a:58:c7:6a brd ff:ff:ff:ff:ff:ff

-> it has been installed via and on eth0. I would have liked to be able to force eth2 configuration with this ip even in the case where PXE was initially done through eth0

6) the node definition once discovered :

# lsdef -t node tars-113
Object name: tars-113
addkcmdline=ipv6.disable=1 biosdevname=0 net.ifnames=0 rd.driver.blacklist=nouveau nouveau.modeset=0
    arch=x86_64
    bmc=10.6.96.115
    bmcpassword=XXXX
    bmcport=0
    bmcusername=XXXX

chain=runcmd=bmcsetup,runimage=http://xcat-tars/install/sum_activate/sum_activate.tgz,osimage=centos6.10-x86_64-netboot-compute-prod
    cpucount=12
    cputype=Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
    currchain=osimage=centos6.10-x86_64-netboot-compute-prod
    currstate=netboot centos6.10-x86_64-compute
    disksize=sda:256GB
    groups=tars-compute,tars-ipmi,tars,standard,b10

initrd=xcat/osimage/centos6.10-x86_64-netboot-compute-prod/initrd-stateless.gz
    ip=192.168.128.115

kcmdline=imgurl=http://!myipfn!:80//install/netboot/centos6.10/x86_64/compute/prod/rootimg.gz XCAT=!myipfn!:3001 NODE=tars-113 FC=0
    kernel=xcat/osimage/centos6.10-x86_64-netboot-compute-prod/kernel
    mac=0c:c4:7a:4d:85:a8|0c:c4:7a:58:c7:6a!tars-113-eth2
    memory=258373MB
    mgt=ipmi
    netboot=xnba
    os=centos6.10
    postbootscripts=otherpkgs
    profile=compute
    provmethod=centos6.10-x86_64-netboot-compute-prod
    serial=E162178X5A02118
    status=booted
    statustime=02-19-2019 09:43:07
    supportedarchs=x86,x86_64
    switch=b10b4.dc1.pasteur.fr
    switchport=8

7) what ends up in the mac table
# tabdump mac | grep -i 113
"tars-113",,"0c:c4:7a:4d:85:a8|0c:c4:7a:58:c7:6a!tars-113-eth2",,


[I don't use /etc/hosts but an external DNS]

So to sum up my issue is if for some reason node gets deleted then re-discovered but through eth0, it will be installed on eth0 instead of eth2 as it is normally when eth2 is listed first in bios and switch-based mechanism then works.

I cannot understand why you said the nics should get the same ip.

Thanks for your help

--
Thomas H.


_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to