On 2/19/19 2:34 AM, Bin XA Xu wrote:
But it should be handled when discovering, xCAT will assign the same
IP to eth0 and eth2 during the auto-discovery.
Ertao, could you help to give more information about that?
And Thomas, could you give a `lsdef` output on your node, before
discovering and after discovering?
Thanks for your answer. Sorry for the following long post but it'll give
you any details needed just to make sure and be complete about my setup :
- eth0 is connected to 1Gb/s switchA/portA which allows untagged
incoming packets and tags them in the vlan matching the cluster private
subnet
- eth2 is connected to 10Gb/s switchB/portB which allows untagged
incoming packets and tags them in the vlan matching the cluster private
subnet (same vlan as above)
That's what I meant when I said "are on the same subnet" but I expect
only one of those 2 nics to get the node desired ip address (as stated
with a regexp in the hosts table)
[In addition, bmc is configured as a chain task and uses the same
physical port as eth0 but a differant vlan - bmc card is configured to
tag packets]
Here are the info you asked corresponding to a scenario where I'm
starting from scratch (node doesn't exist) and bios on the node PXE
boots in this order :
1. eth0
2. eth1 [not connected]
3. eth2
and ends up the node beeing correctly provisionned (and with ONLY one
ip) but through eth0 and with eth0 carrying the final desired ip. Which
is what I'd like to avoid (prevent such bios misconfiguration as eth2
should be first)
1) my subnets (note the dynamic rande address range)
"tars-ipmi","10.6.96.0","255.255.252.0",,"10.6.96.1",,,,,,,,,,,,,,
"tars","192.168.128.0","255.255.248.0","eth1",,"192.168.132.2","192.168.132.2",,,,"192.168.134.2-192.168.135.254",,,,,,"tars.cluster.pasteur.fr",,
2) I rmdef'ed the node and did some cleaning to emulate a first time
creation
# ls -l /tftpboot/xcat/xnba/nodes/tars-113*
ls: cannot access /tftpboot/xcat/xnba/nodes/tars-113*: No such file or
directory
# grep -E '(0c:c4:7a:4d:85:a8|0c:c4:7a:4d:85:a9|0c:c4:7a:58:c7:6a)'
/var/lib/dhcpd/dhcpd.leases
#
3) the node before genesis :
# lsdef tars-113
Object name: tars-113
addkcmdline=ipv6.disable=1 biosdevname=0 net.ifnames=0
rd.driver.blacklist=nouveau nouveau.modeset=0
arch=x86_64
bmc=10.6.96.115
bmcpassword=XXXX
bmcport=0
bmcusername=XXXX
chain=runcmd=bmcsetup,runimage=http://xcat-tars/install/sum_activate/sum_activate.tgz,osimage=centos6.10-x86_64-netboot-compute-prod
groups=tars-compute,tars-ipmi,tars,standard,b10
ip=192.168.128.115
mgt=ipmi
os=centos6.10
postbootscripts=otherpkgs
profile=compute
provmethod=centos6.10-x86_64-netboot-compute-prod
supportedarchs=x86,x86_64
switch=b10b4.dc1.pasteur.fr
switchport=8
4) at the console I saw the following happen
eth0 : 192.168.134.252
no dhcp answer for eth2
then :
eth0 gets 192.168.128.115 which is the correct node regexp assigned ip
eth2 gets 192.168.134.250 which is from the dynamic range
-> I'm not sure what happened here and who did what
5) the node once netbooted (after genesis)
$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
qlen 1000
link/ether 0c:c4:7a:4d:85:a8 brd ff:ff:ff:ff:ff:ff
inet 192.168.128.115/21 brd 192.168.135.255 scope global eth0
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 0c:c4:7a:4d:85:a9 brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 0c:c4:7a:58:c7:6a brd ff:ff:ff:ff:ff:ff
-> it has been installed via and on eth0. I would have liked to be able
to force eth2 configuration with this ip even in the case where PXE was
initially done through eth0
6) the node definition once discovered :
# lsdef -t node tars-113
Object name: tars-113
addkcmdline=ipv6.disable=1 biosdevname=0 net.ifnames=0
rd.driver.blacklist=nouveau nouveau.modeset=0
arch=x86_64
bmc=10.6.96.115
bmcpassword=XXXX
bmcport=0
bmcusername=XXXX
chain=runcmd=bmcsetup,runimage=http://xcat-tars/install/sum_activate/sum_activate.tgz,osimage=centos6.10-x86_64-netboot-compute-prod
cpucount=12
cputype=Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
currchain=osimage=centos6.10-x86_64-netboot-compute-prod
currstate=netboot centos6.10-x86_64-compute
disksize=sda:256GB
groups=tars-compute,tars-ipmi,tars,standard,b10
initrd=xcat/osimage/centos6.10-x86_64-netboot-compute-prod/initrd-stateless.gz
ip=192.168.128.115
kcmdline=imgurl=http://!myipfn!:80//install/netboot/centos6.10/x86_64/compute/prod/rootimg.gz
XCAT=!myipfn!:3001 NODE=tars-113 FC=0
kernel=xcat/osimage/centos6.10-x86_64-netboot-compute-prod/kernel
mac=0c:c4:7a:4d:85:a8|0c:c4:7a:58:c7:6a!tars-113-eth2
memory=258373MB
mgt=ipmi
netboot=xnba
os=centos6.10
postbootscripts=otherpkgs
profile=compute
provmethod=centos6.10-x86_64-netboot-compute-prod
serial=E162178X5A02118
status=booted
statustime=02-19-2019 09:43:07
supportedarchs=x86,x86_64
switch=b10b4.dc1.pasteur.fr
switchport=8
7) what ends up in the mac table
# tabdump mac | grep -i 113
"tars-113",,"0c:c4:7a:4d:85:a8|0c:c4:7a:58:c7:6a!tars-113-eth2",,
[I don't use /etc/hosts but an external DNS]
So to sum up my issue is if for some reason node gets deleted then
re-discovered but through eth0, it will be installed on eth0 instead of
eth2 as it is normally when eth2 is listed first in bios and
switch-based mechanism then works.
I cannot understand why you said the nics should get the same ip.
Thanks for your help
--
Thomas H.
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user