Hello,
I'm using xCAT 2.11 in production (I know it's an old release) to
provision HPC multi-interface compute nodes with stateless images.
Basically, the nodes have one 1Gb/s nic and one 10Gb/s nic, each one
connected to a different switch. Obvioulsy, I want them to be
provisioned through the 10Gb/s link.
I'm using switch-based discovery and, provided that BIOS PXE priorities
is correctly configured, everything works as expected.
But I was wondering what would happen if BIOS was wrongly configured to
PXE on the 1Gb/s nic first : would discover happen ? if yes, would it
result to ?
I first though it would be similar to a serial discovery situation but
it doesn't seem that simple, here's why :
The compute nodes are physical nodes equipped with
- 2 onboard 1Gb/s nic (eth0, eth1)
- 1 additionnal PCI-E 10Gb/s nic (eth2)
and set up like this :
- eth2 is connected to a 10Gb/s ports switch which is used by
switch-based discovery ('switch' and 'switchport' attributes point to
this switch and switch ports)
- eth0 is connected to a 1Gb/s ports switch and is used for ipmi traffic
from/to the BMC and could also carry the data traffic (legacy from the
time we didn't have a 10Gb/s switch nor nic")
- BIOS is set up to PXE boot first and PXE is set up to boot through
eth2 first.
Here's what I tested :
Starting conditions :
- Let's say nodeA was discovered and netbooted through the method above
(thus its mac in the mac table is eth2 (10Gb/s) address).
- let's say I haven't any undiscovered nor unprovisionned node
To see what would happen through eth0, I changed the eth2 facing switch
configuration to forbid traffic coming from this interface just to be
sure eth2 PXE could no work no matter what. So now the node does PXE on
eth0 (1Gb/s)
I know about the 3 types of automatic discovery as described in
documentation (MTMS, switch-based, serial) and here's what I tested :
1) nodeA :
- remove its mac address in the mac table
- remove all its leases in /var/lib/dhcpd/dhcpd.leases
- makedhcp -n
reboot nodeA
-> it is discovered and provisionned with its osimage but through the
eth0 interface. hostname is nodeA as expected
in the mac table we can see
"nodeA",,"<eth0 mac-address>|<eth2 mac-address>!*NOIP*",,
-> seems strange to me (see at the end of my tests description)
2) nodeA
- rmdef nodeA
- remove all its leases in /var/lib/dhcpd/dhcpd.leases
- makedhcp -n
- add a new node with simmilar properties (same chain, ...) :
nodeadd foobar groups=<same as nodeA>
tabedit hosts : insert a specific entry with an ip for foobar
tabedit ipmi : insert a specific entry with an bmc ip for foobar
makedns foobar
reboot nodeA
-> it gets stalled ("Network configuration complete, commencing transmit
of discovery packets") and PXE loops after getting an IP from the dhcpd
dynamic range
-> seems pretty normal to me since I consider myself to be in a serial
discovery like scenario then
3) same as 2) but with a nodediscoverstart noderange=foobar AFTER node
reboot : seems to loop in PXE/discovery (even if nodediscoverstop is run)
-> seems to me it might be normal as I maybe was supposed to run
nodediscoverstart before node boot.
4) same as 2) but with a nodediscoverstart noderange=foobar BEFORE node
reboot
-> foobar (same physical node as nodeA) gets discovered and
provisionned. hostname is foobar as expected
in the mac table we can see
"foobar",,"0c:c4:7a:4d:85:a8",,
-> seems pretty normal to me
[I don't quite understand what is happening in 3) but it a bit off topic]
So, basically, I don't understand the conceptual difference between 1)
and 4) and thus why in 1) nodeA gets discovered without the need to run
nodediscoverstart (granted it has switch and swithport attributes but
not refering the 1Gb/s switch)...
Again, I'd like to understand the above in order to be sure that the
node won't get provisionned through the eth0/1Gb/s link if PXE order
gets mixed up...Or even to make sure (worst case scenario if many nodes
would have PXE order mixed up) that nodes hostnames won't be mixed up
(in a serial discovery like scenario).
Thanks
--
Thomas HUMMEL
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user