To close the loop on this, I submitted a pull request to GitHub with some code changes that solved my problem. After some discussion with the code maintainers, we decided that the correct approach is to simply ensure that the MASTER node is pingable rather than checking for LINKUP on a list of interfaces. If the MASTER is pingable, then there must exist a route to the node and the postbootscripts can continue.
I believe the final code change will appear in version 2.13.0 Thanks Yang Song and Yuan Bai. -Russ From: Russell Auld [mailto:[email protected]] Sent: Wednesday, November 2, 2016 10:32 PM To: 'xCAT Users Mailing list' <[email protected]> Subject: [xcat-user] INSTALLNIC I have a situation where postbootscripts do not run. I traced the problem back to "/opt/xcat/xcatinstallpost" script which is called by the "/etc/init.d/xcatpostinit1" service after the first boot. The trouble is with this section of code: MACADDR=`grep MACADDRESS= /xcatpost/mypostscript.post |awk -F = '{print $2}'|sed s/\'//g` INSTALLNIC=`ip -o link|grep -i $MACADDR|awk '{print $2}'|sed s/://` while true; do #scan the nics with the specified mac address #there will be multiple nic names for a mac address when the network bridge exists for nic in $INSTALLNIC ;do #check whether the nic is configured and linkup ip -4 --oneline addr show dev $nic |grep inet >/dev/null && NETUP=1 && break done #nic is configured,terminate scan... [ $NETUP -ne 0 ] && break; RETRY=$[ $RETRY + 1 ] if [ $RETRY -eq 90 ];then #timeout, complain and exit msgutil_r "$MASTER_IP" "err" `date`" xcatinstallpost: Network not configured, please check..." "/var/log/xcat/xcat.log" exit 1 fi #sleep sometime before the next scan sleep 2 done The issue that I have is that the NIC that I use to install the machines is not the primary NIC that is used when the node reboots. That's because the NIC used during the PXE/Anaconda stage cannot be part of a bonded connection, so we have to run a separate single link for imaging. When the node runs normally, we want to use the bonded connections. In other words, the node would have three Ethernet connections - two 10Gb connections (which would be bonded) and one 1Gb standard Ethernet connection. Normally I set the "installnic" to "mac" in the noderes table and I set "mac" to the mac of the nic that is used to PXE the device. The code above insists that the link associated with the MAC used to PXE the node be up, and will fail if not. That's not reasonable after the node reboots. This method worked well with xCAT v2.8.5. Now I'm using v2.12.1 and this method doesn't work. Is there a better way to handle this issue with bonded Ethernet and PXE booting? This is what the man page for 'noderes' says about 'installnic': The network adapter on the node that will be used for OS deployment, the installnic can be set to the network adapter name or the mac address or the keyword "mac" which means that the network interface specified by the mac address in the mac table will be used. If not set, primarynic will be used. If primarynic is not set too, the keyword "mac" will be used as default.
------------------------------------------------------------------------------
_______________________________________________ xCAT-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/xcat-user
