To close the loop on this, I submitted a pull request to GitHub with some code 
changes that solved my problem. After some discussion
with the code maintainers, we decided that the correct approach is to simply 
ensure that the MASTER node is pingable rather than
checking for LINKUP on a list of interfaces. If the MASTER is pingable, then 
there must exist a route to the node and the
postbootscripts can continue.

 

I believe the final code change will appear in version 2.13.0

 

Thanks Yang Song and Yuan Bai.

 

-Russ

 

 

From: Russell Auld [mailto:[email protected]] 
Sent: Wednesday, November 2, 2016 10:32 PM
To: 'xCAT Users Mailing list' <[email protected]>
Subject: [xcat-user] INSTALLNIC

 

I have a situation where postbootscripts do not run. I traced the problem back 
to "/opt/xcat/xcatinstallpost" script which is called
by the "/etc/init.d/xcatpostinit1" service after the first boot.

The trouble is with this section of code:

MACADDR=`grep MACADDRESS= /xcatpost/mypostscript.post |awk -F = '{print 
$2}'|sed s/\'//g`

INSTALLNIC=`ip -o link|grep -i $MACADDR|awk  '{print $2}'|sed s/://`

while true; do

    #scan the nics with the specified mac address

    #there will be multiple nic names for a mac address when the network bridge 
exists

    for nic in $INSTALLNIC ;do

        #check whether the nic is configured and linkup

        ip -4 --oneline addr show dev $nic |grep inet >/dev/null && NETUP=1 && 
break

    done

    

    #nic is configured,terminate scan...

    [ $NETUP -ne 0 ] && break;    

    RETRY=$[ $RETRY + 1 ]

    if [ $RETRY -eq 90 ];then

       #timeout, complain and exit

       msgutil_r "$MASTER_IP" "err" `date`" xcatinstallpost: Network not 
configured, please check..." "/var/log/xcat/xcat.log"

       exit 1

    fi

    

    #sleep sometime before the next scan  

    sleep 2

done

 

The issue that I have is that the NIC that I use to install the machines is not 
the primary NIC that is used when the node reboots.

That's because the NIC used during the PXE/Anaconda stage cannot be part of a 
bonded connection, so we have to run a separate single
link for imaging. When the node runs normally, we want to use the bonded 
connections. In other words, the node would have three
Ethernet connections - two 10Gb connections (which would be bonded) and one 1Gb 
standard Ethernet connection.

Normally I set the "installnic" to "mac" in the noderes table and I set "mac" 
to the mac of the nic that is used to PXE the device.

The code above insists that the link associated with the MAC used to PXE the 
node be up, and will fail if not. That's not reasonable
after the node reboots. 

This method worked well with xCAT v2.8.5. Now I'm using v2.12.1 and this method 
doesn't work.

Is there a better way to handle this issue with bonded Ethernet and PXE booting?

This is what the man page for 'noderes' says about 'installnic':

The network adapter on the node that will be used for OS deployment, the 
installnic can be set to the network adapter name or the
mac address or the keyword "mac" which means that the network interface 
specified by the mac address in the mac table will be used.
If not set, primarynic will be used. If primarynic is not set too, the keyword 
"mac" will be used as default.

------------------------------------------------------------------------------
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to