Hi Christopher,
Could you try to use "confignics -s -r" in postbootscripts?
In postscripts stage, "-r" is to shut down the NIC if it is on, and remove interface configuration at the same time, when it ifdown install NIC, it may cause unrealiabe.
In order to help us know what happened in your failed nodes, could you share the following information?
You have 10 nodes successfully, and 17 failed, are all these nodes installing the same OS? Which OS do you use? We have different code logic for different OS.
I think you want to use "-r" to "deconfigure other network cards", you mentioned there was only one network, so I think other network cards were not configured in postscripts stage, is "confignics -s" enough here? Do you have different comments here? Please feel freely to contact us, thanks.
10 ran it successfully
17 failed, so nodes still had a dhcp address
17 failed, so nodes still had a dhcp address
Best Regards
--------------------------------------------------
Yuan Bai (白媛)
CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,
Beijing P.R.China 100193
IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193
--------------------------------------------------
Yuan Bai (白媛)
CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,
Beijing P.R.China 100193
IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193
----- Original message -----
From: Christopher Walker <c.j.wal...@qmul.ac.uk>
To: "xcat-user@lists.sourceforge.net" <xcat-user@lists.sourceforge.net>
Cc:
Subject: [xcat-user] confignics -s -r
Date: Tue, Mar 19, 2019 7:25 PM
We have a problem with "configics -s -r" not running reliably in a
postscript.
While we have some infiniband nodes, the majority use only one network
for install and as the single network for the nodes.
On node install, we wish to assign a static IP address on the install
nic, and deconfigure other network cards.
updatenode <nodename> confignics -s -r
Does this just fine.
However, it seems unreliable when run as a postscript. On a recent
reinstall of 30 node:
10 ran it successfully
17 failed, so nodes still had a dhcp address
3 failed for other reasons (telling the bios which image to boot).
I've no idea what causes this - could it be a race condition somewhere?
If so, is there a timer I could increase to make it less likely to happen?
The workaround is to run
updatenode <nodename> confignics -s -r
by hand afterwards.
We are running a relatively old version of xCAT - 2.12.4 - and do plan
to upgrade soon.
Chris
--
Dr Christopher J. Walker
ITS Research
Queen Mary University of London, E1 4NS
+44 20 7882 5969
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user
_______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user