On 20/03/2019 02:52, Yuan Y Bai wrote: > Hi Christopher, > Could you try to use "confignics -s -r" in postbootscripts?
We could, yes. > In postscripts stage, "-r" is to shut down the NIC if it is on, and > remove interface configuration at the same time, when it ifdown install > NIC, it may cause unrealiabe. It sounds like this may well be the issue. Are you saying there's a potential race between the "-s" and the "-r" options? > In order to help us know what happened in your failed nodes, could you > share the following information? > You have 10 nodes successfully, and 17 failed, are all these nodes > installing the same OS? Yes. Furthermore, they were all of the same hardware type plugged into the same switches. > Which OS do you use? Centos 7.4 > We have different code > logic for different OS. > I think you want to use "-r" to "deconfigure other network cards", you > mentioned there was only one network, so I think other network cards > were not configured in postscripts stage, Correct, though they get the default config from Centos - which is to DHCP. We'd prefer that the config were removed - otherwise we potentially end up with two IPs on the same network (though it's probably sensible to disable the network ports too). > is "confignics -s" enough > here? No, we wish to remove the config for the other nics. We can, I guess put confignics -s in postcripts and confignics -r in postbootscripts (or vice versa). Is that what you'd suggest? > Do you have different comments here? Please feel freely to > contact us, thanks. > 10 ran it successfully > 17 failed, so nodes still had a dhcp address Yes indeed. Thanks, Chris > Best Regards > -------------------------------------------------- > Yuan Bai (白媛) > > CSTL HPC System Management Development > Tel:86-10-82451401 > E-mail: by...@cn.ibm.com > Address: IBM ZGC Campus. Ring Building 28, > ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District, > Beijing P.R.China 100193 > > IBM环宇大厦 > 北京市海淀区东北旺西路8号,中关村软件园28号楼 > 邮编:100193 > > ----- Original message ----- > From: Christopher Walker <c.j.wal...@qmul.ac.uk> > To: "xcat-user@lists.sourceforge.net" <xcat-user@lists.sourceforge.net> > Cc: > Subject: [xcat-user] confignics -s -r > Date: Tue, Mar 19, 2019 7:25 PM > We have a problem with "configics -s -r" not running reliably in a > postscript. > > While we have some infiniband nodes, the majority use only one network > for install and as the single network for the nodes. > > On node install, we wish to assign a static IP address on the install > nic, and deconfigure other network cards. > > updatenode <nodename> confignics -s -r > > > Does this just fine. > > However, it seems unreliable when run as a postscript. On a recent > reinstall of 30 node: > > 10 ran it successfully > 17 failed, so nodes still had a dhcp address > 3 failed for other reasons (telling the bios which image to boot). > > I've no idea what causes this - could it be a race condition somewhere? > If so, is there a timer I could increase to make it less likely to > happen? > > The workaround is to run > updatenode <nodename> confignics -s -r > > by hand afterwards. > > We are running a relatively old version of xCAT - 2.12.4 - and do plan > to upgrade soon. > > Chris > > -- > Dr Christopher J. Walker > ITS Research > Queen Mary University of London, E1 4NS > +44 20 7882 5969 > > _______________________________________________ > xCAT-user mailing list > xCAT-user@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/xcat-user > > > > > _______________________________________________ > xCAT-user mailing list > xCAT-user@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/xcat-user > -- Dr Christopher J. Walker ITS Research Queen Mary University of London, E1 4NS +44 20 7882 5969 _______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user