Dear all,
We have an IBM cluster with 44 compute nodes.
I had to reinstall them for different reason. This went ok except for only one
node (node11).
Once I submitted the commands to prepare the node for reinstallation, it boots
correctly, I think it gets its IP address according to the messages in the
console, but does not respond to the ping. And nothing. It gets stuck. I cannot
find anything in the logs (/var/log/messages from the mn). Tcpdump seems to
have traffic with this node:
# tcpdump –i eth2
14:27:18.536865 IP aries.53291 > node11-bmc.7578: . ack 1494861 win 126
14:27:18.610937 IP node11-bmc.7578 > aries.53291: P 1494861:1494870(9) ack 1122
win 2920
14:27:18.610992 IP node11-bmc.7578 > aries.53291: P 1494870:1494889(19) ack
1122 win 2920
14:27:18.611053 IP aries.53291 > node11-bmc.7578: . ack 1494889 win 126
14:27:18.611083 IP node11-bmc.7578 > aries.53291: P 1494889:1495011(122) ack
1122 win 2920
14:27:18.611095 IP aries.53291 > node11-bmc.7578: . ack 1495011 win 126
Access.log:
172.16.0.11 - - [01/Jul/2014:13:45:10 +0200] "GET
/tftpboot/xcat/xnba/nodes/node11 HTTP/1.0" 200 418 "-" "gPXE/1.0.1"
error_log:
Nothing
What I did:
# rsetboot node11 net
# rinstall node11
Details on node11 (these are exactly the same as other nodes):
# lsdef node11
Object name: node11
arch=x86_64
bmc=node11-bmc
bmcpassword=…
bmcport=8
bmcusername=USERID
chain=runcmd=bmcsetup,install=sles11sp1-x86_64
currchain=boot
currstate=install sles11sp1-x86_64-compute
groups=compute,all,ipmi
initrd=xcat/sles11sp1/x86_64/initrd
installnic=eth0
ip=172.16.0.11
kcmdline=autoyast=http://172.16.0.254/install/autoinst/node11
install=http://172.16.0.254/install/sles11sp1/x86_64/1 netdevice=eth0
console=tty0 console=ttyS0,115200n8r
kernel=xcat/sles11sp1/x86_64/linux
mac=e4:1f:13:81:71:f4
mgt=ipmi
mtm=7164FT1
netboot=xnba
nfsserver=172.16.0.254
os=sles11sp1
otherinterfaces=node11-bmc:172.16.1.11,node11-ib0:172.16.2.11
postbootscripts=otherpkgs,pbs_node_execution
postscripts=syslog,remoteshell,syncfiles,setupntp,configiba,gpfs,suseUpdates,mount_opt_software
power=ipmi
prescripts-begin=install:clearSMTReg
primarynic=eth0
profile=compute
provmethod=install
serial=0686204
serialflow=hard
serialport=0
serialspeed=115200
status=installing
statustime=07-01-2014 13:41:53
supportedarchs=x86,x86_64
switch=switch-gbe1
switchport=StackNode1:Ethernet11
tftpserver=172.16.0.254
Where can I have more details on what is going on?
Thanks for your help.
Jean-Claude
------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user