Dear all,

We have an IBM cluster with 44 compute nodes.
I had to reinstall them for different reason. This went ok except for only one 
node (node11).
Once I submitted the commands to prepare the node for reinstallation, it boots 
correctly, I think it gets its IP address according to the messages in the 
console, but does not respond to the ping. And nothing. It gets stuck. I cannot 
find anything in the logs (/var/log/messages from the mn). Tcpdump seems to 
have traffic with this node:
# tcpdump –i eth2
14:27:18.536865 IP aries.53291 > node11-bmc.7578: . ack 1494861 win 126
14:27:18.610937 IP node11-bmc.7578 > aries.53291: P 1494861:1494870(9) ack 1122 
win 2920
14:27:18.610992 IP node11-bmc.7578 > aries.53291: P 1494870:1494889(19) ack 
1122 win 2920
14:27:18.611053 IP aries.53291 > node11-bmc.7578: . ack 1494889 win 126
14:27:18.611083 IP node11-bmc.7578 > aries.53291: P 1494889:1495011(122) ack 
1122 win 2920
14:27:18.611095 IP aries.53291 > node11-bmc.7578: . ack 1495011 win 126

Access.log:
172.16.0.11 - - [01/Jul/2014:13:45:10 +0200] "GET 
/tftpboot/xcat/xnba/nodes/node11 HTTP/1.0" 200 418 "-" "gPXE/1.0.1"

error_log:
Nothing



What I did:
# rsetboot node11 net
# rinstall node11

Details on node11 (these are exactly the same as other nodes):
# lsdef node11
Object name: node11
    arch=x86_64
    bmc=node11-bmc
    bmcpassword=…
    bmcport=8
    bmcusername=USERID
    chain=runcmd=bmcsetup,install=sles11sp1-x86_64
    currchain=boot
    currstate=install sles11sp1-x86_64-compute
    groups=compute,all,ipmi
    initrd=xcat/sles11sp1/x86_64/initrd
    installnic=eth0
    ip=172.16.0.11
    kcmdline=autoyast=http://172.16.0.254/install/autoinst/node11 
install=http://172.16.0.254/install/sles11sp1/x86_64/1 netdevice=eth0 
console=tty0 console=ttyS0,115200n8r
    kernel=xcat/sles11sp1/x86_64/linux
    mac=e4:1f:13:81:71:f4
    mgt=ipmi
    mtm=7164FT1
    netboot=xnba
    nfsserver=172.16.0.254
    os=sles11sp1
    otherinterfaces=node11-bmc:172.16.1.11,node11-ib0:172.16.2.11
    postbootscripts=otherpkgs,pbs_node_execution
    
postscripts=syslog,remoteshell,syncfiles,setupntp,configiba,gpfs,suseUpdates,mount_opt_software
    power=ipmi
    prescripts-begin=install:clearSMTReg
    primarynic=eth0
    profile=compute
    provmethod=install
    serial=0686204
    serialflow=hard
    serialport=0
    serialspeed=115200
    status=installing
    statustime=07-01-2014 13:41:53
    supportedarchs=x86,x86_64
    switch=switch-gbe1
    switchport=StackNode1:Ethernet11
    tftpserver=172.16.0.254

Where can I have more details on what is going on?

Thanks for your help.

Jean-Claude
------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to