Dear Linda,
I am sure I have a dhcp problem now. I got it from tcpdump –v –i eth2 | grep
node11:
16:48:30.064797 arp who-has node11 tell aries
16:48:31.068790 arp who-has node11 tell aries
16:48:32.064778 arp who-has node11 tell aries
16:49:04.188927 arp who-has node11 tell aries
Jean-Claude
De : Linda Mellor <mel...@us.ibm.com<mailto:mel...@us.ibm.com>>
Répondre à : xCAT Users Mailing list
<xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>>
Date : Tuesday, 1 July 2014 15:38
À : xCAT Users Mailing list
<xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>>
Objet : Re: [xcat-user] xCAT - Node reinstallation not starting
When you say " I think it gets its IP address according to the messages in the
console", I am assuming you are watching the node console through rcons or
some other means. When you say " I cannot find anything in the logs", does
that mean you do NOT see corresponding DHCP messages in /var/log/messages on
the MN? If not, then the first place to start looking is to make sure you have
the correct MAC address for your node's install nic, that DHCP is configured
correctly on the MN, and that you do not have another DHCP server on your
network that is responding to broadcast requests. If DHCP is behaving well and
you are seeing all the correct communications in your MN /var/log/messages,
then what is the last that you see in the node's console output? and what is
the last that you see in your MN /var/log/messages for this node?
Linda
[Inactive hide details for De Giorgi Jean-Claude ---07/01/2014 08:53:44
AM---Dear all, We have an IBM cluster with 44 compute no]De Giorgi Jean-Claude
---07/01/2014 08:53:44 AM---Dear all, We have an IBM cluster with 44 compute
nodes.
From: De Giorgi Jean-Claude
<jean-claude.degio...@epfl.ch<mailto:jean-claude.degio...@epfl.ch>>
To: xCAT Users Mailing list
<xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>>
Date: 07/01/2014 08:53 AM
Subject: [xcat-user] xCAT - Node reinstallation not starting
________________________________
Dear all,
We have an IBM cluster with 44 compute nodes.
I had to reinstall them for different reason. This went ok except for only one
node (node11).
Once I submitted the commands to prepare the node for reinstallation, it boots
correctly, I think it gets its IP address according to the messages in the
console, but does not respond to the ping. And nothing. It gets stuck. I cannot
find anything in the logs (/var/log/messages from the mn). Tcpdump seems to
have traffic with this node:
# tcpdump –i eth2
14:27:18.536865 IP aries.53291 > node11-bmc.7578: . ack 1494861 win 126
14:27:18.610937 IP node11-bmc.7578 > aries.53291: P 1494861:1494870(9) ack 1122
win 2920
14:27:18.610992 IP node11-bmc.7578 > aries.53291: P 1494870:1494889(19) ack
1122 win 2920
14:27:18.611053 IP aries.53291 > node11-bmc.7578: . ack 1494889 win 126
14:27:18.611083 IP node11-bmc.7578 > aries.53291: P 1494889:1495011(122) ack
1122 win 2920
14:27:18.611095 IP aries.53291 > node11-bmc.7578: . ack 1495011 win 126
Access.log:
172.16.0.11 - - [01/Jul/2014:13:45:10 +0200] "GET
/tftpboot/xcat/xnba/nodes/node11 HTTP/1.0" 200 418 "-" "gPXE/1.0.1"
error_log:
Nothing
What I did:
# rsetboot node11 net
# rinstall node11
Details on node11 (these are exactly the same as other nodes):
# lsdef node11
Object name: node11
arch=x86_64
bmc=node11-bmc
bmcpassword=…
bmcport=8
bmcusername=USERID
chain=runcmd=bmcsetup,install=sles11sp1-x86_64
currchain=boot
currstate=install sles11sp1-x86_64-compute
groups=compute,all,ipmi
initrd=xcat/sles11sp1/x86_64/initrd
installnic=eth0
ip=172.16.0.11
kcmdline=autoyast=http://172.16.0.254/install/autoinst/node11
install=http://172.16.0.254/install/sles11sp1/x86_64/1 netdevice=eth0
console=tty0 console=ttyS0,115200n8r
kernel=xcat/sles11sp1/x86_64/linux
mac=e4:1f:13:81:71:f4
mgt=ipmi
mtm=7164FT1
netboot=xnba
nfsserver=172.16.0.254
os=sles11sp1
otherinterfaces=node11-bmc:172.16.1.11,node11-ib0:172.16.2.11
postbootscripts=otherpkgs,pbs_node_execution
postscripts=syslog,remoteshell,syncfiles,setupntp,configiba,gpfs,suseUpdates,mount_opt_software
power=ipmi
prescripts-begin=install:clearSMTReg
primarynic=eth0
profile=compute
provmethod=install
serial=0686204
serialflow=hard
serialport=0
serialspeed=115200
status=installing
statustime=07-01-2014 13:41:53
supportedarchs=x86,x86_64
switch=switch-gbe1
switchport=StackNode1:Ethernet11
tftpserver=172.16.0.254
Where can I have more details on what is going on?
Thanks for your help.
Jean-Claude------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user