Dear Linda,

I am sure I have a dhcp problem now. I got it from tcpdump –v –i eth2 | grep 
node11:
16:48:30.064797 arp who-has node11 tell aries
16:48:31.068790 arp who-has node11 tell aries
16:48:32.064778 arp who-has node11 tell aries
16:49:04.188927 arp who-has node11 tell aries

Jean-Claude

De : Linda Mellor <mel...@us.ibm.com<mailto:mel...@us.ibm.com>>
Répondre à : xCAT Users Mailing list 
<xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>>
Date : Tuesday, 1 July 2014 15:38
À : xCAT Users Mailing list 
<xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>>
Objet : Re: [xcat-user] xCAT - Node reinstallation not starting


When you say " I think it gets its IP address according to the messages in the 
console",  I am assuming you are watching the node console through rcons or 
some other means.  When you say " I cannot find anything in the logs", does 
that mean you do NOT see corresponding DHCP messages in /var/log/messages on 
the MN?  If not, then the first place to start looking is to make sure you have 
the correct MAC address for your node's install nic, that DHCP is configured 
correctly on the MN, and that you do not have another DHCP server on your 
network that is responding to broadcast requests.  If DHCP is behaving well and 
you are seeing all the correct communications in your MN /var/log/messages, 
then what is the last that you see in the node's console output?  and what is 
the last that you see in your MN /var/log/messages for this node?

Linda

[Inactive hide details for De Giorgi Jean-Claude ---07/01/2014 08:53:44 
AM---Dear all, We have an IBM cluster with 44 compute no]De Giorgi Jean-Claude 
---07/01/2014 08:53:44 AM---Dear all, We have an IBM cluster with 44 compute 
nodes.

From: De Giorgi Jean-Claude 
<jean-claude.degio...@epfl.ch<mailto:jean-claude.degio...@epfl.ch>>
To: xCAT Users Mailing list 
<xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>>
Date: 07/01/2014 08:53 AM
Subject: [xcat-user] xCAT - Node reinstallation not starting

________________________________



Dear all,

We have an IBM cluster with 44 compute nodes.
I had to reinstall them for different reason. This went ok except for only one 
node (node11).
Once I submitted the commands to prepare the node for reinstallation, it boots 
correctly, I think it gets its IP address according to the messages in the 
console, but does not respond to the ping. And nothing. It gets stuck. I cannot 
find anything in the logs (/var/log/messages from the mn). Tcpdump seems to 
have traffic with this node:
# tcpdump –i eth2
14:27:18.536865 IP aries.53291 > node11-bmc.7578: . ack 1494861 win 126
14:27:18.610937 IP node11-bmc.7578 > aries.53291: P 1494861:1494870(9) ack 1122 
win 2920
14:27:18.610992 IP node11-bmc.7578 > aries.53291: P 1494870:1494889(19) ack 
1122 win 2920
14:27:18.611053 IP aries.53291 > node11-bmc.7578: . ack 1494889 win 126
14:27:18.611083 IP node11-bmc.7578 > aries.53291: P 1494889:1495011(122) ack 
1122 win 2920
14:27:18.611095 IP aries.53291 > node11-bmc.7578: . ack 1495011 win 126

Access.log:
172.16.0.11 - - [01/Jul/2014:13:45:10 +0200] "GET 
/tftpboot/xcat/xnba/nodes/node11 HTTP/1.0" 200 418 "-" "gPXE/1.0.1"

error_log:
Nothing



What I did:
# rsetboot node11 net
# rinstall node11

Details on node11 (these are exactly the same as other nodes):
# lsdef node11
Object name: node11
    arch=x86_64
    bmc=node11-bmc
    bmcpassword=…
    bmcport=8
    bmcusername=USERID
    chain=runcmd=bmcsetup,install=sles11sp1-x86_64
    currchain=boot
    currstate=install sles11sp1-x86_64-compute
    groups=compute,all,ipmi
    initrd=xcat/sles11sp1/x86_64/initrd
    installnic=eth0
    ip=172.16.0.11
    kcmdline=autoyast=http://172.16.0.254/install/autoinst/node11 
install=http://172.16.0.254/install/sles11sp1/x86_64/1 netdevice=eth0 
console=tty0 console=ttyS0,115200n8r
    kernel=xcat/sles11sp1/x86_64/linux
    mac=e4:1f:13:81:71:f4
    mgt=ipmi
    mtm=7164FT1
    netboot=xnba
    nfsserver=172.16.0.254
    os=sles11sp1
    otherinterfaces=node11-bmc:172.16.1.11,node11-ib0:172.16.2.11
    postbootscripts=otherpkgs,pbs_node_execution
    
postscripts=syslog,remoteshell,syncfiles,setupntp,configiba,gpfs,suseUpdates,mount_opt_software
    power=ipmi
    prescripts-begin=install:clearSMTReg
    primarynic=eth0
    profile=compute
    provmethod=install
    serial=0686204
    serialflow=hard
    serialport=0
    serialspeed=115200
    status=installing
    statustime=07-01-2014 13:41:53
    supportedarchs=x86,x86_64
    switch=switch-gbe1
    switchport=StackNode1:Ethernet11
    tftpserver=172.16.0.254

Where can I have more details on what is going on?

Thanks for your help.

Jean-Claude------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/xcat-user

------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to