Hi List,
I couldn't find the solution for this issue.
I have resolved it by the following the below steps:
- killed the process 'xcatd: install monitor' manually
- Restarted xcatd on the master node.
Before when I was trying to run the command '/usr/bin/awk -f
/xcatpost/updateflag.awk <master node IP> 3002' manually it
wasn't returning the prompt back. Once I did the above steps it
returned the prompt in a second.
Best Regards,
Qamar Nazir
HPC Software Engineer
On 12/09/2011 03:27 AM, Jing CDL Sun wrote:
OK, seems not a name
resoluion issue. then,
maybe you need to follow xiaopeng's suggestion for more
debugging.
OR, another straight forward debugging method is to start the
xcatd in
front, it will show some message about the communication between
mn and
cn. I used to debug with it, for example:
service xcatd stop
/opt/xcat/sbin/xcatd -f
Best Regards,
-----------------------------
Sun Jing(Ëᄌ)
IBM China Software Development Laboratory
Tel: (86-10) 82453625 E-mail: [email protected]
Address: Building 28, ZhongGuanCun Software Park,
No.8, Dong Bei Wang West Road, Haidian
District Beijing 100193, PRC
±±¾©Êк£µíÇø¶«±±ÍúÎ÷·8ºÅÖйشåÈí¼þÔ°28ºÅÂ¥
Óʱà: 100193
Hi Jing,
Yes, the node can resolve cmgmt1, and cmgmt1 can resolve the
node. I've
checked the resolv.conf on the node's console while it is hung,
as well
as was able to ping cmgmt1.
That's why this is so strange (and has me pulling my hair out!).
If I set
the node to install a different profile it installs and runs the
postscripts
fine, no hang, and reboots without an issue. It's just this
specific workstation
profile that it is having trouble with, but I don't see anything
in the
postscripts of this profile that should cause this behavior.
2011/12/8 Jing CDL Sun <[email protected]>
Hi Dave,
Another thing I can think of is, have you check if cmgmt1
can be resolved on your compute node? Basically you need to set
site.nameservers=<mn's
ip>, site.domain=<your domain name>, then after
makedhcp, the
nameserver/domain value will be set in your dhcp server
configuration,
so after the compute node is installed, the dhcp server will
create /etc/resolv.conf
on your compute node so that the compute node will know the mn
is its name
server, and the search path is your domain.
Best Regards,
-----------------------------
Sun Jing(Ëᄌ)
IBM China Software Development Laboratory
Tel: (86-10) 82453625 E-mail: [email protected]
Address: Building 28, ZhongGuanCun Software Park,
No.8, Dong Bei Wang West Road, Haidian District
Beijing 100193, PRC
±±¾©Êк£µíÇø¶«±±ÍúÎ÷·8ºÅÖйشåÈí¼þÔ°28ºÅÂ¥
Óʱà: 100193
Hi Xiao,
Yes this is a diskfull installation. I will take a look at the
syslog tomorrow
when I am back in the office to see if I see anything that
could be helpful
that I may have missed.
Given that the node can install fine when being set to one
profile, but
not another, what other things outside of DNS and iptables
could cause
the management node to not be able to receive (or reply) to
the flag my
node is apparently failing to send?
2011/12/8 Xiao Peng Wang <[email protected]>
updateflag.awk is used to send a request to xcatd to indicate
that installation/netboot
has been finished.
The 'updateflag.awk
MN 3002' will be called for diskfull installation and 'updateflag.awk
$MASTER 3002 "installstatus booted"'
should be for the diskless boot. So you case was a diskfull
installation,
right?
For the debugging, you need to check whether the process
'xcatd: install
monitor' has been started on MN, it is used to handle the
request from
the updateflag.awk.
Also you can try to get some hints from syslog: 1. whether
'nodeset next'
command was called? 2. Search the message from node with tag
'xcat'.
You also could try to debug into the do_installm_service in
the xcatd.
See the code to handle the 'ready', 'next' ...
Thanks
Best Regards
----------------------------------------------------------------------
Wang Xiaopeng (ÍõÏþÅó)
IBM China System Technology Laboratory
Tel: 86-10-82453455
Email: [email protected]
Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West
Road, Haidian
District Beijing P.R.China 100193
Dave
Barry ---2011-12-09 07:35:12---Hi *, Can't seem to figure this
issue out.
I have a node who is running it's
From: Dave Barry <[email protected]>
To: xCAT Users
Mailing list <[email protected]>
Date: 2011-12-09
07:35
Subject: [xcat-user]
Installing node
hanging at updateflag.awk
Hi *,
Can't seem to figure this issue out. I have a node who is
running it's
postscripts properly (as far as I can tell), but then hangs at
updateflag.awk.
The specific line that it seems to hang at and is showing in ps
xf is:
/bin/awk -f updateflag.awk cmgmt1 3002
That's all that is in the processes line, there is no actual
command after
the 3002. Even more puzzling is in /tmp/mypostscript.post, the
following
line does not exist at the very end, while it does on other
nodes who installed
properly:
updateflag.awk $MASTER 3002 "installstatus booted"
I can resolve both the node and it's master forwards and
backwards. This
node also installs just fine when I give it a different profile,
so there
is either something in the OS it is installing (centos 5.4) or
one of my
postscripts in this profile that is causing the issue, but I
don't know
how to continue troubleshooting this problem when the issue does
not appear
to be DNS related. Usually problems like this are caused by DNS.
What would cause mypostscript.post to not have the installstatus
line at
the bottom? Does this line get written to that file after a
certain "something"
happens? Any thoughts on logs or something I can look at that
would cause
this behavior?
Thanks!------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference,
checklist and point
of
discussion for anyone considering optimizing the pricing and
packaging
model
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference, checklist
and point
of
discussion for anyone considering optimizing the pricing and
packaging
model
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference,
checklist and point
of
discussion for anyone considering optimizing the pricing and
packaging
model
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference, checklist
and point
of
discussion for anyone considering optimizing the pricing and
packaging
model
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference,
checklist and point
of
discussion for anyone considering optimizing the pricing and
packaging
model
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user
|