Hi all,
This is kind of a shot in the dark, but I thought I would
send a message and see if there's any ideas on the weird behavior we are
experiencing.
We have a large cluster that consists of a single
master, and 3 service nodes. Each cluster is configured to utilize 1
service node (no service node pools due to the networks.tftpserver bug I
reported earlier). The service nodes are using a shared /tftpboot, so
site.sharedtftp= 1.
We are trying to utilize the new
site.precreatemypostscripts feature to lower load on the service nodes
when a node boots. We are experiencing a strange phenomenon though where
if we chain a node's actions, such as chain=osimage, currchain=osimage,
currstate=runimage=http://<MASTER>/script.tgz, sometimes the
NODESETSTATE line in side of the mypostscripts.<NODE> file won't be kept
up to date when a node boot into its osimage. For example, a node will
be booted into its osimage, however the /xcatpost/mypostscript file will
have NODESETSTATE=runimage=http://<MASTER>/script.tgz. Others in this
same cluster, that booted at the same time, will have the correct line
of NODESETSTATE=netboot.
This happens randomly. We are unable to
reproduce the error reliably.This is bad for us, as we use some
postscripts that rely on the NODESETSTATE variable to determine how they
run.
As a workaround we have just disabled that feature, and after
many tests are no longer able to replicate the problem. Dynamically
generated mypostscript files seem to be kept up to date properly.
Thoughts on what could cause this?
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user