On 8/29/08, Greg Kurtzer <[EMAIL PROTECTED]> wrote:
>
>  Perceus only runs the provisioning part (xget) on a non standard port
>  so it doesn't interfere. The xcpu stuff is all standard.

Strange...

>
>  As Abhishek mentioned, check out the ipaddr Perceus module if your
>  provisioning, otherwise just add entries in the hostfile and do a
>  /etc/init.d/perceus reload. (Caos NSA should already set this up for
>  the user automatically, let me know if it didn't work).

Is there a better description of the modules and their options and
configuration than what is written in the user guide?

I would much rather get static addresses, as it makes debugging and
problem solving a lot easier (as well as node identification).

>
>  If you want to do the host resolution on the dynamic addresses (not
>  something I recommend...) add "nameserver 127.0.0.1" to the top of the
>  master's /etc/resolv.conf. The better fix is to add the entires in the
>  /etc/hosts and the DHCP server itself will manage the static IP
>  addresses.

The nameserver 127.0.0.1 entry did nothing.  In fact, these are the
messages that I got in the system log:
Aug 29 13:59:53 dgk3 perceus-dnsmasq[3100]: reading /etc/resolv.conf
Aug 29 13:59:53 dgk3 perceus-dnsmasq[3100]: using nameserver 128.100.102.202#53
Aug 29 13:59:53 dgk3 perceus-dnsmasq[3100]: using nameserver 142.150.224.6#53
Aug 29 13:59:53 dgk3 perceus-dnsmasq[3100]: using nameserver 142.150.224.224#53
Aug 29 13:59:53 dgk3 perceus-dnsmasq[3100]: ignoring nameserver
127.0.0.1 - local interface
Aug 29 13:59:53 dgk3 perceus-dnsmasq[3100]: using local addresses only
for unqualified domains

So, just to clarify, if I want static addresses for the nodes I need
to activate the ipaddr module AND add the addresses to /etc/hosts?


I have another question, related to both perceus and xcpu:  How does
one reboot a node?  Is there a perceus command or an xcpu command to
do it?

Also, what does one do about time keeping in a setup like this?  Is
ntp a possibility?

First I'd like to get this going as far as being able to run xrx -a
/bin/date.... :-)

Thanks,
Daniel


>
>  Thanks,
>
> Greg
>
>
>  On Fri, Aug 29, 2008 at 9:38 AM, Abhishek Kulkarni <[EMAIL PROTECTED]> wrote:
>  >
>  >
>  > On Fri, Aug 29, 2008 at 10:13 AM, Daniel Gruner <[EMAIL PROTECTED]> wrote:
>  >>
>  >> Hi Ab
>  >>
>  >> On 8/29/08, Abhishek Kulkarni <[EMAIL PROTECTED]> wrote:
>  >> > Hi Daniel,
>  >> >
>  >> > Understand the way in which XCPU is supposed to integrate with oneSIS
>  >> > and/or
>  >> > Perceus. It uses these as a "launch vehicle" to build minimal images
>  >> > with
>  >> > xcpufs running on them, and provision the nodes with these images. In
>  >> > the
>  >> > best case, that's all that you need to be running on the compute nodes.
>  >>
>  >> I understand.
>  >>
>  >> >
>  >> > On Fri, Aug 29, 2008 at 8:46 AM, Daniel Gruner <[EMAIL PROTECTED]>
>  >> > wrote:
>  >> > >
>  >> > > Hi Greg,
>  >> > >
>  >> > > I definitely have additional questions! :-)
>  >> > >
>  >> > > Ok, here we go:
>  >> > >
>  >> > > - assume I am totally new to this - what would one do in order to set
>  >> > > up a perceus/xcpu cluster?
>  >> >
>  >> > As Greg said, you have two ways to go about it. You could choose either
>  >> > of
>  >> > them or try both to see what works for ya. It's just a matter of playing
>  >> > with different configurations and rebooting your nodes to try them.
>  >> >
>  >> > >
>  >> > >
>  >> > > - now, I am not totally new to this game, and my background is with
>  >> > > bproc clusters, so I would like to have a replacement for these, but
>  >> > > with the same basic principle of having a minimal node installation,
>  >> > > and basically no management of nodes needed.  I definitely do not want
>  >> > > to go to a model where the nodes have password files, and you ssh into
>  >> > > them in order to run your codes.
>  >> > >
>  >> > > - in the caos-NSA installation, the warewulfd is started by default.
>  >> > > I assume it needs to be stopped and perceus started, correct?
>  >> >
>  >> > You can enable Perceus from "sidekick" in NSA. Warewulf focuses on
>  >> > cluster
>  >> > monitoring starting with 3.0.
>  >>
>  >> Ok, I am concentrating on my RHEL5 machine for now.  It seems to be
>  >> working, at least insofar as the nodes boot.  I haven't been able to
>  >> contact them to try to do anything, other than running xstat with a
>  >> positive response:
>  >>
>  >> n0000   tcp!10.10.0.170!6667    /Linux/x86_64   up      0
>  >> n0001   tcp!10.10.0.185!6667    /Linux/x86_64   up      0
>  >>
>  >> I'd like the nodes to get sequential IP addresses, for ease of
>  >> identification and management, and I have yet to find out how you do
>  >> that in perceus.
>  >
>  > Take a look at the ipaddr module in Perceus.
>  >
>  >>
>  >> Now, when I try to do anything on the nodes I get, for example:
>  >>
>  >> xgroupset 10.10.0.170 root 0
>  >> xgroupset: Error: Connection refused:10.10.0.170
>  >
>  > Whoops! What about telnet 10.10.0.170 6667?
>  > Perceus might possibly be running xcpufs on some non-standard port. I'm not
>  > sure about that but I remember seeing something like that a while back.
>  >
>  >>
>  >> similarly with xrx.
>  >>
>  >> xrx 10.10.0.170 /bin/date
>  >> Error: Connection refused:10.10.0.170
>  >
>  > Ditto with this, if it's running on a different port you would want to do
>  > xrx 10.10.0.170!port /bin/date
>  >
>  > Alternatively you could specify the "-a" flag to retrieve the nodes from 
> the
>  > statfs.
>  >
>  >>
>  >> I also don't get name resolution for the nXXXX names assigned to the
>  >> nodes by perceus.
>  >
>  > Check your /etc/resolv.conf.
>  > Probably try adding the following to it.
>  > nameserver 127.0.0.1
>  >
>  > If that doesn't work, the right place to ask this would be the Perceus ML.
>  >
>  >>
>  >>
>  >> >
>  >> > >
>  >> > >
>  >> > > - what initialization of perceus needs to be done (the first time it
>  >> > > runs)?  I know about the network interface specification, and that I
>  >> > > want it to use xget (the default), but is running the "perceus module
>  >> > > activate xcpu" enough to get the nodes booting into xcpu?
>  >> >
>  >> > Yes, it is enough to get xcpufs running on the compute nodes.
>  >> >
>  >> > >
>  >> > >
>  >> > > - what about configuring the resource manager (e.g. slurm) for use in
>  >> > > the perceus/xcpu environment?
>  >> >
>  >> > XCPU only supports Moab Torque for now.
>  >>
>  >> Is this the open source torque, or just the commercial product?
>  >>
>  >>
>  >> >
>  >> > >
>  >> > >
>  >> > > - I don't see the xcpufs and statfs daemons running on the master
>  >> > > after starting perceus even though I told it to activate xcpu.  I
>  >> > > haven't tried to boot nodes yet, but I'd like to understand what I am
>  >> > > doing first (I hate black boxes...).
>  >> > >
>  >> >
>  >> > You shouldn't need to run xcpufs on the master. As for statfs, you can
>  >> > start
>  >> > it manually if it is not running already.
>  >> >
>  >> > Again, considering that you have fully configured the master and have
>  >> > the
>  >> > nodes provisioned to the init state, this is what I would do to generate
>  >> > my
>  >> > statfs.conf --
>  >> >
>  >> > perceus node status | awk 'NR > 2 {print $1 "=tcp!" $3 "!6667"}' >
>  >> > /etc/xcpu/statfs.conf
>  >>
>  >> I had to replace the part "NR>2" with "NR>0" for the above incantation
>  >> to work (??).
>  >
>  > Strange, I might probably be running a different version of Perceus.
>  >
>  >>
>  >> >
>  >> > And then,
>  >> >
>  >> > statfs -c /etc/xcpu/statfs
>  >>
>  >> statfs seems to work.  Here is the output from xstat:
>  >>
>  >> n0000   tcp!10.10.0.170!6667    /Linux/x86_64   up      0
>  >> n0001   tcp!10.10.0.185!6667    /Linux/x86_64   up      0
>  >>
>  >> In any case, there is some progress, but it is not quite there yet...
>  >
>  > I'm glad you are almost there.
>  >
>  > Thanks,
>  >   -- Abhishek
>  >
>  >>
>  >>
>  >> Thanks,
>  >> Daniel
>  >>
>  >>
>  >>
>  >>
>  >>
>  >>
>  >> >
>  >> >
>  >> > >
>  >> > > etc.
>  >> > >
>  >> > > I guess the main problem I have is not with perceus itself (I have
>  >> > > read the manual), but rather with its integration and provisioning for
>  >> > > xcpu, and for the subsequent configuration of those pieces that make
>  >> > > the cluster useable in a production environment.
>  >> > >
>  >> > >
>  >> > > Thanks for your help,
>  >> > > Daniel
>  >> >
>  >> > Thanks
>  >> >  -- Abhishek
>  >> >
>  >> > >
>  >> > >
>  >> > >
>  >> > >
>  >> > >
>  >> > >
>  >> > > On 8/29/08, Greg Kurtzer <[EMAIL PROTECTED]> wrote:
>  >> > > >
>  >> > > >  You have multiple choices on how to move forward.
>  >> > > >
>  >> > > >  First you can run the xcpu Perceus module like:
>  >> > > >
>  >> > > >  # perceus module activate xcpu
>  >> > > >
>  >> > > >  That will interrupt the node provisioning process and instead of
>  >> > > >  copying the VNFS to the node it will just start up xcpu and start
>  >> > > >  accepting connections.
>  >> > > >
>  >> > > >  The second option would be to run xcpu from within the VNFS of your
>  >> > > >  choice. That mechanism basically involves installing xcpu into the
>  >> > > >  mounted VNFS image and then provision your nodes with that.
>  >> > > >
>  >> > > >  Let me know if that helps or if you have additional questions. :)
>  >> > > >
>  >> > > >
>  >> > > >  Greg
>  >> > > >
>  >> > > >
>  >> > > >
>  >> > > >
>  >> > > >  On Fri, Aug 29, 2008 at 6:45 AM, Daniel Gruner <[EMAIL PROTECTED]>
>  >> > wrote:
>  >> > > >  >
>  >> > > >  > Hi Kevin,
>  >> > > >  >
>  >> > > >  > Well, I've just completed installing xcpu2 and perceus into my
>  >> > > > RHEL5
>  >> > > >  > machine, but now I am stumped with the configuration.  How do you
>  >> > tell
>  >> > > >  > perceus that you want your cluster to run xcpu?  I sure don't
>  >> > > >  > understand where this is configured (I assume somewhere in the
>  >> > > >  > /etc/perceus .conf files), and there is no mention of that in the
>  >> > > >  > manual other than saying that xcpu works.
>  >> > > >  >
>  >> > > >  > If you install xcpu2 you surely would need 9p, right?
>  >> > > >  >
>  >> > > >  > Also, how does slurm integrate into the perceus/xcpu world?
>  >> > > >  >
>  >> > > >  > I have also installed this on a caos-NSA test machine, but again
>  >> > > > I
>  >> > > >  > don't know how to configure the provisioning.
>  >> > > >  >
>  >> > > >  > Any help with this would be much appreciated...
>  >> > > >  >
>  >> > > >  > Daniel
>  >> > > >  >
>  >> > > >  >
>  >> > > >  > On 8/28/08, Kevin Tegtmeier <[EMAIL PROTECTED]> wrote:
>  >> > > >  >> We used RHEL5 + perceus successfully.  I had to modify the
>  >> > > > perceus
>  >> > boot
>  >> > > >  >> image for x86_64, but it may have been a kexec/hardware specific
>  >> > issue I ran
>  >> > > >  >> into.  If you run into an issue with it I can help you along.
>  >> > > >  >>
>  >> > > >  >> I don't think the 9P module was built in, but I don't think you
>  >> > would use
>  >> > > >  >> it.
>  >> > > >  >>
>  >> > > >  >>
>  >> > > >  >> On Thu, Aug 28, 2008 at 11:31 AM, Daniel Gruner
>  >> > > > <[EMAIL PROTECTED]>
>  >> > wrote:
>  >> > > >  >>
>  >> > > >  >> >
>  >> > > >  >> > Thanks, Abhishek.
>  >> > > >  >> >
>  >> > > >  >> > I will try it and report on my success/lack thereof.
>  >> > > >  >> >
>  >> > > >  >> > Just for info, I am using a RHEL5 distribution, but with the
>  >> > 2.6.26
>  >> > > >  >> > kernel so that it supports 9p.  Has anybody been successful
>  >> > > > with
>  >> > this
>  >> > > >  >> > distribution?  Otherwise, is there a preferred one?
>  >> > > >  >> >
>  >> > > >  >> > Daniel
>  >> > > >  >> >
>  >> > > >  >> >
>  >> > > >  >> >
>  >> > > >  >> >
>  >> > > >  >> > On 8/28/08, Abhishek Kulkarni <[EMAIL PROTECTED]> wrote:
>  >> > > >  >> > >
>  >> > > >  >> > >  Daniel,
>  >> > > >  >> > >
>  >> > > >  >> > >  It is _not_ necessary to install cAos Linux to use Perceus.
>  >> > Perceus
>  >> > > >  >> > >  supports most, if not all, distributions.
>  >> > > >  >> > >
>  >> > > >  >> > >  XCPU is bundled up as a module within Perceus. The
>  >> > documentation at
>  >> > > >  >> > >
>  >> > > >  >>
>  >> > http://www.perceus.org/docs/perceus-userguide-1.4.0.pdf is
>  >> > > >  >> quite
>  >> > > >  >> > >  extensive at that and has details on importing and
>  >> > > > activating
>  >> > modules.
>  >> > > >  >> > >  It's quite simple even if you find yourself wanting to
>  >> > > > tinker
>  >> > with the
>  >> > > >  >> > >  XCPU Perceus module (it's just a shell script that runs at
>  >> > > > a
>  >> > specified
>  >> > > >  >> > >  provisioning state/level)
>  >> > > >  >> > >
>  >> > > >  >> > >
>  >> > > >  >> > >   -- Abhishek
>  >> > > >  >> > >
>  >> > > >  >> > >
>  >> > > >  >> > >  On Thu, 2008-08-28 at 14:17 -0400, Daniel Gruner wrote:
>  >> > > >  >> > >  > Yes, that is a possibility.  Instructions on that,
>  >> > > > please?
>  >> > > >  >> > >  > I tried installing caos linux, but it doesn't quite
>  >> > > > finish
>  >> > doing the
>  >> > > >  >> install.
>  >> > > >  >> > >  >
>  >> > > >  >> > >  > Daniel
>  >> > > >  >> > >  >
>  >> > > >  >> > >  > On 8/28/08, ron minnich <[EMAIL PROTECTED]> wrote:
>  >> > > >  >> > >  > >
>  >> > > >  >> > >  > >  Use perceus.
>  >> > > >  >> > >  > >
>  >> > > >  >> > >  > >  Ron
>  >> > > >  >> > >  > >
>  >> > > >  >> > >  > >
>  >> > > >  >> > >  > >  On 8/28/08, Daniel Gruner <[EMAIL PROTECTED]> wrote:
>  >> > > >  >> > >  > >  >
>  >> > > >  >> > >  > >  > Hi All,
>  >> > > >  >> > >  > >  >
>  >> > > >  >> > >  > >  > The list has been very quiet lately... :-)
>  >> > > >  >> > >  > >  >
>  >> > > >  >> > >  > >  > I've been trying, yet again, to install the latest
>  >> > > > xcpu2
>  >> > in a
>  >> > > >  >> test
>  >> > > >  >> > >  > >  > cluster.  Ron's instructions on the xcpu.org site
>  >> > > > seem
>  >> > to be
>  >> > > >  >> outdated,
>  >> > > >  >> > >  > >  > and partly buggy too.  For instance, here are a
>  >> > > > couple
>  >> > of
>  >> > > >  >> points:
>  >> > > >  >> > >  > >  >
>  >> > > >  >> > >  > >  > - After doing:
>  >> > > >  >> > >  > >  >
>  >> > > >  >> > >  > >  > make xcpu-tarball
>  >> > > >  >> > >  > >  >
>  >> > > >  >> > >  > >  > make ramfs-tarball
>  >> > > >  >> > >  > >  >
>  >> > > >  >> > >  > >  > make install
>  >> > > >  >> > >  > >  >
>  >> > > >  >> > >  > >  > I don't know whether xcpu2 has actually been built
>  >> > > > (I
>  >> > suspect
>  >> > > >  >> not),
>  >> > > >  >> > >  > >  > and it certainly has not been installed (e.g. no
>  >> > > > xrx, or
>  >> > xcpufs,
>  >> > > >  >> or
>  >> > > >  >> > >  > >  > any of that stuff has been installed).
>  >> > > >  >> > >  > >  >
>  >> > > >  >> > >  > >  > - The command
>  >> > > >  >> > >  > >  >
>  >> > > >  >> > >  > >  > export u=`uname -r`
>  >> > > >  >> > >  > >  > ./mk-initramfs-oneSIS -f initrd-$u.img $u -nn -rr \
>  >> > > >  >> > >  > >  > -o ../overlays/xcpu-64 \
>  >> > > >  >> > >  > >  > -w e1000 \
>  >> > > >  >> > >  > >  > -w forcedeth \
>  >> > > >  >> > >  > >  > -w ext3
>  >> > > >  >> > >  > >  >
>  >> > > >  >> > >  > >  > should really be
>  >> > > >  >> > >  > >  >
>  >> > > >  >> > >  > >  > ./mk-xcpu-oneSIS ....
>  >> > > >  >> > >  > >  >
>  >> > > >  >> > >  > >  > in order that the 9p and 9pnet modules get loaded
>  >> > > > into
>  >> > the
>  >> > > >  >> initrd.
>  >> > > >  >> > >  > >  >
>  >> > > >  >> > >  > >  > Can someone please take a look and revise the
>  >> > instructions (and
>  >> > > >  >> let us
>  >> > > >  >> > >  > >  > mere mortals know what to do)?
>  >> > > >  >> > >  > >  >
>  >> > > >  >> > >  > >  >
>  >> > > >  >> > >  > >  > Furthermore, is xcpu2 actualy useable for production
>  >> > work?  What
>  >> > > >  >> about
>  >> > > >  >> > >  > >  > its integration with a scheduler/resource manager?
>  >> > > >  What
>  >> > about
>  >> > > >  >> MPI?
>  >> > > >  >> > >  > >  >
>  >> > > >  >> > >  > >  > Regards,
>  >> > > >  >> > >  > >  > Daniel
>  >> > > >  >> > >  > >  >
>  >> > > >  >> > >  > >
>  >> > > >  >> > >  > >
>  >> > > >  >> > >  > > --
>  >> > > >  >> > >  > >  Sent from Gmail for mobile | mobile.google.com
>  >> > > >  >> > >  > >
>  >> > > >  >> > >
>  >> > > >  >> > >
>  >> > > >  >> >
>  >> > > >  >>
>  >> > > >  >>
>  >> > > >  >
>  >> > > >
>  >> > > >
>  >> > > >
>  >> > > >
>  >> > > > --
>  >> > > >  Greg Kurtzer
>  >> > > >  http://www.infiscale.com/
>  >> > > >  http://www.runlevelzero.net/
>  >> > > >  http://www.perceus.org/
>  >> > > >  http://www.caoslinux.org/
>  >> > > >
>  >> > >
>  >> >
>  >> >
>  >
>  >
>
>
>
>
> --
>
> Greg Kurtzer
>  http://www.infiscale.com/
>  http://www.runlevelzero.net/
>  http://www.perceus.org/
>  http://www.caoslinux.org/
>

Reply via email to