Thanks for the corrections!  I too have some comments inline...

On Sat, Dec 13, 2008 at 1:10 PM, Abhishek Kulkarni <[email protected]> wrote:
> Excellent write-up, Daniel. I am adding some of my comments and/or
> suggestions inline.
> I am trying to detail most of these steps in the wiki guide for xcpu. I will
> take note of some of the points that you have made.
> Thanks.
>
> On Sat, Dec 13, 2008 at 10:10 AM, Daniel Gruner <[email protected]> wrote:
>>
>> Ok, here we go...
>>
>> I start from an almost vanilla RHEL5.2 machine, except for the kernel.
>>  RHEL does not provide the 9p modules, so rather than trying to
>> recompile their kernel I just got the 2.6.26 kernel from kernel.org.
>> This allows me to build sxcpu right out of the box.
>
> sxcpu does not need any kernel modules at all. xcpu2 uses the 9p and 9pnet
> modules to mount the head node file system. You can also build the 9p
> modules for a RHEL kernel.
>

I guess way back, when I started with perceus, I was still trying to
use xcpu2, hence the need for a different kernel with 9p support.
Then we realized that perceus includes sxcpu out of the box, so I went
back to that.  xcpu2 is still enticing, but I am not very comfortable
with it - yet.  Perhaps when the writeup is done and it can be
explained in more detail, including its benefits and pitfalls, I'll go
to it.

>>
>> I obtained it
>> from the sourceforge svn repository:
>>
>> svn co https://xcpu.svn.sourceforge.net/svnroot/xcpu/sxcpu/trunk sxcpu
>>
>> Here you simply do "make; make install" and it should all be
>> available.
>
> Another thing to note: there are a few prerequisites (libelf, openssl
> headers) for sxcpu and you would have to install them for it to build
> successfully if you are on a vanilla debian/ubuntu system.
>
>>
>>  It will be important to run the "statfs" daemon on the
>> master, so that the commands you use later are aware of the status of
>> the compute nodes in the cluster.  These will need to be configured in
>> the /etc/xcpu/statfs.conf file:
>>
>> [r...@dgk3 xcpu]# cat /etc/xcpu/statfs.conf
>> n0000=tcp!10.10.0.10!6667
>> n0001=tcp!10.10.0.11!6667
>>
>> See below for more details on the assignment of IP addresses to the
>> nodes by perceus.
>>
>>
>> Then the perceus side of things:  I have perceus 1.4.4, downloaded
>> directly from their site.  To build it I just did the usual
>> ./configure; make; make install with no special options.  Now to the
>> perceus configuration...
>>
>> My internal network to the compute nodes is eth0.  Here is the
>> ifconfig for my master node:
>>
>> [r...@dgk3 all]# ifconfig
>> eth0      Link encap:Ethernet  HWaddr 00:E0:81:2C:81:D0
>>          inet addr:10.10.0.1  Bcast:10.10.0.255  Mask:255.255.255.0
>>          inet6 addr: fe80::2e0:81ff:fe2c:81d0/64 Scope:Link
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:8044504 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:10719515 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:1000
>>          RX bytes:1770711038 (1.6 GiB)  TX bytes:1542820770 (1.4 GiB)
>>          Interrupt:24
>>
>> eth1      Link encap:Ethernet  HWaddr 00:E0:81:2C:81:D1
>>          inet addr:142.150.227.13  Bcast:142.150.227.255
>>  Mask:255.255.252.0
>>          inet6 addr: fec0::9:2e0:81ff:fe2c:81d1/64 Scope:Site
>>          inet6 addr: 2002:8e96:e1cc:9:2e0:81ff:fe2c:81d1/64 Scope:Global
>>          inet6 addr: fe80::2e0:81ff:fe2c:81d1/64 Scope:Link
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:44696410 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:1158903 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:1000
>>          RX bytes:4665982604 (4.3 GiB)  TX bytes:1197487595 (1.1 GiB)
>>          Interrupt:25
>>
>> lo        Link encap:Local Loopback
>>          inet addr:127.0.0.1  Mask:255.0.0.0
>>          inet6 addr: ::1/128 Scope:Host
>>          UP LOOPBACK RUNNING  MTU:16436  Metric:1
>>          RX packets:21339 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:21339 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:0
>>          RX bytes:66081636 (63.0 MiB)  TX bytes:66081636 (63.0 MiB)
>>
>> In /etc/perceus there are several configuration files:
>>
>> ---defaults.conf---
>> [r...@dgk3 perceus]# cat defaults.conf
>> #
>> # Copyright (c) 2006-2008, Greg M. Kurtzer, Arthur A. Stevens and
>> # Infiscale, Inc. All rights reserved
>> #
>>
>> # This is the template name for all new nodes as they are configured.
>>
>> # Define the node name range. The '#' characters symbolize the node number
>> # in the order of initalized. If you don't allocate enough number spaces
>> # here for what you defined in 'Total Nodes' then it will be automatically
>> # padded.
>> Node Name = n####
>>
>> # What is the default group for new nodes (this doesn't have to exist
>> # anywhere before hand)
>> Group Name = cluster
>>
>> # Define the default VNFS image that should be assigned to new nodes
>> Vnfs Name =
>>
>> # Are new nodes automatically enabled and provisionined?
>> Enabled = 1
>>
>> # What is the first node number that we should count at?
>> First Node = 0
>>
>> # This is the total node count that Perceus would ever try and allocate a
>> # node to. It is safe to make this big, so you should leave it big.
>> Total Nodes = 10000
>>
>> (I did not modify the defaults.conf file).
>>
>>
>> ---dnsmasq.conf---
>> [r...@dgk3 perceus]# cat dnsmasq.conf
>> interface=eth0
>> enable-tftp
>> tftp-root=/usr/local/var/lib/perceus//tftp
>> dhcp-option=vendor:Etherboot,60,"Etherboot"
>> dhcp-boot=pxelinux.0
>> local=//
>> domain=internal
>> expand-hosts
>> dhcp-range=10.10.0.128,10.10.0.254
>> dhcp-lease-max=21600
>> read-ethers
>>
>>
>> ---perceus.conf---
>> [r...@dgk3 perceus]# cat perceus.conf
>> #
>> # Copyright (c) 2006-2008, Greg M. Kurtzer, Arthur A. Stevens and
>> # Infiscale, Inc. All rights reserved
>> #
>>
>> # This is the primary configuration file for Perceus
>>
>> # Define the network device on this system that is connected directly
>> # and privately to the nodes. This device will be responding to DHCP
>> # requests thus make sure you specify the proper device name!
>> # note: This device must be configured for IP based communication.
>> master network device = eth0
>>
>> # What protocol should be used to retireve the VNFS information. Generally
>> # Supported options in this version of Perceus are: 'xget', 'nfs', and
>> 'http'
>> # but others may also be available via specialized VNFS capsules or
>> # feature enhancing Perceus Modules.
>> vnfs transfer method = xget
>>
>> # Define the IP Address of the network file server. This address must be
>> # set before Perceus can operate. If this option is left blank, the IP
>> # address of the "master network device" defined above will be used.
>> vnfs transfer master =
>>
>> # Define the VNFS transfer location if it is different from the default
>> # ('statedir'). This gets used differently for different transfer methods
>> # (e.g. NFS this replaces the path to statedir, while with http it is gets
>> # prepended to the "/perceus" path).
>> vnfs transfer prefix =
>>
>> # What is the default database that should be used. If this option is not
>> # specified, then the default is "hash" to remain compatible with
>> # previous versions of Perceus. Other options are 'btree' and 'mysql'.
>> # note: btree is default as of version 1.4.
>> database type = btree
>>
>> # If you selected an SQL database solution as your database type above,
>> # then you will need to specify the SQL user login information here.
>> # note: this will be ignored for non-SQL database types.
>> database server = localhost
>> database name = perceus
>> database user = db user
>> database pass = db pass
>>
>> # To allow for better scaling the Perceus daemon 'preforks' which creates
>> # multiple subprocesses to better handle large number of simultaneous
>> # connections. The default is 4 which on most systems can support
>> # thousands of nodes per minute but for best tuning this number is highly
>> # dependant on system configuration (both hardware and software).
>> prefork = 4
>>
>> # How long (in seconds) should we wait before considering a node as dead.
>> # Note, that if you are not running node client daemons, then after
>> # provisioning the node will never check in, and will no doubt expire.
>> # Considering that the default node check in is 5 minutes, setting this
>> # to double that should ensure that any living node would have checked in
>> # by then (600).
>> node timeout = 600
>>
>>
>> I only modified the master network device to point to eth0.  Note that
>> there are no VNFS images defined, as booting xcpu does not require
>> them.
>>
>> Install the perceus startup script in /etc/rc.d/init.d, so that it
>> will start on boot.  I believe it gets installed by default (in the
>> "make install" step), but it still needs to be configured with
>> "chkconfig -a perceus".
>
> I don't think this is necessary. Perceus manages the init scripts for most
> distributions properly.
>

I just though it would be good to mention it.  I do not remember if
perceus did this automatically or not...

>>
>> This will start the perceus daemons,
>> including the dnsmasq which provides dhcp for the slave nodes.  No
>> other dhcp server can run, but this one can be configured to provide
>> other network configuration for additional NICs.
>
> Yes, this works fine in most cases. And dnsmasq is pretty customizable at
> that.
> http://www.thekelleys.org.uk/dnsmasq/docs/dnsmasq-man.html
>
> Unless you have special needs like having all the compute nodes accessible
> directly from a public network (yes i have heard that before!), it should
> work for you.

Ugg!  I believe in compute nodes needing an external fileserver, but
not in direct access to the nodes from the outside.

>
>>
>> After rebooting, make sure perceus is running.  Then run the command:
>>
>> perceus module activate xcpu
>> perceus module activate ipaddr
>>
>> In order to get static addreses assigned to the compute nodes
>> (desireable), their addresses must be added to the /etc/hosts file,
>> e.g.:
>>
>> 10.10.0.1       master
>> 10.10.0.10      n0000
>> 10.10.0.11      n0001
>>
>> You should be ready to start configuring the nodes at this stage.
>> They must be set for pxe boot.  All the necessary stuff for this is
>> installed by perceus in /usr/local/var/lib/perceus.  You boot your
>> compute nodes in the order in which you want them named, starting, by
>> default, as n0000.  The first time they will be assigned an IP address
>> from the dynamic range defined in the /etc/perceus/dnsmasq.conf file,
>> but on reboot they will get the statically assigned address from the
>> /etc/hosts file.
>>
>> By this stage you should have a useable xcpu cluster.  You need to set
>> up the groups and users using the xgroupset and xuserset commands.  In
>> order to get "proper" behaviour, in accordance with the version of
>> sxcpu that you downloaded and built, you may need to update the xcpufs
>> provided by perceus.  This is done  by statically linking the xcpufs
>> daemon:
>>
>> In /usr/local/src/sxcpu/xcpufs (or wherever you installed the sxcpu
>> sources) there is a script called LINKSTATIC.  I am running on x86_64,
>> so I modified it to read:
>>
>> [r...@dgk3 xcpufs]# cat LINKSTATIC
>> #!/bin/sh
>> echo This script is for linking statically on Linux.
>> cc -static -o xcpufs.static -Wall -g -I ../include -DSYSNAME=Linux
>> file.o pipe.o proc-Linux.o tspawn.o ufs.o xauth.o xcpufs.o  -g
>> -L../libstrutil -lstrutil -L../libspclient -lspclient -L../libspfs
>> -lspfs -L../libxauth -lxauth -lcrypto /usr/lib64/libdl.a
>>
>> Note that it produces the "xcpufs.static" executable, and it looks for
>> its libdl.a library in the /usr/lib64 directory.  Then I copy the
>> xcpufs.static executable to the location where perceus needs it:
>
> Rather than doing this manually, it's recommended to put the tarball in the
> right place within Perceus and then make -C 3rd_party/ xcpu to generate a
> new xcpufs. Perceus applies its own static libs patch to sxcpu and you don't
> have to worry about the multilib path.
>

Ok this is a good idea.  I didn't know where perceus expected this.
Looking at the makefile in the 3rd_party directory of perceus it seems
that one needs to possibly change it to correspond to whatever version
of sxcpu one provides.  Should not be a big problem.

>>
>> cp /usr/local/src/sxcpu/xcpufs/xcpufs.static
>> /usr/local/var/lib/perceus/modules/xcpu/xcpufs
>>
>> and on reboot the nodes will pick up the latest and greatest xcpufs.
>>
>> After this you can do, for example:
>>
>> xgroupset add -a -u
>> xuserset add -a -u
>>
>> in order to add all the groups and all the users to the permitted user
>> list on the nodes.  You can then run anything on the nodes, e.g. "xrx
>> -a date".
>>
>> Needless to say, this requires that the "statfs" daemon be running.
>> You can verify this with the "xstat" command (see the configuration
>> instructions for this above).
>>
>> Now, I don't know anything about IB, mainly because I have never had
>> access to an IB-connected cluster.  I have no idea if perceus can
>> manage pxe booting over IB, but I suspect that if you have IP over IB
>> then it should, for all intents and purposes, look like just another
>> network interface to it (I could be utterly wrong on this, of
>> course...).
>
> gPXE does have a working IB subsystem but I am not sure what network cards
> do they support.
>
>>
>> However, if you need to configure a second interface, say for access
>> to a fileserver on a separate network, then all you need to do is
>> change the /etc/perceus/dnsmasq.conf and define the machines in there.
>>  Again, for static IP addresses on the second interface they need to
>> be defined in /etc/hosts.  Let me know if you would like details on
>> how I did this.  I then mounted my fileserver on the compute nodes by
>> modifying the perceus xcpu startup script in
>> /etc/perceus/nodescripts/init/all/05-xcpu.sh, so that the node gets a
>> mount point and executes the nfs mount.
>>
>> Please let me know if/how this works for you.  I hope it is complete...
>> Best regards,
>> Daniel
>>
>> p.s. Please feel free to modify this blurb and add it to the xcpu
>> installation instructions.  Greg from the perceus group is extremely
>> helpful with any perceus issues.
>
> Yes, I will use this for the instructions on the wiki. Thanks.
>
>  -- Abhishek

Great!  Is the wiki available for perusing yet?

Daniel



>
>
>>
>> On Fri, Dec 12, 2008 at 3:41 PM, Chris Kinney <[email protected]> wrote:
>> > Hey Daniel,
>> >
>> >   My name is Chris Kinney, I'm Ron's intern. I was wondering if you
>> > could
>> > show me how you're booting your perceus setup. We're in need of perceus
>> > being able to work with IB and from what we've seen, the capsules just
>> > don't
>> > work with it. What ever you got that can help that would be great!
>> > Thanks
>> > again!
>> >
>> > -Chris
>> >
>> > ron minnich wrote:
>> >>
>> >> On Thu, Dec 11, 2008 at 5:48 PM, Daniel Gruner <[email protected]>
>> >> wrote:
>> >>
>> >>>
>> >>> Yeah, you don't even need a VNFS image in order to boot into xcpu!
>> >>> All you need is the initial busybox provided by perceus and activating
>> >>> the perceus xcpu module "perceus module activate xcpu".  This will
>> >>> give you a minimal xcpu node, and you can then add remote filesystems
>> >>> for mounting if necessary.  You only really need user files, if at
>> >>> all, since the executables that you run with xrx take the necessary
>> >>> libraries along automagically.
>> >>>
>> >>>
>> >>
>> >> daniel, this is an excellent point, and since we are having a terrible
>> >> time getting our vnfs capsules to work with ib ...
>> >>
>> >> can you give us a quick writeup for how you set this up so we can use
>> >> it
>> >> too.
>> >>
>> >> thanks
>> >>
>> >> ron
>> >>
>> >>
>> >
>> >
>
>

Reply via email to