On Sat, Dec 13, 2008 at 9:36 PM, Daniel Gruner <[email protected]> wrote:
> > Thanks for the corrections! I too have some comments inline... > > On Sat, Dec 13, 2008 at 1:10 PM, Abhishek Kulkarni <[email protected]> > wrote: > > Excellent write-up, Daniel. I am adding some of my comments and/or > > suggestions inline. > > I am trying to detail most of these steps in the wiki guide for xcpu. I > will > > take note of some of the points that you have made. > > Thanks. > > > > On Sat, Dec 13, 2008 at 10:10 AM, Daniel Gruner <[email protected]> > wrote: > >> > >> Ok, here we go... > >> > >> I start from an almost vanilla RHEL5.2 machine, except for the kernel. > >> RHEL does not provide the 9p modules, so rather than trying to > >> recompile their kernel I just got the 2.6.26 kernel from kernel.org. > >> This allows me to build sxcpu right out of the box. > > > > sxcpu does not need any kernel modules at all. xcpu2 uses the 9p and > 9pnet > > modules to mount the head node file system. You can also build the 9p > > modules for a RHEL kernel. > > > > I guess way back, when I started with perceus, I was still trying to > use xcpu2, hence the need for a different kernel with 9p support. > Then we realized that perceus includes sxcpu out of the box, so I went > back to that. xcpu2 is still enticing, but I am not very comfortable > with it - yet. Perhaps when the writeup is done and it can be > explained in more detail, including its benefits and pitfalls, I'll go > to it. > > > >> > >> I obtained it > >> from the sourceforge svn repository: > >> > >> svn co https://xcpu.svn.sourceforge.net/svnroot/xcpu/sxcpu/trunk sxcpu > >> > >> Here you simply do "make; make install" and it should all be > >> available. > > > > Another thing to note: there are a few prerequisites (libelf, openssl > > headers) for sxcpu and you would have to install them for it to build > > successfully if you are on a vanilla debian/ubuntu system. > > > >> > >> It will be important to run the "statfs" daemon on the > >> master, so that the commands you use later are aware of the status of > >> the compute nodes in the cluster. These will need to be configured in > >> the /etc/xcpu/statfs.conf file: > >> > >> [r...@dgk3 xcpu]# cat /etc/xcpu/statfs.conf > >> n0000=tcp!10.10.0.10!6667 > >> n0001=tcp!10.10.0.11!6667 > >> > >> See below for more details on the assignment of IP addresses to the > >> nodes by perceus. > >> > >> > >> Then the perceus side of things: I have perceus 1.4.4, downloaded > >> directly from their site. To build it I just did the usual > >> ./configure; make; make install with no special options. Now to the > >> perceus configuration... > >> > >> My internal network to the compute nodes is eth0. Here is the > >> ifconfig for my master node: > >> > >> [r...@dgk3 all]# ifconfig > >> eth0 Link encap:Ethernet HWaddr 00:E0:81:2C:81:D0 > >> inet addr:10.10.0.1 Bcast:10.10.0.255 Mask:255.255.255.0 > >> inet6 addr: fe80::2e0:81ff:fe2c:81d0/64 Scope:Link > >> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > >> RX packets:8044504 errors:0 dropped:0 overruns:0 frame:0 > >> TX packets:10719515 errors:0 dropped:0 overruns:0 carrier:0 > >> collisions:0 txqueuelen:1000 > >> RX bytes:1770711038 (1.6 GiB) TX bytes:1542820770 (1.4 GiB) > >> Interrupt:24 > >> > >> eth1 Link encap:Ethernet HWaddr 00:E0:81:2C:81:D1 > >> inet addr:142.150.227.13 Bcast:142.150.227.255 > >> Mask:255.255.252.0 > >> inet6 addr: fec0::9:2e0:81ff:fe2c:81d1/64 Scope:Site > >> inet6 addr: 2002:8e96:e1cc:9:2e0:81ff:fe2c:81d1/64 Scope:Global > >> inet6 addr: fe80::2e0:81ff:fe2c:81d1/64 Scope:Link > >> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > >> RX packets:44696410 errors:0 dropped:0 overruns:0 frame:0 > >> TX packets:1158903 errors:0 dropped:0 overruns:0 carrier:0 > >> collisions:0 txqueuelen:1000 > >> RX bytes:4665982604 (4.3 GiB) TX bytes:1197487595 (1.1 GiB) > >> Interrupt:25 > >> > >> lo Link encap:Local Loopback > >> inet addr:127.0.0.1 Mask:255.0.0.0 > >> inet6 addr: ::1/128 Scope:Host > >> UP LOOPBACK RUNNING MTU:16436 Metric:1 > >> RX packets:21339 errors:0 dropped:0 overruns:0 frame:0 > >> TX packets:21339 errors:0 dropped:0 overruns:0 carrier:0 > >> collisions:0 txqueuelen:0 > >> RX bytes:66081636 (63.0 MiB) TX bytes:66081636 (63.0 MiB) > >> > >> In /etc/perceus there are several configuration files: > >> > >> ---defaults.conf--- > >> [r...@dgk3 perceus]# cat defaults.conf > >> # > >> # Copyright (c) 2006-2008, Greg M. Kurtzer, Arthur A. Stevens and > >> # Infiscale, Inc. All rights reserved > >> # > >> > >> # This is the template name for all new nodes as they are configured. > >> > >> # Define the node name range. The '#' characters symbolize the node > number > >> # in the order of initalized. If you don't allocate enough number spaces > >> # here for what you defined in 'Total Nodes' then it will be > automatically > >> # padded. > >> Node Name = n#### > >> > >> # What is the default group for new nodes (this doesn't have to exist > >> # anywhere before hand) > >> Group Name = cluster > >> > >> # Define the default VNFS image that should be assigned to new nodes > >> Vnfs Name = > >> > >> # Are new nodes automatically enabled and provisionined? > >> Enabled = 1 > >> > >> # What is the first node number that we should count at? > >> First Node = 0 > >> > >> # This is the total node count that Perceus would ever try and allocate > a > >> # node to. It is safe to make this big, so you should leave it big. > >> Total Nodes = 10000 > >> > >> (I did not modify the defaults.conf file). > >> > >> > >> ---dnsmasq.conf--- > >> [r...@dgk3 perceus]# cat dnsmasq.conf > >> interface=eth0 > >> enable-tftp > >> tftp-root=/usr/local/var/lib/perceus//tftp > >> dhcp-option=vendor:Etherboot,60,"Etherboot" > >> dhcp-boot=pxelinux.0 > >> local=// > >> domain=internal > >> expand-hosts > >> dhcp-range=10.10.0.128,10.10.0.254 > >> dhcp-lease-max=21600 > >> read-ethers > >> > >> > >> ---perceus.conf--- > >> [r...@dgk3 perceus]# cat perceus.conf > >> # > >> # Copyright (c) 2006-2008, Greg M. Kurtzer, Arthur A. Stevens and > >> # Infiscale, Inc. All rights reserved > >> # > >> > >> # This is the primary configuration file for Perceus > >> > >> # Define the network device on this system that is connected directly > >> # and privately to the nodes. This device will be responding to DHCP > >> # requests thus make sure you specify the proper device name! > >> # note: This device must be configured for IP based communication. > >> master network device = eth0 > >> > >> # What protocol should be used to retireve the VNFS information. > Generally > >> # Supported options in this version of Perceus are: 'xget', 'nfs', and > >> 'http' > >> # but others may also be available via specialized VNFS capsules or > >> # feature enhancing Perceus Modules. > >> vnfs transfer method = xget > >> > >> # Define the IP Address of the network file server. This address must be > >> # set before Perceus can operate. If this option is left blank, the IP > >> # address of the "master network device" defined above will be used. > >> vnfs transfer master = > >> > >> # Define the VNFS transfer location if it is different from the default > >> # ('statedir'). This gets used differently for different transfer > methods > >> # (e.g. NFS this replaces the path to statedir, while with http it is > gets > >> # prepended to the "/perceus" path). > >> vnfs transfer prefix = > >> > >> # What is the default database that should be used. If this option is > not > >> # specified, then the default is "hash" to remain compatible with > >> # previous versions of Perceus. Other options are 'btree' and 'mysql'. > >> # note: btree is default as of version 1.4. > >> database type = btree > >> > >> # If you selected an SQL database solution as your database type above, > >> # then you will need to specify the SQL user login information here. > >> # note: this will be ignored for non-SQL database types. > >> database server = localhost > >> database name = perceus > >> database user = db user > >> database pass = db pass > >> > >> # To allow for better scaling the Perceus daemon 'preforks' which > creates > >> # multiple subprocesses to better handle large number of simultaneous > >> # connections. The default is 4 which on most systems can support > >> # thousands of nodes per minute but for best tuning this number is > highly > >> # dependant on system configuration (both hardware and software). > >> prefork = 4 > >> > >> # How long (in seconds) should we wait before considering a node as > dead. > >> # Note, that if you are not running node client daemons, then after > >> # provisioning the node will never check in, and will no doubt expire. > >> # Considering that the default node check in is 5 minutes, setting this > >> # to double that should ensure that any living node would have checked > in > >> # by then (600). > >> node timeout = 600 > >> > >> > >> I only modified the master network device to point to eth0. Note that > >> there are no VNFS images defined, as booting xcpu does not require > >> them. > >> > >> Install the perceus startup script in /etc/rc.d/init.d, so that it > >> will start on boot. I believe it gets installed by default (in the > >> "make install" step), but it still needs to be configured with > >> "chkconfig -a perceus". > > > > I don't think this is necessary. Perceus manages the init scripts for > most > > distributions properly. > > > > I just though it would be good to mention it. I do not remember if > perceus did this automatically or not... > > >> > >> This will start the perceus daemons, > >> including the dnsmasq which provides dhcp for the slave nodes. No > >> other dhcp server can run, but this one can be configured to provide > >> other network configuration for additional NICs. > > > > Yes, this works fine in most cases. And dnsmasq is pretty customizable at > > that. > > http://www.thekelleys.org.uk/dnsmasq/docs/dnsmasq-man.html > > > > Unless you have special needs like having all the compute nodes > accessible > > directly from a public network (yes i have heard that before!), it should > > work for you. > > Ugg! I believe in compute nodes needing an external fileserver, but > not in direct access to the nodes from the outside. > > > > >> > >> After rebooting, make sure perceus is running. Then run the command: > >> > >> perceus module activate xcpu > >> perceus module activate ipaddr > >> > >> In order to get static addreses assigned to the compute nodes > >> (desireable), their addresses must be added to the /etc/hosts file, > >> e.g.: > >> > >> 10.10.0.1 master > >> 10.10.0.10 n0000 > >> 10.10.0.11 n0001 > >> > >> You should be ready to start configuring the nodes at this stage. > >> They must be set for pxe boot. All the necessary stuff for this is > >> installed by perceus in /usr/local/var/lib/perceus. You boot your > >> compute nodes in the order in which you want them named, starting, by > >> default, as n0000. The first time they will be assigned an IP address > >> from the dynamic range defined in the /etc/perceus/dnsmasq.conf file, > >> but on reboot they will get the statically assigned address from the > >> /etc/hosts file. > >> > >> By this stage you should have a useable xcpu cluster. You need to set > >> up the groups and users using the xgroupset and xuserset commands. In > >> order to get "proper" behaviour, in accordance with the version of > >> sxcpu that you downloaded and built, you may need to update the xcpufs > >> provided by perceus. This is done by statically linking the xcpufs > >> daemon: > >> > >> In /usr/local/src/sxcpu/xcpufs (or wherever you installed the sxcpu > >> sources) there is a script called LINKSTATIC. I am running on x86_64, > >> so I modified it to read: > >> > >> [r...@dgk3 xcpufs]# cat LINKSTATIC > >> #!/bin/sh > >> echo This script is for linking statically on Linux. > >> cc -static -o xcpufs.static -Wall -g -I ../include -DSYSNAME=Linux > >> file.o pipe.o proc-Linux.o tspawn.o ufs.o xauth.o xcpufs.o -g > >> -L../libstrutil -lstrutil -L../libspclient -lspclient -L../libspfs > >> -lspfs -L../libxauth -lxauth -lcrypto /usr/lib64/libdl.a > >> > >> Note that it produces the "xcpufs.static" executable, and it looks for > >> its libdl.a library in the /usr/lib64 directory. Then I copy the > >> xcpufs.static executable to the location where perceus needs it: > > > > Rather than doing this manually, it's recommended to put the tarball in > the > > right place within Perceus and then make -C 3rd_party/ xcpu to generate a > > new xcpufs. Perceus applies its own static libs patch to sxcpu and you > don't > > have to worry about the multilib path. > > > > Ok this is a good idea. I didn't know where perceus expected this. > Looking at the makefile in the 3rd_party directory of perceus it seems > that one needs to possibly change it to correspond to whatever version > of sxcpu one provides. Should not be a big problem. > > >> > >> cp /usr/local/src/sxcpu/xcpufs/xcpufs.static > >> /usr/local/var/lib/perceus/modules/xcpu/xcpufs > >> > >> and on reboot the nodes will pick up the latest and greatest xcpufs. > >> > >> After this you can do, for example: > >> > >> xgroupset add -a -u > >> xuserset add -a -u > >> > >> in order to add all the groups and all the users to the permitted user > >> list on the nodes. You can then run anything on the nodes, e.g. "xrx > >> -a date". > >> > >> Needless to say, this requires that the "statfs" daemon be running. > >> You can verify this with the "xstat" command (see the configuration > >> instructions for this above). > >> > >> Now, I don't know anything about IB, mainly because I have never had > >> access to an IB-connected cluster. I have no idea if perceus can > >> manage pxe booting over IB, but I suspect that if you have IP over IB > >> then it should, for all intents and purposes, look like just another > >> network interface to it (I could be utterly wrong on this, of > >> course...). > > > > gPXE does have a working IB subsystem but I am not sure what network > cards > > do they support. > > > >> > >> However, if you need to configure a second interface, say for access > >> to a fileserver on a separate network, then all you need to do is > >> change the /etc/perceus/dnsmasq.conf and define the machines in there. > >> Again, for static IP addresses on the second interface they need to > >> be defined in /etc/hosts. Let me know if you would like details on > >> how I did this. I then mounted my fileserver on the compute nodes by > >> modifying the perceus xcpu startup script in > >> /etc/perceus/nodescripts/init/all/05-xcpu.sh, so that the node gets a > >> mount point and executes the nfs mount. > >> > >> Please let me know if/how this works for you. I hope it is complete... > >> Best regards, > >> Daniel > >> > >> p.s. Please feel free to modify this blurb and add it to the xcpu > >> installation instructions. Greg from the perceus group is extremely > >> helpful with any perceus issues. > > > > Yes, I will use this for the instructions on the wiki. Thanks. > > > > -- Abhishek > > Great! Is the wiki available for perusing yet? Not yet. But let me know if you want to help in expanding some sections, and I'll set up wiki access for you. Thanks. > > > Daniel > > > > > > > > >> > >> On Fri, Dec 12, 2008 at 3:41 PM, Chris Kinney <[email protected]> > wrote: > >> > Hey Daniel, > >> > > >> > My name is Chris Kinney, I'm Ron's intern. I was wondering if you > >> > could > >> > show me how you're booting your perceus setup. We're in need of > perceus > >> > being able to work with IB and from what we've seen, the capsules just > >> > don't > >> > work with it. What ever you got that can help that would be great! > >> > Thanks > >> > again! > >> > > >> > -Chris > >> > > >> > ron minnich wrote: > >> >> > >> >> On Thu, Dec 11, 2008 at 5:48 PM, Daniel Gruner <[email protected]> > >> >> wrote: > >> >> > >> >>> > >> >>> Yeah, you don't even need a VNFS image in order to boot into xcpu! > >> >>> All you need is the initial busybox provided by perceus and > activating > >> >>> the perceus xcpu module "perceus module activate xcpu". This will > >> >>> give you a minimal xcpu node, and you can then add remote > filesystems > >> >>> for mounting if necessary. You only really need user files, if at > >> >>> all, since the executables that you run with xrx take the necessary > >> >>> libraries along automagically. > >> >>> > >> >>> > >> >> > >> >> daniel, this is an excellent point, and since we are having a > terrible > >> >> time getting our vnfs capsules to work with ib ... > >> >> > >> >> can you give us a quick writeup for how you set this up so we can use > >> >> it > >> >> too. > >> >> > >> >> thanks > >> >> > >> >> ron > >> >> > >> >> > >> > > >> > > > > > >
