Ok, some progress.  I am now able to run things like:

[EMAIL PROTECTED] examples]# xmvapich n0000,n0001 ./cpi
Process 0 of 2 is on n0000
pi is approximately 3.1415926544231318, Error is 0.0000000008333387
wall clock time = 0.000757
Process 1 of 2 is on n0001

as long as the two nodes specified are different.  If, however, I want
to run two processes on the same node, e.g.:

[EMAIL PROTECTED] examples]# xmvapich n0000,n0000 ./cpi
Process 1 of 2 is on n0000
Process 0 of 2 is on n0000

It hangs as before.  Here is the debugging trace:

[EMAIL PROTECTED] examples]# xmvapich -D n0000,n0000 ./cpi
-pmi-> 0: cmd=initack pmiid=1
<-pmi- 0: cmd=initack rc=0
<-pmi- 0: cmd=set rc=0 size=2
<-pmi- 0: cmd=set rc=0 rank=0
<-pmi- 0: cmd=set rc=0 debug=0
-pmi-> 1: cmd=initack pmiid=1
<-pmi- 1: cmd=initack rc=0
<-pmi- 1: cmd=set rc=0 size=2
<-pmi- 1: cmd=set rc=0 rank=1
<-pmi- 1: cmd=set rc=0 debug=0
-pmi-> 0: cmd=init pmi_version=1 pmi_subversion=1
<-pmi- 0: cmd=response_to_init rc=0
-pmi-> 0: cmd=get_maxes
<-pmi- 0: cmd=maxes rc=0 kvsname_max=64 keylen_max=64 vallen_max=64
-pmi-> 1: cmd=init pmi_version=1 pmi_subversion=1
<-pmi- 1: cmd=response_to_init rc=0
-pmi-> 1: cmd=get_maxes
<-pmi- 1: cmd=maxes rc=0 kvsname_max=64 keylen_max=64 vallen_max=64
-pmi-> 0: cmd=get_appnum
<-pmi- 0: cmd=appnum rc=0 appnum=0
-pmi-> 0: cmd=get_my_kvsname
<-pmi- 0: cmd=my_kvsname rc=0 kvsname=kvs_0
-pmi-> 1: cmd=get_appnum
<-pmi- 1: cmd=appnum rc=0 appnum=0
-pmi-> 1: cmd=get_my_kvsname
<-pmi- 1: cmd=my_kvsname rc=0 kvsname=kvs_0
-pmi-> 0: cmd=get_my_kvsname
<-pmi- 0: cmd=my_kvsname rc=0 kvsname=kvs_0
-pmi-> 1: cmd=get_my_kvsname
<-pmi- 1: cmd=my_kvsname rc=0 kvsname=kvs_0
-pmi-> 0: cmd=put kvsname=kvs_0 key=P0-businesscard
value=port#45956$description#n0000$ifname#10.10.0.10$
<-pmi- 0: cmd=put_result rc=0
-pmi-> 1: cmd=put kvsname=kvs_0 key=P1-businesscard
value=port#38363$description#n0000$ifname#10.10.0.10$
<-pmi- 1: cmd=put_result rc=0
-pmi-> 0: cmd=barrier_in
-pmi-> 1: cmd=barrier_in
<-pmi- 0: cmd=barrier_out rc=0
<-pmi- 1: cmd=barrier_out rc=0
-pmi-> 0: cmd=get kvsname=kvs_0 key=P1-businesscard
<-pmi- 0: cmd=get_result rc=0
value=port#38363$description#n0000$ifname#10.10.0.10$
Process 1 of 2 is on n0000
Process 0 of 2 is on n0000
[EMAIL PROTECTED] examples]#



On 11/5/08, Daniel Gruner <[EMAIL PROTECTED]> wrote:
> That is what I was going for...  It returns (none).
>
>  I am about to run the test after explicitly setting up the hostnames
>  of the nodes.
>  Does xmvapich probe the nodes for their names?  How does it resolve
>  their addresses?
>
>
>  Daniel
>
>  On 11/5/08, Latchesar Ionkov <[EMAIL PROTECTED]> wrote:
>  >
>  >  I guess that is the problem. What do you see if you do:
>  >
>  >         xrx n0000 hostname
>  >
>  >  Thanks,
>  >         Lucho
>  >
>  >
>  >  On Nov 5, 2008, at 12:02 PM, Daniel Gruner wrote:
>  >
>  >
>  > >
>  > > Hi Lucho,
>  > >
>  > > I am provisioning with perceus, and in order to get static node
>  > > addresses I have entries in /etc/hosts that define them, e.g.:
>  > >
>  > > 10.10.0.10      n0000
>  > > 10.10.0.11      n0001
>  > > 10.10.0.12      n0002
>  > >
>  > > My /etc/nsswitch.conf is set to resolv hosts like:
>  > >
>  > > hosts:      files dns
>  > >
>  > > One thing I have noticed is that the nodes do not have their own
>  > > hostname defined after provisioning.  Could this be the problem?
>  > >
>  > > Thanks,
>  > > Daniel
>  > > On 11/5/08, Latchesar Ionkov <[EMAIL PROTECTED]> wrote:
>  > >
>  > > >
>  > > > Hi,
>  > > >
>  > > > It looks like the MPI processes on the nodes don't send a correct IP
>  > > > address to connect to. In your case, they send:
>  > > >
>  > > >
>  > > >
>  > > > >      -pmi-> 0: cmd=put kvsname=kvs_0 key=P0-businesscard
>  > > > >
>  > > > value=port#38675$description#(none)$
>  > > >
>  > > > >
>  > > > >
>  > > >
>  > > >
>  > > > And when I run it, I see:
>  > > >
>  > > >       -pmi-> 0: cmd=put kvsname=kvs_0 key=P0-businesscard
>  > > > value=port#34283$description#m10$ifname#192.168.1.110$
>  > > >
>  > > > I tried to figure out how does mpich pick the IP address, and it looks
>  > like
>  > > > it uses the hostname on the node for that. Do you have the node names
>  > setup
>  > > > correctly?
>  > > >
>  > > > Thanks,
>  > > >       Lucho
>  > > >
>  > > > On Nov 4, 2008, at 1:31 PM, Daniel Gruner wrote:
>  > > >
>  > > >
>  > > >
>  > > > >
>  > > > > Hi Lucho,
>  > > > >
>  > > > > Did you have a chance to look at this?  Needless to say it has been
>  > > > > quite frustrating, and perhaps it has to do with the particular Linux
>  > > > > distribution you run.  I am running on a RHEL5.2 system with kernel
>  > > > > 2.6.26, and the compilation of mpich2 or mvapich2 is totally vanilla.
>  > > > > My network is just GigE.  xmvapich works for a single process, but it
>  > > > > always hangs for more than one, regardless of whether they are on the
>  > > > > same node or separate nodes, and independently of the example program
>  > > > > (hellow, cpi, etc).  Other than some administration issues (like the
>  > > > > authentication stuff I have been exchanging with Abhishek about), 
> this
>  > > > > is the only real obstacle to making my clusters suitable for
>  > > > > production...
>  > > > >
>  > > > > Thanks,
>  > > > > Daniel
>  > > > >
>  > > > > ---------- Forwarded message ----------
>  > > > > From: Daniel Gruner <[EMAIL PROTECTED]>
>  > > > > Date: Oct 8, 2008 2:49 PM
>  > > > > Subject: Re: [xcpu] Re: (s)xcpu and MPI
>  > > > > To: [email protected]
>  > > > >
>  > > > >
>  > > > > Hi Lucho,
>  > > > >
>  > > > > Here is the output (two nodes in the cluster):
>  > > > >
>  > > > > [EMAIL PROTECTED] examples]# xmvapich -D -a ./hellow
>  > > > > -pmi-> 0: cmd=initack pmiid=1
>  > > > > <-pmi- 0: cmd=initack rc=0
>  > > > > <-pmi- 0: cmd=set rc=0 size=2
>  > > > > <-pmi- 0: cmd=set rc=0 rank=0
>  > > > > <-pmi- 0: cmd=set rc=0 debug=0
>  > > > > -pmi-> 0: cmd=init pmi_version=1 pmi_subversion=1
>  > > > > <-pmi- 0: cmd=response_to_init rc=0
>  > > > > -pmi-> 0: cmd=get_maxes
>  > > > > <-pmi- 0: cmd=maxes rc=0 kvsname_max=64 keylen_max=64 vallen_max=64
>  > > > > -pmi-> 0: cmd=get_appnum
>  > > > > <-pmi- 0: cmd=appnum rc=0 appnum=0
>  > > > > -pmi-> 1: cmd=initack pmiid=1
>  > > > > <-pmi- 1: cmd=initack rc=0
>  > > > > <-pmi- 1: cmd=set rc=0 size=2
>  > > > > <-pmi- 1: cmd=set rc=0 rank=1
>  > > > > <-pmi- 1: cmd=set rc=0 debug=0
>  > > > > -pmi-> 0: cmd=get_my_kvsname
>  > > > > <-pmi- 0: cmd=my_kvsname rc=0 kvsname=kvs_0
>  > > > > -pmi-> 0: cmd=get_my_kvsname
>  > > > > <-pmi- 0: cmd=my_kvsname rc=0 kvsname=kvs_0
>  > > > > -pmi-> 1: cmd=init pmi_version=1 pmi_subversion=1
>  > > > > <-pmi- 1: cmd=response_to_init rc=0
>  > > > > -pmi-> 1: cmd=get_maxes
>  > > > > <-pmi- 1: cmd=maxes rc=0 kvsname_max=64 keylen_max=64 vallen_max=64
>  > > > > -pmi-> 1: cmd=get_appnum
>  > > > > <-pmi- 1: cmd=appnum rc=0 appnum=0
>  > > > > -pmi-> 0: cmd=put kvsname=kvs_0 key=P0-businesscard
>  > > > > value=port#38675$description#(none)$
>  > > > > <-pmi- 0: cmd=put_result rc=0
>  > > > > -pmi-> 1: cmd=get_my_kvsname
>  > > > > <-pmi- 1: cmd=my_kvsname rc=0 kvsname=kvs_0
>  > > > > -pmi-> 0: cmd=barrier_in
>  > > > > -pmi-> 1: cmd=get_my_kvsname
>  > > > > <-pmi- 1: cmd=my_kvsname rc=0 kvsname=kvs_0
>  > > > > -pmi-> 1: cmd=put kvsname=kvs_0 key=P1-businesscard
>  > > > > value=port#38697$description#(none)$
>  > > > > <-pmi- 1: cmd=put_result rc=0
>  > > > > -pmi-> 1: cmd=barrier_in
>  > > > > <-pmi- 0: cmd=barrier_out rc=0
>  > > > > <-pmi- 1: cmd=barrier_out rc=0
>  > > > > -pmi-> 0: cmd=get kvsname=kvs_0 key=P1-businesscard
>  > > > > <-pmi- 0: cmd=get_result rc=0
>  > > > >
>  > > > value=port#38697$description#(none)$
>  > > >
>  > > > > -pmi-> 1: cmd=get kvsname=kvs_0 key=P0-businesscard
>  > > > > <-pmi- 1: cmd=get_result rc=0
>  > > > >
>  > > > value=port#38675$description#(none)$
>  > > >
>  > > > >
>  > > > > Hello world from process 1 of 2
>  > > > > Hello world from process 0 of 2
>  > > > >
>  > > > >
>  > > > > It looks like it ran, but then it hangs and never returns.
>  > > > >
>  > > > > If I try to run another example (cpi), here is the output from the 
> run
>  > > > > with a single process, and then with two:
>  > > > >
>  > > > > [EMAIL PROTECTED] examples]# xmvapich n0001 ./cpi
>  > > > > Process 0 of 1 is on (none)
>  > > > > pi is approximately 3.1415926544231341, Error is 0.0000000008333410
>  > > > > wall clock time = 0.000313
>  > > > > [EMAIL PROTECTED] examples]# xmvapich -D n0001 ./cpi
>  > > > > -pmi-> 0: cmd=initack pmiid=1
>  > > > > <-pmi- 0: cmd=initack rc=0
>  > > > > <-pmi- 0: cmd=set rc=0 size=1
>  > > > > <-pmi- 0: cmd=set rc=0 rank=0
>  > > > > <-pmi- 0: cmd=set rc=0 debug=0
>  > > > > -pmi-> 0: cmd=init pmi_version=1 pmi_subversion=1
>  > > > > <-pmi- 0: cmd=response_to_init rc=0
>  > > > > -pmi-> 0: cmd=get_maxes
>  > > > > <-pmi- 0: cmd=maxes rc=0 kvsname_max=64 keylen_max=64 vallen_max=64
>  > > > > -pmi-> 0: cmd=get_appnum
>  > > > > <-pmi- 0: cmd=appnum rc=0 appnum=0
>  > > > > -pmi-> 0: cmd=get_my_kvsname
>  > > > > <-pmi- 0: cmd=my_kvsname rc=0 kvsname=kvs_0
>  > > > > -pmi-> 0: cmd=get_my_kvsname
>  > > > > <-pmi- 0: cmd=my_kvsname rc=0 kvsname=kvs_0
>  > > > > -pmi-> 0: cmd=put kvsname=kvs_0 key=P0-businesscard
>  > > > > value=port#48513$description#(none)$
>  > > > > <-pmi- 0: cmd=put_result rc=0
>  > > > > -pmi-> 0: cmd=barrier_in
>  > > > > <-pmi- 0: cmd=barrier_out rc=0
>  > > > > -pmi-> 0: cmd=finalize
>  > > > > <-pmi- 0: cmd=finalize_ack rc=0
>  > > > > Process 0 of 1 is on (none)
>  > > > > pi is approximately 3.1415926544231341, Error is 0.0000000008333410
>  > > > > wall clock time = 0.000332
>  > > > > [EMAIL PROTECTED] examples]
>  > > > >
>  > > > > normal termination.
>  > > > >
>  > > > > [EMAIL PROTECTED] examples]# xmvapich -D n0000,n0001 ./cpi
>  > > > > -pmi-> 0: cmd=initack pmiid=1
>  > > > > <-pmi- 0: cmd=initack rc=0
>  > > > > <-pmi- 0: cmd=set rc=0 size=2
>  > > > > <-pmi- 0: cmd=set rc=0 rank=0
>  > > > > <-pmi- 0: cmd=set rc=0 debug=0
>  > > > > -pmi-> 0: cmd=init pmi_version=1 pmi_subversion=1
>  > > > > <-pmi- 0: cmd=response_to_init rc=0
>  > > > > -pmi-> 0: cmd=get_maxes
>  > > > > <-pmi- 0: cmd=maxes rc=0 kvsname_max=64 keylen_max=64 vallen_max=64
>  > > > > -pmi-> 0: cmd=get_appnum
>  > > > > <-pmi- 0: cmd=appnum rc=0 appnum=0
>  > > > > -pmi-> 0: cmd=get_my_kvsname
>  > > > > <-pmi- 0: cmd=my_kvsname rc=0 kvsname=kvs_0
>  > > > > -pmi-> 1: cmd=initack pmiid=1
>  > > > > <-pmi- 1: cmd=initack rc=0
>  > > > > <-pmi- 1: cmd=set rc=0 size=2
>  > > > > <-pmi- 1: cmd=set rc=0 rank=1
>  > > > > <-pmi- 1: cmd=set rc=0 debug=0
>  > > > > -pmi-> 0: cmd=get_my_kvsname
>  > > > > <-pmi- 0: cmd=my_kvsname rc=0 kvsname=kvs_0
>  > > > > -pmi-> 1: cmd=init pmi_version=1 pmi_subversion=1
>  > > > > <-pmi- 1: cmd=response_to_init rc=0
>  > > > > -pmi-> 1: cmd=get_maxes
>  > > > > <-pmi- 1: cmd=maxes rc=0 kvsname_max=64 keylen_max=64 vallen_max=64
>  > > > > -pmi-> 0: cmd=put kvsname=kvs_0 key=P0-businesscard
>  > > > > value=port#45645$description#(none)$
>  > > > > <-pmi- 0: cmd=put_result rc=0
>  > > > > -pmi-> 1: cmd=get_appnum
>  > > > > <-pmi- 1: cmd=appnum rc=0 appnum=0
>  > > > > -pmi-> 0: cmd=barrier_in
>  > > > > -pmi-> 1: cmd=get_my_kvsname
>  > > > > <-pmi- 1: cmd=my_kvsname rc=0 kvsname=kvs_0
>  > > > > -pmi-> 1: cmd=get_my_kvsname
>  > > > > <-pmi- 1: cmd=my_kvsname rc=0 kvsname=kvs_0
>  > > > > -pmi-> 1: cmd=put kvsname=kvs_0 key=P1-businesscard
>  > > > > value=port#53467$description#(none)$
>  > > > > <-pmi- 1: cmd=put_result rc=0
>  > > > > -pmi-> 1: cmd=barrier_in
>  > > > > <-pmi- 0: cmd=barrier_out rc=0
>  > > > > <-pmi- 1: cmd=barrier_out rc=0
>  > > > > -pmi-> 0: cmd=get kvsname=kvs_0 key=P1-businesscard
>  > > > > <-pmi- 0: cmd=get_result rc=0
>  > > > >
>  > > > value=port#53467$description#(none)$
>  > > >
>  > > > > Process 0 of 2 is on (none)
>  > > > > Process 1 of 2 is on (none)
>  > > > >
>  > > > > hung processes....
>  > > > >
>  > > > >
>  > > > > Daniel
>  > > > >
>  > > > >
>  > > > > On Wed, Oct 8, 2008 at 3:23 PM, Latchesar Ionkov <[EMAIL PROTECTED]>
>  > wrote:
>  > > > >
>  > > > >
>  > > > > >
>  > > > > > I can't replicate it, it is working fine here :(
>  > > > > > Can you please try xmvapich again with -D option and cut&paste the
>  > > > > >
>  > > > >
>  > > > output?
>  > > >
>  > > > >
>  > > > > >
>  > > > > > Thanks,
>  > > > > >    Lucho
>  > > > > >
>  > > > > > On Oct 6, 2008, at 2:51 PM, Daniel Gruner wrote:
>  > > > > >
>  > > > > >
>  > > > > >
>  > > > > > >
>  > > > > > > I just compiled mpich2-1.1.0a1, and tested it, with the same
>  > result as
>  > > > > > > with mvapich.  Again I had to do the configure with
>  > > > > > > --with-device=ch3:sock, since otherwise the runtime complains 
> that
>  > it
>  > > > > > > can't allocate shared memory or some such thing.  When I run a
>  > single
>  > > > > > > process using xmvapich it completes fine.  However when running
>  > two or
>  > > > > > > more it hangs.  This is not surprising as it should be the same 
> as
>  > > > > > > mvapich when running over regular TCP/IP on GigE rather than a
>  > special
>  > > > > > > interconnect.
>  > > > > > >
>  > > > > > > [EMAIL PROTECTED] examples]# ./hellow
>  > > > > > > Hello world from process 0 of 1
>  > > > > > > [EMAIL PROTECTED] examples]# xmvapich -a ./hellow
>  > > > > > > Hello world from process 1 of 2
>  > > > > > > Hello world from process 0 of 2
>  > > > > > > ^C
>  > > > > > > [EMAIL PROTECTED] examples]# xmvapich n0000 ./hellow
>  > > > > > > Hello world from process 0 of 1
>  > > > > > > [EMAIL PROTECTED] examples]# xmvapich n0001 ./hellow
>  > > > > > > Hello world from process 0 of 1
>  > > > > > > [EMAIL PROTECTED] examples]# xmvapich n0000,n0001 ./hellow
>  > > > > > > Hello world from process 1 of 2
>  > > > > > > Hello world from process 0 of 2
>  > > > > > > ^C
>  > > > > > >
>  > > > > > > Daniel
>  > > > > > >
>  > > > > > >
>  > > > > > >
>  > > > > > > On 10/6/08, Latchesar Ionkov <[EMAIL PROTECTED]> wrote:
>  > > > > > >
>  > > > > > >
>  > > > > > > >
>  > > > > > > > I just compiled mpich2-1.1.0a1 and tried running hellow,
>  > everything
>  > > > > > > >
>  > > > > > >
>  > > > > >
>  > > > >
>  > > > looks
>  > > >
>  > > > >
>  > > > > >
>  > > > > > >
>  > > > > > > > fine:
>  > > > > > > >
>  > > > > > > > $ xmvapich m1,m2
>  > > > > > > > ~/work/mpich2-1.1.0a1/build/examples/hellow
>  > > > > > > > Hello world from process 0 of 2
>  > > > > > > > Hello world from process 1 of 2
>  > > > > > > > $
>  > > > > > > >
>  > > > > > > > I didn't set any special parameters when compiling, just
>  > > > > > > >
>  > > > > > >
>  > > > > >
>  > > > >
>  > > > ./configure.
>  > > >
>  > > > >
>  > > > > >
>  > > > > > >
>  > > > > > > >
>  > > > > > > > Thanks,
>  > > > > > > >   Lucho
>  > > > > > > >
>  > > > > > > >
>  > > > > > > > On Oct 3, 2008, at 9:05 AM, Daniel Gruner wrote:
>  > > > > > > >
>  > > > > > > >
>  > > > > > > >
>  > > > > > > >
>  > > > > > > > >
>  > > > > > > > > Well, I just did the same, but with NO success...  The
>  > processes
>  > > > > > > > >
>  > > > > > > >
>  > > > > > >
>  > > > > >
>  > > > >
>  > > > are
>  > > >
>  > > > >
>  > > > > >
>  > > > > > >
>  > > > > > > >
>  > > > > > > > > apparently started, run at the beginning, but then they hang
>  > and
>  > > > > > > > >
>  > > > > > > >
>  > > > > > >
>  > > > > >
>  > > > >
>  > > > do
>  > > >
>  > > > >
>  > > > > >
>  > > > > > >
>  > > > > > > >
>  > > > > > > > > not finalize.  For example, running the "hellow" example from
>  > the
>  > > > > > > > > mvapich2 distribution:
>  > > > > > > > >
>  > > > > > > > > [EMAIL PROTECTED] examples]# cat hellow.c
>  > > > > > > > > /* -*- Mode: C; c-basic-offset:4 ; -*- */
>  > > > > > > > > /*
>  > > > > > > > > *  (C) 2001 by Argonne National Laboratory.
>  > > > > > > > > *      See COPYRIGHT in top-level directory.
>  > > > > > > > > */
>  > > > > > > > >
>  > > > > > > > > #include <stdio.h>
>  > > > > > > > > #include "mpi.h"
>  > > > > > > > >
>  > > > > > > > > int main( int argc, char *argv[] )
>  > > > > > > > > {
>  > > > > > > > > int rank;
>  > > > > > > > > int size;
>  > > > > > > > >
>  > > > > > > > > MPI_Init( 0, 0 );
>  > > > > > > > > MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>  > > > > > > > > MPI_Comm_size(MPI_COMM_WORLD, &size);
>  > > > > > > > > printf( "Hello world from process %d of %d\n", rank, size );
>  > > > > > > > > MPI_Finalize();
>  > > > > > > > > return 0;
>  > > > > > > > > }
>  > > > > > > > >
>  > > > > > > > > [EMAIL PROTECTED] examples]# make hellow
>  > > > > > > > > ../bin/mpicc  -I../src/include -I../src/include   -c hellow.c
>  > > > > > > > > ../bin/mpicc   -o hellow hellow.o
>  > > > > > > > > [EMAIL PROTECTED] examples]# ./hellow
>  > > > > > > > > Hello world from process 0 of 1
>  > > > > > > > >
>  > > > > > > > > (this was fine, just running on the master).  Running on the
>  > two
>  > > > > > > > >
>  > > > > > > >
>  > > > > > >
>  > > > > >
>  > > > >
>  > > > nodes
>  > > >
>  > > > >
>  > > > > >
>  > > > > > >
>  > > > > > > >
>  > > > > > > > > requires that the xmvapich process be killed (ctrl-C):
>  > > > > > > > >
>  > > > > > > > > [EMAIL PROTECTED] examples]# xmvapich -ap ./hellow
>  > > > > > > > > n0000: Hello world from process 0 of 2
>  > > > > > > > > n0001: Hello world from process 1 of 2
>  > > > > > > > > [EMAIL PROTECTED] examples]#
>  > > > > > > > >
>  > > > > > > > > I have tried other codes, both in C and Fortran, with the 
> same
>  > > > > > > > > behaviour.  I don't know if the issue is with xmvapich or 
> with
>  > > > > > > > > mvapich2.  Communication is just GigE.
>  > > > > > > > >
>  > > > > > > > > Daniel
>  > > > > > > > >
>  > > > > > > > >
>  > > > > > > > > On 9/30/08, Abhishek Kulkarni <[EMAIL PROTECTED]> wrote:
>  > > > > > > > >
>  > > > > > > > >
>  > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > Just gave this a quick try, and xmvapich seems to run MPI
>  > apps
>  > > > > > > > > >
>  > > > > > > > >
>  > > > > > > >
>  > > > > > >
>  > > > > >
>  > > > >
>  > > > compiled
>  > > >
>  > > > >
>  > > > > >
>  > > > > > >
>  > > > > > > >
>  > > > > > > > >
>  > > > > > > > > > with mpich2 without any issues.
>  > > > > > > > > >
>  > > > > > > > > > $ xmvapich -a ./mpihello
>  > > > > > > > > > blender: Hello World from process 0 of 1
>  > > > > > > > > > eregion: Hello World from process 0 of 1
>  > > > > > > > > >
>  > > > > > > > > > Hope that helps,
>  > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > -- Abhishek
>  > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > On Tue, 2008-09-30 at 17:02 +0200, Stefan Boresch wrote:
>  > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > > Thanks for the quick reply!
>  > > > > > > > > > >
>  > > > > > > > > > > On Tue, Sep 30, 2008 at 07:34:37AM -0700, ron minnich
>  > wrote:
>  > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > > > On Tue, Sep 30, 2008 at 1:57 AM, stefan
>  > > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > >
>  > > > > > > >
>  > > > > > >
>  > > > > >
>  > > > >
>  > > > <[EMAIL PROTECTED]>
>  > > >
>  > > > >
>  > > > > >
>  > > > > > >
>  > > > > > > >
>  > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > >
>  > > > > > > > >
>  > > > > > > >
>  > > > > > > > wrote:
>  > > > > > > >
>  > > > > > > >
>  > > > > > > > >
>  > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > > > >
>  > > > > > > > > > > > > the state of xcpu support with MPI libraries -- 
> either
>  > of
>  > > > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > >
>  > > > > > > >
>  > > > > > >
>  > > > > >
>  > > > >
>  > > > the
>  > > >
>  > > > >
>  > > > > >
>  > > > > > >
>  > > > > > > >
>  > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > > > >
>  > > > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > >
>  > > > > > > > >
>  > > > > > > >
>  > > > > > > > common
>  > > > > > > >
>  > > > > > > >
>  > > > > > > > >
>  > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > > > >
>  > > > > > > > > > > > > free ones
>  > > > > > > > > > > > > is fine (e.g., openmpi, mpich2)
>  > > > > > > > > > > > >
>  > > > > > > > > > > > >
>  > > > > > > > > > > > >
>  > > > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > > > there is now support for mpich2. openmpi is not
>  > supported as
>  > > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > >
>  > > > > > > >
>  > > > > > >
>  > > > > >
>  > > > >
>  > > > openmpi
>  > > >
>  > > > >
>  > > > > >
>  > > > > > >
>  > > > > > > >
>  > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > > > is (once again) in flux. it has been supported numerous
>  > > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > >
>  > > > > > > >
>  > > > > > >
>  > > > > >
>  > > > >
>  > > > times and
>  > > >
>  > > > >
>  > > > > >
>  > > > > > >
>  > > > > > > >
>  > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > >
>  > > > > > > > >
>  > > > > > > >
>  > > > > > > > has
>  > > > > > > >
>  > > > > > > >
>  > > > > > > > >
>  > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > > > changed out from under us numerous times. I no longer
>  > use
>  > > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > >
>  > > > > > > >
>  > > > > > >
>  > > > > >
>  > > > >
>  > > > openmpi if
>  > > >
>  > > > >
>  > > > > >
>  > > > > > >
>  > > > > > > >
>  > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > >
>  > > > > > > > >
>  > > > > > > >
>  > > > > > > > I
>  > > > > > > >
>  > > > > > > >
>  > > > > > > > >
>  > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > > > have a working mvapich or mpich available.
>  > > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > > I am slightly confused. I guess I had inferred the 
> openmpi
>  > > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > >
>  > > > > > > >
>  > > > > > >
>  > > > > >
>  > > > >
>  > > > issues from
>  > > >
>  > > > >
>  > > > > >
>  > > > > > >
>  > > > > > > >
>  > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > > the various mailing lists. But I just looked at the 
> latest
>  > > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > >
>  > > > > > > >
>  > > > > > >
>  > > > > >
>  > > > >
>  > > > mpich2
>  > > >
>  > > > >
>  > > > > >
>  > > > > > >
>  > > > > > > >
>  > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > > prerelease and found no mentioning of (s)xcpu(2). I
>  > thought
>  > > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > >
>  > > > > > > >
>  > > > > > >
>  > > > > >
>  > > > >
>  > > > that some
>  > > >
>  > > > >
>  > > > > >
>  > > > > > >
>  > > > > > > >
>  > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > > patches/support on the side of the mpi library are
>  > necessary
>  > > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > >
>  > > > > > > >
>  > > > > > >
>  > > > > >
>  > > > >
>  > > > (as,
>  > > >
>  > > > >
>  > > > > >
>  > > > > > >
>  > > > > > > >
>  > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > >
>  > > > > > > > >
>  > > > > > > >
>  > > > > > > > e.g.,
>  > > > > > > >
>  > > > > > > >
>  > > > > > > > >
>  > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > > openmpi provides for bproc ...)  Or am I completely
>  > > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > >
>  > > > > > > >
>  > > > > > >
>  > > > > >
>  > > > >
>  > > > misunderstanding
>  > > >
>  > > > >
>  > > > > >
>  > > > > > >
>  > > > > > > >
>  > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > > something here, and this is somehow handled by xcpu 
> itself
>  > ...
>  > > > > > > > > > > I guess there is some difference between
>  > > > > > > > > > >
>  > > > > > > > > > > xrx 192.168.19.2 /bin/date
>  > > > > > > > > > >
>  > > > > > > > > > > and
>  > > > > > > > > > >
>  > > > > > > > > > > xrx 192.168.19.2 <pathto>/mpiexec ...
>  > > > > > > > > > >
>  > > > > > > > > > > and the latter seems too magic to me to run out of the 
> box
>  > (it
>  > > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > >
>  > > > > > > >
>  > > > > > >
>  > > > > >
>  > > > >
>  > > > sure
>  > > >
>  > > > >
>  > > > > >
>  > > > > > >
>  > > > > > > >
>  > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > > > would be nice though ...)
>  > > > > > > > > > >
>  > > > > > > > > > > Sorry for making myself a nuisance -- thanks,
>  > > > > > > > > > >
>  > > > > > > > > > > Stefan Boresch
>  > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > > >
>  > > > > > > > >
>  > > > > > > > >
>  > > > > > > > >
>  > > > > > > >
>  > > > > > > >
>  > > > > > > >
>  > > > > > > >
>  > > > > > >
>  > > > > > >
>  > > > > >
>  > > > > >
>  > > > > >
>  > > > > >
>  > > > >
>  > > > >
>  > > >
>  > > >
>  > > >
>  > >
>  >
>  >
>

Reply via email to