Re: [X10-users] X10 with CUDA and MPI

Dave Cunningham Mon, 13 Feb 2012 15:23:37 -0800

I would guess you are right, they have probably broken backwards
compatability in that version. I'll need to get the latest CUDA installed
and try it.  In the mean time, can you paste the output of nvcc --help on
your system, at this point:


--gpu-architecture <gpu architecture name>  (-arch)

[...]
        Allowed values for this option:
 'compute_10','compute_11','compute_12',

'compute_13','compute_20','compute_30','sm_10','sm_11','sm_12','sm_13',
        'sm_20','sm_21','sm_22','sm_23','sm_30'.

Probably I should be using --gpu-code instead

thanks for the report

On Mon, Feb 13, 2012 at 1:05 PM, David E Hudak <dhu...@osc.edu> wrote:

> OK, I followed these instructions:
> http://x10-lang.org/documentation/practical-x10-programming/x10-on-gpus
>
> …and got CUDATopology to work:
> 1004  x10c++ -O -NO_CHECKS -x10rt mpi CUDATopology.x10 -o CUDATopology
>  1005  X10RT_ACCELS=ALL mpiexec -pernode ./CUDATopology
> dhudak@n0282 1012%> !1005
> X10RT_ACCELS=ALL mpiexec -pernode ./CUDATopology
> Dumping places at place: Place(0)
> Place: Place(0)
>  Parent: Place(0)
>  NumChildren: 2
>  Is a Host place
>   Child 0: Place(2)
>    Parent: Place(0)
>    NumChildren: 0
>    Is a CUDA place
>  Child 1: Place(3)
>    Parent: Place(0)
>    NumChildren: 0
>    Is a CUDA place
> Place: Place(1)
>  Parent: Place(1)
>  NumChildren: 2
>  Is a Host place
>  Child 0: Place(4)
>    Parent: Place(1)
>    NumChildren: 0
>    Is a CUDA place
>  Child 1: Place(5)
>    Parent: Place(1)
>    NumChildren: 0
>    Is a CUDA place
>
> Dumping places at place: Place(1)
> Place: Place(0)
>  Parent: Place(0)
>  NumChildren: 2
>  Is a Host place
>  Child 0: Place(2)
>    Parent: Place(0)
>    NumChildren: 0
>    Is a CUDA place
>  Child 1: Place(3)
>    Parent: Place(0)
>    NumChildren: 0
>    Is a CUDA place
> Place: Place(1)
>  Parent: Place(1)
>  NumChildren: 2
>  Is a Host place
>  Child 0: Place(4)
>    Parent: Place(1)
>    NumChildren: 0
>    Is a CUDA place
>  Child 1: Place(5)
>    Parent: Place(1)
>    NumChildren: 0
>    Is a CUDA place
>
> …but, other examples are not building.  I am assuming its the new version
> of X10 along with the new version of CUDA, but I figured I would pass it
> along to the mailing list.
>
> dhudak@oak-rw 999%> module list
> Currently Loaded Modules:
>  1) torque/2.5.10  2) moab/6.1.4  3) modules/1.0  4) gnu/4.4.5  5)
> mvapich2/1.7  6) mkl/10.3.0  7) cuda/4.1.28  8) java/1.7.0_02  9)
> x10/2.2.2-cuda
> dhudak@oak-rw 1000%> which nvcc
> /usr/local/cuda/4.1.28/bin/nvcc
> dhudak@oak-rw 1001%> x10c++ -O -NO_CHECKS -x10rt mpi CUDA3DFD.x10 -o
> CUDA3DFD
>
> x10c++: nvcc fatal : Value 'sm_30' is not defined for option
> 'gpu-architecture'
> x10c++: Non-zero return code: 255
> x10c++: Found @CUDA annotation, but not compiling for GPU because nvcc
> could not be run (check your $PATH).
> dhudak@oak-rw 1002%>
> dhudak@oak-rw 1002%> x10c++ -O -NO_CHECKS -x10rt mpi CUDAKernelTest.x10
> -o CUDAKernelTest
> x10c++: ./CUDAKernelTest.cu(56): Warning: Cannot tell what pointer points
> to, assuming global memory space
>     ./CUDAKernelTest.cu(56): Warning: Cannot tell what pointer points to,
> assuming global memory space
> x10c++: ./CUDAKernelTest.cu(56): Warning: Cannot tell what pointer points
> to, assuming global memory space
>     ./CUDAKernelTest.cu(56): Warning: Cannot tell what pointer points to,
> assuming global memory space
> x10c++: ./CUDAKernelTest.cu(56): Warning: Cannot tell what pointer points
> to, assuming global memory space
>     ./CUDAKernelTest.cu(56): Warning: Cannot tell what pointer points to,
> assuming global memory space
> x10c++: ./CUDAKernelTest.cu(56): Warning: Cannot tell what pointer points
> to, assuming global memory space
>     ./CUDAKernelTest.cu(56): Warning: Cannot tell what pointer points to,
> assuming global memory space
> x10c++: nvcc fatal : Value 'sm_30' is not defined for option
> 'gpu-architecture'
> x10c++: Non-zero return code: 255
> x10c++: Found @CUDA annotation, but not compiling for GPU because nvcc
> could not be run (check your $PATH).
> dhudak@oak-rw 1003%> which nvcc
> /usr/local/cuda/4.1.28/bin/nvcc
>
> Regards,
> Dave
>
> On Feb 11, 2012, at 5:09 PM, David E Hudak wrote:
>
> > Hi All,
> >
> > I have a code sample that I want to try on our new cluster.  These are
> dual-socket nodes with dual-M2070 cards connected by QDR IB.
> >
> > I configured my local environment and built the code as follows:
> > svn co 
> > https://x10.svn.sourceforge.net/svnroot/x10/tags/SF_RELEASE_2_2_2x10-2.2.2
> > cd x10-2.2.2/x10.dist
> > ant -DNO_CHECKS=true -Doptimize=true -DX10RT_MPI=true -DX10RT_CUDA=true
> diet
> >
> > Things build.
> >
> > And, then I get an interactive PBS job on 2 nodes.  I would like the
> launch the program with 2 X10 places per node, with each X10 place having
> one child place for a GPU.  Does anyone have the incantation that would
> launch this configuration?
> >
> > By the way, is there a hostname function in X10 I can call to verify
> which node I am running on?
> >
> > So, first I tried...
> >
> > dhudak@n0282 1021%> mpiexec -pernode ./CUDATopology
> > Dumping places at place: Place(0)
> > Place: Place(0)
> >  Parent: Place(0)
> >  NumChildren: 0
> >  Is a Host place
> >
> > Dumping places at place: Place(0)
> > Place: Place(0)
> >  Parent: Place(0)
> >  NumChildren: 0
> >  Is a Host place
> >
> > …and it ran two copies of the program, each on the two nodes.  (I
> verified by running top on the other node, and seeing a CUDATopology
> process running.)
> >
> > If I add the X10RT_ACCELS variable, each copy finds the two cards:
> >
> > dhudak@n0282 1012%> X10RT_ACCELS=ALL mpiexec -pernode ./CUDATopology
> > Dumping places at place: Place(0)
> > Place: Place(0)
> >  Parent: Place(0)
> >  NumChildren: 2
> >  Is a Host place
> >  Child 0: Place(1)
> >    Parent: Place(0)
> >    NumChildren: 0
> >    Is a CUDA place
> >  Child 1: Place(2)
> >    Parent: Place(0)
> >    NumChildren: 0
> >    Is a CUDA place
> >
> > Dumping places at place: Place(0)
> > Place: Place(0)
> >  Parent: Place(0)
> >  NumChildren: 2
> >  Is a Host place
> >  Child 0: Place(1)
> >    Parent: Place(0)
> >    NumChildren: 0
> >    Is a CUDA place
> >  Child 1: Place(2)
> >    Parent: Place(0)
> >    NumChildren: 0
> >    Is a CUDA place
> >
> > OK, so I wanted place 1 on one node and place 2 on another node:
> >
> > dhudak@n0282 1029%> X10RT_ACCELS=ALL X10_NPLACES=2 mpiexec -pernode
> ./CUDATopology
> > Dumping places at place: Place(0)
> > Place: Place(0)
> >  Parent: Place(0)
> >  NumChildren: 2
> >  Is a Host place
> >  Child 0: Place(2)
> >    Parent: Place(0)
> >    NumChildren: 0
> >    Is a CUDA place
> >  Child 1: Place(3)
> >    Parent: Place(0)
> >    NumChildren: 0
> >    Is a CUDA place
> > Place: Place(1)
> >  Parent: Place(1)
> >  NumChildren: 2
> >  Is a Host place
> >  Child 0: Place(4)
> >    Parent: Place(1)
> >    NumChildren: 0
> >    Is a CUDA place
> >  Child 1: Place(5)
> >    Parent: Place(1)
> >    NumChildren: 0
> >    Is a CUDA place
> >
> > Dumping places at place: Place(1)
> > Place: Place(0)
> >  Parent: Place(0)
> >  NumChildren: 2
> >  Is a Host place
> >  Child 0: Place(2)
> >    Parent: Place(0)
> >    NumChildren: 0
> >    Is a CUDA place
> >  Child 1: Place(3)
> >    Parent: Place(0)
> >    NumChildren: 0
> >    Is a CUDA place
> > Place: Place(1)
> >  Parent: Place(1)
> >  NumChildren: 2
> >  Is a Host place
> >  Child 0: Place(4)
> >    Parent: Place(1)
> >    NumChildren: 0
> >    Is a CUDA place
> >  Child 1: Place(5)
> >    Parent: Place(1)
> >    NumChildren: 0
> >    Is a CUDA place
> >
> > Dumping places at place: Place(0)
> > Place: Place(0)
> >  Parent: Place(0)
> >  NumChildren: 2
> >  Is a Host place
> >  Child 0: Place(2)
> >    Parent: Place(0)
> >    NumChildren: 0
> >    Is a CUDA place
> >  Child 1: Place(3)
> >    Parent: Place(0)
> >    NumChildren: 0
> >    Is a CUDA place
> > Place: Place(1)
> >  Parent: Place(1)
> >  NumChildren: 2
> >  Is a Host place
> >  Child 0: Place(4)
> >    Parent: Place(1)
> >    NumChildren: 0
> >    Is a CUDA place
> >  Child 1: Place(5)
> >    Parent: Place(1)
> >    NumChildren: 0
> >    Is a CUDA place
> >
> > Dumping places at place: Place(1)
> > Place: Place(0)
> >  Parent: Place(0)
> >  NumChildren: 2
> >  Is a Host place
> >  Child 0: Place(2)
> >    Parent: Place(0)
> >    NumChildren: 0
> >    Is a CUDA place
> >  Child 1: Place(3)
> >    Parent: Place(0)
> >    NumChildren: 0
> >    Is a CUDA place
> > Place: Place(1)
> >  Parent: Place(1)
> >  NumChildren: 2
> >  Is a Host place
> >  Child 0: Place(4)
> >    Parent: Place(1)
> >    NumChildren: 0
> >    Is a CUDA place
> >  Child 1: Place(5)
> >    Parent: Place(1)
> >    NumChildren: 0
> >    Is a CUDA place
> >
> > Does anyone have any advice?
> >
> > Thanks,
> > Dave
> > ---
> > David E. Hudak, Ph.D.          dhu...@osc.edu
> > Program Director, HPC Engineering
> > Ohio Supercomputer Center
> > http://www.osc.edu
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> ------------------------------------------------------------------------------
> > Virtualization & Cloud Management Using Capacity Planning
> > Cloud computing makes use of virtualization - but cloud computing
> > also focuses on allowing computing to be delivered as a service.
> > http://www.accelacomm.com/jaw/sfnl/114/51521223/
> > _______________________________________________
> > X10-users mailing list
> > X10-users@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/x10-users
>
> ---
> David E. Hudak, Ph.D.          dhu...@osc.edu
> Program Director, HPC Engineering
> Ohio Supercomputer Center
> http://www.osc.edu
>
>
>
>
>
>
>
>
>
>
>
> ------------------------------------------------------------------------------
> Try before you buy = See our experts in action!
> The most comprehensive online learning library for Microsoft developers
> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> Metro Style Apps, more. Free future releases when you subscribe now!
> http://p.sf.net/sfu/learndevnow-dev2
> _______________________________________________
> X10-users mailing list
> X10-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/x10-users
>

------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2

_______________________________________________
X10-users mailing list
X10-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/x10-users

Re: [X10-users] X10 with CUDA and MPI

Reply via email to