D'oh!  Yep, I copied points.dat from the parent directory and it works:

dhu...@opt2393 915%> X10RT_ACCELS=ALL runx10 KMeansCUDA -i 50
X10_NPLACES not set.  Assuming 1 place, running locally
points: 100000 clusters: 8 dim: 4
Running using 2 GPUs.
GPU known as (Place 1) gets role 0 offset 0 len 50000
GPU known as (Place 2) gets role 1 offset 50000 len 50000
100000 8 4 0.073
kernel: 0.032
dma: 0.018
cpu: 0.021
reduce: 0.002

Thanks,
Dave

On Nov 23, 2010, at 5:05 PM, Igor Peshansky wrote:

> Dave,
> 
> Doesn't KMeansCUDA require a points.dat file?  If you are running in
> x10.dist/samples/CUDA, you can try
> 
> $ runx10 KMeansCUDA -i 50 -p ../points.dat
> 
> HTH,
>        Igor
> 
> David E Hudak <dhu...@osc.edu> wrote on 11/23/2010 04:36:54 PM:
> 
>> Thanks, Dave.
>> 
>> OK, so I did an svn up and retest.  It all worked except KMeansCUDA. 
>> 
>> Here’s what worked:
>> 
>> dhu...@opt2648 535%> x10c++ -O -NO_CHECKS -STATIC_CALLS CUDATopology.
>> x10 -o CUDATopology
>> dhu...@opt2648 524%> runx10 CUDATopology
>> X10_NPLACES not set.  Assuming 1 place, running locally
>> Dumping places at place: (Place 0)
>> Place: (Place 0)
>>  Parent: (Place 0)
>>  NumChildren: 2
>>  Is a Host place
>>  Child 0: (Place 1)
>>    Parent: (Place 0)
>>    NumChildren: 0
>>    Is a CUDA place
>>  Child 1: (Place 2)
>>    Parent: (Place 0)
>>    NumChildren: 0
>>    Is a CUDA place
>> 
>> 
>> dhu...@opt2648 536%> x10c++ -O -NO_CHECKS -STATIC_CALLS 
>> CUDABlackScholes.x10 -o CUDABlackScholes
>> x10c++: /nfs/07/dhudak/devel/x10/20101122r2/x10-trunk/x10.
>> dist/include/x10aux/debug.h(124): warning: class "_X10ClosureMap" 
>> defines no constructor to initialize the following: 
>>     const member "_X10ClosureMap::_x10members" 
>> 
>> 
>>     ptxas info : Compiling entry function 
>> 'CUDABlackScholes__closure__0' for 'sm_10' 
>>     ptxas info : Used 17 registers, 96+16 bytes smem, 65536 bytes 
>> cmem[0], 40 bytes cmem[1] 
>> 
>> dhu...@opt2648 525%> runx10 CUDABlackScholes
>> X10_NPLACES not set.  Assuming 1 place, running locally
>> Using the GPU at place (Place 1)
>> This program only supports a single GPU.
>> Running 512 times on place (Place 1)
>> Options count             : 8000000
>> BlackScholesGPU() time    : 1.058917974016E12 msec
>> Effective memory bandwidth: 75.548812866210938 GB/s
>> Gigaoptions per second    : 7.55488109588623
>> Generating a second set of results at place (Place 0)
>> Verifying the reuslts match...
>> L1 norm: 1.0E-7
>> Max absolute error: 6.0E-7
>> 
>> TEST PASSED
>> 
>> 
>> dhu...@opt2648 528%> x10c++ -O -NO_CHECKS -STATIC_CALLS CUDA3DFD.x10 -o 
> CUDA3DFD
>> x10c++: /nfs/07/dhudak/devel/x10/20101122r2/x10-trunk/x10.
>> dist/include/x10aux/debug.h(124): warning: class "_X10ClosureMap" 
>> defines no constructor to initialize the following: 
>>     const member "_X10ClosureMap::_x10members" 
>> 
>>     ptxas info : Compiling entry function 'CUDA3DFD__closure__1' for 
> 'sm_10' 
>>     ptxas info : Used 18 registers, 80+16 bytes smem, 65536 bytes 
>> cmem[0], 24 bytes cmem[1] 
>> 
>> dhu...@opt2648 529%> runx10 ./CUDA3DFD
>> X10_NPLACES not set.  Assuming 1 place, running locally
>> 480x480x400
>> allocated 703.125000 MB on device
>> -------------------------------
>> time:       22 ms
>> throughput: 4105.30908203125 MPoints/s
>> -------------------------------
>> 
>> comparing to CPU result...
>> 
>>  Result within epsilon
>> 
>> 
>>  TEST PASSED!
>> 
>> 
>> dhu...@opt2648 532%> x10c++ -O -NO_CHECKS -STATIC_CALLS CUDAMatMul.x10
>> -o CUDAMatMul
>> x10c++: /nfs/07/dhudak/devel/x10/20101122r2/x10-trunk/x10.
>> dist/include/x10aux/debug.h(124): warning: class "_X10ClosureMap" 
>> defines no constructor to initialize the following: 
>>     const member "_X10ClosureMap::_x10members" 
>> 
>>     ptxas info : Compiling entry function 'CUDAMatMul__closure__0' 
>> for 'sm_10' 
>>     ptxas info : Used 56 registers, 64+0 bytes lmem, 96+16 bytes 
>> smem, 65536 bytes cmem[0], 24 bytes cmem[1] 
>> 
>> dhu...@opt2648 533%> runx10 CUDAMatMul
>> X10_NPLACES not set.  Assuming 1 place, running locally
>> 
>> 
>> testing sgemm( 'N', 'N', n, n, n, ... )
>> 
>> 
>> 4096 31.258859505094609 GF/s in 4.396800000000001 seconds
>> 
>> 
>> 
>> Here is what did not:
>> 
>> dhu...@opt2648 537%> x10c++ -O -NO_CHECKS -STATIC_CALLS KMeansCUDA.x10
>> -o KMeansCUDA
>> x10c++: /nfs/07/dhudak/devel/x10/20101122r2/x10-trunk/x10.
>> dist/include/x10aux/debug.h(124): warning: class "_X10ClosureMap" 
>> defines no constructor to initialize the following: 
>>     const member "_X10ClosureMap::_x10members" 
>>     ptxas info : Compiling entry function 'KMeansCUDA__closure__5' 
>> for 'sm_10' 
>>     ptxas info : Used 13 registers, 80+16 bytes smem, 65536 bytes 
>> cmem[0], 80 bytes cmem[1] 
>> 
>> dhu...@opt2648 538%> runx10 KMeansCUDA -i 50
>> X10_NPLACES not set.  Assuming 1 place, running locally
>> points: 100000 clusters: 8 dim: 4
>> x10.io.FileNotFoundException: points.dat
>>        at x10::lang::Throwable::fillInStackTrace()
>>        at x10aux::io::FILEPtrStream::open_file(x10aux::ref<x10::
>> lang::String> const&, char const*)
>>        at x10::io::FileReader__FileInputStream::_make(x10aux::
>> ref<x10::lang::String>)
>>        at x10::io::FileReader::_constructor(x10aux::ref<x10::io::File>)
>>        at x10::io::FileReader::_make(x10aux::ref<x10::io::File>)
>>        at x10::io::File::openRead()
>>        at KMeansCUDA::main(x10aux::ref<x10::array::Array<x10aux::
>> ref<x10::lang::String> > >)
>>        at x10aux::BootStrapClosure::apply()
>>        at x10_lang_Runtime__closure__2::apply()
>>        at x10::lang::Activity::run()
>>        at x10::lang::Runtime__Worker::loop(x10aux::ref<x10::lang::
>> SimpleLatch>, bool)
>>        at x10::lang::Runtime__Worker::apply()
>>        at x10::lang::Runtime__Pool::apply()
>>        at x10::lang::Runtime::start(x10aux::ref<x10::lang::
>> VoidFun_0_0>, x10aux::ref<x10::lang::VoidFun_0_0>)
>>        at int x10aux::template_main<x10::lang::Runtime, 
>> KMeansCUDA>(int, char**)
>>        at __libc_start_main
>>        at __gxx_personality_v0
>> 
>> Dave
>> On Nov 23, 2010, at 3:00 PM, Dave Cunningham wrote:
>> 
>>> Hi
>>> 
>>> The error message doesn't necessarily imply that nvcc couldn't be 
> found, in
>>> fact the errors you got were from nvcc, and we print the same message 
> no
>>> matter how the invocation of nvcc fails.
>>> 
>>> The problem was a regression caused by a recent change in SVN, it's 
> now
>>> fixed in r18467
>>> 
>>> thanks for your interest in CUDA
>>> 
>>> 
>>> 
>>> On Tue, Nov 23, 2010 at 1:51 PM, David E Hudak <dhu...@osc.edu> wrote:
>>> 
>>>> Hi All,
>>>> 
>>>> I am building X10 from the trunk (checked out last night) and am 
> running
>>>> into a problem with executing nvcc:
>>>> ----------------------------------------
>>>> dhu...@opt2648 509%> x10c++ -O -NO_CHECKS -STATIC_CALLS 
> CUDAMatMul.x10 -o
>>>> CUDAMatMul
>>>> x10c++:
>>>> /nfs/07/dhudak/devel/x10/20101122r2/x10-trunk/x10.
>> dist/include/x10aux/debug.h(124):
>>>> warning: class "_X10ClosureMap" defines no constructor to initialize 
> the
>>>> following:
>>>>   const member "_X10ClosureMap::_x10members"
>>>> 
>>>>   CUDAMatMul.cu(41): error: namespace "x10aux" has no member 
> "zeroCheck"
>>>> 
>>>>   CUDAMatMul.cu(44): error: namespace "x10aux" has no member 
> "zeroCheck"
>>>> 
>>>>   CUDAMatMul.cu(47): error: namespace "x10aux" has no member 
> "zeroCheck"
>>>> 
>>>>   CUDAMatMul.cu(50): error: namespace "x10aux" has no member 
> "zeroCheck"
>>>> 
>>>>   4 errors detected in the compilation of
>>>> "/tmp/tmpxft_0000626f_00000000-4_CUDAMatMul.cpp1.ii".
>>>> x10c++: Non-zero return code: 2
>>>> x10c++: Found @CUDA annotation, but not compiling for GPU because 
> nvcc
>>>> could not be run (check your $PATH).
>>>> ----------------------------------------
>>>> 
>>>> Unfortunately, nvcc is in my path:
>>>> 
>>>> ----------------------------------------
>>>> dhu...@opt2648 510%> which nvcc
>>>> /usr/local/cuda-3.1/cuda/bin/nvcc
>>>> ----------------------------------------
>>>> 
>>>> Any suggestions?
>>>> 
>>>> Thanks,
>>>> Dave
> -- 
> Igor Peshansky  (note the spelling change!)
> IBM T.J. Watson Research Center
> X10: Parallel Productivity and Performance (http://x10-lang.org/)
> XJ: No More Pain for XML's Gain (http://www.research.ibm.com/xj/)
> "I hear and I forget.  I see and I remember.  I do and I understand" -- 
> Xun Zi
> 
> ------------------------------------------------------------------------------
> Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
> Tap into the largest installed PC base & get more eyes on your game by
> optimizing for Intel(R) Graphics Technology. Get started today with the
> Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
> http://p.sf.net/sfu/intelisp-dev2dev
> _______________________________________________
> X10-users mailing list
> X10-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/x10-users

---
David E. Hudak, Ph.D.          dhu...@osc.edu
Program Director, HPC Engineering
Ohio Supercomputer Center
http://www.osc.edu










------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
_______________________________________________
X10-users mailing list
X10-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/x10-users

Reply via email to