D'oh! Yep, I copied points.dat from the parent directory and it works: dhu...@opt2393 915%> X10RT_ACCELS=ALL runx10 KMeansCUDA -i 50 X10_NPLACES not set. Assuming 1 place, running locally points: 100000 clusters: 8 dim: 4 Running using 2 GPUs. GPU known as (Place 1) gets role 0 offset 0 len 50000 GPU known as (Place 2) gets role 1 offset 50000 len 50000 100000 8 4 0.073 kernel: 0.032 dma: 0.018 cpu: 0.021 reduce: 0.002
Thanks, Dave On Nov 23, 2010, at 5:05 PM, Igor Peshansky wrote: > Dave, > > Doesn't KMeansCUDA require a points.dat file? If you are running in > x10.dist/samples/CUDA, you can try > > $ runx10 KMeansCUDA -i 50 -p ../points.dat > > HTH, > Igor > > David E Hudak <dhu...@osc.edu> wrote on 11/23/2010 04:36:54 PM: > >> Thanks, Dave. >> >> OK, so I did an svn up and retest. It all worked except KMeansCUDA. >> >> Here’s what worked: >> >> dhu...@opt2648 535%> x10c++ -O -NO_CHECKS -STATIC_CALLS CUDATopology. >> x10 -o CUDATopology >> dhu...@opt2648 524%> runx10 CUDATopology >> X10_NPLACES not set. Assuming 1 place, running locally >> Dumping places at place: (Place 0) >> Place: (Place 0) >> Parent: (Place 0) >> NumChildren: 2 >> Is a Host place >> Child 0: (Place 1) >> Parent: (Place 0) >> NumChildren: 0 >> Is a CUDA place >> Child 1: (Place 2) >> Parent: (Place 0) >> NumChildren: 0 >> Is a CUDA place >> >> >> dhu...@opt2648 536%> x10c++ -O -NO_CHECKS -STATIC_CALLS >> CUDABlackScholes.x10 -o CUDABlackScholes >> x10c++: /nfs/07/dhudak/devel/x10/20101122r2/x10-trunk/x10. >> dist/include/x10aux/debug.h(124): warning: class "_X10ClosureMap" >> defines no constructor to initialize the following: >> const member "_X10ClosureMap::_x10members" >> >> >> ptxas info : Compiling entry function >> 'CUDABlackScholes__closure__0' for 'sm_10' >> ptxas info : Used 17 registers, 96+16 bytes smem, 65536 bytes >> cmem[0], 40 bytes cmem[1] >> >> dhu...@opt2648 525%> runx10 CUDABlackScholes >> X10_NPLACES not set. Assuming 1 place, running locally >> Using the GPU at place (Place 1) >> This program only supports a single GPU. >> Running 512 times on place (Place 1) >> Options count : 8000000 >> BlackScholesGPU() time : 1.058917974016E12 msec >> Effective memory bandwidth: 75.548812866210938 GB/s >> Gigaoptions per second : 7.55488109588623 >> Generating a second set of results at place (Place 0) >> Verifying the reuslts match... >> L1 norm: 1.0E-7 >> Max absolute error: 6.0E-7 >> >> TEST PASSED >> >> >> dhu...@opt2648 528%> x10c++ -O -NO_CHECKS -STATIC_CALLS CUDA3DFD.x10 -o > CUDA3DFD >> x10c++: /nfs/07/dhudak/devel/x10/20101122r2/x10-trunk/x10. >> dist/include/x10aux/debug.h(124): warning: class "_X10ClosureMap" >> defines no constructor to initialize the following: >> const member "_X10ClosureMap::_x10members" >> >> ptxas info : Compiling entry function 'CUDA3DFD__closure__1' for > 'sm_10' >> ptxas info : Used 18 registers, 80+16 bytes smem, 65536 bytes >> cmem[0], 24 bytes cmem[1] >> >> dhu...@opt2648 529%> runx10 ./CUDA3DFD >> X10_NPLACES not set. Assuming 1 place, running locally >> 480x480x400 >> allocated 703.125000 MB on device >> ------------------------------- >> time: 22 ms >> throughput: 4105.30908203125 MPoints/s >> ------------------------------- >> >> comparing to CPU result... >> >> Result within epsilon >> >> >> TEST PASSED! >> >> >> dhu...@opt2648 532%> x10c++ -O -NO_CHECKS -STATIC_CALLS CUDAMatMul.x10 >> -o CUDAMatMul >> x10c++: /nfs/07/dhudak/devel/x10/20101122r2/x10-trunk/x10. >> dist/include/x10aux/debug.h(124): warning: class "_X10ClosureMap" >> defines no constructor to initialize the following: >> const member "_X10ClosureMap::_x10members" >> >> ptxas info : Compiling entry function 'CUDAMatMul__closure__0' >> for 'sm_10' >> ptxas info : Used 56 registers, 64+0 bytes lmem, 96+16 bytes >> smem, 65536 bytes cmem[0], 24 bytes cmem[1] >> >> dhu...@opt2648 533%> runx10 CUDAMatMul >> X10_NPLACES not set. Assuming 1 place, running locally >> >> >> testing sgemm( 'N', 'N', n, n, n, ... ) >> >> >> 4096 31.258859505094609 GF/s in 4.396800000000001 seconds >> >> >> >> Here is what did not: >> >> dhu...@opt2648 537%> x10c++ -O -NO_CHECKS -STATIC_CALLS KMeansCUDA.x10 >> -o KMeansCUDA >> x10c++: /nfs/07/dhudak/devel/x10/20101122r2/x10-trunk/x10. >> dist/include/x10aux/debug.h(124): warning: class "_X10ClosureMap" >> defines no constructor to initialize the following: >> const member "_X10ClosureMap::_x10members" >> ptxas info : Compiling entry function 'KMeansCUDA__closure__5' >> for 'sm_10' >> ptxas info : Used 13 registers, 80+16 bytes smem, 65536 bytes >> cmem[0], 80 bytes cmem[1] >> >> dhu...@opt2648 538%> runx10 KMeansCUDA -i 50 >> X10_NPLACES not set. Assuming 1 place, running locally >> points: 100000 clusters: 8 dim: 4 >> x10.io.FileNotFoundException: points.dat >> at x10::lang::Throwable::fillInStackTrace() >> at x10aux::io::FILEPtrStream::open_file(x10aux::ref<x10:: >> lang::String> const&, char const*) >> at x10::io::FileReader__FileInputStream::_make(x10aux:: >> ref<x10::lang::String>) >> at x10::io::FileReader::_constructor(x10aux::ref<x10::io::File>) >> at x10::io::FileReader::_make(x10aux::ref<x10::io::File>) >> at x10::io::File::openRead() >> at KMeansCUDA::main(x10aux::ref<x10::array::Array<x10aux:: >> ref<x10::lang::String> > >) >> at x10aux::BootStrapClosure::apply() >> at x10_lang_Runtime__closure__2::apply() >> at x10::lang::Activity::run() >> at x10::lang::Runtime__Worker::loop(x10aux::ref<x10::lang:: >> SimpleLatch>, bool) >> at x10::lang::Runtime__Worker::apply() >> at x10::lang::Runtime__Pool::apply() >> at x10::lang::Runtime::start(x10aux::ref<x10::lang:: >> VoidFun_0_0>, x10aux::ref<x10::lang::VoidFun_0_0>) >> at int x10aux::template_main<x10::lang::Runtime, >> KMeansCUDA>(int, char**) >> at __libc_start_main >> at __gxx_personality_v0 >> >> Dave >> On Nov 23, 2010, at 3:00 PM, Dave Cunningham wrote: >> >>> Hi >>> >>> The error message doesn't necessarily imply that nvcc couldn't be > found, in >>> fact the errors you got were from nvcc, and we print the same message > no >>> matter how the invocation of nvcc fails. >>> >>> The problem was a regression caused by a recent change in SVN, it's > now >>> fixed in r18467 >>> >>> thanks for your interest in CUDA >>> >>> >>> >>> On Tue, Nov 23, 2010 at 1:51 PM, David E Hudak <dhu...@osc.edu> wrote: >>> >>>> Hi All, >>>> >>>> I am building X10 from the trunk (checked out last night) and am > running >>>> into a problem with executing nvcc: >>>> ---------------------------------------- >>>> dhu...@opt2648 509%> x10c++ -O -NO_CHECKS -STATIC_CALLS > CUDAMatMul.x10 -o >>>> CUDAMatMul >>>> x10c++: >>>> /nfs/07/dhudak/devel/x10/20101122r2/x10-trunk/x10. >> dist/include/x10aux/debug.h(124): >>>> warning: class "_X10ClosureMap" defines no constructor to initialize > the >>>> following: >>>> const member "_X10ClosureMap::_x10members" >>>> >>>> CUDAMatMul.cu(41): error: namespace "x10aux" has no member > "zeroCheck" >>>> >>>> CUDAMatMul.cu(44): error: namespace "x10aux" has no member > "zeroCheck" >>>> >>>> CUDAMatMul.cu(47): error: namespace "x10aux" has no member > "zeroCheck" >>>> >>>> CUDAMatMul.cu(50): error: namespace "x10aux" has no member > "zeroCheck" >>>> >>>> 4 errors detected in the compilation of >>>> "/tmp/tmpxft_0000626f_00000000-4_CUDAMatMul.cpp1.ii". >>>> x10c++: Non-zero return code: 2 >>>> x10c++: Found @CUDA annotation, but not compiling for GPU because > nvcc >>>> could not be run (check your $PATH). >>>> ---------------------------------------- >>>> >>>> Unfortunately, nvcc is in my path: >>>> >>>> ---------------------------------------- >>>> dhu...@opt2648 510%> which nvcc >>>> /usr/local/cuda-3.1/cuda/bin/nvcc >>>> ---------------------------------------- >>>> >>>> Any suggestions? >>>> >>>> Thanks, >>>> Dave > -- > Igor Peshansky (note the spelling change!) > IBM T.J. Watson Research Center > X10: Parallel Productivity and Performance (http://x10-lang.org/) > XJ: No More Pain for XML's Gain (http://www.research.ibm.com/xj/) > "I hear and I forget. I see and I remember. I do and I understand" -- > Xun Zi > > ------------------------------------------------------------------------------ > Increase Visibility of Your 3D Game App & Earn a Chance To Win $500! > Tap into the largest installed PC base & get more eyes on your game by > optimizing for Intel(R) Graphics Technology. Get started today with the > Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. > http://p.sf.net/sfu/intelisp-dev2dev > _______________________________________________ > X10-users mailing list > X10-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/x10-users --- David E. Hudak, Ph.D. dhu...@osc.edu Program Director, HPC Engineering Ohio Supercomputer Center http://www.osc.edu ------------------------------------------------------------------------------ Increase Visibility of Your 3D Game App & Earn a Chance To Win $500! Tap into the largest installed PC base & get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev _______________________________________________ X10-users mailing list X10-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/x10-users