Hi Igor,

I applied the patch onto SF_RELEASE_2_1_0 and it compiles fine.
When I run the sample programs, CUDATopology and CUDAKernelTest work 
fine whilst others fail like this:


$ runx10 KMeansCUDA
points: 100000 clusters: 8 dim: 4
Running using 1 GPUs.
GPU known as (Place 1) gets role 0 offset 0 len 100000
100000 8 4 2.703
kernel: 1.689
dma: 0.447
cpu: 0.463
reduce: 0.101
Segmentation fault




$ runx10 CUDABlackScholes
Using the GPU at place (Place 1)
This program only supports a single GPU.
CUDA_ERROR_OUT_OF_MEMORY (At common/x10rt_cuda.cc:452)
Aborted



Are these failures considered 'expected' or my graphics card is not very 
intelligent for such tasks? I have a GeForce 8300 GS

Device 0: "GeForce 8300 GS"
 >    CUDA Driver Version:                           3.20
 >    CUDA Runtime Version:                          3.20
 >    CUDA Capability Major/Minor version number:    1.1
 >    Total amount of global memory:                 133496832 bytes
 >    Multiprocessors x Cores/MP = Cores:            1 (MP) x 8 (Cores/MP)
 > = 8 (Cores)
 >    Total amount of constant memory:               65536 bytes
 >    Total amount of shared memory per block:       16384 bytes
 >    Total number of registers available per block: 8192
 >    Warp size:                                     32
 >    Maximum number of threads per block:           512
 >    Maximum sizes of each dimension of a block:    512 x 512 x 64
 >    Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
 >    Maximum memory pitch:                          2147483647 bytes
 >    Texture alignment:                             256 bytes
 >    Clock rate:                                    0.92 GHz
 >    Concurrent copy and execution:                 No
 >    Run time limit on kernels:                     Yes
 >    Integrated:                                    No
 >    Support host page-locked memory mapping:       Yes
 >    Compute mode:                                  Default (multiple host


Thanks


Richard Gomes
M: +44(77)9955-6813
http://tinyurl.com/frgomes
twitter: frgomes

JQuantLib is a library for Quantitative Finance written in Java.
http://www.jquantlib.org/
twitter: jquantlib

On 09/11/10 22:22, Igor Peshansky wrote:
> Richard,
>
> "svn diff -r18088:18092 x10.runtime/x10rt/common/x10rt_cuda.cc" in our
> repo should generate that patch.
>          Igor
>
> Richard Gomes<rgomes1...@yahoo.co.uk>  wrote on 11/09/2010 05:11:43 PM:
>
>> Hi Dave,
>>
>> Could you please send the patch again?
>>
>> Thanks
>>
>> Richard Gomes
>> M: +44(77)9955-6813
>> http://tinyurl.com/frgomes
>> twitter: frgomes
>>
>> JQuantLib is a library for Quantitative Finance written in Java.
>> http://www.jquantlib.org/
>> twitter: jquantlib
>>
>> On 09/11/10 03:45, Dave Cunningham wrote:
>>> Thanks for trying out X10/CUDA
>>>
>>> Your initial problem with CUDATopology is due to the fact that
> X10RT_ACCELS
>>> is ineffective if X10 was built without -DX10RT_CUDA=true, this means
> the
>>> X10 application was unable to 'see' the accelerators.  You correctly
>>> surmised that building X10 from the source release was necessary.
>>>
>>> The build errors are due to nvidia adding more error codes and making
>>> backwards incompatible changes in the CUDA API.  This is now fixed in
> SVN.
>>> I checked the build with the following CUDA versions:
>>>
>>> cuda-2.2  cuda-2.3  cuda-3.0  cuda-3.1  cuda-3.2.12
>>>
>>> If you decide to use SVN, there are some changes to the way kernels
> should
>>> be written in X10 that are currently undocumented (except via the code
> in
>>> the samples dir).  Also, we can't guarantee there won't be more
> changes (and
>>> breakages) before the next release.  However you will be able to try
> new
>>> features like clocks and constant memory on the GPU.
>>>
>>> If using SVN does not appeal, you can also patch the source release to
> fix
>>> this problem.  Apply the attached patch from the root of the source
> release
>>> as follows, and rebuild:
>>>
>>> patch -p0<   cuda_3.2.patch
>>>
>>> hope this helps

------------------------------------------------------------------------------
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
_______________________________________________
X10-users mailing list
X10-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/x10-users

Reply via email to