Re: [X10-users] Confused with CUDATopology

Richard Gomes Fri, 19 Nov 2010 00:40:20 -0800

Hi Dave,

Thanks a lot.
I will try x10-trunk again.


yeah... I've seen that the text console allows me to go ahead :)
Another good reason for Emacs in text mode :D

Thanks

Richard Gomes
M: +44(77)9955-6813
http://tinyurl.com/frgomes
twitter: frgomes

JQuantLib is a library for Quantitative Finance written in Java.
http://www.jquantlib.org/
twitter: jquantlib

On 19/11/10 02:34, Dave Cunningham wrote:
> I fixed the segfault in trunk recently.  This was a null dereference in the
> memory freeing code.
>
> You shouldn't have run out of memory there, it only needs about 80MB iirc
> and you have 128.  Sometimes all the memory gets tied up in the windowing
> system.  If you're using linux, switching to the text console and back often
> flushes it back out and lets you run CUDA programs again.  I'm not sure if
> there is an equivalent in windows.
>
>
> On Thu, Nov 11, 2010 at 10:36 PM, Richard Gomes<rgomes1...@yahoo.co.uk>wrote:
>
>> Hi Igor,
>>
>> I applied the patch onto SF_RELEASE_2_1_0 and it compiles fine.
>> When I run the sample programs, CUDATopology and CUDAKernelTest work
>> fine whilst others fail like this:
>>
>>
>> $ runx10 KMeansCUDA
>> points: 100000 clusters: 8 dim: 4
>> Running using 1 GPUs.
>> GPU known as (Place 1) gets role 0 offset 0 len 100000
>> 100000 8 4 2.703
>> kernel: 1.689
>> dma: 0.447
>> cpu: 0.463
>> reduce: 0.101
>> Segmentation fault
>>
>>
>>
>>
>> $ runx10 CUDABlackScholes
>> Using the GPU at place (Place 1)
>> This program only supports a single GPU.
>> CUDA_ERROR_OUT_OF_MEMORY (At common/x10rt_cuda.cc:452)
>> Aborted
>>
>>
>>
>> Are these failures considered 'expected' or my graphics card is not very
>> intelligent for such tasks? I have a GeForce 8300 GS
>>
>> Device 0: "GeForce 8300 GS"
>>   >     CUDA Driver Version:                           3.20
>>   >     CUDA Runtime Version:                          3.20
>>   >     CUDA Capability Major/Minor version number:    1.1
>>   >     Total amount of global memory:                 133496832 bytes
>>   >     Multiprocessors x Cores/MP = Cores:            1 (MP) x 8 (Cores/MP)
>>   >  = 8 (Cores)
>>   >     Total amount of constant memory:               65536 bytes
>>   >     Total amount of shared memory per block:       16384 bytes
>>   >     Total number of registers available per block: 8192
>>   >     Warp size:                                     32
>>   >     Maximum number of threads per block:           512
>>   >     Maximum sizes of each dimension of a block:    512 x 512 x 64
>>   >     Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
>>   >     Maximum memory pitch:                          2147483647 bytes
>>   >     Texture alignment:                             256 bytes
>>   >     Clock rate:                                    0.92 GHz
>>   >     Concurrent copy and execution:                 No
>>   >     Run time limit on kernels:                     Yes
>>   >     Integrated:                                    No
>>   >     Support host page-locked memory mapping:       Yes
>>   >     Compute mode:                                  Default (multiple host
>>
>>
>> Thanks
>>
>>
>> Richard Gomes
>> M: +44(77)9955-6813
>> http://tinyurl.com/frgomes
>> twitter: frgomes
>>
>> JQuantLib is a library for Quantitative Finance written in Java.
>> http://www.jquantlib.org/
>> twitter: jquantlib
>>
>> On 09/11/10 22:22, Igor Peshansky wrote:
>>> Richard,
>>>
>>> "svn diff -r18088:18092 x10.runtime/x10rt/common/x10rt_cuda.cc" in our
>>> repo should generate that patch.
>>>           Igor
>>>
>>> Richard Gomes<rgomes1...@yahoo.co.uk>   wrote on 11/09/2010 05:11:43 PM:
>>>
>>>> Hi Dave,
>>>>
>>>> Could you please send the patch again?
>>>>
>>>> Thanks
>>>>
>>>> Richard Gomes
>>>> M: +44(77)9955-6813
>>>> http://tinyurl.com/frgomes
>>>> twitter: frgomes
>>>>
>>>> JQuantLib is a library for Quantitative Finance written in Java.
>>>> http://www.jquantlib.org/
>>>> twitter: jquantlib
>>>>
>>>> On 09/11/10 03:45, Dave Cunningham wrote:
>>>>> Thanks for trying out X10/CUDA
>>>>>
>>>>> Your initial problem with CUDATopology is due to the fact that
>>> X10RT_ACCELS
>>>>> is ineffective if X10 was built without -DX10RT_CUDA=true, this means
>>> the
>>>>> X10 application was unable to 'see' the accelerators.  You correctly
>>>>> surmised that building X10 from the source release was necessary.
>>>>>
>>>>> The build errors are due to nvidia adding more error codes and making
>>>>> backwards incompatible changes in the CUDA API.  This is now fixed in
>>> SVN.
>>>>> I checked the build with the following CUDA versions:
>>>>>
>>>>> cuda-2.2  cuda-2.3  cuda-3.0  cuda-3.1  cuda-3.2.12
>>>>>
>>>>> If you decide to use SVN, there are some changes to the way kernels
>>> should
>>>>> be written in X10 that are currently undocumented (except via the code
>>> in
>>>>> the samples dir).  Also, we can't guarantee there won't be more
>>> changes (and
>>>>> breakages) before the next release.  However you will be able to try
>>> new
>>>>> features like clocks and constant memory on the GPU.
>>>>>
>>>>> If using SVN does not appeal, you can also patch the source release to
>>> fix
>>>>> this problem.  Apply the attached patch from the root of the source
>>> release
>>>>> as follows, and rebuild:
>>>>>
>>>>> patch -p0<    cuda_3.2.patch
>>>>>
>>>>> hope this helps
>>
>>
>> ------------------------------------------------------------------------------
>> Centralized Desktop Delivery: Dell and VMware Reference Architecture
>> Simplifying enterprise desktop deployment and management using
>> Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
>> client virtualization framework. Read more!
>> http://p.sf.net/sfu/dell-eql-dev2dev
>> _______________________________________________
>> X10-users mailing list
>> X10-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/x10-users
>>
> ------------------------------------------------------------------------------
> Beautiful is writing same markup. Internet Explorer 9 supports
> standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2&  L3.
> Spend less time writing and  rewriting code and more time creating great
> experiences on the web. Be a part of the beta today
> http://p.sf.net/sfu/msIE9-sfdev2dev
> _______________________________________________
> X10-users mailing list
> X10-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/x10-users
>

------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev
_______________________________________________
X10-users mailing list
X10-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/x10-users

Re: [X10-users] Confused with CUDATopology

Reply via email to