Hi Dave,

I added "-arch=sm_21" to CUDACodeGenerator.java:1067.
Observe the small change when "f" is appended to the String[].

This is certainly not the correct fix.
Looks like the ideal thing would be generating PTX files instead.

Anyway, I'm able to continue my tests on my new GTX 460. :)

More details at
http://developer.download.nvidia.com/compute/cuda/3_2_prod/toolkit/docs/Fermi_Compatibility_Guide.pdf



String[] nvccCmd = { "nvcc", "--cubin", "-arch=sm_21", "-Xptxas", "-v",
                      "-I" + CXXCommandBuilder.X10_DIST + "/include",
                      null };

for (String f : compilationUnits) {
     if (f.endsWith(".cu")) {
         nvccCmd[nvccCmd.length-1] = f;


Cheers

Richard Gomes
M: +44(77)9955-6813
http://tinyurl.com/frgomes
twitter: frgomes

JQuantLib is a library for Quantitative Finance written in Java.
http://www.jquantlib.org/
twitter: jquantlib

On 12/12/10 20:36, Richard Gomes wrote:
> Hi Dave,
>
>   From the programming guide:
>
>
>
> 3.1.2 Binary Compatibility
>
> Binary code is architecture-specific. A cubin object is generated using
> the compiler option –code that specifies the targeted architecture: For
> example, compiling with –code=sm_13 produces binary code for devices of
> compute capability 1.3. Binary compatibility is guaranteed from one
> minor revision to the next one, but not from one minor revision to the
> previous one or across major revisions. In other words, a cubin object
> generated for compute capability X.y is only guaranteed to execute on
> devices of compute capability X.z where z≥y.
>
>
>
> This makes me believe that code intended to GTX 460 should be compiled
> with compute capability 2.1 set, i.e. employing -code=sm_21.
>
> Does it make sense?
> How I could force this option when I call x10c++ ?
>
> Thanks
>
> Richard Gomes
> M: +44(77)9955-6813
> http://tinyurl.com/frgomes
> twitter: frgomes
>
> JQuantLib is a library for Quantitative Finance written in Java.
> http://www.jquantlib.org/
> twitter: jquantlib
>
> On 11/12/10 00:33, Dave Cunningham wrote:
>> This is what I got with our tesla 2070 too.  I assumed it was something to
>> do with the system but if you get it too then it's more likely to be
>> something we're doing wrong.  Maybe we're invoking nvcc improperly.
>>
>> On Fri, Dec 10, 2010 at 5:36 PM, Richard Gomes<rgomes1...@yahoo.co.uk>wrote:
>>
>>> Hi guys,
>>>
>>> Today I started tests with a new toy: GeForce GTX 460 :)
>>>
>>> In a nutshell, I removed the old GeForce 8300 and installed the new one.
>>>
>>> Now all X10 programs are failing, except CUDATopology which works fine.
>>> All programs are behaving like this:
>>>
>>> $ runx10 CUDAKernelTest
>>> X10_NPLACES not set.  Assuming 1 place, running locally
>>> CUDA_ERROR_INVALID_SOURCE (At common/x10rt_cuda.cc:383)
>>> Aborted
>>>
>>> I've recompiled from trunk, as usual (and should always do before
>>> reporting issues!).
>>>
>>> All Nvidia apps, part of the dev kit are working as expected.
>>>
>>>
>>> This is what deviceQuery says:
>>>
>>>
>>> $ ./deviceQuery
>>> ./deviceQuery Starting...
>>>
>>>    CUDA Device Query (Runtime API) version (CUDART static linking)
>>>
>>> There is 1 device supporting CUDA
>>>
>>> Device 0: "GeForce GTX 460"
>>>     CUDA Driver Version:                           3.20
>>>     CUDA Runtime Version:                          3.20
>>>     CUDA Capability Major/Minor version number:    2.1
>>>     Total amount of global memory:                 2146631680 bytes
>>>     Multiprocessors x Cores/MP = Cores:            7 (MP) x 48 (Cores/MP)
>>> = 336 (Cores)
>>>     Total amount of constant memory:               65536 bytes
>>>     Total amount of shared memory per block:       49152 bytes
>>>     Total number of registers available per block: 32768
>>>     Warp size:                                     32
>>>     Maximum number of threads per block:           1024
>>>     Maximum sizes of each dimension of a block:    1024 x 1024 x 64
>>>     Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
>>>     Maximum memory pitch:                          2147483647 bytes
>>>     Texture alignment:                             512 bytes
>>>     Clock rate:                                    1.40 GHz
>>>     Concurrent copy and execution:                 Yes
>>>     Run time limit on kernels:                     Yes
>>>     Integrated:                                    No
>>>     Support host page-locked memory mapping:       Yes
>>>     Compute mode:                                  Default (multiple host
>>> threads can use this device simultaneously)
>>>     Concurrent kernel execution:                   Yes
>>>     Device has ECC support enabled:                No
>>>     Device is using TCC driver mode:               No
>>>
>>> deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 3.20, CUDA
>>> Runtime Version = 3.20, NumDevs = 1, Device = GeForce GTX 460
>>>
>>>
>>> PASSED
>>>
>>> Press<Enter>   to Quit...
>>> -----------------------------------------------------------
>>>
>>>
>>> Any idea?
>>>
>>> Thanks a lot :)
>>>
>>> --
>>> Richard Gomes
>>> M: +44(77)9955-6813
>>> http://tinyurl.com/frgomes
>>> twitter: frgomes
>>>
>>> JQuantLib is a library for Quantitative Finance written in Java.
>>> http://www.jquantlib.org/
>>> twitter: jquantlib
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Oracle to DB2 Conversion Guide: Learn learn about native support for
>>> PL/SQL,
>>> new data types, scalar functions, improved concurrency, built-in packages,
>>> OCI, SQL*Plus, data movement tools, best practices and more.
>>> http://p.sf.net/sfu/oracle-sfdev2dev
>>> _______________________________________________
>>> X10-users mailing list
>>> X10-users@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/x10-users
>>>
>> ------------------------------------------------------------------------------
>> Oracle to DB2 Conversion Guide: Learn learn about native support for PL/SQL,
>> new data types, scalar functions, improved concurrency, built-in packages,
>> OCI, SQL*Plus, data movement tools, best practices and more.
>> http://p.sf.net/sfu/oracle-sfdev2dev
>> _______________________________________________
>> X10-users mailing list
>> X10-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/x10-users
>>
>
> ------------------------------------------------------------------------------
> Oracle to DB2 Conversion Guide: Learn learn about native support for PL/SQL,
> new data types, scalar functions, improved concurrency, built-in packages,
> OCI, SQL*Plus, data movement tools, best practices and more.
> http://p.sf.net/sfu/oracle-sfdev2dev
> _______________________________________________
> X10-users mailing list
> X10-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/x10-users

------------------------------------------------------------------------------
Oracle to DB2 Conversion Guide: Learn learn about native support for PL/SQL,
new data types, scalar functions, improved concurrency, built-in packages, 
OCI, SQL*Plus, data movement tools, best practices and more.
http://p.sf.net/sfu/oracle-sfdev2dev 
_______________________________________________
X10-users mailing list
X10-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/x10-users

Reply via email to