Hi Dave, Thanks a lot for your help. As I'm doing some very simple tests, I will try some old revision... for the time being whilst you enjoy some days off :)
Cheers :) Richard Gomes http://www.jquantlib.org/index.php/User:RichardGomes twitter: frgomes JQuantLib is a library for Quantitative Finance written in Java. http://www.jquantlib.com/ twitter: jquantlib On 28/02/11 07:52, Dave Cunningham wrote: > I was able to reproduce this and got this trace from valgrind but I'm on > vacation for a week so cannot debug further > > ==23823== Invalid read of size 1 > ==23823== at 0x4027411: memcpy (mc_replace_strmem.c:497) > ==23823== by 0x80CD8F3: x10aux::deserialization_buffer::Read<unsigned > int>::_(x10aux::deserialization_buffer&) (in > /home/spark/work/17/KMeansCUDA.sockets.dbg) > ==23823== by 0x80CDE74: unsigned int > x10aux::deserialization_buffer::read<unsigned int>() (in > /home/spark/work/17/KMeansCUDA.sockets.dbg) > ==23823== by 0x80CDE91: unsigned int > x10aux::deserialization_buffer::peek<unsigned int>() (in > /home/spark/work/17/KMeansCUDA.sockets.dbg) > ==23823== by 0x8155E16: > x10aux::deserialization_buffer::Read<x10aux::ref<x10::io::SerialData> >> ::_(x10aux::deserialization_buffer&) (in > /home/spark/work/17/KMeansCUDA.sockets.dbg) > ==23823== by 0x815636A: x10aux::ref<x10::io::SerialData> > x10aux::deserialization_buffer::read<x10aux::ref<x10::io::SerialData> >() > (in /home/spark/work/17/KMeansCUDA.sockets.dbg) > ==23823== by 0x440981F: > x10::io::SerialData::_deserialize_body(x10aux::deserialization_buffer&) (in > /home/spark/x10-dbg/x10.dist/stdlib/lib/libx10.so) > ==23823== by 0x440B031: x10aux::ref<x10::lang::Reference> > x10::io::SerialData::_deserializer<x10::lang::Reference>(x10aux::deserialization_buffer&) > (in /home/spark/x10-dbg/x10.dist/stdlib/lib/libx10.so) > ==23823== by 0x4A81A27: > x10aux::DeserializationDispatcher::create_(x10aux::deserialization_buffer&, > short) (in /home/spark/x10-dbg/x10.dist/stdlib/lib/libx10.so) > ==23823== by 0x4A7EDAF: > x10aux::DeserializationDispatcher::create(x10aux::deserialization_buffer&, > short) (in /home/spark/x10-dbg/x10.dist/stdlib/lib/libx10.so) > ==23823== by 0x4A850B6: > x10aux::deserialization_buffer::deserialize_reference(x10aux::deserialization_buffer&) > (in /home/spark/x10-dbg/x10.dist/stdlib/lib/libx10.so) > ==23823== by 0x815632E: > x10aux::deserialization_buffer::Read<x10aux::ref<x10::io::SerialData> >> ::_(x10aux::deserialization_buffer&) (in > /home/spark/work/17/KMeansCUDA.sockets.dbg) > ==23823== Address 0x615777d is 1 bytes after a block of size 28 alloc'd > ==23823== at 0x4025BD3: malloc (vg_replace_malloc.c:236) > ==23823== by 0x4C8FFAC: unsigned char* safe_malloc<unsigned > char>(unsigned int, unsigned int) (in > /home/spark/x10-dbg/x10.dist/lib/libx10rt_sockets.so) > ==23823== by 0x4C92134: x10rt_cuda_send_put (in > /home/spark/x10-dbg/x10.dist/lib/libx10rt_sockets.so) > ==23823== by 0x4C8D0B2: x10rt_lgl_send_put (in > /home/spark/x10-dbg/x10.dist/lib/libx10rt_sockets.so) > ==23823== by 0x4C8BA15: x10rt_send_put (in > /home/spark/x10-dbg/x10.dist/lib/libx10rt_sockets.so) > ==23823== by 0x4A7B65E: x10aux::send_put(int, short, > x10aux::serialization_buffer&, void*, unsigned int) (in > /home/spark/x10-dbg/x10.dist/stdlib/lib/libx10.so) > ==23823== by 0x4AB7212: x10::util::IMC_copyToBody(void*, void*, int, > x10::lang::Place, bool, x10aux::ref<x10::lang::Reference>) (in > /home/spark/x10-dbg/x10.dist/stdlib/lib/libx10.so) > ==23823== by 0x815D8E3: void > x10::util::IndexedMemoryChunk<void>::asyncCopy<float>(x10::util::IndexedMemoryChunk<float>, > int, x10::util::RemoteIndexedMemoryChunk<float>, int, int) (in > /home/spark/work/17/KMeansCUDA.sockets.dbg) > ==23823== by 0x815D97A: void > x10::util::CUDAUtilities::initCUDAArray<float>(x10::util::IndexedMemoryChunk<float>, > x10::util::RemoteIndexedMemoryChunk<float>, int) (in > /home/spark/work/17/KMeansCUDA.sockets.dbg) > ==23823== by 0x815EEB0: x10aux::ref<x10::array::RemoteArray<float> > > x10::util::CUDAUtilities::makeCUDAArray<float>(x10::lang::Place, int, > x10::util::IndexedMemoryChunk<float>) (in > /home/spark/work/17/KMeansCUDA.sockets.dbg) > ==23823== by 0x815F442: x10aux::ref<x10::array::RemoteArray<float> > > x10::util::CUDAUtilities::makeRemoteArray<float>(x10::lang::Place, int, > x10aux::ref<x10::array::Array<float> >) (in > /home/spark/work/17/KMeansCUDA.sockets.dbg) > ==23823== by 0x8161877: KMeansCUDA__closure__2::__apply() (in > /home/spark/work/17/KMeansCUDA.sockets.dbg) > > > > On Sun, Feb 27, 2011 at 2:45 AM, Richard Gomes<rgomes1...@yahoo.co.uk>wrote: > >> Hi guys, >> >> I'm getting segmentation fault on all CUDA samples, except CUDATopology. >> Are you observing the same kind of problem? >> If not, which compilation options are you using? Any help is much >> appreciated. >> >> >> I added this method to CUDATopology: >> >> >> public static def cells(p:Place) : Int = { >> if (p.isCUDA()) { >> val remote = CUDAUtilities.makeRemoteArray[Int](p, 1, 0); >> finish async at (p) @CUDA @CUDADirectParams { >> val blocks = CUDAUtilities.autoBlocks(); >> val threads = CUDAUtilities.autoThreads(); >> finish for (block in 0..0) async { >> clocked finish for (thread in 0..0) clocked async { >> remote(0) = blocks * threads; >> } >> } >> } >> val local = new Array[Int](1); >> finish Array.asyncCopy(remote, 0, local, 0, 1); >> return local(0); >> } else if (p.isSPE()) { >> return 1; // TODO: should return something else? >> } else { >> return 1; // TODO: should return the number of cores? >> } >> } >> >> >> This is the compilation, using v2.1.2: >> >> >> $ echo x10c++ ${X10C_OPTS} -report postcompile=5 CUDATopology.x10 -o >> CUDATopology >> x10c++ -NO_CHECKS -STATIC_CALLS -report postcompile=5 CUDATopology.x10 >> -o CUDATopology >> $ >> $ >> $ x10c++ ${X10C_OPTS} -report postcompile=5 CUDATopology.x10 -o >> CUDATopology >> Output files: [CUDATopology.h, CUDATopology.cu, CUDATopology.cc] >> Executing post-compiler nvcc --cubin -Xptxas -v -arch=sm_10 >> -Inull/include -I/opt/JavaIDE/x10-2.1.2-linux_x86/stdlib/include -o >> CUDATopology_sm_10.cubin CUDATopology.cu >> Executing post-compiler nvcc --cubin -Xptxas -v -arch=sm_11 >> -Inull/include -I/opt/JavaIDE/x10-2.1.2-linux_x86/stdlib/include -o >> CUDATopology_sm_11.cubin CUDATopology.cu >> Executing post-compiler nvcc --cubin -Xptxas -v -arch=sm_12 >> -Inull/include -I/opt/JavaIDE/x10-2.1.2-linux_x86/stdlib/include -o >> CUDATopology_sm_12.cubin CUDATopology.cu >> Executing post-compiler nvcc --cubin -Xptxas -v -arch=sm_13 >> -Inull/include -I/opt/JavaIDE/x10-2.1.2-linux_x86/stdlib/include -o >> CUDATopology_sm_13.cubin CUDATopology.cu >> Executing post-compiler nvcc --cubin -Xptxas -v -arch=sm_20 >> -Inull/include -I/opt/JavaIDE/x10-2.1.2-linux_x86/stdlib/include -o >> CUDATopology_sm_20.cubin CUDATopology.cu >> Executing post-compiler nvcc --cubin -Xptxas -v -arch=sm_21 >> -Inull/include -I/opt/JavaIDE/x10-2.1.2-linux_x86/stdlib/include -o >> CUDATopology_sm_21.cubin CUDATopology.cu >> Executing post-compiler nvcc --cubin -Xptxas -v -arch=sm_30 >> -Inull/include -I/opt/JavaIDE/x10-2.1.2-linux_x86/stdlib/include -o >> CUDATopology_sm_30.cubin CUDATopology.cu >> Executing post-compiler g++ -I/opt/JavaIDE/x10-2.1.2-linux_x86/include >> -I/opt/JavaIDE/x10-2.1.2-linux_x86/stdlib/include -I/home/rgomes/tmp -I. >> -Wno-long-long -Wno-unused-parameter -DNO_CHECKS -DX10_USE_BDWGC >> -pthread -o /home/rgomes/tmp/CUDATopology CUDATopology.cc >> xxx_main_xxx.cc -L/opt/JavaIDE/x10-2.1.2-linux_x86/stdlib/lib -lx10 -lgc >> -lm -lpthread -lrt -ldl -L/opt/JavaIDE/x10-2.1.2-linux_x86/lib >> -lx10rt_sockets -Wl,--rpath >> -Wl,/opt/JavaIDE/x10-2.1.2-linux_x86/stdlib/lib -Wl,--rpath >> -Wl,/opt/JavaIDE/x10-2.1.2-linux_x86/lib -Wl,-export-dynamic >> x10c++: ptxas info : Compiling entry function 'CUDATopology__closure__1' >> for 'sm_10' >> ptxas info : Used 3 registers, 24+16 bytes smem, 65536 bytes cmem[0] >> x10c++: ptxas info : Compiling entry function 'CUDATopology__closure__1' >> for 'sm_11' >> ptxas info : Used 3 registers, 24+16 bytes smem, 65536 bytes cmem[0] >> x10c++: ptxas info : Compiling entry function 'CUDATopology__closure__1' >> for 'sm_12' >> ptxas info : Used 3 registers, 24+16 bytes smem, 65536 bytes cmem[0] >> x10c++: ptxas info : Compiling entry function 'CUDATopology__closure__1' >> for 'sm_13' >> ptxas info : Used 3 registers, 24+16 bytes smem, 65536 bytes cmem[0] >> x10c++: ptxas info : Compiling entry function 'CUDATopology__closure__1' >> for 'sm_20' >> ptxas info : Used 4 registers, 56 bytes cmem[0], 65536 bytes cmem[2] >> x10c++: ptxas info : Compiling entry function 'CUDATopology__closure__1' >> for 'sm_21' >> ptxas info : Used 4 registers, 56 bytes cmem[0], 65536 bytes cmem[2] >> x10c++: ptxas info : Compiling entry function 'CUDATopology__closure__1' >> for 'sm_30' >> ptxas info : Used 4 registers, 56 bytes cmem[0], 65536 bytes cmem[2] >> >> >> Thanks a lot :) >> >> -- >> Richard Gomes >> http://www.jquantlib.org/index.php/User:RichardGomes >> twitter: frgomes >> >> JQuantLib is a library for Quantitative Finance written in Java. >> http://www.jquantlib.com/ >> twitter: jquantlib >> >> >> ------------------------------------------------------------------------------ >> Free Software Download: Index, Search& Analyze Logs and other IT data in >> Real-Time with Splunk. Collect, index and harness all the fast moving IT >> data >> generated by your applications, servers and devices whether physical, >> virtual >> or in the cloud. Deliver compliance at lower cost and gain new business >> insights. http://p.sf.net/sfu/splunk-dev2dev >> _______________________________________________ >> X10-users mailing list >> X10-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/x10-users >> > ------------------------------------------------------------------------------ > Free Software Download: Index, Search& Analyze Logs and other IT data in > Real-Time with Splunk. Collect, index and harness all the fast moving IT data > generated by your applications, servers and devices whether physical, virtual > or in the cloud. Deliver compliance at lower cost and gain new business > insights. http://p.sf.net/sfu/splunk-dev2dev > _______________________________________________ > X10-users mailing list > X10-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/x10-users > ------------------------------------------------------------------------------ Free Software Download: Index, Search & Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev _______________________________________________ X10-users mailing list X10-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/x10-users