That looks like a bug I fixed in the last few months but I can't remember whether I fixed it before or after the last release. You might try doing an SVN build of X10 or using the PGAS binaries for the svn release with your version of X10. The error comes from our transport layer which was originally used just for UPC, and the error messages have not been changed to reflect its more general use these days. If you are using the MPI backend then this should never have occured but I expect you are using mpirun as a launcher to launch the PGAS backend which is not what we mean by the MPI backend. The MPI backend is a completely different codebase and therefore has a different set of bugs :) ________________________________________ From: Christoph Pospiech [christoph.pospi...@de.ibm.com] Sent: 07 February 2010 06:25 To: Cunningham, David Cc: x10-users@lists.sourceforge.net Subject: Re: [X10-users] DistributedRail
On Wednesday 03 February 2010 10:24:59 pm Cunningham, David wrote: > Hi, I had a quick look at the differences between KMeansCUDA's own > DistributedRail and the system one, they are very similar except for some > type system stuff which may or may not still be important. > > However the DistributedRail internally synchronises using clocks so you > must be using clocks in your program. You should treat the collective > operation like a next 'next' statement, it actually uses several next > statements internally. > > Thanks > David, thanks for your hint to use clocks. Apparently, I am still using the DistributedRail class the wrong way. Following your hint to use the collective operation like a "next" statement, I was calling it "collectively" from every place. But this gives me an error message as printed below. Could you please have a look ? I actually stripped down the program to a small test example that I attached to this mail. Running this program with C++ back end on two MPI tasks, I get the following error message. <map> <host name="sirius" slots="1" max_slots="0"> <process rank="0"/> <process rank="1"/> </host> </map> <stdout rank="0">main: before ClockTest</stdout> <stdout rank="0">Before next</stdout> <stdout rank="0">After next</stdout> <stdout rank="0">v_tmp(0)=0</stdout> <stdout rank="1">Before next</stdout> <stdout rank="1">After next</stdout> <stdout rank="1">v_tmp(0)=1</stdout> <stderr rank="0">0: xlupc transport: AMSend: handler returned NULL</stderr> -------------------------------------------------------------------------- mpirun noticed that process rank 0 with PID 3572 on node sirius exited on signal 6 (Aborted). -------------------------------------------------------------------------- Does this ring a bell with you ? This error message disappears, when the collective operation is commented out. Also, I don't understand the error message. I am running on x86 (32 bit) with g++ and OpenMPI. How comes that I get a message from "xlupc", which looks like an xl-compiler (for UPC ?) ? I thought that xl-compilers only exist for POWER CPUs. -- Mit freundlichen Grüßen / Kind regards Dr. Christoph Pospiech High Performance & Parallel Computing Phone: +49-351 86269826 Mobile: +49-171-765 5871 E-Mail: christoph.pospi...@de.ibm.com ------------------------------------- IBM Deutschland GmbH Vorsitzender des Aufsichtsrats: Erich Clementi Geschäftsführung: Martin Jetter (Vorsitzender), Reinhard Reschke, Christoph Grandpierre, Klaus Lintelmann, Michael Diemer, Martina Koederitz Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 WEEE-Reg.-Nr. DE 99369940 ------------------------------------------------------------------------------ The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com _______________________________________________ X10-users mailing list X10-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/x10-users