Hi there,
I am playing around with the current svn-version (R28225). In
particular, I tried using allreduce in combination with Place.places.
Example code is appended. I was able to make the following observations:
- The behaviour of the code changes with X10_NPLACES.
- The behaviour of the code changes with the place, which died.
- The behaviour is (partially) non-deterministic.
I tested the code under Ubuntu linux (14.04), gcc 4.8.3 and openJDK
1.7.0_65. I summarized my results in the table below.
I compiled the program with both the C++- and the Java-Backend. Both
show the same behaviour.
| ID of killed Place (args(0)):
NPLACES | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
2 | P | | | | | | |
4 | ok | P | P | | | | |
8 | ok | S/2 | S/2 | P | P | P | P |
"P" means: premature program abortion.
The output can take one of tow forms: a short one and a long one.
short:
$ ./Hello 2
x10.lang.DeadPlaceException: DeadPlaceException at Place(2)
Place 2 exited unexpectedly with exit code: 1
Launcher 2: cleanup complete, exit code=1. Goodbye!
Launcher 0: cleanup complete, exit code=1. Goodbye!
long:
$ ./Hello 3
x10.lang.DeadPlaceException: DeadPlaceException at Place(3)
Place 3 exited unexpectedly with exit code: 1
Launcher 3: cleanup complete, exit code=1. Goodbye!
Launcher 1: cleanup complete, exit code=1. Goodbye!
at
x10::lang::FinishResilientPlace0::addDeadPlaceException(x10::lang::FinishResilientPlace0__State*,
long long)
at x10::lang::FinishResilientPlace0::quiescent(long long)
at x10::lang::FinishResilientPlace0::notifyPlaceDeath()
at x10::lang::FinishResilient::notifyPlaceDeath()
at x10::lang::Runtime::notifyPlaceDeath()
at x10::lang::Runtime__Pool::scan(x10::util::Random*,
x10::lang::Runtime__Worker*)
at x10::lang::Runtime__Worker::loop()
at x10::lang::Runtime__Worker::__apply()
at x10::lang::Thread::thread_start_routine(void*)
at GC_inner_start_routine
at GC_call_with_stack_base
at GC_start_routine
at
at clone
Place(0): Place.places():
Launcher 0: cleanup complete, exit code=1. Goodbye!
Launcher -1: cleanup complete, exit code=1. Goodbye!
"S" means program gets stuck, the output looks like this:
$ ./Hello 3
x10.lang.DeadPlaceException: DeadPlaceException at Place(3)
at
x10::lang::FinishResilientPlace0::addDeadPlaceException(x10::lang::FinishResilientPlace0__State*,
long long)
at x10::lang::FinishResilientPlace0::quiescent(long long)
at x10::lang::FinishResilientPlace0::notifyPlaceDeath()
at x10::lang::FinishResilient::notifyPlaceDeath()
at x10::lang::Runtime::notifyPlaceDeath()
at x10::lang::Runtime__Pool::scan(x10::util::Random*,
x10::lang::Runtime__Worker*)
at x10::lang::Runtime__Worker::loop()
at x10::lang::Runtime__Worker::__apply()
at x10::lang::Thread::thread_start_routine(void*)
at GC_inner_start_routine
at GC_call_with_stack_base
at GC_start_routine
at
at clone
Place(0): Place.places():
Place(0)
Place(1)
Place(2)
Place(4)
Place(5)
Place(6)
Place(7)
Place(0): stage 1.
Place 3 exited unexpectedly with exit code: 1
(still running)
"S/2" meaning "getting stuck about half of the time, running as expected
half of the time", showing non-deterministic behaviour.
Is there something wrong with the way I use broadcast/allreduce or are
the corresponding library-functions still under construction for X10 2.5.0?
Cheers,
Marco
import x10.util.Team;
public class Hello
{
var i:Long;
public def this(val i:Long) { this.i = i; }
public static def main(args:Rail[String])
{
val plh:PlaceLocalHandle[Hello]
= PlaceLocalHandle.makeFlat[Hello]
(Place.places()
, () => new Hello(Runtime.hereLong()));
try { at (Place(Long.parse(args(0)))) { System.killHere(); } }
catch (e:CheckedThrowable) { e.printStackTrace(Console.OUT); }
Console.OUT.println(here + ": Place.places():");
for (p:Place in Place.places()) { Console.OUT.println(" " + p); }
Console.OUT.println(here + ": stage 1.");
val t:Team = Team(Place.places());
Place.places().broadcastFlat
(() =>
{
plh().i = t.allreduce(plh().i, Team.ADD);
Console.OUT.println(here + ": stage 2.");
});
Console.OUT.println(here + ": stage 3.");
Console.OUT.println(plh().i);
}
}
------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
X10-users mailing list
X10-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/x10-users