Hi there,

I am playing around with the current svn-version (R28225). In particular, I tried using allreduce in combination with Place.places. Example code is appended. I was able to make the following observations:

- The behaviour of the code changes with X10_NPLACES.
- The behaviour of the code changes with the place, which died.
- The behaviour is (partially) non-deterministic.

I tested the code under Ubuntu linux (14.04), gcc 4.8.3 and openJDK 1.7.0_65. I summarized my results in the table below. I compiled the program with both the C++- and the Java-Backend. Both show the same behaviour.

        | ID of killed Place (args(0)):
NPLACES |     1  |  2  |  3  |  4  |  5  |  6  |  7  |
     2  |     P  |     |     |     |     |     |     |
     4  |    ok  |  P  |  P  |     |     |     |     |
     8  |    ok  | S/2 | S/2 |  P  |  P  |  P  |  P  |


"P" means: premature program abortion.
The output can take one of tow forms: a short one and a long one.
short:
$ ./Hello 2
x10.lang.DeadPlaceException: DeadPlaceException at Place(2)
Place 2 exited unexpectedly with exit code: 1
Launcher 2: cleanup complete, exit code=1.  Goodbye!
Launcher 0: cleanup complete, exit code=1.  Goodbye!

long:
$ ./Hello 3
x10.lang.DeadPlaceException: DeadPlaceException at Place(3)
Place 3 exited unexpectedly with exit code: 1
Launcher 3: cleanup complete, exit code=1.  Goodbye!
Launcher 1: cleanup complete, exit code=1.  Goodbye!
at x10::lang::FinishResilientPlace0::addDeadPlaceException(x10::lang::FinishResilientPlace0__State*, long long)
        at x10::lang::FinishResilientPlace0::quiescent(long long)
        at x10::lang::FinishResilientPlace0::notifyPlaceDeath()
        at x10::lang::FinishResilient::notifyPlaceDeath()
        at x10::lang::Runtime::notifyPlaceDeath()
at x10::lang::Runtime__Pool::scan(x10::util::Random*, x10::lang::Runtime__Worker*)
        at x10::lang::Runtime__Worker::loop()
        at x10::lang::Runtime__Worker::__apply()
        at x10::lang::Thread::thread_start_routine(void*)
        at GC_inner_start_routine
        at GC_call_with_stack_base
        at GC_start_routine
        at
        at clone
Place(0): Place.places():
Launcher 0: cleanup complete, exit code=1.  Goodbye!
Launcher -1: cleanup complete, exit code=1.  Goodbye!

"S" means program gets stuck, the output looks like this:
$ ./Hello 3
x10.lang.DeadPlaceException: DeadPlaceException at Place(3)
at x10::lang::FinishResilientPlace0::addDeadPlaceException(x10::lang::FinishResilientPlace0__State*, long long)
        at x10::lang::FinishResilientPlace0::quiescent(long long)
        at x10::lang::FinishResilientPlace0::notifyPlaceDeath()
        at x10::lang::FinishResilient::notifyPlaceDeath()
        at x10::lang::Runtime::notifyPlaceDeath()
at x10::lang::Runtime__Pool::scan(x10::util::Random*, x10::lang::Runtime__Worker*)
        at x10::lang::Runtime__Worker::loop()
        at x10::lang::Runtime__Worker::__apply()
        at x10::lang::Thread::thread_start_routine(void*)
        at GC_inner_start_routine
        at GC_call_with_stack_base
        at GC_start_routine
        at
        at clone
Place(0): Place.places():
    Place(0)
    Place(1)
    Place(2)
    Place(4)
    Place(5)
    Place(6)
    Place(7)
Place(0): stage 1.
Place 3 exited unexpectedly with exit code: 1
(still running)

"S/2" meaning "getting stuck about half of the time, running as expected half of the time", showing non-deterministic behaviour.

Is there something wrong with the way I use broadcast/allreduce or are the corresponding library-functions still under construction for X10 2.5.0?

Cheers,
Marco
import x10.util.Team;

public class Hello
{
    var i:Long;

    public def this(val i:Long) { this.i = i; }

    public static def main(args:Rail[String])
    {
        val plh:PlaceLocalHandle[Hello]
            = PlaceLocalHandle.makeFlat[Hello]
                (Place.places()
                , () => new Hello(Runtime.hereLong()));

        try { at (Place(Long.parse(args(0)))) { System.killHere(); } }
        catch (e:CheckedThrowable) { e.printStackTrace(Console.OUT); }

        Console.OUT.println(here + ": Place.places():");

        for (p:Place in Place.places()) { Console.OUT.println("    " + p); }

        Console.OUT.println(here + ": stage 1.");
        val t:Team = Team(Place.places());
        Place.places().broadcastFlat
            (() =>
            {
                plh().i = t.allreduce(plh().i, Team.ADD);
                Console.OUT.println(here + ": stage 2.");
            });
        Console.OUT.println(here + ": stage 3.");
        Console.OUT.println(plh().i);
    }
}
------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
X10-users mailing list
X10-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/x10-users

Reply via email to