Jim LaGrone <jlagr...@cs.uh.edu> wrote on 08/27/2009 11:47:44 AM:

> Apparently I can't just use a distribution at the time of an array's 
> creation and get away with it. Can anyone suggest where I can look to 
> get a better understanding of how to parallelize computation?
> 
> I have this array
> 
>    var regionS: Region{rank==2};
>    var distS: Dist{rank==2};
>    var S: Array[Complex](distS);
>    ...
>       regionS = [0..N-1, 0..Mc-1];
>       distS = Dist.makeBlock(regionS,0);
>       S = Array.makeVar[Complex](distS, ((p):Point) => new Complex ());
> 
> with this work
> 
>       for( i = 0; i < N; i++){
>          for( j = 0;j < Mc; j++){
>             rawNormPower2 += Math.sqrt(S(i,j).real*S(i,j).real +
>                S(i,j).image*S(i,j).image);
>          }
>       }
> 
> and this exception
> 
> x10.lang.BadPlaceException: point (110,0) not defined at (Place 0)
>   in freadK1
> x10.lang.BadPlaceException: point (110,0) not defined at (Place 0)
>    at x10.array.BaseArray$23.apply(BaseArray.java:1330)
>    at x10.array.BaseArray$23.apply(BaseArray.java:1)
>    at x10.array.RectRegion.check(RectRegion.java:709)
>    at x10.array.BaseArray.checkPlace(BaseArray.java:715)
>    at x10.array.DistArray.apply(DistArray.java:126)
>    at State.formImage(State.java:634)
>    at State.stage1(State.java:1250)
>    at RUN_knowledgeFormation.main(RUN_knowledgeFormation.java:458)
>    at 
RUN_knowledgeFormation$Main$1.apply(RUN_knowledgeFormation.java:48)
>    at x10.runtime.Activity.now(Activity.java:222)
>    at x10.runtime.Activity.run(Activity.java:127)
>    at x10.runtime.Worker$3.apply(Worker.java:330)
>    at x10.runtime.impl.java.Runtime.runAt(Runtime.java:96)
>    at x10.runtime.Worker.loop(Worker.java:317)
>    at x10.runtime.Runtime.start(Runtime.java:143)
>    at RUN_knowledgeFormation$Main.main(RUN_knowledgeFormation.java:35)
>    at x10.runtime.impl.java.Runtime.run(Runtime.java:46)
>    at java.lang.Thread.run(Thread.java:613)

Jim,

You can use the distribution, but your work has to be distributed too.

You didn't mention the number of places you're running with, but I assume
it's more than one.  The whole point of a distribution is that your array
gets distributed over multiple places.  However, the above work loop does
all of its iterations in place 0 (or whatever place you run it from).

In X10, you cannot access a remote location (be it an object or an array
element) unless you're in the place where this location resides.  So,
you can change your work loop to the following:

      val rawNormPower2 : Box[Double] = new Box[Double](0);
      finish ateach ( (i, j) in distS ) {
         val tmp = Math.sqrt(S(i,j).real*S(i,j).real +
               S(i,j).image*S(i,j).image);
         async (Place.places(0)) {
            atomic {
               rawNormPower2.value += tmp;
            }
         }
      }

This will distribute the iterations of the loop to the appropriate places.
You need the box because an async (which is what the body of an ateach 
loop
is) cannot capture a non-final variable.  You need the async because
rawNormPower2 lives in place 0, and you cannot access it from anywhere 
else.
And you need the atomic statement to make sure the += operation does not
introduce data races, since all iterations will run in parallel.  The 
finish
statement ensures that all of the parallel computation completes before 
the
program proceeds.

You can also use the place-local idiom, as follows:

      val norms = Array.make[Double](Dist.makeUnique(), (Point)=>0.0);
      finish ateach ( (p) in distS.places() ) {
         for ( (i, j) in distS | p ) { // at this point, p == here
            norms(p.id) += Math.sqrt(S(i,j).real*S(i,j).real +
               S(i,j).image*S(i,j).image);
         }
      }
      rawNormPower2 = norms.sum(); // a distributed reduction

Here, norms is an array with a unique distribution (which has one point
per place).  The ateach loop again spawns parallel iterations on each
place (but now there's one per place covered by the distribution).  The
body of the ateach has a sequential loop that runs over the part of distS
that lives in that place.  Each iteration of that sequential loop updates
the local element of norms.  There is no need for atomic, since all
updates happen sequentially.  Once the ateach loop is done, a 
(distributed)
reduction is performed over the norms array, and the sum is stored in the
final result location.

Hope this helps you proceed.
        Igor
-- 
Igor Peshansky  (note the spelling change!)
IBM T.J. Watson Research Center
XJ: No More Pain for XML's Gain (http://www.research.ibm.com/xj/)
X10: Parallel Productivity and Performance (http://x10.sf.net/)


------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
X10-users mailing list
X10-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/x10-users

Reply via email to