Hello all,
I am trying to implement a parallel version of the N body problem.
I have tried to compile with java and c++ backend and the compiler
otpimisations -O -NO_CHECKS proposed on the webpage, but it seems like
after all these, the sequential version is always faster.
I use distributed arrays of objects with block distribution and a cluster
of 10 nodes for testing. I have tried different combinations for # of
hosts, NPLACES and NTHREADS and no real difference occurs in performance.
On the other hand, results seem correct and from printing here.id's inside
the ateach statements I can see that the workload is balanced over all
nodes.
Does the distributed array of objects give so much overhead? I read
somewhere in the documentation that heavily object-oriented applications do
not benefit a lot from performance tuning ...:(
Could someone please take a look at the code in case I am missing something
and give me a hint on performance improvement?
Below is the code:
-----------------------------------------------------
//Class UniBlock implements the universe in which the bodies are
initialised and interact with each other
import x10.util.Random;
import Body;
import x10.io.Printer;
import x10.io.Console;
import x10.lang.Math;
import x10.array.DistArray;
import x10.util.*;
public class UniBlock {
public var dimx: double;
public var dimy: double;
public var numBodies: Int;
public var R : Region(1);
public val D:Dist;
public val bodies: DistArray[Body](D);
public var px :double;
public var py :double;
public var pz :double;
public var e: double;
//Constructor
public def this(dimx: double, dimy:double, numBodies: Int){
this.dimx = dimx;
this.dimy = dimy;
this.numBodies = numBodies;
this.R = 1..numBodies;
this.D = Dist.makeBlock(R);
this.bodies = DistArray.make[Body](D, (p:Point(D.rank))=>new Body());
this.px = 0.0;
this.py = 0.0;
this.pz = 0.0;
Console.OUT.println("universe created");
Console.OUT.println("dimensions x: "+dimx+" y: "+dimy+" bodies:
"+numBodies);
}
public def printBodies(){
Console.OUT.println("bodies----------------------");
finish ateach(d in bodies){
bodies(d).printBody();
}
}
//The intensive parallel calculation is performed here.
public def Advance(val dt:double){
finish ateach(p in bodies){
px += bodies(p).velx * bodies(p).mass;
py += bodies(p).vely * bodies(p).mass;
pz += bodies(p).velz * bodies(p).mass;
}
finish ateach(b in bodies){
bi: Body = bodies(b);
finish ateach(p in bodies){
bj: Body = bodies(p);
var dx: double = bi.posx - bj.posx ;
var dy: double = bi.posy - bj.posy;
var dz: double = bi.posz - bj.posz;
var d2: double = dx*dx + dy*dy + dz*dz;
if (d2 != 0.0 ){
var mag: double = dt/ (d2*Math.sqrt(d2));
bi.velx -= dx* bj.mass * mag;
bj.velx += dx* bi.mass * mag;
bi.vely -= dy* bj.mass * mag;
bj.vely += dy* bi.mass * mag;
bi.velz -= dz* bj.mass * mag;
bj.velz += dz* bi.mass * mag;
}
}
}
finish ateach(d in bodies){
bodies(d).posx += dt* bodies(d).velx;
bodies(d).posy += dt* bodies(d).vely;
bodies(d).posz += dt* bodies(d).velz;
}
}
//Calculation of the energy produced
public def Energy(): double{
finish ateach(d in bodies){
val bi: Body = bodies(d);
e += 0.5 * bi.mass * (bi.velx*bi.velx + bi.vely*bi.vely + bi.velz*bi.velz);
finish ateach(b in bodies){
val bj: Body = bodies(b);
val dx: double = bi.posx - bj.posx;
val dy: double = bi.posy - bj.posx;
val dz: double = bi.posz - bj.posx;
e -= (bi.mass*bj.mass) / Math.sqrt(dx*dx + dy*dy + dz*dz);
}
}
return e;
}
------------------------------------------------------------------------------
//Genblock is a generator class to initialise and call functions of the
UniBlock class
import UniBlock;
import Body;
import x10.util.Timer;
import x10.io.Printer;
import x10.io.Console;
import x10.util.*;
class GenBlock{
var bodies: Int;
static val x: double = 3.5;
static val y: double = 4.2;
static val dt:double = 0.1;
static val numIterations: Int = 3;
public def run(bodies:Int){
val uni = new UniBlock(x, y, bodies);
Console.OUT.println("==== Initial positions ====");
uni.printBodies();
var e: double = 1.1;
val t = new Timer();
val start: long = t.milliTime();
for (var i: Int = 1; i <= numIterations; i++) {
uni.Advance(dt);
e= uni.Energy();
}
val stop: long = t.milliTime();
Console.OUT.println("Total energy : "+e);
Console.OUT.println("==== Final positions ====");
uni.printBodies();
val total:long = stop-start;
Console.OUT.println("");
Console.OUT.println("Time elapsed: "+((total as double)/1e9));
}
public static def main (args:Rail[String])
{
val bod = Int.parse(args(0));
new GenBlock().run(bod);
}
}
--------------------
Finaly the body class is trivial.It comprises of the Body constructor:
def this(mass: double, posx: double, posy: double, posz: double, velx:
double, vely: double, velz : double)
and some get/set and print methods.
-----------------------------
As you can see I call the methods Advance and Energy (here 3 iterations)
and I am only timing this part.
Example average time measurements :
#bodies = 1000 seq: 3.1E-7 par: 6.299E-6
#bodies = 10000 seq: 2.0492E-5 par: 2.5103E-4
#bodies = 12000 seq 2.8669E-5 par: 3.48155E-4
Another question is for the energy function:
Currently I use public var e: double; which gives me final result = 0.0
although when printing all intermediate values I can see that the
calculation is done properly.
When I try to declare it inside the function body I get warning that it
need to be a preinitialized val because it is accessed in many places. But
then again, it gives a zero result.
How should I handle this?
Regards,
Konstantina
------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:
Build for Windows Store.
http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
X10-users mailing list
X10-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/x10-users