Hi,

Thank you for reporting. We are recognizing this problem. 
We investigated the JIT behavior and confirmed that the current JIT 
doesn't remove this redundant boxing operations. We are now planning to 
fix this problem.

Regards,
Hiroshi



From:
David P Grove <gro...@us.ibm.com>
To:
Mailing list for users of the X10 programming language 
<x10-users@lists.sourceforge.net>
Date:
2010/04/12 07:02
Subject:
Re: [X10-users] X10 seems extraordinary slow.




Hi,

Here's what I see with the current svn head of X10 on my laptop
(Linux-IA32) using a recent IBM Java6.

[dgr...@wannalancit myTests]$ javac Pi_Java_Sequential.java
[dgr...@wannalancit myTests]$ java Pi_Java_Sequential
==== Java Sequential pi = 3.1415926535899708
==== Java Sequential iteration count = 1000000000
==== Java Sequential elapse = 34.160766
[dgr...@wannalancit myTests]$ x10c++ -O -o Pi_X10_Sequential
Pi_X10_Sequential.x10
[dgr...@wannalancit myTests]$ runx10 Pi_X10_Sequential
==== X10 Sequential pi = 3.141592653589971
==== X10 Sequential iteration count = 1000000000
==== X10 Sequential elapse = 32.374886009000001
[dgr...@wannalancit myTests]$ x10c Pi_X10_Sequential.x10
[dgr...@wannalancit myTests]$ x10 Pi_X10_Sequential.x10
==== X10 Sequential pi = 3.1415926535899708
==== X10 Sequential iteration count = 1000000000
==== X10 Sequential elapse = 46.495616

So, with x10c++, X10 is marginally faster than Java.  With x10c, it is
about 33% slower.

Looking at the generated .java file produced by x10c, the main loop of the
program was generated (after some formatting to make it readable) to be:

//#line 8
    double sum = 0.0;

//#line 9
    for (
        long i =((java.lang.Long)(((long)(((long)(1))))));
        ((i) <= (n));
        i += ((long) (int) (java.lang.Integer)(1))) {

//#line 10
        final double x =
((java.lang.Double)(((double)((((((((double)(long)(i))) - (0.5)))) *
(delta))))));

//#line 11
        sum += ((java.lang.Double)(((double)(((1.0) / ((((1.0) + (((x) *
(x)))))))))));
    }

//#line 13
    final double pi = ((java.lang.Double)(((double)(((((4.0) * (sum))) *
(delta))))));


So, my analysis would be that the machine generated Java produced by X10
has some amount of casting/conversion/local boxing going on and the JIT
compiler isn't 100% killing it quickly enough. This is a side-effect of 
out
current codegen strategy for supporting X10 generics in Java.  We're
working on improving this.

It's also possible that if you structured the program to first "warm up"
then to time some iterations, that the JIT would have time to kick in
(which is going to help the x10c generated Java program more than the 
human
written Java program because there's more trivial overhead in the x10c
generated file for the JIT to optimize away).

--dave
------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
X10-users mailing list
X10-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/x10-users


------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
X10-users mailing list
X10-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/x10-users

Reply via email to