Re: [X10-users] Performance of foreach

Krishna Nandivada Venkata Tue, 13 Apr 2010 21:22:29 -0700

Hi.
The reasons can be many fold - Mostly it is because you are creating too
many activities when the system has much less parallelism. See the
following ICS 2009 paper for some explanation in this regard, and a way
out :


Chunking Parallel Loops in the Presence of Synchronization. By Jun Shirako,
Jisheng Zhao, V. Krishna Nandivada, Vivek Sarkar.

Warm regards,
Krishna.


|------------>
| From:      |
|------------>
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
  |"Liu, Xing" <xing....@gatech.edu>                                            
                                                                     |
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| To:        |
|------------>
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
  |x10-users <x10-users@lists.sourceforge.net>                                  
                                                                     |
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Date:      |
|------------>
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
  |04/13/2010 10:24 PM                                                          
                                                                     |
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Subject:   |
|------------>
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
  |[X10-users] Performance of foreach                                           
                                                                     |
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|





Hi,


We used the following code to test the performance of foreach.
    add1() is a sequential code.
    in add2(), we use foreach, and let X10 to partition workloads.
    and in add3(), we partition the workloads by ourselves.

We use c++ backend, and run the code as
# env X10_NTHREADS=2 runx10 ./Test_foreach

The performance:

time of add1() = 32.3 ms
time of add2() = 3277.98 ms
time of add3() = 18.33 ms

It is surprising that add2() is 100 times slower than add1(). Is someone
knows the reason? Thanks.


// Test_foreach.x10

def add1()
{
    for ((i) in 0..size-1)
    {
        data(i) += 5;
    }
}

def add2()
{
    finish foreach ((i) in 0..size-1)
    {
        data(i) += 5;
    }
}

def add3()
{
    var numThreads: Int = 2;
    val mySize = size/numThreads;
    finish foreach ((p) in 0..numThreads-1)
    {
        for ((i) in p*mySize..(p+1)*mySize-1)
        {
            data(i) += 5;
        }
    }
}

------------------------------------------------------------------------------

Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
X10-users mailing list
X10-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/x10-users



------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
X10-users mailing list
X10-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/x10-users

Re: [X10-users] Performance of foreach

Reply via email to