Hi Jonas,

I am starting to look into this. Sorry for the delay.

There is a big difference between X10 and APGAS when it comes to detecting 
place failures.
X10 does the detection quickly because it owns the communication channels 
between the places.
As soon as a channel is closed it assumes the place has failed and report 
the failure.
APGAS on the other hand relies on Hazelcast for all communication 
channels.
Hazelcast does not consider the loss of a channel as an irrecoverable 
error and tries to reconnect.
I have just committed a patch to instruct Hazelcast to give up on failed 
connections faster.
That should help a lot.

Like you, I have also observed some performance issues with Hazelcast 
3.6.3 and upgraded to 3.7.1 in another commit.

Now, this is not going to make up for the difference as Hazelcast will 
still spend up to 1s trying to reconnect and some time doing 
reconfiguration.
I will continue to look into ways to improve the recovery time with 
Hazelcast.

Could you please update and rerun your benchmark programs and share the 
results?

Regards,

Olivier


Jonas Posner <jonas.pos...@uni-kassel.de> wrote on 08/31/2016 08:02:45 AM:

> From: Jonas Posner <jonas.pos...@uni-kassel.de>
> To: Mailing list for users of the X10 programming language <x10-
> us...@lists.sourceforge.net>
> Date: 08/31/2016 08:38 AM
> Subject: [X10-users] [APGAS] Performance at place failure
> 
> Hi all,
> 
> I am playing around with the resiliency of APGAS. I wondered about 
> relative high times for manage a place failure. For comparision, I wrote 

> a simple program in X10 and APGAS. Both are attached. The results show 
> significant difference.
> 
> Every experiment was run with 4 places. X10 and APGAS are deployed from 
> the official git repository.
> 
> Native X10 with X10_RESILIENT_MODE=1
> without crash: 0.003
> with crash: 0.015
> 
> 
> Managed X10 with Hazelcast 3.3.1 and X10_RESILIENT_MODE=1
> without crash: 0.34
> with crash: 0.74
> 
> 
> APGAS with Hazelcast with Hazelcast 3.6.3 and -Dapgas.serialization=java 

> -Dapgas.resilient=true -Dapgas.compact=false
> without crash: 0.86
> with crash: very varying: 8-38
> 
> 
> APGAS with Hazelcast with Hazelcast 3.6.3 and -Dapgas.serialization=java 

> -Dapgas.resilient=true -Dapgas.compact=true
> without crash: 0.8
> with crash: very varying: 8-31
> 
> 
> APGAS with Hazelcast with Hazelcast 3.7 and -Dapgas.serialization=java 
> -Dapgas.resilient=true -Dapgas.compact=false
> without crash: 0.77
> with crash: 5.7 or 11.33 (50:50)
> 
> 
> APGAS with Hazelcast with Hazelcast 3.7 and -Dapgas.serialization=java 
> -Dapgas.resilient=true -Dapgas.compact=true
> without crash: 0.74
> with crash: 5.7 or 11.33 (50:50)
> 
> 
> 
> Managed X10 absorbs a failure significantly better than APGAS. However, 
> Managed X10 uses Hazelcast 3.3.1 and (official) APGAS uses Hazelcast 
> 3.6.3. What causes the differences?
> 
> A few days ago Hazelcast was released in version 3.7. In my experiments, 

> it performs better.
> 
> What purpose has "apgas.compact=true"? Should APGAS perform better with 
it?
> 
> Are there other options to improve APGAS's performance at a place 
failure?
> 
> 
> 
> Thanks and cheers
> 
> -- 
> Jonas Posner
> Universitaet Kassel
> Fachbereich 16 Elektrotechnik/Informatik
> Fachgebiet Programmiersprachen/-methodik
> Wilhelmshoeher Allee 71-73
> 34121 Kassel, Germany
> 
> Phone:  +49 (0)561 804-6498
> Fax:    +49 (0)561 804-6219
> mailto: jonas.pos...@uni-kassel.de
> www.uni-kassel.de
> [attachment "MessCrashingTime.java" deleted by Olivier Tardieu/
> Watson/IBM] [attachment "MessCrashingTime.x10" deleted by Olivier 
> Tardieu/Watson/IBM] 
> 
------------------------------------------------------------------------------
> _______________________________________________
> X10-users mailing list
> X10-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/x10-users


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
X10-users mailing list
X10-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/x10-users

Reply via email to