On Jun 24, 2011, at 6:32 PM, Igor Peshansky wrote:

> Hi, Dave,
> 
> Since 2.1.2, X10 comes with what we call a multi-vm Java backend
> implementation.  It runs with the sockets transport.  For sockets, the
> Java runner, "x10", uses the same launcher as the C++ backend
> ("X10Launcher"), so one can run it on, e.g., a Linux cluster by
> setting X10_NPLACES and X10_HOSTFILE, just like you would for a C++
> launch.

I am using X10 2.2.0.  I have figured out how to launch multi-vm across 
multiple nodes of the OSC cluster using the x10 command with X10_NPLACES and 
X10_HOSTFILE.

However, for our cluster we are always concerned about process clean up.  So, 
as a test, I ran a four-node job running MontyPi (good old MontyPi!), running 
one java instance per node.  I was logged into the nodes and ran top on each.  
Sure enough, one 'java' process per node.  

Then, I ran it again, but this time I manually killed one of the JVM's - one of 
the launcher "children".  A sibling on one node exited, a sibling on another 
node did not and the parent did not exit.  I was hoping that an exit of any 
process would cause the entire set of processes to shut down.  Looking at the 
source code for launcher.cc, it looks like there are hooks in there 
(Launcher::handleDeadChild and Launcher::handleDeadParent), but I don't know 
the expected behavior.  

Typically, we rely on the Torque resource manager to start processes.  There is 
a torque daemon on each node of the job.  The daemons fork child processes to 
do the work.  If one child exits, the daemon is notified and in turn notifies 
the other node's daemons.  If the X10 launcher is the supported process 
management mechanism, it would be a good idea to have it work with resource 
managers.  Open MPI used to have a standalone project called Open Run Time 
Environment (Open RTE or ORTE) which may be an interesting fit.  However, at 
this point, I think I am going to do something in my job scripts to manually 
shut down processes after an exit as a workaround.  

> 
> As far as I know, you cannot use the MPI transport with multi-vm.

Understood.  This is one advantage of the MPI transport:  I use mpiexec to 
launch processes via Torque and the cleanup works correctly.  

Thanks,
Dave

>       Igor
> 
> On Fri, Jun 24, 2011 at 4:47 PM, David E Hudak <dhu...@osc.edu> wrote:
>> Hi All,
>> 
>> I have a colleague with a Java implementation of a genetic algorithm.  He is 
>> interested in parallelizing the application for both multicore and multinode 
>> execution.
>> 
>> In the initial implementation, there are a set of classes for specifying 
>> fitness functions, expressing genes and implementing gene manipulations.  
>> There is a top-level simulation object that run the various number of 
>> generations.  My plan was to try using the java native interface to use the 
>> existing Java classes for organisms and fitness, and rewrite the top level 
>> simulation in X10.
>> 
>> I have been evaluating X10 for purely numeric applications on our cluster 
>> (C++ back end, MPI runtime and mpiexec as a process launcher).  I believe I 
>> read somewhere that the Java native interface requires the Java back end.  
>> In that case, I'd need to make sure we could run the sockets runtime and 
>> whatever process launcher we have for java (x10run?).
>> 
>> Anyone have any advice?
>> 
>> Thanks,
>> Dave
> 
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure contains a 
> definitive record of customers, application performance, security 
> threats, fraudulent activity and more. Splunk takes this data and makes 
> sense of it. Business sense. IT sense. Common sense.. 
> http://p.sf.net/sfu/splunk-d2d-c1
> _______________________________________________
> X10-users mailing list
> X10-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/x10-users

---
David E. Hudak, Ph.D.          dhu...@osc.edu
Program Director, HPC Engineering
Ohio Supercomputer Center
http://www.osc.edu










------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
X10-users mailing list
X10-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/x10-users

Reply via email to