These are pretty easy to solve with ZK. Ephemerality, exclusive create,
atomic update and file versions allow you to implement most of the semantics
I don't know of any recipes available for this, but they would be worthy
additions to ZK.
On Mon, Aug 23, 2010 at 11:33 PM, Todd Nine <t...@spidertracks.co.nz> wrote:
> Solving UC1 and UC2 via zookeeper or some other framework if one is
> recommended. We don't run Hadoop, just ZK and Cassandra as we don't have a
> need for map/reduce. I'm searching for any existing framework that can
> perform standard time based scheduling in a distributed environment. As I
> said earlier, Quartz is the closest model to what we're looking for, but it
> can't be used in a distributed parallel environment. Any suggestions for a
> system that could accomplish this would be helpful.
> On 24 August 2010 11:27, Mahadev Konar <maha...@yahoo-inc.com> wrote:
> > Hi Todd,
> > Just to be clear, are you looking at solving UC1 and UC2 via zookeeper?
> > is this a broader question for scheduling on cassandra nodes? For the
> > this probably isnt the right mailing list.
> > Thanks
> > mahadev
> > On 8/23/10 4:02 PM, "Todd Nine" <t...@spidertracks.co.nz> wrote:
> > Hi all,
> > We're using Zookeeper for Leader Election and system monitoring. We're
> > also using it for synchronizing our cluster wide jobs with barriers.
> > We're
> > running into an issue where we now have a single job, but each node can
> > fire
> > the job independently of others with different criteria in the job. In
> > event of a system failure, another node in our application cluster will
> > need
> > to fire this Job. I've used quartz previously (we're running Java 6),
> > it simply isn't designed for the use case we have. I found this article
> > cloudera.
> > http://www.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/
> > I've looked at both plugins, but they require hadoop. We're not
> > running hadoop, we only have Cassandra. Here are the 2 basic use cases
> > need to support.
> > UC1: Synchronized Jobs
> > 1. A job is fired across all nodes
> > 2. The nodes wait until the barrier is entered by all participants
> > 3. The nodes process the data and leave
> > 4. On all nodes leaving the barrier, the Leader node marks the job as
> > complete.
> > UC2: Multiple Jobs per Node
> > 1. A Job is scheduled for a future time on a specific node (usually the
> > same
> > node that's creating the trigger)
> > 2. A Trigger can be overwritten and cancelled without the job firing
> > 3. In the event of a node failure, the Leader will take all pending jobs
> > from the failed node, and partition them across the remaining nodes.
> > Any input would be greatly appreciated.
> > Thanks,
> > Todd