Solving UC1 and UC2 via zookeeper or some other framework if one is
recommended. We don't run Hadoop, just ZK and Cassandra as we don't have a
need for map/reduce. I'm searching for any existing framework that can
perform standard time based scheduling in a distributed environment. As I
said earlier, Quartz is the closest model to what we're looking for, but it
can't be used in a distributed parallel environment. Any suggestions for a
system that could accomplish this would be helpful.
On 24 August 2010 11:27, Mahadev Konar <maha...@yahoo-inc.com> wrote:
> Hi Todd,
> Just to be clear, are you looking at solving UC1 and UC2 via zookeeper? Or
> is this a broader question for scheduling on cassandra nodes? For the latter
> this probably isnt the right mailing list.
> On 8/23/10 4:02 PM, "Todd Nine" <t...@spidertracks.co.nz> wrote:
> Hi all,
> We're using Zookeeper for Leader Election and system monitoring. We're
> also using it for synchronizing our cluster wide jobs with barriers.
> running into an issue where we now have a single job, but each node can
> the job independently of others with different criteria in the job. In the
> event of a system failure, another node in our application cluster will
> to fire this Job. I've used quartz previously (we're running Java 6), but
> it simply isn't designed for the use case we have. I found this article on
> I've looked at both plugins, but they require hadoop. We're not currently
> running hadoop, we only have Cassandra. Here are the 2 basic use cases we
> need to support.
> UC1: Synchronized Jobs
> 1. A job is fired across all nodes
> 2. The nodes wait until the barrier is entered by all participants
> 3. The nodes process the data and leave
> 4. On all nodes leaving the barrier, the Leader node marks the job as
> UC2: Multiple Jobs per Node
> 1. A Job is scheduled for a future time on a specific node (usually the
> node that's creating the trigger)
> 2. A Trigger can be overwritten and cancelled without the job firing
> 3. In the event of a node failure, the Leader will take all pending jobs
> from the failed node, and partition them across the remaining nodes.
> Any input would be greatly appreciated.