a good answer would be fairly long as there are many design choices yet to be 
made.
but let me describe how i would deal with this (given there is still much i 
don't know).

Data things:
        tasks are actual computation steps.
        we have jobs made up of jobs or tasks (or both).
        we store these job description in Redis.
        we have queues for tasks waiting to be executed.

Scheduling:
        we assume some code which, given a job instance,
                can compute if any new task or job can be scheduled.
        tasks are scheduled by being put on a task-specific queue
        tasks are executed by a worker process asking (REQ/REP)
                for the next task (or several tasks) from a specific queue
        workers are responsible for sending status back to the task queue
        task queues are straightforward (waiting, running+timeout, done)

Control:
        pick a way to select the master; this will run the scheduler and be the 
writer
                for teh redis store. (lots of algorithms here, like leader 
election.)
        deal with configuration somehow (this means setting up the various
                workers and messaging helpers (like brokers etc)). this can be 
done statically
                or dynamically. try to go simple here; just say how many workers
                and of what type.
        the workers should not be micro-managed. let them ask for work when 
they are free.
        reiliance/shutdown is always delicate. i prefer heartbeat messages from 
teh workers
                and shutting them down by special 'task's on the task queue.

Messaging:
        once you know the overall structure and roughly how big it is, then you 
can pick
                the messaging topology (the guide helps a lot here).


obviously, this needs to be fleshed out, but that seems straightforward.
with maybe one exception; the jobs that need "multiple subscribers" .
i have assumed (maybe incorrectly) that these jobs can be decomposed
or handled by a single worker.

        hope this helps
                andrew

On Jun 29, 2012, at 4:41 AM, Felix De Vliegher wrote:

> Yes, scheduling is done in a single thread.
> Preferably, we should be resilient to network and worker failures. There's 
> one use case for a job to re-publish something to other subscribers, so you 
> might say that one has side effects. But most jobs are functional, 
> self-contained units of work.
> 
> Regards,
> Felix
> 
> On Friday 29 June 2012 at 13:20, Andrew Hume wrote:
> 
>> is teh marshalling/scheduling of stuff being done (essentially) 
>> single-threaded?
>> that is, even if teh work is being done in parallel and distributed,
>> is the organising being done in one place?
>> (somewhat equivalent to running make foo, where the make can fire off jobs 
>> elsewhere.)
>> 
>> and how do you feel about networking and worker failures? do you need to be 
>> resiliant
>> against them? (and if so, do your jobs have side effects, or are they 
>> somehow functional?)
>> 
>> andrew
>> 
>> On Jun 29, 2012, at 4:08 AM, Felix De Vliegher wrote:
>>> Hi andrew
>>> 
>>> The router (or splitter, from the EIA book) would attach a unique 
>>> identifier to each job and store that id and its sub-jobs in Redis. All 
>>> workers would then ultimately report back to the sink, which aggregates the 
>>> results of the tasks that belong together. There might be a better approach 
>>> though, but this is the idea for now :)
>>> 
>>> Cheers,
>>> Felix
>>> 
>>> On Friday 29 June 2012 at 12:57, Andrew Hume wrote:
>>> 
>>>> before i answer, how are you going to implement patterns such as 
>>>> aggregator from teh EIA book?
>>>> i think that means knowing how you identify tasks/jobs and if the tracking 
>>>> and organising of all
>>>> that is going to be centralised or distributed.
>>>> 
>>>> andrew
>>>> 
>>>> On Jun 29, 2012, at 3:08 AM, Felix De Vliegher wrote:
>>>>> Hi list
>>>>> 
>>>>> I'm trying to set up a system where certain jobs can be executed through 
>>>>> zeromq, but there are currently a few unknowns in how to tackle certain 
>>>>> issues. Basically, I have a Redis queue with jobs. I pick one job from 
>>>>> the queue and push it to a broker that distributes it to workers that 
>>>>> handle the job.
>>>>> 
>>>>> So far so good, but there's a few extra requirements:
>>>>> - one job can have multiple sub-jobs which might or might not need to be 
>>>>> executed in a specific order. "item_update 5" could have "cache_update 5" 
>>>>> and "clear_proxies 5" as sub-jobs). I'm currently thinking of using the 
>>>>> routing slip pattern (http://www.eaipatterns.com/RoutingTable.html) to do 
>>>>> this.
>>>>> - some sub-jobs need to wait for other sub-jobs to finish first.
>>>>> - some jobs need to be published across multiple subscribers, other jobs 
>>>>> only need to be handled by one worker.
>>>>> - workers should be divided into groups that will only handle specific 
>>>>> tasks (majordomo pattern?)
>>>>> - some workers could forward-publish something themselves to a set of 
>>>>> subscribers
>>>>> 
>>>>> Right now, I have the following setup:
>>>>> (Redis queue) <---- (one or more routers | push) -----> (pull | one or 
>>>>> more brokers | push) -----> (pull | multiple workers | push) ----> (pull 
>>>>> | sink)
>>>>> 
>>>>> 
>>>>> The brokers and the sink are the stable part of the architecture. The 
>>>>> routers are responsible for getting a job from the queue, deciding the 
>>>>> sub-jobs for each job and attaching the routing slip. What I haven't done 
>>>>> yet is implementing majordomo to selectively define workers for a certain 
>>>>> service, so every worker can handle every task right now. The requirement 
>>>>> that some jobs are pub/sub and other are push/pull also isn't fulfilled.
>>>>> 
>>>>> I was wondering if this is the right approach and if there are better 
>>>>> ways of setting up messaging, keeping into account the requirements?
>>>>> 
>>>>> 
>>>>> Kind regards,
>>>>> 
>>>>> Felix De Vliegher
>>>>> Egeniq.com (http://Egeniq.com) (http://Egeniq.com)
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> zeromq-dev mailing list
>>>>> [email protected] (mailto:[email protected]) 
>>>>> (mailto:[email protected])
>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>> 
>>>> 
>>>> 
>>>> 
>>>> ------------------
>>>> Andrew Hume (best -> Telework) +1 623-551-2845
>>>> [email protected] (mailto:[email protected]) 
>>>> (mailto:[email protected]) (Work) +1 973-236-2014
>>>> AT&T Labs - Research; member of USENIX and LOPSA
>>>> 
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> zeromq-dev mailing list
>>>> [email protected] (mailto:[email protected]) 
>>>> (mailto:[email protected])
>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> [email protected] (mailto:[email protected])
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>> 
>> 
>> 
>> ------------------
>> Andrew Hume (best -> Telework) +1 623-551-2845
>> [email protected] (mailto:[email protected]) (Work) +1 
>> 973-236-2014
>> AT&T Labs - Research; member of USENIX and LOPSA
>> 
>> 
>> 
>> 
>> _______________________________________________
>> zeromq-dev mailing list
>> [email protected] (mailto:[email protected])
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> 
> 
> 
> _______________________________________________
> zeromq-dev mailing list
> [email protected]
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev


------------------
Andrew Hume  (best -> Telework) +1 623-551-2845
[email protected]  (Work) +1 973-236-2014
AT&T Labs - Research; member of USENIX and LOPSA




_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to