It is possible to estimate, accurately in fact. We can record execution times
from previous runs. However, for many million of tasks that is a lot of
overhead. However, I think there is a simpler approach. Instead, I think I can
batch messages as "very fast execution" (where tasks take 3ms, the minimum) and
"everything else". So, the fast tasks I'd batch (e.g. of 100) -- then send
everything else individually. Determining which messages/tasks are very fast
is relatively trivial -- determining everything else is laborious.
Trying (a) to optimize minimum idle time by (b) ensuring that the messaging
adds as little overhead as possible and (c) taking into account there could be
hundreds of cores (either Windows|Linux boxes with 64+cores) or the newer
generation SPARC T-series CPUs. The other factor is the task execution time.
Each task is actually a pricing of a derivative trade. Depending on the trade
type/configuration, there are many complex pricing-models involved. Optimizing
a given pricing-model is a project (well, challenge) in it's own right.
Many thanks,
Sean
________________________________
From: Andrew Hume <[email protected]>
To: Sean Donovan <[email protected]>; ZeroMQ development list
<[email protected]>
Sent: Monday, November 12, 2012 10:08 PM
Subject: Re: [zeromq-dev] Distributed Q with **lots** of consumers
is it possible to estimate the runtime for an item?
and what is the metric you are trying to optimise?
is it average latency? or total throughput? or minimal idle time?
On Nov 12, 2012, at 3:58 PM, Sean Donovan wrote:
Any suggestions for implementing the following in ZMQ?
>
>
>Imagine a single Q containing millions of entries, which is constantly being
>added to. This Q would be fully persistent, probably not managed by ZMQ, and
>run in it's own process.
>
>
>We would like N workers. Those workers need to start/stop ad-hoc, and
>reconnect to the Q host process. Each worker would take a single item from
>the Q, process, acknowledge completion, then repeat (to request another item).
> Processing time for each task is 3ms+ (occasionally minutes).
>
>
>Because of the variance in compute time it is important that the workers don't
>pre-fetch/cache tasks. As an optimization, we'll add a heuristic so we can
>batch short-running tasks together (but, we'd like the control -- a
>load-balancing algorithm wouldn't know how to route efficiently, unless it
>could take hints).
>
>
>Need a pattern that would allow us to scale to 100s of workers.
>
>
>MANY THANKS!
>
>Sean Donovan_______________________________________________
>zeromq-dev mailing list
>[email protected]
>http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
-----------------------
Andrew Hume
623-551-2845 (VO and best)
973-236-2014 (NJ)
[email protected]
_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev