Hi Noah,

There is no 4.x roadmap and any previous version calling itself "4.x"
no longer exists. We are working on the 3.x tree, currently pushing
3.1 towards stability.

In general anyone proposing incompatible patches to 3.x will have to
give extraordinarily good reasons for them. My advice to anyone,
including yourselves, who wants the fullest control over 0MQ's future
direction is to become a contributor.

-Pieter

On Thu, Feb 2, 2012 at 8:18 PM, Noah Gibbs <[email protected]> wrote:
>
> Hi!  My team at Ooyala are putting together a zmq-based architecture for some 
> monitoring stuff we're doing.  We're trying to figure out if it's reasonable 
> to keep compatibility options for ZMQ 4.x.  I'm hoping you might have 
> suggestions, for that or in general.
>
> ** First, why we're doing this:
>
> The idea is that a monitoring client runs on each monitored machine.  Local 
> processes send registrations, statistics, heartbeats and notifications 
> (errors, warnings, etc).  They also declare plugins to run periodically to 
> assess process and machine health, roughly like what Nagios does.  Client 
> sets are dynamic, and a lot of this runs in EC2.
>
> We put the various information to Graphite, into our alerting system, into 
> our scheduling system for running plugins and a few other places.  Then we 
> can see the results and determine the health of our cluster -- what machines 
> are running and what applications they're running, as well as health checks 
> from the plugins.
>
> ** Next, what we're doing with ZMQ:
>
> The clients send JSON with the stats, notifications, etc. over a ZMQ_DEALER 
> connected to central routers (six routers, to start with).  The routers bind 
> a ZMQ_ROUTER socket for client traffic, which is resent via a ZMQ_PUSH socket 
> to our back-end message sinks.
>
> A few high-value messages like error notifications require acknowledgements 
> from the sink, and will be resent periodically until the ack is received by 
> the client.  The router doesn't store any state about that, it just forwards 
> messages.
>
> Each client has a UUID.  It's sent in their JSON messages, it's what they 
> bind as the socket identity.  That's how we send them things.  It's how we 
> identify things like statistics from them.  It persists across reboots, but 
> we can generate new ones easily when provisioning new virtual machine 
> instances.
>
> The message sinks connect to the routers with a ZMQ_PULL socket.  They 
> receive messages (stats, notifications, etc.) and put them in various 
> back-end storage, including sending out notifications by email or pager where 
> appropriate.  Each message sink has a type (heartbeat sink, stats sink, 
> registration sink, etc), and the pull socket distributes the work among the 
> available sinks.
>
> The routers also bind a REP socket for traffic from the back end *to* the 
> clients.  At the moment, the traffic to the client is either acks or "run 
> this plugin now" messages.
>
> A scheduler (in practice, several machines) looks at that storage, determines 
> what plugins need to run, and then sends "run this plugin" messages to the 
> REP socket on the router to be forwarded to the clients by UUID.
>
> ** What we're worried about with 4.0:
>
> From the mailing list, it sounds like ZMQ 4.0 router sockets won't support 
> setting identity, which makes it difficult to send to a client by UUID.  
> Presumably we could make each client, when it connects to the router, send 
> its UUID in a "hello" message so that the router could then save its identity 
> and forward messages to it.  Does that sound like the right approach?  Should 
> we be doing this already in 3.1?
>
> Right now we're using ROUTER and DEALER for the client/router connection, 
> which lets us send everything over a single socket - very nice for keeping 
> our firewall rules simple.  But it sounds like there's not any way to do this 
> in a way that's both 3.1- and 4.0-compatible.  Is that true, or am I 
> misunderstanding?
>
> ** Suggestions?
>
> Right now we're in the early stages.  We have a basic ZMQ topology running 
> and a few tests, but there will never be a better time to change this 
> architecture.  What are we doing wrong?
>
> --
> Noah Gibbs
> Software Engineer |
> [email protected] | (510) 260-5409 (cell)
> www.ooyala.com | blog | @ooyala
>
> _______________________________________________
> zeromq-dev mailing list
> [email protected]
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to