Re: [zeromq-dev] Important: backward incompatible changes for 0MQ/3.0!

Paul Colomiets Thu, 31 Mar 2011 23:14:57 -0700


31.03.2011, 18:24, "Martin Sustrik" <[email protected]>:
>
> What you are speaking of is the former case -- overall optimisation of
> allocation.


OK, let's leave that alone :)

>
>>  BTW, if we discussing features for 3.0. I'd like to propose few things.
>>
>>  1. Have you considered implementing devices inside the IO thread?
>>      I'm sure it's controversal, but there are lots of cases where
>>     devices are unavoidable, but adding a tens of context switches
>>     for each message affects performance very negatively (we have
>>     a recv call on each zmq_recv and each zmq_send of single
>>     message part).
>
> Yes, I wanted to do that when I started with devices, then I realised
> it's not really possible. The problem is that there are multiple
> frontend and multiple backend peers. Session for each of them can be
> running in a different I/O thread. The device would have to choose one
> of those threads to run in. Consequently, it would still need context
> switches to communicate with the other peers.

Well It still needs twice as lower of context switches. And it would be
normal for me if the device would stick all it's sessions to a single
I/O thread. When it becomes a bottleneck I can add another device on
another port and connect it to the same clients, at least for use cases
where I've used zeromq for.

But probably I should stop this bikeshed until (and if) I'll find a time
to make some tests. I know it would complicate things a lot so the
reason must be really clear.

> The reliability problem is an extremely complex one. If you don't want
> to loose messages, you start with explicit ACKing. Then you have to
> time-out a particular connection (heartbeats?) and push unacked messages
> back to the shared queue. Then you have to deal with messaged delivered
> twice. Then you have to solve the message re-ordering problem. Then you
> have to couple receiving a message and processing into a single atomic
> transaction. Same on the send side. Then you want multiple send/recv
> operations in a single transaction. If the processing involves any other
> resource you want to add the resource (scuch as DB) into the
> transaction. That means using distributed transactions. So you try to
> implement XA. Most probably you'll get definitely lost somewhere close
> to heuristic commits in 2PC model.
>

Your description is great as always :)

> I would say the question is how can we improve reliability of 0mq (NB:
> not make it perfect, just improve it) without dragging all this madness in.

That was exacly my intention. May be I've not clear about that. I'm thinking
about API similar to posix shutdown. First we call:

zmq_shutdown(sock, SHUT_RD)

Zeromq should immediately close bound sockets, but leave connections
open. Then it should forward this message to peers. They should stop
adding messages to pipe which is shutting down. It's ok, if messages
that are already on the pipe will be delivered. Then it must send a
sentinel message. When issuer gots sentinel it closes underlying socket
if it's unidirectional and return ENOTCONN (or whatever error we choose).
Then application processes all messages and closes socket with
zmq_close().

With SHUT_WR zeromq should close bound sockets, to allow somebody
to bind immediately, disable zmq_send, and send sentinels to peers.
There is some problem for bidirectional connections, since peers should
reconnect to same address as fast as possible, but there could be some
outstading requests, which should be forwarded to the old connection.
Probably they will reconnect, and if identity is the same, old connection
can be closed immediately and old responses forwarded by this connection,
and if the identity is different old connection will be open until it will be
zmq_close'd.

This API lets software updates and network reconfiguration to be
done without loosing messages. Other problems (network outage,
hardware and software crashes) are less common and monitoring very
much helps.

Probably when we add a sentinel messages we can do PUB/SUB more
reliable. When connection from publisher is closed unexpectedly we
can send application EIO error (or whatever we choose). For tcp we know
when connection is broken, for ipc it is broken only on application crash
and we also know it, for pgm we have retry timeout. Also we have to
inject this kind of message when queue is full and we loose some
message. This way you don't need to count messages to know when
to die if messages stream is broken (and don't need to duplicate complex
bookkeeping when there are several publishers). For devices it's up
to the application on whats to do with error. It have to forward it as some
application specific message if it needs to.

The shutdown proposal is very strong. The PUB/SUB point is much
more controversal.

BTW, it's much more important than repeating requests in REQ socket,
since latter can be easily done in user code. Well, actually it forces
me to always use XREQ/XREP sockets which points that REQ ones are
probably useless for any realistic applications. So probably for blocking
use case we need some option like ZMQ_RESET, to allow to request
again.

--
Paul
_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Important: backward incompatible changes for 0MQ/3.0!

Reply via email to