Re: [zeromq-dev] Important: backward incompatible changes for 0MQ/3.0!

Martin Sustrik Thu, 31 Mar 2011 08:25:05 -0700

Hi Paul,

> 30.03.2011, 12:12, "Martin Sustrik"<[email protected]>:
>> Can you spell out more clearly what's the problem
>> with zmq_content_t?)
>
> The only problem is another memory allocation. If I want to use
> another memory allocation technique for performance reasons, allocation
> of content_t structure can make all the benefits negligible.


My point in separating the allocation problem into 2 parts was that 
optimising the overall allocation mechanism performance is a different 
problem from using a specific allocator for message body e.g. because 
your legacy app requires the data allocated by some special allocator 
and you want to avoid copying it after receiving it from 0mq.

What you are speaking of is the former case -- overall optimisation of 
allocation. That, IMO, should be done by overloading malloc/free (the 
way jemalloc, tcmalloc etc. are using). If you do so, even the metadata 
is allocated using the new allocator. Also note that with 
zmq_msg_init_size the data and metadata are allocted in a single chunk, 
so there's no additional allocation for metadata.

> BTW, if we discussing features for 3.0. I'd like to propose few things.
>
> 1. Have you considered implementing devices inside the IO thread?
>     I'm sure it's controversal, but there are lots of cases where
>    devices are unavoidable, but adding a tens of context switches
>    for each message affects performance very negatively (we have
>    a recv call on each zmq_recv and each zmq_send of single
>    message part).

Yes, I wanted to do that when I started with devices, then I realised 
it's not really possible. The problem is that there are multiple 
frontend and multiple backend peers. Session for each of them can be 
running in a different I/O thread. The device would have to choose one 
of those threads to run in. Consequently, it would still need context 
switches to communicate with the other peers.

> 2. Currently there is no way to close PUSH/PULL or PUB/SUB
>    socket or even probably a REQ/REP socket without loosing
>    a message. The only combination that works is XREP/XREQ
>    (and XREP/XREP), and this while doing all the bookkeeping
>    yourself. I know that it's intentionally, because by design my
>    system needs to be failure resistant and so on. But there
>    are lots of use cases which needs that. E.g I want a local
>    process start, process single message and die peacefully.
>    Today, I'll loose more messages than process.Second use
>    case, is if I have a device which load balances messages
>    between local processes by ipc://. I know that the only reason
>    for processes to disconnect is when I want to restart process
>    to get software update (crashes are quite rare). Okay, it's even
>    more controversal example. I have client which will
>    repeat request if there is a network trouble for sure. But restart
>    is going to loose at least tens of messages and timeout is
>    for very hard edge cases, because of latency. And really what's
>    the purpose of message queue, which can't queue messages
>    for me? :)

The reliability problem is an extremely complex one. If you don't want 
to loose messages, you start with explicit ACKing. Then you have to 
time-out a particular connection (heartbeats?) and push unacked messages 
back to the shared queue. Then you have to deal with messaged delivered 
twice. Then you have to solve the message re-ordering problem. Then you 
have to couple receiving a message and processing into a single atomic 
transaction. Same on the send side. Then you want multiple send/recv 
operations in a single transaction. If the processing involves any other 
resource you want to add the resource (scuch as DB) into the 
transaction. That means using distributed transactions. So you try to 
implement XA. Most probably you'll get definitely lost somewhere close 
to heuristic commits in 2PC model.

I would say the question is how can we improve reliability of 0mq (NB: 
not make it perfect, just improve it) without dragging all this madness in.

Martin
_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Important: backward incompatible changes for 0MQ/3.0!

Reply via email to