Hi, We've been happily using ZeroMQ for a good few months now and are impressed with it's robustness and performance. We are however having a problem with one of our servers "leaking" significant amounts of memory over time and after a good few hours trawling the 0mq code I think I might know why.
The "leaking" server acts as a single TCP port to ZeroMq universe proxy for (Internet) clients hooking in to our server applications. It is the only server in our environment that creates and binds to ZeroMQ (sub) sockets, but does not necessarily zmq_poll on them. If there are no client subscriptions mapping to a particular sub socket, we do not add the 0mq socket to the zmq_pollitem_t array when next calling zmq_poll. Such time as at least one client subscription maps to the sub socket, we rebuild the zmq_pollitem_t array including the socket. The first case is where the problem occurs - even though we're not subscribed to a single topic on a socket, 0mq appears to buffer incoming messages indefinitely. It is only when the socket is next included in a call to zmq_poll that 0mq will actually purge all buffered non-matching (ie. all) messages from the buffer. In other words, 0mq appears to implement client-side filtering and discarding of unwanted messages within the call to zmq_poll or zmq_recv in the application thread, not in the 0mq I/O thread(s) as we expected. The ironic side-effect of this, at least with our use of "optimised" calls to zmq_poll, is that our proxy server leaks most when no clients are connected! My problem now is how to correct this. I see a few options: 1. Always pass all sockets to zmq_poll 2. Set high water-marks, which appear to be enforced in the I/O thread(s) 3. Use ZeroMq 3.1, which filters publisher-side and presumably eliminates the scenario completely None of these solutions are ideal. 1. involves a delicate rewrite of some core messaging code and if I'm being really picky introduces extra work (matching/purging) on our main application thread, 2. opens up the potential for lost/discarded messages, and 3. is not an option currently given 3.1's current beta status. I'm interested as to whether anyone out there has experienced this kind of problem before and has any alternative solutions for tackling things? My suspicion is that a temporary solution of high water-marks, replaced by a proper solution of using 0mq 3.1 once stable, is probably the way forward. Best regards, Jess _______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
