Hi Pieter, I can certainly make a test case for the socket leakage, but as I said before, I thought this was a known issue, so I'm a little confused. Specifically, I was referring to the line in the guide<http://zguide.zeromq.org/page:all#Shrugging-It-Off>where it says "When we use a ROUTER socket in an application that tracks peers, as peers disconnect and reconnect, the application will leak memory (resources that the application holds for each peer) and get slower and slower." How is it that we can avoid this kind of behaviour? I know if the outside client sends a zmq_close it will actually close the socket, but on the ROUTER socket that these sockets are connected to, I can't find any exposed way to inform it that a client, that we have determined is not returning via heartbeat (or any other mechanism), should be removed from the (os-level) socket table associated with the router.
I doubt I'm going to be able to come up with a test case that can get a socket into EPOLLERR or EPOLLHUP, but I think it's reasonably easy to see how this can happen from the code. If the socket is in one of those states, then epoll_wait return EPOLLERR or EPOLLHUP and we end up on this line<https://github.com/zeromq/zeromq2-x/blob/master/src/epoll.cpp#L153>. This calls in_event in zmq_engine.cpp. We end up on this line<https://github.com/zeromq/zeromq2-x/blob/master/src/zmq_engine.cpp#L164>, because disconnected will be set to true which calls back into epoll.cpp here <https://github.com/zeromq/zeromq2-x/blob/master/src/epoll.cpp#L97>. This function *only* unregisters the socket for EPOLLIN, so the next time we call epoll_wait, the socket will still be in the EPOLLERR or EPOLLHUP state, and we do exactly the same thing again. Thanks, Will On Wed, Jan 23, 2013 at 6:27 AM, Pieter Hintjens <[email protected]> wrote: > Hi Will, > > Can you make a minimal test case that shows the socket leakage? That's > a first step to solving the problem. > > > 2. There appears to be a bug with ZeroMQ's epoll implementation when a > > socket gets into the EPOLLERR or EPOLLHUP state. ZeroMQ unregisteres the > > socket for read, but doesn't actually call EPOLL_CTL_DEL on the fd, so > epoll > > just keeps calling back zmq with the same fd. Is this a know bug? I also > > tried fixing this, but now it crashes in set_pollout periodically and > > appears to be passing in a garbage fd, so I must have missed something. > > Again, if there's any way to reproduce this, that's a good start. > Otherwise, file an issue and note as much about the problem as you > can. > > -Pieter > _______________________________________________ > zeromq-dev mailing list > [email protected] > http://lists.zeromq.org/mailman/listinfo/zeromq-dev >
_______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
