Re: [zeromq-dev] Cleaning up file descriptors for dead router

Maurice Barnum Mon, 29 Jun 2015 12:07:46 -0700

I've been worried about a similar problem:  how to release resources tied up by 
a client
that has gone away.  zproto provides a good example of what to do for 
application-level
resources, but queued messages are still stuck.  Using credits will limit the 
flow of

messages, but at high request rates, a few thousand buggy client deployments 
can be
very disruptive.

Has anyone ever thought about implementing an API that would let an application 
disconnect
a specific peer from a socket?  I was thinking about adding an API that would 
disconnect

a peer from a router, identified by the identity.  Maybe a more general 
approach would be
to add an API to iterate connected peers and allow a disconnect based upon 
address,
but I don't have such a need and so haven't thought too much about it until 
right now.

On Monday, June 29, 2015 11:26 AM, Marcin Romaszewicz <[email protected]> wrote:

Hi Jonathan,

Your heartbeat code does indeed work in my little test, but I don't know why it 
didn't work in the wild for me.

Your code, though, gave me an idea to fix my problem slightly differently on 
top of ZMQ 4.1.2. I already have heartbeats going back and forth, and they 
propagate some peer information, so I have to send them irrespective of whether 
your code sends ZMQ-internal heartbeats. I'm going to do something similar in 
the stream engine, where if the tcp send returns a size of 0 and the reason is 
that the send would block or fail, I'll start a timer, then cancel if if we 
ever have a subsequent successful send or receive something. If the timer goes 
off, we disconnect. This should fix my problem without two layers of heartbeats.

Once 4.2.0 is stable and tested, I'll move to using your heartbeat stuff and 
remove our own heartbeats.

-- Marcin

On Sat, Jun 27, 2015 at 9:06 AM, Jonathan Reams <[email protected]> wrote:

Hi Marcin,
>
>
>I tried running your test case with the new heartbeats turned on and I saw 
>what I think should be the correct behavior. I set the heartbeat interval, 
>timeout, and TTL to 500 ms, and less than a second after setting iptables to 
>DROP, all the sockets on the peer side went from ESTABLISHED to SYN_SENT, 
>indicating that they were trying to reconnect, and all the ESTABLISHED sockets 
>on the router side were closed. After flushing the INPUT iptables chain, the 
>peers eventually recovered. I put my updated copy of your test script here 
>https://gist.github.com/jbreams/7f507beff87987afad98. I haven't tried this 
>with 4.2.0 talking to 4.1.2 though, although in your configuration I think it 
>would do almost the right thing - I'd expect the router side to work fine and 
>the peers to never close their sockets.
>
>
>Jonathan
>
>
>On Fri, Jun 26, 2015 at 4:58 PM, Marcin Romaszewicz <[email protected]> wrote: 
>Hi All,
>>I've gota trivial bit of code to reproduce this issue on a single host
>>using iptables to simulate network partition.
>>https://s3-us-west-2.amazonaws.com/marcin-zmq-example/zmq_test.cpp
>>The file has comments on how to run the executable, but the short version
>>is that you start a ZMQ_ROUTER listener which accepts connections from
>>other peers, and remembers their identities and pings them every 5 seconds.
>>Then, you start a number of peers which connect to this router and start
>>pinging it every few seconds.
>>Once you use the iptables command (also in the comments in the file), the
>>router can't ping the peers, and the peers can't ping the router. The file
>>descriptors and connections remain open forever on both sides.
>>Furthermore, when you undo the iptables block, the connections never come
>>back.
>_______________________________________________
>zeromq-dev mailing list
>[email protected]
>http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>

_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev
_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Cleaning up file descriptors for dead router

Reply via email to