RSTs can occur even if the remote node is still up.
On Thu, Dec 19, 2013 at 7:25 AM, artemv zmq <[email protected]> wrote: > hi Justin. Thanks for reply! > > Can you please elaborate more on: > >> FIN/RST are only part of the TCP protocol. TCP has other heuristics it > uses for re-requests which are needed to determine if a remote node has > become unavailable, but this takes time > > Usually ITOps just "kill" a process and that's it. That's why RST always > works. I don't know (and architects I work with don't know either ...) > what other heuristics TCP uses to determine if a remote node has > become unavailable. They know FIN/RST and they happy . > > > > > 2013/12/17 Justin Cook <[email protected]> > >> Artem, >> >> On Tuesday, 17 December 2013 at 17:35, artemv zmq wrote: >> >> > Now, imagine, server shuts down, for example via "ifdown eth0". OS >> sends to client RST packet and client now recognizes that server became >> unresponsive. A this point I think would be very-very great to have an >> socket_option standing for "if socket reveals during runtime that remote >> peer is not responsive -- don't queue a msg and raise error" . >> >> >> 0MQ abstracts — to a large degree — the underlying socket implementation. >> TCP is one transport-layer protocol, and from the list it seems UDP may be >> joining soon. Multicast (used only in PUB/SUB) are encapsulated in UDP. >> >> FIN/RST are only part of the TCP protocol. TCP has other heuristics it >> uses for re-requests which are needed to determine if a remote node has >> become unavailable, but this takes time. Unless you receive an RST inserted >> by a firewall or an `ifcfg eth0 down`, then it is not possible to know >> immediately to stop queuing messages. If you are sending 1000s of messages >> per second, and it takes several seconds to mark a host as unavailable, >> then what? >> >> As of now, if you set HWM=1 and the connection breaks, send() will block >> if a message is on the queue depending on the message pattern. >> >> There has been other traffic on the list today regarding a similar topic. >> As of now, since you are interested in finding a host that has gone down >> QUICKLY, you need to implement your own heartbeat. Relying on transport >> protocols to do that for you is very unreliable. >> >> Credit-based flow control has also been mentioned along with other >> possible approaches. >> >> > What do you think devs? >> >> My opinion is that it would be great if we somehow did give an option to >> establish a heartbeat — even though 0MQ provides the library to do this >> yourself. I wouldn’t mind a socket option that did this, but it will have >> quite large implications depending on the message pattern and the queue. >> This is not something that would be easy and straightforward. It would >> require a lot of thought. >> >> Since it was brought up today, it is definitely worth talking about how >> this should be done, but if you follow the advice you should implement your >> own heartbeat. The biggest issue I can see in your case is that you do not >> have control over the remote node nor the protocol. >> >> -- >> Justin Cook >> >> >> _______________________________________________ >> zeromq-dev mailing list >> [email protected] >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >> > > > _______________________________________________ > zeromq-dev mailing list > [email protected] > http://lists.zeromq.org/mailman/listinfo/zeromq-dev > >
_______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
