Artem,

On Tuesday, 17 December 2013 at 17:35, artemv zmq wrote:

> Now, imagine, server shuts down, for example via "ifdown eth0". OS sends to 
> client RST packet and client now recognizes that server became unresponsive. 
> A this point I think would be very-very great to have an socket_option 
> standing for "if socket reveals during runtime that remote peer is not 
> responsive -- don't queue a msg and raise error" .


0MQ abstracts — to a large degree — the underlying socket implementation. TCP 
is one transport-layer protocol, and from the list it seems UDP may be joining 
soon. Multicast (used only in PUB/SUB) are encapsulated in UDP.  

FIN/RST are only part of the TCP protocol. TCP has other heuristics it uses for 
re-requests which are needed to determine if a remote node has become 
unavailable, but this takes time. Unless you receive an RST inserted by a 
firewall or an `ifcfg eth0 down`, then it is not possible to know immediately 
to stop queuing messages. If you are sending 1000s of messages per second, and 
it takes several seconds to mark a host as unavailable, then what?  

As of now, if you set HWM=1 and the connection breaks, send() will block if a 
message is on the queue depending on the message pattern.  

There has been other traffic on the list today regarding a similar topic. As of 
now, since you are interested in finding a host that has gone down QUICKLY, you 
need to implement your own heartbeat. Relying on transport protocols to do that 
for you is very unreliable.  

Credit-based flow control has also been mentioned along with other possible 
approaches.  

> What do you think devs?

My opinion is that it would be great if we somehow did give an option to 
establish a heartbeat — even though 0MQ provides the library to do this 
yourself. I wouldn’t mind a socket option that did this, but it will have quite 
large implications depending on the message pattern and the queue. This is not 
something that would be easy and straightforward. It would require a lot of 
thought.

Since it was brought up today, it is definitely worth talking about how this 
should be done, but if you follow the advice you should implement your own 
heartbeat. The biggest issue I can see in your case is that you do not have 
control over the remote node nor the protocol.  

--  
Justin Cook  


_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to