On Tue, Jul 03, 2018 at 10:35:31AM -0500, Mark Botner wrote: > I wonder if the default setting for ZMQ_LINGER is causing the zmq_close() > to block since there are unsent messages? From the ref. guide: > > The default setting of *ZMQ_LINGER* does not discard unsent messages; this > behaviour may cause the application to block when calling *zmq_ctx_term()*.
Indeed, but in my case, zmq_close() does NOT block. zmq_connect() does NOT block either. It's just that messages do not arrive, and eventually zmq_send() blocks (or fails in non-blocking mode). Charles > On Tue, Jul 3, 2018 at 9:08 AM, Charles Bouillaguet < > [email protected]> wrote: > > > Dear zeromq'ers, > > > > I'm facing a reliability problem that I couldn't solve by myself so far. > > > > I have two machines with two asymetric programs running. Machine A creates > > a > > PULL socket and binds it. Machine B creates a PUSH socket and connects it > > (to > > the PULL socket of machine A), using the TCP transport. Machine B then > > sends > > messages like crazy (about 500/s). Basically, B is an low-cost device > > equiped > > with sensors and A is a server that just stores the data. > > > > This works like a charm... until the inevitable happens: some network event > > occurs, and the messages cannot be transmitted from machine B to machine A. > > > > With a blocking send, the process on machine B then gets stuck in > > zmq_send(), > > once the high water mark is reached, and the whole pipeline grinds to a > > halt. > > > > To avoid this, I tried the "Lazy Pirate Pattern". I use something like: > > > > if (-1 == zmq_send(socket, msg, size, ZMQ_DONTWAIT)) { > > if (errno == EAGAIN) { > > zmq_close(socket); > > socket = zmq_socket(context, ZMQ_PUSH); > > zmq_connect(socket, address); > > } > > } > > > > I don't care if I lose some messages. What I don't want is the pipeline to > > stop > > forever. > > > > At first, this seems to work as intended. When the network is down, the > > program > > actually closes and re-creates the socket; the call to zmq_connect() > > succeeds... but the messages are still not sent, and the process in > > machine B > > ends up in a loop where it fills the ZMQ buffers, destroy the socket, > > re-create > > it, re-connect, rinse, repeat. I observed the loop for several hours. > > > > Just stopping the UNIX process and re-starting it solved the problem > > (i.e. messages get transmitted normally, instantaneously). > > > > Is there something I am doing wrong? What are my options to avoid this > > problem? > > [I can consider moving away from ZMQ to nanomsg or nng]. > > > > Thanks, > > -- > > Charles BOUILLAGUET > > Université de Lille - Sciences et Technologies > > [email protected] | www.univ-lille1.fr > > Laboratoire CRIStAL - Bât M3 - Bureau 332 - 59655 Villeneuve d'Ascq > > Tél. +33 (0)3 28 77 85 84 > > homepage: http://cristal.univ-lille.fr/~bouillag/ > > _______________________________________________ > > zeromq-dev mailing list > > [email protected] > > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > > > _______________________________________________ > zeromq-dev mailing list > [email protected] > https://lists.zeromq.org/mailman/listinfo/zeromq-dev -- Charles BOUILLAGUET Université de Lille - Sciences et Technologies [email protected] | www.univ-lille1.fr Laboratoire CRIStAL - Bât M3 - Bureau 332 - 59655 Villeneuve d'Ascq Tél. +33 (0)3 28 77 85 84 homepage: http://cristal.univ-lille.fr/~bouillag/ _______________________________________________ zeromq-dev mailing list [email protected] https://lists.zeromq.org/mailman/listinfo/zeromq-dev
