Hi, I have a program that sends messages on a ZMQ_DEALER socket with with ZMQ_DONTWAIT. If it gets back EAGAIN (perhaps because the other end is responding slowly or has gone away) it calls zmq_close to close the socket and then re-establish the connection (possibly to a new endpoint) with a new socket. ZMQ_LINGER is set to 0 (this doesn't appear to happen if ZMQ_LINGER isn't set, but that can cause other issues).
I'm occasionally seeing crashes in the libzmq epoll_t thread with either "pure virtual method called" or a segmentation fault. The stack looks like (this is with libzmq 3.2.4 but others are similar): #4 0x00007f8928939ca3 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #5 0x00007f892893a77f in __cxa_pure_virtual () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #6 0x00007f8929649db1 in zmq::v1_encoder_t::message_ready (this=0x7f8918000b90) at v1_encoder.cpp:66 #7 0x00007f892964a2a4 in zmq::encoder_base_t<zmq::v1_encoder_t>::get_data (this=0x7f8918000b90, data_=0x7f8918000928, size_=0x7f8918000930, offset_=0x0) at encoder.hpp:93 #8 0x00007f892963fb42 in zmq::stream_engine_t::out_event (this=0x7f89180008e0) at stream_engine.cpp:261 #9 0x00007f8929627d1a in zmq::epoll_t::loop (this=0x8eace0) at epoll.cpp:158 #10 0x00007f8929644996 in thread_routine (arg_=0x8ead50) at thread.cpp:83 #11 0x00007f8928be6e9a in start_thread (arg=0x7f89271b9700) at pthread_create.c:308 #12 0x00007f89293453fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 Looking at the core, it appears that the memory pointed to by the msg_source field in the encoder has been freed (the "pure virtual method called" is because the vtbl pointer has been munged by something that re-allocated the buffer). The msg_source field points to the session_base_t, but that was freed by the zmq_close. The session_base_t destructor calls engine->terminate(), which would normally free the engine state but doesn't do anything if the encoder still has data left to be sent. I've reproduced this with 3.2.4, 4.0.1, and master (as of a few days ago). I filed LIBZMQ-576 and attached a small test program to the issue. This looks like a libzmq bug to me, though if I'm misusing the API in some way (or if there's a reasonable workaround) please let me know. Andy
_______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
