Hi, I've noticed that if I create a SUB socket, call setsockopt(...ZMQ_SUBSCRIBE...), and then connect to a peer that doesn't exist (I'm using "ipc://foo"), and then call zmq_close() on the socket followed by zmq_term(), then the termination will hang.
There is slightly more to it that I cannot figure out yet though. I can't reproduce the bug in a small test program that only performs exactly these steps. However, I can reproduce it 100% in my larger event-driven application that uses ZMQ_FD and ZMQ_EVENTS with an event loop (Qt). I report it in case anyone might have an idea about how this can happen. I've tried strace to see what is different between my app and a small test case, but nothing stands out. Fortunately it is possible to work around by setting ZMQ_LINGER on the socket. However I consider it a bug since SUB technically has no write queue, at least from the application's perspective, so it should not block shutdown. My theory is that since SUB under the hood sends the subscription to the publisher, it is the combination of a pending subscription and a connect socket (which causes the queue to be created in the absence of a peer) that causes the zmq engine to consider the socket to have a pending write and so it employs the blocking behavior on shutdown, even though the application didn't actually write anything. If I don't subscribe, or if I bind instead of connect, or if I use a non-SUB socket, then this problem doesn't occur. Further, if I create a second application that binds to "ipc://foo", and start this application while the original is blocking on zmq_term, then the original will finish and exit. So it's clearly waiting on a connect. Happy to investigate this further if anyone can direct me. Justin _______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
