While attempting to unit test some code based on PUB/SUB sockets (pyzmq 2.1.10 & zmq 2.1.11), I ran into an unfortunate race condition that occasionally breaks the unit tests. Usually the message exchanges work fine but occasionally the unit tests randomly break due to some or all of the nodes completely failing to send messages.
The unit tests use 3 nodes that consist of a pair of PUB/SUB sockets where each SUB socket is connected to all PUB sockets (including the node's own PUB socket). The individual unit tests involve various message exchanges between the nodes. In typical unit test fashion (python unit testing anyway), the nodes are created immediately before each test begins and are torn down immediately after each test completes. It seems that, when using 'ipc:///' on Linux at least, closing the zmq sockets followed immediately by recreating and reconnecting them occasionally renders the sockets unusable. No messages are received from the PUB sockets and the connection never recovers. Inserting as little as a 50ms delay between each test completely prevents this problem from occurring though. Is this a bug or expected behavior? Due to the multi-threaded nature of the zmq backend I can see the response to this issue legitimately being "don't do that" but I'm not a fan of littering my code with magic "this ought-ta be long enough" delays. Tom _______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
