Here's some small sample code to reproduce the issue: https://gist.github.com/jkarneges/ab2b1abea1ee4cfc1332
A (ztest1.py) creates REQ and ROUTER sockets. B (ztest2.py) creates REP and ROUTER sockets. B binds and provides a random identity to its ROUTER socket. A connects its sockets to B. A queries for B's id using the REQ socket, and then attempts to send a message via the ROUTER socket right after that. This is repeated every 2 seconds. A and B can be started in any order. A can be restarted and things will still work. If B is restarted, then A's ROUTER socket will never work again until A is restarted also. A uses ZMQ_ROUTER_MANDATORY to show that the failures are on A's side. On 02/07/2014 02:16 PM, Justin Karneges wrote: > It is my understanding that being able to route requires the socket to > have an identity mapping in its routing table for the peer. > > For peers that do not explicitly specify their own identity, then I > believe you are correct that routing is not possible until at least one > message has been received from the peer. It is at this point that the > ROUTER socket will make up an identity for this peer and store it in its > routing table. > > However, for peers that *do* explicitly specify their own identity (as I > am doing), then this identity information is delivered immediately after > the connection is established, allowing routing to the peer even if the > peer has not sent a message yet. > > I should have been more clear in my original message. The B program is > explicitly specifying a random UUID as the identity of its socket before > binding. > > On 02/07/2014 02:06 PM, Panu Wetterstrand wrote: >> I did not quite get the problem but could this be because (I think) >> router is not able to route messages to socket from which it has not >> reveived data first... >> >> 7.2.2014 22.51 kirjoitti "Justin Karneges" <[email protected] >> <mailto:[email protected]>>: >> >> Hi, >> >> 1) ROUTER in program A is set to connect to a bind socket in program B. >> 2) Both programs are started, and the connection is established. >> 3) A determines B's socket identity out-of-band, and is able to send >> messages to B. >> 3) B is terminated and the connection is lost. >> 4) B is started again, and the connection is re-established. >> 5) A determines B's socket identity out-of-band, and is no longer able >> to send messages to B. >> >> It seems this problem does not happen if B retains the same socket >> identity across reconnects. However, if it uses a random identity (to be >> discovered out-of-band by A), then routing will never work again after >> the first restart of B. The A program must be restarted in order to make >> things right again. >> >> My guess is that each connect queue on a ROUTER socket is somehow bound >> for life against the first identity it sees. Is this intentional >> behavior? >> >> Thanks, >> Justin >> _______________________________________________ >> zeromq-dev mailing list >> [email protected] <mailto:[email protected]> >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >> >> >> >> _______________________________________________ >> zeromq-dev mailing list >> [email protected] >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >> > > _______________________________________________ > zeromq-dev mailing list > [email protected] > http://lists.zeromq.org/mailman/listinfo/zeromq-dev > _______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
