OK, I've reproduced the problem quite easily. Something strange with messages being delivered even though the socket they're sent on is torn down entirely. I'm investigating...
On Fri, Jun 6, 2014 at 5:57 PM, Pieter Hintjens <p...@imatix.com> wrote: > OK, I'll simulate this in the code. The peers should automatically > resend HELLO if they lost contact. > > No thanks needed, we enjoy making this software and use it in > everything we make. :-) > > On Fri, Jun 6, 2014 at 4:12 PM, Steve Rasmussen > <steve.rasmus...@rassimtech.com> wrote: >>> In principle if the connection is re-established there should be no new >> HELLO message sent. >> >> This problem occurs after the Wi-Fi connection has been down long enough for >> the peers to remove each other. When the connection come back up, as I >> understand it, the HELLO message is necessary to kick-off handshaking. >> >>> Can you find a way to reproduce the problem easily? >> The easiest method that I've found is using a modified version of the >> zpinger tool on two laptops. The modified zpinger tool is set up to send a >> whisper, after a time delay, anytime it receives a whisper from a peer. I >> either turn the Wi-Fi adapter off/on or move the laptop out of range to >> perform the test. >> >> It seems like this may have something to do with the sockets maintaining the >> TCP/IP connection during the break and then being in a bad state when the >> Wi-Fi connection comes back up. Is this possible? If so is there some way to >> reset the TCP/IP connection? >> >>> Thanks for taking the time to analyse the problem. >> >> I need this capability for the system I'm developing. Thank you and your >> colleagues for ZeroMQ, CZMQ, Zyre, ... >> >> Regards, >> >> Steve >> >> >> >> -----Original Message----- >> From: zeromq-dev-boun...@lists.zeromq.org >> [mailto:zeromq-dev-boun...@lists.zeromq.org] On Behalf Of Pieter Hintjens >> Sent: Thursday, June 5, 2014 5:22 PM >> To: ZeroMQ development list >> Subject: Re: [zeromq-dev] Zyre Wi-Fi Rejoin Issue >> >> On Thu, Jun 5, 2014 at 5:32 PM, Steve Rasmussen >> <steve.rasmus...@rassimtech.com> wrote: >> >>> The problem seems to be with the TCP/IP connection not the beacon. After a >> network break, the beacon reestablishes the connection, but no data is >> getting through the tcp/ip connection. >>> It looks as if there are messages that are being buffered before the break >> and then delivered after. This prevents the "HELLO" message from getting >> through. I've tried various things, but the closest the I've come, so far, >> is to keep removing the peer until it is reported as being ready. I'm doing >> this in the "zyre_node_require_peer" function. If a peer exists I check to >> see if it is ready, "zyre_peer_ready" and if not, I remove the peer, >> "zyre_node_remove_peer". This seems to fix the problem that I'm having, but >> it seems a little kludgie. >> >> Thanks for taking the time to analyse the problem. >> >> In principle if the connection is re-established there should be no new >> HELLO message sent. Can you find a way to reproduce the problem easily? >> >> Feel free to make a pull request with your change anyhow. I'm reworking a >> lot of this code atm so will try to include your change if I can reproduce >> the error. >> >> -Pieter >> _______________________________________________ >> zeromq-dev mailing list >> zeromq-dev@lists.zeromq.org >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >> >> _______________________________________________ >> zeromq-dev mailing list >> zeromq-dev@lists.zeromq.org >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev _______________________________________________ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev