Pieter,

Thanks for your reply. I think I might have found the problem. I have a xpub 
socket that has about 10 subscribers. The following events happened:

1) one subscriber's hwm is reached and it's moved to the end of the pipes. (in 
zmq::dist_t::write).

2) another subscriber was terminated (zmq::xpub_t::xterminated), causing the 
pipe to be removed from the dist. however in zmq::dist_t::terminated, the 
terminated pipe was removed by moving the last pipe to the to-be-removed pipe's 
spot. therefore the deactivated pipe in step 1 is moved in the front of the 
pipes. in the meantime, the value of eligible and active are decremented. 
therefore the last eligible pipe (which was in front of the de-activated pipe 
before this event) now becomes in-eligible. and it will not receive any 
messages after this.

let me know if this makes any sense. it's hard for me to write a standalone 
test case like this. I hope my explanation is clear. And if you can suggest any 
fix, let me know. I can see one fix is to swap with the last eligible pipe and 
then delete that position.

Thanks,
Winston
----- Original Message -----
From: Pieter Hintjens
Sent: 03/15/13 09:15 AM
To: ZeroMQ development list
Subject: Re: [zeromq-dev] subscriber stopped receiving messages from XPUB socket

It sounds like a problem in the subscription forwarding, yet it's not clear how 
a subscriber could be affected by the publisher restarting, with the proxy in 
between. Do you need the proxy at all? First thing would be to connect 
subscribers directly to the publisher. If the problem then still happens we can 
try to make a reproducible test case. -Pieter On Fri, Mar 15, 2013 at 4:59 AM, 
Winston Huang <[email protected]> wrote: > hi there, > > I have multiple (5-10) 
subscribers subscribing to the same topic published > by one publisher. They 
are connected via a XSUB-XPUB proxy. All the > subscribers are always up and 
the publisher may come and go at times. I have > noticed that at times, after 
the publisher is restarted, one of the > subscribers might stop receiving any 
messages at all. It's not a slow-joiner > kind of issue because the publisher 
continues to publish message every > second and that subscriber may not get any 
messages at all forever, whereas > other subscribers are getting messages at 
the same time. I also verified > that the subscriber is waiting for messages 
(it's calling the receive > function.) and if I restart the subscriber, it will 
get messages again. > > Could someone enlighten me what I may be doing wrong? 
Is there any thing I > should be looking into? > > Thanks in advance. > Winston 
> _______________________________________________ > zeromq-dev mailing list > 
[email protected] > 
http://lists.zeromq.org/mailman/listinfo/zeromq-dev > 
_______________________________________________ zeromq-dev mailing list 
[email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to