I have run into this many times. The issue is that the SUB socket doesn't get the messages until it actually connects:
If the PUB starts before the SUB, the PUB will start broadcasting before the SUB starts and the SUB won't get those messages that were sent before the SUB connects. A PUB socket is like a radio broadcast. If you aren't listening, you don't get the messages. BUT (this is more subtle). If the SUB starts before the PUB, you will still miss messages. This is because it takes a little bit of time (I think 0.1 sec) for the SUB socket to realize the PUB socket has started. In that short time interval, the PUB socket has already started sending and you miss a few. So, if you want to make sure that you get all the messages, I would use a separate REQ/REP channel between the two to synchronize everything before the PUB/SUB starts. Cheers, Brian On Mon, Jul 5, 2010 at 3:31 PM, Andrew Hume <[email protected]> wrote: > folks, > i am doing a simple case and can't see my error: > in process a: > ctxt = zmq_init(1, 5, 0); > q = zmq_socket(ctxt, ZMQ_PUB); > sprintf(buf, "tcp://%s:%s", machine, port); > n = zmq_connect(q, buf); > for(n = 0; n < 2050; n++){ > get_goo(loc, &data, &len); > zmq_msg_init_size(&msg, len); > memcpy(zmq_msg_data(&msg), data, len); > m = zmq_send(q, &msg, 0); > assert(m == 0); > } > zmq_term(ctxt); > exit(0); > in process b: > ctxt = zmq_init(1, 10, 0); > q = zmq_socket(ctxt, ZMQ_SUB); > sprintf(buf, "tcp://*:%s", port); > n = zmq_bind(q, buf); > assert(n == 0); > n = zmq_setsockopt(q, ZMQ_SUBSCRIBE, 0, 0); > assert(n == 0); > for(n = 0; ; n++){ > zmq_msg_init(&msg); > m = zmq_recv(q, &msg, 0); > assert(m == 0); > zmq_msg_close(&msg); > if((n%100) == 99){ > printf("got %d packets\n", n+1); > sleep(1); > } > } > the problem: > process b doesn't always see all 2050 messages from process a. > maybe 5% it does. sometimes, 1300 get thru, other times, 2000. > nothing i've checked returns an error. i'm running 2.0.6 on redhat 5.4. > is my code in error? or am i misunderstanding something? > thanks > > ------------------ > Andrew Hume (best -> Telework) +1 732-886-1886 > [email protected] (Work) +1 973-360-8651 > AT&T Labs - Research; member of USENIX and LOPSA > > > > _______________________________________________ > zeromq-dev mailing list > [email protected] > http://lists.zeromq.org/mailman/listinfo/zeromq-dev > > -- Brian E. Granger, Ph.D. Assistant Professor of Physics Cal Poly State University, San Luis Obispo [email protected] [email protected] _______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
