Koert,

I've been trying your test cases.  There are delays in the req and pub
programs.  Could you explain that?  I need to know whether to test
with or without those delays.

What I'm seeing is:

* The delay in the req.py program has no effect, which is expected.
* If I leave the delay in the publisher, the subscriber gets all
messages, no matter what order I start the programs.
* If I remove the delay in the publisher, the subscriber gets no
messages, no matter what order I start the programs.

Also, you mentioned 1,000 messages in your email but your test cases
sent 10,000 messages.  Again, I need to know whether you changed this
and why.

Finally, how do you start the test cases, is it by hand or from a
script?  This is relevant because doing it by hand introduces
additional delays.

What I think you are seeing (and what I'm certainly reproducing using
your test cases) is the "slow subscriber connect" symptom, which
means:

* Connecting takes a certain time, say 10msecs
* During that time a publisher can send say 10,000 messages
* If the publisher does bind/send(10000) and the client does
connect/recv, it will get nothing

There are three trivial ways to verify that this is what's happening.

1. Send more messages, e.g. 100K instead of 1K or 10K
2. Send very large messages, which will take longer to send
3. Send periodic messages, i.e. 1 per second

If you do send periodic messages and you number them, you will see
that the first 1 or 2 messages a publisher sends are *always* lost
unless you explicitly add a delay, or a synchronization of some kind.

Hope this helps.

-Pieter


On Wed, Sep 22, 2010 at 9:16 PM, Pieter Hintjens <[email protected]> wrote:
> Koert,
>
> So you're saying, if you start the subscriber after the publisher, you
> don't get messages?
>
> If that's what you're seeing, it's normal.  Pubsub does not wait for
> subscribers to connect, and if they arrive after the publisher has
> sent its data, they will receive nothing.
>
> -Pieter
>
> On Tue, Sep 21, 2010 at 1:17 PM, Koert Kuipers
> <[email protected]> wrote:
>> Hello all,
>>
>> I ran into a problem while developing a server in python. When a program is
>> listening to both a REP socket and a SUB socket, using multiplexing (poll),
>> messages from the publisher (which should arrive at the SUB socket) get
>> lost. This seems to only happen if there are also messages arriving at the
>> REP socket, and typically all the messages from the publisher get lost.
>>
>>
>>
>> My setup:
>>
>> Windows XP (I also observed the problem on Ubuntu 10.04)
>>
>> zeromq 2.0.7
>>
>> pyzmq
>>
>>
>>
>> The problem doesn’t always occur, and is somewhat hard to replicate.
>>
>>
>>
>> I ended up convincing myself that there is indeed a problem by writing 3
>> little programs. Program 1 listens to REP and SUB socket, program 2 only has
>> a PUB socket and sends 1000 messages, and program 3 only has REQ socket and
>> does 1000 RPC requests in a row.
>>
>>
>>
>> When I start the programs in this order everything works as expected:
>>
>> Start program 1, then program 2 and then program 3 (program 3 starts while
>> program 2 is still working). Program 1 will report it received 1000 messages
>> on the PUB socket and 1000 messages on the REP socket.
>>
>>
>>
>> But when change the order I get into trouble. I start program 1, then
>> program 3 and then program 2 (program 2 starts while program 3 is still
>> working). Program 1 will report it received 1000 messages on the REP socket
>> but none on the SUB socket.
>>
>>
>>
>> Best,
>>
>> Koert
>>
>>
>>
>> PS I attached the 3 programs. Hope that works.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> [email protected]
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>>
>
>
>
> --
> -
> Pieter Hintjens
> iMatix - www.imatix.com
>



-- 
-
Pieter Hintjens
iMatix - www.imatix.com
_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to