[Moderator - Please kill my prior message, I joined the list ;-) ]

Folks,

I have a client and server using REQ and REP, running the a ZMQ_QUEUE device.

REQ -- [TCP localhost] - XREP - ZMQ_QUEUE - XREQ - [INPROC] - REP

To handle timeouts, the client is closing its socket, and opening a new socket, whenever it sees a null packet coming from the server:


                // check to see if we're a special case
                if (reply.size() == 0) {
                        delete psocket;
                        psocket = new zmq::socket_t(*pctx, ZMQ_REQ);
                        assert(psocket != NULL);
                        psocket->connect(client_connect);
                }

I have a server sending close replies ever 10th message.

After a few hundred cycles, things hang, see below.

I've done a git of the latest 2.0.7, as I needed the fix for bug 38 (Assertion failed: fetched (xrep.cpp:196)), which had been biting me.

Any thoughts?


I played around a bit, and the problem goes away if I insert a usleep() strategically in one of two places (where it --helps). My feeling is that there may be a race condition related to tearing down the actual TCP socket, or a timing problem allocating and deallocating a ypipe. I tried using an OSMemoryBarrier (OS/X) but that didn't help. I haven't tried different usleep() values:

                if (reply.size() == 0) {
//                      usleep(10000); -- does not help
                        delete psocket;
//                      usleep(10000); //-- helps here
                        psocket = new zmq::socket_t(*pctx, ZMQ_REQ);
                        assert(psocket != NULL);
                        usleep(10000); //-- helps here
                        psocket->connect(client_connect);
                }


The problem is reproducible (easily) on OS/X.

Code is available.  Environment: OS/X Leopard.


Thanks,

Best,

Matt

client recv: Xthread# 0x10040a000 request# 297
client send: thread# 0x10040a000 request# 298
server recv: thread# 0x10040a000 request# 298
server send thread# 0x10040a000 request# 298
server send complete
client recv: Xthread# 0x10040a000 request# 298
client send: thread# 0x10040a000 request# 299
server recv: thread# 0x10040a000 request# 299
server send thread# 0x10040a000 request# 299
server send complete
client recv: Xthread# 0x10040a000 request# 299
client send: thread# 0x10040a000 request# 300
server recv: thread# 0x10040a000 request# 300
server send null for thread# 0x10040a000 request# 300
client recv:
client send: thread# 0x10040a000 request# 301
server recv: thread# 0x10040a000 request# 301
server send thread# 0x10040a000 request# 301
server send complete

--- I expected to see this, it never showed up:
client recv: Xthread# 0x10040a000 request# 301


_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to