On Jul 8, 2010, at 5:01 PM, Matt Weinstein wrote:
[Moderator - Please kill my prior message, I joined the list ;-) ]
Folks,
I have a client and server using REQ and REP, running the a
ZMQ_QUEUE device.
REQ -- [TCP localhost] - XREP - ZMQ_QUEUE - XREQ - [INPROC] - REP
To handle timeouts, the client is closing its socket, and opening a
new socket, whenever it sees a null packet coming from the server:
// check to see if we're a special case
if (reply.size() == 0) {
delete psocket;
psocket = new zmq::socket_t(*pctx, ZMQ_REQ);
assert(psocket != NULL);
psocket->connect(client_connect);
}
I have a server sending close replies ever 10th message.
After a few hundred cycles, things hang, see below.
I've done a git of the latest 2.0.7, as I needed the fix for bug 38
(Assertion failed: fetched (xrep.cpp:196)), which had been biting me.
Any thoughts?
I played around a bit, and the problem goes away if I insert a
usleep() strategically in one of two places (where it --helps). My
feeling is that there may be a race condition related to tearing
down the actual TCP socket, or a timing problem allocating and
deallocating a ypipe. I tried using an OSMemoryBarrier (OS/X) but
that didn't help. I haven't tried different usleep() values:
if (reply.size() == 0) {
// usleep(10000); -- does not help
delete psocket;
// usleep(10000); //-- helps here
psocket = new zmq::socket_t(*pctx, ZMQ_REQ);
assert(psocket != NULL);
usleep(10000); //-- helps here
psocket->connect(client_connect);
}
After a long term test, this solution didn't work. Threads slowly
hang, and eventually I got a SEGV.
The problem is reproducible (easily) on OS/X.
Code is available. Environment: OS/X Leopard.
Thanks,
Best,
Matt
client recv: Xthread# 0x10040a000 request# 297
client send: thread# 0x10040a000 request# 298
server recv: thread# 0x10040a000 request# 298
server send thread# 0x10040a000 request# 298
server send complete
client recv: Xthread# 0x10040a000 request# 298
client send: thread# 0x10040a000 request# 299
server recv: thread# 0x10040a000 request# 299
server send thread# 0x10040a000 request# 299
server send complete
client recv: Xthread# 0x10040a000 request# 299
client send: thread# 0x10040a000 request# 300
server recv: thread# 0x10040a000 request# 300
server send null for thread# 0x10040a000 request# 300
client recv:
client send: thread# 0x10040a000 request# 301
server recv: thread# 0x10040a000 request# 301
server send thread# 0x10040a000 request# 301
server send complete
--- I expected to see this, it never showed up:
client recv: Xthread# 0x10040a000 request# 301
_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev
_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev