i have a program called portal that takes a socket as input and several output
sockets.
i have a thread R that receives messages from the input and a thread S that
sends messages out on one of teh output threads. pseudocode is
tmp_in and tmp_out are the input and output ends of a PUSH/PULL inproc socket
with no queue bounds.
R:
while(zmq_recv(isock, &msg)){
// do statistics
zmq_send(tmp_out, &msg)
}
S:
while(zmq_recv(tmp_in, &msg)){
// do statistics
// determine which output socket osock
zmq_send(osock, &msg)
}
the input socket is a PUSH/PULL with a bound of about 20000 messages, and maybe
a hundred or so inputs (PUSHers).
the output sockets are PUSH/PULL with a bound of 5000 messages, each going to a
single process.
ordinarily, this works great; the internal inproc socket remains empty (we drain
it as fast as input comes in. under heavy load, about once or twice a day, this
setup wedges;
that is, S is blocked on the zmq_send and and the destination process is
blocked on a
zmq_recv.
this wedging occurs with both TCP transport and ipc transport.
when it occurs, killing just the receiving process does not fix teh problem;
all the receiving processes have to be killed.
this occurs under 2.1.7, and under 2.1.11.
i have several portals, each handling messages of different sizes and contents,
on each
server (there are 8 servers). when the portal on one server wedges, the portal
of the same
type on all the other servers soon (within 5-10 minutes) will wedge.
any clues or advice?
andrew
------------------
Andrew Hume (best -> Telework) +1 623-551-2845
[email protected] (Work) +1 973-236-2014
AT&T Labs - Research; member of USENIX and LOPSA
_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev