On Jan 6, 2012, at 3:23 PM, Martin Sustrik wrote: > Hi Chuck, > >> I am periodically receiving ETIMEDOUT (errno 60) when doing a >> non-blocking read from either a SUB socket or a DEALER/XREQ socket. >> What can I assume from this error? >> >> My guess is that the socket has recently tried to connect to another >> socket (in this particular case,*everything* is using 'inproc' >> transport and they only bind once at startup) and it timed out. >> Because zmq_connect() is async, we don't actually see the error until >> we try to zmq_send()/zmq_recv() with that socket. At that point the >> error is delivered. >> >> Is that assumption correct? If so, what can I do about it? >> >> OS => OSX libzmq => 2.1.11 ulimit -n => 400000 >> >> At the time of the error, there has usually been about 2-3k xreq >> sockets opened& closed with around 200 being open at any given >> moment. > > 0MQ itself doesn't seem to produce this error. I.e. it must be received > from the OS and forward via 0MQ to the user. > > Given that only transport you are using is inproc there's not much OS > functionality involved so it shouldn't be that hard to track the source > of the error down. > > My guess would be that it is generated by singaler_t class which > contains a system socketpair on OSX platform. One of the OS functions > called there is probably returning ETIMEDOUT for some reason. > > Unfortunately, I don't have a Mac so it's up to you to investigate.
Martin, I see this ETIMEDOUT error quite a bit when my machine is under a little bit of load so I agree that it's probably some OSX kernel resource running out/low. (OSX is *not* a good choice for server workloads.) Do you have any specific suggestions on what components of libzmq that I should instrument? I can add some printf's... cr _______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
