2011/1/31 Martin Sustrik <[email protected]>: > On 01/31/2011 05:43 PM, Dhammika Pathirana wrote: > >> Try something like 512k, but I don't know about your app/traffic pattern. >> These are system wide settings so you'd want to be stingy. > > There's one problem with 0mq architecture that may be the cause of the > problem. > > Namely, the I/O thread (one that handles the network traffic in async way) > and application thread (the one you use to access 0mq API from) communicate > via a socketpair. > > Now imagine that I/O thread has something to say to the app thread every now > and then (such as that new connection was established or that the connection > was destroyed). If the user doesn't call any 0mq functions for an extended > period of time the application thread has no chance to read its mailbox. > Thus the mailbox (the socketpair) will finally fill up cause this assertion. > > Martin >
Usage pattern of my crashing application may be described as following: application works in 4 threads, each process N requests in parallel. If there are free slots in thread, it receives message by ST_PULL socket from previous application. For each message it makes many additional subrequests to many other app types. When all subrequests are processed, final message is sent to next application by ST_PUSH socket, and its slot is freed. When speed of processing subrequests is lesser than speed of incoming messages, application (thanks to HWM for ST_PULL and ST_PUSH) will limit speed of whole system without problem. Each thread is not sending subrequests to other apps directly: for each app type i have additional threads, that is receiving requests from application threads, sends it to application, receives response, and deliver it to application. Except this, these threads is responsible for response caching and resending request if no response is received during some timeperiod. Summary: application consists from many threads, and every thread in almost all situations is suspended by blocking zeromq calls receive() and send(). But this application have only one IO thread, may be this be a problem? I think, that problem with assert appears often, when system is in limiting speed mode. But i'm not sure. From last restart, my application works now 40 hours without troubles... I'll create issue in github. _______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
