at long last, i am starting to scale up my project. that is, amongst the flotilla of processes doing stuff, i am starting to increase the size of a couple of pools of worker processes.
i first ran into the open file descriptor limit, so that has been increased from 1024 to 8192. i then got this on stderr: Assertion failed: new_sndbuf > old_sndbuf (mailbox.cpp:183) although when i gdb'ed the core dump, i get a different spot for the error: (gdb) bt #0 0x00000032a1630265 in raise () from /lib64/libc.so.6 #1 0x00000032a1631d10 in abort () from /lib64/libc.so.6 #2 0x0000000004c1bd59 in zmq::mailbox_t::send (this=0x52d6fd0, cmd_=...) at mailbox.cpp:178 #3 0x0000000004c1d4cd in zmq::object_t::send_bind (this=0x8f645a0, destination_=0x52d6ef0, in_pipe_=0x8f672f0, out_pipe_=0x0, peer_identity_=..., inc_seqnum_=224) at object.cpp:267 #4 0x0000000004c24cc8 in zmq::session_t::process_attach (this=0x8f645a0, engine_=0x8f5d4e0, peer_identity_=...) at session.cpp:263 #5 0x0000000004c1d7b4 in zmq::object_t::process_command (this=0x8f645a0, cmd_=...) at object.cpp:88 #6 0x0000000004c1a8d0 in zmq::io_thread_t::in_event ( this=<value optimized out>) at io_thread.cpp:83 #7 0x0000000004c1923a in zmq::epoll_t::loop (this=0x52d3ea0) at epoll.cpp:161 #8 0x0000000004c2b177 in thread_routine (arg_=0x52d3f10) at thread.cpp:73 #9 0x00000032a1e0673d in start_thread () from /lib64/libpthread.so.0 #10 0x00000032a16d40cd in clone () from /lib64/libc.so.6 the context is that this process is the overall coordinator, and all the sub processes are sending status messages over a PUSH/PULL socket. the load shouldn't be too high; we are going from 200ish processes to 300ish processes, and the messages are only sent every 15 seconds. to me, it smells of a file descriptor weirdness. and the gdb points to this (the assert on the stack is a fcntl failing on a file descriptor). can anyone offer advice on what i might look at? (the underlying OS is RHEL6.) ------------------ Andrew Hume (best -> Telework) +1 623-551-2845 [email protected] (Work) +1 973-236-2014 AT&T Labs - Research; member of USENIX and LOPSA
_______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
