I vaguely remember this was reported a long time ago. I spent some time but never found anything suspicious. Will look at it. - Martin
On Tue, Nov 5, 2013 at 4:10 PM, Pieter Hintjens <[email protected]> wrote: > Hi MinRK, > > I'll try to reproduce it tomorrow. Any suggestion to the kind of test > case I could make? > > -Pieter > > On Tue, Nov 5, 2013 at 11:44 PM, MinRK <[email protected]> wrote: >> Once in a while, when running either the IPython or PyZMQ test suite, I >> still get this error: >> >> Bad file descriptor (kqueue.cpp:77) >> >> or >> >> Bad file descriptor (epoll.cpp:81) >> >> Stack trace suggests that this happens when destroying a context: >> >> Thread 0: >> 1 libzmq.3.dylib 0x000000010f26b170 >> zmq::signaler_t::send() + 52 >> 2 libzmq.3.dylib 0x000000010f261b2f >> zmq::object_t::send_stop() + 35 >> 3 libzmq.3.dylib 0x000000010f2534a7 zmq::ctx_t::~ctx_t() + >> 59 >> 4 libzmq.3.dylib 0x000000010f253a29 >> zmq::ctx_t::terminate() + 439 >> 5 libzmq.3.dylib 0x000000010f27c071 zmq_ctx_term + 35 >> >> >> Thread 6 Crashed: >> 0 libsystem_kernel.dylib 0x00007fff94a4d866 __pthread_kill + 10 >> 1 libsystem_pthread.dylib 0x00007fff8cac835c pthread_kill + 92 >> 2 libsystem_c.dylib 0x00007fff97570bba abort + 125 >> 3 libzmq.3.dylib 0x000000010f25a9e1 zmq::zmq_abort(char >> const*) + 9 >> 4 libzmq.3.dylib 0x000000010f25d0fe >> zmq::kqueue_t::kevent_delete(int, short) + 142 >> 5 libzmq.3.dylib 0x000000010f25d1b0 >> zmq::kqueue_t::rm_fd(void*) + 42 >> 6 libzmq.3.dylib 0x000000010f2687a3 >> zmq::reaper_t::process_stop() + 59 >> 7 libzmq.3.dylib 0x000000010f26862b >> zmq::reaper_t::in_event() + 161 >> 8 libzmq.3.dylib 0x000000010f25d40c zmq::kqueue_t::loop() >> + 362 >> >> >> I am still seeing this error once in a while with libzmq-master as of today. >> I don't think it's a recent regression. A minimal test case is difficult, >> since it only seems to raise after at least a hundred tests, and only a >> small fraction of the time even then. Given that it is always late in the >> process that the assert is hit, I have always assumed that it is FD >> exhaustion that is causing the problem, but I am not actually sure, and I am >> fairly careful about cleaning up sockets. >> >> Properties of the test suite that sees the issue: >> >> - create and destroy many contexts and sockets >> - the previous test's context should always be destroyed before the next >> test starts >> - it is not reliably the same test where the assert is hit >> >> I'm afraid I don't know enough about the internals to really tell what's >> going on here, or figure out why the deleted FD is invalid (maybe it was >> already closed, and the error should be ignored?). >> >> Anyone have insight on what might be causing the problem, or how I might dig >> deeper into more useful information? >> >> -MinRK >> >> _______________________________________________ >> zeromq-dev mailing list >> [email protected] >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >> > > > > -- > - > Pieter Hintjens > CEO of iMatix.com > Founder of ZeroMQ community > blog: http://hintjens.com > _______________________________________________ > zeromq-dev mailing list > [email protected] > http://lists.zeromq.org/mailman/listinfo/zeromq-dev _______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
