Hi MinRK, I'll try to reproduce it tomorrow. Any suggestion to the kind of test case I could make?
-Pieter On Tue, Nov 5, 2013 at 11:44 PM, MinRK <[email protected]> wrote: > Once in a while, when running either the IPython or PyZMQ test suite, I > still get this error: > > Bad file descriptor (kqueue.cpp:77) > > or > > Bad file descriptor (epoll.cpp:81) > > Stack trace suggests that this happens when destroying a context: > > Thread 0: > 1 libzmq.3.dylib 0x000000010f26b170 > zmq::signaler_t::send() + 52 > 2 libzmq.3.dylib 0x000000010f261b2f > zmq::object_t::send_stop() + 35 > 3 libzmq.3.dylib 0x000000010f2534a7 zmq::ctx_t::~ctx_t() + > 59 > 4 libzmq.3.dylib 0x000000010f253a29 > zmq::ctx_t::terminate() + 439 > 5 libzmq.3.dylib 0x000000010f27c071 zmq_ctx_term + 35 > > > Thread 6 Crashed: > 0 libsystem_kernel.dylib 0x00007fff94a4d866 __pthread_kill + 10 > 1 libsystem_pthread.dylib 0x00007fff8cac835c pthread_kill + 92 > 2 libsystem_c.dylib 0x00007fff97570bba abort + 125 > 3 libzmq.3.dylib 0x000000010f25a9e1 zmq::zmq_abort(char > const*) + 9 > 4 libzmq.3.dylib 0x000000010f25d0fe > zmq::kqueue_t::kevent_delete(int, short) + 142 > 5 libzmq.3.dylib 0x000000010f25d1b0 > zmq::kqueue_t::rm_fd(void*) + 42 > 6 libzmq.3.dylib 0x000000010f2687a3 > zmq::reaper_t::process_stop() + 59 > 7 libzmq.3.dylib 0x000000010f26862b > zmq::reaper_t::in_event() + 161 > 8 libzmq.3.dylib 0x000000010f25d40c zmq::kqueue_t::loop() > + 362 > > > I am still seeing this error once in a while with libzmq-master as of today. > I don't think it's a recent regression. A minimal test case is difficult, > since it only seems to raise after at least a hundred tests, and only a > small fraction of the time even then. Given that it is always late in the > process that the assert is hit, I have always assumed that it is FD > exhaustion that is causing the problem, but I am not actually sure, and I am > fairly careful about cleaning up sockets. > > Properties of the test suite that sees the issue: > > - create and destroy many contexts and sockets > - the previous test's context should always be destroyed before the next > test starts > - it is not reliably the same test where the assert is hit > > I'm afraid I don't know enough about the internals to really tell what's > going on here, or figure out why the deleted FD is invalid (maybe it was > already closed, and the error should be ignored?). > > Anyone have insight on what might be causing the problem, or how I might dig > deeper into more useful information? > > -MinRK > > _______________________________________________ > zeromq-dev mailing list > [email protected] > http://lists.zeromq.org/mailman/listinfo/zeromq-dev > -- - Pieter Hintjens CEO of iMatix.com Founder of ZeroMQ community blog: http://hintjens.com _______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
