Hi,

I managed to reproduce something that looks like this, seen it on both Linux 
and windows.

I modified test_inproc_connect to run the tests in a tight loop (except for 
test_connect_before_bind_pub_sub as that has a sleep in it), so the main
looks like:

        while (true)
        {
                test_bind_before_connect();
                test_connect_before_bind();
                //test_connect_before_bind_pub_sub();
                test_multiple_connects();
                test_multiple_threads();
                test_identity();
        }

This gave me the output:

Bad file descriptor (/home/richard/code/libzmq/src/epoll.cpp:79)
Aborted

On todays master.

It does take a few hours to occur on my machine.

Ric.




From:   MinRK <[email protected]>
To:     "0MQ development list" <[email protected]>,
Date:   05/11/2013 10:44 PM
Subject:        [zeromq-dev] Bad file descriptor in rm_fd()
Sent by:        [email protected]



Once in a while, when running either the IPython or PyZMQ test suite, I still 
get this error:

    Bad file descriptor (kqueue.cpp:77)

or


    Bad file descriptor (epoll.cpp:81)


Stack trace suggests that this happens when destroying a context:

Thread 0:
1   libzmq.3.dylib                 0x000000010f26b170 zmq::signaler_t::send() + 
52
2   libzmq.3.dylib                 0x000000010f261b2f 
zmq::object_t::send_stop() + 35
3   libzmq.3.dylib                 0x000000010f2534a7 zmq::ctx_t::~ctx_t() + 59
4   libzmq.3.dylib                 0x000000010f253a29 zmq::ctx_t::terminate() + 
439
5   libzmq.3.dylib                 0x000000010f27c071 zmq_ctx_term + 35


Thread 6 Crashed:
0   libsystem_kernel.dylib         0x00007fff94a4d866 __pthread_kill + 10
1   libsystem_pthread.dylib       0x00007fff8cac835c pthread_kill + 92
2   libsystem_c.dylib             0x00007fff97570bba abort + 125
3   libzmq.3.dylib                 0x000000010f25a9e1 zmq::zmq_abort(char 
const*) + 9
4   libzmq.3.dylib                 0x000000010f25d0fe 
zmq::kqueue_t::kevent_delete(int, short) + 142
5   libzmq.3.dylib                 0x000000010f25d1b0 
zmq::kqueue_t::rm_fd(void*) + 42
6   libzmq.3.dylib                 0x000000010f2687a3 
zmq::reaper_t::process_stop() + 59
7   libzmq.3.dylib                 0x000000010f26862b zmq::reaper_t::in_event() 
+ 161
8   libzmq.3.dylib                 0x000000010f25d40c zmq::kqueue_t::loop() + 
362


I am still seeing this error once in a while with libzmq-master as of today. I 
don't think it's a recent regression.  A minimal test case is
difficult, since it only seems to raise after at least a hundred tests, and 
only a small fraction of the time even then.  Given that it is always late
in the process that the assert is hit, I have always assumed that it is FD 
exhaustion that is causing the problem, but I am not actually sure, and I
am fairly careful about cleaning up sockets.

Properties of the test suite that sees the issue:

- create and destroy many contexts and sockets
- the previous test's context should always be destroyed before the next test 
starts
- it is not reliably the same test where the assert is hit

I'm afraid I don't know enough about the internals to really tell what's going 
on here, or figure out why the deleted FD is invalid (maybe it was
already closed, and the error should be ignored?).

Anyone have insight on what might be causing the problem, or how I might dig 
deeper into more useful information?

-MinRK_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

===========================================================
The information in this email is confidential, and is intended solely for the 
addressee(s). 
Access to this email by anyone else is unauthorized and therefore prohibited.  
If you are 
not the intended recipient you are notified that disclosing, copying, 
distributing or taking 
any action in reliance on the contents of this information is strictly 
prohibited and may be unlawful.
===========================================================

<<inline: graycol.gif>>

_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to