Thanks.

It the test_multiple_threads test that fails, which I guess make sense.  I'll 
test to see if any of the other ones (that do everything on a single
thread) can reproduce it just in case.

Ric.




From:   "Pieter Hintjens" <[email protected]>
To:     "ZeroMQ development list" <[email protected]>,
Date:   06/11/2013 12:14 PM
Subject:        Re: [zeromq-dev] Bad file descriptor in rm_fd()
Sent by:        [email protected]



Great, nice to have a test case.

I think this is a very old issue: https://zeromq.jira.com/browse/LIBZMQ-76

I'll put Richard's test case into the issues repository, and update issue 76.


On Wed, Nov 6, 2013 at 12:47 PM, <[email protected]> wrote:
  Hi,

  I managed to reproduce something that looks like this, seen it on both Linux 
and windows.

  I modified test_inproc_connect to run the tests in a tight loop (except for 
test_connect_before_bind_pub_sub as that has a sleep in it), so the main
  looks like:

  while (true)
  {
  test_bind_before_connect();
  test_connect_before_bind();
  //test_connect_before_bind_pub_sub();
  test_multiple_connects();
  test_multiple_threads();
  test_identity();
  }

  This gave me the output:

  Bad file descriptor (/home/richard/code/libzmq/src/epoll.cpp:79)
  Aborted

  On todays master.

  It does take a few hours to occur on my machine.

  Ric.


  MinRK ---05/11/2013 10:44:47 PM---Once in a while, when running either the 
IPython or PyZMQ test suite, I still get this error:

  From: MinRK <[email protected]>
  To: "0MQ development list" <[email protected]>,
  Date: 05/11/2013 10:44 PM



  Subject: [zeromq-dev] Bad file descriptor in rm_fd()
  Sent by: [email protected]




  Once in a while, when running either the IPython or PyZMQ test suite, I still 
get this error:

      Bad file descriptor (kqueue.cpp:77)

  or


      Bad file descriptor (epoll.cpp:81)


  Stack trace suggests that this happens when destroying a context:

  Thread 0:
  1   libzmq.3.dylib                 0x000000010f26b170 zmq::signaler_t::send() 
+ 52
  2   libzmq.3.dylib                 0x000000010f261b2f 
zmq::object_t::send_stop() + 35
  3   libzmq.3.dylib                 0x000000010f2534a7 zmq::ctx_t::~ctx_t() + 
59
  4   libzmq.3.dylib                 0x000000010f253a29 zmq::ctx_t::terminate() 
+ 439
  5   libzmq.3.dylib                 0x000000010f27c071 zmq_ctx_term + 35


  Thread 6 Crashed:
  0   libsystem_kernel.dylib         0x00007fff94a4d866 __pthread_kill + 10
  1   libsystem_pthread.dylib       0x00007fff8cac835c pthread_kill + 92
  2   libsystem_c.dylib             0x00007fff97570bba abort + 125
  3   libzmq.3.dylib                 0x000000010f25a9e1 zmq::zmq_abort(char 
const*) + 9
  4   libzmq.3.dylib                 0x000000010f25d0fe 
zmq::kqueue_t::kevent_delete(int, short) + 142
  5   libzmq.3.dylib                 0x000000010f25d1b0 
zmq::kqueue_t::rm_fd(void*) + 42
  6   libzmq.3.dylib                 0x000000010f2687a3 
zmq::reaper_t::process_stop() + 59
  7   libzmq.3.dylib                 0x000000010f26862b 
zmq::reaper_t::in_event() + 161
  8   libzmq.3.dylib                 0x000000010f25d40c zmq::kqueue_t::loop() + 
362


  I am still seeing this error once in a while with libzmq-master as of today. 
I don't think it's a recent regression.  A minimal test case is
  difficult, since it only seems to raise after at least a hundred tests, and 
only a small fraction of the time even then.  Given that it is always
  late in the process that the assert is hit, I have always assumed that it is 
FD exhaustion that is causing the problem, but I am not actually sure,
  and I am fairly careful about cleaning up sockets.

  Properties of the test suite that sees the issue:

  - create and destroy many contexts and sockets
  - the previous test's context should always be destroyed before the next test 
starts
  - it is not reliably the same test where the assert is hit

  I'm afraid I don't know enough about the internals to really tell what's 
going on here, or figure out why the deleted FD is invalid (maybe it was
  already closed, and the error should be ignored?).

  Anyone have insight on what might be causing the problem, or how I might dig 
deeper into more useful information?

  -MinRK_______________________________________________
  zeromq-dev mailing list
  [email protected]
  http://lists.zeromq.org/mailman/listinfo/zeromq-dev



  ===========================================================
  The information in this email is confidential, and is intended solely for the 
addressee(s).
  Access to this email by anyone else is unauthorized and therefore prohibited. 
 If you are
  not the intended recipient you are notified that disclosing, copying, 
distributing or taking
  any action in reliance on the contents of this information is strictly 
prohibited and may be unlawful.
  ===========================================================

  _______________________________________________
  zeromq-dev mailing list
  [email protected]
  http://lists.zeromq.org/mailman/listinfo/zeromq-dev




--
-
Pieter Hintjens
CEO of iMatix.com
Founder of ZeroMQ community
blog: http://hintjens.com _______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

===========================================================
The information in this email is confidential, and is intended solely for the 
addressee(s). 
Access to this email by anyone else is unauthorized and therefore prohibited.  
If you are 
not the intended recipient you are notified that disclosing, copying, 
distributing or taking 
any action in reliance on the contents of this information is strictly 
prohibited and may be unlawful.
===========================================================

<<inline: graycol.gif>>

_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to