On Thu, 2017-02-16 at 14:59 +0100, zmqdev wrote: > Hello, > > I could use some advice to diagnose the following issue. > > I have a program that has been running without problems for a couple of > years on Red Hat Enterprise Linux 6 at various sites. > > On RHEL7, the program triggers the assertion > > Bad file descriptor (src/epoll.cpp:131) > > in about 1/3 of executions, during startup (sometimes during shutdown). > > Less often, I see > > Bad file descriptor (src/epoll.cpp:100) > > The problem persists after upgrading to ZeroMQ 4.2.1 from 4.1.6. > > I don't get it! > > Programming errors aside, I do check all return codes and log errors as > they occur in the main thread, and there is nothing until libzmq commits > suicide from one of its threads. > > Any idea/advice on how I could track down this problem? > > What makes RHEL7 different enough from RHEL6 to emerge this kind of errors? > > Cheers :-( > > ________________________________________________________________________ > GDB BACKTRACE FROM CORE FILE: > > Thread 3 (Thread 0xf736b900 (LWP 5039)): > #0 0xf7751430 in __kernel_vsyscall () > #1 0xf745694b in poll () from /lib/libc.so.6 > #2 0xf6ff5457 in > zmq::socket_poller_t::wait(zmq::socket_poller_t::event_t*, int, long) () > from $TOP/lib/platform/libzmq.so.5 > #3 0xf6ff325f in zmq_poller_wait_all(void*, zmq_poller_event_t*, int, > long) () from $TOP/lib/platform/libzmq.so.5 > #4 0xf6ff3aa5 in zmq_poller_poll(zmq_pollitem_t*, int, long) () from > $TOP/lib/platform/libzmq.so.5 > #5 0xf6ff2bb1 in zmq_poll () from $TOP/lib/platform/libzmq.so.5 > #6 0xf702cec1 in zt_reactor_loop (r=<optimized out>) at > $TOP/src/reactor.c:268 > (...) > #17 0x080487da in main () > > Thread 2 (Thread 0xf6e6db40 (LWP 5066)): > #0 0xf7751430 in __kernel_vsyscall () > #1 0xf7463a16 in epoll_wait () from /lib/libc.so.6 > #2 0xf6fa17d0 in zmq::epoll_t::loop() () from $TOP/lib/platform/libzmq.so.5 > #3 0xf6fa1a35 in zmq::epoll_t::worker_routine(void*) () from > $TOP/lib/platform/libzmq.so.5 > #4 0xf6fe36f2 in thread_routine () from $TOP/lib/platform/libzmq.so.5 > #5 0xf7574b2c in start_thread () from /lib/libpthread.so.0 > #6 0xf746308e in clone () from /lib/libc.so.6 > > Thread 1 (Thread 0xf666cb40 (LWP 5067)): > #0 0xf7751430 in __kernel_vsyscall () > #1 0xf739a1f7 in raise () from /lib/libc.so.6 > #2 0xf739ba33 in abort () from /lib/libc.so.6 > #3 0xf6fa2726 in zmq::zmq_abort(char const*) () from > $TOP/lib/platform/libzmq.so.5 > #4 0xf6fa164b in zmq::epoll_t::set_pollout(void*) () from > $TOP/lib/platform/libzmq.so.5 > #5 0xf6fa3951 in zmq::io_object_t::set_pollout(void*) () from > $TOP/lib/platform/libzmq.so.5 > #6 0xf6fdafe1 in zmq::stream_engine_t::restart_output() () from > $TOP/lib/platform/libzmq.so.5 > #7 0xf6fcae20 in zmq::session_base_t::read_activated(zmq::pipe_t*) () > from $TOP/lib/platform/libzmq.so.5 > #8 0xf6fb9dd3 in zmq::pipe_t::process_activate_read() () from > $TOP/lib/platform/libzmq.so.5 > #9 0xf6fb2a9e in zmq::object_t::process_command(zmq::command_t&) () > from $TOP/lib/platform/libzmq.so.5 > #10 0xf6fa3f77 in zmq::io_thread_t::in_event() () from > $TOP/lib/platform/libzmq.so.5 > #11 0xf6fa1948 in zmq::epoll_t::loop() () from $TOP/lib/platform/libzmq.so.5 > #12 0xf6fa1a35 in zmq::epoll_t::worker_routine(void*) () from > $TOP/lib/platform/libzmq.so.5 > #13 0xf6fe36f2 in thread_routine () from $TOP/lib/platform/libzmq.so.5 > #14 0xf7574b2c in start_thread () from /lib/libpthread.so.0 > #15 0xf746308e in clone () from /lib/libc.so.6
Are you building your own binaries in both cases? What polling mechanism was RHEL 6 using? You can see it in the ./configure output: "Using 'epoll' polling system" Kind regards, Luca Boccassi
signature.asc
Description: This is a digitally signed message part
_______________________________________________ zeromq-dev mailing list zeromq-dev@lists.zeromq.org https://lists.zeromq.org/mailman/listinfo/zeromq-dev