Hello,

I spent a few minutes tracking this down, I'm not sure if this is the right
place to report about it (I'll write something in Jira too)

The bug comes from a broken assertion in the connect() call of
tcp_connecter.cpp. The

int rc = getsockopt (s, SOL_SOCKET, SO_ERROR, (char*) &err, &len);

call fails with err == EINVAL.
In my case, this exactly happens as the last step of:
- I plug my 3G dongle and connect to internet
- I run my subscriber to connect to a remote publisher in the cloud (
tcp://someserver:someport )
- I unplug the 3G dongle
- I wait
- I plug it again : the bug happens immediately. The error code EINVAL is
the same with Pieter's test case.

The 's' socket became invalid for some reason; unfortunately I do not know
much about zeromq core internals yet, so I have no idea how to fix this bug
properly. But maybe this additional information can be helpful to someone
more experienced... I'm attaching the full gdb trace of my session, with
info about the tcp_connecter instance and its tcp addresses struct.



On Wed, May 1, 2013 at 11:04 AM, Pieter Hintjens <[email protected]> wrote:

> Hi Victor,
>
> No bugs are ever out of scope unless fixing them really breaks other
> things.
>
> We can definitely wait a few days if you want to try to get a patch
> for this ready. The process is we fix and test on master, then
> backport specific fixes to the 3.2 stable branch. It's also easy to
> make more stable releases, really whenever people feel it's a good
> time.
>
> For 3.3, we're looking at backwards compatibility with 3.1, and
> security (the new ZMTP 3.0 protocol and its plug-in mechanisms)
>
> -Pieter
>
>
>
> On Wed, May 1, 2013 at 8:52 AM, Victor Perron <[email protected]> wrote:
> > Hi,
> >
> > Actually (sorry it's really late, I'm maybe saying something stupid here)
> > but I have encountered this myself as well yesterday :
> > https://zeromq.jira.com/browse/LIBZMQ-526
> >
> > It happens on (quite) unstable 3G connections.
> > I have had no time to try and fix it yet, but I'd be really happy to have
> > that fixed in the next stable.
> > I'd be happy to work on it in the upcoming days, but my schedule is
> > incredibly tight since a few weeks.
> >
> > My scheme is: the publisher is bound on some remote server; the sucriber
> > connecting to it is behind a 3G connection (dongle in my laptop).
> > Not far from the minimal test case posted with the issue.
> > For now the result is that, on 3G, as soon as my connexion drops once,
> the
> > assert breaks and my subscriber crashes within seconds.
> > It may be interesting to point out that when the connection "drops", the
> 3g
> > virtual "eth1" interface is removed as well, if that helps reproduce the
> > bug.
> >
> > That's why it would be really great to have it  "not a problem anymore"
> in
> > the next stable, since it's the first time I encounter such an issue with
> > libzmq, even after a really intensive multiarch use.
> >
> > Anyway, was just my 2 cents, maybe I did not get what the 3.3.0 version
> is
> > about, and that particular bug is out of scope for the release.
> >
> >
> > On Wed, May 1, 2013 at 5:31 AM, Pieter Hintjens <[email protected]> wrote:
> >>
> >> Hi Trevor,
> >>
> >> Yes, great idea. I'll get onto that asap.
> >>
> >> -Pieter
> >>
> >> On Tue, Apr 30, 2013 at 5:47 PM, Trevor Bernard
> >> <[email protected]> wrote:
> >> > There appears to be no defects for zeromq-3.2.3 in JIRA. I was just
> >> > wondering when you plan to release it.
> >> >
> >> > Warmest regards,
> >> >
> >> > Trevor
> >> >
> >> > _______________________________________________
> >> > zeromq-dev mailing list
> >> > [email protected]
> >> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> >> >
> >> _______________________________________________
> >> zeromq-dev mailing list
> >> [email protected]
> >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> >
> >
> >
> >
> > --
> > Victor
> >
> > _______________________________________________
> > zeromq-dev mailing list
> > [email protected]
> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> >
>



-- 
Victor
Breakpoint 4, zmq::tcp_connecter_t::connect (this=this@entry=0x7ffff00094b0) at 
tcp_connecter.cpp:285
285             errno_assert (errno == ECONNREFUSED || errno == ECONNRESET ||
(gdb) p err
$16 = 22
(gdb) p *this
$17 = {
  <zmq::own_t> = {
    <zmq::object_t> = {
      _vptr.object_t = 0x7ffff77dca30 <vtable for zmq::tcp_connecter_t+16>, 
      ctx = 0x605140, 
      tid = 2
    }, 
    members of zmq::own_t: 
    options = {
      sndhwm = 1000, 
      rcvhwm = 1000, 
      affinity = 0, 
      identity_size = 0 '\000', 
      identity = 
"\000\000\000\000\000\000\000@\000\000\000\000\000\000\000\060", '\000' 
<repeats 16 times>, 
"\004\000\000\377\003\000\000\376\003\000\000\375\003\000\000\374\003\000\000\373\003\000\000p\000\000\000\000\000\000\000P",
 '\000' <repeats 16 times>, 
"\004\000\000\377\003\000\000\376\003\000\000\375\003\000\000\374\003\000\000\373\003\000\000\372\003\000\000\371\003\000\000\370\003\000\000\367\003\000\000\366\003\000\000\365\003\000\000\364\003\000\000\363\003\000\000\000\000\000\000\000\000\000\000\241\001\000\000\000\000\000\000\230w\271\367\377\177\000\000\230w\271\367\377\177\000\000\376\003\000\000\375\003\000\000\374\003\000\000\373\003\000\000\372\003\000\000\371\003\000\000\370\003\000\000\367\003\000\000\366\003\000\000\365\003\000\000\364\003\000\000\363\003\000\000\362\003\000\000\361\003\000\000\360\003\000\000\357\003\000\000\356\003\000\000\355\003\000\000\354\003\000\000\353\003\000\000\352\003\000\000\351\003\000",
 <incomplete sequence \350>, 
      last_endpoint = {
        static npos = <optimized out>, 
        _M_dataplus = {
          <std::allocator<char>> = {
            <__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data 
fields>}, 
          members of std::basic_string<char, std::char_traits<char>, 
std::allocator<char> >::_Alloc_hider: 
          _M_p = 0x7ffff71734d8 ""
        }
      }, 
      rate = 100, 
      recovery_ivl = 10000, 
      multicast_hops = 1, 
      sndbuf = 0, 
      rcvbuf = 0, 
      type = 2, 
      linger = 0, 
      reconnect_ivl = 100, 
      reconnect_ivl_max = 0, 
      backlog = 100, 
      maxmsgsize = -1, 
      rcvtimeo = -1, 
      sndtimeo = -1, 
      ipv4only = 1, 
      delay_attach_on_connect = 0, 
      delay_on_close = true, 
      delay_on_disconnect = true, 
      filter = true, 
      recv_identity = false, 
      tcp_keepalive = -1, 
      tcp_keepalive_cnt = -1, 
      tcp_keepalive_idle = -1, 
      tcp_keepalive_intvl = -1, 
      tcp_accept_filters = {
        <std::_Vector_base<zmq::tcp_address_mask_t, 
std::allocator<zmq::tcp_address_mask_t> >> = {
          _M_impl = {
            <std::allocator<zmq::tcp_address_mask_t>> = {
              <__gnu_cxx::new_allocator<zmq::tcp_address_mask_t>> = {<No data 
fields>}, <No data fields>}, 
            members of std::_Vector_base<zmq::tcp_address_mask_t, 
std::allocator<zmq::tcp_address_mask_t> >::_Vector_impl: 
            _M_start = 0x0, 
            _M_finish = 0x0, 
            _M_end_of_storage = 0x0
          }
        }, <No data fields>}, 
      socket_id = 1
    }, 
    terminating = false, 
    sent_seqnum = {
      value = 1
    }, 
    processed_seqnum = 1, 
    owner = 0x60b7e0, 
    owned = {
      _M_t = {
        _M_impl = {
          <std::allocator<std::_Rb_tree_node<zmq::own_t*> >> = {
            <__gnu_cxx::new_allocator<std::_Rb_tree_node<zmq::own_t*> >> = {<No 
data fields>}, <No data fields>}, 
          members of std::_Rb_tree<zmq::own_t*, zmq::own_t*, 
std::_Identity<zmq::own_t*>, std::less<zmq::own_t*>, 
std::allocator<zmq::own_t*> >::_Rb_tree_impl<std::less<zmq::own_t*>, false>: 
          _M_key_compare = {
            <std::binary_function<zmq::own_t*, zmq::own_t*, bool>> = {<No data 
fields>}, <No data fields>}, 
          _M_header = {
            _M_color = std::_S_red, 
            _M_parent = 0x0, 
            _M_left = 0x7ffff0009680, 
            _M_right = 0x7ffff0009680
          }, 
          _M_node_count = 0
        }
      }
    }, 
    term_acks = 0
  }, 
  <zmq::io_object_t> = {
    <zmq::i_poll_events> = {
      _vptr.i_poll_events = 0x7ffff77dcaf0 <vtable for zmq::tcp_connecter_t+208>
    }, 
    members of zmq::io_object_t: 
    poller = 0x607b90
  }, 
  members of zmq::tcp_connecter_t: 
  addr = 0x608550, 
  s = 16, 
  handle = 0x7ffff0000b20, 
  handle_valid = true, 
  delayed_start = true, 
  timer_started = false, 
  session = 0x60b7e0, 
  current_reconnect_ivl = 100, 
  endpoint = {
    static npos = <optimized out>, 
    _M_dataplus = {
      <std::allocator<char>> = {
        <__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data 
fields>}, 
      members of std::basic_string<char, std::char_traits<char>, 
std::allocator<char> >::_Alloc_hider: 
      _M_p = 0x7ffff0000e28 "tcp://106.187.91.94:10080"
    }
  }, 
  socket = 0x607d90
}
(gdb) p addr
$18 = (const zmq::address_t *) 0x608550
(gdb) p *addr
$19 = {
  protocol = {
    static npos = <optimized out>, 
    _M_dataplus = {
      <std::allocator<char>> = {
        <__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data 
fields>}, 
      members of std::basic_string<char, std::char_traits<char>, 
std::allocator<char> >::_Alloc_hider: 
      _M_p = 0x6084f8 "tcp"
    }
  }, 
  address = {
    static npos = <optimized out>, 
    _M_dataplus = {
      <std::allocator<char>> = {
        <__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data 
fields>}, 
      members of std::basic_string<char, std::char_traits<char>, 
std::allocator<char> >::_Alloc_hider: 
      _M_p = 0x608528 "sops.locarise.com:10080"
    }
  }, 
  resolved = {
    tcp_addr = 0x608570, 
    ipc_addr = 0x608570
  }
}
(gdb) p *addr->resolved->tcp_addr 
$20 = {
  _vptr.tcp_address_t = 0x7ffff77dc970 <vtable for zmq::tcp_address_t+16>, 
  address = {
    generic = {
      sa_family = 2, 
      sa_data = "'`j\273[^\000\000\000\000\000\000\000"
    }, 
    ipv4 = {
      sin_family = 2, 
      sin_port = 24615, 
      sin_addr = {
        s_addr = 1583070058
      }, 
      sin_zero = "\000\000\000\000\000\000\000"
    }, 
    ipv6 = {
      sin6_family = 2, 
      sin6_port = 24615, 
      sin6_flowinfo = 1583070058, 
      sin6_addr = {
        __in6_u = {
          __u6_addr8 = '\000' <repeats 15 times>, 
          __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, 
          __u6_addr32 = {0, 0, 0, 0}
        }
      }, 
      sin6_scope_id = 0
    }
  }
}


_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to