Indeed... I've had this hit more than once, a zmq_setsockopt that caused an error that wasn't being handled, with weird and expensive results down the line.
Kind of makes you appreciate assertions more. CZMQ does this -- if a setsockopt fails for any reason except ETERM, it asserts. I might propose such a patch to libzmq. On Mon, Jun 16, 2014 at 8:45 PM, Gerry Steele <[email protected]> wrote: > Thanks, there was also an error in my error handling thus why it was never > flagged. I imagine its the same in my app code. uint64_t came from the cli > argument handling lib thus why it was used over int. A lesson learned there. > > > > > On 16 June 2014 19:13, Pieter Hintjens <[email protected]> wrote: >> >> And indeed, this code prints "-1" as the return code: >> >> void *context = zmq_ctx_new (); >> void *publisher = zmq_socket (context, ZMQ_PUB); >> uint64_t rhwm = 0; >> int rc = zmq_setsockopt (publisher, ZMQ_SNDHWM, &rhwm, sizeof (rhwm)); >> printf ("RC=%d\n", rc); >> >> -Pieter >> >> On Mon, Jun 16, 2014 at 8:03 PM, Pieter Hintjens <[email protected]> wrote: >> > Hmm, it does check the size of the passed argument, and if that's >> > wrong, returns an error (which you do check for). >> > >> > On Mon, Jun 16, 2014 at 7:36 PM, Gerry Steele <[email protected]> >> > wrote: >> >> Hi Pieter, you have struck on something there. >> >> >> >> Converting it to int seems to yield the correct behaviour. >> >> >> >> I guess the way setsockopt works type coercion doesn't happen. >> >> >> >> Embarrassing! But at least we got to the bottom of it. >> >> >> >> I was able to send billions of events without incurring loss. Apologies >> >> for >> >> taking everyones time. >> >> >> >> Thanks all. >> >> >> >> g >> >> >> >> >> >> >> >> On 16 June 2014 18:22, Pieter Hintjens <[email protected]> wrote: >> >>> >> >>> OK, just to double check, you're using ZeroMQ 4.0.x? In your test case >> >>> (which I'm belatedly looking at), you use a uint64_t for the hwm >> >>> values; it should be int. Probably not significant. >> >>> >> >>> On Mon, Jun 16, 2014 at 6:20 PM, Gerry Steele <[email protected]> >> >>> wrote: >> >>> > In the patent email I have links to the minimal examples on >> >>> > gist.github.com >> >>> > >> >>> > Happy to open an issue and commit them later on if that's what you >> >>> > need. >> >>> > >> >>> > Thanks >> >>> > >> >>> > On 16 Jun 2014 14:43, "Pieter Hintjens" <[email protected]> wrote: >> >>> >> >> >>> >> Gerry, can you provide a minimal test case that shows the behavior? >> >>> >> Thanks. >> >>> >> >> >>> >> On Mon, Jun 16, 2014 at 12:49 PM, Gerry Steele >> >>> >> <[email protected]> >> >>> >> wrote: >> >>> >> > Thanks Peter. I can't try this out till I get home but it is >> >>> >> > looking >> >>> >> > like >> >>> >> > hwm overflows. >> >>> >> > >> >>> >> > If you run the utilities you notice the drops start happening >> >>> >> > after >> >>> >> > precisely 1000 events in the first instance (which Is the default >> >>> >> > hwm). >> >>> >> > >> >>> >> > There was another largely ignored thread about this recently >> >>> >> > mentioning >> >>> >> > the >> >>> >> > same problem. >> >>> >> > >> >>> >> > I also tried setting the hwm values to a number greater than the >> >>> >> > number >> >>> >> > of >> >>> >> > events and it seemed to have no effect either. >> >>> >> > >> >>> >> > g >> >>> >> > >> >>> >> > On 16 Jun 2014 09:32, "Pieter Hintjens" <[email protected]> wrote: >> >>> >> >> >> >>> >> >> On Mon, Jun 16, 2014 at 9:10 AM, Gerry Steele >> >>> >> >> <[email protected]> >> >>> >> >> wrote: >> >>> >> >> >> >>> >> >> > Big chunks of messages go missing mid flow and then pick up >> >>> >> >> > again. >> >>> >> >> > There >> >>> >> >> > is >> >>> >> >> > no literature that indicates that is expected behaviour. >> >>> >> >> >> >>> >> >> Right. The two plausible causes for this are (a) HWM overflows, >> >>> >> >> and >> >>> >> >> (b) temporary network disconnects. You have excluded (a), though >> >>> >> >> to >> >>> >> >> be >> >>> >> >> paranoid I'd probably add some temporary logging to libzmq's pub >> >>> >> >> socket to shout out if/when it does hit the HWM. To detect (b) >> >>> >> >> you >> >>> >> >> could use the socket monitoring. The third possibility is that >> >>> >> >> you're >> >>> >> >> doing something wrong with subscriptions... though that seems >> >>> >> >> unlikely. >> >>> >> >> >> >>> >> >> -Pieter >> >>> >> >> _______________________________________________ >> >>> >> >> zeromq-dev mailing list >> >>> >> >> [email protected] >> >>> >> >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >> >>> >> > >> >>> >> > >> >>> >> > _______________________________________________ >> >>> >> > zeromq-dev mailing list >> >>> >> > [email protected] >> >>> >> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev >> >>> >> > >> >>> >> _______________________________________________ >> >>> >> zeromq-dev mailing list >> >>> >> [email protected] >> >>> >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >> >>> > >> >>> > >> >>> > _______________________________________________ >> >>> > zeromq-dev mailing list >> >>> > [email protected] >> >>> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev >> >>> > >> >>> _______________________________________________ >> >>> zeromq-dev mailing list >> >>> [email protected] >> >>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >> >> >> >> >> >> >> >> >> >> -- >> >> Gerry Steele >> >> >> >> >> >> _______________________________________________ >> >> zeromq-dev mailing list >> >> [email protected] >> >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >> >> >> _______________________________________________ >> zeromq-dev mailing list >> [email protected] >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev > > > > > -- > Gerry Steele > > > _______________________________________________ > zeromq-dev mailing list > [email protected] > http://lists.zeromq.org/mailman/listinfo/zeromq-dev > _______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
