People in this discussion may be interested in taking a look at RELP. It's used in Rsyslog when you want to absolutely guarantee that log messages are transmitted.

http://www.rsyslog.com/doc/relp.html

On 12/20/2013 3:01 PM, Lindley French wrote:
I'll agree to a limited extent, though I don't see things exactly as you do.

The problem, in my view, is that normally you can trust TCP to get your packets through intact. When something goes wrong and a connection fails, you can take appropriate action to test what go through and what didn't, and fix it. But when TCP connections go down and then come back within 0MQ, there's no way to react to that, and 0MQ doesn't do a whole lot (from my understanding) to make sure no messages got lost in the ether. So nothing is done automatically and nothing can be done manually when a fault occurs.....which means you are forced to write a higher-level protocol on top of 0MQ *as if* it is totally unreliable and failures can happen at any time, even though in reality TCP is pretty good 99% of the time.

I'm not going to request that 0MQ do its own acking and retransmissions or anything like that----I've been down that road, sooner or later you're basically writing TCP over TCP----but I do think there should be hooks to let you know what range of messages might be at risk when a connection goes down, so you can give them whatever special treatment you like.


On Fri, Dec 20, 2013 at 2:42 PM, artemv zmq <[email protected] <mailto:[email protected]>> wrote:

    hi Gregg,

    As for the "acks". The game on mobile device is awaiting (with
    timeout) for "acks". So, yes, we do "acks", sure.

    I also was thinking about
    >> timestamping the messages and giving them a TTL

    and considered it as not reliable in my case.   The problem is
    that  we don't have control on where we deploy our software. We
    can't check: is time settings the same on all nodes in a cluster .
    And we can't
    ask our customers: "you have to ensure that time settings are the
    same on all nodes in your datacenter."  I'm pretty sure that
    wouldn't work (at least, in my company).

    As for
    >>This sounds like an application problem, not a 0MQ problem

    I wouldn't put like that. It's not a problem, it's rather a
    missing feature in 0mq.  I think behaviour like:
     "_unconditionally_ deliver messages on reconnected socket"  is
    somewhat too strict.  It's more designed to support some kind of
    historical data flow, where you don't want to lose even one
    message. What it can be?  E.g. wheather data from sensors, e.g.
     quotes from stock exchange. But it is not very much suitable
     when  you deal with something like: "place a bet" , "create a
    purchase order", "book hotel room".    Agree?




    2013/12/20 Gregg Irwin <[email protected]
    <mailto:[email protected]>>

        Hi Artem,

        az> Real example from gambling.

        az> We have thousands users betting from their phones.  For
        end user a bet is
        az> just a click in UI, but for backend it's  a bunch of
        remote calls to
        az> services. If  service is not available, then bet message
         will stuck
        az> in 0mq in-mem message queue  (up to hwm). The game UI can
        wait up to
        az> certain timeout  and then render something akin to  "We
        have communication
        az> problem with our backend. Try again later."  So at this
        point user believes
        az> that bet wasn't succeeded (.. this is important).    What
        happens then --
        az> ITOps get their pager rings, and then during 1hr they do
        their best to
        az>  restart a failed service.  Ok?

        az> After 1hr or so service restarts  and now what? Now queued
        bet will be
        az> delivered to restarted service. And this is not goood,
        because 1hr earlier
        az>  we ensured user that "we had a backend issue"  and his
        bet wasn't suceeded.

        az> So  the  question arised --  how to not redeliver messages
        upon reconnect?

        This sounds like an application problem, not a 0MQ problem. A
        request
        to place the bet can be received, which doesn't guarantee that
        the bet
        has been placed (if other work needs to be done). To know that
        the bet
        was place, you need an ack. You can also ack that the
        *request* was
        received. In your scenario above, timestamping the messages
        and giving
        them a TTL lets you handle cases where requests could not be
        processed
        in a timely manner, and possibly ask the user what they want
        to do.

        -- Gregg

        _______________________________________________
        zeromq-dev mailing list
        [email protected] <mailto:[email protected]>
        http://lists.zeromq.org/mailman/listinfo/zeromq-dev



    _______________________________________________
    zeromq-dev mailing list
    [email protected] <mailto:[email protected]>
    http://lists.zeromq.org/mailman/listinfo/zeromq-dev




_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to