I also thought about repairers (I called them message store), but I don't want it to be over complicated, I think the discovery services can also take the job of the repairers.
Anyway, thanks for your thoughts. On Sun, May 12, 2013 at 5:50 AM, Ian Barber <[email protected]> wrote: > Sounds pretty sensible. You might want to consider having separate > repairers from the publishers, particular if you have a bursty source of > messages. Then if a subscriber can't keep up they can go to the repairer > without effecting the publisher. > > Being smart about the batching as well can make the system perform a bit > more smoothly in failure modes, so if a subscriber is failing to keep up > and dropping occasional messages, it may be best to disconnect until its > backlog is processed, pull a large batch in the recovery mode, then > reconnect to the stream. > > Ian > > > On Wed, May 8, 2013 at 12:18 AM, Doron Somech <[email protected]> wrote: > >> Hi All, >> >> Usually we are using zeromq with pgm as our message bus. We are using >> message bus to publish events between server side services. >> >> The issue is that we need to support environment where multicast is not >> supported (like amazon cloud). >> >> I'm working on a design to make tcp based message bus and want to get >> your thoughts on that. >> >> There are three major requirements, we want services to be able to come >> and go without need to reconfigure the system, we want a brokeless design >> and we want to be able to recover lost messages between a publisher and a >> subscriber (caused by connection problem) like pgm does. >> >> We have three types of components, a discovery service, publisher and >> subscriber. >> >> Discovery Service is a standalone service, the discovery service has the >> list of all the subscribers in the network, the subscriber ping the >> discovery service every X seconds, when specific subscriber didn't ping the >> service for more than Y seconds it consider dead. On every new subscriber >> the publisher publish a message to all the publishers. For high >> availability there are more than one discovery services (probably 3). >> >> When publisher is starting it's asking the discovery service for all of >> the subscribers and subscribe for new subscribers (it asked all configured >> discovery services and takes the first answer, it subscribed for all of the >> discovery services). After getting the list the publisher is connecting to >> all of the subscribers. The publisher also connects to every new >> subscriber. The publisher is ignoring dead subscribers (mostly because I >> don't know how to handle it because the dead message can come from one of >> the discovery service but can still be alive on others). >> >> All the messages the publisher is sending are numbered, also the >> publisher is saving the X last messages it sends to support recovery of >> lost messages. Each publisher has a unique random id. >> >> If publisher doesn't send a message in X seconds the publisher will send >> a keep alive message to all subscribers. >> >> As mentioned the subscriber ping the discovery services every X seconds, >> when the subscriber get a message from a publisher for the first time it's >> saving the message number. From there if the subscriber detects a gap in >> the messages it directly connects to the publisher (using request-response) >> and asking for the missing messages. The only problem is that in lost >> messages situation the subscriber will stop handle new messages from all >> publishers until the missing messages are restored. >> >> If the publisher doesn't have those messages anymore the subscriber >> should raise an exception or restart the entire service. >> >> The only thing the subscriber and publisher need to know is the addresses >> of the discovery services. >> >> The reason I want the publisher to connect to the subscriber is to make >> sure when the connection is dropped the publisher will be able to recognize >> it and reconnect (the subscriber may not be able to recognize it because it >> doesn't send any data to the publishers). >> Thanks, I will very much appreciate your comments. >> >> Doron >> >> _______________________________________________ >> zeromq-dev mailing list >> [email protected] >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >> >> > > _______________________________________________ > zeromq-dev mailing list > [email protected] > http://lists.zeromq.org/mailman/listinfo/zeromq-dev > >
_______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
