I am trying to understand hedwig. I tried reading the documentation user.txt,
dev.txt along with the code but
still some design aspects are not clear.
Can someone please tell the following:
(Lets say there are 2 regions A and B)
1. When a subscriber X subscribes to topic T in region A, then does
RegionManager automatically adds a subscription (with id = __A) to topic T in
The RegionManager class has couple of callbacks and I was not able to
2. What happens when X and Y in region A subscribe to topic T. Does
RegionManager tries to do separate subscription for X and Y in B? Since the
RegionManager uses a static subscriber Id, the second subscription request will
be considered duplicate.
3. How does X gets messages from region B? The RegionManager callbacks are bit
confusing and I was not able to understand.
4. What is the purpose of org.apache.hedwig.server.proxy package classes
(HedwigProxy etc.). There is no documentation to explain the same.
5. What happens when one of the hub dies. The publisher will try to contact
another hub? But what about the subscribers? Do they need to do any error
handling / recovery?
6. Hedwig architecture mandates the need for a load balancer. As per my
understanding it is required because the zk instances of different regions is
not shared. I would expect all hosts information to be maintained in zk, and
even for cross colo, the information should be shared through zk (may be that
requires SSL support in zk).