To follow up with this question, which seems to be asking primarily about
Hedwig (and I guess the answer is: it's not in production yet, anywhere),
with one more about Bookkeeper: is BookKeeper used in production as a WAL
(or for any other use) anywhere? If so, for what uses?
Any info (even anecdotal) would be great!
On Thu, Oct 7, 2010 at 9:15 AM, Benjamin Reed <br...@yahoo-inc.com> wrote:
> hi amit,
> sorry for the late response. this week has been crunch time for a lot of
> different things.
> here are your answers:
> 1. it is still in prototype phase. we are evaluating different aspects, but
> there is still some work to do to make it production ready. we also need to
> get an engineering team to signup to stand behind it.
> 2. it's a generic pub/sub message bus. in some sense it is really a
> datacenter solution with extensions for multi-data center operation, so it
> is perfectly reasonable to use it in a single datacenter setting.
> 3. yeah, we have removed the hw.bash script. it had some hardcoded
> assumptions and was a swiss army knife on steroids. he have been breaking it
> up into simpler scripts.
> 4. session expiry really represents a fundamental connectivity problem, so
> both bk and hedwig restart the component that gets the expired session
> 1. yes.
> 2. once all subscribers have consumed a message there is a background
> process that cleans it up.
> 3. yes there is a replication factor and we ensure replication on writes
> and there is a recovery tool to recover bookies that fail. we don't have to
> worry about conflicts because there is only a single writer for a give
> ledger. because of this we do not need to do quorum reads.
> yes, this is something we need to work on. i'll see if i can push out some
> of our hello world applications. we'd also like to put a JMS API on top so
> that the API is more familiar (and documented :). i don't want to delay the
> answers to your other questions, so let me answer that HedwigSubscriber is
> the class for clients. the other classes are internal. (for cross data
> center hubs use a special kind of subscriptions to do cross data center
> On 10/05/2010 10:32 PM, amit jaiswal wrote:
>> In Hedwig talk (http://vimeo.com/13282102), it was mentioned that the
>> use case for Hedwig comes from the distributed key-value store PNUTS in
>> but also said that the work is new.
>> Could you please about the following:
>> Production readiness / Deployment
>> 1. What is the production readiness of Hedwig / BookKeeper. Is it being
>> anywhere (like in PNUTS)?
>> 2. Is Hedwig designed to use as a generic message bus or only for
>> multi-datacenter operations?
>> 3. Hedwig installation and deployment is done through a script hw.bash,
>> but that
>> is difficult to use especially in a production environment. Are there any
>> packages available that can simplify the deployment of hedwig.
>> 4. How does BK/Hedwig handle zookeeper session expiry?
>> Data Deletion, Handling data loss, Quorum
>> 1. Does BookKeeper support deletion of old log entries which have been
>> 2. How does Hedwig handles the case when all subscribers have consumed all
>> messages. In the talk, it was said that a subscriber can come back after
>> days or weeks. Is there any data retention / expiration policy for the
>> data that
>> is published?
>> 3. How does Hedwig handles data loss? There is a replication factor, and a
>> operation must be accepted by majority of the bookies, but how data
>> are handled? Is there any possibility of data conflict at all? Is the
>> replication only for recovery? When the hub is reading data from bookies,
>> it reads from all the bookies to satisfy quorum read?
>> What is the difference between PubSubServer, HedwigSubscriber,
>> HedwigHubSubscriber. Is there any HelloWorld program that simply
>> illustrates how
>> to instantiate a hedwig client, and publish/consume messages.
>> class is helpful, but was looking something like API documentation).