> Hi Thomas,
> There are a couple of projects inside Yahoo! that use ZooKeeper as an
> event manager for feed processing.
> I am little bit unclear on your example below. As I understand it-
> 1. There are 1 million feeds that will be stored in Hbase.
> 2. A map reduce job will be run on these feeds to find out which feeds need
> to be fetched.
> 3. This will create queues in ZooKeeper to fetch the feeds
> 4. Workers will pull items from this queue and process feeds
> Did I understand it correctly? Also, if above is the case, how many queue
> items would you anticipate be accumulated every hour?
Yes. That's exactly what I'm thinking about. Currently one node processes like
20000 Feeds an hour and we have 5 feed-fetch-nodes. This would mean ~100000
queue items/hour. Each queue item should carry some meta informations, most
important the feed items, that are already known to the system so that only
new items get processed.
Thomas Koch, http://www.koch.ro