Hi David, Sharing the cluster with HDFS and Map reduce might cause significant problems. Mapreduce is very IO intensive and this might cause lot of unnecessary hiccups in your cluster. I would suggest atleast providing something like this, if you really want to share the nodes.
- atleast considerable amount of memory space say 400-500MB (depending on your usage) for the java heap - one dedicated disk not used by MR or Datanodes, so that ZooKeeper performance is a little predictable for you. Thanks mahadev On 3/8/10 10:58 AM, "David Rosenstrauch" <dar...@darose.net> wrote: > I'm contemplating an upcoming zookeeper rollout and was wondering what > the zookeeper brain trust here thought about a network deployment question: > > Is it generally considered bad practice to just deploy zookeeper on our > existing hdfs/MR nodes? Or is it better to run zookeeper instances on > their own dedicated nodes? > > On the one hand, we're not going to be making heavy-duty use of > zookeeper, so it might be sufficient for zookeeper nodes to share box > resources with HDFS & MR. On the other hand, though, I don't want > zookeeper to become unavailable if the nodes are running a resource > intensive job that's hogging CPU or network. > > > What's generally considered best practice for Zookeeper? > > Thanks, > > DR