I'm contemplating an upcoming zookeeper rollout and was wondering what the zookeeper brain trust here thought about a network deployment question:

Is it generally considered bad practice to just deploy zookeeper on our existing hdfs/MR nodes? Or is it better to run zookeeper instances on their own dedicated nodes?

On the one hand, we're not going to be making heavy-duty use of zookeeper, so it might be sufficient for zookeeper nodes to share box resources with HDFS & MR. On the other hand, though, I don't want zookeeper to become unavailable if the nodes are running a resource intensive job that's hogging CPU or network.

What's generally considered best practice for Zookeeper?



