wrt bandwidth the issue there is when you do a write you end up copying
the data btw servers in the quorum:
1) client setdata("largedata") -> follower ZK server (copy data)
2) follower ZK server forwards the proposal to the ZK server leader
3) ZK server leader does atomic broadcast to all followers - ie sends
individual copies of the data to all the followers (copy * (x-1 servers))
4) majority of followers ack, leader commits, follower responds to
Again, if you have a handful of nodes it's not a big deal... but as/if
you expand your use you end up with a potential issue.
Of course if you care about reliability/availablity of the data then
choice of "third party data store" is important... this really depends
on your requirements. Perhaps storing in ZK makes sense... it really
depends on your use case/requirements.
Eric Bowman wrote:
Thanks for the quick reply Henry & Patrick.
I understand the important of "small things" for a common use case point
of view; I don't think my case is so common, but it's also not that big
a deal to just write the data to an NFS volume and puts its path in ZK.
I was kind of hoping to avoid that, but I have to do that anyhow for
other things, so this doesn't do much damage. :)
At some point I'll spend some time understanding how this really affects
latency in my case ... I'm keeping just a handful of things that are
about 10M in the ensemble, so the memory footprint is no problem. But
the network bandwidth could be ... I'll check it out.