Hey zookeeper gurus - Are there any recommended ways for one to monitor zookeeper ensembles? I'm familiar with the four-letter words and that stats are published via JMX - I'm more interested in what people are doing with those stats.
I'd like to publish the JMX stats to Ganglia, and this works well for the built-in stats. However, the zookeeper-specific names appear to be dynamic which causes issues when deciding what to publish. For example, the current mode (leader/follower) appears to only be accessible from the bean names, instead of looking at, say, a "mode" stat. org.apache.ZooKeeperService:name0=ReplicatedServer_id1,name1=replica.1,name2=Follower org.apache.ZooKeeperService:name0=ReplicatedServer_id2,name1=replica.2,name2=Leader The only way I've found to learn if replicas are up-to-date is looking at "synced" buried in followerInfo: $ java -jar cmdline-jmxclient-0.10.5.jar - localhost:8081 org.apache.ZooKeeperService:name0=ReplicatedServer_id2,name1=replica.2,name2=Leader followerInfo 04/14/2010 18:06:06 +0000 org.archive.jmx.Client followerInfo: FollowerHandler Socket[addr=/10.0.0.10,port=48104,localport=2888] tickOfLastAck:29793 synced?:true queuedPacketLength:0 FollowerHandler Socket[addr=/10.0.0.11,port=59599,localport=2888] tickOfLastAck:29793 synced?:true queuedPacketLength:0 I don't mind writing a tool to parse the JMX output and publishing to Ganglia if needed, but it seems like a problem that may have already been solved and I'm curious what others are doing. The tool would basically take the zookeeper stats, normalize the names, and publish to a timeseries database. Is anyone already monitoring ZK in a way others might find useful? Thanks! Travis