Patrick Hunt updated ZOOKEEPER-744:
Status: Open (was: Patch Available)
Andrei, looks good, a few comments while reviewing the patch:
1) indicate in the docs that not all keys are available on all platforms (fd
count only on unix for example)
2) change "node_count" to "znode_count" (reduce confusion btw serving nodes and
3) your implementation of ephemeral counting:
is inefficient, use entrySet instead (rather than keyset)
4) take a look at how ephemeral counting is done here:
You might use refactor to use this code in both places.
5) watch_count is only counting the number of paths that are watched, not the
total number of watches (a path may have multiple watches - ie multiple
sessions watching the same path)
Looks like this is a bug in the existing implementation (currently only exposed
in the bean). You should fix this. Add a test for this while you are at it to
verify correct counting.
6) good that you capture the quorum info, is there a way to capture the
date/time of the last election?
> Add monitoring four-letter word
> Key: ZOOKEEPER-744
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-744
> Project: Zookeeper
> Issue Type: New Feature
> Components: server
> Affects Versions: 3.4.0
> Reporter: Travis Crawford
> Assignee: Savu Andrei
> Fix For: 3.4.0
> Attachments: zk-ganglia.png, ZOOKEEPER-744.patch, ZOOKEEPER-744.patch
> Filing a feature request based on a zookeeper-user discussion.
> Zookeeper should have a new four-letter word that returns key-value pairs
> appropriate for importing to a monitoring system (such as Ganglia which has a
> large installed base)
> This command should initially export the following:
> (a) Count of instances in the ensemble.
> (b) Count of up-to-date instances in the ensemble.
> But be designed such that in the future additional data can be added. For
> example, the output could define the statistic in a comment, then print a key
> "space character" value line:
> # Total number of instances in the ensemble
> zk_ensemble_instances_total 5
> # Number of instances currently participating in the quorum.
> zk_ensemble_instances_active 4
> From the mailing list:
> Date: Mon, 19 Apr 2010 12:10:44 -0700
> From: Patrick Hunt <ph...@apache.org>
> To: zookeeper-u...@hadoop.apache.org
> Subject: Re: Recovery issue - how to debug?
> On 04/19/2010 11:55 AM, Travis Crawford wrote:
> > It would be a lot easier from the operations perspective if the leader
> > explicitly published some health stats:
> > (a) Count of instances in the ensemble.
> > (b) Count of up-to-date instances in the ensemble.
> > This would greatly simplify monitoring& alerting - when an instance
> > falls behind one could configure their monitoring system to let
> > someone know and take a look at the logs.
> That's a great idea. Please enter a JIRA for this - a new 4 letter word
> and JMX support. It would also be a great starter project for someone
> interested in becoming more familiar with the server code.
This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.