[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-744:
-----------------------------------

    Status: Open  (was: Patch Available)

Andrei, looks good, a few comments while reviewing the patch:

1) indicate in the docs that not all keys are available on all platforms (fd 
count only on unix for example)
2) change "node_count" to "znode_count" (reduce confusion btw serving nodes and 
znodes)
3) your implementation of ephemeral counting:
org.apache.zookeeper.server.DataTree.getEphemeralsCount()
is inefficient, use entrySet instead (rather than keyset)
4) take a look at how ephemeral counting is done here:
org.apache.zookeeper.server.DataTreeBean.countEphemerals()
You might use refactor to use this code in both places.
5) watch_count is only counting the number of paths that are watched, not the 
total number of watches (a path may have multiple watches - ie multiple 
sessions watching the same path)
Looks like this is a bug in the existing implementation (currently only exposed 
in the bean). You should fix this. Add a test for this while you are at it to 
verify correct counting.
6) good that you capture the quorum info, is there a way to capture the 
date/time of the last election?


> Add monitoring four-letter word
> -------------------------------
>
>                 Key: ZOOKEEPER-744
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-744
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: server
>    Affects Versions: 3.4.0
>            Reporter: Travis Crawford
>            Assignee: Savu Andrei
>             Fix For: 3.4.0
>
>         Attachments: zk-ganglia.png, ZOOKEEPER-744.patch, ZOOKEEPER-744.patch
>
>
> Filing a feature request based on a zookeeper-user discussion.
> Zookeeper should have a new four-letter word that returns key-value pairs 
> appropriate for importing to a monitoring system (such as Ganglia which has a 
> large installed base)
> This command should initially export the following:
> (a) Count of instances in the ensemble.
> (b) Count of up-to-date instances in the ensemble.
> But be designed such that in the future additional data can be added. For 
> example, the output could define the statistic in a comment, then print a key 
> "space character" value line:
> """
> # Total number of instances in the ensemble
> zk_ensemble_instances_total 5
> # Number of instances currently participating in the quorum.
> zk_ensemble_instances_active 4
> """
> From the mailing list:
> """
> Date: Mon, 19 Apr 2010 12:10:44 -0700
> From: Patrick Hunt <ph...@apache.org>
> To: zookeeper-u...@hadoop.apache.org
> Subject: Re: Recovery issue - how to debug?
> On 04/19/2010 11:55 AM, Travis Crawford wrote:
> > It would be a lot easier from the operations perspective if the leader
> > explicitly published some health stats:
> >
> > (a) Count of instances in the ensemble.
> > (b) Count of up-to-date instances in the ensemble.
> >
> > This would greatly simplify monitoring&  alerting - when an instance
> > falls behind one could configure their monitoring system to let
> > someone know and take a look at the logs.
> That's a great idea. Please enter a JIRA for this - a new 4 letter word 
> and JMX support. It would also be a great starter project for someone 
> interested in becoming more familiar with the server code.
> Patrick
> """

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to