Re: Solitication for logging/debugging requirements

Patrick Hunt Mon, 29 Mar 2010 14:50:57 -0700

Take a look at the logging page in the docs:
http://hadoop.apache.org/zookeeper/docs/current/zookeeperInternals.html#sc_logging

Some good guidelines in there. Basically we log things at info levelthat are interesting/informational but not logged so frequently thatthey fill the log. WARN is for things that are bad but that we canhandle (like network connectivity failure). ERROR is generally forthings we don't expect and are unlikely we can handle. FATAL meansreally bad, we shutdown the server. Many end users log only at WARNlevel or higher in production, so typically we err on the side of WARNfor issues (so that we have a shot at debugging after the fact). Overtime, as we gain confidence in production environments, we've beenpushing more things that were WARN down to INFO.

I fixed a number of JIRAs for 3.3 related to logging. In particular Icleaned up the client session logging significantly. The most fertilearea right now to cleanup logging is in the quorum code. That code inparticular has issues wrt providing sufficient information to debugerror conditions. You can easily see this by starting an ensemble ofgreater than 1 machine and try killing one/more of the servers. Thereare many places where the logging is insufficient (eg. "got vote", whichdoesn't say what the vote was or what the effect of such a vote is,etc...) Having improved logging in this area would really help.


Try searching on the JIRA
https://issues.apache.org/jira/browse/ZOOKEEPER
of open/closed issues re "log4j" or "logging" or "log" for further insight.

Patrick

Benjamin Reed wrote:

awesome! that would be great ivan. i'm sure pat has some more concretesuggestions, but one simple thing to do is to run the unit tests andlook at the log messages that get output. there are a couple ofcategories of things that need to be fixed (this is in no way exhaustive):
1) messages that have useful information, but only if you look in thecode to figure out what it means. there are some leader electionmessages that fall into this category. it would be nice to clarify them.2) there are error messages that really aren't errors. when shuttingdown there are a bunch of errors that are expected, but still logged,for example.
3) misclassified error levels

welcome aboard!

ben

On 03/29/2010 10:07 AM, Ivan Kelly wrote:
Hi,

Im going to be using Zookeeper quite extensively for a project in a
few weeks, but development hasn't kicked off yet. This means I have
some time on my hands and I'd like to get familiar with zookeeper
beforehand by perhaps writing some tools to make debugging problems
with it easier so as to save myself some time in the future. Problem
is I haven't had to debug many zookeeper problems yet, so I don't know
where the pain points are.

So, without further ado,
    - Are there any places that logging is deficient that sorely needs
improvement?
    - Could current logs be improved any amount or presented in a more
readable fashion?
    - Would some form of log visualisation be useful (for example in
something approximating a sequence diagram)?

Feel free to suggest anything which the list above doesn't allude to
which you think would be helpful.

Cheers,
Ivan

Re: Solitication for logging/debugging requirements

Reply via email to