Hi Ivan, good stuff, please add this as a comment on the jira.
On 07/16/2010 08:24 AM, Ivan Kelly wrote:
Zookeeper's traces (i.e., logs in TRACE level) provide some
information that can be helpful to understand what happened. For
instance, they contain information about the clients that are
connected, the operations issued, etc. However, in real deployments
with many clients (say, hundreds), traces are typically turned off to
avoid the high overhead that they cause. Furthermore, the data in the
traces is probably not enough for our purposes because it does not
include, e.g., the replies to operations or the data values.
As far as I've seen, this overhead comes in two forms, CPU and disk. CPU
overhead is mostly due to formatting. Disk obviously because tracing
will fill your disk fairly quickly. Perhaps something could be done to
combat both of these. To fix the formatting problem we could use a
binary log format. I've seen this done in C++ but not in java. The basic
idea is that if you have TRACE("operation %x happened to %s %p", obj1,
obj2, obj3); a preprocessor replaces this with TRACE(0x1234, obj1, obj2,
obj3) where 0x1234 is an identifier for the trace. Then when the trace
occurs a binary blob [0x1234, value of obj1, value of obj2, value of
obj3] is logged. Then when the logs are pulled of the machine you run a
post processor to do all the formatting and you get your full trace.
Regarding the disk overhead, traces are usually only interesting in the
run up to a failure. We could have a ring buffer in memory that is
constantly traced to, old traces being overwritten when the ring buffer
reaches it's limit. These traces should only be dumped to the filesystem
when an error or fatal level event occurs, thereby giving you a trace of
what was happening before you fell over.