Re: [Zeek-Dev] Proposal: Make Zeek's debug logging thread-safe

Bob Murphy Wed, 15 Jul 2020 14:58:29 -0700

> 
> On Jul 15, 2020, at 12:52 AM, Robin Sommer <[email protected]> wrote:
> Reading through this thread, I'm wondering if we should focus on
> improving identification of log lines in terms of where they come from
> and when they were generated, while keeping to go through the existing
> mechanism of sending messages back to main process for output (so that
> we don't need the mutex). If we sent timestamps & thread IDs along
> with the Debug() messages, one could later post-process debug.log to,
> get things sorted/split as desired.
> 
> This wouldn't support the use case of "millions of lines" very well,
> but I'm not convinced that's what we should be designing this for. A
> mutex becomes potentially problematic at that volume as well, and it
> also seems like a rare use case to begin with. In cases where it's
> really needed, a local patch to get logs into files directly (as you
> have done already) might just do the trick, no?
> 
> Robin


We could definitely change DebugLogger to improve the log line identification, 
and route it through the threading framework’s Debug() call. That will avoid 
turning debug.log into "word salad”.

However, that would also cause a delay in writing the log lines, and I've run 
into situations working on Zeek where that kind of delay would make debugging 
harder.

For example, sometimes I run tail on the log file in a terminal window. Then, 
when the code hits a breakpoint in a debugger, I can analyze the program state 
by looking at log lines emitted right before the breakpoint triggers, and 
compare them to variable contents, the stack trace, etc. That won't work if 
logging is delayed.

There are multiple, conflicting use cases for logging in Zeek. Sometimes a 
developer might think:
- Maximized throughput is important, but a delay is okay
- No delay can be tolerated, but slower throughput is okay
- Correct temporal ordering in the log is (or isn’t) important
- fflush() after every write is (or isn’t) important
- Debug logging output should go to the debug.log file, or stdout, or somewhere 
else

This is a pretty common situation around logging, in my experience.

One way to solve it, as Robin says, is for a developer with a use case Zeek 
doesn't support to apply a temporary local patch. Unfortunately, that doesn't 
help other developers who might have the same use case. Also, I personally hate 
to spend time writing code and getting it to work well, and then throw it away.

On other projects, I've used a different approach that's worked really well: 
use a single, common logging API, but let it send its output to different 
output mechanisms that support different use cases. Then a developer could pick 
the output mechanism that works best for their use case at runtime, using a 
command line option or environment variable. I think it wouldn’t be very 
complicated to add that to Zeek.

- Bob


_______________________________________________
Zeek-Dev mailing list
[email protected]
http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev

Re: [Zeek-Dev] Proposal: Make Zeek's debug logging thread-safe

Reply via email to