I've found it convenient to use an undocumented feature of Sumstats:
changing the epoch. This comes particularly handy when creating statistics
for human consumption, as oftentimes it is useful to synchronize to a
logging interval. For example, if hourly stats are desired, it is useful
to have a shorter epoch for the original sumstats to align with an hour,
then to have subsequent sumstats trigger on the hour.
Researching into this, I realized that the epoch variable can be changed,
if the argument to *Sumstats::create* is a variable, rather than the usual
style of an anonymous argument. Then, in *epoch_result*, or
*epoch_finished*, the timeout for the next epoch can be recomputed on the
fly using *calc_next_rotate()*.
However, this fails to work as expected as the next sumstat is scheduled
prior to executing *epoch_result*, and *epoch_finished*. What does work is
the following hack:
1. Create the initial sumstat with a epoch that will synchronize to the
logging interval
2. Immediately change the epoch to the desired interval
Example:
*event bro_init()*
* {*
* # So network_time() will be initialized...*
* schedule 0 usec { setup_sumstat() };*
* }*
*event setup_sumstat()*
* {*
* ... blah ...*
* local mysumstat: SumStats::SumStat;*
* mysumstat = [*
* $name="mysumstat",*
* $epoch=calc_next_rotate(10 min) - network_time(),*
* etc...*
* ];*
* SumStats::create(mysumstat);*
* # Now SumStat has been created, and the initial epoch scheduled,
change epoch to regular interval for the future*
* mysumstat$epoch = 10 min;*
* }*
It would be convenient if the epoch could be changed in *epoch_result* or
*epoch_finished*, but some internals would require a bit of change - the
reschedule would need to take place after processing results, which could
throw the timing off a bit - on the other hand, unless one is interested in
exact statistics over a known time period (as I am), the small amount of
jitter probably wouldn't be noticeable or significant.
The above is horribly hackish, and a different approach for accomplishing
the goal would be to allow use scripts to schedule the end of the epoch:
1. Mark *epoch* as *&optional*.
2. Expose and document *SumStats::finish_epoch* as part of the public API
3. Make the minor changes to not schedule *SumStats::finish_epoch* if
*epoch* is undefined.
By not defining *epoch* a script would indicate that it will manage epoch
timing. The script would schedule the first epoch based on the logging
interval, and in the *epoch_finished* function schedule each successive
epoch to stay in sync with the logging interval.
Any comments, suggestions, etc. ????
Jim
_______________________________________________
zeek-dev mailing list
[email protected]
http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev