Sangjin Lee commented on YARN-3949:

bq. One question about the buffer: if for some reason the app collector has 
crashed, will this written, but unflushed data be lost?

It depends on the manner in which it crashes. The writer is owned by the 
timeline collector *manager* and shared by possibly multiple (app) timeline 
collectors, and as long as that service stays up it can still flush. On the 
other hand, if the timeline collector manager crashes without a chance to 
perform the service stop, then it could be lost.

bq. The proposal looks good to me for now. We may need to revisit it if we'd 
like to support getting the real-time data later.

One aspect this patch does not address is more of a synchronous write from the 
caller's perspective. That would be writing application lifecycle events that 
are critical for example. At least in the case of the hbase writer, all writes 
are basically asynchronous. If we want to make some writes synchronous, we can 
either have the caller (timeline collector) add a {{flush()}} call after the 
{{write()}} call or provide a boolean flag in the {{write()}} method to force 
the flush. Yes, we can do that bit later.

> ensure timely flush of timeline writes
> --------------------------------------
>                 Key: YARN-3949
>                 URL: https://issues.apache.org/jira/browse/YARN-3949
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>         Attachments: YARN-3949-YARN-2928.001.patch
> Currently flushing of timeline writes is not really handled. For example, 
> {{HBaseTimelineWriterImpl}} relies on HBase's {{BufferedMutator}} to batch 
> and write puts asynchronously. However, {{BufferedMutator}} may not flush 
> them to HBase unless the internal buffer fills up.
> We do need a flush functionality first to ensure that data are written in a 
> reasonably timely manner, and to be able to ensure some critical writes are 
> done synchronously (e.g. key lifecycle events).

This message was sent by Atlassian JIRA

Reply via email to