[ 
https://issues.apache.org/jira/browse/YARN-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640799#comment-14640799
 ] 

Junping Du commented on YARN-3949:
----------------------------------

Thanks for your input, [~jrottinghuis]! 
I agree the current API (write() + flush()) is simple and flexible to use. 
However, I was just thinking another more easy-to-use way for API could be 
something like write_through() or write_back() (or write_sync or write_async). 
The writer provide different semantics to caller and caller doesn't have to 
know about details when and how to flush(). So this just sounds like a classic 
trade-off between flexible and simple. To be clear, I am not against the 
current design but just want to propose another way in case we have other 
callers (may not inherited from TimelineCollectorManager) in future. I am fine 
with delaying the refactor work by that time. 
About the latest patch, mostly looks good. Except we should document new 
configuration 
"YarnConfiguration.TIMELINE_SERVICE_WRITER_FLUSH_INTERVAL_SECONDS" to 
yarn-default.xml and put description there. Also, the default value for this 
new configuration should be put to YarnConfiguration to conform with current 
code conventions.

> ensure timely flush of timeline writes
> --------------------------------------
>
>                 Key: YARN-3949
>                 URL: https://issues.apache.org/jira/browse/YARN-3949
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>         Attachments: YARN-3949-YARN-2928.001.patch, 
> YARN-3949-YARN-2928.002.patch, YARN-3949-YARN-2928.002.patch
>
>
> Currently flushing of timeline writes is not really handled. For example, 
> {{HBaseTimelineWriterImpl}} relies on HBase's {{BufferedMutator}} to batch 
> and write puts asynchronously. However, {{BufferedMutator}} may not flush 
> them to HBase unless the internal buffer fills up.
> We do need a flush functionality first to ensure that data are written in a 
> reasonably timely manner, and to be able to ensure some critical writes are 
> done synchronously (e.g. key lifecycle events).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to