[
https://issues.apache.org/jira/browse/YARN-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949860#comment-15949860
]
Hudson commented on YARN-6376:
------------------------------
SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11503 (See
[https://builds.apache.org/job/Hadoop-trunk-Commit/11503/])
YARN-6376. Exceptions caused by synchronous putEntities requests can be
(varunsaxena: rev b58777a9c9a5b6f2e4bcfd2b3bede33f25f80dec)
* (edit)
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/collector/TimelineCollector.java
* (edit)
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/collector/TimelineCollectorManager.java
> Exceptions caused by synchronous putEntities requests can be swallowed
> ----------------------------------------------------------------------
>
> Key: YARN-6376
> URL: https://issues.apache.org/jira/browse/YARN-6376
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: ATSv2
> Affects Versions: 3.0.0-alpha2
> Reporter: Haibo Chen
> Assignee: Haibo Chen
> Priority: Critical
> Labels: yarn-5355-merge-blocker
> Fix For: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>
> Attachments: YARN-6376.00.patch
>
>
> TimelineCollector.putEntitities() is currently implemented by calling
> TimelineWriter.write() followed by TimelineWriter.flush(). Given
> HBaseTimelineWriter.write() is an asynchronous operation, it is possible that
> TimelineClient sends a synchronous putEntities() request for critical data,
> but never gets back an exception even though the HBase write request to store
> the entities may have failed.
> This is due to a race condition between the WriterFlushThread in
> TimelineCollectorManager and web threads handling synchronous putEntities()
> requests. Entities are first put into the buffer by the web thread, it is
> possible that before the web thread invokes writer.flush(), WriterFlushThread
> is fired up to flush the writer. If the entities were not successfully
> written to the backend during flush, the WriterFlushThread would just simply
> log an error, whereas the web thread would never get an exception out from
> its writer.flush() invocation. This is bad because the reason of
> TimelineClient sending synchronously putEntities() is to retry upon any
> exception.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]