[
https://issues.apache.org/jira/browse/YARN-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15592611#comment-15592611
]
Joep Rottinghuis commented on YARN-4061:
----------------------------------------
You do bring up an interesting question [~gtCarrera9], and that is what happens
if the timeline collector / writer is down. This would occur in the current
implementation when the nodemanager is down (is restarted). Once collectors
become dedicated / separate per-application containers, then something similar
can happen. The clients will time out and will have to do re-tries.
I think what you indicated here is the concern what happens to data buffered in
memory in the collector before it is written to either HBase or even spooled to
disk (or HDFS). Even in the HDFS case there will be buffering.
The current TimelineWriter interface covers this by assuming that all writes
are buffered and providing an explicit flush call to flush all previously
buffered data to permanent storage. For the spooling case to HDFS that would
mean we'd have to do a hsync/flush there as well.
This jira is really mainly focussed on what would happen if we cannot persist
data to the distributed back-end system (HBase in the current implementation).
> [Fault tolerance] Fault tolerant writer for timeline v2
> -------------------------------------------------------
>
> Key: YARN-4061
> URL: https://issues.apache.org/jira/browse/YARN-4061
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Reporter: Li Lu
> Assignee: Joep Rottinghuis
> Labels: YARN-5355
> Attachments: FaulttolerantwriterforTimelinev2.pdf
>
>
> We need to build a timeline writer that can be resistant to backend storage
> down time and timeline collector failures.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]