[
https://issues.apache.org/jira/browse/YARN-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14127403#comment-14127403
]
bc Wong commented on YARN-1530:
-------------------------------
bq. The current writing channel allows the data to be available on the timeline
server immediately
Let's have reliability before speed. I think one of the requirement of ATS is:
*The channel for writing events should be reliable.*
I'm using *reliable* here in a strong sense, not the TCP-best-effort style
reliability. HDFS is reliable. Kafka is reliable. (They are also scalable and
robust.) A normal RPC connection is not. I don't want the ATS to be able to
slow down my writes, and therefore, my applications, at all. For example, an
ATS failover shouldn't pause all my applications for N seconds. A direct RPC to
the ATS for writing seems a poor choice in general.
Yes, you could make a distributed reliable scalable "ATS service" to accept
writing events. But that seems a lot of work, while we can leverage existing
technologies.
If the channel itself is pluggable, then we have lots of options. Kafka is a
very good choice, for sites that already deploy Kafka and know how to operate
it. Using HDFS as a channel is also a good default implementation, for people
already know how to scale and manage HDFS. Embedding a Kafka broker with each
ATS daemon is also an option, if we're ok with that dependency.
> [Umbrella] Store, manage and serve per-framework application-timeline data
> --------------------------------------------------------------------------
>
> Key: YARN-1530
> URL: https://issues.apache.org/jira/browse/YARN-1530
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Vinod Kumar Vavilapalli
> Attachments: ATS-Write-Pipeline-Design-Proposal.pdf,
> ATS-meet-up-8-28-2014-notes.pdf, application timeline design-20140108.pdf,
> application timeline design-20140116.pdf, application timeline
> design-20140130.pdf, application timeline design-20140210.pdf
>
>
> This is a sibling JIRA for YARN-321.
> Today, each application/framework has to do store, and serve per-framework
> data all by itself as YARN doesn't have a common solution. This JIRA attempts
> to solve the storage, management and serving of per-framework data from
> various applications, both running and finished. The aim is to change YARN to
> collect and store data in a generic manner with plugin points for frameworks
> to do their own thing w.r.t interpretation and serving.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)