[ 
https://issues.apache.org/jira/browse/YARN-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14127403#comment-14127403
 ] 

bc Wong commented on YARN-1530:
-------------------------------

bq. The current writing channel allows the data to be available on the timeline 
server immediately

Let's have reliability before speed. I think one of the requirement of ATS is: 
*The channel for writing events should be reliable.*

I'm using *reliable* here in a strong sense, not the TCP-best-effort style 
reliability. HDFS is reliable. Kafka is reliable. (They are also scalable and 
robust.) A normal RPC connection is not. I don't want the ATS to be able to 
slow down my writes, and therefore, my applications, at all. For example, an 
ATS failover shouldn't pause all my applications for N seconds. A direct RPC to 
the ATS for writing seems a poor choice in general.

Yes, you could make a distributed reliable scalable "ATS service" to accept 
writing events. But that seems a lot of work, while we can leverage existing 
technologies.

If the channel itself is pluggable, then we have lots of options. Kafka is a 
very good choice, for sites that already deploy Kafka and know how to operate 
it. Using HDFS as a channel is also a good default implementation, for people 
already know how to scale and manage HDFS. Embedding a Kafka broker with each 
ATS daemon is also an option, if we're ok with that dependency.

> [Umbrella] Store, manage and serve per-framework application-timeline data
> --------------------------------------------------------------------------
>
>                 Key: YARN-1530
>                 URL: https://issues.apache.org/jira/browse/YARN-1530
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Vinod Kumar Vavilapalli
>         Attachments: ATS-Write-Pipeline-Design-Proposal.pdf, 
> ATS-meet-up-8-28-2014-notes.pdf, application timeline design-20140108.pdf, 
> application timeline design-20140116.pdf, application timeline 
> design-20140130.pdf, application timeline design-20140210.pdf
>
>
> This is a sibling JIRA for YARN-321.
> Today, each application/framework has to do store, and serve per-framework 
> data all by itself as YARN doesn't have a common solution. This JIRA attempts 
> to solve the storage, management and serving of per-framework data from 
> various applications, both running and finished. The aim is to change YARN to 
> collect and store data in a generic manner with plugin points for frameworks 
> to do their own thing w.r.t interpretation and serving.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to