[jira] [Commented] (YARN-1530) [Umbrella] Store, manage and serve per-framework application-timeline data

Patrick Wendell (JIRA) Mon, 17 Feb 2014 14:59:29 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13903589#comment-13903589
 ]


Patrick Wendell commented on YARN-1530:
---------------------------------------

Hey,

Thanks for the explanation! To make sure I understand how this would all work 
by walking through an example.

For the Spark UI we are currently implementing the ability to serialize and 
write events to HDFS, then load them later from a history server that can 
render the UI for jobs that are finished. AFAIK this is basically how MapReduce 
works as well (?)

If users have set-up a YARN cluster and they set up event ingestion to this 
shared store. Then Spark would need two things to integrate with it:

1. Be able to represent our events in JSON and hook into whatever source the 
user has set up for ingestion (flume, HDFS, etc).
2. Be able to render our history timeline UI by reading event data from this 
store.

Correct?

The benefit would be that if users set something fancy like flume, they could 
leverage the same infrastructure for Spark as for other applications since 
there is a shared event model. Also, they would benefit from faster indexed 
serving offered by this application when rendering the "history" UI... 

Is that the main idea? I'm just trying to figure out what redundant work is 
saved by having a generic framework. Since each application writes their own UI 
and has their own event model. From what I can tell the benefit is that a 
shared ingestion and serving infrastructure can be used. 

> [Umbrella] Store, manage and serve per-framework application-timeline data
> --------------------------------------------------------------------------
>
>                 Key: YARN-1530
>                 URL: https://issues.apache.org/jira/browse/YARN-1530
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Vinod Kumar Vavilapalli
>         Attachments: application timeline design-20140108.pdf, application 
> timeline design-20140116.pdf, application timeline design-20140130.pdf, 
> application timeline design-20140210.pdf
>
>
> This is a sibling JIRA for YARN-321.
> Today, each application/framework has to do store, and serve per-framework 
> data all by itself as YARN doesn't have a common solution. This JIRA attempts 
> to solve the storage, management and serving of per-framework data from 
> various applications, both running and finished. The aim is to change YARN to 
> collect and store data in a generic manner with plugin points for frameworks 
> to do their own thing w.r.t interpretation and serving.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1530) [Umbrella] Store, manage and serve per-framework application-timeline data

Reply via email to