[ 
https://issues.apache.org/jira/browse/YARN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13706645#comment-13706645
 ] 

Hitesh Shah commented on YARN-321:
----------------------------------

{quote}
To start with, we will have an implementation with per-app HDFS file.
{quote}

[~vinodkv] Based on the above, it seems like this will address allowing someone 
to analyse only one job at a time. Based on a per-app file, it will be 
non-trivial to search for applications that match a certain criteria? All jobs 
that run on a certain day? All jobs of a certain type? All jobs that took 
longer than 10 mins to run? All jobs that use over 100 containers? Sure, a 
directory hierarchy based on dates may solve the very basic use-cases but it 
looks like anyone needing to do any slightly more complex analysis on cluster 
utilization will need to build an indexing layer on top of the file-based store?



                
> Generic application history service
> -----------------------------------
>
>                 Key: YARN-321
>                 URL: https://issues.apache.org/jira/browse/YARN-321
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Luke Lu
>            Assignee: Vinod Kumar Vavilapalli
>
> The mapreduce job history server currently needs to be deployed as a trusted 
> server in sync with the mapreduce runtime. Every new application would need a 
> similar application history server. Having to deploy O(T*V) (where T is 
> number of type of application, V is number of version of application) trusted 
> servers is clearly not scalable.
> Job history storage handling itself is pretty generic: move the logs and 
> history data into a particular directory for later serving. Job history data 
> is already stored as json (or binary avro). I propose that we create only one 
> trusted application history server, which can have a generic UI (display json 
> as a tree of strings) as well. Specific application/version can deploy 
> untrusted webapps (a la AMs) to query the application history server and 
> interpret the json for its specific UI and/or analytics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to