[jira] [Commented] (YARN-321) Generic application history service

Karthik Kambatla (JIRA) Mon, 15 Jul 2013 16:47:43 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13709192#comment-13709192
 ]


Karthik Kambatla commented on YARN-321:
---------------------------------------

Few other considerations:

bq. Running as service: By default, ApplicationHistoryService will be embedded 
inside ResourceManager but will be independent enough to run as a separate 
service for scaling purposes.
Is there a reason to embed this inside the RM? I don't know if there were 
reasons for the JHS to be separate, other than it being MR-specific. If there 
were, this would be against those. No?
That said, I agree it will be easier for the user if AHS starts along with the 
RM. May be, that should be configurable and turned on by default? 

bq. Hosting/serving per-framework data is out of scope for this JIRA. 
Understand and agree it makes sense to not complicate it. However, during the 
design, it would be nice to outline (at least at a high-level) how the 
"plugins" can work. For the plugins to serve application-specific information, 
I suspect the RM should write this information in addition to generic YARN 
information about that application (e.g. MapReduce counters). On completion, 
can we leave a provision for the AM to write a json blob (may be, via RM) to 
{{HistoryStorage}}. In the AHS, can we leave a provision for app-"plugins" to 
access/use this information to render application specifics.
                
> Generic application history service
> -----------------------------------
>
>                 Key: YARN-321
>                 URL: https://issues.apache.org/jira/browse/YARN-321
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Luke Lu
>            Assignee: Vinod Kumar Vavilapalli
>
> The mapreduce job history server currently needs to be deployed as a trusted 
> server in sync with the mapreduce runtime. Every new application would need a 
> similar application history server. Having to deploy O(T*V) (where T is 
> number of type of application, V is number of version of application) trusted 
> servers is clearly not scalable.
> Job history storage handling itself is pretty generic: move the logs and 
> history data into a particular directory for later serving. Job history data 
> is already stored as json (or binary avro). I propose that we create only one 
> trusted application history server, which can have a generic UI (display json 
> as a tree of strings) as well. Specific application/version can deploy 
> untrusted webapps (a la AMs) to query the application history server and 
> interpret the json for its specific UI and/or analytics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-321) Generic application history service

Reply via email to