[ 
https://issues.apache.org/jira/browse/YARN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13706553#comment-13706553
 ] 

Vinod Kumar Vavilapalli commented on YARN-321:
----------------------------------------------

Fundamentally, this JIRA is to track the management of data related to finished 
applications via a new server called ApplicationHistoryService (AHS). Some 
important design points:

h4. Basics
 - ResoureManager will write per-application data to a (hopefully very) thin 
{{HistoryStorage}} layer.
 - ResourceManager will push the data to HistoryStorage after an application 
finishes in a separate thread.
 - HistoryStorage is different from the current RMStateStore and so unlike 
JobHistory, HistoryStorage isn't used for state-tracking or as a transaction 
log. ResourceManager will try to publish information about completed apps in a 
best-case manner but there will be edge cases during RM restart where we may 
not be flushing some data. Fixing it to be consistent and complete over an RM 
restart will be a future step.
 - HistoryStorage will have publish app-info, retrieve app-info and list apps 
APIs and can have various implementations
   -- A file based implementation where RM writes per-app files to DFS, 
HistoryStorage will take care of file management like we do today in 
JobHistoryServer (JHS) and serve users by reading the data in files
   -- A shared bus implementation where RM directly writes to AHS and AHS 
persists them in a storage that it controls - Files/DB etc.
 - To start with, we will have an implementation with per-app HDFS file.

h4. Miscellaneous

 - *Running as service*: By default, ApplicationHistoryService will be embedded 
inside ResourceManager but will be independent enough to run as a separate 
service for scaling purposes.

 - *User interfaces*: Command line clients and/or web-clients will have RPC and 
web and REST interfaces to interact with ApplicationHistoryService to get info 
about finished applications. Fundamentally, we'll have two types of interfaces
    -- Per-app info
    -- List of all apps
    -- Querying list of apps based on user-name, queue-name etc. To start with, 
we will imitate what JHS does, throw up list of all apps and do the filtering 
client side. But we need a better server side solution.

 - *Aggregated logs*: Logs will be served and potentially log management 
(expiry etc.) by ApplicationHistoryService via an abstract LogService component.

 - *Retention*: ApplicationHistoryService will have components to take care of 
retention - expiring very old apps.

 - *Security*: ApplicationHistoryService will have security from start, will 
use tokens similar to JHS.

h4. Out of scope

 - Hosting/serving per-framework data is out of scope for this JIRA. It is 
related to ApplicationHistoryService but I am keeping focus on generic data for 
now on this JIRA, will file a separate ticket for ApplicationHistoryService or 
a related service to work with per-framework or app data. I see a transition 
phase where we would continue to run AHS and JHS run at the same time till the 
other JIRA is resolved.

 - *Long running services*: We won't be having any special support for long 
running services yet. We should track this with other long running services' 
support.

Feedback apprecitated.

I am going kickstarting this right now. I am creating a branch for faster 
progress. 
                
> Generic application history service
> -----------------------------------
>
>                 Key: YARN-321
>                 URL: https://issues.apache.org/jira/browse/YARN-321
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Luke Lu
>            Assignee: Vinod Kumar Vavilapalli
>
> The mapreduce job history server currently needs to be deployed as a trusted 
> server in sync with the mapreduce runtime. Every new application would need a 
> similar application history server. Having to deploy O(T*V) (where T is 
> number of type of application, V is number of version of application) trusted 
> servers is clearly not scalable.
> Job history storage handling itself is pretty generic: move the logs and 
> history data into a particular directory for later serving. Job history data 
> is already stored as json (or binary avro). I propose that we create only one 
> trusted application history server, which can have a generic UI (display json 
> as a tree of strings) as well. Specific application/version can deploy 
> untrusted webapps (a la AMs) to query the application history server and 
> interpret the json for its specific UI and/or analytics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to