[jira] [Updated] (YARN-7952) RM should be able to recover log aggregation status after restart/fail-over

Wangda Tan (JIRA) Wed, 07 Mar 2018 13:12:18 -0800

     [ 
https://issues.apache.org/jira/browse/YARN-7952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Wangda Tan updated YARN-7952:
-----------------------------
    Description: Right now, the NM would send its own log aggregation status to 
RM periodically to RM. And RM would aggregate the status for each application, 
but it will not generate the final status until a client call(from web ui or 
cli) trigger it. But RM never persists the log aggregation status. So, when RM 
restarts/fails over, the log aggregation status will become “NOT_STARTED”. This 
is confusing, maybe we should change it to “NOT_AVAILABLE” (will create a 
separate ticket for this). Anyway, we need to persist the log aggregation 
status for the future use.  (was: In MAPREDUCE-6415, we have created a CLI to 
har the aggregated logs, and In YARN-4946: RM should write out Aggregated Log 
Completion file flag next to logs, we have a discussion on how we can get the 
log aggregation status: make a client call to RM or get it directly from the 
Distributed file system(HDFS).
No matter which approach we would like to choose, we need to figure out a way 
to persist the log aggregation status first. This ticket is used to track the 
working progress for this purpose.)

> RM should be able to recover log aggregation status after restart/fail-over
> ---------------------------------------------------------------------------
>
>                 Key: YARN-7952
>                 URL: https://issues.apache.org/jira/browse/YARN-7952
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>            Priority: Major
>         Attachments: YARN-7952-poc.patch, YARN-7952.1.patch, 
> YARN-7952.2.patch, YARN-7952.3.patch, YARN-7952.3.patch
>
>
> Right now, the NM would send its own log aggregation status to RM 
> periodically to RM. And RM would aggregate the status for each application, 
> but it will not generate the final status until a client call(from web ui or 
> cli) trigger it. But RM never persists the log aggregation status. So, when 
> RM restarts/fails over, the log aggregation status will become “NOT_STARTED”. 
> This is confusing, maybe we should change it to “NOT_AVAILABLE” (will create 
> a separate ticket for this). Anyway, we need to persist the log aggregation 
> status for the future use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7952) RM should be able to recover log aggregation status after restart/fail-over

Reply via email to