[ 
https://issues.apache.org/jira/browse/YARN-9237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16752119#comment-16752119
 ] 

Jiandan Yang  edited comment on YARN-9237 at 1/25/19 10:33 AM:
---------------------------------------------------------------

Thanks [~cheersyang]  for your quick response.
{quote} change to ApplicationState.FINISHED != 
appEntry.getValue().getApplicationState(){quote}
I agree with you, this style is better
After looking through code several times,  I'm not sure how to test it, maybe 
existing ut is ok.
Do you have good test idea.



was (Author: yangjiandan):
Thanks [~cheersyang]  for your quick response. I'll update patch according to 
your comment.

> RM prints a lot of "Cannot get RMApp by appId" log when RM failover
> -------------------------------------------------------------------
>
>                 Key: YARN-9237
>                 URL: https://issues.apache.org/jira/browse/YARN-9237
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn
>            Reporter: Jiandan Yang 
>            Assignee: Jiandan Yang 
>            Priority: Major
>         Attachments: YARN-9237.001.patch, YARN-9237.002.patch
>
>
> I found a lot of following log in active RM log file after doing  failover RM
> {code:java}
> 2019-01-24 15:43:58,999 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Cannot get 
> RMApp by appId=application_1542178952162_34746156, just added it to 
> finishedApplications list for cleanup
> .....
> {code}
> I looked forward RM logs and find this app had finished before hours
> {code:java}
> 2019-01-23 21:49:55,683 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1542178952162_34746156_000001 State change from FINAL_SAVING to 
> FINISHING
> {code}
> The reason of RM prints " Cannot get RMApp by appId"  is as follows:
> 1. RM failover
> 2. NM reports all running apps to RM in register request
> 3. The running apps are from NMContext, some apps may already finished
> 4. In my cluster, yarn.log-aggregation-enable=false, 
> yarn.nodemanager.log.retain-seconds=86400(1day), so app is kept in NMContext 
> before app has finished for 24 hours
> 5. My Yarn cluster runs 50k apps per day and 7k nodes, and NM will report 
> many finished apps to RM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to