Anubhav Dhoot commented on YARN-1365:

The error is RMAppRecoveredTransition leaves it in LAUNCHED and then scheduler 
executes ATTEMPT_ADDED. I see Jian fixed it in a certain way in YARN-1368. But 
that only addresses it if its in LAUNCHED. If the state reaches RUNNING before 
that we still get the error. The option is see is we pass in a flag to 
AppAttemptAddedSchedulerEvent that tells scheduler not to issue ATTEMPT_ADDED. 
This will be set in RMAppRecoveredTransition. Lemme know what you think

> ApplicationMasterService to allow Register and Unregister of an app that was 
> running before restart
> ---------------------------------------------------------------------------------------------------
>                 Key: YARN-1365
>                 URL: https://issues.apache.org/jira/browse/YARN-1365
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Anubhav Dhoot
>         Attachments: YARN-1365.001.patch, YARN-1365.002.patch, 
> YARN-1365.003.patch, YARN-1365.initial.patch
> For an application that was running before restart, the 
> ApplicationMasterService currently throws an exception when the app tries to 
> make the initial register or final unregister call. These should succeed and 
> the RMApp state machine should transition to completed like normal. 
> Unregistration should succeed for an app that the RM considers complete since 
> the RM may have died after saving completion in the store but before 
> notifying the AM that the AM is free to exit.

This message was sent by Atlassian JIRA

Reply via email to