[ 
https://issues.apache.org/jira/browse/YARN-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814541#comment-13814541
 ] 

Omkar Vinit Joshi commented on YARN-674:
----------------------------------------

Thanks [~jianhe], [~bikassaha] .

bq. Saw this is changed back to asynchronous submission on recovery, the 
original intention was to prevent client from seeing the application as a new 
application. If asynchronously, the client can query the application before 
recover event gets processed, meaning before the application is fully recovered 
as some recover logic happens when app is processing the recover 
event(app.FinalTransition).
fixed to make sure that it gets updated synchronously.

bq. The assert doesnt make it to the production jar - so it wont catch anything 
on the cluster. Need to throw an exception here. If we dont want to crash the 
RM here then we can log and error. When the attempt state machine gets the 
event then it will crash on the async dispatcher thread if the event is not 
handled in the current state.
discussed with bikas offline.. this is fine.

> Slow or failing DelegationToken renewals on submission itself make RM 
> unavailable
> ---------------------------------------------------------------------------------
>
>                 Key: YARN-674
>                 URL: https://issues.apache.org/jira/browse/YARN-674
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Omkar Vinit Joshi
>         Attachments: YARN-674.1.patch, YARN-674.2.patch, YARN-674.3.patch, 
> YARN-674.4.patch, YARN-674.5.patch, YARN-674.5.patch
>
>
> This was caused by YARN-280. A slow or a down NameNode for will make it look 
> like RM is unavailable as it may run out of RPC handlers due to blocked 
> client submissions.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to