[ 
https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858582#comment-13858582
 ] 

Bikas Saha commented on YARN-1410:
----------------------------------

We should probably have always done the second approach in which there in which 
submitApplication directly gives the appId. But I think we have a separate 
createApplication() in order to get an appId for which to request RM tokens so 
that those tokens can be inserted in the AppSubmitContext before app 
submission. Again, all of this could have been combined in a single app 
submission context. We may be able to make the second proposal work by 
enhancing a single submitApplication to do all of this and return the appId. 
That would be the preferred API and users should move to it for good HA 
behavior.
Alternative would be for the YARNClient to receive the app does not exist 
exception from the new active. Then use a new API to tell the RM about a 
previously created appId and have the RM accept the new appId. However, the 
downside of this would be that a user could reuse an old appId for multiple 
apps.

> Handle client failover during 2 step client API's like app submission
> ---------------------------------------------------------------------
>
>                 Key: YARN-1410
>                 URL: https://issues.apache.org/jira/browse/YARN-1410
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Xuan Gong
>         Attachments: YARN-1410.1.patch
>
>
> App submission involves
> 1) creating appId
> 2) using that appId to submit an ApplicationSubmissionContext to the user.
> The client may have obtained an appId from an RM, the RM may have failed 
> over, and the client may submit the app to the new RM.
> Since the new RM has a different notion of cluster timestamp (used to create 
> app id) the new RM may reject the app submission resulting in unexpected 
> failure on the client side.
> The same may happen for other 2 step client API operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to