[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission

Xuan Gong (JIRA) Thu, 20 Feb 2014 20:20:39 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907937#comment-13907937
 ]


Xuan Gong commented on YARN-1410:
---------------------------------

Thanks for the suggestions. But this proposal seems have disadvantages. 
Assume that we use RetryCache, we have to save the information in the store, 
either in hdfs or zookeeper. First, we need to ensure all our APIs can store 
stuffs *asynchronously*. Second, if we keep this proposal, when we save the 
RetryCache information *asynchronously*, there will be a problem. Because the 
same race condition will still be there. Third, when we save the RetryCache 
information *synchronously* , there will be another problem: the performance 
will be decreased. 

[~bikassaha]

> Handle client failover during 2 step client API's like app submission
> ---------------------------------------------------------------------
>
>                 Key: YARN-1410
>                 URL: https://issues.apache.org/jira/browse/YARN-1410
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Xuan Gong
>         Attachments: YARN-1410-outline.patch, YARN-1410.1.patch, 
> YARN-1410.2.patch, YARN-1410.2.patch, YARN-1410.3.patch, YARN-1410.4.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> App submission involves
> 1) creating appId
> 2) using that appId to submit an ApplicationSubmissionContext to the user.
> The client may have obtained an appId from an RM, the RM may have failed 
> over, and the client may submit the app to the new RM.
> Since the new RM has a different notion of cluster timestamp (used to create 
> app id) the new RM may reject the app submission resulting in unexpected 
> failure on the client side.
> The same may happen for other 2 step client API operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission

Reply via email to