[
https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13911842#comment-13911842
]
Bikas Saha commented on YARN-1410:
----------------------------------
bq. getApplicationReport() is called, we will get an
ApplicationNotFoundException. So, we need to catch this exception and submit
this application again
It would be good if via HAUtil we may be able to get an indication whether a
failover has occurred or not. If it has occurred then its ok to get this
exception but if it has not then its a bug. We can defer that to a separate
jira if its too much work.
> Handle client failover during 2 step client API's like app submission
> ---------------------------------------------------------------------
>
> Key: YARN-1410
> URL: https://issues.apache.org/jira/browse/YARN-1410
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Bikas Saha
> Assignee: Xuan Gong
> Attachments: YARN-1410-outline.patch, YARN-1410.1.patch,
> YARN-1410.2.patch, YARN-1410.2.patch, YARN-1410.3.patch, YARN-1410.4.patch,
> YARN-1410.5.patch
>
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> App submission involves
> 1) creating appId
> 2) using that appId to submit an ApplicationSubmissionContext to the user.
> The client may have obtained an appId from an RM, the RM may have failed
> over, and the client may submit the app to the new RM.
> Since the new RM has a different notion of cluster timestamp (used to create
> app id) the new RM may reject the app submission resulting in unexpected
> failure on the client side.
> The same may happen for other 2 step client API operations.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)