[ 
https://issues.apache.org/jira/browse/YARN-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14077580#comment-14077580
 ] 

Junping Du commented on YARN-2209:
----------------------------------

Thanks [~zjshen] for more details! I have similar comments above but [~jianhe] 
mentioned RESYNC is just used for RM restart work which hasn't been released as 
a completed feature. However, I checked our previous releases that even since 
in 2.2 (may earlier), AM_RESYNC and AM_SHUTDOWN is already a public API that 
could be used in customers' application. In this case, our changes here could 
break the application - previously, it should show "Resource Manager doesn't 
recognize AttemptId: ..." when RM getting restart (even no preserving work), 
but now it shows something like "Could not contact RM after ... milliseconds." 
which sounds misleading. May be we should think some other compatible way? i.e. 
add a new API to ApplicationMasterProtocol which throw exceptions instead of 
AMCommand. The old API still get supported for backward compatibility. Thoughts?

> Replace AM resync/shutdown command with corresponding exceptions
> ----------------------------------------------------------------
>
>                 Key: YARN-2209
>                 URL: https://issues.apache.org/jira/browse/YARN-2209
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Jian He
>            Assignee: Jian He
>         Attachments: YARN-2209.1.patch, YARN-2209.2.patch, YARN-2209.3.patch, 
> YARN-2209.4.patch, YARN-2209.5.patch
>
>
> YARN-1365 introduced an ApplicationMasterNotRegisteredException to indicate 
> application to re-register on RM restart. we should do the same for 
> AMS#allocate call also.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to