[
https://issues.apache.org/jira/browse/YARN-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14077554#comment-14077554
]
Zhijie Shen commented on YARN-2209:
-----------------------------------
While the change will neither break the binary and the source compatibility,
the logic is still at the risk of being broken by changing the way of signaling
AM from via AMCommand to via exception. As is mentioned above applications
other than MR will be affected by this change. For example, if a certain
application AM logic looks as follows:
{code}
try {
ams.allocate(...);
catch (Exception e) {
ams.finishApplicationMaster(...)
}
if (response is shutdown/resync) {
// cleanup and reboot ...
}
{code}
The original logic is likely to be broken if the application is running on the
YARN cluster after this patch. Previously, the application doesn't expect the
shutdown/resync is going to be notified via exception, and it simply catches
the allocate operation failure, and terminate the application. In this case,
the application that should have been retried during RM restarting in a current
YARN cluster is likely to conclude failure (assume killing AM container signal
arrives later than all the aforementioned logic).
In general, the problem is that we previously claim an API is going to throw
exception 1, exception 2 and etc., and we expect users to handle these
exceptions. To handle them correctly, users are supposed to know in what
situation the exception is going to be raised either implicitly or explicitly
(in YARN it seems that users had to figure out themselves as we hardly drafted
the javadoc for the exceptions). Lately, we don't change the API method
signature. Instead, we add/modify the situation where the exception is going to
be raised, or throw a sub-exception (in this case) which was not expected
before. Hence, the existing API user is likely to be broken around the newly
added/modified exception, as the new stuff may not be taken into consideration
before. Is this considered as a kind of *logic incompatibility*?
> Replace AM resync/shutdown command with corresponding exceptions
> ----------------------------------------------------------------
>
> Key: YARN-2209
> URL: https://issues.apache.org/jira/browse/YARN-2209
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Jian He
> Assignee: Jian He
> Attachments: YARN-2209.1.patch, YARN-2209.2.patch, YARN-2209.3.patch,
> YARN-2209.4.patch, YARN-2209.5.patch
>
>
> YARN-1365 introduced an ApplicationMasterNotRegisteredException to indicate
> application to re-register on RM restart. we should do the same for
> AMS#allocate call also.
--
This message was sent by Atlassian JIRA
(v6.2#6252)