[
https://issues.apache.org/jira/browse/YARN-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14077038#comment-14077038
]
Jian He commented on YARN-2209:
-------------------------------
Hi Zhijie, thanks for the review. Here are some responses:
bq. Why is it necessary to use the exception instead of the flag to indicate
the RM restarting? I
Because as you can see, not just allocate API, unregisterResponse is also
required to add AMCommand otherwise. Basically, every AMS API other than
register requires adding a new field otherwise. Throwing exception is much
cleaner way.
bq. For example, MR of prior versions will no longer work properly with a YARN
cluster after this patch during RM restarting.
Not matter how application is reacting to the shutdown command, NM will shoot
down the AM container during RM restart. So prior applications(including MR)
should still work. Even earlier MR AM container is possibly killed by NM
before it actually successfully performs any shutting down logic.
bq. Deprecate the enum type instead of each enum value?
Maybe not deprecating AMCommand, as we may add other commands later on as
needed.
bq. Why not throwing ApplicationAttemptNotFoundException instead? It sounds
more reasonable here, doesn’t it?
Do you mean creating a new ApplicationAttemptNotFoundException exception ? I
think it's fine to just reuse the ApplicationNotFoundException as they are
quite similar. The internal exception msg shows the attemptId.
bq. Is this change necessary?
It is. because the finally block (i.e. "if(allocateResponse == null)" ) will be
executed otherwise.
bq. shall we split the patch into two pieces: one for YARN and the other for MR,
will split once review is done. I think it'll be easier to review with both
side changes having more context.
bq. No need to break it into two lines, right?
will fix it.
> Replace AM resync/shutdown command with corresponding exceptions
> ----------------------------------------------------------------
>
> Key: YARN-2209
> URL: https://issues.apache.org/jira/browse/YARN-2209
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Jian He
> Assignee: Jian He
> Attachments: YARN-2209.1.patch, YARN-2209.2.patch, YARN-2209.3.patch,
> YARN-2209.4.patch, YARN-2209.5.patch
>
>
> YARN-1365 introduced an ApplicationMasterNotRegisteredException to indicate
> application to re-register on RM restart. we should do the same for
> AMS#allocate call also.
--
This message was sent by Atlassian JIRA
(v6.2#6252)