[
https://issues.apache.org/jira/browse/YARN-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14005370#comment-14005370
]
Xuan Gong commented on YARN-2074:
---------------------------------
Comments:
1. {code}
RMAppAttempt attempt =
new RMAppAttemptImpl(appAttemptId, rmContext, scheduler, masterService,
submissionContext, conf, maxAppAttempts <= attempts.size());
{code}
Use this condition to decide whether this RMAppAttempt is isLastAttempt, does
not sound right to me.
For example, we set the maxAppAttempts as 3, but previous 2 AM is preempted,
based on the condition you set here, the next RMAppAttempt is the lastAttempt
?? If this Attempt is failed, the whole application will be marked as failure.
2. {code}
public boolean isPreempted() {
return getDiagnostics().contains(SchedulerUtils.PREEMPTED_CONTAINER);
}
{code}
It is fine to use this to check isPreempted. But, link
https://issues.apache.org/jira/browse/YARN-614, basically, this ticket is
saying we should separate hardware failures or YARN issues from AM failure, and
do not count them as AM failure. I think that the Preemption of AM is one of
them. So, maybe we could use a more general way to check whether the AM is
isPreempted, (check ContainerExitStatus instead ?)
> Preemption of AM containers shouldn't count towards AM failures
> ---------------------------------------------------------------
>
> Key: YARN-2074
> URL: https://issues.apache.org/jira/browse/YARN-2074
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Reporter: Vinod Kumar Vavilapalli
> Assignee: Jian He
> Attachments: YARN-2074.1.patch, YARN-2074.2.patch
>
>
> One orthogonal concern with issues like YARN-2055 and YARN-2022 is that AM
> containers getting preempted shouldn't count towards AM failures and thus
> shouldn't eventually fail applications.
> We should explicitly handle AM container preemption/kill as a separate issue
> and not count it towards the limit on AM failures.
--
This message was sent by Atlassian JIRA
(v6.2#6252)