Jian He commented on YARN-4479:

- For finished attempt, I think we do not need to re-add into scheduler, so 
this whole code could be removed. 
          if (EnumSet.of(RMAppAttemptState.RUNNING, RMAppAttemptState.LAUNCHED)
              .contains(appAttempt.recoveredFinalState)) {
            appAttempt.scheduler.handle(new AppAttemptAddedSchedulerEvent(
                appAttempt.getAppAttemptId(), false, true, true));
          } else {
            appAttempt.scheduler.handle(new AppAttemptAddedSchedulerEvent(
                appAttempt.getAppAttemptId(), false, true));
Accordinlgy in BaseFinalTransition, this code need to be invoked if 
recoveredFinalState == null
 appAttempt.eventHandler.handle(new AppAttemptRemovedSchedulerEvent(
        appAttemptId, finalAttemptState, keepContainersAcrossAppAttempts));
 - With above change, we can assume that attempt added into scheduler should be 
running, so the extra field wasAttemptRunning in AppAttemptAddedSchedulerEvent 
is not needed, the existing isAttemptRecovering flag should be enough.
- I think [~Naganarasimha]'s suggestion make sense. we should consider 
FairComparator too. May be we can add a predefined comparator in 
AbstractComparatorOrderingPolicy with the recoveryComparator initialized  and 
force underlying implementations to use this ?

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> -------------------------------------------------------------------------------
>                 Key: YARN-4479
>                 URL: https://issues.apache.org/jira/browse/YARN-4479
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, resourcemanager
>            Reporter: Rohith Sharma K S
>            Assignee: Rohith Sharma K S
>         Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery

This message was sent by Atlassian JIRA

Reply via email to