[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108236#comment-15108236
 ] 

Wangda Tan commented on YARN-4479:
----------------------------------

Thanks [~sunilg]/[~Naganarasimha],

I looked at existing code again, I still think we should make such logic 
existed in orderingPolicy only.

Such logic increases complexities of LeafQueue, and it has to handle the new 
added {{pendingOPForRecoveredApps}} In many places, such as:
{code}
765           if (application.isAttemptRecovering()) {
766             pendingOPForRecoveredApps.removeSchedulableEntity(application);
767           } else {
768             pendingOrderingPolicy.removeSchedulableEntity(application);
769           }
{code}
And
{code}
1566        for (FiCaSchedulerApp pendingApp : pendingOPForRecoveredApps
1567            .getSchedulableEntities()) {
1568          apps.add(pendingApp.getApplicationAttemptId());
1569        }
1538        for (FiCaSchedulerApp pendingApp : pendingOrderingPolicy    1570    
    for (FiCaSchedulerApp pendingApp : pendingOrderingPolicy
{code}

In addition to this problem, I just noticed another issue of 
pending-ordering-policy introduced by YARN-3873.
It assumes queue's ordering-policy for pending apps should as same as 
ordering-policy for active apps. But actually it doesn't, for example, 
pending-ordering-policy of fair-ordering-policy should be FIFO instead of FAIR.

Some ideas on top of my mind to fix the above issue and cleanup code.
- Keep changes for SchedulerApplicationAttempt/CapacityScheduler in the patch: 
0006-YARN-4479.patch to set "isRecoverying" field
- Add a RecoveryComparator, and add a new FifoOrderingPolicyForPendingApps 
which extends FifoOrderingPolicy but uses RecoveryComparator
- Instead of using queue's configured ordering-policy for pending apps, it 
should use FifoOrderingPolicyForPendingApps.
with the change, user cannot configure ordering-policy for pending-apps, I 
didn't see a strong real life requirement for that now.

Thoughts?

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> -------------------------------------------------------------------------------
>
>                 Key: YARN-4479
>                 URL: https://issues.apache.org/jira/browse/YARN-4479
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, resourcemanager
>            Reporter: Rohith Sharma K S
>            Assignee: Rohith Sharma K S
>             Fix For: 2.8.0
>
>         Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to