[
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108236#comment-15108236
]
Wangda Tan commented on YARN-4479:
----------------------------------
Thanks [~sunilg]/[~Naganarasimha],
I looked at existing code again, I still think we should make such logic
existed in orderingPolicy only.
Such logic increases complexities of LeafQueue, and it has to handle the new
added {{pendingOPForRecoveredApps}} In many places, such as:
{code}
765 if (application.isAttemptRecovering()) {
766 pendingOPForRecoveredApps.removeSchedulableEntity(application);
767 } else {
768 pendingOrderingPolicy.removeSchedulableEntity(application);
769 }
{code}
And
{code}
1566 for (FiCaSchedulerApp pendingApp : pendingOPForRecoveredApps
1567 .getSchedulableEntities()) {
1568 apps.add(pendingApp.getApplicationAttemptId());
1569 }
1538 for (FiCaSchedulerApp pendingApp : pendingOrderingPolicy 1570
for (FiCaSchedulerApp pendingApp : pendingOrderingPolicy
{code}
In addition to this problem, I just noticed another issue of
pending-ordering-policy introduced by YARN-3873.
It assumes queue's ordering-policy for pending apps should as same as
ordering-policy for active apps. But actually it doesn't, for example,
pending-ordering-policy of fair-ordering-policy should be FIFO instead of FAIR.
Some ideas on top of my mind to fix the above issue and cleanup code.
- Keep changes for SchedulerApplicationAttempt/CapacityScheduler in the patch:
0006-YARN-4479.patch to set "isRecoverying" field
- Add a RecoveryComparator, and add a new FifoOrderingPolicyForPendingApps
which extends FifoOrderingPolicy but uses RecoveryComparator
- Instead of using queue's configured ordering-policy for pending apps, it
should use FifoOrderingPolicyForPendingApps.
with the change, user cannot configure ordering-policy for pending-apps, I
didn't see a strong real life requirement for that now.
Thoughts?
> Retrospect app-priority in pendingOrderingPolicy during recovering
> applications
> -------------------------------------------------------------------------------
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: api, resourcemanager
> Reporter: Rohith Sharma K S
> Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch,
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch,
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active
> applications. When priority is configured for an applications, during
> recovery high priority application get activated first. It is possible that
> low priority job was submitted and running state.
> This causes low priority job in starvation after recovery
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)