Yesha Vora created YARN-9570:
--------------------------------

             Summary: pplication in pending-ordering-policy is not considered 
while container allocation
                 Key: YARN-9570
                 URL: https://issues.apache.org/jira/browse/YARN-9570
             Project: Hadoop YARN
          Issue Type: Bug
          Components: capacity scheduler
            Reporter: Yesha Vora


This is 5 node cluster with total 15GB capacity.

1) Configure Capacity scheduler and set max cluster priority=10
2) launch app1 with no priority and wait for it to occupy full cluster
application_1558135983180_0001 is launched with Priority=0
3) launch app2 with priority=2 and check its in ACCEPTED state
application_1558135983180_0002 is launched with Priority=2
4) launch app3 with priority=3 and check its in ACCEPTED state
application_1558135983180_0003 is launched with Priority=2
5) kill container from app1
6) Verify app3 with higher priority goes to RUNNING state.

When max-application-master-percentage is set to 0.1, app2 goes to RUNNING 
state even though app3 has higher priority.

Root cause:
In CS LeafQueue, there's two ordering list:

If the queue's total application master usage below 
maxAMResourcePerQueuePercent, the app will be added to the "ordering-policy" 
list.
Otherwise, the app will be added to the "pending-ordering-policy" list.
During allocation, only apps in "ordering-policy" are considered. 
If there's any app finish, or queue config changed, or node add/remove happen, 
"pending-ordering-policy" will be reconsidered, and some apps from 
"pending-ordering-policy" will be added to "ordering-policy".

This behavior leads to the issue of this JIRA:

The cluster has 15GB resource, the max-application-master-percentage is set to 
0.1. So it can at most accept 2GB (rounded by 1GB) AM resource, which equals to 
2 applications.
When app2 submitted, it will be added to ordering-policy.
When app3 submitted, it will be added to pending-ordering-policy.
When we kill app1, it won't finish immediately. Instead, it will still be part 
of "odering-policy" until all containers of app1 released. (That makes app3 
stays in pending-ordering-policy).
So any resource released by app1, app3 cannot pick up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to