Yesha Vora created YARN-9570:
--------------------------------
Summary: pplication in pending-ordering-policy is not considered
while container allocation
Key: YARN-9570
URL: https://issues.apache.org/jira/browse/YARN-9570
Project: Hadoop YARN
Issue Type: Bug
Components: capacity scheduler
Reporter: Yesha Vora
This is 5 node cluster with total 15GB capacity.
1) Configure Capacity scheduler and set max cluster priority=10
2) launch app1 with no priority and wait for it to occupy full cluster
application_1558135983180_0001 is launched with Priority=0
3) launch app2 with priority=2 and check its in ACCEPTED state
application_1558135983180_0002 is launched with Priority=2
4) launch app3 with priority=3 and check its in ACCEPTED state
application_1558135983180_0003 is launched with Priority=2
5) kill container from app1
6) Verify app3 with higher priority goes to RUNNING state.
When max-application-master-percentage is set to 0.1, app2 goes to RUNNING
state even though app3 has higher priority.
Root cause:
In CS LeafQueue, there's two ordering list:
If the queue's total application master usage below
maxAMResourcePerQueuePercent, the app will be added to the "ordering-policy"
list.
Otherwise, the app will be added to the "pending-ordering-policy" list.
During allocation, only apps in "ordering-policy" are considered.
If there's any app finish, or queue config changed, or node add/remove happen,
"pending-ordering-policy" will be reconsidered, and some apps from
"pending-ordering-policy" will be added to "ordering-policy".
This behavior leads to the issue of this JIRA:
The cluster has 15GB resource, the max-application-master-percentage is set to
0.1. So it can at most accept 2GB (rounded by 1GB) AM resource, which equals to
2 applications.
When app2 submitted, it will be added to ordering-policy.
When app3 submitted, it will be added to pending-ordering-policy.
When we kill app1, it won't finish immediately. Instead, it will still be part
of "odering-policy" until all containers of app1 released. (That makes app3
stays in pending-ordering-policy).
So any resource released by app1, app3 cannot pick up.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]