Bonnie Xu created YARN-11171:
--------------------------------
Summary: Help figuring out why high priority jobs are starving and
low priority jobs not being preempted
Key: YARN-11171
URL: https://issues.apache.org/jira/browse/YARN-11171
Project: Hadoop YARN
Issue Type: Bug
Components: fairscheduler
Affects Versions: 3.2.1
Reporter: Bonnie Xu
Hi! Recently we've been running into this issue in our production systems where
a high priority job starves a lower priority job and preemption isn't kicking
in to rebalance the resources, over the period of 1h. It's to our understanding
at least that when higher priority jobs show up, resources should be evicted
from a lower priority queue based on fair share allocations relatively quickly
based on the fairshare timeout.
+*This is for the higher priority queue (high):*+
!https://paper.dropbox.com/ep/redirect/image?url=https%3A%2F%2Fpaper-attachments.dropbox.com%2Fs_2C3F7CB2982B9542EDF25C7829E4FC9F52683EF3F37B8BF0F033955DB8D447D3_1654287958159_file.png&hmac=Km%2B2JsKoHiuN9ymq2Pz4bcexI%2FdsWDWIkmLdFCfufIg%3D&width=1490|width=738,height=298!
Between 23:30 to 0:45, notice that the higher priority queue consistently
demands a lot of memory and should be fairly allocated at least half of it, but
doesn't get its fairshare.
+*This is for the lower priority queue (medium):*+
!https://paper.dropbox.com/ep/redirect/image?url=https%3A%2F%2Fpaper-attachments.dropbox.com%2Fs_2C3F7CB2982B9542EDF25C7829E4FC9F52683EF3F37B8BF0F033955DB8D447D3_1654287958172_file.png&hmac=%2F6oGoh1smD9OdcmNXlwrIEudFjTVaofHMetUXfKb2KY%3D&width=1490|width=725,height=250!
Notice that during the same point in time, the medium subqueue is using way
more than its fairshare.
One interesting thing (could possibly be related to the issue) is that when
this happens, the queue is at max resources and we see a lot of these logs:
{code:java}
diagnostics: [Mon May 16 06:29:28 +0000 2022] Application is added to the
scheduler and is not yet activated. (Resource request: <memory:27136,
vCores:4> exceeds current queue or its parents maximum resource allowed). Max
share of queue: <memory:9223372036854775807, vCores:2147483647> {code}
For this application in particular, it stays like this for a while and then an
hour later finally ends up getting the resources it needs after the low
priority job finishes. Note the max share of the queue number is strangely
really off.
Our current preemption config for this cluster:
{code:java}
<fairSharePreemptionThreshold>1</fairSharePreemptionThreshold>
<fairSharePreemptionTimeout>900</fairSharePreemptionTimeout>
<minSharePreemptionTimeout>180</minSharePreemptionTimeout>{code}
{code:java}
<queue name="low">
<weight>1</weight>
</queue>
<queue name="medium">
<weight>2</weight>
</queue>
<queue name="high">
<fairSharePreemptionTimeout>300</fairSharePreemptionTimeout>
<weight>3</weight>
</queue>
</queue>{code}
We've tried taking a heap dump and enabling debug logging, and one of our
theories is that maybe the preemption thread does a check for whether it can
add resources before preempting, and since the queue is already at max
resources, this can't go through?
However, nothing super conclusive yet. Would love any assistance/insight you
could provide. Happy to give more details as well.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]