[ https://issues.apache.org/jira/browse/YARN-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17307191#comment-17307191 ]
Michael Zeoli commented on YARN-6538: ------------------------------------- Eric - thanks for the response and apologies for the absence. Currently we have not been able to reproduce outside of our particular pipeline, though we stopped in earnest once our platform vendor indicated they were able to reproduce with a purpose-built MR job (we are currently working the issue with them). I will try to get details. Essentially what we see is a single job (in lq1) with several thousand pending containers taking the entire cluster (expected, via dynamic allocation). When a second job enters lq2, it fails to receive executors despite having a guaranteed minimum capacity of 17% (approx 4 cores.. 28 * 0.95 * 0.17). On occasion it also fails to receive an AM. If a third job enters lq3 at this point, it also fails to receive executors. The jobs continue to starve until the first job begins attriting resources as pending containers fall to zero. YARN Resources (4 NM's, so 280 GiB / 28c total YARN resources) * yarn.nodemanager.resource.cpu-vcores = 7 * yarn.scheduler.maximum-allocation-vcores = 7 * yarn.nodemanager.resource.memory-mb = 70 GiB * yarn.scheduler.maximum-allocation-mb = 40 GiB Queue configuration (note that only lq1, lq2 and lq3 are used in the current tests) * root.default cap = 5% * root.tek cap = 95% * root.tek.lq1, .lq2, .lq3, .lq4 cap = 17% each * root.tek.lq5 .lq6 cap = 16% each For all lqN (leaf queues): * Minimum User Limit = 25% * User Limit Factor = 100 (intentionally set high to allow user to exceed queue capacity when idle capacity exists) * max cap = 100% * max AM res limit = 20% * inter / intra queue preemption: Enabled * ordering policy = Fair Spark config * spark.executor.cores=1 * spark.executor.memory=5G * spark.driver.memory=4G * spark.driver.maxResultSize=2G * spark.executor.memoryOverhead=1024 * spark.dynamicAllocation.enabled = true > Inter Queue preemption is not happening when DRF is configured > -------------------------------------------------------------- > > Key: YARN-6538 > URL: https://issues.apache.org/jira/browse/YARN-6538 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler, scheduler preemption > Affects Versions: 2.8.0 > Reporter: Sunil G > Assignee: Sunil G > Priority: Major > > Cluster capacity of <memory:3TB, vCores:168>. Here memory is more and vcores > are less. If applications have more demand, vcores might be exhausted. > Inter queue preemption ideally has to be kicked in once vcores is over > utilized. However preemption is not happening. > Analysis: > In {{AbstractPreemptableResourceCalculator.computeFixpointAllocation}}, > {code} > // assign all cluster resources until no more demand, or no resources are > // left > while (!orderedByNeed.isEmpty() && Resources.greaterThan(rc, totGuarant, > unassigned, Resources.none())) { > {code} > will loop even when vcores are 0 (because memory is still +ve). Hence we are > having more vcores in idealAssigned which cause no-preemption cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org