Muhammad Samir Khan created YARN-6834:
-----------------------------------------
Summary: A container request with only racks specified and relax
locality set to false is never honoured
Key: YARN-6834
URL: https://issues.apache.org/jira/browse/YARN-6834
Project: Hadoop YARN
Issue Type: Bug
Components: capacity scheduler
Reporter: Muhammad Samir Khan
A patch for a unit test is attached to reproduce the issue. It creates a
container request with only racks specified (nodes=null) and relax locality set
to false. With the node-locality-delay conf set appropriately, we wait
indefinitely for a container allocation and the test will timeout.
My understanding of what causes this issue is as follows. The
RegularContainerAllocator delays a rack local allocation based on the
node-locality-delay parameter. This delay is based on missed opportunities.
However, the corresponding off-switch request is skipped but does not count
towards a missed opportunity (because relax locality is set to false). So the
allocator waits indefinitely.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]