[
https://issues.apache.org/jira/browse/YARN-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Muhammad Samir Khan updated YARN-6834:
--------------------------------------
Attachment: YARN-6834.001.patch
Not sure if the attached patch is the best way to solve the issue but putting
it up for comments.
> A container request with only racks specified and relax locality set to false
> is never honoured
> -----------------------------------------------------------------------------------------------
>
> Key: YARN-6834
> URL: https://issues.apache.org/jira/browse/YARN-6834
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacity scheduler
> Reporter: Muhammad Samir Khan
> Attachments: YARN-6834.001.patch, yarn-6834-unittest.patch
>
>
> A patch for a unit test is attached to reproduce the issue. It creates a
> container request with only racks specified (nodes=null) and relax locality
> set to false. With the node-locality-delay conf set appropriately, we wait
> indefinitely for a container allocation and the test will timeout.
> My understanding of what causes this issue is as follows. The
> RegularContainerAllocator delays a rack local allocation based on the
> node-locality-delay parameter. This delay is based on missed opportunities.
> However, the corresponding off-switch request is skipped but does not count
> towards a missed opportunity (because relax locality is set to false). So the
> allocator waits indefinitely.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]