[
https://issues.apache.org/jira/browse/YARN-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13699373#comment-13699373
]
Djellel Eddine Difallah commented on YARN-897:
----------------------------------------------
We spotted this bug while experimenting on dynamic queues updates. The TreeSet
methods .contains() and .remove() failed on retrieving a queue that we knew was
there, and that gave us a hint that the tree was unsorted properly.
The attached test is a [simple junit test |
https://issues.apache.org/jira/secure/attachment/12590676/TestBugParentQueue.java]
inspired by the already available capacity scheduler tests. It does simulate
the scenario that [~curino] describes above and displays the order in which the
childQueues is left after a couple of container assignments and completions.
I will post a first version of a patch that re-inserts the recently completed
container's queue (and all its parents) into their respective parents'
childQueues.
> CapacityScheduler wrongly sorted queues
> ---------------------------------------
>
> Key: YARN-897
> URL: https://issues.apache.org/jira/browse/YARN-897
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacityscheduler
> Reporter: Djellel Eddine Difallah
> Attachments: TestBugParentQueue.java
>
>
> The childQueues of a ParentQueue are stored in a TreeSet where UsedCapacity
> defines the sort order. This ensures the queue with least UsedCapacity to
> receive resources next. On containerAssignment we correctly update the order,
> but we miss to do so on container completions. This corrupts the TreeSet
> structure, and under-capacity queues might starve for resources.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira