[ 
https://issues.apache.org/jira/browse/YARN-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13699373#comment-13699373
 ] 

Djellel Eddine Difallah commented on YARN-897:
----------------------------------------------

We spotted this bug while experimenting on dynamic queues updates. The TreeSet 
methods .contains() and .remove() failed on retrieving a queue that we knew was 
there, and that gave us a hint that the tree was unsorted properly.
The attached test is a [simple junit test | 
https://issues.apache.org/jira/secure/attachment/12590676/TestBugParentQueue.java]
 inspired by the already available capacity scheduler tests. It does simulate 
the scenario that [~curino] describes above and displays the order in which the 
childQueues is left after a couple of container assignments and completions.
I will post a first version of a patch that re-inserts the recently completed 
container's queue (and all its parents) into their respective parents' 
childQueues. 
                
> CapacityScheduler wrongly sorted queues
> ---------------------------------------
>
>                 Key: YARN-897
>                 URL: https://issues.apache.org/jira/browse/YARN-897
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>            Reporter: Djellel Eddine Difallah
>         Attachments: TestBugParentQueue.java
>
>
> The childQueues of a ParentQueue are stored in a TreeSet where UsedCapacity 
> defines the sort order. This ensures the queue with least UsedCapacity to 
> receive resources next. On containerAssignment we correctly update the order, 
> but we miss to do so on container completions. This corrupts the TreeSet 
> structure, and under-capacity queues might starve for resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to