[ https://issues.apache.org/jira/browse/YARN-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13699204#comment-13699204 ]
Carlo Curino commented on YARN-897: ----------------------------------- The childQueues of a ParentQueue are stored in a TreeSet where UsedCapacity defines the sort order. I believe for the scheduler to work correctly, we must maintain this order explicitly. When a new container is assigned to an application, the correposnding queue is removed and readded, maintain the order. When a container completes however the UsedCapacity of the queue is changed, but we don't resort the childQueues. This means the TreeSet assumptions are not maintained, and we might miss to assign containers to this queue. Example: Parent queue (root) has four child queues with capacities (A:25%, B:25%, C:25%, D:25%). The cluster has 10GB of resources with a minimum allocation of 1GB. 1- Through some history we got to assign 1,2,3,4 containers respectively to the queues (note: container = 1GB): status child-queues: root.a(0.4), root.b(0.8), root.c(1.2), root.d(1.6) 2- 3 containers from D complete, status child-queues: root.a(0.4), root.b(0.8), root.c(1.2), root.d(0.4) 3- Now if A and B keep receiving and releasing containers without ever passing the 1.2 mark of C we might have D being stuck behind C and never receive containers. In practice this might not show up often because of reservations (that bypass this ordering). If D has reservations pending it might get at least one container, and this will trigger the resorting, thus un-stucking it. Nonetheless this should be addressed. I discussed this briefly with few folks at Hadoop Summit and we seemed to confirm the problem, but we should double check further. [~dedcode] will post a small test that triggers the issue, and an idea of patch soon... comments welcome. > CapacityScheduler wrongly sorted queues > --------------------------------------- > > Key: YARN-897 > URL: https://issues.apache.org/jira/browse/YARN-897 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler > Reporter: Djellel Eddine Difallah > > The childQueues of a ParentQueue are stored in a TreeSet where UsedCapacity > defines the sort order. This ensures the queue with least UsedCapacity to > receive resources next. On containerAssignment we correctly update the order, > but we miss to do so on container completions. This corrupts the TreeSet > structure, and under-capacity queues might starve for resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira