[
https://issues.apache.org/jira/browse/YARN-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13706348#comment-13706348
]
Djellel Eddine Difallah commented on YARN-897:
----------------------------------------------
Omkar, thanks for the feedback
{quote}any reason for this even after this patch? if we don't see any other
issues then why not just use childQueues.remove instead of iterating?{quote}
The tree is already out of order because of the new usedCapacity, the remove()
won't work. We have to iterate and add() to fix the order.
{quote}reinsertQueue could be marked synchronized? thoughts? But yeah.. without
that too it is thread safe as we are locking it at
CapacitySchedulder.nodeUpdate(). but still it is better to mark it.{quote}
ok, sounds reasonable to put a synchronize there.
{quote}LOG.info("Re-sorting queues since queue got completed: " +
childQueue.getQueuePath() +
nit. line > 80{quote}
sure
{quote}at present we send the container completed event to leaf queue and then
keep propagating it till root. why not sent the event to root grab the locks
from root->leaf and update it? any thoughts?{quote}
Because the released container is linked to a leaf queue and we have to walk
bottom up to figure out to which parent propagate. The assignment phase,
however, works the way you described.
> CapacityScheduler wrongly sorted queues
> ---------------------------------------
>
> Key: YARN-897
> URL: https://issues.apache.org/jira/browse/YARN-897
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacityscheduler
> Reporter: Djellel Eddine Difallah
> Attachments: TestBugParentQueue.java, YARN-897-1.patch
>
>
> The childQueues of a ParentQueue are stored in a TreeSet where UsedCapacity
> defines the sort order. This ensures the queue with least UsedCapacity to
> receive resources next. On containerAssignment we correctly update the order,
> but we miss to do so on container completions. This corrupts the TreeSet
> structure, and under-capacity queues might starve for resources.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira