[
https://issues.apache.org/jira/browse/YARN-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14803151#comment-14803151
]
Karthik Kambatla commented on YARN-4066:
----------------------------------------
Thanks again for working on this, Johan. Took a closer look at the patch and
have the following comments:
# A few lines are longer than 80 characters.
# For the method parameters, {{recomputeSteadyShares}} might be more
descriptive thaan {{recalculate}}
# While at it, I would suggest the following improvements in synchronization as
well:
## In getQueue, some of the code could be outside the synchronized block
{code}
name = ensureRootPrefix(name);
FSQueue queue;
synchronized (queues) {
queue = queues.get(name);
if (queue == null && create) {
// if the queue doesn't exist,create it and return
queue = createQueue(name, queueType);
} else {
recalculate = false;
}
}
if (recalculate) {
rootQueue.recomputeSteadyShares();
}
return queue;
{code}
## In updateAllocationConfiguration, club the two synchronized blocks into one,
and recomputeSteadyShares outside the synchronized block.
Since we are changing some of the locking that would be hard to unit-tests,
would appreciate if you could run the updated patch through the tests you
previously reported.
> Large number of queues choke fair scheduler
> -------------------------------------------
>
> Key: YARN-4066
> URL: https://issues.apache.org/jira/browse/YARN-4066
> Project: Hadoop YARN
> Issue Type: Bug
> Components: fairscheduler
> Affects Versions: 2.7.1
> Reporter: Johan Gustavsson
> Attachments: yarn-4066-1.patch
>
>
> Due to synchronization and all the loops performed during queue creation,
> setting a large amount of queues (12000+) will completely choke the
> scheduler. To deal with this some optimization to
> "QueueManager.updateAllocationConfiguration(AllocationConfiguration
> queueConf)" should be done to reduce the amount of unnesecary loops. The
> attached patch have been tested to work with atleast 96000 queues.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)