[
https://issues.apache.org/jira/browse/YARN-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14155398#comment-14155398
]
Varun Vasudev commented on YARN-2628:
-------------------------------------
The release audit error is from a hdfs file and unrelated.
> Capacity scheduler with DominantResourceCalculator carries out reservation
> even though slots are free
> -----------------------------------------------------------------------------------------------------
>
> Key: YARN-2628
> URL: https://issues.apache.org/jira/browse/YARN-2628
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacityscheduler
> Affects Versions: 2.5.1
> Reporter: Varun Vasudev
> Assignee: Varun Vasudev
> Attachments: apache-yarn-2628.0.patch
>
>
> We've noticed that if you run the CapacityScheduler with the
> DominantResourceCalculator, sometimes apps will end up with containers in a
> reserved state even though free slots are available.
> The root cause seems to be this piece of code from CapacityScheduler.java -
> {noformat}
> // Try to schedule more if there are no reservations to fulfill
> if (node.getReservedContainer() == null) {
> if (Resources.greaterThanOrEqual(calculator, getClusterResource(),
> node.getAvailableResource(), minimumAllocation)) {
> if (LOG.isDebugEnabled()) {
> LOG.debug("Trying to schedule on node: " + node.getNodeName() +
> ", available: " + node.getAvailableResource());
> }
> root.assignContainers(clusterResource, node, false);
> }
> } else {
> LOG.info("Skipping scheduling since node " + node.getNodeID() +
> " is reserved by application " +
>
> node.getReservedContainer().getContainerId().getApplicationAttemptId()
> );
> }
> {noformat}
> The code is meant to check if a node has any slots available for containers .
> Since it uses the greaterThanOrEqual function, we end up in situation where
> greaterThanOrEqual returns true, even though we may not have enough CPU or
> memory to actually run the container.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)