Varun Vasudev created YARN-2628:
-----------------------------------
Summary: Capacity scheduler with DominantResourceCalculator
carries out reservation even though slots are free
Key: YARN-2628
URL: https://issues.apache.org/jira/browse/YARN-2628
Project: Hadoop YARN
Issue Type: Bug
Components: capacityscheduler
Affects Versions: 2.5.1
Reporter: Varun Vasudev
Assignee: Varun Vasudev
We've noticed that if you run the CapacityScheduler with the
DominantResourceCalculator, sometimes apps will end up with containers in a
reserved state even though free slots are available.
The root cause seems to be this piece of code from CapacityScheduler.java -
{noformat}
// Try to schedule more if there are no reservations to fulfill
if (node.getReservedContainer() == null) {
if (Resources.greaterThanOrEqual(calculator, getClusterResource(),
node.getAvailableResource(), minimumAllocation)) {
if (LOG.isDebugEnabled()) {
LOG.debug("Trying to schedule on node: " + node.getNodeName() +
", available: " + node.getAvailableResource());
}
root.assignContainers(clusterResource, node, false);
}
} else {
LOG.info("Skipping scheduling since node " + node.getNodeID() +
" is reserved by application " +
node.getReservedContainer().getContainerId().getApplicationAttemptId()
);
}
{noformat}
The code is meant to check if a node has any slots available for containers .
Since it uses the greaterThanOrEqual function, we end up in situation where
greaterThanOrEqual returns true, even though we may not have enough CPU or
memory to actually run the container.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)