[
https://issues.apache.org/jira/browse/YARN-8476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
YunFan Zhou updated YARN-8476:
------------------------------
Description: (was: Hi, [~leftnoteasy]
We recently merge https://issues.apache.org/jira/browse/YARN-5139 into our
version and found some bug.
Below is the more serious bugs I've encountered:
{code:java}
LeafQueue queue = ((LeafQueue) reservedApplication.getQueue());
assignment = queue.assignContainers(getClusterResource(), candidates,
// TODO, now we only consider limits for parent for non-labeled
// resources, should consider labeled resources as well.
new ResourceLimits(labelManager
.getResourceByLabel(RMNodeLabelsManager.NO_LABEL,
getClusterResource())),
SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY);
if (assignment.isFulfilledReservation()) {
if (withNodeHeartbeat) {
// Only update SchedulerHealth in sync scheduling, existing
// Data structure of SchedulerHealth need to be updated for
// Async mode
updateSchedulerHealth(lastNodeUpdateTime, node.getNodeID(),
assignment);
}
schedulerHealth.updateSchedulerFulfilledReservationCounts(1);
ActivitiesLogger.QUEUE.recordQueueActivity(activitiesManager, node,
queue.getParent().getQueueName(), queue.getQueueName(),
ActivityState.ACCEPTED, ActivityDiagnosticConstant.EMPTY);
ActivitiesLogger.NODE.finishAllocatedNodeAllocation(activitiesManager,
node, reservedContainer.getContainerId(),
AllocationState.ALLOCATED_FROM_RESERVED);
} else{
ActivitiesLogger.QUEUE.recordQueueActivity(activitiesManager, node,
queue.getParent().getQueueName(), queue.getQueueName(),
ActivityState.ACCEPTED, ActivityDiagnosticConstant.EMPTY);
ActivitiesLogger.NODE.finishAllocatedNodeAllocation(activitiesManager,
node, reservedContainer.getContainerId(), AllocationState.SKIPPED);
}
assignment.setSchedulingMode(
SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY);
submitResourceCommitRequest(getClusterResource(), assignment);
}
{code}
Before we submit assignment to *resourceCommitterService* service, we must
check the assignment is greater than the *Resources. none().*
Because the assignment can be *CSAssignment(Resources.createResource(0, 0),
NodeType.NODE_LOCAL)* after call *getRootQueue().assignContainers* method,
which is a meaningless value.
But we are still going to submit it to *resourceCommitterService* service, and
lead to a bunch of meaningless assignments blocks other meaningful event
processing.
I think this is a very serious bug! Any Suggestions?)
> Should check the resource of assignment is greater than Resources.none()
> before submitResourceCommitRequest
> -----------------------------------------------------------------------------------------------------------
>
> Key: YARN-8476
> URL: https://issues.apache.org/jira/browse/YARN-8476
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacity scheduler, capacityscheduler
> Reporter: YunFan Zhou
> Assignee: YunFan Zhou
> Priority: Minor
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]