[
https://issues.apache.org/jira/browse/YARN-10352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17161682#comment-17161682
]
Prabhu Joseph commented on YARN-10352:
--------------------------------------
[~wangda] For each node in the list given by
{{CapacityScheduler#getNodesHeartbeated}}, only allocation of reserved
containers from that node happens.
Allocate or Reserve new containers uses the multiple node candidates prepared
by {{MultiNodeSorter#reSortClusterNodes}} (below code snippet) which passes the
list to the configured {{MultiNodeLookupPolicy}} to perform sorting in
background at every configured sorting interval. {{MultiNodeSortingManager}}
filters that list while returning to {{RegularContainerAllocator#allocate}}
call.
{code:java}
Map<NodeId, SchedulerNode> nodesByPartition = new HashMap<>();
List<SchedulerNode> nodes = ((AbstractYarnScheduler) rmContext
.getScheduler()).getNodeTracker().getNodesPerPartition(label);
if (nodes != null) {
nodes.forEach(n -> nodesByPartition.put(n.getNodeID(), n));
multiNodePolicy.addAndRefreshNodesSet(
(Collection<N>) nodesByPartition.values(), label);
}
{code}
> Skip schedule on not heartbeated nodes in Multi Node Placement
> --------------------------------------------------------------
>
> Key: YARN-10352
> URL: https://issues.apache.org/jira/browse/YARN-10352
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 3.3.0, 3.4.0
> Reporter: Prabhu Joseph
> Assignee: Prabhu Joseph
> Priority: Major
> Labels: capacityscheduler, multi-node-placement
> Attachments: YARN-10352-001.patch, YARN-10352-002.patch,
> YARN-10352-003.patch
>
>
> When Node Recovery is Enabled, Stopping a NM won't unregister to RM. So RM
> Active Nodes will be still having those stopped nodes until NM Liveliness
> Monitor Expires after configured timeout
> (yarn.nm.liveness-monitor.expiry-interval-ms = 10 mins). During this 10mins,
> Multi Node Placement assigns the containers on those nodes. They need to
> exclude the nodes which has not heartbeated for configured heartbeat interval
> (yarn.resourcemanager.nodemanagers.heartbeat-interval-ms=1000ms) similar to
> Asynchronous Capacity Scheduler Threads.
> (CapacityScheduler#shouldSkipNodeSchedule)
> *Repro:*
> 1. Enable Multi Node Placement
> (yarn.scheduler.capacity.multi-node-placement-enabled) + Node Recovery
> Enabled (yarn.node.recovery.enabled)
> 2. Have only one NM running say worker0
> 3. Stop worker0 and start any other NM say worker1
> 4. Submit a sleep job. The containers will timeout as assigned to stopped NM
> worker0.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]