[
https://issues.apache.org/jira/browse/YARN-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15031914#comment-15031914
]
Junping Du commented on YARN-3223:
----------------------------------
bq. RMNodeImpl does not know directly the amount of usedResource in order to
trigger an RMNodeResourceUpdateEvent. I can use
rmNode.context.getScheduler().(rmNode.getNodeID()).getUsedResource(), but I'm
not sure if adding that dependency on scheduler is okay.
This is a reasonable concern. I would not prefer rmNode talk to schedulerNode
directly also. Instead, I would prefer YarnScheduler to trigger
RMNodeResourceUpdateEvent who should know the usedResource of schedulerNode.
Concretely saying, there are two scenarios to trigger resource update event
when nodes are in decommissioning:
1. DecommissioningNodeTransition happens on RMNode:
we can let RMNode to send a new scheduler event with RMNode info only (no
resource info needed), may be called something as
DecommissioningNodeResourceUpdateSchedulerEvent, so scheduler in handling this
event will create a RMNodeResourceUpdateEvent with SchedulerNode's usedResource
instead.
2. Every time container get finished on decommissioning node:
We can also send RMNodeResourceUpdateEvent in YarnScheduler
(Fifo/Fair/Capacity) within completedContainer() just after SchedulerNode's
usedResource get updated.
> Resource update during NM graceful decommission
> -----------------------------------------------
>
> Key: YARN-3223
> URL: https://issues.apache.org/jira/browse/YARN-3223
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: graceful, nodemanager, resourcemanager
> Affects Versions: 2.7.1
> Reporter: Junping Du
> Assignee: Brook Zhou
> Attachments: YARN-3223-v0.patch, YARN-3223-v1.patch,
> YARN-3223-v2.patch
>
>
> During NM graceful decommission, we should handle resource update properly,
> include: make RMNode keep track of old resource for possible rollback, keep
> available resource to 0 and used resource get updated when
> container finished.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)