[
https://issues.apache.org/jira/browse/YARN-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542683#comment-13542683
]
Junping Du commented on YARN-291:
---------------------------------
Luke, Thanks for comments. I agree changing node resource RM directly via JMX
is a simpler way as no protocol (NM-RM heartbeat or Client-RM) changes.
However, I don't understand your comments below:
"Changing resource on NM and propagating the change to RM would lead to nasty
race conditions where RM still thinks that an NM has enough resource and
schedule new containers on the already downsized NM, which should fail and RM
would need to reschedule the containers elsewhere"
As RM's scheduling event is trigger by NM heartbeat (NM->RM'
ResourceTrackerService-> RMNode (via RMNodeStatusEvent) -> Scheduler (via
NodeUpdateSchedulerEvent)), so I think no matter change on NM or RM, the
resource update reflect on assigning container will both happen on next
Heartbeat update between NM-RM, and I don't see race conditions here.
I think we can allow resource changes happen both on RM and NM but they have to
be consistent after exchanging heartbeat. Allowing resource changes on RM (not
matter JMX or RPC, in fact, rpc is not too much complicated as 03 patch shows),
like you said, is simple and straightforward. On the other side, I see there is
already a dummy NodeResourceMonitor in NodeManager and I remember there is JIRA
(forget the number) to talking about detecting OS's resource rather than static
configuration. So I think allowing resource changes on NM is also a good thing
to go (at least heartbeat with resource info can benefit other work). Thoughts?
> Dynamic resource configuration on NM
> ------------------------------------
>
> Key: YARN-291
> URL: https://issues.apache.org/jira/browse/YARN-291
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: nodemanager, scheduler
> Reporter: Junping Du
> Assignee: Junping Du
> Labels: features
> Attachments: Elastic Resources for YARN-v0.2.pdf,
> YARN-291-AddClientRMProtocolToSetNodeResource-03.patch,
> YARN-291-all-v1.patch, YARN-291-core-HeartBeatAndScheduler-01.patch,
> YARN-291-JMXInterfaceOnNM-02.patch,
> YARN-291-OnlyUpdateWhenResourceChange-01-fix.patch,
> YARN-291-YARNClientCommandline-04.patch
>
>
> The current Hadoop YARN resource management logic assumes per node resource
> is static during the lifetime of the NM process. Allowing run-time
> configuration on per node resource will give us finer granularity of resource
> elasticity. This allows Hadoop workloads to coexist with other workloads on
> the same hardware efficiently, whether or not the environment is virtualized.
> About more background and design details, please refer: HADOOP-9165.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira