[
https://issues.apache.org/jira/browse/YARN-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542825#comment-13542825
]
Luke Lu commented on YARN-291:
------------------------------
Regarding the race condition, I meant that you could change the resource on NM
in the middle of a heartbeat, where RM already assigned some containers to NM
from the last heartbeat but you don't actually have the resource to launch
these containers. It's relatively harmless now as NM is not enforcing its
resource limit when launching containers, but the race condition can rear its
ugly head if someone later decides to add code to enforce the limit when adding
more resource features for things like GPU etc. The race condition would not be
obvious to later maintainers.
With the RM only approach, I think that it's OK/harmless not to change
totalResource at all at NM side and leave it as the original resource limit, as
long as we set the resource limit <= the original limit, even if limit
enforcement is added later. This makes the change an order of magnitude smaller
and less invasive, as it doesn't have to change the node update code paths. It
would also be wasteful to heartbeat the node resource (especially when later
enhanced to including features besides memory) that doesn't change most of the
time.
> Dynamic resource configuration on NM
> ------------------------------------
>
> Key: YARN-291
> URL: https://issues.apache.org/jira/browse/YARN-291
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: nodemanager, scheduler
> Reporter: Junping Du
> Assignee: Junping Du
> Labels: features
> Attachments: Elastic Resources for YARN-v0.2.pdf,
> YARN-291-AddClientRMProtocolToSetNodeResource-03.patch,
> YARN-291-all-v1.patch, YARN-291-core-HeartBeatAndScheduler-01.patch,
> YARN-291-JMXInterfaceOnNM-02.patch,
> YARN-291-OnlyUpdateWhenResourceChange-01-fix.patch,
> YARN-291-YARNClientCommandline-04.patch
>
>
> The current Hadoop YARN resource management logic assumes per node resource
> is static during the lifetime of the NM process. Allowing run-time
> configuration on per node resource will give us finer granularity of resource
> elasticity. This allows Hadoop workloads to coexist with other workloads on
> the same hardware efficiently, whether or not the environment is virtualized.
> About more background and design details, please refer: HADOOP-9165.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira