[
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300389#comment-15300389
]
Varun Vasudev commented on YARN-4844:
-------------------------------------
Thanks for the updates [~leftnoteasy]. Some comments on the latest patch -
# It needs to be rebased - FSQueueMetrics had an issue
# We're using getMemoryLong() in 3 places -
*RMWebServices.java*
{code}
- if (newApp.getResource().getMemory() > rm.getConfig().getInt(
+ if (newApp.getResource().getMemoryLong() > rm.getConfig().getInt(
{code}
*FairSchedulerQueueInfo.java*
{code}
- fractionMemUsed = (float)usedResources.getMemory() /
- clusterResources.getMemory();
+ fractionMemUsed = (float)usedResources.getMemoryLong() /
+ clusterResources.getMemoryLong();
{code}
*ResourceInfo.java*
{code}
- public int getMemory() {
+ public long getMemoryLong() {
{code}
Maybe we should changed these to getMemorySize() as well?
# Should we deprecate getVirtualCores as well or will you take care of that in
a follow up patch?
Rest of the patch looks good to me.
> Add getMemoryLong/getVirtualCoreLong to o.a.h.y.api.records.Resource
> --------------------------------------------------------------------
>
> Key: YARN-4844
> URL: https://issues.apache.org/jira/browse/YARN-4844
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: api
> Reporter: Wangda Tan
> Assignee: Wangda Tan
> Priority: Blocker
> Attachments: YARN-4844.1.patch, YARN-4844.10.patch,
> YARN-4844.11.patch, YARN-4844.12.patch, YARN-4844.13.patch,
> YARN-4844.14.patch, YARN-4844.2.patch, YARN-4844.3.patch, YARN-4844.4.patch,
> YARN-4844.5.patch, YARN-4844.6.patch, YARN-4844.7.patch,
> YARN-4844.8.branch-2.patch, YARN-4844.8.patch, YARN-4844.9.branch,
> YARN-4844.9.branch-2.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending
> resources of running apps to cluster's total pending resources. If a
> problematic app requires too much resources (let's say 1M+ containers, each
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that
> there're many running apps, each of them has capped but still significant
> numbers of pending resources.
> So we may possibly need to add getMemoryLong/getVirtualCoreLong to
> o.a.h.y.api.records.Resource.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]