[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300389#comment-15300389
 ] 

Varun Vasudev commented on YARN-4844:
-------------------------------------

Thanks for the updates [~leftnoteasy]. Some comments on the latest patch -
# It needs to be rebased - FSQueueMetrics had an issue
# We're using getMemoryLong() in 3 places -
*RMWebServices.java*
{code}
-    if (newApp.getResource().getMemory() > rm.getConfig().getInt(
+    if (newApp.getResource().getMemoryLong() > rm.getConfig().getInt(
{code}
*FairSchedulerQueueInfo.java*
{code}
-    fractionMemUsed = (float)usedResources.getMemory() /
-        clusterResources.getMemory();
+    fractionMemUsed = (float)usedResources.getMemoryLong() /
+        clusterResources.getMemoryLong();
{code}
*ResourceInfo.java*
{code}
-  public int getMemory() {
+  public long getMemoryLong() {
{code}
Maybe we should changed these to getMemorySize() as well?
# Should we deprecate getVirtualCores as well or will you take care of that in 
a follow up patch?

Rest of the patch looks good to me.

> Add getMemoryLong/getVirtualCoreLong to o.a.h.y.api.records.Resource
> --------------------------------------------------------------------
>
>                 Key: YARN-4844
>                 URL: https://issues.apache.org/jira/browse/YARN-4844
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>            Priority: Blocker
>         Attachments: YARN-4844.1.patch, YARN-4844.10.patch, 
> YARN-4844.11.patch, YARN-4844.12.patch, YARN-4844.13.patch, 
> YARN-4844.14.patch, YARN-4844.2.patch, YARN-4844.3.patch, YARN-4844.4.patch, 
> YARN-4844.5.patch, YARN-4844.6.patch, YARN-4844.7.patch, 
> YARN-4844.8.branch-2.patch, YARN-4844.8.patch, YARN-4844.9.branch, 
> YARN-4844.9.branch-2.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to add getMemoryLong/getVirtualCoreLong to 
> o.a.h.y.api.records.Resource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to