[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300572#comment-15300572
 ] 

Sunil G commented on YARN-4844:
-------------------------------

Hi [~leftnoteasy]
I have one doubt..
In {{CheckpointAMPreemptionPolicy}}, *totalMemoryToRelease* is changed to long 
and *pendingPreemptionRam* is still integer. And we have below calculation also 
happening.
{code}
          if (pendingPreemptionRam > 0) {
            // if goes negative we simply exit
            totalMemoryToRelease -= pendingPreemptionRam;
            // decrement pending resources if zero or negatve we will
            // ignore it while processing next PreemptionResourceRequest
            pendingPreemptionRam -= totalMemoryToRelease;
          }
{code}
Is it better if we keep same data type for both totalMemoryToRelease and 
totalMemoryToRelease.

> Add getMemoryLong/getVirtualCoreLong to o.a.h.y.api.records.Resource
> --------------------------------------------------------------------
>
>                 Key: YARN-4844
>                 URL: https://issues.apache.org/jira/browse/YARN-4844
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>            Priority: Blocker
>         Attachments: YARN-4844.1.patch, YARN-4844.10.patch, 
> YARN-4844.11.patch, YARN-4844.12.patch, YARN-4844.13.patch, 
> YARN-4844.14.patch, YARN-4844.15.patch, YARN-4844.2.patch, YARN-4844.3.patch, 
> YARN-4844.4.patch, YARN-4844.5.patch, YARN-4844.6.patch, YARN-4844.7.patch, 
> YARN-4844.8.branch-2.patch, YARN-4844.8.patch, YARN-4844.9.branch, 
> YARN-4844.9.branch-2.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to add getMemoryLong/getVirtualCoreLong to 
> o.a.h.y.api.records.Resource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to