[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15279144#comment-15279144
 ] 

Wangda Tan commented on YARN-4844:
----------------------------------

[~ozawa],

We have discussed this on this jira.

To quick summary, what we plan to do is: 
- In branch-2, add getMemoryLong and getVCoresLong method, change all YARN 
internal uses of Resource.getMemory to Resource.getMemoryLong. The reason why 
only change getMemory is only memory of pending resource will be likely 
overflowed.
- In branch-2, we mark getMemoryLong and getVCoresLong to private. Since we 
plan to directly update return type of getMemory and getVirtualCores to long in 
Hadoop 3.x releases. It's better not to ask application to update their code 
before branch-3. This is also why I didn't deprecate the int getters in 
Resource object.

Please let me know your thoughts.

Thanks,

> Add getMemoryLong/getVirtualCoreLong to o.a.h.y.api.records.Resource
> --------------------------------------------------------------------
>
>                 Key: YARN-4844
>                 URL: https://issues.apache.org/jira/browse/YARN-4844
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>            Priority: Blocker
>         Attachments: YARN-4844.1.patch, YARN-4844.2.patch, YARN-4844.3.patch, 
> YARN-4844.4.patch, YARN-4844.5.patch, YARN-4844.6.patch, YARN-4844.7.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to add getMemoryLong/getVirtualCoreLong to 
> o.a.h.y.api.records.Resource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to