[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257142#comment-15257142
 ] 

Wangda Tan commented on YARN-4844:
----------------------------------

bq. So the plan is to force users to change their usage of these APIs in some 
version of 3.x but not in 3.0.0 ?
Regardless of debates about the first release of 3.x, let's assume it happens 
soon.
The plan in my mind is to make sure incompatible API changes are get in when 
3.x enters beta releases. We have a couple of other API changes on the way, in 
YARN for example, ATSv2, new web UI, etc.

bq. Additionally we are not talking about use in production but rather making 
upstream apps change as needed to work with 3.x and over time stabilize 3.x.
Per my understanding, changing from int to long won't affect downstream project 
a lot, it's an error which can be captured by compiler directly. And 
getMemory/getVCores should not be used intensively by downstream project. For 
example, MR uses only ~20 times of getMemory()/VCores for non-testing code. 
Which can be easily fixed.

bq. Making an API change earlier rather than later is actually better as the 
API changes in this case have no relevance to production stability.
I agree that it won't affect production stability. However, it adds additional 
overhead to development works which I don't want.

> Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64
> ------------------------------------------------------------------
>
>                 Key: YARN-4844
>                 URL: https://issues.apache.org/jira/browse/YARN-4844
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>            Priority: Blocker
>         Attachments: YARN-4844.1.patch, YARN-4844.2.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to upgrade int32 memory field (could include v-cores 
> as well) to int64 to avoid integer overflow. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to