[ 
https://issues.apache.org/jira/browse/YARN-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677247#comment-13677247
 ] 

Bikas Saha commented on YARN-689:
---------------------------------

bq. the MRAppMaster obtains the MIN/MAX values from the registration response, 
see the RMCommunicator#register() method.
Thats a good observation. The MAX value is definitely important because its 
invalid to ask for more than the max and the scheduler will throw an exception. 
I am open to removing the MIN value since it looks like a scheduling artifact 
that has been unnecessarily exposed to the user.

bq. And this normalization is also 'misusing' the current minimum as the 
increment for the normalization.
The MR app knows too much about the internal details and IMO this is precisely 
what we should try to avoid.

bq. As a follow up JIRA to this one I plan to fix MR to use the increment 
instead.
IMO, we should fix MR to not understand scheduler details (and thats not hard) 
instead of changing it to use this multiplier.

bq. There is value for an AM to know the normalized capacity as based on that 
it can decide at allocation request time how to plan its processing 
distribution and further allocations (if I ask for 1.2 GB and I'll be getting 
2GB I can bump at planning phase how much will do in that container and correct 
my subsequent allocation requests to be less).
I am sorry I don't quite agree with this approach because it is fragile. I 
expect scheduler heuristics to maximize utilization etc to keep evolving. If 
apps start developing heuristics based on an understanding of how certain 
schedulers currently do their calculations then that app is going to be open to 
disruption. Do we want to support backwards compatibility of this type of 
multiplier based calculations? What if we implement a scheduler that solves the 
maximization problem as a graph flow problem. That would be totally different 
from our current box-fitting approach.

bq. The change doesn't make the code more complicated. Instead of the range of 
allowable requests being [1, n] (normalized by the minimum), it becomes [m, n], 
for configurable m and n, with m = 1 as the default.
I am not opposing this change so there is no need to defend it. Based on some 
of the examples above, it looks like there is merit to this approach. I am only 
trying to not expose these internal details to the users via the API. Thats all.

                
> Add multiplier unit to resourcecapabilities
> -------------------------------------------
>
>                 Key: YARN-689
>                 URL: https://issues.apache.org/jira/browse/YARN-689
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, scheduler
>    Affects Versions: 2.0.4-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>         Attachments: YARN-689.patch, YARN-689.patch, YARN-689.patch, 
> YARN-689.patch, YARN-689.patch
>
>
> Currently we overloading the minimum resource value as the actual multiplier 
> used by the scheduler.
> Today with a minimum memory set to 1GB, requests for 1.5GB are always 
> translated to allocation of 2GB.
> We should decouple the minimum allocation from the multiplier.
> The multiplier should also be exposed to the client via the 
> RegisterApplicationMasterResponse

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to