[
https://issues.apache.org/jira/browse/YARN-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677247#comment-13677247
]
Bikas Saha commented on YARN-689:
---------------------------------
bq. the MRAppMaster obtains the MIN/MAX values from the registration response,
see the RMCommunicator#register() method.
Thats a good observation. The MAX value is definitely important because its
invalid to ask for more than the max and the scheduler will throw an exception.
I am open to removing the MIN value since it looks like a scheduling artifact
that has been unnecessarily exposed to the user.
bq. And this normalization is also 'misusing' the current minimum as the
increment for the normalization.
The MR app knows too much about the internal details and IMO this is precisely
what we should try to avoid.
bq. As a follow up JIRA to this one I plan to fix MR to use the increment
instead.
IMO, we should fix MR to not understand scheduler details (and thats not hard)
instead of changing it to use this multiplier.
bq. There is value for an AM to know the normalized capacity as based on that
it can decide at allocation request time how to plan its processing
distribution and further allocations (if I ask for 1.2 GB and I'll be getting
2GB I can bump at planning phase how much will do in that container and correct
my subsequent allocation requests to be less).
I am sorry I don't quite agree with this approach because it is fragile. I
expect scheduler heuristics to maximize utilization etc to keep evolving. If
apps start developing heuristics based on an understanding of how certain
schedulers currently do their calculations then that app is going to be open to
disruption. Do we want to support backwards compatibility of this type of
multiplier based calculations? What if we implement a scheduler that solves the
maximization problem as a graph flow problem. That would be totally different
from our current box-fitting approach.
bq. The change doesn't make the code more complicated. Instead of the range of
allowable requests being [1, n] (normalized by the minimum), it becomes [m, n],
for configurable m and n, with m = 1 as the default.
I am not opposing this change so there is no need to defend it. Based on some
of the examples above, it looks like there is merit to this approach. I am only
trying to not expose these internal details to the users via the API. Thats all.
> Add multiplier unit to resourcecapabilities
> -------------------------------------------
>
> Key: YARN-689
> URL: https://issues.apache.org/jira/browse/YARN-689
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: api, scheduler
> Affects Versions: 2.0.4-alpha
> Reporter: Alejandro Abdelnur
> Assignee: Alejandro Abdelnur
> Attachments: YARN-689.patch, YARN-689.patch, YARN-689.patch,
> YARN-689.patch, YARN-689.patch
>
>
> Currently we overloading the minimum resource value as the actual multiplier
> used by the scheduler.
> Today with a minimum memory set to 1GB, requests for 1.5GB are always
> translated to allocation of 2GB.
> We should decouple the minimum allocation from the multiplier.
> The multiplier should also be exposed to the client via the
> RegisterApplicationMasterResponse
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira