[ 
https://issues.apache.org/jira/browse/YARN-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663473#comment-13663473
 ] 

Alejandro Abdelnur commented on YARN-689:
-----------------------------------------

[~acmurthy], let me try to be a bit more clear, going all over again.

1. I have a set of services co-existing with a Yarn cluster.

2. These services run out of band from Yarn. They are not started as yarn 
containers and they don't use Yarn containers for processing.

3. These services use, dynamically, different amounts of CPU and memory based 
on their load. They manage their CPU and memory requirements independently. In 
other words, depending on their load, they may require more CPU but not memory 
or vice-versa.

By using YARN as RM for these services I'm able share and utilize the resources 
of the cluster appropriately and in a dynamic way. Yarn keeps tab of all the 
resources.

These services run an AM that reserves resources on their behalf. When this AM 
gets the requested resources, the services bump up their CPU/memory utilization 
out of band from Yarn. If the Yarn allocations are released/preempted, the 
services back off on their resources utilization. By doing this, Yarn and these 
service correctly share the cluster resources, being Yarn RM the only one that 
does the overall resource bookeeping.

The services AM, not to break the lifecycle of containers, start containers in 
the corresponding NMs. These container processes do basically a sleep forever 
(i.e. sleep 10000d). They are almost not using any CPU nor memory (less than 
1MB). Thus it is reasonable to assume their required CPU and memory utilization 
is NIL (more on hard enforcement later).

Because of this NIL utilization of CPU and memory, it is possible to specify, 
when doing a request, zero as one of the dimensions (CPU or memory).

The current limitation is that because we are overloading minimum with being 
the multiplier, setting it zero does not work. 

If we set the current minimum to 1MB and 1CPU....

When doing a pure CPU request, we would have to specify 1MB of memory. That 
would work. However it would allow discretionary memory requests without a 
desired normalization (increments of 256, 512, etc).

When doing a pure memory request, we would have to specify 1CPU. CPU amounts a 
much smaller than memory amounts, and because we don't have fractional CPUs, it 
would mean that all my pure memory requests will be wasting 1 CPU thus reducing 
the overall utilization of the cluster.

Decoupling minimum and multiplier solves this problem.

Regarding your comment on 2 AMs asking 6.8GB and 5.9GB. Having the minimum and 
multiplier decouple still allows to normalize values to big round values even 
if minimum is set to zero, i.e. by setting the multiplier to 512 or 1024. 
Without decoupling minimum and multiplier, this is not possible. Using 1MB and 
1CPU to pseudo address my use case, with the current functionality, would make 
me run into exactly what you point out *assume significant complexity (and this 
now a best-fit problem) for unclear gains.*

Finally, on hard enforcement. For example for CPUs, The cgroup cpu controller 
enforcement would use an absolute minimum, max(ABS_MIN, REQUEST_VALUE), to 
ensure there is enough CPU cycles to run the sleep process. In this case the 
ABS_MIN would be 10 CPU shares. thus will not impact the overall effective 
distribution (the default for once CPU is 1024). Similarly, for the memory 
controller the ABS_MIN would be 1 or 2 MBs. This ABS_MIN values don't have to 
be configurations but constants in the code and not exposed in any way, they 
are just a minor correction that enters in effect when zero is specified.

Please let me know if any of this requires further explanation?



                
> Add multiplier unit to resourcecapabilities
> -------------------------------------------
>
>                 Key: YARN-689
>                 URL: https://issues.apache.org/jira/browse/YARN-689
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: api, scheduler
>    Affects Versions: 2.0.4-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>         Attachments: YARN-689.patch, YARN-689.patch, YARN-689.patch
>
>
> Currently we overloading the minimum resource value as the actual multiplier 
> used by the scheduler.
> Today with a minimum memory set to 1GB, requests for 1.5GB are always 
> translated to allocation of 2GB.
> We should decouple the minimum allocation from the multiplier.
> The multiplier should also be exposed to the client via the 
> RegisterApplicationMasterResponse

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to