MENG DING commented on YARN-1197:

We have a real use case to better support long running services on YARN, and to 
share resources between long running services and batch jobs. We have carefully 
reviewed discussions and documentations in this thread (and other topics 
related to this thread), and are committed to bring this work to completion. We 
agree with the general design of this feature, and understand that this is the 
result of an extensive discussions among many experts. 

We will attempt to post an updated design shortly for review.

We don't really see a bottleneck at the scheduler side at this moment. However, 
we do see problems with memory enforcement for long running services.
- For JVM based containers (e.g., container running HBase), it is not possible 
right now to change the heap size of JVM without restarting the Java process. 
Even if we can implement a wrapper in the container to relaunch a Java process 
when resource is changed for a container, we still need to implement an 
interface between node manager and container to trigger the relaunch action.
- We thought about launching the JVM based container with -Xmx set to the 
physical memory of the node, and use cgroup memory control to enforce the 
resource limit, but we don't think LCE supports memory isolation right now (?). 
We cannot use YARN's default memory enforcement as we don't want long running 
services to be killed.

So overall there doesn't seem to be an easy solution for memory enforcement 
without killing the long running services right now. Any comments or 
suggestions will be greatly appreciated.


> Support changing resources of an allocated container
> ----------------------------------------------------
>                 Key: YARN-1197
>                 URL: https://issues.apache.org/jira/browse/YARN-1197
>             Project: Hadoop YARN
>          Issue Type: Task
>          Components: api, nodemanager, resourcemanager
>    Affects Versions: 2.1.0-beta
>            Reporter: Wangda Tan
>         Attachments: mapreduce-project.patch.ver.1, 
> tools-project.patch.ver.1, yarn-1197-scheduler-v1.pdf, yarn-1197-v2.pdf, 
> yarn-1197-v3.pdf, yarn-1197-v4.pdf, yarn-1197-v5.pdf, yarn-1197.pdf, 
> yarn-api-protocol.patch.ver.1, yarn-pb-impl.patch.ver.1, 
> yarn-server-common.patch.ver.1, yarn-server-nodemanager.patch.ver.1, 
> yarn-server-resourcemanager.patch.ver.1
> The current YARN resource management logic assumes resource allocated to a 
> container is fixed during the lifetime of it. When users want to change a 
> resource 
> of an allocated container the only way is releasing it and allocating a new 
> container with expected size.
> Allowing run-time changing resources of an allocated container will give us 
> better control of resource usage in application side

This message was sent by Atlassian JIRA

Reply via email to