[
https://issues.apache.org/jira/browse/YARN-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13844352#comment-13844352
]
Arun C Murthy commented on YARN-1404:
-------------------------------------
I've opened YARN-1488 to track delegation of container resources.
> Enable external systems/frameworks to share resources with Hadoop leveraging
> Yarn resource scheduling
> -----------------------------------------------------------------------------------------------------
>
> Key: YARN-1404
> URL: https://issues.apache.org/jira/browse/YARN-1404
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: nodemanager
> Affects Versions: 2.2.0
> Reporter: Alejandro Abdelnur
> Assignee: Alejandro Abdelnur
> Attachments: YARN-1404.patch
>
>
> Currently Hadoop Yarn expects to manage the lifecycle of the processes its
> applications run workload in. External frameworks/systems could benefit from
> sharing resources with other Yarn applications while running their workload
> within long-running processes owned by the external framework (in other
> words, running their workload outside of the context of a Yarn container
> process).
> Because Yarn provides robust and scalable resource management, it is
> desirable for some external systems to leverage the resource governance
> capabilities of Yarn (queues, capacities, scheduling, access control) while
> supplying their own resource enforcement.
> Impala is an example of such system. Impala uses Llama
> (http://cloudera.github.io/llama/) to request resources from Yarn.
> Impala runs an impalad process in every node of the cluster, when a user
> submits a query, the processing is broken into 'query fragments' which are
> run in multiple impalad processes leveraging data locality (similar to
> Map-Reduce Mappers processing a collocated HDFS block of input data).
> The execution of a 'query fragment' requires an amount of CPU and memory in
> the impalad. As the impalad shares the host with other services (HDFS
> DataNode, Yarn NodeManager, Hbase Region Server) and Yarn Applications
> (MapReduce tasks).
> To ensure cluster utilization that follow the Yarn scheduler policies and it
> does not overload the cluster nodes, before running a 'query fragment' in a
> node, Impala requests the required amount of CPU and memory from Yarn. Once
> the requested CPU and memory has been allocated, Impala starts running the
> 'query fragment' taking care that the 'query fragment' does not use more
> resources than the ones that have been allocated. Memory is book kept per
> 'query fragment' and the threads used for the processing of the 'query
> fragment' are placed under a cgroup to contain CPU utilization.
> Today, for all resources that have been asked to Yarn RM, a (container)
> process must be started via the corresponding NodeManager. Failing to do
> this, will result on the cancelation of the container allocation
> relinquishing the acquired resource capacity back to the pool of available
> resources. To avoid this, Impala starts a dummy container process doing
> 'sleep 10y'.
> Using a dummy container process has its drawbacks:
> * the dummy container process is in a cgroup with a given number of CPU
> shares that are not used and Impala is re-issuing those CPU shares to another
> cgroup for the thread running the 'query fragment'. The cgroup CPU
> enforcement works correctly because of the CPU controller implementation (but
> the formal specified behavior is actually undefined).
> * Impala may ask for CPU and memory independent of each other. Some requests
> may be only memory with no CPU or viceversa. Because a container requires a
> process, complete absence of memory or CPU is not possible even if the dummy
> process is 'sleep', a minimal amount of memory and CPU is required for the
> dummy process.
> Because of this it is desirable to be able to have a container without a
> backing process.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)