[ 
https://issues.apache.org/jira/browse/YARN-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13844774#comment-13844774
 ] 

Vinod Kumar Vavilapalli commented on YARN-1404:
-----------------------------------------------

bq. In this scenario, I think explicitly allowing for delegation of a container 
would solve the problem in a first-class manner.
This is an interesting solution that avoids the problems about trust, 
liveliness reporting and resource limitations' enforcement. +1 for considering 
something like this.

> Enable external systems/frameworks to share resources with Hadoop leveraging 
> Yarn resource scheduling
> -----------------------------------------------------------------------------------------------------
>
>                 Key: YARN-1404
>                 URL: https://issues.apache.org/jira/browse/YARN-1404
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager
>    Affects Versions: 2.2.0
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>         Attachments: YARN-1404.patch
>
>
> Currently Hadoop Yarn expects to manage the lifecycle of the processes its 
> applications run workload in. External frameworks/systems could benefit from 
> sharing resources with other Yarn applications while running their workload 
> within long-running processes owned by the external framework (in other 
> words, running their workload outside of the context of a Yarn container 
> process). 
> Because Yarn provides robust and scalable resource management, it is 
> desirable for some external systems to leverage the resource governance 
> capabilities of Yarn (queues, capacities, scheduling, access control) while 
> supplying their own resource enforcement.
> Impala is an example of such system. Impala uses Llama 
> (http://cloudera.github.io/llama/) to request resources from Yarn.
> Impala runs an impalad process in every node of the cluster, when a user 
> submits a query, the processing is broken into 'query fragments' which are 
> run in multiple impalad processes leveraging data locality (similar to 
> Map-Reduce Mappers processing a collocated HDFS block of input data).
> The execution of a 'query fragment' requires an amount of CPU and memory in 
> the impalad. As the impalad shares the host with other services (HDFS 
> DataNode, Yarn NodeManager, Hbase Region Server) and Yarn Applications 
> (MapReduce tasks).
> To ensure cluster utilization that follow the Yarn scheduler policies and it 
> does not overload the cluster nodes, before running a 'query fragment' in a 
> node, Impala requests the required amount of CPU and memory from Yarn. Once 
> the requested CPU and memory has been allocated, Impala starts running the 
> 'query fragment' taking care that the 'query fragment' does not use more 
> resources than the ones that have been allocated. Memory is book kept per 
> 'query fragment' and the threads used for the processing of the 'query 
> fragment' are placed under a cgroup to contain CPU utilization.
> Today, for all resources that have been asked to Yarn RM, a (container) 
> process must be started via the corresponding NodeManager. Failing to do 
> this, will result on the cancelation of the container allocation 
> relinquishing the acquired resource capacity back to the pool of available 
> resources. To avoid this, Impala starts a dummy container process doing 
> 'sleep 10y'.
> Using a dummy container process has its drawbacks:
> * the dummy container process is in a cgroup with a given number of CPU 
> shares that are not used and Impala is re-issuing those CPU shares to another 
> cgroup for the thread running the 'query fragment'. The cgroup CPU 
> enforcement works correctly because of the CPU controller implementation (but 
> the formal specified behavior is actually undefined).
> * Impala may ask for CPU and memory independent of each other. Some requests 
> may be only memory with no CPU or viceversa. Because a container requires a 
> process, complete absence of memory or CPU is not possible even if the dummy 
> process is 'sleep', a minimal amount of memory and CPU is required for the 
> dummy process.
> Because of this it is desirable to be able to have a container without a 
> backing process.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to