[
https://issues.apache.org/jira/browse/YARN-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13821657#comment-13821657
]
Alejandro Abdelnur commented on YARN-1404:
------------------------------------------
[[email protected]]
bq. 1. I'd be inclined to treat this as a special case of YARN-1040
I've just commented in YARN-1040 following Bikas' comment on this
https://issues.apache.org/jira/browse/YARN-1040?focusedCommentId=13821597&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13821597
bq. It's dangerously easy to leak containers here; I know llama keeps an eye on
things, but I worry about other people's code -though admittedly, any
long-lived command line app "yes" could do the same.
We can have NM configs to disable no-process or multi-process, but still you
can workaround this around by having a dummy process. This is how Llama is
doing things today, but it is not ideal for several reasons.
IMO, from Yarn perspective we need to allow AMs to be able to do sophisticated
things within the Yarn programming model (like you are trying to do with
long-lived containers or what I'm doing with Llama).
bq. For the multi-process (and that includes processes=0), we really do need
some kind of lease renewal option to stop containers being retained forever. It
would become the job of the AM to do the renewal
As I've mentioned above, I don't think we need a special lease for this:
https://issues.apache.org/jira/browse/YARN-1404?focusedCommentId=13820200&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13820200
(look for 'The reason I've taken the approach of leaving the container leases
out of band is:')
[~vinodkv]
bq. -1 for this...
I think you are jumping too fast here.
bq. As I repeated on other JIRAs, please change the title with the problem
statement instead of solutions.
IMO that makes completely sense for bugs, for improvements/new-features a
description of it communicates more as it will be the commit message. The
shortcomings the JIRA is trying to address should be captured in the
description.
Take for example the following JIRA summaries, would you change them to
describe a problem?
* AHS should support application-acls and queue-acls
* AM's tracking URL should be a URL instead of a string
* Add support for zipping/unzipping logs while in transit for the NM logs
web-service
* YARN should have a ClusterId/ServiceId
bq. I indicated offline about llama with others. I don't think you need
NodeManagers either to do what you want, forget about containers. All you need
is use the ResourceManager/scheduler in isolation using MockRM/LightWeightRM
(YARN-1385) - your need seems to be using the scheduling logic in YARN and not
use the physical resources.
The whole point of Llama is to allow Impala to share resources in a real Yarn
cluster doing other workloads like Map-Reduce. In other words, Impala/Llama and
other AMs must share cluster resources.
> Add support for unmanaged containers
> ------------------------------------
>
> Key: YARN-1404
> URL: https://issues.apache.org/jira/browse/YARN-1404
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: nodemanager
> Affects Versions: 2.2.0
> Reporter: Alejandro Abdelnur
> Assignee: Alejandro Abdelnur
> Attachments: YARN-1404.patch
>
>
> Currently a container allocation requires to start a container process with
> the corresponding NodeManager's node.
> For applications that need to use the allocated resources out of band from
> Yarn this means that a dummy container process must be started.
> Impala/Llama is an example of such application which is currently starting a
> 'sleep 10y' (10 years) process as the container process. And the resource
> capabilities are used out of by and the Impala process collocated in the
> node. The Impala process ensures the processing associated to that resources
> do not exceed the capabilities of the container. Also, if the container is
> lost/preempted/killed, Impala stops using the corresponding resources.
> In addition, in the case of Llama, the current requirement of having a
> container process, gets complicates when hard resource enforcement (memory
> -ContainersMonitor- or cpu -via cgroups-) is enabled because Impala/Llama
> request resources with CPU and memory independently of each other. Some
> requests are CPU only and others are memory only. Unmanaged containers solve
> this problem as there is no underlying process with zero CPU or zero memory.
--
This message was sent by Atlassian JIRA
(v6.1#6144)