[jira] [Commented] (YARN-1404) Add support for unmanaged containers

Vinod Kumar Vavilapalli (JIRA) Wed, 13 Nov 2013 16:38:30 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13822034#comment-13822034
 ]


Vinod Kumar Vavilapalli commented on YARN-1404:
-----------------------------------------------

bq. Vinod Kumar Vavilapalli, a lightweight RM is not sufficient because the 
goal of llama is to be able to run frameworks that use unmanaged containers 
alongside frameworks that don't. While Impala does its own resource 
enforcement, it wants to coexist on a YARN instance with MR and other 
frameworks that fit more naturally with the YARN model.
Well, this has been my problem, I'm sure others will agree. Proposing unmanaged 
containers before explaining your key requirements keeps folks only looking at 
JIRA in the dark.

bq. Are you saying YARN should never support containers that don't launch a 
process? Is there anything gained by this?
If that need arises, and if there are no other first-class solutions, then yes. 
Otherwise no.

bq. I think you are jumping too fast here
That's because I see multiple JIRAs all trying to achieve a common goal and 
instead of discussing that design, we are shoe-horned into debating on 
individual tickets that don't make up the overall goal.

bq. IMO that makes completely sense for bugs, for improvements/new-features a 
description of it communicates more as it will be the commit message. The 
shortcomings the JIRA is trying to address should be captured in the 
description.
Agree that it is subjective. But in some of the tickets that potentially have a 
solution-space > 1, I'd suggest renaming them. For e.g., this on can be renamed 
to "support running a service that doesn't want to use YARN containers but 
still co-exists with YARN"

bq. Take for example the following JIRA summaries, would you change them to 
describe a problem?
bq.    AM's tracking URL should be a URL instead of a string
bq.    YARN should have a ClusterId/ServiceId
Yes, I'd change the above two. The other two are apt summaries. The goal should 
be indicating the problem one is attacking. And my point here is not that you 
or someone is making that mistake and others are not.

bq. The whole point of Llama is to allow Impala to share resources in a real 
Yarn cluster doing other workloads like Map-Reduce. In other words, 
Impala/Llama and other AMs must share cluster resources.
Well, you should have started with this requirement so that we can all discuss 
and come up with a solution instead of putting in approaches that you think are 
best.  This was the same discussion we had in YARN-689 where it took a while 
for the rest of us to understand the real requirements. Similarly, YARN-789 was 
put in FairScheduler without giving considerations to the rest of the system.

bq. The AM that started the unmanaged container gets the 
early-preemption/preemption/lost notification from the RM and notifies the out 
of band process in the corresponding node to release the corresponding 
resources. (Impala/Llama is doing this today with the dummy sleep containers)
That won't work for cases where RM wants to forcefully terminate in emergency 
situations.

bq. A NM plugin notifies the collocated out of band process that the unmanaged 
container as ended. This prompts the out of band process to release the 
corresponding resources. (We are working on getting this in Impala/Llama).
This again is a new proposal which is never discussed.

Re this problem, I think you should create a ticket about supporting services 
that want to use cluster and node level scheduling without using containers. 
Then if you follow up with a requirement list, we can discuss solutions and an 
end-to-end design. I can come with more solutions already, which may or may not 
work depending on your requirements.
 - Use the dynamic NM resource stuff that just went in and use signalling 
between YARN NM and some outside component to dynamically adjust NM resources
 - Run a long running service under YARN with containers that dynamically grow 
and shrink

> Add support for unmanaged containers
> ------------------------------------
>
>                 Key: YARN-1404
>                 URL: https://issues.apache.org/jira/browse/YARN-1404
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager
>    Affects Versions: 2.2.0
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>         Attachments: YARN-1404.patch
>
>
> Currently a container allocation requires to start a container process with 
> the corresponding NodeManager's node.
> For applications that need to use the allocated resources out of band from 
> Yarn this means that a dummy container process must be started.
> Impala/Llama is an example of such application which is currently starting a 
> 'sleep 10y' (10 years) process as the container process. And the resource 
> capabilities are used out of by and the Impala process collocated in the 
> node. The Impala process ensures the processing associated to that resources 
> do not exceed the capabilities of the container. Also, if the container is 
> lost/preempted/killed, Impala stops using the corresponding resources.
> In addition, in the case of Llama, the current requirement of having a 
> container process, gets complicates when hard resource enforcement (memory 
> -ContainersMonitor- or cpu -via cgroups-) is enabled because Impala/Llama 
> request resources with CPU and memory independently of each other. Some 
> requests are CPU only and others are memory only. Unmanaged containers solve 
> this problem as there is no underlying process with zero CPU or zero memory.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1404) Add support for unmanaged containers

Reply via email to