[ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541129#comment-14541129
 ] 

Vinod Kumar Vavilapalli commented on YARN-1039:
-----------------------------------------------

*sigh* This JIRA was all over the place. 

Can we please agree not to discuss here *how* long running services related 
scheduling features, UI, log-aggregation, security-tokens should be 
implemented? There are separate JIRAs with good progress on each of them.

Let's also please not discuss how the platform _could_ make use of the notion 
of a long-lived nature of an application/container. I understand that the type 
of usage shall dictate what the input will look like, but hold on to that for a 
second.

h3. Blocker
I've already started seeing real-life situations where we need the RM to know 
about the long-lived'ness of a container and an application. The prominents one 
of this are (a) reservations (b) white-listed requests or (c) node-label 
requests getting stuck on a node used by other services' containers that don't 
exit.

Absence of this notion is increasingly becoming a *blocker* for running 
services. I'd like to get some progress here.

h3. Short Proposal

There seems like a general agreement on having the notion itself. Here are the 
proposals and dimensions
 # The notion at app level, at per container level
 # a boolean flag, an enum, duration

I propose that we solve the blocker use-case that I pointed above with a 
boolean at both app-level and container-level. Tomorrow, when somebody 
implements a duration based bin-packing scheduling policy, they can add in the 
notion of a duration and then reconcile the boolean with infinity values on the 
duration. The enum proposal is to me a dup of YARN-3409 which covers a much 
larger problem space.

Thoughts?

> Add parameter for YARN resource requests to indicate "long lived"
> -----------------------------------------------------------------
>
>                 Key: YARN-1039
>                 URL: https://issues.apache.org/jira/browse/YARN-1039
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 3.0.0, 2.1.1-beta
>            Reporter: Steve Loughran
>            Assignee: Craig Welch
>         Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch
>
>
> A container request could support a new parameter "long-lived". This could be 
> used by a scheduler that would know not to host the service on a transient 
> (cloud: spot priced) node.
> Schedulers could also decide whether or not to allocate multiple long-lived 
> containers on the same node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to