[ https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541129#comment-14541129 ]
Vinod Kumar Vavilapalli commented on YARN-1039: ----------------------------------------------- *sigh* This JIRA was all over the place. Can we please agree not to discuss here *how* long running services related scheduling features, UI, log-aggregation, security-tokens should be implemented? There are separate JIRAs with good progress on each of them. Let's also please not discuss how the platform _could_ make use of the notion of a long-lived nature of an application/container. I understand that the type of usage shall dictate what the input will look like, but hold on to that for a second. h3. Blocker I've already started seeing real-life situations where we need the RM to know about the long-lived'ness of a container and an application. The prominents one of this are (a) reservations (b) white-listed requests or (c) node-label requests getting stuck on a node used by other services' containers that don't exit. Absence of this notion is increasingly becoming a *blocker* for running services. I'd like to get some progress here. h3. Short Proposal There seems like a general agreement on having the notion itself. Here are the proposals and dimensions # The notion at app level, at per container level # a boolean flag, an enum, duration I propose that we solve the blocker use-case that I pointed above with a boolean at both app-level and container-level. Tomorrow, when somebody implements a duration based bin-packing scheduling policy, they can add in the notion of a duration and then reconcile the boolean with infinity values on the duration. The enum proposal is to me a dup of YARN-3409 which covers a much larger problem space. Thoughts? > Add parameter for YARN resource requests to indicate "long lived" > ----------------------------------------------------------------- > > Key: YARN-1039 > URL: https://issues.apache.org/jira/browse/YARN-1039 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Affects Versions: 3.0.0, 2.1.1-beta > Reporter: Steve Loughran > Assignee: Craig Welch > Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch > > > A container request could support a new parameter "long-lived". This could be > used by a scheduler that would know not to host the service on a transient > (cloud: spot priced) node. > Schedulers could also decide whether or not to allocate multiple long-lived > containers on the same node -- This message was sent by Atlassian JIRA (v6.3.4#6332)