[ https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284304#comment-14284304 ]
Craig Welch commented on YARN-1039: ----------------------------------- As I understand it (and, I may be wrong on this...) the original intent of this jira was to provide a "boolean switch" to control a set of behaviors expected to be important for a long running service - among other things, what sort of nodes to schedule on and how to handle logs. This could be on a sliding scale based on duration, but I'm not sure that works so well - at what duration do we start to change how we handle logs and / or where we schedule things? While related, I think that converting this from a boolean to a range will make it more difficult to use it for the intended usecase. I also think that packing together all of these behaviors into one parameter might be a negative overall. I do think, to [~john.jian.fang] 's point, as of now using this to determine where to schedule tasks to avoid spot instances and the like has really been superseded by Node Labels and I do not think we should add additional functionality for that here - Node Labels is really the way to handle that part of the usecase. That leaves, potentially among other things, affinity/anti-affinity issues (not scheduling long running tasks together/scheduling them together) and log handling (how do we tell the system we want log handling for a long running service, if, in fact, the system needs to be told that). I submit that it would be better to have separate solutions to each of these needs which can be bundled together to achieve the overall usecase, as I think that will provide better control without adding too much complexity for the end user. Which means that we would break this out into affinity/anti-affinity and logging configuration. We could always have a single parameter (like this one) which set's the others for convenience, I'm not sure we'll actually need it, but I do think that splitting out the bundled functionality into individual items (some of which may already be being worked on elsewhere) is the way to go. > Add parameter for YARN resource requests to indicate "long lived" > ----------------------------------------------------------------- > > Key: YARN-1039 > URL: https://issues.apache.org/jira/browse/YARN-1039 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Affects Versions: 3.0.0, 2.1.1-beta > Reporter: Steve Loughran > Assignee: Craig Welch > Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch > > > A container request could support a new parameter "long-lived". This could be > used by a scheduler that would know not to host the service on a transient > (cloud: spot priced) node. > Schedulers could also decide whether or not to allocate multiple long-lived > containers on the same node -- This message was sent by Atlassian JIRA (v6.3.4#6332)