[
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284304#comment-14284304
]
Craig Welch commented on YARN-1039:
-----------------------------------
As I understand it (and, I may be wrong on this...) the original intent of this
jira was to provide a "boolean switch" to control a set of behaviors expected
to be important for a long running service - among other things, what sort of
nodes to schedule on and how to handle logs. This could be on a sliding scale
based on duration, but I'm not sure that works so well - at what duration do we
start to change how we handle logs and / or where we schedule things? While
related, I think that converting this from a boolean to a range will make it
more difficult to use it for the intended usecase. I also think that packing
together all of these behaviors into one parameter might be a negative overall.
I do think, to [~john.jian.fang] 's point, as of now using this to determine
where to schedule tasks to avoid spot instances and the like has really been
superseded by Node Labels and I do not think we should add additional
functionality for that here - Node Labels is really the way to handle that part
of the usecase. That leaves, potentially among other things,
affinity/anti-affinity issues (not scheduling long running tasks
together/scheduling them together) and log handling (how do we tell the system
we want log handling for a long running service, if, in fact, the system needs
to be told that). I submit that it would be better to have separate solutions
to each of these needs which can be bundled together to achieve the overall
usecase, as I think that will provide better control without adding too much
complexity for the end user. Which means that we would break this out into
affinity/anti-affinity and logging configuration. We could always have a
single parameter (like this one) which set's the others for convenience, I'm
not sure we'll actually need it, but I do think that splitting out the bundled
functionality into individual items (some of which may already be being worked
on elsewhere) is the way to go.
> Add parameter for YARN resource requests to indicate "long lived"
> -----------------------------------------------------------------
>
> Key: YARN-1039
> URL: https://issues.apache.org/jira/browse/YARN-1039
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Affects Versions: 3.0.0, 2.1.1-beta
> Reporter: Steve Loughran
> Assignee: Craig Welch
> Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch
>
>
> A container request could support a new parameter "long-lived". This could be
> used by a scheduler that would know not to host the service on a transient
> (cloud: spot priced) node.
> Schedulers could also decide whether or not to allocate multiple long-lived
> containers on the same node
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)