[ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14297783#comment-14297783
 ] 

Craig Welch commented on YARN-1039:
-----------------------------------

[~chris.douglas]

bq.  YARN shouldn't understand the lifecycle for a service or the 
progress/dependencies for task containers

That's not necessarily so, there are some cases where the type of life cycle 
for an application is important, for example, when determining whether or not 
it is open-ended ("service") or a batch process which entails a notion of 
progress ("session"), at least for purposes of display.

I think we need to re scope and clarify this jira a bit so that we can make 
progress - there are a number of items in the original problem statement and 
subsequent comments which have been taken on elsewhere and so really no longer 
make sense to pursue here.  Here's an attempt at a breakdown:

bq. This could be used by a scheduler that would know not to host the service 
on a transient (cloud: spot priced) node

I think this is now clearly covered by [YARN-796], nodes having qualities 
(including operational qualities such as these) is one of the core purposes of 
this work, it makes no sense to duplicate it here, and so it should be 
de-scoped from this jira

bq. Schedulers could also decide whether or not to allocate multiple long-lived 
containers on the same node

As [~ste...@apache.org]   mentioned in an earlier comment 
[https://issues.apache.org/jira/browse/YARN-1039?focusedCommentId=14038041&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14038041]
 affinity / anti-affinity is covered in a more general sense in [YARN-1042].  
The above component of this jira is really just such a case, and so it should 
be covered with that general solution and dropped from scope as well.  There 
may be some interest in informing that solution based on a generalized 
"service" setting, but to really understand that the affinity approach needs to 
be worked out - and I think the affinity approach will really need to 
inform/integrate with this rather than the other way around, and integration 
should be approached as part of that effort

That leaves nothing, so we can close the jira ;-)  Not quite, there were 
several things added in comments:

Token management - handled in [YARN-941]

Scheduler hints not related to node categories or anti-affinity (opportunistic 
scheduling, etc) - this does strike me as something better handled via the 
duration route et all [YARN-2877] [YARN-1051] and not something which needs to 
be replicated here

I think that really just leaves the progress bar (and potentially other display 
related items).  This is covered by [YARN-1079]  I suggest, then, that we 
either rescope this jira to providing the lifecycle information as an 
application tag 
[https://issues.apache.org/jira/browse/YARN-1039?focusedCommentId=14039679&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14039679]
 as suggested by [~zjshen] early on or close it and cover the work as part of 
[YARN-1079].  I originally objected to that approach on the basis that tags 
appeared to be a display type feature which did not fit this effort, but if re 
scoped as I'm proposing, it becomes such a feature, and I think that approach 
is now a good fit.  

Thoughts?


> Add parameter for YARN resource requests to indicate "long lived"
> -----------------------------------------------------------------
>
>                 Key: YARN-1039
>                 URL: https://issues.apache.org/jira/browse/YARN-1039
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 3.0.0, 2.1.1-beta
>            Reporter: Steve Loughran
>            Assignee: Craig Welch
>         Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch
>
>
> A container request could support a new parameter "long-lived". This could be 
> used by a scheduler that would know not to host the service on a transient 
> (cloud: spot priced) node.
> Schedulers could also decide whether or not to allocate multiple long-lived 
> containers on the same node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to