[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate "long lived"

Jian Fang (JIRA) Tue, 20 Jan 2015 20:44:41 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14285159#comment-14285159
 ]


Jian Fang commented on YARN-1039:
---------------------------------

I do see the potential use of the distributed scheduling for repeated and 
handcrafted hadoop jobs if the algorithm is robust enough. But how to scale 
then? For example, we have tens of thousands of jobs every day in some 
production clusters. It may be difficult to manually specify the duration 
values unless hadoop is smart enough to learn job knowledge from past history 
automatically or use some calculation based on input size, for example.

Seems to me, categories could be more robust than the direct use of a duration 
sometimes. For example, I could define a set as short-lived, medium, and 
long-lived using a fuzzy logic and then use the categories in my scheduler to 
improve the performance. 

How about use a generic object to represent the knowledge of the job? It could 
be a flag to indicate long lived or short lived, or a duration value, or some 
categories. Sorry if this does not make sense at all, just try to throw out 
some ideas.





> Add parameter for YARN resource requests to indicate "long lived"
> -----------------------------------------------------------------
>
>                 Key: YARN-1039
>                 URL: https://issues.apache.org/jira/browse/YARN-1039
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 3.0.0, 2.1.1-beta
>            Reporter: Steve Loughran
>            Assignee: Craig Welch
>         Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch
>
>
> A container request could support a new parameter "long-lived". This could be 
> used by a scheduler that would know not to host the service on a transient 
> (cloud: spot priced) node.
> Schedulers could also decide whether or not to allocate multiple long-lived 
> containers on the same node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate "long lived"

Reply via email to