[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla reassigned YARN-3870:
--------------------------------------

    Assignee:     (was: Karthik Kambatla)

> Providing raw container request information for fine scheduling
> ---------------------------------------------------------------
>
>                 Key: YARN-3870
>                 URL: https://issues.apache.org/jira/browse/YARN-3870
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>            Reporter: Lei Guo
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine 
> scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to