Weiwei Yang commented on YARN-1042:

Hello [~cchen317]

Thanks for your thoughts. 

Basically, one application/attempt may include multiple group of container 
requests, and each group includes multiple container requests.

This seems like the idea to support specify a container allocation rule per 
request, e.g app1 asks for 3 containers with policy affinity then asks for 4 
container with anti-affinity policy. This requires AM-RM protocol change and 
that's why it was not in my patch yet. What I have done is to let you be able 
to specify a rule per app.

Fundamentally, it is hard for scheduler to make a right judgement without 
knowing the raw container request. The situation will get worse when dealing 
with affinity and anti-affinity or even gang scheduling etc. 

I do not fully understand the meaning of "raw container request" in your 
comments, but I think I understand your point.
While implementing container allocation policies. The hard part for me is, 
scheduler is not aware of the *context* when it tries to allocate a container. 
Ideally, it needs to know what are the corresponding containers this container 
related to (they are not independent, like the group you mentioned), also it 
needs to know the scheduling details such as how long a request is being 
waiting for and how many requests are waiting, etc ... These information is 
very helpful to help scheduler to make more complex decisions.

> add ability to specify affinity/anti-affinity in container requests
> -------------------------------------------------------------------
>                 Key: YARN-1042
>                 URL: https://issues.apache.org/jira/browse/YARN-1042
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 3.0.0
>            Reporter: Steve Loughran
>            Assignee: Arun C Murthy
>         Attachments: YARN-1042-demo.patch, YARN-1042-design-doc.pdf, 
> YARN-1042.001.patch, YARN-1042.002.patch
> container requests to the AM should be able to request anti-affinity to 
> ensure that things like Region Servers don't come up on the same failure 
> zones. 
> Similarly, you may be able to want to specify affinity to same host or rack 
> without specifying which specific host/rack. Example: bringing up a small 
> giraph cluster in a large YARN cluster would benefit from having the 
> processes in the same rack purely for bandwidth reasons.

This message was sent by Atlassian JIRA

Reply via email to