[ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15332268#comment-15332268
 ] 

Konstantinos Karanasos commented on YARN-5215:
----------------------------------------------

I also think this is a good feature to have -- thanks for initiating this, 
[~elgoiri]...

We had some similar use cases, so we had gone over possible designs some time 
back.
I see some advantages in the "fake container" approach that [~kasha] mentioned.
What I like about it is that you do not have to introduce the "external 
resources" to the RM. So essentially everything happens at the NM level, and 
the RM sees just some extra container.
The disadvantage I see is that we will not be able to differentiate out of the 
box those fake containers and let the user be aware of them...
What do you think?

Regarding overcommitment, I also believe it is orthogonal, but can be nicely 
coupled with it.
The way I see the full picture is to use guaranteed containers for the fake 
containers, as well as for a few containers that we are sure are not going to 
be preempted. Then use NM-queuing (YARN-2883) and opportunistic containers to 
place more containers at the NMs (using YARN-5220). At the same time, we can 
enable overcommitment through YARN-1011 to start even more opportunistic 
containers, based on the actual node's utilization (especially if we know that 
the fake container usually does not use all the resources it has allocated).
Eventually we can also introduce additional container types, as [~curino] 
mentioned, to have even tighter control about what gets preempted.

> Scheduling containers based on external load in the servers
> -----------------------------------------------------------
>
>                 Key: YARN-5215
>                 URL: https://issues.apache.org/jira/browse/YARN-5215
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Inigo Goiri
>         Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is consumed by external processes and 
> schedule based on this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to