[
https://issues.apache.org/jira/browse/YARN-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16476312#comment-16476312
]
Arun Suresh commented on YARN-8250:
-----------------------------------
Also with regard to this:
{quote}
Minimize impact on GUARANTEED containers from over-allocating node with
OPPORTUNISTIC containers. Queuing time of GUARANTEED containers would increase
with more running OPPORTUNISTIC containers, which is the case with
over-allocating.
{quote}
Would like to understand why this is so. When a G container comes in, and
resources are currently being used by a number of O containers. it first queues
the G containers and then the ContainerScheduler (CS) will request that the
appropriate number of O containers are killed (or paused). Once the CS receives
event that the O containers are killed/paused, it will start the queued G
containers. If the kill signals are kill -9, then the events should be received
almost immediately. I don't expect more than a second or 2 for the queued G
containers to start.
Given that, in the decently utilized cluster, it is possible for the RM to take
a couple of seconds to return container tokens, do you think the added
complexity is justified just to shave a second or two in the container startup
times ?
I agree for extremely short tasks (where life time is of the order of a few
seconds), maybe it is justified - but in our experience, for many of those
tasks, localization time dominates runtime - and localization happens before
the container is even sent to the scheduler.
> Create another implementation of ContainerScheduler to support NM
> overallocation
> --------------------------------------------------------------------------------
>
> Key: YARN-8250
> URL: https://issues.apache.org/jira/browse/YARN-8250
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Haibo Chen
> Assignee: Haibo Chen
> Priority: Major
> Attachments: YARN-8250-YARN-1011.00.patch,
> YARN-8250-YARN-1011.01.patch, YARN-8250-YARN-1011.02.patch
>
>
> YARN-6675 adds NM over-allocation support by modifying the existing
> ContainerScheduler and providing a utilizationBased resource tracker.
> However, the implementation adds a lot of complexity to ContainerScheduler,
> and future tweak of over-allocation strategy based on how much containers
> have been launched is even more complicated.
> As such, this Jira proposes a new ContainerScheduler that always launch
> guaranteed containers immediately and queues opportunistic containers. It
> relies on a periodical check to launch opportunistic containers.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]