[
https://issues.apache.org/jira/browse/YARN-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496872#comment-16496872
]
Haibo Chen commented on YARN-8250:
----------------------------------
[~asuresh], [~leftnoteasy] and I had an offline discussion about this again.
We think one alternative to avoid two different implementations of the
container scheduler is to modify the behavior of the existing
ContainerScheduler to accommodate the requirements of NM over-allocation.
Specifically, the behavior changes of the current ContainerScheduler will
include
Before:
1) Upon a GUARANTEED container scheduling event, always queue the GUARANTEED
container first and then check if any OPPORTUNISTIC container needs to be
preempted. If so, wait for the OPPORTUNISTIC container(s) to be killed.
Otherwise, launch the GUARANTEED container.
2) Upon an OPPORTUNISTIC container scheduling event, queue the container first
and only launch the OPPORTUNISTIC container if there is enough room.
3) Upon any container completed or finished event that signals resources that
have been released, check if any container (GUARANTEED containers first, then
OPPORTUNISTIC containers) can be launched
After:
1) Upon a GUARANTEED container scheduling event, launch the GUARANTEED
container immediately (without queuing). Rely on cgroups OOM control
(YARN-6677) to preempt OPPORTUNISTIC containers as necessary.
2) Upon an OPPORTUNISTIC container scheduling event, simply queue the
OPPORTUNISTIC container.
3) Upon any container completed or finished event, do not try to launch any
container.
4) Introduce a periodic check (in ContainersMonitor thread) that launches
OPPORTUNISTIC container. Ideally, the period is configurable so that the
latency to launch OPPORTUNISTIC containers can be reduced.
As we have discussed in previous comments, this reduces the latency to launch
GUARANTEED containers and allow us to control how aggressive OPPORTUNISTIC
containers are launched, which is especially important for reliability when
over-allocation is turned on. The code can be a lot simpler as well.
*But it does increase the latency to launch OPPORTUNISTIC containers in cases
where over-allocation is not on, because we give up opportunities to launch
them when there are containers finished or paused*. In addition, it does add a
dependency on cgroup OOM control to preempt OPPORTUNISTIC containers, even
though I'd argue it's best to turn on cgroup isolation anyway to ensure
GUARANTEED containers are not adversely impacted by running OPPORUTNISTIC
containers.
Let us know your thoughts, if the workload you guys are running is okay with
the change. [~leftnoteasy] Please add anything that I may have missed.
> Create another implementation of ContainerScheduler to support NM
> overallocation
> --------------------------------------------------------------------------------
>
> Key: YARN-8250
> URL: https://issues.apache.org/jira/browse/YARN-8250
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Haibo Chen
> Assignee: Haibo Chen
> Priority: Major
> Attachments: YARN-8250-YARN-1011.00.patch,
> YARN-8250-YARN-1011.01.patch, YARN-8250-YARN-1011.02.patch
>
>
> YARN-6675 adds NM over-allocation support by modifying the existing
> ContainerScheduler and providing a utilizationBased resource tracker.
> However, the implementation adds a lot of complexity to ContainerScheduler,
> and future tweak of over-allocation strategy based on how much containers
> have been launched is even more complicated.
> As such, this Jira proposes a new ContainerScheduler that always launch
> guaranteed containers immediately and queues opportunistic containers. It
> relies on a periodical check to launch opportunistic containers.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]