[
https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474543#comment-16474543
]
Haibo Chen commented on YARN-4599:
----------------------------------
Thanks [[email protected]] for the patch! The
TestContainersMonitor.testContainerKillOnMemoryOverflow failure seems related.
I have a few comments/questions:
1) Error handling when writing to {{cgroup.event_control}} fails seems missing
in oom_listener.c, do we need handle such an case?
2) Is 'descriptors->event_control_fd = -1;' necessary?
3) The comments for test_oom() does not quite make sense to me. My current
understanding is that it adds the calling process to the given pgroup and
simulates an OOM by keep asking OS for memory?
4) Can you please elaborate on how cgroup simulation is done in
oom_listener_test_main.c? The child process that is added to the cgroup only
does sleep().
5) Doing a param matching in CGroupsHandlerImpl.GetCGroupParam() does not seem
a good practice to me. Does it make sense to create a new method for the
special case?
6) Let's wrap the new thread join in ContainersMonitorImpl with try-catch
clause as we do with the monitoring thread.
7) The configuration changes are incompatible in that before the patch,
poll-based pmcheck and vmcheck takes precedence over the cgroup based memory
control mechanism. It is now reversed after the patch. if cgroup-based memory
control is enabled, then poll-based pmcheck and vmcheck is disabled
automatically. IIUC, one of the reasons is that we need to reuse the pmcheck
and vmcheck flags which are dedicated to control the poll-based memory control.
How about we create separate configurations for pm elastic control and vm
elastic control? We can make sure they are mutual exclusive as indicated in
CGroupElasticMemoryController. We want to keep the elastic memory control
mechanism independent of the per-container memory control mechanism, so we can
get ride of the shortcut in checkLimit() (warnings are probably more
appropriate if we want to say the poll-based mechanism is not robust, which is
an issue unrelated to what we are doing here.)
8) In CGroupElasticMemoryController, we can create a createOomHandler() method
that is called by the constructor and overridden by the unit tests to avoid the
test-only setOomHandler() method
9) bq. // TODO could we miss an event before the process starts?
This is no longer an issue based on your experiment, per our offline discussion?
10) We only need two threads in the thread pool, one for reading the error
stream, and the other for watching and logging OOM state, don't we. If so, we
change executor = Executors.newFixedThreadPool(5); => executor =
Executors.newFixedThreadPool(2);
11) I'm not quite sure how the watchdog thread can tell the elastic memory
controller to stop. I guess once the watchdog thread calls stopListening(), the
process is destroyed, `(read = events.read(event)` would return false, we'd
realize in the memory controller thread that OOM is not resolved in time and
throw an exception to crash NM? This process seems pretty obscure to me. Doe
it make sense to have the stopListening logic in `if (!watchdog.get) {}` block
instead?
12) Can we replace thrown.expected() statements with @Test(expected) which is
more declarative? Similarly in TestDefaultOOMHandler.
13) In TestCGroupElasticMemoryController.testNormalExit(), not quite sure what
the purpose of the sleep task is. Can you please add some comments there?
14) Can we add some comments to DefaultOOMHandler javadoc, especially which
containers are considered to be killed first.
15) if new YarnRuntimeException("Could not find any containers but CGroups " +
"reserved for containers ran out of memory. " + "I am giving up") is thrown in
DefaultOOMHandler, CGroupElasticMemoryController simply logs the exception. Do
we want to crash the NM as well in this case?
16) In TestDefaultOOMHandler.testBothContainersOOM(), I think we also need to
verify container 2 is killed. Similarly, in testOneContainerOOM() and
testNoContainerOOM().
> Set OOM control for memory cgroups
> ----------------------------------
>
> Key: YARN-4599
> URL: https://issues.apache.org/jira/browse/YARN-4599
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager
> Affects Versions: 2.9.0
> Reporter: Karthik Kambatla
> Assignee: Miklos Szegedi
> Priority: Major
> Labels: oct16-medium
> Attachments: Elastic Memory Control in YARN.pdf, YARN-4599.000.patch,
> YARN-4599.001.patch, YARN-4599.002.patch, YARN-4599.003.patch,
> YARN-4599.004.patch, YARN-4599.005.patch, YARN-4599.006.patch,
> YARN-4599.sandflee.patch, yarn-4599-not-so-useful.patch
>
>
> YARN-1856 adds memory cgroups enforcing support. We should also explicitly
> set OOM control so that containers are not killed as soon as they go over
> their usage. Today, one could set the swappiness to control this, but
> clusters with swap turned off exist.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]