[ 
https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474543#comment-16474543
 ] 

Haibo Chen commented on YARN-4599:
----------------------------------

Thanks [[email protected]] for the patch! The 
TestContainersMonitor.testContainerKillOnMemoryOverflow failure seems related.

I have a few comments/questions:

1) Error handling when writing to {{cgroup.event_control}} fails seems missing 
in oom_listener.c, do we need handle such an case?

2) Is 'descriptors->event_control_fd = -1;'  necessary?

3) The comments for test_oom() does not quite make sense to me. My current 
understanding is that it adds the calling process to the given pgroup and 
simulates an OOM by keep asking OS for memory?

4) Can you please elaborate on how cgroup simulation is done in 
oom_listener_test_main.c? The child process that is added to the cgroup only 
does sleep().

5) Doing a param matching in CGroupsHandlerImpl.GetCGroupParam() does not seem 
a good practice to me. Does it make sense to create a new method for the 
special case?

6) Let's wrap the new thread join in ContainersMonitorImpl with try-catch 
clause as we do with the monitoring thread.

7) The configuration changes are incompatible in that before the patch, 
poll-based pmcheck and vmcheck takes precedence over the cgroup based memory 
control mechanism. It is now reversed after the patch. if cgroup-based memory 
control is enabled, then poll-based pmcheck and vmcheck is disabled 
automatically. IIUC, one of the reasons is that we need to reuse the pmcheck 
and vmcheck flags which are dedicated to control the poll-based memory control. 
 How about we create separate configurations for pm elastic control and vm 
elastic control? We can make sure they are mutual exclusive as indicated in 
CGroupElasticMemoryController. We want to keep the elastic memory control 
mechanism independent of the per-container memory control mechanism, so we can 
get ride of the shortcut in checkLimit()  (warnings are probably more 
appropriate if we want to say the poll-based mechanism is not robust, which is 
an issue unrelated to what we are doing here.)

8) In  CGroupElasticMemoryController, we can create a createOomHandler() method 
that is called by the constructor and overridden by the unit tests to avoid the 
test-only setOomHandler() method

9) bq.       // TODO could we miss an event before the process starts?

This is no longer an issue based on your experiment, per our offline discussion?

10) We only need two threads in the thread pool, one for reading the error 
stream, and the other for watching and logging OOM state, don't we. If so, we 
change       executor = Executors.newFixedThreadPool(5);  =>       executor = 
Executors.newFixedThreadPool(2);

11) I'm not quite sure how the watchdog thread can tell the elastic memory 
controller to stop. I guess once the watchdog thread calls stopListening(), the 
process is destroyed,  `(read = events.read(event)` would return false, we'd 
realize in the memory controller thread that OOM is not resolved in time and 
throw an exception to crash NM?  This process seems pretty obscure to me. Doe 
it make sense to have the stopListening logic in `if (!watchdog.get) {}` block 
instead?

12) Can we replace thrown.expected() statements with @Test(expected) which is 
more declarative? Similarly in TestDefaultOOMHandler.

13) In TestCGroupElasticMemoryController.testNormalExit(), not quite sure what 
the purpose of the sleep task is. Can you please add some comments there?

14) Can we add some comments to DefaultOOMHandler javadoc, especially which 
containers are considered to be killed first.

15) if new YarnRuntimeException("Could not find any containers but CGroups " + 
"reserved for containers ran out of memory. " + "I am giving up") is thrown in 
DefaultOOMHandler, CGroupElasticMemoryController  simply logs the exception. Do 
we want to crash the NM as well in this case?

16) In TestDefaultOOMHandler.testBothContainersOOM(), I think we also need to 
verify container 2 is killed. Similarly, in  testOneContainerOOM() and  
testNoContainerOOM().

 

 

> Set OOM control for memory cgroups
> ----------------------------------
>
>                 Key: YARN-4599
>                 URL: https://issues.apache.org/jira/browse/YARN-4599
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>    Affects Versions: 2.9.0
>            Reporter: Karthik Kambatla
>            Assignee: Miklos Szegedi
>            Priority: Major
>              Labels: oct16-medium
>         Attachments: Elastic Memory Control in YARN.pdf, YARN-4599.000.patch, 
> YARN-4599.001.patch, YARN-4599.002.patch, YARN-4599.003.patch, 
> YARN-4599.004.patch, YARN-4599.005.patch, YARN-4599.006.patch, 
> YARN-4599.sandflee.patch, yarn-4599-not-so-useful.patch
>
>
> YARN-1856 adds memory cgroups enforcing support. We should also explicitly 
> set OOM control so that containers are not killed as soon as they go over 
> their usage. Today, one could set the swappiness to control this, but 
> clusters with swap turned off exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to