[ 
https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16476682#comment-16476682
 ] 

Haibo Chen commented on YARN-4599:
----------------------------------

Thanks [[email protected]] for updating the patch! The new 
documentation, how to configure is very clear and easy to understand to me. A 
few follow-up comments.
{quote}You are mixing the parent with the child. The parent gets the child pid 
and the child gets 0 after the fork() since the child can just call getpid(). 
It forks a child process gets it's pid in the parent and adds that to a cgroup. 
Once the child notices that it is in the cgroup it starts eating memory 
triggering an OOM
{quote}
Reading the code once more, test_oom in oom_listener_main.c happens to have the 
same name as oom_listener_test_main.c. The test_oom() function in 
oom_listener_main.c seems to be a helper function that simulates OOM situation. 
Let's add more comments to it so that it is clearly indicated.

The DefaultOOMHandler checks both physical and virtual memory regardless of 
whether we'd want to just enforce physical or virtural memory. We can make a 
flag and checks one only dependent on whether physical or virtual memory is 
enabled.

The  ContainerSignalContext.equals() is just for the unit test. We could either 
complete it by including more fields, or we can use Argument Matcher in 
TestDefaultOOMHandler.java as an alternative.

The way that the watchdog in the elastic controller can stop the main 
controller thread from listening is not quite straightforward. How about we add 
a comment before the while(read()) loop to say read() returns -1 if the process 
is destroye?

> Set OOM control for memory cgroups
> ----------------------------------
>
>                 Key: YARN-4599
>                 URL: https://issues.apache.org/jira/browse/YARN-4599
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>    Affects Versions: 2.9.0
>            Reporter: Karthik Kambatla
>            Assignee: Miklos Szegedi
>            Priority: Major
>              Labels: oct16-medium
>         Attachments: Elastic Memory Control in YARN.pdf, YARN-4599.000.patch, 
> YARN-4599.001.patch, YARN-4599.002.patch, YARN-4599.003.patch, 
> YARN-4599.004.patch, YARN-4599.005.patch, YARN-4599.006.patch, 
> YARN-4599.007.patch, YARN-4599.sandflee.patch, yarn-4599-not-so-useful.patch
>
>
> YARN-1856 adds memory cgroups enforcing support. We should also explicitly 
> set OOM control so that containers are not killed as soon as they go over 
> their usage. Today, one could set the swappiness to control this, but 
> clusters with swap turned off exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to