[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups

Miklos Szegedi (JIRA) Mon, 14 May 2018 16:52:32 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16475059#comment-16475059
 ]


Miklos Szegedi commented on YARN-4599:
--------------------------------------

{quote}bq. Is 'descriptors->event_control_fd = -1;'  necessary?
{quote}
Yes it is a defense against chained errors, it may make it easier to debug when 
you get a core dump.
{quote}bq. 3) The comments for test_oom() does not quite make sense to me. My 
current understanding is that it adds the calling process to the given pgroup 
and simulates an OOM by keep asking OS for memory?
{quote}
You are mixing the parent with the child. The parent gets the child pid and the 
child gets 0 after the fork() since the child can just call getpid(). It forks 
a child process gets it's pid in the parent and adds that to a cgroup. Once the 
child notices that it is in the cgroup it starts eating memory triggering an 
OOM.
{quote}bq. 4) Can you please elaborate on how cgroup simulation is done in 
oom_listener_test_main.c? The child process that is added to the cgroup only 
does sleep().
{quote}
/*
 Unit test for cgroup testing. There are two modes.
 If the unit test is run as root and we have cgroups
 we try to crate a cgroup and generate an OOM.
 If we are not running as root we just sleep instead
 of eating memory and simulate the OOM by sending
 an event in a mock event fd mock_oom_event_as_user.
*/
{quote}bq. 5) Doing a param matching in CGroupsHandlerImpl.GetCGroupParam() 
does not seem a good practice to me.
{quote}
CGroupsHandlerImpl.GetCGroupParam() is a smart function that returns the file 
name given the parameter name. I do not see any good practice issue here. The 
tasks file is always without the controller name.
{quote}bq. 6) Let's wrap the new thread join in ContainersMonitorImpl with 
try-catch clause as we do with the monitoring thread.
{quote}
May I ask why? I thought only exceptions that will actually be thrown need to 
be caught. CGroupElasticMemoryController has a much better cleanup process than 
the monitoring thread and it does not need InterruptedException. In fact any 
interrupted exception would mean that we have likely leaked the external 
process, so I would advise against using it.
{quote}bq. 7) The configuration changes are incompatible  ... How about we 
create separate configurations for pm elastic control and vm elastic control?
{quote}
I do not necessarily agree here.

a) First of all polling and cgroups memory control did not work together before 
the patch either. NM exited with an exception, so there is no previous 
functionality that worked before and it does not work now. There is no 
compatibility break. cgroups takes a precedence indeed, that is a new feature.

b) I would like to have a clean design in the long term for configuration 
avoiding too many configuration entries and definitely avoiding confusion. If 
here is a yarn.nodemanager.pmem-check-enabled, it suggests general use, it 
would be unintuitive not to use it. We indeed cannot change it's general 
meaning anymore. I think the clean design is having 
yarn.nodemanager.resource.memory.enabled to enable cgroups, 
yarn.nodemanager.resource.memory.enforced to enforce it per container and 
yarn.nodemanager.elastic-memory-control.enabled to enforce it at the node 
level. The detailed settings like yarn.nodemanager.pmem-check-enabled and 
yarn.nodemanager.pmem-check-enabled can only intuitively apply to all of them. 
In uderstand the concern but this solution would let us keep only these five 
configuration entries.

11) Does it make sense to have the stopListening logic in `if (!watchdog.get) 
{}` block instead?

It is completely equivalent. It will be called a few milliseconds earlier 
later, but there was a missing explanation there, so I added a comment.
{quote}bq. 16) In TestDefaultOOMHandler.testBothContainersOOM(), I think we 
also need to verify container 2 is killed. Similarly, in  testOneContainerOOM() 
and  testNoContainerOOM().
{quote}
Only one container should be killed. However, I refined the verify logic to be 
even more precise verifying this.

I addressed the rest. I will provide a patch soon.

 

> Set OOM control for memory cgroups
> ----------------------------------
>
>                 Key: YARN-4599
>                 URL: https://issues.apache.org/jira/browse/YARN-4599
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>    Affects Versions: 2.9.0
>            Reporter: Karthik Kambatla
>            Assignee: Miklos Szegedi
>            Priority: Major
>              Labels: oct16-medium
>         Attachments: Elastic Memory Control in YARN.pdf, YARN-4599.000.patch, 
> YARN-4599.001.patch, YARN-4599.002.patch, YARN-4599.003.patch, 
> YARN-4599.004.patch, YARN-4599.005.patch, YARN-4599.006.patch, 
> YARN-4599.sandflee.patch, yarn-4599-not-so-useful.patch
>
>
> YARN-1856 adds memory cgroups enforcing support. We should also explicitly 
> set OOM control so that containers are not killed as soon as they go over 
> their usage. Today, one could set the swappiness to control this, but 
> clusters with swap turned off exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups

Reply via email to