[ 
https://issues.apache.org/jira/browse/YARN-8259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16483845#comment-16483845
 ] 

Shane Kumpf commented on YARN-8259:
-----------------------------------

{quote}System administrator can reserve one cpu core for node manager and all 
the docker inspect call are counted toward saturating one cpu core{quote}
I'm less concerned about the cpu usage and more about docker's client/server 
model and the potential for hangs (that I've seen many of in the past under 
load). Personally, I want the /proc route for my systems and am not using 
hidepid. Losing a container due to an intermittent docker issue isn't really 
acceptable to me when an alternative exists that avoids the issue.

What I could do is implement both the /proc and {{docker inspect}} approaches, 
and a configuration switch to choose the implementation for that that use 
hidepid (or a system without /proc). Would that be acceptable?

I'm also going to make this a blocker, as all privileged containers are leaked 
on NM restart today.

> Revisit liveliness checks for Docker containers
> -----------------------------------------------
>
>                 Key: YARN-8259
>                 URL: https://issues.apache.org/jira/browse/YARN-8259
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>    Affects Versions: 3.0.2, 3.2.0, 3.1.1
>            Reporter: Shane Kumpf
>            Assignee: Shane Kumpf
>            Priority: Major
>              Labels: Docker
>         Attachments: YARN-8259.001.patch
>
>
> As privileged containers may execute as a user that does not match the YARN 
> run as user, sending the null signal for liveliness checks could fail. We 
> need to reconsider how liveliness checks are handled in the Docker case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to