[
https://issues.apache.org/jira/browse/YARN-8259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16482992#comment-16482992
]
Eric Yang edited comment on YARN-8259 at 5/21/18 8:15 PM:
----------------------------------------------------------
If I am not mistaken, DockerContainerRuntime is running as part of node
manager. If hidepid option is used by system administrator, yarn user might
not have rights to check if /proc/[pid] exists. We might need to create a LCE
operation to perform the check, if we are going with the suggested pid file
check path.
I prefer the docker inspect command path with retry logic. In a non-blocking
IO system, it is hard to avoid coding logic for retries. The investment will
pay off in the long run, when each retry value is defined and optimized to make
the system reliable and robust.
was (Author: eyang):
If I am not mistaken, DockerContainerRuntime is running as part of node
manager. If hidepid option is used by system administrator, yarn user might
not have rights to check if /proc/[pid] exists. We might need to create a LCE
operation to perform the check, if we are going with the suggested pid file
check path.
I still prefers the docker inspect command path with retry logic. In a
non-blocking IO system, it is hard to avoid coding logic for retries. The
investment will pay off in the long run, when each retry value is defined and
optimized to make the system reliable and robust.
> Revisit liveliness checks for Docker containers
> -----------------------------------------------
>
> Key: YARN-8259
> URL: https://issues.apache.org/jira/browse/YARN-8259
> Project: Hadoop YARN
> Issue Type: Sub-task
> Affects Versions: 3.0.2, 3.2.0, 3.1.1
> Reporter: Shane Kumpf
> Assignee: Shane Kumpf
> Priority: Major
> Labels: Docker
> Attachments: YARN-8259.001.patch
>
>
> As privileged containers may execute as a user that does not match the YARN
> run as user, sending the null signal for liveliness checks could fail. We
> need to reconsider how liveliness checks are handled in the Docker case.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]