[ https://issues.apache.org/jira/browse/YARN-8259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16482992#comment-16482992 ]
Eric Yang commented on YARN-8259: --------------------------------- If I am not mistaken, DockerContainerRuntime is running as part of node manager. If hidepid option is used by system administrator, yarn user might not have rights to check if /proc/[pid] exists. We might need to create a LCE operation to perform the check, if we are going with the suggested pid file check path. I still prefers the docker inspect command path with retry logic. In a non-blocking IO system, it is hard to avoid coding logic for retries. The investment will pay off in the long run, when each retry value is defined and optimized to make the system reliable and robust. > Revisit liveliness checks for Docker containers > ----------------------------------------------- > > Key: YARN-8259 > URL: https://issues.apache.org/jira/browse/YARN-8259 > Project: Hadoop YARN > Issue Type: Sub-task > Affects Versions: 3.0.2, 3.2.0, 3.1.1 > Reporter: Shane Kumpf > Assignee: Shane Kumpf > Priority: Major > Labels: Docker > Attachments: YARN-8259.001.patch > > > As privileged containers may execute as a user that does not match the YARN > run as user, sending the null signal for liveliness checks could fail. We > need to reconsider how liveliness checks are handled in the Docker case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org