[ 
https://issues.apache.org/jira/browse/YARN-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16723580#comment-16723580
 ] 

Chandni Singh commented on YARN-9126:
-------------------------------------

There were 2 changes that caused the issue:
- YARN-7644 : the cleanup of working directory is done asynchronously 
- YARN-8569: this introduced sysfs directory in container's working directory 
which needs to be deleted during cleanup of working directory.

Attached is patch 001. [~eyang] could you please take a look.

> Container reinit always fails in branch-3.2 and trunk
> -----------------------------------------------------
>
>                 Key: YARN-9126
>                 URL: https://issues.apache.org/jira/browse/YARN-9126
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Eric Yang
>            Assignee: Chandni Singh
>            Priority: Major
>              Labels: docker
>         Attachments: YARN-9126.001.patch
>
>
> When upgrading container, container reinitialization always failed with code 
> 33.  This error code means the localizing file already exist while copying 
> resource files.  The container will retry with another container ID, hence 
> the problem is masked.
> Hadoop 3.1.x relaunch logic seem to have some way to prevent this bug from 
> happening.  The same logic might be useful in branch 3.2 and trunk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to