[
https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16661198#comment-16661198
]
Jason Lowe commented on YARN-8672:
----------------------------------
Thanks for the analysis and patch!
I believe the patch will fix the issue but cause another issue in the process.
If I'm reading it properly, canceling a localizer will leak the localizer's
token files. For example, if a container is in the process of localizing and
gets killed, we will cancel the localizer which will prevent the tokens file
from being deleted. I'd expect localizer tokens to pile up as a result.
It seems odd to me that we create localizer tokens _and_ container tokens.
Seems like we only need one of these, and the container tokens have the benefit
of getting automatically cleaned up as part of removing container directories.
If for some reason we have to keep them separate then we could change the
localization path to be under the same nmPrivate directory used for the
container so we don't have to be so careful about removing these things as part
of cleaning up localizers -- it will be cleaned up automatically as part of
cleaning the container.
> TestContainerManager#testLocalingResourceWhileContainerRunning occasionally
> times out
> -------------------------------------------------------------------------------------
>
> Key: YARN-8672
> URL: https://issues.apache.org/jira/browse/YARN-8672
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 3.2.0
> Reporter: Jason Lowe
> Assignee: Chandni Singh
> Priority: Major
> Attachments: YARN-8672.001.patch, YARN-8672.002.patch
>
>
> Precommit builds have been failing in
> TestContainerManager#testLocalingResourceWhileContainerRunning. I have been
> able to reproduce the problem without any patch applied if I run the test
> enough times. It looks like something is removing container tokens from the
> nmPrivate area just as a new localizer starts.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]