[
https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16677311#comment-16677311
]
Chandni Singh commented on YARN-8672:
-------------------------------------
[~eyang] Please see below:
In DefaultContainerExecutor.startLocalizer(LocalizerStartContext ctx), the
token file path is read from start context and then written to
{{appStorageDir/<containerId.token>}}. The {{appStorageDir}} is then set as the
working directory for {{ContainerLocalizer}}. This is the file which is being
read in {{runLocalization}} so patch 005 is not going to break that method.
{code:java}
Path nmPrivateContainerTokensPath = ctx.getNmPrivateContainerTokens();
String tokenFn =
String.format(ContainerLocalizer.TOKEN_FILE_NAME_FMT, locId);
Path tokenDst = new Path(appStorageDir, tokenFn);
copyFile(nmPrivateContainerTokensPath, tokenDst, user);
LOG.info("Copying from " + nmPrivateContainerTokensPath
+ " to " + tokenDst);
...
localizerFc.setWorkingDirectory(appStorageDir);
{code}
In the LinuxContainerExecutor.startLocalizer(LocalizerStartContext ctx), the
token file path is appended to the arguments.
{code}
initializeContainerOp.appendArgs(
runAsUser,
user,
Integer.toString(
PrivilegedOperation.RunAsUserCommand.INITIALIZE_CONTAINER
.getValue()),
appId,
locId,
nmPrivateContainerTokensPath.toUri().getPath().toString(),
StringUtils.join(PrivilegedOperation.LINUX_FILE_PATH_SEPARATOR,
localDirs),
StringUtils.join(PrivilegedOperation.LINUX_FILE_PATH_SEPARATOR,
logDirs));
{code}
I assumed this will be copied to the working directory when the privilege
operation is executed.
The {{ContainerLocalizer.run}} method does assume that token file is in the
current working directory.
{code}
Path tokenPath =
new Path(String.format(TOKEN_FILE_NAME_FMT, localizerId));
credFile = lfs.open(tokenPath);
creds.readTokenStorageStream(credFile);
{code}
cc [~jlowe] [[email protected]]
> TestContainerManager#testLocalingResourceWhileContainerRunning occasionally
> times out
> -------------------------------------------------------------------------------------
>
> Key: YARN-8672
> URL: https://issues.apache.org/jira/browse/YARN-8672
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 3.2.0
> Reporter: Jason Lowe
> Assignee: Chandni Singh
> Priority: Major
> Attachments: YARN-8672.001.patch, YARN-8672.002.patch,
> YARN-8672.003.patch, YARN-8672.004.patch, YARN-8672.005.patch
>
>
> Precommit builds have been failing in
> TestContainerManager#testLocalingResourceWhileContainerRunning. I have been
> able to reproduce the problem without any patch applied if I run the test
> enough times. It looks like something is removing container tokens from the
> nmPrivate area just as a new localizer starts.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]