[
https://issues.apache.org/jira/browse/YARN-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16334872#comment-16334872
]
Jason Lowe commented on YARN-2185:
----------------------------------
Thanks for updating the patch!
Should a SuppressWarnings("deprecation") be added? I personally would rather
see that with a comment next to the call site explaining why we're using a
deprecated method rather than add yet another warning to the pile, but I'm
curious what others think here.
There were checks for paths with embedded single-quotes which is missing. The
code should be escaping single quotes in the filename to avoid the shell
mis-parsing the command.
runCommandOnStream is only creating a thread pool and reading the subprocess
stdout and stderr if logging is enabled. If the subprocess ends up producing
too much output on either channel then this will deadlock. The child process
will stop consuming input waiting for the output stream to be consumed but the
parent process will be busy blocked waiting for the subprocess to consume more
input. We need to be consuming the subprocess stdout and stderr even if we do
not intend to log it. If not being logged or otherwise acted upon then the
data can simply be thrown away.
Speaking of throwing away subprocess output, if the tar command fails there
will be nothing but an exit code to try to figure out what went wrong. The
existing unTarUsingTar gets this behavior via the ShellCommandExecutor. I
think runCommandOnStream should throw an exception (e.g.: ExitCodeException or
something similar) containing the error output if the subprocess does not
return a zero exit code.
> Use pipes when localizing archives
> ----------------------------------
>
> Key: YARN-2185
> URL: https://issues.apache.org/jira/browse/YARN-2185
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: nodemanager
> Affects Versions: 2.4.0
> Reporter: Jason Lowe
> Assignee: Miklos Szegedi
> Priority: Major
> Attachments: YARN-2185.000.patch, YARN-2185.001.patch,
> YARN-2185.002.patch, YARN-2185.003.patch, YARN-2185.004.patch,
> YARN-2185.005.patch, YARN-2185.006.patch, YARN-2185.007.patch,
> YARN-2185.008.patch
>
>
> Currently the nodemanager downloads an archive to a local file, unpacks it,
> and then removes it. It would be more efficient to stream the data as it's
> being unpacked to avoid both the extra disk space requirements and the
> additional disk activity from storing the archive.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]