[
https://issues.apache.org/jira/browse/YARN-8207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457139#comment-16457139
]
Eric Yang edited comment on YARN-8207 at 4/27/18 11:03 PM:
-----------------------------------------------------------
[~jlowe] Thank you for the review. Good suggestions on coding style issues. I
will fix the coding style issues.
{quote}
stderr.txt is fopen'd and never used before the fclose? Are we supposed to dup2
these file descriptors to 1 and 2 before the execv so any errors from docker
run appear in those output files?{quote}
When using launch_script.sh, there is stdout and stderr redirection inside
launch_script.sh which bind-mount to host log directory. This is the reason
that there is fopen and fclosed immediately until YARN-7654 logic are added.
{quote}The parent process that is responsible for obtaining the pid is not
waiting for the child to complete before running the inspect command. That's
why retries had to be added to get it to work when they were not needed before.
The parent should simply wait and check for error exit codes as it did before
when it was using popen. After that we can ditch the retries since they won't
be necessary.{quote}
Using launch_script.sh, container-executor runs "docker run" with detach
option. It assumes the exit code can be obtained quickly. This is the reason
there is no logic for retry "docker inspect". This assumption is some what
flawed. If the docker image is unavailable on the host, docker will show
download progress and some other information and errors. The progression are
not captured, which is difficult to debug. When docker inspect is probed,
there is no information of what failed.
Without launch_script.sh, container-executor runs "docker run" in the
foreground, and obtain pid when the first process is started. Inspect command
is checked asynchronously because docker run exit code is only reported when
the docker process is terminated. There is a balance between how long that we
should wait before we decide if the system is hang. We can make MAX_RETRIES
configurable in case people like to wait for longer or period of time before
deciding if the container should fail.
{quote}Why does make_string calculate size = n + 2 instead of n + 1?{quote}
This change makes make_string function twice faster than sample code while
waste 1% or less space if recursion is required. It is probably a reasonable
trade off for modern day computers.
was (Author: eyang):
[~jlowe] Thank you for the review. Good suggestions on coding style issues. I
will fix the coding style issues.
{quote}
stderr.txt is fopen'd and never used before the fclose? Are we supposed to dup2
these file descriptors to 1 and 2 before the execv so any errors from docker
run appear in those output files?{quote}
When using launch_script.sh, there is stdout and stderr redirection inside
launch_script.sh which bind-mount to host log directory. This is the reason
that there is fopen and fclosed immediately until YARN-7654 logic are added.
{quote}The parent process that is responsible for obtaining the pid is not
waiting for the child to complete before running the inspect command. That's
why retries had to be added to get it to work when they were not needed before.
The parent should simply wait and check for error exit codes as it did before
when it was using popen. After that we can ditch the retries since they won't
be necessary.{quote}
Using launch_script.sh, container-executor runs "docker run" with detach
option. It assumes the exit code can be obtained quickly. This is the reason
there is no logic for retry "docker inspect". This assumption is some what
flawed. If the docker image is unavailable on the host, docker will show
download progress and some other information and errors. The progression are
not captured, which is difficult to debug. When docker inspect is probed,
there is no information of what failed.
Without launch_script.sh, container-executor runs "docker run" in the
foreground, and obtain pid when the first process is started. Inspect command
is checked asynchronously because docker run exit code is only reported when
the docker process is terminated. There is a balance between how long that we
should wait before we decide if the system is hang. We can make MAX_RETRIES
configurable in case people like to wait for longer or period of time before
deciding if the container should fail.
{quote}Why does make_string calculate size = n + 2 instead of n + 1?{quote}
This change make make_string function twice faster than sample code while waste
1% or less space if recursion is required. It is probably a reasonable trade
off for modern day computers.
> Docker container launch use popen have risk of shell expansion
> --------------------------------------------------------------
>
> Key: YARN-8207
> URL: https://issues.apache.org/jira/browse/YARN-8207
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: yarn-native-services
> Reporter: Eric Yang
> Assignee: Eric Yang
> Priority: Major
> Attachments: YARN-8207.001.patch
>
>
> Container-executor code utilize a string buffer to construct docker run
> command, and pass the string buffer to popen for execution. Popen spawn a
> shell to run the command. Some arguments for docker run are still vulnerable
> to shell expansion. The possible solution is to convert from char * buffer
> to string array for execv to avoid shell expansion.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]