[ https://issues.apache.org/jira/browse/YARN-8207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457085#comment-16457085 ]
Jason Lowe commented on YARN-8207: ---------------------------------- Thanks for the patch! Nit: chosen_container_log_dir should be static. init_log_path should use make_string instead of manually calculating string lengths. It would make the code easier to read and maintain if the argv pointer array and size of the array (i.e. \*\*out and \*index) were passed around as a structure instead of separate function arguments. It's easier to treat them together as a single object. For example, it would be simple for free_buffer to not only free all the arguments but also reset the size to zero at the same time since it has access to the whole structure. DOCKER_ARG_MAX is sometimes used as a maximum argument count and other times used as a character buffer count which is inconsistent. For example, the following code has nothing to do with the maximum argument count for the docker command. It should be using make_string to form the command rather than trying to manually size the buffer. {code} size_t command_size = DOCKER_ARG_MAX; char* proc_pid_path = alloc_and_clear_memory(command_size, sizeof(char)); snprintf(proc_pid_path, command_size, "%s/%d", PROC_PATH, pid); {code} docker_command_with_binary is initialized to an allocated buffer and then overwritten with the result of the call to flatten. docker_wait_command and docker_rm_command are used to track an allocated buffer that is never used. docker_inspect_command and docker_inspect_exitcode_command are tracking a DOCKER_ARG_MAX buffer used for popen. Should these be using execv instead of popen as part of this change? If not, DOCKER_ARG_MAX isn't an appropriate buffer sizer for those per above discussion. I think we can simply use make_string for those. stderr.txt is fopen'd and never used before the fclose? Are we supposed to dup2 these file descriptors to 1 and 2 before the execv so any errors from docker run appear in those output files? The parent process that is responsible for obtaining the pid is not waiting for the child to complete before running the inspect command. That's why retries had to be added to get it to work when they were not needed before. The parent should simply wait and check for error exit codes as it did before when it was using popen. After that we can ditch the retries since they won't be necessary. Nit: flatten would be more efficient and quite a bit simpler if written with stpcpy rather than strlen and strncpy. add_param_to_command and add_param_to_command_if_allowed should use make_string instead of guessing at the required buffer size. add_param_to_command_if_allowed is trying to reset the index on errors, but it fails to re-NULL out the written index values (if there were any). Either we should assume all errors are fatal and therefore the buffer doesn't need to be reset or the reset logic needs to be fixed. Why is get_docker_inspect_command calling free_buffer at the beginning? Index is left unmodified and we're not nulling the values after freeing them in free_buffer, so I'm not sure what the intent is with that call here. I think the method should simply append to the buffer, and it is the responsibility of the caller to make sure the buffer is in the desired state before calling the function that will append the commands. Similar comment for other places free_buffer is called towards the beginning of a function. The set_env changes look relevant to YARN-7654 rather than this JIRA. get_docker_run_command should use make_string. Why does make_string calculate size = n + 2 instead of n + 1? Style nit: space between realloc and parentheses in make_string I didn't quite get through everything today, and I'll try to get to the rest of it early next week. > Docker container launch use popen have risk of shell expansion > -------------------------------------------------------------- > > Key: YARN-8207 > URL: https://issues.apache.org/jira/browse/YARN-8207 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-native-services > Reporter: Eric Yang > Assignee: Eric Yang > Priority: Major > Attachments: YARN-8207.001.patch > > > Container-executor code utilize a string buffer to construct docker run > command, and pass the string buffer to popen for execution. Popen spawn a > shell to run the command. Some arguments for docker run are still vulnerable > to shell expansion. The possible solution is to convert from char * buffer > to string array for execv to avoid shell expansion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org