[jira] [Commented] (YARN-6623) Add support to turn off launching privileged containers in the container-executor

Eric Badger (JIRA) Fri, 25 Aug 2017 15:47:37 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16142372#comment-16142372
 ]


Eric Badger commented on YARN-6623:
-----------------------------------

{noformat}
container-executor.c:run_docker()

+  docker_command = construct_docker_command(command_file);
+  docker_binary = get_docker_binary(&CFG);

I don’t see these getting freed. Am I missing this invocation somewhere?
container-executor.c:run_docker()

We call the execvp function, the running program will get replaced by the 
docker invocation.
{noformat}
Ahhh yes of course. Disregard my comment 

{noformat}
if (command == NULL || (strcmp(command, docker_command) != 0)) {
  ret = INCORRECT_COMMAND;
}

Is command guaranteed to be null-terminated? It comes from the configuration 
file, which is a Java creation and I don’t think Java null-terminates. This 
comment is probably relevant for quite a few other places that are doing string 
operations. We need to be very safe about this or else we risk possibly 
overrunning into random regions of the heap. A safe alternative would be to use 
the “n” version of all the string operations. This patch uses a mixed bag of 
the regular versions and their accompanying “n” versions. I don’t quite 
understand the reasoning behind the usage of each. If we guarantee that the 
string is null terminated (and always null terminated) then we don’t need the 
“n” versions. But even if we guarantee that the input string is null 
terminated, it may accidentally have the null character chopped off by an off 
by 1 error in a strdup or something like that. So my preference here would be 
to use the “n” versions of all of the string functions. Thoughts?
command is guaranteed to be null terminated by the confiugration functions. If 
we use the 'n' versions of the function we end up doing a 'begins with' match 
instead of an exact match which could cause problems(e.g. "inspect" would match 
"inspectcommand")
{noformat}
Yea I guess that makes sense. I still think we should be consistent on the 
usage of the str vs strn functions, though. There are still places in the patch 
using both, seemingly interchangeably. 

{noformat}
docker-util.c: get_docker_load_command()

+  image_name = get_configuration_value("image", DOCKER_COMMAND_FILE_SECTION, 
&command_config);
+  if (image_name == NULL) {
+    return INVALID_DOCKER_IMAGE_NAME;
+  }
+
+  memset(out, 0, outlen);

Maybe being paranoid is a good thing, but get_docker_load_command and any of 
the other commands from get_docker_command are always passed in a buffer that 
has already been nulled out, so the memset here is redundant. I imagine it will 
be similar in the other get_*_comand functions. Maybe we should put a comment 
saying that it is the caller’s responsibility to null out the buffer that they 
pass in and get rid of these memsets? Either that or keep the memset here and 
get rid of the resets higher up in get_docker_command?
I've gotten rid of the memset in the higher calls and left the memset in the 
get_docker_* functions. I just feel more comfortable having the memset there.
{noformat}
That sounds good to me

{noformat}
docker-util.c: get_docker_load_command()

+  image_name = get_configuration_value("image", DOCKER_COMMAND_FILE_SECTION, 
&command_config);
+  if (image_name == NULL) {
+    return INVALID_DOCKER_IMAGE_NAME;
+  }

We should validate the image name here as we do in get_docker_pull_command 
below.
You can pass tar files to docker load so we can't do much validation here. 
Docker save lets you save the image to any file you want and then you load it 
using docker load.
{noformat}
Ah that’s true, good point.

{noformat}
docker-util.c

+static int add_param_to_command(const struct configuration *command_config, 
const char *key, const char *param,
+                                 const int with_argument, char *out, const 
size_t outlen) {
+  size_t tmp_buffer_size = 1024;

What’s the reasoning behind the buffer size here? It’s hardcoded to 1024 in a 
bunch of places. This doesn’t follow the system path limit, the 
EXECUTOR_PATH_MAX or the system argument length.
It's just a temporary buffer. The size doesn't matter too much because 
quote_and_append_arg will resize the buffer if it's too small.
{noformat}
That makes sense. Shouldn’t we bound the size that {{quote_and_append()}} can 
{{realloc()}} to, though? I think the proper size would be {{MIN(_SC_ARG_MAX, 
128*1024)}}

{noformat}
docker-util.c:set_network()

+  } else if (value == NULL) {
+    ret = 0;
+    goto free_and_exit;

Since we didn’t find any network in the configuration, we didn’t set a network. 
That seems like it should print out an error and/or fail. Right now it will 
just silently do nothing.
You can launch docker containers with no network specified. It'll just pick up 
the default network.
{noformat}
Ah ok. No problem then

{noformat}
docker-util.c:check_mount_permitted()

+    // directory check
+    permitted_mount_len = strlen(permitted_mounts[i]);
+    if (permitted_mount_len > 0
+        && permitted_mounts[i][permitted_mount_len - 1] == '/') {
+      if (strncmp(requested, permitted_mounts[i], permitted_mount_len) == 0) {
+        return 1;

This looks potentially dangerous. Though we are normalizing the paths in all of 
the current invocations, someone in the future may add in another invocation 
that doesn’t normalize. I think at a bare minimum we need a comment saying that 
this function requires all symlinks to be resolved in the path that is passed 
to this function. Because if we allow symlinks to be followed then we will get 
different behavior because of how docker treats symlinks. A possible fix for 
this could be to combine get_src_mount() and get_real_src_mount() into a single 
function that parses the mount string to get the source and then resolves it.
I've moved the normalize_mount call into the check_mount_permitted function. 
Does that address your concern?
{noformat}
Yes, this should fix it since we’re checking exactly what we will be mounting 
against exactly what is allowed

{noformat}
docker-util.c:set_capabilities()

+  if (values != NULL) {
+    if(permitted_values != NULL) {
+      for (i = 0; values[i] != NULL; ++i) {
+        memset(tmp_buffer, 0, tmp_buffer_size);
+        permitted = 0;
+        for (j = 0; permitted_values[j] != NULL; ++j) {
+          if (strcmp(values[i], permitted_values[j]) == 0) {
+            permitted = 1;
+            break;
+          }
+        }

I think this is the 3rd time I’ve seen this structure. I think it would be nice 
if we had a single function that took in a double array for permitted_values, 
and the value to be checked to see whether that value is permitted. That way we 
don’t have to duplicate all of the looping and can minimize the amount of code 
that we need to maintain. We could use this for networks, capabilities, mounts, 
devices, etc.
Fixed. I added a new function called add_param_to_command_if_allowed which if 
used for network, capabilities and devices. Mounts has it's own function due to 
the read-write behaviour.
{noformat}
Perfect, this will compact the code quite a bit.

{noformat}
As a general comment for docker-util.c, it would be nice if we could 
consolidate some of the code in the get_command functions and in the set 
functions. There is a lot of redundant code in there. I’m worried that someone 
in the future will change part of the flow in one function (possibly a bug fix) 
that will then not get propagated to the rest of the similar code. This 
wouldn't necessarily have to be a part of this patch, but it might get kicked 
down the line and never get in if we don't do it now.
I think I've addressed some of this as part of the latest patch. Can you take a 
look and let me know?
{noformat}
I think you’ve done a good job consolidating. Especially with the permitted 
values. The functions are much leaner now.

{noformat}
Thank you the comprehensive review. Very much appreciated.
{noformat}
My pleasure!

{noformat}
+static int add_mounts(const struct configuration *command_config, const struct 
configuration *conf, const char *key,
+                      const int ro, char *out, const size_t outlen) {
+  size_t tmp_buffer_size = 1024;
+  const char *ro_suffix = "";
+  const char *tmp_path_buffer[2] = {NULL, NULL};
+  char *tmp_buffer = (char *) alloc_and_clear_memory(tmp_buffer_size, 
sizeof(char));
+  char **permitted_ro_mounts = 
get_configuration_values_delimiter("allowed.ro-mounts",
+                                                                  
CONTAINER_EXECUTOR_CFG_DOCKER_SECTION, conf, ",");
+  char **permitted_rw_mounts = 
get_configuration_values_delimiter("allowed.rw-mounts",
+                                                                  
CONTAINER_EXECUTOR_CFG_DOCKER_SECTION, conf, ",");
+  char **values = get_configuration_values_delimiter(key, 
DOCKER_COMMAND_FILE_SECTION, command_config, ",");
+  char *tmp_buffer_2 = NULL, *mount_src = NULL;
+  const char *container_executor_cfg_path = 
normalize_mount(get_config_path(""));
+  int i = 0, permitted_rw = 0, permitted_ro = 0, ret = 0;
+  if (ro != 0) {
+    ro_suffix = ":ro";
+  }
+
+  ret = normalize_mounts(permitted_ro_mounts);
+  ret |= normalize_mounts(permitted_rw_mounts);
+  if (ret != 0) {
+    fprintf(ERRORFILE, "Unable to find permitted docker mounts on disk\n");
+    ret = MOUNT_ACCESS_ERROR;
+    goto free_and_exit;
+  }
+
+  if (values != NULL) {
+    for (i = 0; values[i] != NULL; ++i) {
+      mount_src = get_mount_source(values[i]);
+      if (mount_src == NULL) {
+        ret = INVALID_DOCKER_MOUNT;
+        goto free_and_exit;
+      }
+      permitted_rw = check_mount_permitted((const char **) 
permitted_rw_mounts, mount_src);
+      permitted_ro = check_mount_permitted((const char **) 
permitted_ro_mounts, mount_src);
+      if (permitted_ro == -1 || permitted_rw == -1) {
+        ret = INVALID_DOCKER_MOUNT;
+        goto free_and_exit;
+      }
+      // rw mount
+      if (ro == 0) {
+        if (permitted_rw == 0) {
+          fprintf(ERRORFILE, "Invalid docker rw mount '%s', realpath=%s\n", 
values[i], mount_src);
+          ret = INVALID_DOCKER_RW_MOUNT;
+          goto free_and_exit;
+        } else {
+          // determine if the user can modify the container-executor.cfg file
+          tmp_path_buffer[0] = normalize_mount(mount_src);
+          // just re-use the function, flip the args to check if the 
container-executor path is in the requested
+          // mount point
+          ret = check_mount_permitted(tmp_path_buffer, 
container_executor_cfg_path);
+          free((void *) tmp_path_buffer[0]);
+          if (ret == 1) {
+            fprintf(ERRORFILE, "Attempting to mount a parent directory '%s' of 
container-executor.cfg as read-write\n",
+                    values[i]);
+            ret = INVALID_DOCKER_RW_MOUNT;
+            goto free_and_exit;
+          }
+        }
+      }
+      //ro mount
+      if (ro != 0 && permitted_ro == 0 && permitted_rw == 0) {
+        ret = INVALID_DOCKER_RO_MOUNT;
+        goto free_and_exit;
+      }
+      tmp_buffer_2 = (char *) alloc_and_clear_memory(strlen(values[i]) + 
strlen(ro_suffix) + 1, sizeof(char));
+      strncpy(tmp_buffer_2, values[i], strlen(values[i]));
+      strncpy(tmp_buffer_2 + strlen(values[i]), ro_suffix, strlen(ro_suffix));
+      quote_and_append_arg(&tmp_buffer, &tmp_buffer_size, "-v ", tmp_buffer_2);
+      ret = add_to_buffer(out, outlen, tmp_buffer);
+      free(tmp_buffer_2);
+      free(mount_src);
+      tmp_buffer_2 = NULL;
+      mount_src = NULL;
+      memset(tmp_buffer, 0, tmp_buffer_size);
+      if (ret != 0) {
+        ret = BUFFER_TOO_SMALL;
+        goto free_and_exit;
+      }
+    }
+  }
+
+  free_and_exit:
+  free_values(permitted_ro_mounts);
+  free_values(permitted_rw_mounts);
+  free_values(values);
+  free(mount_src);
+  free((void *) container_executor_cfg_path);
+  free(tmp_buffer);
+  if (ret != 0) {
+    memset(out, 0, outlen);
+  }
+  return ret;
+}
{noformat}
As an additional comment on my second pass, I don’t think need to call 
{{normalize_mounts()}} up front since {{values}} might not have anything in it. 
Also, we might have only RO or only RW mounts, so we don’t need to necessarily 
compute {{permitted_rw}} or {{permitted_ro}}. We only need to if we’re actually 
going to mount RW or RO respectively. 

The rest of your responses that I didn't touch on looked like they were fixed 
correctly, so I left them out of this comment for brevity.

> Add support to turn off launching privileged containers in the 
> container-executor
> ---------------------------------------------------------------------------------
>
>                 Key: YARN-6623
>                 URL: https://issues.apache.org/jira/browse/YARN-6623
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>            Reporter: Varun Vasudev
>            Assignee: Varun Vasudev
>            Priority: Blocker
>         Attachments: YARN-6623.001.patch, YARN-6623.002.patch, 
> YARN-6623.003.patch, YARN-6623.004.patch, YARN-6623.005.patch, 
> YARN-6623.006.patch
>
>
> Currently, launching privileged containers is controlled by the NM. We should 
> add a flag to the container-executor.cfg allowing admins to disable launching 
> privileged containers at the container-executor level.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-6623) Add support to turn off launching privileged containers in the container-executor

Reply via email to