[ 
https://issues.apache.org/jira/browse/YARN-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14177477#comment-14177477
 ] 

zhihai xu commented on YARN-2701:
---------------------------------

Hi [~xgong], thanks for the details explanation. The explanation sounds 
reasonable to me. 

Some nits:
1. since we only check permission for the final directory component,
I think we also need check finalComponent in the first call check_permission.
change
{code}
        } else if (check_permission(sb.st_mode, perm) == -1) {
{code}
to
{code}
        } else if (finalComponent == 1 && check_permission(sb.st_mode, perm) == 
-1) {
{code}

2. Can we create a new function check_dir to remove the duplicate code which 
verify the existing directory at two places?
We can also remove function check_permission by moving check_permission code 
into check_dir.
This is check_dir function:
{code}
int check_dir(char* npath, mode_t st_mode, mode_t desired, int finalComponent) {
    // Check whether it is a directory
    if (!S_ISDIR (st_mode)) {
      fprintf(LOGFILE, "Path %s is file not dir\n", npath);
      return -1;
   } else if (finalComponent == 1) {
  int filePermInt = st_mode & (S_IRWXU | S_IRWXG | S_IRWXO);
  int desiredInt = desired & (S_IRWXU | S_IRWXG | S_IRWXO);
  if (filePermInt != desiredInt) {
      fprintf(LOGFILE, "Path %s does not have desired permission.\n", npath);
      return -1;
    }
   }
   return 0;
}
{code}

3. Can we move free(npath); from create_validate_dirs to mkdirs?
It will be better to free the memory at the same function(mkdirs) which 
allocated the memory.
in mkdirs 
{code}
if (create_validate_dirs(npath, perm, path, 0) == -1) {
      free(npath);
       return -1;
     }
{code}

4. a little more optimization to remove redundant code:
we can merge these two piece of code:
        fprintf(LOGFILE, "Can't create directory %s in %s - %s\n", npath,
                path, strerror(errno));
by 
if (errno != EEXIST || stat(npath, &sb) != 0) {

The code after change will be like the following: 
{code}
int create_validate_dir(char* npath, mode_t perm, char* path, int 
finalComponent) {
  struct stat sb;
  if (stat(npath, &sb) != 0) {
    if (mkdir(npath, perm) != 0) {
      if (errno != EEXIST  || stat(npath, &sb) != 0) {
        fprintf(LOGFILE, "Can't create directory %s in %s - %s\n", npath,
                path, strerror(errno));
        return -1;
      }
      // The directory npath should exist.
      if (check_dir(npath, sb.st_mode, perm, finalComponent) == -1) {
          return -1;
       }
    }
  } else if(check_dir(npath, sb.st_mode, perm, finalComponent) == -1){
      return -1;
  }
  return 0;
}
{code}

5. Can we change the name create_validate_dirs to create_validate_dir? since we 
only create one directory in create_validate_dirs.

> Potential race condition in startLocalizer when using LinuxContainerExecutor  
> ------------------------------------------------------------------------------
>
>                 Key: YARN-2701
>                 URL: https://issues.apache.org/jira/browse/YARN-2701
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>            Priority: Blocker
>         Attachments: YARN-2701.1.patch, YARN-2701.2.patch, YARN-2701.3.patch
>
>
> When using LinuxContainerExecutor do startLocalizer, we are using native code 
> container-executor.c. 
> {code}
>      if (stat(npath, &sb) != 0) {
>        if (mkdir(npath, perm) != 0) {
> {code}
> We are using check and create method to create the appDir under /usercache. 
> But if there are two containers trying to do this at the same time, race 
> condition may happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to