[ 
https://issues.apache.org/jira/browse/YARN-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063878#comment-16063878
 ] 

Jason Lowe commented on YARN-6708:
----------------------------------

Thanks for the report and the patch!

I'm not a fan of moving LocalCacheDirectoryManager to yarn-common.  It is 
_very_ specific to the peculiarities of how container localization works, and 
therefore isn't really a reusable component as something in yarn-common would 
imply.

Instead I think it's more appropriate for the directory to be created _before_ 
FSDownload tries to download it.  Notice that FSDownload when it creates the 
directory is not expecting to create parents, because it is explicitly calling 
the mkdir form that should fail when parent directories do not exist.  That 
made me wonder how this is actually working in practice, and I found that the 
place where the parents are getting auto-created is actually this code chunk in 
ContainerLocalizer:
{code}
  Callable<Path> download(Path path, LocalResource rsrc,
      UserGroupInformation ugi) throws IOException {
    diskValidator.checkStatus(new File(path.toUri().getRawPath()));
    return new FSDownloadWrapper(lfs, ugi, conf, path, rsrc);
  }
{code}

The checkStatus call is calling checkDir which in turn calls 
mkdirsWithExistsCheck.  That's creating the parent directories with default 
permissions.  I'd rather see ContainerLocalizer setup the parent directories 
with proper permissions before calling FSDownload.  ContainerLocalizer is 
already in the appropriate package to leverage LocalCacheDirectoryManager and 
seems like a more appropriate place to make this change.


> Nodemanager container crash after ext3 folder limit
> ---------------------------------------------------
>
>                 Key: YARN-6708
>                 URL: https://issues.apache.org/jira/browse/YARN-6708
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Bibin A Chundatt
>            Assignee: Bibin A Chundatt
>            Priority: Critical
>         Attachments: YARN-6708.001.patch, YARN-6708.002.patch, 
> YARN-6708.003.patch
>
>
> Configure umask as *027* for nodemanager service user
> and {{yarn.nodemanager.local-cache.max-files-per-directory}} as {{40}}. After 
> 4  *private* dir localization next directory will be *0/14*
> Local Directory cache manager 
> {code}
> vm2:/opt/hadoop/release/data/nmlocal/usercache/mapred/filecache # l
> total 28
> drwx--x--- 7 mapred hadoop 4096 Jun 10 14:35 ./
> drwxr-s--- 4 mapred hadoop 4096 Jun 10 12:07 ../
> drwxr-x--- 3 mapred users  4096 Jun 10 14:36 0/
> drwxr-xr-x 3 mapred users  4096 Jun 10 12:15 10/
> drwxr-xr-x 3 mapred users  4096 Jun 10 12:22 11/
> drwxr-xr-x 3 mapred users  4096 Jun 10 12:27 12/
> drwxr-xr-x 3 mapred users  4096 Jun 10 12:31 13/
> {code}
> *drwxr-x---* 3 mapred users  4096 Jun 10 14:36 0/ is only *750*
> Nodemanager user will not be able check for localization path exists or not.
> {{LocalResourcesTrackerImpl}}
> {code}
>     case REQUEST:
>       if (rsrc != null && (!isResourcePresent(rsrc))) {
>         LOG.info("Resource " + rsrc.getLocalPath()
>             + " is missing, localizing it again");
>         removeResource(req);
>         rsrc = null;
>       }
>       if (null == rsrc) {
>         rsrc = new LocalizedResource(req, dispatcher);
>         localrsrc.put(req, rsrc);
>       }
>       break;
> {code}
> *isResourcePresent* will always return false and same resource will be 
> localized to {{0}} to next unique number



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to