Jason Lowe commented on YARN-7879:

bq. We are allowing file cache to be mounted in docker container as read only 
in YARN-7815.

If we are mounting a file cache directory into a container then I assume the 
user running in the Docker container should have the right to read every file 
under that file cache directory.  I do not see the security concern there if 
that's the case, but maybe I'm missing a key scenario that would be problematic?

bq. The risk of exposing filename is marginally small, but I like to confirm 
that is not a problem even the filename contains sensitive information exposed 
in docker containers.

The only way I can see it being an issue specific to Docker is if somehow 
something in the Docker container is not trusted that runs as a different user 
within the Docker  container (but still in the hadoop group or equivalent for 
the Docker container) pokes around for the filename.  That thing would have to 
probe for filenames since there's no read access on the filecache top-level 
directory, only group-execute permissions.

However I would argue that if the user is running untrusted things within the 
Docker container it's simply much easier to access the sensitive files _as the 
user_.  Then there would be access to the file's contents in addition to the 

bq. Can cache directory contain subdirectories to prevent this arrangement from 

Yes, if the cache directory manager is being used there can be subdirectories 
to limit the total number of entries in a single directory.  In those cases the 
intermediate directories are setup with similar 0755 permissions so the NM user 
can access them easily, see ContainerLocalizer#createParentDirs.

This patch is restoring the usercache permissions behavior from before 
YARN-2185 went in.  YARN-2185 wasn't about addressing directory permissions, 
but it had a sidecar permission change that broke the ability for the NM to 
reuse non-public localized resources.  Therefore I'd like to see this go in so 
we aren't regressing functionality, and if there are concerns/improvements for 
how usercache permissions are handled we should address those in a separate 
JIRA.  Either that or we revert YARN-2185, remove the unrelated permissions 
change, recommit it, and still end up addressing any usercache permissions 
concerns in a separate JIRA. ;-)

> NM user is unable to access the application filecache due to permissions
> ------------------------------------------------------------------------
>                 Key: YARN-7879
>                 URL: https://issues.apache.org/jira/browse/YARN-7879
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 3.1.0
>            Reporter: Shane Kumpf
>            Assignee: Jason Lowe
>            Priority: Critical
>         Attachments: YARN-7879.001.patch
> I noticed the following log entries where localization was being retried on 
> several MR AM files. 
> {code}
> 2018-02-02 02:53:02,905 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>  Resource 
> /hadoop-yarn/usercache/hadoopuser/appcache/application_1517539453610_0001/filecache/11/job.jar
>  is missing, localizing it again
> 2018-02-02 02:53:42,908 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>  Resource 
> /hadoop-yarn/usercache/hadoopuser/appcache/application_1517539453610_0001/filecache/13/job.xml
>  is missing, localizing it again
> {code}
> The cluster is configured to use LCE and 
> {{yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user}} is 
> set to a user ({{hadoopuser}}) that is in the {{hadoop}} group. The user has 
> a umask of {{0002}}. The cluser is configured with 
> {{fs.permissions.umask-mode=022}}, coming from {{core-default}}. Setting the 
> local-user to {{nobody}}, who is not a login user or in the {{hadoop}} group, 
> produces the same results.
> {code}
> [hadoopuser@y7001 ~]$ umask
> 0002
> [hadoopuser@y7001 ~]$ id
> uid=1003(hadoopuser) gid=1004(hadoopuser) groups=1004(hadoopuser),1001(hadoop)
> {code}
> The cause of the log entry was tracked down a simple !file.exists call in 
> {{LocalResourcesTrackerImpl#isResourcePresent}}.
> {code}
>   public boolean isResourcePresent(LocalizedResource rsrc) {
>     boolean ret = true;
>     if (rsrc.getState() == ResourceState.LOCALIZED) {
>       File file = new File(rsrc.getLocalPath().toUri().getRawPath().
>         toString());
>       if (!file.exists()) {
>         ret = false;
>       } else if (dirsHandler != null) {
>         ret = checkLocalResource(rsrc);
>       }
>     }
>     return ret;
>   }
> {code}
> The Resources Tracker runs as the NM user, in this case {{yarn}}. The files 
> being retried are in the filecache. The directories in the filecache are all 
> owned by the local-user's primary group and 700 perms, which makes it 
> unreadable by the {{yarn}} user.
> {code}
> [root@y7001 ~]# ls -la 
> /hadoop-yarn/usercache/hadoopuser/appcache/application_1517540536531_0001/filecache
> total 0
> drwx--x---. 6 hadoopuser hadoop     46 Feb  2 03:06 .
> drwxr-s---. 4 hadoopuser hadoop     73 Feb  2 03:07 ..
> drwx------. 2 hadoopuser hadoopuser 61 Feb  2 03:05 10
> drwx------. 3 hadoopuser hadoopuser 21 Feb  2 03:05 11
> drwx------. 2 hadoopuser hadoopuser 45 Feb  2 03:06 12
> drwx------. 2 hadoopuser hadoopuser 41 Feb  2 03:06 13
> {code}
> I saw YARN-5287, but that appears to be related to a restrictive umask and 
> the usercache itself. I was unable to locate any other known issues that 
> seemed relevent. Is the above already known? a configuration issue?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to