[ 
https://issues.apache.org/jira/browse/YARN-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13918548#comment-13918548
 ] 

Chris Douglas commented on YARN-1771:
-------------------------------------

The simpler check doesn't seem to have any practical issues. Since the cache is 
keyed on Paths, the case where a user can refer to an object without access to 
it seems pretty esoteric. As long as the public cache runs with lowered 
privileges, and the check isn't necessary to verify that the "public" resource 
isn't private to YARN. Copying with the user's HDFS credentials avoids that, 
though that seems like a heavyweight solution if reducing getFileStatus calls 
is the only motivation.

> many getFileStatus calls made from node manager for localizing a public 
> distributed cache resource
> --------------------------------------------------------------------------------------------------
>
>                 Key: YARN-1771
>                 URL: https://issues.apache.org/jira/browse/YARN-1771
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>            Priority: Critical
>
> We're observing that the getFileStatus calls are putting a fair amount of 
> load on the name node as part of checking the public-ness for localizing a 
> resource that belong in the public cache.
> We see 7 getFileStatus calls made for each of these resource. We should look 
> into reducing the number of calls to the name node. One example:
> {noformat}
> 2014-02-27 18:07:27,351 INFO audit: ... cmd=getfileinfo       
> src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
> 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo       
> src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
> 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo       
> src=/tmp/temp-887708724/tmp883330348 ...
> 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo       
> src=/tmp/temp-887708724 ...
> 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo       src=/tmp ...
> 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo       src=/    ...
> 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo       
> src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
> 2014-02-27 18:07:27,355 INFO audit: ... cmd=open      
> src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to