[
https://issues.apache.org/jira/browse/YARN-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13918301#comment-13918301
]
Jason Lowe commented on YARN-1771:
----------------------------------
Yes, it would be a weaker condition check, but I'm wondering if the weaker
check still meets the security needs of the dist cache.
A user is requesting a resource to be publicly localized. If they have read
permissions to it then even if others lack access then the original user can
trivially work around that obstacle by copying to a publicly accessible
location (e.g.: /tmp). So in that sense the user has a legitimate way to make
the resource data public even if it isn't right now.
A subsequent request for the same resource would check the timestamp doing the
same doAs logic, so if another user doesn't have access then they won't
localize. It's true that the other user's container can still access the
resource by avoiding explicit localization and instead scanning/scraping the
local public distcache area directly once it runs. However the original user
who requested the resource asked for it to be public and has the means to make
it public, so they probably aren't concerned that the public can access it.
This approach would also be useful to the shared cache design in YARN-1492,
where it was calling for the ability to make something a public resource
directly from a user's staging area.
There may be some security concerns that I've missed, but if this ends up being
a possibility then it would eliminate all of the parent directory stat calls on
public localization.
> many getFileStatus calls made from node manager for localizing a public
> distributed cache resource
> --------------------------------------------------------------------------------------------------
>
> Key: YARN-1771
> URL: https://issues.apache.org/jira/browse/YARN-1771
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Reporter: Sangjin Lee
> Assignee: Sangjin Lee
> Priority: Critical
>
> We're observing that the getFileStatus calls are putting a fair amount of
> load on the name node as part of checking the public-ness for localizing a
> resource that belong in the public cache.
> We see 7 getFileStatus calls made for each of these resource. We should look
> into reducing the number of calls to the name node. One example:
> {noformat}
> 2014-02-27 18:07:27,351 INFO audit: ... cmd=getfileinfo
> src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
> 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo
> src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
> 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo
> src=/tmp/temp-887708724/tmp883330348 ...
> 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo
> src=/tmp/temp-887708724 ...
> 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo src=/tmp ...
> 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo src=/ ...
> 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo
> src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
> 2014-02-27 18:07:27,355 INFO audit: ... cmd=open
> src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.2#6252)