[
https://issues.apache.org/jira/browse/YARN-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13923176#comment-13923176
]
Sangjin Lee commented on YARN-1771:
-----------------------------------
I have been looking into this from the perspective of reducing the number of
unnecessary getFileStatus calls (and thereby reducing the pressure on the name
node). So for now I'm gravitating towards a solution that caches the
getFileStatus calls for the duration of a container initialization (i.e.
resource localization). It would be pretty effective (reducing the number of
calls from (m + 3)*n to n + (small constant)).
I'll upload a patch for your review shortly. Thanks!
> many getFileStatus calls made from node manager for localizing a public
> distributed cache resource
> --------------------------------------------------------------------------------------------------
>
> Key: YARN-1771
> URL: https://issues.apache.org/jira/browse/YARN-1771
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Reporter: Sangjin Lee
> Assignee: Sangjin Lee
> Priority: Critical
>
> We're observing that the getFileStatus calls are putting a fair amount of
> load on the name node as part of checking the public-ness for localizing a
> resource that belong in the public cache.
> We see 7 getFileStatus calls made for each of these resource. We should look
> into reducing the number of calls to the name node. One example:
> {noformat}
> 2014-02-27 18:07:27,351 INFO audit: ... cmd=getfileinfo
> src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
> 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo
> src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
> 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo
> src=/tmp/temp-887708724/tmp883330348 ...
> 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo
> src=/tmp/temp-887708724 ...
> 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo src=/tmp ...
> 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo src=/ ...
> 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo
> src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
> 2014-02-27 18:07:27,355 INFO audit: ... cmd=open
> src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.2#6252)