[
https://issues.apache.org/jira/browse/YARN-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13732231#comment-13732231
]
Steve Loughran commented on YARN-1016:
--------------------------------------
We can re-use existing artifacts in HDFS today -no need to reload them to HDFS
every time they are needed. For example for Hoya the user specifies an HDFS
path to an hbase tarball when a cluster is created, that (potentially shared)
file is listed as a local resource to d/l when starting every container.
The main place I could see a benefit is for the NM to cache resources d/l'd so
that if you start many containers with the same files, there's no download at
all, just a copy.
But this implies
# a per-user cache (you don't want a shared one for security reasons)
# cache expiry and/or a stat on every launch to see if they've changed
# cache quotas
> Define a HDFS based repository that allows YARN services to share resources
> ---------------------------------------------------------------------------
>
> Key: YARN-1016
> URL: https://issues.apache.org/jira/browse/YARN-1016
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: api
> Affects Versions: 3.0.0
> Reporter: Kam Kasravi
>
> YARN services both short and long lived can benefit from a resource repo
> rather than packaging resources within the YARN client to be extracted and
> used by the Application Master and (later) the containers. Standardizing a
> resource repo will provide performance benefits as well. The repo should be
> similar to maven or ivy repo's so discovery and versioning are built-in.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira