[ 
https://issues.apache.org/jira/browse/YARN-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13732231#comment-13732231
 ] 

Steve Loughran commented on YARN-1016:
--------------------------------------

We can re-use existing artifacts in HDFS today -no need to reload them to HDFS 
every time they are needed. For example for Hoya the user specifies an HDFS 
path to an hbase tarball when a cluster is created, that (potentially shared) 
file is listed as a local resource to d/l when starting every container.

The main place I could see a benefit is for the NM to cache resources d/l'd so 
that if you start many containers with the same files, there's no download at 
all, just a copy.

But this implies
# a per-user cache (you don't want a shared one for security reasons)
# cache expiry and/or a stat on every launch to see if they've changed
# cache quotas



                
> Define a HDFS based repository that allows YARN services to share resources
> ---------------------------------------------------------------------------
>
>                 Key: YARN-1016
>                 URL: https://issues.apache.org/jira/browse/YARN-1016
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api
>    Affects Versions: 3.0.0
>            Reporter: Kam Kasravi
>
> YARN services both short and long lived can benefit from a resource repo 
> rather than packaging resources within the YARN client to be extracted and 
> used by the Application Master and (later) the containers. Standardizing a 
> resource repo will provide performance benefits as well. The repo should be 
> similar to maven or ivy repo's so discovery and versioning are built-in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to