Chris Trezzo created YARN-3637:
----------------------------------

             Summary: Handle localization sym-linking correctly at the YARN 
level
                 Key: YARN-3637
                 URL: https://issues.apache.org/jira/browse/YARN-3637
             Project: Hadoop YARN
          Issue Type: Sub-task
            Reporter: Chris Trezzo
            Assignee: Chris Trezzo


The shared cache needs to handle resource sym-linking at the YARN layer. 
Currently, we let the application layer (i.e. mapreduce) handle this, but it is 
probably better for all applications if it is handled transparently.

Here is the scenario:
Imagine two separate jars (with unique checksums) that have the same name 
job.jar.

They are stored in the shared cache as two separate resources:
checksum1/job.jar
checksum2/job.jar

A new application tries to use both of these resources, but internally refers 
to them as different names:
foo.jar maps to checksum1
bar.jar maps to checksum2

When the shared cache returns the path to the resources, both resources are 
named the same (i.e. job.jar). Because of this, when the resources are 
localized one of them clobbers the other. This is because both symlinks in the 
container_id directory are the same name (i.e. job.jar) even though they point 
to two separate resource directories.

Originally we tackled this in the MapReduce client by using the fragment 
portion of the resource url. This, however, seems like something that should be 
solved at the YARN layer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to