[ 
https://issues.apache.org/jira/browse/YARN-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15960233#comment-15960233
 ] 

zhihai xu commented on YARN-4095:
---------------------------------

[~Feng Yuan], I think, For ShuffleHandler, we always want it to access all 
local directories which also include full local directories, since output data 
from mappers may be at the full local directories. Otherwise shuffle may fail 
due to data or index file can't be found in the good local directories.
{code}
          Path indexFileName = lDirAlloc.getLocalPathToRead(
              attemptBase + "/" + INDEX_FILE_NAME, conf);
          Path mapOutputFileName = lDirAlloc.getLocalPathToRead(
              attemptBase + "/" + DATA_FILE_NAME, conf);
    public Path getLocalPathToRead(String pathStr,
        Configuration conf) throws IOException {
      Context ctx = confChanged(conf);
      int numDirs = ctx.localDirs.length;
      int numDirsSearched = 0;
      //remove the leading slash from the path (to make sure that the uri
      //resolution results in a valid path on the dir being checked)
      if (pathStr.startsWith("/")) {
        pathStr = pathStr.substring(1);
      }
      while (numDirsSearched < numDirs) {
        Path file = new Path(ctx.localDirs[numDirsSearched], pathStr);
        if (ctx.localFS.exists(file)) {
          return file;
        }
        numDirsSearched++;
      }

      //no path found
      throw new DiskErrorException ("Could not find " + pathStr +" in any of" +
      " the configured local directories");
    }
{code}
I think This may be also the reason why we didn't want to use the same 
configuration between ShuffleHandler and LocalDirHandlerService.

> Avoid sharing AllocatorPerContext object in LocalDirAllocator between 
> ShuffleHandler and LocalDirsHandlerService.
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4095
>                 URL: https://issues.apache.org/jira/browse/YARN-4095
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager
>            Reporter: zhihai xu
>            Assignee: zhihai xu
>             Fix For: 2.8.0, 3.0.0-alpha1
>
>         Attachments: YARN-4095.000.patch, YARN-4095.001.patch
>
>
> Currently {{ShuffleHandler}} and {{LocalDirsHandlerService}} share 
> {{AllocatorPerContext}} object in {{LocalDirAllocator}} for configuration 
> {{NM_LOCAL_DIRS}} because {{AllocatorPerContext}} are stored in a static 
> TreeMap with configuration name as key
> {code}
>   private static Map <String, AllocatorPerContext> contexts = 
>                  new TreeMap<String, AllocatorPerContext>();
> {code}
> {{LocalDirsHandlerService}} and {{ShuffleHandler}} both create a 
> {{LocalDirAllocator}} using {{NM_LOCAL_DIRS}}. Even they don't use the same 
> {{Configuration}} object, but they will use the same {{AllocatorPerContext}} 
> object. Also {{LocalDirsHandlerService}} may change {{NM_LOCAL_DIRS}} value 
> in its {{Configuration}} object to exclude full and bad local dirs, 
> {{ShuffleHandler}} always uses the original {{NM_LOCAL_DIRS}} value in its 
> {{Configuration}} object. So every time {{AllocatorPerContext#confChanged}} 
> is called by {{ShuffleHandler}} after {{LocalDirsHandlerService}}, 
> {{AllocatorPerContext}} need be reinitialized because {{NM_LOCAL_DIRS}} value 
> is changed. This will cause some overhead.
> {code}
>       String newLocalDirs = conf.get(contextCfgItemName);
>       if (!newLocalDirs.equals(savedLocalDirs)) {
> {code}
> So it will be a good improvement to not share the same 
> {{AllocatorPerContext}} instance between {{ShuffleHandler}} and 
> {{LocalDirsHandlerService}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to