[ 
https://issues.apache.org/jira/browse/YARN-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13615908#comment-13615908
 ] 

Omkar Vinit Joshi commented on YARN-467:
----------------------------------------

Adding tests to validate the expected behavior :-

* TestHierarchicalDirectory
** testHierarchicalSubDirectoryCreation :- It tests below scenarios
*** Limiting files per directory to 
YarnConfiguration.NM_LOCAL_CACHE_NUM_FILES_PER_DIRECTORY ( which includes 36 
directories)
*** If a file is removed (decFileCountForPath call) from any subdirectory then 
those directories are reused the order in which their state changes to 
DirectoryState.VACANT
*** Checks path generation upto 2nd level.
** testMinimumPerDirectoryFileLimit :- This tests if the configuration 
parameter is set to a value which is <= 36.

* TestLocalResourcesTrackerImpl
** testMinimumPerDirectoryFileLimit :- It is testing Public resources for 
HierarchicalDirectory structure.
                
> Jobs fail during resource localization when public distributed-cache hits 
> unix directory limits
> -----------------------------------------------------------------------------------------------
>
>                 Key: YARN-467
>                 URL: https://issues.apache.org/jira/browse/YARN-467
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 3.0.0, 2.0.0-alpha
>            Reporter: Omkar Vinit Joshi
>            Assignee: Omkar Vinit Joshi
>         Attachments: yarn-467-20130322.1.patch, yarn-467-20130322.2.patch, 
> yarn-467-20130322.3.patch, yarn-467-20130322.patch, 
> yarn-467-20130325.1.patch, yarn-467-20130325.path
>
>
> If we have multiple jobs which uses distributed cache with small size of 
> files, the directory limit reaches before reaching the cache size and fails 
> to create any directories in file cache (PUBLIC). The jobs start failing with 
> the below exception.
> java.io.IOException: mkdir of /tmp/nm-local-dir/filecache/3901886847734194975 
> failed
>       at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
>       at 
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>       at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>       at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
>       at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
>       at 
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>       at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
>       at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
>       at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
>       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>       at java.lang.Thread.run(Thread.java:662)
> we need to have a mechanism where in we can create directory hierarchy and 
> limit number of files per directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to