[ https://issues.apache.org/jira/browse/YARN-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14186238#comment-14186238 ]
Sangjin Lee commented on YARN-2755: ----------------------------------- Thanks [~l201514]. Just wanted to add that when this happens and if the NM has enough of these empty directories it would take a long time for the NM to finish the init, and the RM gives up on it and sends a shutdown to the NM. This cycle repeats. > NM fails to clean up usercache_DEL_<timestamp> dirs after YARN-661 > ------------------------------------------------------------------ > > Key: YARN-2755 > URL: https://issues.apache.org/jira/browse/YARN-2755 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Siqi Li > Assignee: Siqi Li > Priority: Critical > Attachments: YARN-2755.v1.patch > > > When NM restarts frequently due to some reason, a large number of directories > like these left in /data/disk$num/yarn/local/: > /data/disk1/yarn/local/usercache_DEL_1414372756105 > /data/disk1/yarn/local/usercache_DEL_1413557901696 > /data/disk1/yarn/local/usercache_DEL_1413657004894 > /data/disk1/yarn/local/usercache_DEL_1413675321860 > /data/disk1/yarn/local/usercache_DEL_1414093167936 > /data/disk1/yarn/local/usercache_DEL_1413565841271 > These directories are empty, but take up 100M+ due to the number of them. > There were 38714 on the machine I looked at per data disk. > It appears to be a regression introduced by YARN-661 -- This message was sent by Atlassian JIRA (v6.3.4#6332)