[ 
https://issues.apache.org/jira/browse/YARN-8703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16598533#comment-16598533
 ] 

lujie edited comment on YARN-8703 at 8/31/18 10:14 AM:
-------------------------------------------------------

[~jlowe]

Thanks for your explain, I have added sleep and kill job command in the  
constructor of ResourceLocalizedEvent, I also add a log statement to show  the 
suspicious file path!  By doing this, i have triggered the warning log 
statement to execute.  

Below is snippet of  the log file, log1 is generated by the warning log 
statement! log2 is generated by my log, log3 shows that the suspicious file can 
be deleted when the application  finished! 
{code:java}
1. Received LOCALIZED event for request { 
hdfs://hadoop11:29000/tmp/hadoop-yarn/staging/hires/.staging/job_1535704989367_0001/job.splitmetainfo,
 1535705000427, FILE, null } but localized resource is missing

2. start to delete 
file:/home/hires/cloudraid/hadoop/hadoop-3.2.0-SNAPSHOT/tmp/nm-local-dir/usercache/hires/appcache/application_1535704989367_0001/filecache/10
..............

3. Deleting path : 
/home/hires/cloudraid/hadoop/hadoop-3.2.0-SNAPSHOT/tmp/nm-local-dir/usercache/hires/appcache/application_1535704989367_0001/filecache/10
{code}
 


was (Author: xiaoheipangzi):
[~jlowe]

Thanks for your explain, I have added sleep and kill job command in the  
constructor of ResourceLocalizedEvent, I also add a log statement to show  the 
suspicious file path!  By doing this, i have triggered the warning log 
statement to execute.  

Below is snippet of  the log file, log1 is generated by the warning log 
statement! log2 is generated by my log, log3 shows that the suspicious file can 
be deleted when the application  finished! 
{code:java}
1. Received LOCALIZED event for request { 
hdfs://hadoop11:29000/tmp/hadoop-yarn/staging/hires/.staging/job_1535704989367_0001/job.splitmetainfo,
 1535705000427, FILE, null } but localized resource is missing
2. start to delete 
file:/home/hires/cloudraid/hadoop/hadoop-3.2.0-SNAPSHOT/tmp/nm-local-dir/usercache/hires/appcache/application_1535704989367_0001/filecache/10


3. Deleting path : 
/home/hires/cloudraid/hadoop/hadoop-3.2.0-SNAPSHOT/tmp/nm-local-dir/usercache/hires/appcache/application_1535704989367_0001/filecache/10
{code}
 

> Localized resource may leak on disk if container is killed while localizing
> ---------------------------------------------------------------------------
>
>                 Key: YARN-8703
>                 URL: https://issues.apache.org/jira/browse/YARN-8703
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>            Reporter: Jason Lowe
>            Priority: Major
>         Attachments: hadoop-hires-nodemanager-hadoop15.log
>
>
> If a container is killed while localizing then it releases all of its 
> resources.  If the resource count goes to zero and it is in the DOWNLOADING 
> state then the resource bookkeeping is removed in the resource tracker.  
> Shortly afterwards the localizer could heartbeat in and report the successful 
> localization of the resource that was just removed.  When the 
> LocalResourcesTrackerImpl receives the LOCALIZED event but does not find the 
> corresponding LocalResource for the event then it simply logs a "localized 
> without a location" warning.  At that point I think the localized resource 
> has been leaked on the disk since the NM has removed bookkeeping for the 
> resource without removing it on disk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to