[
https://issues.apache.org/jira/browse/YARN-8703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16598533#comment-16598533
]
lujie edited comment on YARN-8703 at 8/31/18 10:05 AM:
-------------------------------------------------------
[~jlowe]
Thanks for your explain, I have added sleep and kill job command in the
constructor of ResourceLocalizedEvent, I also add a log statement to show the
suspicious file path! By doing this, i have triggered the warning log
statement to execute. The log file shows that the suspicious file can be
deleted when the application finished!
{code:java}
2018-08-31 16:43:23,357 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
Received LOCALIZED event for request {
hdfs://hadoop11:29000/tmp/hadoop-yarn/staging/hires/.staging/job_1535704989367_0001/job.splitmetainfo,
1535705000427, FILE, null } but localized resource is missing
2018-08-31 16:43:23,358 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
start to delete
file:/home/hires/cloudraid/hadoop/hadoop-3.2.0-SNAPSHOT/tmp/nm-local-dir/usercache/hires/appcache/application_1535704989367_0001/filecache/10
2018-08-31 16:43:23,381 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
Resource {
hdfs://hadoop11:29000/tmp/hadoop-yarn/staging/hires/.staging/job_1535704989367_0001/job.splitmetainfo,
1535705000427, FILE, null } has been removed and will no longer be localized
2018-08-31 16:43:23,395 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting
path :
/home/hires/cloudraid/hadoop/hadoop-3.2.0-SNAPSHOT/tmp/nm-local-dir/usercache/hires/appcache/application_1535704989367_0001/filecache/10
{code}
was (Author: xiaoheipangzi):
[~jlowe]
Thanks for your explain, I have added sleep and kill job command in the
constructor of ResourceLocalizedEvent, I also add a log statement to show the
suspicious file path! By doing this, i have triggered the warning log
statement to execute. The log file shows that the suspicious file can be
deleted when the application finished!
> Localized resource may leak on disk if container is killed while localizing
> ---------------------------------------------------------------------------
>
> Key: YARN-8703
> URL: https://issues.apache.org/jira/browse/YARN-8703
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Reporter: Jason Lowe
> Priority: Major
>
> If a container is killed while localizing then it releases all of its
> resources. If the resource count goes to zero and it is in the DOWNLOADING
> state then the resource bookkeeping is removed in the resource tracker.
> Shortly afterwards the localizer could heartbeat in and report the successful
> localization of the resource that was just removed. When the
> LocalResourcesTrackerImpl receives the LOCALIZED event but does not find the
> corresponding LocalResource for the event then it simply logs a "localized
> without a location" warning. At that point I think the localized resource
> has been leaked on the disk since the NM has removed bookkeeping for the
> resource without removing it on disk.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]