[
https://issues.apache.org/jira/browse/YARN-7070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143813#comment-16143813
]
Jason Lowe commented on YARN-7070:
----------------------------------
Hmm, at this point we cannot know for sure what's going on here without more
information in the logs. I'd recommend instrumenting the container-executor
binary so it logs somewhere what it's doing during deletes to see if there's a
clue why these directories aren't getting cleaned up. For example, have it log
each and every path it is trying to delete to see if there's a clue as to where
it stops deleting or if it even tries to delete these paths that remain. Also
emitting a log at the end stating it completed successfully would help if
there's some issue where it is actually crashing during deletes but the NM is
not recognizing that properly and reporting it.
Also to clarify -- are these directories always getting left behind or only
under certain circumstances (e.g.: only when an application or container is
getting killed, etc.)?
> some of local cache files for yarn can't be deleted
> ---------------------------------------------------
>
> Key: YARN-7070
> URL: https://issues.apache.org/jira/browse/YARN-7070
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 2.8.1
> Environment: Hadoop 2.8.1
> Reporter: Changyao Ye
> Attachments: application_1501810184023_55949.log
>
>
> We have found some of cache files(in
> /tmp/hadoop-yarn/nm-local-dir/usercache/hdfs/appcache) for yarn on
> nodemanager cannot be deleted properly. The directories are like
> following(blockmgr***)
> =================
> # ls -ltr application_1501810184023_55949
> total 120
> drwx--x--- 2 hdfs yarn 4096 Aug 22 04:29 filecache
> drwxr-s--- 2 hdfs yarn 4096 Aug 22 04:56
> blockmgr-881fab2c-fba4-4bb1-8dd9-5ab35a512df7
> drwxr-s--- 10 hdfs yarn 4096 Aug 22 04:56
> blockmgr-bf8a19f5-e9ae-4269-a0ef-b27d0f9c17e7
> drwxr-s--- 11 hdfs yarn 4096 Aug 22 04:58
> blockmgr-f3437e8d-9595-4898-8bda-92ebff3ada1d
> drwxr-s--- 18 hdfs yarn 4096 Aug 22 05:01
> blockmgr-930c0cd8-1d31-4cdb-a244-f6ad4bf74bff
> drwxr-s--- 12 hdfs yarn 4096 Aug 22 05:13
> blockmgr-83fc0702-ac40-4743-812a-7d488e92004e
> drwxr-s--- 9 hdfs yarn 4096 Aug 22 05:13
> blockmgr-f6cfe045-12c3-41d6-b77e-aa5200daeb6a
> drwxr-s--- 12 hdfs yarn 4096 Aug 22 05:13
> blockmgr-53dcb4ea-ba5d-4b8b-859b-805b9303a149
> drwxr-s--- 10 hdfs yarn 4096 Aug 22 05:13
> blockmgr-0c0c4bb9-ef5e-4ca1-8d23-ce5cd58d0a75
> drwxr-s--- 9 hdfs yarn 4096 Aug 22 05:13
> blockmgr-557d0f39-67d2-491a-9307-12fc1d724380
> drwxr-s--- 10 hdfs yarn 4096 Aug 22 05:13
> blockmgr-fbc87680-4df7-498e-bf6d-456a5aea4fc9
> drwxr-s--- 10 hdfs yarn 4096 Aug 22 05:13
> blockmgr-53ee8251-fac1-4f62-82c2-5e970f0d86ec
> drwxr-s--- 9 hdfs yarn 4096 Aug 22 05:14
> blockmgr-5a8bc187-abcf-482d-9da5-e8c4647d4731
> drwxr-s--- 10 hdfs yarn 4096 Aug 22 05:14
> blockmgr-251c3a99-cd85-442a-8945-52c344c0d861
> drwxr-s--- 13 hdfs yarn 4096 Aug 22 05:14
> blockmgr-c352c1ad-15dc-456b-8b62-5b83b9950494
> drwxr-s--- 12 hdfs yarn 4096 Aug 22 05:15
> blockmgr-b4f01347-4b51-4b35-8146-2aa840084c2b
> drwxr-s--- 14 hdfs yarn 4096 Aug 22 05:15
> blockmgr-0095d26c-c134-48b4-82a6-e8ae02f0189c
> drwxr-s--- 13 hdfs yarn 4096 Aug 22 05:15
> blockmgr-28a31574-61ae-459f-be3a-8608892246d7
> drwxr-s--- 16 hdfs yarn 4096 Aug 22 05:15
> blockmgr-c0cd0df9-b355-4209-b6aa-b549a1fa36eb
> drwxr-s--- 11 hdfs yarn 4096 Aug 22 05:15
> blockmgr-a2730abb-9517-461e-bedf-d9a2dcef373f
> drwxr-s--- 14 hdfs yarn 4096 Aug 22 05:15
> blockmgr-91dd2e1a-6bc2-4429-8b71-2f4240987159
> drwxr-s--- 12 hdfs yarn 4096 Aug 22 05:15
> blockmgr-f4e3a586-8817-45ea-a197-9fdbb3d91946
> drwxr-s--- 15 hdfs yarn 4096 Aug 22 05:15
> blockmgr-ba2c605e-89d8-4f7c-b42c-6ed4ba6bf4ea
> drwxr-s--- 16 hdfs yarn 4096 Aug 22 05:15
> blockmgr-2ae72383-5f72-4002-84a7-e6335b8c2b6c
> drwxr-s--- 13 hdfs yarn 4096 Aug 22 05:15
> blockmgr-6c5e260f-d3c7-4af6-91c1-168c73343f2d
> drwxr-s--- 16 hdfs yarn 4096 Aug 22 05:15
> blockmgr-2e9923b1-281c-4a9d-8069-6c5430bd5fc3
> drwxr-s--- 18 hdfs yarn 4096 Aug 22 05:15
> blockmgr-cc3f1406-d8a2-4bf5-a276-8f7aed75c513
> drwxr-s--- 11 hdfs yarn 4096 Aug 22 05:15
> blockmgr-975bcce0-84b2-4590-880b-bf182d76e319
> drwxr-s--- 11 hdfs yarn 4096 Aug 22 05:15
> blockmgr-ce82cb63-5998-4227-b85e-77f1c633db43
> drwxr-s--- 11 hdfs yarn 4096 Aug 22 05:15
> blockmgr-592af4aa-3c89-4081-8746-29b99f2220b1
> =================
> We also applied patches YARN-4594, YARN-4731, but nothing changed.
> YARN-4594 https://issues.apache.org/jira/browse/YARN-4594
> YARN-4731 https://issues.apache.org/jira/browse/YARN-4731
> Any advice will be greatly appreciated.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]