[
https://issues.apache.org/jira/browse/YARN-9192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807236#comment-16807236
]
Rayman commented on YARN-9192:
------------------------------
[~sihai]
This is probably because you have set yarn.nodemanager.recovery.enabled to
true, and
yarn.nodemanager.recovery.supervised to false.
[https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/NodeManager.html]
> Deletion Taks will be picked up to delete running containers
> ------------------------------------------------------------
>
> Key: YARN-9192
> URL: https://issues.apache.org/jira/browse/YARN-9192
> Project: Hadoop YARN
> Issue Type: Bug
> Components: applications
> Affects Versions: 2.9.1
> Reporter: Sihai Ke
> Priority: Major
>
> I suspect there is a bug in Yarn deletion task service, below is my repo
> steps:
> # First let's set yarn.nodemanager.delete.debug-delay-sec=3600, that means
> when the app finished, the Binary/container folder will be deleted after 3600
> seconds.
> # when the application App1 (long running service) is running on machine
> machine1, and machine1 shutdown, ContainerManagerImpl#serviceStop() will be
> called -> ContainerManagerImpl#cleanUpApplicationsOnNMShutDown, and
> ApplicationFinishEvent will be sent, and then some delection tasks will be
> created, but be stored in DB and will be picked up to execute 3600 seconds.
> # 100 seconds later, machine1 comes back, and the same app is assigned to
> run this this machine, container created and works well.
> # then deleting task created in step 2 will be picked up to delete
> containers created in step 3 later.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]