Jason Lowe created YARN-6846:
--------------------------------
Summary: Nodemanager can fail to fully delete application local
directories when applications are killed
Key: YARN-6846
URL: https://issues.apache.org/jira/browse/YARN-6846
Project: Hadoop YARN
Issue Type: Bug
Components: nodemanager
Affects Versions: 2.8.1
Reporter: Jason Lowe
Priority: Critical
When an application is killed all of the running containers are killed and the
app waits for the containers to complete before cleaning up. As each container
completes the container directory is deleted via the DeletionService. After
all containers have completed the app completes and the app directory is
deleted. If the app completes quickly enough then the deletion of the
container and app directories can race against each other. If the container
deletion executor deletes a file just before the application deletion executor
then it can cause the application deletion executor to fail, leaving the
remaining entries in the application directory lingering.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]