[
https://issues.apache.org/jira/browse/YARN-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16534030#comment-16534030
]
Jason Lowe commented on YARN-8385:
----------------------------------
The container's working directories are cleaned up by YARN immediately after
the container terminates (unless the debug delay is configured on the NM). The
application directories are not cleaned up until the application completes, as
that is where a container can leave local data that may be accessed by a
subsequent container on the node or via an auxiliary service (e.g.: shuffle
data served up by a shuffle handler auxiliary service as is done for MapReduce,
Tez, and Spark).
Are you sure the data is being placed in the container's working directory and
not the application directory?
> Clean local directories when a container is killed
> --------------------------------------------------
>
> Key: YARN-8385
> URL: https://issues.apache.org/jira/browse/YARN-8385
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.7.0
> Reporter: Marco Gaido
> Priority: Major
>
> In long running applications, it may happen that many containers are created
> and killed. A use case is Spark Thrift Server when dynamic allocation is
> enabled. A lot of containers are killed and the application keeps running
> indefinitely.
> Currently, YARN seems to remove the local directories only when the whole
> application terminates. In the scenario described above, this can cause
> serious resource leakages. Please, check
> https://issues.apache.org/jira/browse/SPARK-22575.
> I think YARN should clean up all the local directories of a container when it
> is killed and not when the whole application terminates.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]