[
https://issues.apache.org/jira/browse/YARN-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012235#comment-14012235
]
Junping Du commented on YARN-1338:
----------------------------------
bq. Good point. I added shutdown code that removes the recovery directory if
the shutdown is due to a decommission. I also added a unit test for this
scenario.
Thanks for addressing my comments, Jason!
bq. The last component of localDir is the unique resource ID and not a
directory managed by the local cache directory manager.
I see. It is really confusing and we'd better put some documents somewhere
(don't have to be in this patch though given this is big enough).
I will review it again today.
> Recover localized resource cache state upon nodemanager restart
> ---------------------------------------------------------------
>
> Key: YARN-1338
> URL: https://issues.apache.org/jira/browse/YARN-1338
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager
> Affects Versions: 2.3.0
> Reporter: Jason Lowe
> Assignee: Jason Lowe
> Attachments: YARN-1338.patch, YARN-1338v2.patch,
> YARN-1338v3-and-YARN-1987.patch, YARN-1338v4.patch, YARN-1338v5.patch,
> YARN-1338v6.patch
>
>
> Today when node manager restarts we clean up all the distributed cache files
> from disk. This is definitely not ideal from 2 aspects.
> * For work preserving restart we definitely want them as running containers
> are using them
> * For even non work preserving restart this will be useful in the sense that
> we don't have to download them again if needed by future tasks.
--
This message was sent by Atlassian JIRA
(v6.2#6252)