[
https://issues.apache.org/jira/browse/YARN-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14005493#comment-14005493
]
Junping Du commented on YARN-1338:
----------------------------------
Thanks for addressing my comments, [~jlowe]! Some additional comments:
I think currently we are using initStorage(conf) to create DB items for storing
NMState when NM is start for the first time and the same method for locating DB
items when NM is restart. Do we have any code to destroy DB items for NMState
when NM is decommissioned (not expecting short-term restart)? If not, when NM
is recommissioned - which should be recognized as a fresh node, it will still
have stale NMState info if NM_RECOVERY_DIR and DB_NAME not changed. Do I miss
anything here?
In LocalResourcesTrackerImpl#recoverResource()
{code}
+ incrementFileCountForLocalCacheDirectory(localDir.getParent());
{code}
Given localDir is already the parent of localPath, may be we should just
increment locaDir rather than its parent? I didn't see we have unit test to
check file count for resource directory after recovery. May be we should add
some?
> Recover localized resource cache state upon nodemanager restart
> ---------------------------------------------------------------
>
> Key: YARN-1338
> URL: https://issues.apache.org/jira/browse/YARN-1338
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager
> Affects Versions: 2.3.0
> Reporter: Jason Lowe
> Assignee: Jason Lowe
> Attachments: YARN-1338.patch, YARN-1338v2.patch,
> YARN-1338v3-and-YARN-1987.patch, YARN-1338v4.patch, YARN-1338v5.patch
>
>
> Today when node manager restarts we clean up all the distributed cache files
> from disk. This is definitely not ideal from 2 aspects.
> * For work preserving restart we definitely want them as running containers
> are using them
> * For even non work preserving restart this will be useful in the sense that
> we don't have to download them again if needed by future tasks.
--
This message was sent by Atlassian JIRA
(v6.2#6252)