[
https://issues.apache.org/jira/browse/YARN-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14167175#comment-14167175
]
Karthik Kambatla commented on YARN-2183:
----------------------------------------
Thanks for updating the patch, Chris.
Review comments on the current patch:
# YarnConfiguration - let us include units for time in the config names;
period-mins, initial-delay-mins, resource-sleep-ms.
# CleanerService
## Don't need a SHUTDOWN_HOOK_PRIORITY specific to CleanerService, it is not
being used anywhere either.
## Nit: In the constructor, can we explicitly initialize AtomicBoolean(false).
## Nit: Rename scheduler to executor or scheduledExecutor?
## serviceStart: the exception thrown should have a message more amenable to
the user. How about appending "It appears there is another CleanerService
running in the cluster"?
## serviceStop: the log message that the background thread stopped should be
moved to the else-block in the try, instead of after catch?
{code}
try {
if (!scheduler.awaitTermination(10, TimeUnit.SECONDS)) {
LOG.warn("Gave up waiting for the cleaner task to shutdown.");
}
} catch (InterruptedException e) {
LOG.warn("The cleaner service was interrupted while shutting down the
task.",
e);
}
LOG.info("The background thread stopped.");
{code}
## runCleanerTask: Instead of checking if there is a scheduled-cleaner-task
running here, why not just rely on the check in CleanerTask#run(). Agree, we
might be doing a little more work here unnecessarily, but not sure the savings
are worth an extra check and an extra parameter in the CleanerTask constructor.
## How does a user use runCleanerTask? Instantiate another SCM? The SCM isn't
listening to any requests. I can see the SCM being run in the RM, and one could
potentially add "yarn rmadmin -clean-shared-cache". In any case, given there is
no way to reach a running SCM, I would remove runCleanerTask altogether for
now, and add it back later when we need it? Thoughts?
## Should we worry about users starting SCMs with roots at different levels
that can lead to multiple cleaners?
#CleanerTask:
## Nit: RENAMED_SUFFIX can be private
## Add @Override to run()
## The following code is duplicated in InMemoryStateStore a well. May be, we
should just add a static method to SharedCacheUtil?
{code}
StringBuilder pattern = new StringBuilder();
for (int i = 0; i < nestedLevel; i++) {
pattern.append("*/");
}
pattern.append("*");
{code}
## process: the sleep between directories is in millis, do we want to really
calculate nanos?
## Should cleanResourceReferences be moved to SCMStore?
## For the race condition (YARN-2663), would it help to handle the delete files
on HDFS in the store#remove?
# CleanerMetrics:
## Make initSingleton private and call it in getInstance if the instance is
null?
## How about using MutableRate or MutableStat for the rates?
# Do we need CleanerMetricsCollector, wouldn't CleanerMetrics extending
MetricsSource suffice?
> Cleaner service for cache manager
> ---------------------------------
>
> Key: YARN-2183
> URL: https://issues.apache.org/jira/browse/YARN-2183
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Chris Trezzo
> Assignee: Chris Trezzo
> Attachments: YARN-2183-trunk-v1.patch, YARN-2183-trunk-v2.patch,
> YARN-2183-trunk-v3.patch, YARN-2183-trunk-v4.patch
>
>
> Implement the cleaner service for the cache manager along with metrics for
> the service. This service is responsible for cleaning up old resource
> references in the manager and removing stale entries from the cache.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)