[ 
https://issues.apache.org/jira/browse/YARN-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14167175#comment-14167175
 ] 

Karthik Kambatla edited comment on YARN-2183 at 10/10/14 5:42 PM:
------------------------------------------------------------------

Thanks for updating the patch, Chris.

Review comments on the current patch:
# YarnConfiguration - let us include units for time in the config names; 
period-mins, initial-delay-mins, resource-sleep-ms.
# CleanerService
## Don't need a SHUTDOWN_HOOK_PRIORITY specific to CleanerService, it is not 
being used anywhere either.
## Nit: In the constructor, can we explicitly initialize AtomicBoolean(false). 
## Nit: Rename scheduler to executor or scheduledExecutor? 
## serviceStart: the exception thrown should have a message more amenable to 
the user. How about appending "It appears there is another CleanerService 
running in the cluster"? 
## serviceStop: the log message that the background thread stopped should be 
moved to the else-block in the try, instead of after catch? 
{code}
    try {
      if (!scheduler.awaitTermination(10, TimeUnit.SECONDS)) {
        LOG.warn("Gave up waiting for the cleaner task to shutdown.");
      }
    } catch (InterruptedException e) {
      LOG.warn("The cleaner service was interrupted while shutting down the 
task.",
          e);
    }
    LOG.info("The background thread stopped.");
{code}
## runCleanerTask: Instead of checking if there is a scheduled-cleaner-task 
running here, why not just rely on the check in CleanerTask#run(). Agree, we 
might be doing a little more work here unnecessarily, but not sure the savings 
are worth an extra check and an extra parameter in the CleanerTask constructor. 
## How does a user use runCleanerTask? Instantiate another SCM? The SCM isn't 
listening to any requests. I can see the SCM being run in the RM, and one could 
potentially add "yarn rmadmin -clean-shared-cache". In any case, given there is 
no way to reach a running SCM, I would remove runCleanerTask altogether for 
now, and add it back later when we need it? Thoughts? 
## Should we worry about users starting SCMs with roots at different levels 
that can lead to multiple cleaners? 
# CleanerTask:
## Nit: RENAMED_SUFFIX can be private
## Add @Override to run()
## The following code is duplicated in InMemoryStateStore a well. May be, we 
should just add a static method to SharedCacheUtil? 
{code}
      StringBuilder pattern = new StringBuilder();
      for (int i = 0; i < nestedLevel; i++) {
        pattern.append("*/");
      }
      pattern.append("*");
{code}
## process: the sleep between directories is in millis, do we want to really 
calculate nanos? 
## Should cleanResourceReferences be moved to SCMStore?
## For the race condition (YARN-2663), would it help to handle the delete files 
on HDFS in the store#remove? 
# CleanerMetrics:
## Make initSingleton private and call it in getInstance if the instance is 
null? 
## How about using MutableRate or MutableStat for the rates?
# Do we need CleanerMetricsCollector, wouldn't CleanerMetrics extending 
MetricsSource suffice? 


was (Author: kkambatl):
Thanks for updating the patch, Chris.

Review comments on the current patch:
# YarnConfiguration - let us include units for time in the config names; 
period-mins, initial-delay-mins, resource-sleep-ms.
# CleanerService
## Don't need a SHUTDOWN_HOOK_PRIORITY specific to CleanerService, it is not 
being used anywhere either.
## Nit: In the constructor, can we explicitly initialize AtomicBoolean(false). 
## Nit: Rename scheduler to executor or scheduledExecutor? 
## serviceStart: the exception thrown should have a message more amenable to 
the user. How about appending "It appears there is another CleanerService 
running in the cluster"? 
## serviceStop: the log message that the background thread stopped should be 
moved to the else-block in the try, instead of after catch? 
{code}
    try {
      if (!scheduler.awaitTermination(10, TimeUnit.SECONDS)) {
        LOG.warn("Gave up waiting for the cleaner task to shutdown.");
      }
    } catch (InterruptedException e) {
      LOG.warn("The cleaner service was interrupted while shutting down the 
task.",
          e);
    }
    LOG.info("The background thread stopped.");
{code}
## runCleanerTask: Instead of checking if there is a scheduled-cleaner-task 
running here, why not just rely on the check in CleanerTask#run(). Agree, we 
might be doing a little more work here unnecessarily, but not sure the savings 
are worth an extra check and an extra parameter in the CleanerTask constructor. 
## How does a user use runCleanerTask? Instantiate another SCM? The SCM isn't 
listening to any requests. I can see the SCM being run in the RM, and one could 
potentially add "yarn rmadmin -clean-shared-cache". In any case, given there is 
no way to reach a running SCM, I would remove runCleanerTask altogether for 
now, and add it back later when we need it? Thoughts? 
## Should we worry about users starting SCMs with roots at different levels 
that can lead to multiple cleaners? 
#CleanerTask:
## Nit: RENAMED_SUFFIX can be private
## Add @Override to run()
## The following code is duplicated in InMemoryStateStore a well. May be, we 
should just add a static method to SharedCacheUtil? 
{code}
      StringBuilder pattern = new StringBuilder();
      for (int i = 0; i < nestedLevel; i++) {
        pattern.append("*/");
      }
      pattern.append("*");
{code}
## process: the sleep between directories is in millis, do we want to really 
calculate nanos? 
## Should cleanResourceReferences be moved to SCMStore?
## For the race condition (YARN-2663), would it help to handle the delete files 
on HDFS in the store#remove? 
# CleanerMetrics:
## Make initSingleton private and call it in getInstance if the instance is 
null? 
## How about using MutableRate or MutableStat for the rates?
# Do we need CleanerMetricsCollector, wouldn't CleanerMetrics extending 
MetricsSource suffice? 

> Cleaner service for cache manager
> ---------------------------------
>
>                 Key: YARN-2183
>                 URL: https://issues.apache.org/jira/browse/YARN-2183
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Chris Trezzo
>            Assignee: Chris Trezzo
>         Attachments: YARN-2183-trunk-v1.patch, YARN-2183-trunk-v2.patch, 
> YARN-2183-trunk-v3.patch, YARN-2183-trunk-v4.patch
>
>
> Implement the cleaner service for the cache manager along with metrics for 
> the service. This service is responsible for cleaning up old resource 
> references in the manager and removing stale entries from the cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to