[ 
https://issues.apache.org/jira/browse/YARN-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14305514#comment-14305514
 ] 

Jason Lowe commented on YARN-3090:
----------------------------------

Thanks for the patch, Varun!  I kicked the tires on this patch and discovered 
it doesn't work.  I forced the container executor to throw a runtime exception 
but it wasn't logged.

It turns out I was mistaken in my earlier analysis.  The afterExecute method 
normally does not receive any exception because it only reports exceptions the 
_FutureTask run method_ throws, and normally that doesn't throw anything.  It 
stores any exception thrown by the underlying Runnable and almost always throws 
nothing itself.  Therefore afterExecute sees no exception even if one escaped 
during deletion, and we log nothing.  I mistakenly assumed that the thread pool 
executor would pass any exception stored by the FutureTask to afterExecute, but 
that's not the case.

To log the escaped exceptions we'd need to do something like this in 
afterExecute:

{code}
    @Override
    protected void afterExecute(Runnable task, Throwable exception) {
      if (task instanceof FutureTask<?>) {
        FutureTask<?> futureTask = (FutureTask<?>) task;
        if (!futureTask.isCancelled()) {
          try {
            futureTask.get();
          } catch (ExecutionException e) {
            exception = e.getCause();
          }
        }
      }
      if (exception != null) {
        LOG.warn("Exception during execution of task in DeletionService",
            exception);
      }
    }
{code}

A couple of other nits on the patch:
* The new class should be private
* The log message should be an ERROR rather than a WARN since this should only 
occur when an exception escaped the exception handling already in place for 
deletion.  Those kinds of exceptions are usually pretty bad (like NPEs).

> DeletionService can silently ignore deletion task failures
> ----------------------------------------------------------
>
>                 Key: YARN-3090
>                 URL: https://issues.apache.org/jira/browse/YARN-3090
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.1.1-beta
>            Reporter: Jason Lowe
>            Assignee: Varun Saxena
>         Attachments: YARN-3090.001.patch, YARN-3090.002.patch
>
>
> If a non-I/O exception occurs while the DeletionService is executing a 
> deletion task then it will be silently ignored.  The exception bubbles up to 
> the thread workers of the ScheduledThreadPoolExecutor which simply attaches 
> the throwable to the Future that was returned when the task was scheduled.  
> However the thread pool is used as a fire-and-forget pool, so nothing ever 
> looks at the Future and therefore the exception is never logged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to