[
https://issues.apache.org/jira/browse/YARN-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14305514#comment-14305514
]
Jason Lowe commented on YARN-3090:
----------------------------------
Thanks for the patch, Varun! I kicked the tires on this patch and discovered
it doesn't work. I forced the container executor to throw a runtime exception
but it wasn't logged.
It turns out I was mistaken in my earlier analysis. The afterExecute method
normally does not receive any exception because it only reports exceptions the
_FutureTask run method_ throws, and normally that doesn't throw anything. It
stores any exception thrown by the underlying Runnable and almost always throws
nothing itself. Therefore afterExecute sees no exception even if one escaped
during deletion, and we log nothing. I mistakenly assumed that the thread pool
executor would pass any exception stored by the FutureTask to afterExecute, but
that's not the case.
To log the escaped exceptions we'd need to do something like this in
afterExecute:
{code}
@Override
protected void afterExecute(Runnable task, Throwable exception) {
if (task instanceof FutureTask<?>) {
FutureTask<?> futureTask = (FutureTask<?>) task;
if (!futureTask.isCancelled()) {
try {
futureTask.get();
} catch (ExecutionException e) {
exception = e.getCause();
}
}
}
if (exception != null) {
LOG.warn("Exception during execution of task in DeletionService",
exception);
}
}
{code}
A couple of other nits on the patch:
* The new class should be private
* The log message should be an ERROR rather than a WARN since this should only
occur when an exception escaped the exception handling already in place for
deletion. Those kinds of exceptions are usually pretty bad (like NPEs).
> DeletionService can silently ignore deletion task failures
> ----------------------------------------------------------
>
> Key: YARN-3090
> URL: https://issues.apache.org/jira/browse/YARN-3090
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 2.1.1-beta
> Reporter: Jason Lowe
> Assignee: Varun Saxena
> Attachments: YARN-3090.001.patch, YARN-3090.002.patch
>
>
> If a non-I/O exception occurs while the DeletionService is executing a
> deletion task then it will be silently ignored. The exception bubbles up to
> the thread workers of the ScheduledThreadPoolExecutor which simply attaches
> the throwable to the Future that was returned when the task was scheduled.
> However the thread pool is used as a fire-and-forget pool, so nothing ever
> looks at the Future and therefore the exception is never logged.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)