Junping Du commented on YARN-4354:

+1. Patch LGTM. Will commit it shortly.
bq. Looks like this can cause nodemanagers to crash as well.
To make NM more robust, I think we should tolerate this kind of 
failure/exception in LocalResourcesTracker rather than making NM's dispatch to 
crash and exit. May be we can make LocalResourcesTracker have a separated 
AsyncDispatcher to set "DISPATCHER_EXIT_ON_ERROR_KEY" to false like what we do 
in RM for SchedulerEventDispatcher?

> Public resource localization fails with NPE
> -------------------------------------------
>                 Key: YARN-4354
>                 URL: https://issues.apache.org/jira/browse/YARN-4354
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.7.2
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Blocker
>         Attachments: YARN-4354-unittest.patch, YARN-4354.001.patch, 
> YARN-4354.002.patch
> I saw public localization on nodemanagers get stuck because it was constantly 
> rejecting requests to the thread pool executor.

This message was sent by Atlassian JIRA

Reply via email to