Jason Lowe commented on YARN-4354:

I committed the 2.7 patch to branch-2.7.2 as well, since it was missing from 
that release branch.

bq. I am not against to keep consistent for localization event handling with 
other subsystems, but not sure if ignoring other exceptional events could 
potentially cause NM ends up in a bad state.

>From my perspective, any escaped exception at the Async Dispatcher level is 
>capable of leaving the NM in a bad state.  Since it's escaped we don't know 
>where it occurred and what we were trying to do at the time.  That's why I 
>think it's a bit dangerous to assume the decisions we will make from that bad 
>state are better than crashing.  Anyway if we want to do this then we should 
>take up the discussion in a JIRA targeting that feature.

> Public resource localization fails with NPE
> -------------------------------------------
>                 Key: YARN-4354
>                 URL: https://issues.apache.org/jira/browse/YARN-4354
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.7.2
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Blocker
>             Fix For: 2.7.2
>         Attachments: YARN-4354-branch-2.7.002.patch, 
> YARN-4354-unittest.patch, YARN-4354.001.patch, YARN-4354.002.patch
> I saw public localization on nodemanagers get stuck because it was constantly 
> rejecting requests to the thread pool executor.

This message was sent by Atlassian JIRA

Reply via email to