[ 
https://issues.apache.org/jira/browse/YARN-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13660388#comment-13660388
 ] 

Vinod Kumar Vavilapalli commented on YARN-690:
----------------------------------------------

Bobby, the fix went in a little too fast for any of us to notice, you should 
give others a bit of time to be able to look at it. Tx.

While this is a quick fix that should help, we should think of more long term 
solutions - specifically looking for correct exceptions etc. After our recent 
exception work, mainly after YARN-628 and MAPREDUCE-5254, we can look for 
IOException specifically. Is that enough?
                
> RM exits on token cancel/renew problems
> ---------------------------------------
>
>                 Key: YARN-690
>                 URL: https://issues.apache.org/jira/browse/YARN-690
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 3.0.0, 0.23.7, 2.0.5-beta
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>             Fix For: 3.0.0, 2.0.5-beta, 0.23.8
>
>         Attachments: YARN-690.patch, YARN-690.patch
>
>
> The DelegationTokenRenewer thread is critical to the RM.  When a 
> non-IOException occurs, the thread calls System.exit to prevent the RM from 
> running w/o the thread.  It should be exiting only on non-RuntimeExceptions.
> The problem is especially bad in 23 because the yarn protobuf layer converts 
> IOExceptions into UndeclaredThrowableExceptions (RuntimeException) which 
> causes the renewer to abort the process.  An UnknownHostException takes down 
> the RM...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to