[ 
https://issues.apache.org/jira/browse/YARN-875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703540#comment-13703540
 ] 

Bikas Saha commented on YARN-875:
---------------------------------

Its fine to catch the Throwable and report an error to the app and exit the 
callback thread. Looks like if we simply assign the exception to the existing 
savedException everything will work as expected. We should probably not stop 
the heartbeat thread since then the RM will start AM exit timer. Its reasonable 
to expect the app to call stop when AMRMClient sends it an onError() like you 
suggest above. We should add this instruction to the javadoc of onError() 
suggesting calling stop() as the recommended action.
                
> Application can hang if AMRMClientAsync callback thread has exception
> ---------------------------------------------------------------------
>
>                 Key: YARN-875
>                 URL: https://issues.apache.org/jira/browse/YARN-875
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.1.0-beta
>            Reporter: Bikas Saha
>            Assignee: Xuan Gong
>
> Currently that thread will die and then never callback. App can hang. 
> Possible solution could be to catch Throwable in the callback and then call 
> client.onError().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to