[
https://issues.apache.org/jira/browse/YARN-875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703540#comment-13703540
]
Bikas Saha commented on YARN-875:
---------------------------------
Its fine to catch the Throwable and report an error to the app and exit the
callback thread. Looks like if we simply assign the exception to the existing
savedException everything will work as expected. We should probably not stop
the heartbeat thread since then the RM will start AM exit timer. Its reasonable
to expect the app to call stop when AMRMClient sends it an onError() like you
suggest above. We should add this instruction to the javadoc of onError()
suggesting calling stop() as the recommended action.
> Application can hang if AMRMClientAsync callback thread has exception
> ---------------------------------------------------------------------
>
> Key: YARN-875
> URL: https://issues.apache.org/jira/browse/YARN-875
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.1.0-beta
> Reporter: Bikas Saha
> Assignee: Xuan Gong
>
> Currently that thread will die and then never callback. App can hang.
> Possible solution could be to catch Throwable in the callback and then call
> client.onError().
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira