[
https://issues.apache.org/jira/browse/YARN-875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13710344#comment-13710344
]
Bikas Saha commented on YARN-875:
---------------------------------
If users have done their homework then the libraries catch stmt will not be
executed and we are fine. If users have not done their homework and we ignore
their exceptions then we can get into bad cases where an allocation from the RM
is lost due to exception in onContainersAllocated() and so the app is hung now
because its waiting for that the allocation to happen. That is not acceptable
IMO. These libraries are all freshly written and IMO its better to fail fast
and expose issues than to silently ignore them. If we see a common case of
innocuous exceptions then we can choose to ignore them but we first need to see
them in real life usage.
We should fix the circular exception. The last patch attached has a bug in that
regard.
Changing to Throwable will not be incompatible because the async library has
not yet been officially released. It did not go out in 2.0.4-alpha.
> Application can hang if AMRMClientAsync callback thread has exception
> ---------------------------------------------------------------------
>
> Key: YARN-875
> URL: https://issues.apache.org/jira/browse/YARN-875
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.1.0-beta
> Reporter: Bikas Saha
> Assignee: Xuan Gong
> Attachments: YARN-875.1.patch, YARN-875.1.patch, YARN-875.2.patch
>
>
> Currently that thread will die and then never callback. App can hang.
> Possible solution could be to catch Throwable in the callback and then call
> client.onError().
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira