[
https://issues.apache.org/jira/browse/YARN-6967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119291#comment-16119291
]
Chengbing Liu commented on YARN-6967:
-------------------------------------
Hi [~templedf], I don't think the message will be any longer. In the patch, the
local {{BoundedAppender diags}} is independent from the field variable
{{diagnostics}} of {{RMAppAttemptImpl}}, and starts as a new
{{BoundedAppender}}.
We have seen cases where a Spark application throws a large exception message,
which is too large for ZooKeeper. Then both the AppMaster and the
ResourceManager keep trying and failing, making the RM irresponsive in the end.
> Limit application attempt's diagnostic message size thoroughly
> --------------------------------------------------------------
>
> Key: YARN-6967
> URL: https://issues.apache.org/jira/browse/YARN-6967
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Affects Versions: 2.8.1
> Reporter: Chengbing Liu
> Assignee: Chengbing Liu
> Attachments: YARN-6967.01.patch
>
>
> YARN-6125 implemented {{BoundedAppender}} and applied to the field
> {{diagnostics}} to limit the diagnostic message's size.
> However, some code bypasses this limit. In
> {{RMAppAttemptImpl.rememberTargetTransitionsAndStoreState(...)}}, a local
> variable {{diags}} will finally be written into ZooKeeper if ZKRMStateStore
> is used.
> A simple fix is to also use {{BoundedAppender}} for the local variable.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]