[ 
https://issues.apache.org/jira/browse/YARN-6967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119291#comment-16119291
 ] 

Chengbing Liu commented on YARN-6967:
-------------------------------------

Hi [~templedf], I don't think the message will be any longer. In the patch, the 
local {{BoundedAppender diags}} is independent from the field variable 
{{diagnostics}} of {{RMAppAttemptImpl}}, and starts as a new 
{{BoundedAppender}}.

We have seen cases where a Spark application throws a large exception message, 
which is too large for ZooKeeper. Then both the AppMaster and the 
ResourceManager keep trying and failing, making the RM irresponsive in the end.

> Limit application attempt's diagnostic message size thoroughly
> --------------------------------------------------------------
>
>                 Key: YARN-6967
>                 URL: https://issues.apache.org/jira/browse/YARN-6967
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.8.1
>            Reporter: Chengbing Liu
>            Assignee: Chengbing Liu
>         Attachments: YARN-6967.01.patch
>
>
> YARN-6125 implemented {{BoundedAppender}} and applied to the field 
> {{diagnostics}} to limit the diagnostic message's size.
> However, some code bypasses this limit. In 
> {{RMAppAttemptImpl.rememberTargetTransitionsAndStoreState(...)}}, a local 
> variable {{diags}} will finally be written into ZooKeeper if ZKRMStateStore 
> is used.
> A simple fix is to also use {{BoundedAppender}} for the local variable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to