[ 
https://issues.apache.org/jira/browse/YARN-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840265#comment-15840265
 ] 

Jason Lowe commented on YARN-6125:
----------------------------------

For the huge examples that have been encountered so far, what would have worked 
best for them?  Are they simply a gigantic stacktrace, an accumulation of 
independent diagnostic messages, or potentially recurring, redundant messages 
for the same error?  I normally would tend to lean towards preserving the tail 
end of the message with the assumption that the most recent error would be 
logged there, but of course there could be cascading errors and the beginning 
would be better.

That's why I'm hoping the real-world examples help shape the direction here.  
I'd rather not add yet another config that either nobody sets or knows how to 
set correctly.  If we do add a config then the next question is whether that 
config should be app-specific (e.g.: app framework A's diagnostic approach 
works best with preserving the end, but preserving the beginning is better for 
B, etc.).


> The application attempt's diagnostic message should have a maximum size
> -----------------------------------------------------------------------
>
>                 Key: YARN-6125
>                 URL: https://issues.apache.org/jira/browse/YARN-6125
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 2.7.0
>            Reporter: Daniel Templeton
>            Assignee: Daniel Templeton
>            Priority: Critical
>
> We've found through experience that the diagnostic message can grow 
> unbounded.  I've seen attempts that have diagnostic messages over 1MB.  Since 
> the message is stored in the state store, it's a bad idea to allow the 
> message to grow unbounded.  Instead, there should be a property that sets a 
> maximum size on the message.
> I suspect that some of the ZK state store issues we've seen in the past were 
> due to the size of the diagnostic messages and not to the size of the 
> classpath, as is the current prevailing opinion.
> An open question is how best to prune the message once it grows too large.  
> Should we
> # truncate the tail,
> # truncate the head,
> # truncate the middle,
> # add another property to make the behavior selectable, or
> # none of the above?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to