[
https://issues.apache.org/jira/browse/YARN-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15851698#comment-15851698
]
Daniel Templeton commented on YARN-6125:
----------------------------------------
That is certainly an option. The point that [~yufeigu] made is that if a
message is a stack trace, the most important stuff is at the top. What
[~andras.piros] is saying is that we should drop the oldest message first, but
starting from its tail, not its head. That way we
get:{noformat}java.io.IOException: Failed on local exception:
java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed
[Caused by GSSException: No valid credentials provided (Mechanism level: Failed
to find any Kerberos tgt)]; Host Details : local host is:
"foo.bar.com/127.0.0.1"; destination host is: "foo.bar.com":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1476)
at org.apache.hadoop.ipc.Client.call(Client.java:1409)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
...{noformat} instead of {noformat}...
at
sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
at
sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
at
sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
at
sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
... 41 more{noformat}
> The application attempt's diagnostic message should have a maximum size
> -----------------------------------------------------------------------
>
> Key: YARN-6125
> URL: https://issues.apache.org/jira/browse/YARN-6125
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: resourcemanager
> Affects Versions: 2.7.0
> Reporter: Daniel Templeton
> Assignee: Andras Piros
> Priority: Critical
> Fix For: 3.0.0-alpha3
>
> Attachments: YARN-6125.000.patch, YARN-6125.001.patch,
> YARN-6125.002.patch, YARN-6125.003.patch
>
>
> We've found through experience that the diagnostic message can grow
> unbounded. I've seen attempts that have diagnostic messages over 1MB. Since
> the message is stored in the state store, it's a bad idea to allow the
> message to grow unbounded. Instead, there should be a property that sets a
> maximum size on the message.
> I suspect that some of the ZK state store issues we've seen in the past were
> due to the size of the diagnostic messages and not to the size of the
> classpath, as is the current prevailing opinion.
> An open question is how best to prune the message once it grows too large.
> Should we
> # truncate the tail,
> # truncate the head,
> # truncate the middle,
> # add another property to make the behavior selectable, or
> # none of the above?
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]