[ https://issues.apache.org/jira/browse/YARN-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15851698#comment-15851698 ]
Daniel Templeton commented on YARN-6125: ---------------------------------------- That is certainly an option. The point that [~yufeigu] made is that if a message is a stack trace, the most important stuff is at the top. What [~andras.piros] is saying is that we should drop the oldest message first, but starting from its tail, not its head. That way we get:{noformat}java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "foo.bar.com/127.0.0.1"; destination host is: "foo.bar.com":8020; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) at org.apache.hadoop.ipc.Client.call(Client.java:1476) at org.apache.hadoop.ipc.Client.call(Client.java:1409) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source) ...{noformat} instead of {noformat}... at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193) ... 41 more{noformat} > The application attempt's diagnostic message should have a maximum size > ----------------------------------------------------------------------- > > Key: YARN-6125 > URL: https://issues.apache.org/jira/browse/YARN-6125 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager > Affects Versions: 2.7.0 > Reporter: Daniel Templeton > Assignee: Andras Piros > Priority: Critical > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6125.000.patch, YARN-6125.001.patch, > YARN-6125.002.patch, YARN-6125.003.patch > > > We've found through experience that the diagnostic message can grow > unbounded. I've seen attempts that have diagnostic messages over 1MB. Since > the message is stored in the state store, it's a bad idea to allow the > message to grow unbounded. Instead, there should be a property that sets a > maximum size on the message. > I suspect that some of the ZK state store issues we've seen in the past were > due to the size of the diagnostic messages and not to the size of the > classpath, as is the current prevailing opinion. > An open question is how best to prune the message once it grows too large. > Should we > # truncate the tail, > # truncate the head, > # truncate the middle, > # add another property to make the behavior selectable, or > # none of the above? -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org