[ 
https://issues.apache.org/jira/browse/YARN-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15851698#comment-15851698
 ] 

Daniel Templeton commented on YARN-6125:
----------------------------------------

That is certainly an option.  The point that [~yufeigu] made is that if a 
message is a stack trace, the most important stuff is at the top.  What 
[~andras.piros] is saying is that we should drop the oldest message first, but 
starting from its tail, not its head.  That way we 
get:{noformat}java.io.IOException: Failed on local exception: 
java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed 
[Caused by GSSException: No valid credentials provided (Mechanism level: Failed 
to find any Kerberos tgt)]; Host Details : local host is: 
"foo.bar.com/127.0.0.1"; destination host is: "foo.bar.com":8020;
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
        at org.apache.hadoop.ipc.Client.call(Client.java:1476)
        at org.apache.hadoop.ipc.Client.call(Client.java:1409)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
        at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
...{noformat} instead of {noformat}...
        at 
sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
        at 
sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
        at 
sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
        at 
sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
        at 
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
        ... 41 more{noformat}

> The application attempt's diagnostic message should have a maximum size
> -----------------------------------------------------------------------
>
>                 Key: YARN-6125
>                 URL: https://issues.apache.org/jira/browse/YARN-6125
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 2.7.0
>            Reporter: Daniel Templeton
>            Assignee: Andras Piros
>            Priority: Critical
>             Fix For: 3.0.0-alpha3
>
>         Attachments: YARN-6125.000.patch, YARN-6125.001.patch, 
> YARN-6125.002.patch, YARN-6125.003.patch
>
>
> We've found through experience that the diagnostic message can grow 
> unbounded.  I've seen attempts that have diagnostic messages over 1MB.  Since 
> the message is stored in the state store, it's a bad idea to allow the 
> message to grow unbounded.  Instead, there should be a property that sets a 
> maximum size on the message.
> I suspect that some of the ZK state store issues we've seen in the past were 
> due to the size of the diagnostic messages and not to the size of the 
> classpath, as is the current prevailing opinion.
> An open question is how best to prune the message once it grows too large.  
> Should we
> # truncate the tail,
> # truncate the head,
> # truncate the middle,
> # add another property to make the behavior selectable, or
> # none of the above?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to