[ https://issues.apache.org/jira/browse/YARN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395412#comment-14395412 ]
zhihai xu commented on YARN-2893: --------------------------------- Hi [~jira.shegalov], I can catch the exception for all the code. try { Credentials credentials = parseCredentials(submissionContext); if (UserGroupInformation.isSecurityEnabled()) { this.rmContext.getDelegationTokenRenewer().addApplicationAsync(appId, credentials, submissionContext.getCancelTokensWhenComplete(), application.getUser()) } else { this.rmContext.getDispatcher().getEventHandler() .handle(new RMAppEvent(applicationId, RMAppEventType.START)); } } catch (Exception e) { LOG.warn("Unable to parse credentials.", e); // Sending APP_REJECTED is fine, since we assume that the // RMApp is in NEW state and thus we haven't yet informed the // scheduler about the existence of the application assert application.getState() == RMAppState.NEW; this.rmContext.getDispatcher().getEventHandler() .handle(new RMAppRejectedEvent(applicationId, e.getMessage())); throw RPCUtil.getRemoteException(e); } {code} Are you ok with above change? I think it will be better to parseCredentials and catch the exception for Security not Enabled case, So we can find corrupted credentials from Client earlier. > AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream > ------------------------------------------------------------------------------ > > Key: YARN-2893 > URL: https://issues.apache.org/jira/browse/YARN-2893 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager > Affects Versions: 2.4.0 > Reporter: Gera Shegalov > Assignee: zhihai xu > Attachments: YARN-2893.000.patch, YARN-2893.001.patch, > YARN-2893.002.patch > > > MapReduce jobs on our clusters experience sporadic failures due to corrupt > tokens in the AM launch context. -- This message was sent by Atlassian JIRA (v6.3.4#6332)