[ 
https://issues.apache.org/jira/browse/YARN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383151#comment-14383151
 ] 

zhihai xu commented on YARN-2893:
---------------------------------

[~adhoot], thanks for the review. I added a test case for the  AMLauncher 
changes in the new patch YARN-2893.002.patch.
The root cause for this bug is at job Client which submitted a bad token in 
ApplicationSubmissionContext.
The changes for RMAppManager#submitApplication is to prevent this error 
earlier. So the user who submit the application knows the real cause of the 
issue.

bq. The changes for RMAppManager#submitApplication seems to no longer return 
RMAppRejectedEvent for any exception in 
getDelegationTokenRenewer().addApplicationAsync. Is that deliberate?
I checked the code for DelegationTokenRenewer#addApplicationAsync, I didn't 
find any exception which will be generated from addApplicationAsync.
addApplicationAsync will launch a thread to run handleDTRenewerAppSubmitEvent, 
any exception from handleDTRenewerAppSubmitEvent will return RMAppRejectedEvent.
{code}
    private void handleDTRenewerAppSubmitEvent(
        DelegationTokenRenewerAppSubmitEvent event) {
      try {
        // Setup tokens for renewal
        DelegationTokenRenewer.this.handleAppSubmitEvent(event);
        rmContext.getDispatcher().getEventHandler()
            .handle(new RMAppEvent(event.getApplicationId(), 
RMAppEventType.START));
      } catch (Throwable t) {
        LOG.warn(
            "Unable to add the application to the delegation token renewer.",
            t);
        // Sending APP_REJECTED is fine, since we assume that the
        // RMApp is in NEW state and thus we havne't yet informed the
        // Scheduler about the existence of the application
        rmContext.getDispatcher().getEventHandler().handle(
            new RMAppRejectedEvent(event.getApplicationId(), t.getMessage()));
      }
  }
{code}
This is why I only check the exception for parseCredentials.
Also the original code only expected the exception from parseCredentials based 
on the exception message.
{code}
LOG.warn("Unable to parse credentials.", e);
{code}

> AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream
> ------------------------------------------------------------------------------
>
>                 Key: YARN-2893
>                 URL: https://issues.apache.org/jira/browse/YARN-2893
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.4.0
>            Reporter: Gera Shegalov
>            Assignee: zhihai xu
>         Attachments: YARN-2893.000.patch, YARN-2893.001.patch, 
> YARN-2893.002.patch
>
>
> MapReduce jobs on our clusters experience sporadic failures due to corrupt 
> tokens in the AM launch context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to