Rohith updated YARN-1366:

    Attachment: YARN-1366.3.patch

I updated patch with below changes.

   bq. Pending releases - AM forgets about a request to release once its made. 
We will have to reissue a release request after RM restart 
   bq. Blacklisting has logic in ignoreBlacklisting to ignore it if we cross a 
   bq. There a few places where the line exceeds 80 chars
      Even I have done format, it is not reducing less than 80char.
       Ex : Line 209 at RMContainerRequestor
            LIne 267 at AMRMClientImpl

Apart from above fix, other changes done are 
* AMRMClient
**  AMRMClient maitaines blacklisted nodes.This will be sent back to RM resync.
**  Added test for checking functionality.

* MapReduce
** Added test applying yarn-1365 patch. To run this test, it is required to 
have patch for yarn-1365

Please review the patch

> ApplicationMasterService should Resync with the AM upon allocate call after 
> restart
> -----------------------------------------------------------------------------------
>                 Key: YARN-1366
>                 URL: https://issues.apache.org/jira/browse/YARN-1366
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Rohith
>         Attachments: YARN-1366.1.patch, YARN-1366.2.patch, YARN-1366.3.patch, 
> YARN-1366.patch, YARN-1366.prototype.patch, YARN-1366.prototype.patch
> The ApplicationMasterService currently sends a resync response to which the 
> AM responds by shutting down. The AM behavior is expected to change to 
> calling resyncing with the RM. Resync means resetting the allocate RPC 
> sequence number to 0 and the AM should send its entire outstanding request to 
> the RM. Note that if the AM is making its first allocate call to the RM then 
> things should proceed like normal without needing a resync. The RM will 
> return all containers that have completed since the RM last synced with the 
> AM. Some container completions may be reported more than once.

This message was sent by Atlassian JIRA

Reply via email to