[
https://issues.apache.org/jira/browse/YARN-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rohith updated YARN-1366:
-------------------------
Attachment: YARN-1366.3.patch
I updated patch with below changes.
bq. Pending releases - AM forgets about a request to release once its made.
We will have to reissue a release request after RM restart
FIXED
bq. Blacklisting has logic in ignoreBlacklisting to ignore it if we cross a
threshold.
FIXED
bq. There a few places where the line exceeds 80 chars
Even I have done format, it is not reducing less than 80char.
Ex : Line 209 at RMContainerRequestor
LIne 267 at AMRMClientImpl
Apart from above fix, other changes done are
* AMRMClient
** AMRMClient maitaines blacklisted nodes.This will be sent back to RM resync.
** Added test for checking functionality.
* MapReduce
** Added test applying yarn-1365 patch. To run this test, it is required to
have patch for yarn-1365
Please review the patch
> ApplicationMasterService should Resync with the AM upon allocate call after
> restart
> -----------------------------------------------------------------------------------
>
> Key: YARN-1366
> URL: https://issues.apache.org/jira/browse/YARN-1366
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Reporter: Bikas Saha
> Assignee: Rohith
> Attachments: YARN-1366.1.patch, YARN-1366.2.patch, YARN-1366.3.patch,
> YARN-1366.patch, YARN-1366.prototype.patch, YARN-1366.prototype.patch
>
>
> The ApplicationMasterService currently sends a resync response to which the
> AM responds by shutting down. The AM behavior is expected to change to
> calling resyncing with the RM. Resync means resetting the allocate RPC
> sequence number to 0 and the AM should send its entire outstanding request to
> the RM. Note that if the AM is making its first allocate call to the RM then
> things should proceed like normal without needing a resync. The RM will
> return all containers that have completed since the RM last synced with the
> AM. Some container completions may be reported more than once.
--
This message was sent by Atlassian JIRA
(v6.2#6252)