[ 
https://issues.apache.org/jira/browse/YARN-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13997722#comment-13997722
 ] 

Bikas Saha commented on YARN-1366:
----------------------------------

Is there any value in combining the re-register and re-sending of pending 
requests in 1 new "resync" method? I am not arguing in favor of it but it would 
help if we evaluate the pros/cons and go through the mental exercise of how 
things would work on the AM and RM side. This is important because we making 
API changes and these are hard to undo.
e.g. pro of new resync method - API clearly specifies that pending requests 
must be re-submitted. Are there any other advantage on the RM side by having 
this information come together in 1 "atomic" operation? Does it help the RM to 
differentiate between an AM that was launched and had registered vs an AM that 
had been launched but the RM died before the AM could register. Is that 
important in any case?

> ApplicationMasterService should Resync with the AM upon allocate call after 
> restart
> -----------------------------------------------------------------------------------
>
>                 Key: YARN-1366
>                 URL: https://issues.apache.org/jira/browse/YARN-1366
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Rohith
>         Attachments: YARN-1366.1.patch, YARN-1366.2.patch, YARN-1366.patch, 
> YARN-1366.prototype.patch, YARN-1366.prototype.patch
>
>
> The ApplicationMasterService currently sends a resync response to which the 
> AM responds by shutting down. The AM behavior is expected to change to 
> calling resyncing with the RM. Resync means resetting the allocate RPC 
> sequence number to 0 and the AM should send its entire outstanding request to 
> the RM. Note that if the AM is making its first allocate call to the RM then 
> things should proceed like normal without needing a resync. The RM will 
> return all containers that have completed since the RM last synced with the 
> AM. Some container completions may be reported more than once.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to