[ 
https://issues.apache.org/jira/browse/YARN-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14000284#comment-14000284
 ] 

Bikas Saha commented on YARN-1366:
----------------------------------

bq. Seems like we are going with no resync api for now as per the current 
patch. I think its a good idea to hold of on the new API unless we see a need. 
I feel there isnt a strong case for it yet.
I dont think we can summarily make such a choice without a proper discussion. 
Again, I am not advocating either choice. But we should understand the 
approaches and their effects on the system (users + back-end implementation) 
before we make a call on the API. My last comment opened the discussion with 
some questions and it would be great if the assignee ([~rohithsharma] and other 
committers/contributors express their understanding and insight on those 
questions.

> ApplicationMasterService should Resync with the AM upon allocate call after 
> restart
> -----------------------------------------------------------------------------------
>
>                 Key: YARN-1366
>                 URL: https://issues.apache.org/jira/browse/YARN-1366
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Rohith
>         Attachments: YARN-1366.1.patch, YARN-1366.2.patch, YARN-1366.patch, 
> YARN-1366.prototype.patch, YARN-1366.prototype.patch
>
>
> The ApplicationMasterService currently sends a resync response to which the 
> AM responds by shutting down. The AM behavior is expected to change to 
> calling resyncing with the RM. Resync means resetting the allocate RPC 
> sequence number to 0 and the AM should send its entire outstanding request to 
> the RM. Note that if the AM is making its first allocate call to the RM then 
> things should proceed like normal without needing a resync. The RM will 
> return all containers that have completed since the RM last synced with the 
> AM. Some container completions may be reported more than once.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to