[jira] [Commented] (YARN-1815) RM doesn't recover unmanaged AMs into its memory after restart

Bikas Saha (JIRA) Wed, 26 Mar 2014 19:56:55 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948810#comment-13948810
 ]


Bikas Saha commented on YARN-1815:
----------------------------------

I am not up to date with the latest state of the code. The original restart 
code used to save all AM info into the state store - managed and unmanaged. 
Upon recovery, the unmanaged AM was explicitly discarded since all AM's were 
asked to restart after RM recovery. The plan was that when AM's will be asked 
to resync instead of restart then the unmanaged AM would not be discarded on 
recovery. It would go through the same flow as other managed AM's. Its AM would 
ping the RM and be asked to resync just like other managed AM's. So the flow of 
unmanaged AM's would be identical to the flow of the managed AM's. This comment 
is mainly towards running unmanaged AM's. Unmanaged AM's that have already 
finished before RM restart should be handled already since their completion 
information will tell the recovered RM that they are done.

bq.  by recording unmanaged AMs also when they finish and use that information 
for recovery
Is completion information for unmanaged AM's not saved currently? We should be 
saving that information so that the flow is identical to managed AM's.


> RM doesn't recover unmanaged AMs into its memory after restart
> --------------------------------------------------------------
>
>                 Key: YARN-1815
>                 URL: https://issues.apache.org/jira/browse/YARN-1815
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 2.3.0
>            Reporter: Karthik Kambatla
>            Assignee: Karthik Kambatla
>            Priority: Critical
>         Attachments: Unmanaged AM recovery.png, yarn-1815-1.patch, 
> yarn-1815-2.patch, yarn-1815-2.patch
>
>
> RM doesn't recover unmanaged AMs into its memory after restart



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1815) RM doesn't recover unmanaged AMs into its memory after restart

Reply via email to