[
https://issues.apache.org/jira/browse/YARN-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303192#comment-15303192
]
Subru Krishnan edited comment on YARN-1815 at 5/26/16 11:54 PM:
----------------------------------------------------------------
I tested the following scenarios:
For failover:
* Submit a unmanaged AM and failover RM, unmanaged AM is able to reregister
and complete successfully. Verified that all it’s containers are preserved and
no work is lost.
For recording state:
* A unmanaged AM that completes successfully and verified state is SUCCEEDED
* A unmanaged AM that is killed during execution and verified state is KILLED
* A unmanaged AM that fails (essentially times out) and verified state is
FAILED
was (Author: subru):
I tested the following scenarios:
For failover:
* Submit a unmanaged AM and failover RM, unmanaged AM is able to reregister
and complete successfully. Verified that all it’s containers are preserved and
no work is lost.
For recording state:
* A unmanaged AM that completes successfully and verified state is SUCCEEDED
* A unmanaged AM that is killed during execution and verified state is KILLED
* A unmanaged AM that fails (essentially times out) and verified state is
FAILED
> Work preserving recovery of Unmanged AMs
> ----------------------------------------
>
> Key: YARN-1815
> URL: https://issues.apache.org/jira/browse/YARN-1815
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Affects Versions: 2.3.0
> Reporter: Karthik Kambatla
> Assignee: Subru Krishnan
> Priority: Critical
> Attachments: Unmanaged AM recovery.png, YARN-1815-v3.patch,
> yarn-1815-1.patch, yarn-1815-2.patch, yarn-1815-2.patch
>
>
> Currently work preserving RM restart recovers unmanaged AMs but it has a
> couple of shortcomings - all running containers are killed and completed
> unmanaged AMs are also recovered as we do _not_ record final state for
> unmanaged AMs in the RM StateStore. This JIRA proposes to address both the
> shortcomings so that work preserving unmanaged AM recovery works exactly like
> with managed AMs
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]