[jira] [Commented] (YARN-3387) container complete message couldn't pass to am if am restarted and rm changed

Karthik Kambatla (JIRA) Mon, 23 Mar 2015 14:10:11 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376640#comment-14376640
 ]


Karthik Kambatla commented on YARN-3387:
----------------------------------------

Does this imply our work-preserving AM restart is broken on a RM failover? 

> container complete message couldn't pass to am if am restarted and rm changed
> -----------------------------------------------------------------------------
>
>                 Key: YARN-3387
>                 URL: https://issues.apache.org/jira/browse/YARN-3387
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.6.0
>            Reporter: sandflee
>            Priority: Critical
>
> suppose am work preserving and rm ha is enabled.
> container complete message is passed to appattemt.justFinishedContainers in 
> rm。in normal situation，all attempt in one app shares the same 
> justFinishedContainers, but when rm changed, every attempt has it's own 
> justFinishedContainers, so in situations below, container complete message 
> couldn't passed to am:
> 1, am restart
> 2, rm changes
> 3, container launched by first am completes
> container complete message will be passed to appAttempt1 not appAttempt2, but 
> am pull finished containers from appAttempt2 (currentAppAttempt)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3387) container complete message couldn't pass to am if am restarted and rm changed

Reply via email to