[
https://issues.apache.org/jira/browse/YARN-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15068335#comment-15068335
]
Daniel Templeton commented on YARN-4373:
----------------------------------------
I'm also incredulous. I'm still working to reproduce the issue. It was
reported by our testing team. If/when I reproduce it, I'll post the details.
> Jobs can be temporarily forgotten during recovery
> -------------------------------------------------
>
> Key: YARN-4373
> URL: https://issues.apache.org/jira/browse/YARN-4373
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.7.1
> Reporter: Daniel Templeton
> Assignee: Daniel Templeton
> Priority: Critical
>
> The RM becomes available to service requests before state store recovery is
> started. Before recovery and during the recovery period, it's possible for a
> client to request an application report for a running application to which
> the RM will respond that the application in unknown.
> I'm seeing this issue with Oozie during an RM failover. Until the active
> finishes recovery, it reports erroneous information to Oozie, which doesn't
> have context to know that it should just try again later.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)