[
https://issues.apache.org/jira/browse/YARN-2579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14197890#comment-14197890
]
Vinod Kumar Vavilapalli commented on YARN-2579:
-----------------------------------------------
Tx for working on this [~rohithsharma]!
The abstractions are all broken in this part of the code-base, but it's not
your fault. Given this is a blocker, your approach to minimize the changes is
good!
One comment: This still is invoking transitionToStandby in the RMStateStore's
dispatcher. So what we will see is the following
{code}
RMStateStoreDispatcher.handle() -> store fails in the event, generates a
notifyStoreOperationFailed -> invokes resourceManager.handleTransitionToStandBy
-> calls transitionToStandby(boolean) -> activeServices.stop() ->
stateStore.close() -> RMStateStoreDispatcher.stop()
{code}
We should avoid these dispatcher events trying to close the dispatchers - that
was why I suggested a separate thread (my point 2.2 in the proposal).
> Both RM's state is Active , but 1 RM is not really active.
> ----------------------------------------------------------
>
> Key: YARN-2579
> URL: https://issues.apache.org/jira/browse/YARN-2579
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Affects Versions: 2.5.1
> Reporter: Rohith
> Assignee: Rohith
> Priority: Blocker
> Attachments: YARN-2579-20141105.1.patch, YARN-2579-20141105.patch,
> YARN-2579.patch, YARN-2579.patch
>
>
> I encountered a situaltion where both RM's web page was able to access and
> its state displayed as Active. But One of the RM's ActiveServices were
> stopped.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)