[
https://issues.apache.org/jira/browse/YARN-1574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13865741#comment-13865741
]
Xuan Gong commented on YARN-1574:
---------------------------------
For example, we have two RMs, RM1 and RM2, and one NM. We do the following
actions:
Set RM1 as Active --> Set RM1 as Standby --> Set RM1 as active.
After that, we can find that the same event : RMNodeEventType.STARTED will be
handled twice.
{code}
14/01/08 10:39:20 INFO rmnode.RMNodeImpl: localhost:9105 Node Transitioned from
NEW to RUNNING
14/01/08 10:39:20 ERROR rmnode.RMNodeImpl: Can't handle this event at current
state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
STARTED at RUNNING
at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.handle(RMNodeImpl.java:377)
at
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.handle(RMNodeImpl.java:73)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$NodeEventDispatcher.handle(ResourceManager.java:731)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$NodeEventDispatcher.handle(ResourceManager.java:715)
at
org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:264)
at
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
at
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
at java.lang.Thread.run(Thread.java:695)
{code}
> When RM transit from Active to Standby, the same eventDispatcher should not
> be registered more than once
> --------------------------------------------------------------------------------------------------------
>
> Key: YARN-1574
> URL: https://issues.apache.org/jira/browse/YARN-1574
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Xuan Gong
> Assignee: Xuan Gong
> Priority: Blocker
>
> Currently, we move rmDispatcher out of ActiveService. But we still register
> the Event dispatcher, such as schedulerDispatcher, RMAppEventDispatcher when
> we initiate the ActiveService.
> Almost every time when we transit RM from Active to Standby, we need to
> initiate the ActiveService. That means we will register the same event
> Dispatcher which will cause the same event will be handled several times.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)