[
https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15019097#comment-15019097
]
Sangjin Lee commented on YARN-4209:
-----------------------------------
Does this issue exist in 2.6.x? Should this be backported to branch-2.6?
> RMStateStore FENCED state doesn’t work due to updateFencedState called by
> stateMachine.doTransition
> ---------------------------------------------------------------------------------------------------
>
> Key: YARN-4209
> URL: https://issues.apache.org/jira/browse/YARN-4209
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Affects Versions: 2.7.2
> Reporter: zhihai xu
> Assignee: zhihai xu
> Priority: Critical
> Fix For: 2.7.2
>
> Attachments: YARN-4209.000.patch, YARN-4209.001.patch,
> YARN-4209.002.patch, YARN-4209.branch-2.7.patch
>
>
> RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by
> {{stateMachine.doTransition}}. The reason is
> {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded
> in {{stateMachine.doTransition}} called from public
> API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So
> right after the internal state transition from {{updateFencedState}} changes
> the state to FENCED state, the external state transition changes the state
> back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE
> state even after {{notifyStoreOperationFailed}} is called. The only working
> case for FENCED state is {{notifyStoreOperationFailed}} called from
> {{ZKRMStateStore#VerifyActiveStatusThread}}.
> For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter
> external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} =>
> {{notifyStoreOperationFailed}}
> =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal
> {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}}
> change state to FENCED => exit external {{stateMachine.doTransition}} change
> state to ACTIVE.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)