[ 
https://issues.apache.org/jira/browse/YARN-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-4209:
----------------------------
    Description: 
RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by 
{{stateMachine.doTransition}}. The reason is
{{stateMachine.doTransition}} called from {{updateFencedState}} is embedded in 
{{stateMachine.doTransition}} called from public 
API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So right 
after the internal state transition from {{updateFencedState}} changes the 
state to FENCED state, the external state transition changes the state back to 
ACTIVE state. The end result is that RMStateStore is still in ACTIVE state even 
after {{notifyStoreOperationFailed}} is called. The only working case for 
FENCED state is {{notifyStoreOperationFailed}} called from 
{{ZKRMStateStore#VerifyActiveStatusThread}}.
For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter 
external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => 
{{notifyStoreOperationFailed}} =>{{updateFencedState}}=>{{handleStoreEvent}}=> 
enter internal {{stateMachine.doTransition}} => exit internal 
{{stateMachine.doTransition}} change state to FENCED => exit external 
{{stateMachine.doTransition}} change state to ACTIVE.


  was:
RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by 
{{stateMachine.doTransition}}. The reason is
{{stateMachine.doTransition}} called from {{updateFencedState}} is embedded in 
{{stateMachine.doTransition}} called from public 
API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So right 
after the internal state transition from {{updateFencedState}} changes the 
state to FENCED state, the external state transition changes the state back to 
ACTIVE state. The end result is that RMStateStore is still in ACTIVE state even 
{{notifyStoreOperationFailed}} is called. The only working case for FENCED 
state is {{notifyStoreOperationFailed}} called from 
{{ZKRMStateStore#VerifyActiveStatusThread}}.
For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter 
external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => 
{{notifyStoreOperationFailed}} =>{{updateFencedState}}=>{{handleStoreEvent}}=> 
enter internal {{stateMachine.doTransition}} => exit internal 
{{stateMachine.doTransition}} change state to FENCED => exit external 
{{stateMachine.doTransition}} change state to ACTIVE.



> RMStateStore FENCED state doesn’t work due to updateFencedState called by 
> stateMachine.doTransition
> ---------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4209
>                 URL: https://issues.apache.org/jira/browse/YARN-4209
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.7.2
>            Reporter: zhihai xu
>            Assignee: zhihai xu
>            Priority: Critical
>         Attachments: YARN-4209.000.patch
>
>
> RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by 
> {{stateMachine.doTransition}}. The reason is
> {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded 
> in {{stateMachine.doTransition}} called from public 
> API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So 
> right after the internal state transition from {{updateFencedState}} changes 
> the state to FENCED state, the external state transition changes the state 
> back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE 
> state even after {{notifyStoreOperationFailed}} is called. The only working 
> case for FENCED state is {{notifyStoreOperationFailed}} called from 
> {{ZKRMStateStore#VerifyActiveStatusThread}}.
> For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter 
> external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => 
> {{notifyStoreOperationFailed}} 
> =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal 
> {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} 
> change state to FENCED => exit external {{stateMachine.doTransition}} change 
> state to ACTIVE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to