zhihai xu commented on YARN-4209:

Hi [~rohithsharma], I uploaded a new patch YARN-4209.001.patch, which uses 
MultipleArcTransition. Create private function 
{{notifyStoreOperationFailedInternal}}, now {{notifyStoreOperationFailed}} will 
only be called by {{ZKRMStateStore#VerifyActiveStatusThread}}.
So I acquire {{writeLock}} and check {{isFencedState}} in 
{{notifyStoreOperationFailed}} to make sure {{handleTransitionToStandBy}} is 
only called once. Please review it, thanks.

> RMStateStore FENCED state doesn’t work due to updateFencedState called by 
> stateMachine.doTransition
> ---------------------------------------------------------------------------------------------------
>                 Key: YARN-4209
>                 URL: https://issues.apache.org/jira/browse/YARN-4209
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.7.2
>            Reporter: zhihai xu
>            Assignee: zhihai xu
>            Priority: Critical
>         Attachments: YARN-4209.000.patch, YARN-4209.001.patch
> RMStateStore FENCED state doesn’t work due to {{updateFencedState}} called by 
> {{stateMachine.doTransition}}. The reason is
> {{stateMachine.doTransition}} called from {{updateFencedState}} is embedded 
> in {{stateMachine.doTransition}} called from public 
> API(removeRMDelegationToken...) or {{ForwardingEventHandler#handle}}. So 
> right after the internal state transition from {{updateFencedState}} changes 
> the state to FENCED state, the external state transition changes the state 
> back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE 
> state even after {{notifyStoreOperationFailed}} is called. The only working 
> case for FENCED state is {{notifyStoreOperationFailed}} called from 
> {{ZKRMStateStore#VerifyActiveStatusThread}}.
> For example: {{removeRMDelegationToken}} => {{handleStoreEvent}} => enter 
> external {{stateMachine.doTransition}} => {{RemoveRMDTTransition}} => 
> {{notifyStoreOperationFailed}} 
> =>{{updateFencedState}}=>{{handleStoreEvent}}=> enter internal 
> {{stateMachine.doTransition}} => exit internal {{stateMachine.doTransition}} 
> change state to FENCED => exit external {{stateMachine.doTransition}} change 
> state to ACTIVE.

This message was sent by Atlassian JIRA

Reply via email to