[ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14256618#comment-14256618
 ] 

Rohith commented on YARN-2946:
------------------------------

Since handleStoreEvent() called from event dispatcher for RMApp Store events 
and syncronously for DT store, TestRMRestart was overriding handleStoreEvent() 
simulate test scnario which was causing start up failure.
Will correct test case and update patch.

> DeadLocks in RMStateStore<->ZKRMStateStore
> ------------------------------------------
>
>                 Key: YARN-2946
>                 URL: https://issues.apache.org/jira/browse/YARN-2946
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.7.0
>            Reporter: Rohith
>            Assignee: Rohith
>            Priority: Blocker
>         Attachments: 0001-YARN-2946.patch, 0001-YARN-2946.patch, 
> 0002-YARN-2946.patch, 0003-YARN-2946.patch, 
> RM_BeforeFix_Deadlock_cycle_1.png, RM_BeforeFix_Deadlock_cycle_2.png, 
> TestYARN2946.java
>
>
> Found one deadlock in ZKRMStateStore.
> # Initial stage zkClient is null because of zk disconnected event.
> # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
> re establish zookeeper connection either via synconnected or expired event, 
> it is highly possible that any other thred can obtain lock on 
> {{ZKRMStateStore.this}} from state machine transition events. This cause 
> Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to