[ 
https://issues.apache.org/jira/browse/YARN-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13966538#comment-13966538
 ] 

Rohith commented on YARN-1929:
------------------------------

Complete stack trace
{noformat}
Found one Java-level deadlock:
=============================
"Thread-2":
  waiting to lock monitor 0x00007fb514303cf0 (object 0x00000000ef153fd0, a 
org.apache.hadoop.ha.ActiveStandbyElector),
  which is held by "main-EventThread"
"main-EventThread":
  waiting to lock monitor 0x00007fb514750a48 (object 0x00000000ef154020, a 
org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService),
  which is held by "Thread-2"

Java stack information for the threads listed above:
===================================================
"Thread-2":
        at 
org.apache.hadoop.ha.ActiveStandbyElector.quitElection(ActiveStandbyElector.java:353)
        - waiting to lock <0x00000000ef153fd0> (a 
org.apache.hadoop.ha.ActiveStandbyElector)
        at 
org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.serviceStop(EmbeddedElectorService.java:108)
        - locked <0x00000000ef154020> (a 
org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService)
        at 
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
        - locked <0x00000000ef154068> (a java.lang.Object)
        at 
org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
        at 
org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
        at 
org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158)
        - locked <0x00000000ef154090> (a 
org.apache.hadoop.yarn.server.resourcemanager.AdminService)
        at 
org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131)
        at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStop(AdminService.java:134)
        - locked <0x00000000ef154090> (a 
org.apache.hadoop.yarn.server.resourcemanager.AdminService)
        at 
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
        - locked <0x00000000ef154108> (a java.lang.Object)
        at 
org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
        at 
org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
        at 
org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158)
        - locked <0x00000000ef154118> (a 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager)
        at 
org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:947)
        at 
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
        - locked <0x00000000ef1541c0> (a java.lang.Object)
        at 
org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
        at 
org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
        at 
org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:65)
        at 
org.apache.hadoop.service.CompositeService$CompositeServiceShutdownHook.run(CompositeService.java:184)
        at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
"main-EventThread":
        at 
org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:116)
        - waiting to lock <0x00000000ef154020> (a 
org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService)
        at 
org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:804)
        at 
org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:480)
        - locked <0x00000000ef153fd0> (a 
org.apache.hadoop.ha.ActiveStandbyElector)
        at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:543)
        at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)

Found 1 deadlock.
{noformat}

> DeadLock in RM when automatic failover is enabled.
> --------------------------------------------------
>
>                 Key: YARN-1929
>                 URL: https://issues.apache.org/jira/browse/YARN-1929
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>         Environment: Yarn HA cluster
>            Reporter: Rohith
>            Priority: Critical
>
> Dead lock detected  in RM when automatic failover is enabled.
> {noformat}
> Found one Java-level deadlock:
> =============================
> "Thread-2":
>   waiting to lock monitor 0x00007fb514303cf0 (object 0x00000000ef153fd0, a 
> org.apache.hadoop.ha.ActiveStandbyElector),
>   which is held by "main-EventThread"
> "main-EventThread":
>   waiting to lock monitor 0x00007fb514750a48 (object 0x00000000ef154020, a 
> org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService),
>   which is held by "Thread-2"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to