[ 
https://issues.apache.org/jira/browse/YARN-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14233023#comment-14233023
 ] 

Rohith commented on YARN-2917:
------------------------------

Attaching thread dump when RM hanged
{code}
"Thread-1" prio=10 tid=0x00000000006e1000 nid=0x55a4 in Object.wait() 
[0x00007f2ce9493000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000000f26b0d48> (a java.lang.Object)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:141)
        - locked <0x00000000f26b0d48> (a java.lang.Object)
        at 
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
        - locked <0x00000000f26b0aa8> (a java.lang.Object)
        at 
org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.stopDispatcher(CommonNodeLabelsManager.java:232)
        at 
org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.serviceStop(CommonNodeLabelsManager.java:238)
        at 
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
        - locked <0x00000000f26b0968> (a java.lang.Object)
        at 
org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
        at 
org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
        at 
org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157)
        at 
org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStop(ResourceManager.java:599)
        at 
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
        - locked <0x00000000f2842458> (a java.lang.Object)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.stopActiveServices(ResourceManager.java:1002)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToStandby(ResourceManager.java:1057)
        - locked <0x00000000c0c96c98> (a 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1104)
        at 
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
        - locked <0x00000000c0cab280> (a java.lang.Object)
        at 
org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
        at 
org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
        at 
org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:65)
        at 
org.apache.hadoop.service.CompositeService$CompositeServiceShutdownHook.run(CompositeService.java:183)
        at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

"AsyncDispatcher event handler" daemon prio=10 tid=0x00007f2cf0b81000 
nid=0x54a1 in Object.wait() [0x00007f2cf7bfa000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000000c01b83e8> (a 
org.apache.hadoop.util.ShutdownHookManager$1)
        at java.lang.Thread.join(Thread.java:1281)
        - locked <0x00000000c01b83e8> (a 
org.apache.hadoop.util.ShutdownHookManager$1)
        at java.lang.Thread.join(Thread.java:1355)
        at 
java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:106)
        at 
java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46)
        at java.lang.Shutdown.runHooks(Shutdown.java:123)
        at java.lang.Shutdown.sequence(Shutdown.java:167)
        at java.lang.Shutdown.exit(Shutdown.java:212)
        - locked <0x00000000c04ae9c0> (a java.lang.Class for java.lang.Shutdown)
        at java.lang.Runtime.exit(Runtime.java:109)
        at java.lang.System.exit(System.java:962)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:185)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
        at java.lang.Thread.run(Thread.java:745)
{code}

> RM get hanged if fail to store NodeLabels into store.
> -----------------------------------------------------
>
>                 Key: YARN-2917
>                 URL: https://issues.apache.org/jira/browse/YARN-2917
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>            Reporter: Rohith
>            Assignee: Rohith
>            Priority: Critical
>
> I encoutered scenario where RM hanged while shutting down and keep on logging 
> {{2014-12-03 19:32:44,283 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Waiting for AsyncDispatcher to drain.}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to