Please see: https://issues.apache.org/jira/browse/ZOOKEEPER-846
On Thu, Aug 12, 2010 at 10:00 AM, Patrick Hunt <ph...@apache.org> wrote: > Great bug report Ted, the stack trace in particular is very useful. > > It looks like a timing bug where the client is not shutting down cleanly on > the close call. I reviewed the code in question but nothing pops out at me. > Also the logs just show us shutting down, nothing else from zk in there. > > Create a jira and attach all the detail you have available. > > Patrick > > > On 08/11/2010 03:21 PM, Ted Yu wrote: > >> Hi, >> Using HBase 0.20.6 (with HBASE-2473) we encountered a situation where >> Regionserver >> process was shutting down and seemed to hang. >> >> Here is the bottom of region server log: >> http://pastebin.com/YYawJ4jA >> >> zookeeper-3.2.2 is used. >> >> Your comment is welcome. >> >> Here is relevant portion from jstack - I attempted to attach jstack twice >> in >> my email to d...@hbase.apache.org but failed: >> >> "DestroyJavaVM" prio=10 tid=0x00002aabb849c800 nid=0x6c60 waiting on >> condition [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> >> "regionserver/10.32.42.245:60020" prio=10 tid=0x00002aabb84ce000 >> nid=0x6c81 >> in Object.wait() [0x0000000043755000] >> java.lang.Thread.State: WAITING (on object monitor) >> at java.lang.Object.wait(Native Method) >> - waiting on<0x00002aaab76633c0> (a >> org.apache.zookeeper.ClientCnxn$Packet) >> at java.lang.Object.wait(Object.java:485) >> at >> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1099) >> - locked<0x00002aaab76633c0> (a >> org.apache.zookeeper.ClientCnxn$Packet) >> at org.apache.zookeeper.ClientCnxn.close(ClientCnxn.java:1077) >> at org.apache.zookeeper.ZooKeeper.close(ZooKeeper.java:505) >> - locked<0x00002aaabf5e0c30> (a org.apache.zookeeper.ZooKeeper) >> at >> >> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.close(ZooKeeperWrapper.java:681) >> at >> >> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:654) >> at java.lang.Thread.run(Thread.java:619) >> >> "main-EventThread" daemon prio=10 tid=0x0000000043474000 nid=0x6c80 >> waiting >> on condition [0x00000000413f3000] >> java.lang.Thread.State: WAITING (parking) >> at sun.misc.Unsafe.park(Native Method) >> - parking to wait for<0x00002aaabf6e9150> (a >> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) >> at >> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) >> at >> >> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) >> at >> >> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) >> at >> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:414) >> >> "RMI TCP Accept-0" daemon prio=10 tid=0x00002aabb822c800 nid=0x6c7d >> runnable >> [0x0000000040752000] >> java.lang.Thread.State: RUNNABLE >> at java.net.PlainSocketImpl.socketAccept(Native Method) >> at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:390) >> - locked<0x00002aaabf585578> (a java.net.SocksSocketImpl) >> at java.net.ServerSocket.implAccept(ServerSocket.java:453) >> at java.net.ServerSocket.accept(ServerSocket.java:421) >> at >> >> sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:34) >> at >> >> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:369) >> at >> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:341) >> at java.lang.Thread.run(Thread.java:619) >> >>