Runaway thread

                 Key: ZOOKEEPER-865
             Project: Zookeeper
          Issue Type: Bug
    Affects Versions: 3.3.1, 3.3.0
         Environment: Linux; Java 1.6; x86;
            Reporter: Stephen McCants
            Priority: Critical

I'm starting a standalone Zookeeper server (v3.3.1).  That starts normally and 
does not have a runaway thread.

Next, I start an based Eclipse application that is using ZK 3.3.0 to register 
itself with the ZooKeeper server (3.3.1).  The Eclipse application using the 
following arguments to Eclipse:


When the Eclipse application starts, the ZK server prints out:

2010-09-03 09:59:46,006 - INFO  
[NIOServerCxn.Factory:$fact...@250] - 
Accepted socket connection from /
2010-09-03 09:59:46,039 - INFO  
[NIOServerCxn.Factory:] - Client 
attempting to establish new session at /
2010-09-03 09:59:46,045 - INFO  [SyncThread:0:nioserverc...@1579] - Established 
session 0x12ad81b90000002 with negotiated timeout 4000 for client 
2010-09-03 09:59:46,046 - INFO  
[NIOServerCxn.Factory:$fact...@250] - 
Accepted socket connection from /
2010-09-03 09:59:46,078 - INFO  
[NIOServerCxn.Factory:] - Client 
attempting to establish new session at /
2010-09-03 09:59:46,080 - INFO  [SyncThread:0:nioserverc...@1579] - Established 
session 0x12ad81b90000003 with negotiated timeout 4000 for client 

Then both the Eclipse application and the ZK server go into runaway states and 
consume 100% of the CPU.

Here is a view from top:

4949 smccants  15   0  597m  78m 5964 S    66.2      1.0      1:03.14 
4876 smccants  17   0  554m  27m 6688 S    30.9       0.3     0:34.74 java

PID 4949 (autosubmitter) is the Eclipse application and is using more than 
twice the CPU of PID 4876 (java) which is the ZK server.  They will continue in 
this state indefinitely.

I can attach a debugger to the Eclipse application and if I stop the thread 
named "pool-1-thread-2-SendThread(" and the 
runaway condition stops on both the application and ZK server.  However the ZK 
server reports:

2010-09-03 10:03:38,001 - INFO  [SessionTracker:zookeeperser...@315] - Expiring 
session 0x12ad81b90000003, timeout of 4000ms exceeded
2010-09-03 10:03:38,002 - INFO  [ProcessThread:-1:preprequestproces...@208] - 
Processed session termination for sessionid: 0x12ad81b90000003
2010-09-03 10:03:38,005 - INFO  [SyncThread:0:nioserverc...@1434] - Closed 
socket connection for client / which had sessionid 

Here is the stack trace from the suspended thread:

EPollArrayWrapper.epollWait(long, int, long, int) line: not available [native 
EPollArrayWrapper.poll(long) line: 215  
EPollSelectorImpl.doSelect(long) line: 77       
EPollSelectorImpl(SelectorImpl).lockAndDoSelect(long) line: 69  
EPollSelectorImpl(SelectorImpl).select(long) line: 80   
ClientCnxn$ line: 1066  

Any ideas what might be going wrong?


This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to