Runaway thread -------------- Key: ZOOKEEPER-865 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-865 Project: Zookeeper Issue Type: Bug Affects Versions: 3.3.1, 3.3.0 Environment: Linux; Java 1.6; x86; Reporter: Stephen McCants Priority: Critical
I'm starting a standalone Zookeeper server (v3.3.1). That starts normally and does not have a runaway thread. Next, I start an based Eclipse application that is using ZK 3.3.0 to register itself with the ZooKeeper server (3.3.1). The Eclipse application using the following arguments to Eclipse: -Dzoodiscovery.autoStart=true -Dzoodiscovery.flavor=zoodiscovery.flavor.centralized=smccants.austin.ibm.com When the Eclipse application starts, the ZK server prints out: 2010-09-03 09:59:46,006 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioservercnxn$fact...@250] - Accepted socket connection from /9.53.189.11:42271 2010-09-03 09:59:46,039 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@776] - Client attempting to establish new session at /9.53.189.11:42271 2010-09-03 09:59:46,045 - INFO [SyncThread:0:nioserverc...@1579] - Established session 0x12ad81b90000002 with negotiated timeout 4000 for client /9.53.189.11:42271 2010-09-03 09:59:46,046 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioservercnxn$fact...@250] - Accepted socket connection from /9.53.189.11:42272 2010-09-03 09:59:46,078 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@776] - Client attempting to establish new session at /9.53.189.11:42272 2010-09-03 09:59:46,080 - INFO [SyncThread:0:nioserverc...@1579] - Established session 0x12ad81b90000003 with negotiated timeout 4000 for client /9.53.189.11:42272 Then both the Eclipse application and the ZK server go into runaway states and consume 100% of the CPU. Here is a view from top: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4949 smccants 15 0 597m 78m 5964 S 66.2 1.0 1:03.14 autosubmitter 4876 smccants 17 0 554m 27m 6688 S 30.9 0.3 0:34.74 java PID 4949 (autosubmitter) is the Eclipse application and is using more than twice the CPU of PID 4876 (java) which is the ZK server. They will continue in this state indefinitely. I can attach a debugger to the Eclipse application and if I stop the thread named "pool-1-thread-2-SendThread(smccants.austin.ibm.com:2181)" and the runaway condition stops on both the application and ZK server. However the ZK server reports: 2010-09-03 10:03:38,001 - INFO [SessionTracker:zookeeperser...@315] - Expiring session 0x12ad81b90000003, timeout of 4000ms exceeded 2010-09-03 10:03:38,002 - INFO [ProcessThread:-1:preprequestproces...@208] - Processed session termination for sessionid: 0x12ad81b90000003 2010-09-03 10:03:38,005 - INFO [SyncThread:0:nioserverc...@1434] - Closed socket connection for client /9.53.189.11:42272 which had sessionid 0x12ad81b90000003 Here is the stack trace from the suspended thread: EPollArrayWrapper.epollWait(long, int, long, int) line: not available [native method] EPollArrayWrapper.poll(long) line: 215 EPollSelectorImpl.doSelect(long) line: 77 EPollSelectorImpl(SelectorImpl).lockAndDoSelect(long) line: 69 EPollSelectorImpl(SelectorImpl).select(long) line: 80 ClientCnxn$SendThread.run() line: 1066 Any ideas what might be going wrong? Thanks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.