2008/7/23 James Strachan <[EMAIL PROTECTED]>:
> 2008/7/23 Benjamin Reed <[EMAIL PROTECTED]>:
>> SessionExpiredExceptions should be extremely rare. Basically they should only
>> happen if a machine goes down (of course that would mean no exception would
>> actually get generated since the client is dead :) or a network partition
>> occurs.
>>
>> Having said that we seem to have a bug that cause SessionExpiredExceptions
>> when nothing bad has happened. The bug must be in the heart beat code (we do
>> them automatically, so the client shouldn't have to worry about it). If you
>> can reproduce it well, it would greatly help to track down the bug! Can you
>> send me the code to reproduce the problem?
>
> Its the test case WriteLockTest in the patch for ZOOKEEPER-78 which is
> currently dependent on the ZOOKEEPER-84 patch as well (though given
> your recent comment I'm gonna refactor the code to not require a
> ZooKeeper change :)
>
> I'll ping the list when I've refactored the test case to not require
> the ZOOKEEPER-84 change.

I've just updated the patch on ZOOKEEPER-78 to avoid the dependency on
ZOOKEEPER-84. It now uses a ZooKeeperFacade class which wraps up the
creation of the ZooKeeper - and recreation of it if a
SessionExpiredException is received.

The test case currently hangs there...

    [junit] "main" prio=5 tid=0x01001710 nid=0xb0801000 in
Object.wait() [0xb07ff000..0xb0800148]
    [junit]     at java.lang.Object.wait(Native Method)
    [junit]     - waiting on <0x096105e0> (a
org.apache.zookeeper.ClientCnxn$Packet)
    [junit]     at java.lang.Object.wait(Object.java:474)
    [junit]     at
org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:822)
    [junit]     - locked <0x096105e0> (a org.apache.zookeeper.ClientCnxn$Packet)
    [junit]     at org.apache.zookeeper.ZooKeeper.close(ZooKeeper.java:329)
    [junit]     - locked <0x0bd54108> (a org.apache.zookeeper.ZooKeeper)
    [junit]     at
org.apache.zookeeper.protocols.ZooKeeperFacade.close(ZooKeeperFacade.java:99)
    [junit]     at
org.apache.zookeeper.protocols.WriteLockTest.tearDown(WriteLockTest.java:146)
    [junit]     at junit.framework.TestCase.runBare(TestCase.java:140)
    [junit]     at junit.framework.TestResult$1.protect(TestResult.java:110)
    [junit]     at junit.framework.TestResult.runProtected(TestResult.java:128)
    [junit]     at junit.framework.TestResult.run(TestResult.java:113)
    [junit]     at junit.framework.TestCase.run(TestCase.java:124)
    [junit]     at junit.framework.TestSuite.runTest(TestSuite.java:232)
    [junit]     at junit.framework.TestSuite.run(TestSuite.java:227)
    [junit]     at
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81)
    [junit]     at
junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:36)
    [junit]     at
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:421)
    [junit]     at
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:912)
    [junit]     at
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:766)


basically the 3rd ZooKeeper client cannot close down; it just hangs in
the close() method.

(BTW it might be nice to avoid the close() method waiting forever - it
might as well wait, say, 10 seconds then just close anyway).

Though now I've refactored the code to avoid the patch on ZooKeeper to
deal with reconnecting when a SessionExpiredException occurs, I don't
seem to get any session expired exceptions :). I'm starting to wonder
if its maybe related to old persistent data on disk causing the
exception?

I still get the strange lack of Watch Events on the 3rd client though
and the hang on closing (if
WriteLockTest,workAroundClosingLastZNodeFails is set to false - I've
hacked the test to pass by default).

-- 
James
-------
http://macstrac.blogspot.com/

Open Source Integration
http://open.iona.com

Reply via email to