[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654811#action_12654811
 ] 

vinodjohnson edited comment on ZOOKEEPER-251 at 12/9/08 6:47 AM:
-------------------------------------------------------------------------

Thank you. Post patch, I don't see the exceptions. To clear up confusion on my 
part, when you say "the situation arises after switching from quorum to 
standalone", you are referring to the single server that is stopped and 
started, and not the ensemble as a whole? I believe that throughout the test, I 
was maintaining quorum by having at least 2 out of 3 servers running and 
communicating to each other.

      was (Author: vinodjohnson):
    Thank you. Post patch, I don't see the exceptions. To clear up confusion on 
my part, when you say "the situation arises after switching from quorum to 
standalone", you are referring to the single server that is stopped and 
started, correct and not the ensemble as a whole? I believe that throughout the 
test, I was maintaining quorum by having at least 2 out of 3 servers running 
and communicating to each other.
  
> NullPointerException stopping and starting Zookeeper servers
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-251
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-251
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.0.0, 3.0.1
>         Environment: Tested with JDK 1.5, Solaris, but I suspect it is not 
> relevant in this case.
>            Reporter: Thomas Vinod Johnson
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.1.0
>
>         Attachments: ZOOKEEPER-251.patch
>
>
> See the following thread for the original report:
> http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/200812.mbox/browser
> Steps to reproduce:
> 1) Start a replicated zookeeper service consisting of 3 zookeeper (3.0.1) 
> servers all running on the same host (of course, all using their own ports 
> and log directories)
> 2) Create one znode in this ensemble (using the zookeeper client console, I 
> issued 'create /node1 node1data').
> 3) Stop, then restart a single zookeeper server; moving onto the next one a 
> few seconds later. 
> 4) Go back to 3. After 4-5 iterations, the following should occur, with the 
> failing server exiting:
> java.lang.NullPointerException
>         at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:447)
>         at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:358)
>         at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:333)
>         at 
> org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:250)
>         at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:102)
>         at 
> org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:183)
>         at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:245)
>         at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:421)
> 2008-12-08 14:14:24,880 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:2183:[EMAIL PROTECTED] - Shutdown called
> java.lang.Exception: shutdown Leader! reason: Forcing shutdown
>         at 
> org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:336)
>         at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:427)
> Exception in thread "QuorumPeer:/0:0:0:0:0:0:0:0:2183" 
> java.lang.NullPointerException
>         at 
> org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:339)
>         at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:427)
> The inputStream field is null, apparently because next is being called 
> at line 358 even after next returns false. Having very little knowledge 
> about the implementation, I don't know if the existence of hdr.getZxid() 
>  >= zxid is supposed to be an invariant across all invocations of the 
> server; however the following change to FileTxnLog.java seems to make 
> the problem go away.
> diff FileTxnLog.java /tmp/FileTxnLog.java
> 358c358,359
> <                 next();
> ---
>  >               if (!next())
>  >                   return;
> 447c448,450
> <                 inputStream.close();
> ---
>  >               if (inputStream != null) {
>  >                   inputStream.close();
>  >               }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to