Travis Crawford updated ZOOKEEPER-803:

    Attachment: connection-bugfix-diff.png

This diff shows a bug where the client developer confused disconnections and 
expired sessions. In the zookeeper programing model, clients reconnect 
themselves automatically when disconnected. However, should the session expire 
the application is responsible for reconnecting.

In this case the developer attempted to throttle reconnects, however, due to a 
bug the application created a new connection each time.

A small number of clients running the buggy code took down a 3 node Zookeeper 
cluster by exhausting 65k file descriptor limit. It only recovered after 
shutting down clients, restarting the Zookeepers, and then restarting the 
well-behaved clients.

> Improve defenses against misbehaving clients
> --------------------------------------------
>                 Key: ZOOKEEPER-803
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-803
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.3.0
>            Reporter: Travis Crawford
>         Attachments: connection-bugfix-diff.png
> This issue is in response to ZOOKEEPER-801. Short version is a small number 
> of buggy clients opened thousands of connections and caused Zookeeper to fail.
> The misbehaving client did not correctly handle expired sessions, creating a 
> new connection each time. The huge number of connections exacerbated the 
> issue.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to