[ https://issues.apache.org/jira/browse/ZOOKEEPER-856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903170#action_12903170 ]
Travis Crawford commented on ZOOKEEPER-856: ------------------------------------------- @patrick - We're using these settings, which I believe are based on what's recommended in the troubleshooting guide. -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+UseConcMarkSweepGC Looking at the logs I do see lots of GC activity. For example: Total time for which application threads were stopped: 0.5599050 seconds Application time: 0.0056590 seconds I only see this on the hosts that became unresponsive after acquiring lots of connections. Any suggestions for the GC flags? If there's something better I can experiment, and update the wiki if we discover something interesting. http://wiki.apache.org/hadoop/ZooKeeper/Troubleshooting > Connection imbalance leads to overloaded ZK instances > ----------------------------------------------------- > > Key: ZOOKEEPER-856 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-856 > Project: Zookeeper > Issue Type: Bug > Reporter: Travis Crawford > Fix For: 3.4.0 > > Attachments: zk_open_file_descriptor_count_members.gif, > zk_open_file_descriptor_count_total.gif > > > We've experienced a number of issues lately where "ruok" requests would take > upwards of 10 seconds to return, and ZooKeeper instances were extremely > sluggish. The sluggish instance requires a restart to make it responsive > again. > I believe the issue is connections are very imbalanced, leading to certain > instances having many thousands of connections, while other instances are > largely idle. > A potential solution is periodically disconnecting/reconnecting to balance > connections over time; this seems fine because sessions should not be > affected, and therefore ephemaral nodes and watches should not be affected. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.