On Thu, Apr 29, 2010 at 9:49 AM, Patrick Hunt <ph...@apache.org> wrote: > Is there any good (simple/fast/bulletproof) way to monitor the FD use inside > the jvm? If so we could stop accepting new client connections once we get > close to the os imposed limit... The test would have to be a bulletproof one > though - we wouldn't want to end up in some worse situation (where we refuse > connection because we mistakenly believe that the limit has been reached). > > Might be good to open a JIRA for this and add some tests. In particular we > should verify the server handles this as gracefully as it can when the limit > has been reached.
Poking around with jconsole I found two stats that already measure FDs: - java.lang.OperatingSystem.MaxFileDescriptorCount - java.lang.OperatingSystem.OpenFileDescriptorCount They're described (rather tersely) at: http://java.sun.com/javase/6/docs/jre/api/management/extension/com/sun/management/UnixOperatingSystemMXBean.html So it sounds like the feature request would be stop accepting new client connections if OpenFileDescriptorCount > 95% of MaxFileDescriptorCount? Only start accepting new requests when OpenFileDescriptorCount < 90% of MaxFileDescriptorCount. Basically the high/low watermark thing. Thoughts? --travis > > Patrick > > On 04/29/2010 09:34 AM, Mahadev Konar wrote: >> >> Hi Travis, >> >> How many clients did you have connected to this server? Usually the >> default >> is 8K file descriptors. Did you have clients more than that? >> >> Also, if clients fail to attach to a server, they will run off to another >> server. We do not do any blacklisting because we expect the server to heal >> and if it does not, it mostly shuts itself down in most of the cases. >> >> Thanks >> mahadev >> >> >> On 4/29/10 12:08 AM, "Travis Crawford"<traviscrawf...@gmail.com> wrote: >> >>> Hey zookeeper gurus - >>> >>> We recently had a zookeeper outage when one ZK server was started with >>> a low limit after upgrading to 3.3.0. Several days later the outage >>> occurred when that node reached its file descriptor limit and clients >>> started having major issues. >>> >>> Are there any circumstances when a ZK server will get blacklisted from >>> the ensemble? Something similar to how tasktrackers are blacklisted >>> when too many tasks fail. >>> >>> Thanks! >>> Travis >> >