On Thu, Apr 29, 2010 at 10:24 AM, Patrick Hunt <ph...@apache.org> wrote: > Did you find any bugs on java.sun.com related to those? ;-) > > That does sound like a good solution to me. We should stop accepting > connections and log it to the log as well. We might also want to update the > user docs and tell users to monitor the FD count as part of their monitoring > regime. Is there a way to register for notifications on those via JMX? We > might want to add this to our own JMX/4letterwords to simplify monitoring of > this critical resource for users. > > Travis, would you mind creating a JIRA for this? Thanks!
Filed: https://issues.apache.org/jira/browse/ZOOKEEPER-759 Thanks for the feedback all! Travis > > Patrick > > On 04/29/2010 10:09 AM, Travis Crawford wrote: >> >> On Thu, Apr 29, 2010 at 9:49 AM, Patrick Hunt<ph...@apache.org> wrote: >>> >>> Is there any good (simple/fast/bulletproof) way to monitor the FD use >>> inside >>> the jvm? If so we could stop accepting new client connections once we get >>> close to the os imposed limit... The test would have to be a bulletproof >>> one >>> though - we wouldn't want to end up in some worse situation (where we >>> refuse >>> connection because we mistakenly believe that the limit has been >>> reached). >>> >>> Might be good to open a JIRA for this and add some tests. In particular >>> we >>> should verify the server handles this as gracefully as it can when the >>> limit >>> has been reached. >> >> Poking around with jconsole I found two stats that already measure FDs: >> >> - java.lang.OperatingSystem.MaxFileDescriptorCount >> - java.lang.OperatingSystem.OpenFileDescriptorCount >> >> They're described (rather tersely) at: >> >> >> http://java.sun.com/javase/6/docs/jre/api/management/extension/com/sun/management/UnixOperatingSystemMXBean.html >> >> So it sounds like the feature request would be stop accepting new >> client connections if OpenFileDescriptorCount> 95% of >> MaxFileDescriptorCount? Only start accepting new requests when >> OpenFileDescriptorCount< 90% of MaxFileDescriptorCount. Basically the >> high/low watermark thing. >> >> Thoughts? >> >> --travis >> >> >> >> >>> >>> Patrick >>> >>> On 04/29/2010 09:34 AM, Mahadev Konar wrote: >>>> >>>> Hi Travis, >>>> >>>> How many clients did you have connected to this server? Usually the >>>> default >>>> is 8K file descriptors. Did you have clients more than that? >>>> >>>> Also, if clients fail to attach to a server, they will run off to >>>> another >>>> server. We do not do any blacklisting because we expect the server to >>>> heal >>>> and if it does not, it mostly shuts itself down in most of the cases. >>>> >>>> Thanks >>>> mahadev >>>> >>>> >>>> On 4/29/10 12:08 AM, "Travis Crawford"<traviscrawf...@gmail.com> >>>> wrote: >>>> >>>>> Hey zookeeper gurus - >>>>> >>>>> We recently had a zookeeper outage when one ZK server was started with >>>>> a low limit after upgrading to 3.3.0. Several days later the outage >>>>> occurred when that node reached its file descriptor limit and clients >>>>> started having major issues. >>>>> >>>>> Are there any circumstances when a ZK server will get blacklisted from >>>>> the ensemble? Something similar to how tasktrackers are blacklisted >>>>> when too many tasks fail. >>>>> >>>>> Thanks! >>>>> Travis >>>> >>> >