https://bugzilla.wikimedia.org/show_bug.cgi?id=56882
--- Comment #4 from Tim Starling <[email protected]> --- The error message indicates that a libmemcached memcached_connect() call gave a result of MEMCACHED_SERVER_TEMPORARILY_DISABLED. This can happen if memcached_mark_server_for_timeout() was called on the server, which can happen if memcached_quit_server is called with io_death=true, which can happen in all sorts of different cases in io.cc and in one case in response.cc. More investigation is needed to determine exactly which case is causing it. Unfortunately, MEMCACHED_BEHAVIOR_RETRY_TIMEOUT (i.e. retry_timeout in our config) has a minimum of one whole second. In the case of PHP talking to twemproxy, immediate reconnection would probably be a better policy. The one-second timeout is why we see floods of messages in bursts that last approximately one second each. The fact that pmtpa apaches responding to pybal monitoring requests are heavily represented in the logs may be caused by transient packet loss on the pmtpa to eqiad link, which would increase the rate of connection timeouts. -- You are receiving this mail because: You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
