https://bugzilla.wikimedia.org/show_bug.cgi?id=29233
--- Comment #6 from Tim Starling <[email protected]> 2011-06-17 00:44:24 UTC --- (In reply to comment #3) > Apache/PHP doesn't select databases; MediaWiki's LoadBalancer class does. It > should already be failing over to the next available server in the case of a > connection error (MySQL error 2003 is "can't connect"). Yes, LoadBalancer has failover code. During the downtime on May 24, the appropriate sort of connection error was logged: Tue May 24 13:41:01 UTC 2011 srv191 rowiki Connection error: No working slave server: No working slave server: Unknown error () This error indicates that the fallback sequence was exhausted, so whatever is going on, it's clear that we're not just letting connection error exceptions leak out of LoadBalancer. There were 4775 instances on that day. If we get an error in Database::query(), then we don't close the connection and switch over to another database. I'm not sure if that's what CT is asking for. We don't appear to properly log "MySQL server has gone away" or "Lost connection to MySQL server during query" errors. They are dealt with by automatically reconnecting, and then if the reconnection fails, a DBConnectionError would probably be thrown. There were 162,973 instances of "LB failure with no last connection" on May 24, which is somewhat concerning. But it's hard to know if that's the problem of interest. What we really need to know is: when exactly was this "site failure", and what were the observed symptoms of it? -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
