daniel added a comment. In https://phabricator.wikimedia.org/T118162#1813400, @jcrespo wrote:
> Connection re-use does not work if you open 900 connections at the same time > every 3 minutes. I have been saying that for a long time. If a DBA has to > explain connection reuse vs pool of connections... :-/ There seems to be a fundamental misunderstanding here. The script is not designed to open 900 connections, nor should it be possible for this script to open 900 connections, if the code in core works as I understand it. I suspect that we are talking about "connection re-use" at a completely different level. I'm talking about re-use inside the same PHP invocation, not between requests. Do you know the relevant code in the LoadBalancer class? Can you tell me in what respect I am understanding or using it wrong? Maybe Aaron could shed some light on this. > > We could just as well place the locks for all client wikis on the wikidata > > master db. Then there should be no reason to connect to the client database > > at all (assuming the job queue is not using mysql). > > > Please don't. You are moving the problem from one server to another (and a > more critical server). A good start would be to close connections as soon as > they are not needed anymore- instead of idling them. Whatever you do, please > do not test it in production. The script shouldn't idle the connection really, it's actually bounded by sql query speed on the repo's master DB. No queries are run against other databases. I do not see how we would "shift the problem" - the problem is too many connections. What I suggest is to use one connection for everything. We were connecting to the client databases just to obtain a named lock on that database. The connection then stayed open (not so great), but (according to how I understand LoadBalancer), we would use the same connection for all wikis on the same cluster - so we may end up with 10 or so open connections per script. Not ideal, but not catastrophic. Unless the connection wasn't getting re-used, and we ended up creating a new one for every wiki. To me, that seems to be the problem - a problem caused by the bug in core that Aaron fixed. We could start to forcibly close the connection after every query, even if it's just a second or less until we are going to fire the next one. But we'd have to discuss why we should do that in this case, and not in others. Explicitly closing connections is generally Not Done in MediaWiki - we keep the same connection(s) alive for the duration of the request. As far as I can see, DatabaseBase::close and LoadBalancer::closeConnection are never called in any regular maintenance script or during web requests. If the "request" (script run) lasts 10 minutes, that can of course be problematic, especially if there are many such scripts, and the connection isn't actually used. But the connection to the repo database *is* used. And we don't have hundreds of script instances. Closing the connection to slave DBs (instead of using the same connection for all wikis on the same master), as Aude suggests, would work around the issue of connection re-use not working properly. My patch also works around the problem, by using the connection to the repo wiki for everything (which requires a change to the locking logic). But with LoadBalancer working correctly, neither should be needed to avoid opening hundreds of connections. That's either a core bug (probably the one Aaron fixed), or a fundamental misunderstanding on my part (about how LoadBalancer is supposed to be doing). There is a lot that can and should be improved about the dispatching process. But the //massive// problem we are seeing is not caused by the script working as designed. It's caused by the script misbehaving due to, as far as I can tell, a core bug. Which, I think, is fixed. TASK DETAIL https://phabricator.wikimedia.org/T118162 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: daniel Cc: Tobi_WMDE_SW, ori, mobrovac, thiemowmde, aaron, jcrespo, gerritbot, daniel, aude, hoo, Lydia_Pintscher, Addshore, Aklapper, Joe, Wikidata-bugs, Mbch331, Krenair _______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
