2010/11/15 Daniel Friesen <[email protected]>: > There was a thought about the job queue that popped into my mind today. > > From what I understand, for a Wiki Farm, in order to use runJobs.php > instead of using the in-request queue (which on high traffic sites is > less desireable) the Wiki Farm has to run runJobs.php periodically for > each and every wiki on the farm. > So, for example. If a Wiki Farm has 10,000 wiki it's hosting, say the > Wiki Host really wants to ensure that the queue is run at least hourly > to keep the data on the wiki reasonably up to date, the wiki farm > essentially needs to call runJobs.php 10,000 times an hour (ie: one time > for each individual wiki), irrelevantly of whether a wiki has jobs or > not. Either that or poll each database before hand, which in itself is > 10,000 database calls an hour plus the runJobs execution which still > isn't that desireable. > Have you considered the fact that the WMF cluster is in this exact situation? ;)
However, we don't call runJobs.php for all wikis periodically. Instead, we call nextJobDB.php which generates a list of wikis that have pending jobs (by connecting to all of their DBs), caches it in memcached (caching was broken until a few minutes ago, oops) and outputs a random DB name. We then run runJobs.php on that random DB name. This whole thing is in maintenance/jobs-loop.sh Roan Kattouw (Catrope) _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
