https://bugzilla.wikimedia.org/show_bug.cgi?id=54195
Faidon Liambotis <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|High |Highest CC| |[email protected], | |[email protected] --- Comment #20 from Faidon Liambotis <[email protected]> --- OK, I had another quick look, as this in combination with consistent hashing directing /wiki/Special:CentralAutoLogin/start?type=script to a single Varnish backend produced a partial outage by exhausting its resources. (this has been workarounded since with https://gerrit.wikimedia.org/r/#/c/95458/ ) This is far more than just "caching". I'm going to report these here so we don't get lost into multiple bugs, but feel free to fork into other bugs as needed: 1) From Varnish text frontends, i.e. *all* text client-side requests (hits & misses), the top 10 URLs are: 6895.39 RxURL /wiki/Special:CentralAutoLogin/start?type=script 5627.74 RxURL /w/index.php?title=MediaWiki:Wikiminiatlas.js&action=raw&ctype=text/javascript&smaxage=21600&maxage=86400 3235.81 RxURL /wiki/Special:CentralAutoLogin/createSession?gu_id=0&type=script&proto=https 2137.83 RxURL /wiki/Special:CentralAutoLogin/createSession?gu_id=0&type=script&proto=http 1981.98 RxURL /w/index.php?title=MediaWiki:EnhancedCollapsibleElements.js&action=raw&ctype=text/javascript 1943.81 RxURL /w/index.php?title=MediaWiki:Common.js/NormalizeCharWidth.js&action=raw&ctype=text/javascript 1567.70 RxURL /wiki/Special:CentralAutoLogin/checkLoggedIn?wikiid=enwiki&proto=https&type=script 1383.12 RxURL /favicon.ico 1266.87 RxURL /w/index.php?title=%E7%89%B9%E5%88%A5:%E4%B8%AD%E5%A4%AE%E7%AE%A1%E7%90%86%E8%87%AA%E5%8B%95%E3%83%AD%E3%82%B0%E3%82%A4%E3%83%B3/star 1020.66 RxURL /w/index.php?title=MediaWiki:OSM.js&action=raw&ctype=text/javascript&smaxage=21600&maxage=86400 (also note how the first is double the requests from the third and triple the requests from fourth) and from the top-50 URLs, 10 (20%) are CentralAutoLogin URLs, namely: 6944.76 RxURL /wiki/Special:CentralAutoLogin/start?type=script 3003.28 RxURL /wiki/Special:CentralAutoLogin/createSession?gu_id=0&type=script&proto=https 2273.40 RxURL /wiki/Special:CentralAutoLogin/createSession?gu_id=0&type=script&proto=http 1452.62 RxURL /wiki/Special:CentralAutoLogin/checkLoggedIn?wikiid=enwiki&proto=https&type=script 551.49 RxURL /wiki/Special:CentralAutoLogin/checkLoggedIn?wikiid=jawiki&proto=https&type=script 362.93 RxURL /wiki/Special:CentralAutoLogin/checkLoggedIn?wikiid=eswiki&proto=https&type=script 298.33 RxURL /wiki/Special:CentralAutoLogin/checkLoggedIn?wikiid=zhwiki&proto=http&type=script 293.00 RxURL /wiki/Special:CentralAutoLogin/checkLoggedIn?wikiid=enwiki&proto=http&type=script 289.84 RxURL /wiki/Special:CentralAutoLogin/createSession?gu_id=0&type=1x1&proto=https 240.99 RxURL /wiki/Special:CentralAutoLogin/checkLoggedIn?wikiid=ptwiki&proto=https&type=script Moreover, these stats do not include the localized URLs that I'm seeing, such as http://de.wikipedia.org/w/index.php?title=Spezial:Zentrale_automatische_Anmeldung/start&type=script (localized URLs for such basic functions are a bad idea, as it doesn't help us aggregate URLs counts and find outliers like the above) And, surpisingly, does not include the 1x1 URL that I see e.g. on all(?) enwiki pages as: <noscript><img src="//en.wikipedia.org/w/index.php?title=Special:CentralAutoLogin/start&type=1x1" alt="" title="" width="1" height="1" style="border: none; position: absolute;" /></noscript></div> which also comes with Cache-control: private. The fact that these URLs are so popular points to far more than just editors attempting to login -- it surely includes anonymous page views and it's probably aggravated by the fact that due to caching headers, we do not make use of client-side caching. I see ext.centralauth.centralautologin.js doing such requests and that script being loaded on every anon, but it's clearly not my area of expertise. Those URLs not being cached also means these get passed through to appservers and increasing their load significantly, which the original point of this report. It's also very apparent now, though, that they're causing problems in all kinds of layers in our infrastructure (Varnish frontend/backend hit ratio is down the drain, Varnish backends were/are suffering) and possibly causing a client-side slow-down. I think we can fix the caching issues -even for redirects depending on GeoIP- but first and foremost we need to stop forcing all page views to requests all kinds of (uncacheable) CentralAutoLogin URLs in this way. -- You are receiving this mail because: You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
