https://bugzilla.wikimedia.org/show_bug.cgi?id=45877
Krinkle <krinklem...@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|ResourceLoader caches |ResourceLoader: Bad cache |missing module for many |stuck due to race condition |minutes after it's |with scap between load.php |available |and index.php server --- Comment #5 from Krinkle <krinklem...@gmail.com> --- (In reply to Krinkle from comment #3) > This bug (bug 45877) is about the race condition where one server is already > embedding requests urls in the page (or mw.loader.load calls) while the > module in question is not yet available on the apache server that the > request will be made to. > > e.g. > en.wikipedia.org --> srv123 @r12 --> outputs html mw.loader.load('foo') > -> bits.wikimedia.org --> srv214 @r11 --> mw.loader.state('foo', 'missing'); > > Where that exact url (with that timestamp) will get cached for 30 days. > > Now it won't be broken for 30 days because the startup manifest > en.wikipedia.org requests context work with (from bits) will be rebuilt > every 5 minutes. > > I'm not sure what the proper solution is for this problem. Perhaps syncing > to bits firts, though that might bring the opposite problem, which is likely > less visible, but might be equally problematic. Rephrasing bug summary to more accurately reflect this. The different relevant scenarios: 1) index.php server first, load.php server second So whenever a change of any kind is deployed (new module, or a change to an existing module), it is possible that the code might arrive on one server first (e.g. the server serving the html, with the module load queue), and then the client makes a subsequent request to load.php for those modules (handled by a server that doesn't yet have the code). In that case, the response will be module.state=missing; Which is fine and degrades gracefully. This is not a broken state, simply the old state prior to this particular deployment. Once the deployment is finished, the next time a client requests the startup module from load.php, the module will be in there and with a higher timestamp, so the module=missing response won't be served again. 2) index.php server and load.php/startup first, load.php/module second First html server gets it, ensuring the module is in the load queue (if not already) and sometime before or after this, the server handling the startup module got it too. Client will make a request for the module with the newer version number in the url, gets handled by a server that we haven't synced to yet. Responds with an old version of the module (or module=missing if its a new module). This situation doesn't resolve itself within 5 minutes because the bad response is stuck in frontend varnish cache at the correct url. Touching startup.js won't help either. This can only be resolved by changing the module again (touch any of the relevant module's included files), syncing that and hoping you don't get the same race condition (it's rare but I'd say it happens 1/20 times). -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l