I'm not dead set against it, but there are some problems I see with it:

a) It's not well maintained nor documented as people don't really consider
it when changing anything. A concerned volunteer could manage this probably.
For example, dealing with HTTPs could be documented better and the code
could have some failsafe logic around it (just like with $wgShowIPinHeader).
b) It requires additional code paths and complexity everywhere there is
already CDN (squid/varnish) purge logic. Bugs fixed in one may not carry
over to the other. Because of (b), this makes it more vulnerable to bit rot.
c) Files are not LRU and don't even have an expiry mechanism. One could make
a script and put it on a cron I guess (I rediscovered the existence of
PruneFileCache.php, which I forgot I wrote). If one could do that, they
probably also have the rights to install varnish/squid. Hacking around the
lack of LRU requires MediaWiki to try to bound the worst case number of
cache entries; page cache is only the current version and the resource
loader cache uses a bunch of hit count and IP range uniqueness checks to
determine if a load.php cluster of modules is worth caching the response for
(you don't want to cache any combination of modules that happens to hit the
server, only often hits ones by different sources).
d) It can only use filesystems and not object stores or anything else. This
means you need to either only have one server, or use NFS, or if you want to
be exotic use fuse with some DOS, or use cephfs/gluster (though if you can
do all that you may as well use varnish/squid). I'd imagine people would
just use NFS, which may do fine for lots of small to moderate traffic
installs. Still, I'd rather someone set up a CDN rather than install NFS
(either one takes a little work). People would use CDN if it was made easier
to do I'd bet.
e) I'd rather considering investing time in documentation, packaging, and
core changes to make CDN as easy to set up as possible (for people with VMs
or their own physical boxes). Bugs found by third parties and WMF could be
fixed and both sides could benefit from it since common code paths would be
used. Encouraging squid/varnish usage fits nicely with the idea of
encouraging other open source projects and libraries. Also, using tools
heavily designed and optimized for certain usage is better than everyone
inventing their own little hacky versions that do the same thing (e.g. file
cache instead of a proper CDN).
f) Time spent keeping up hacks to do the work of CDNs to make MediaWiki
faster could be spent on actually make origin requests to MediaWiki faster
and making responses more cache friendly (e.g. ESI and such). For example,
if good ESI support was added, would file cache just lag behind and not be
able to do something similar? One *could* do an analogous thing with file
cache reconstructing pages from file fragments...but that would seem like a
waste of time and code if we can just make it easy to use a CDN.

In any case, I would not want to see file cache removed until CDN support
was evaluated, documented, and cleaned up, so people have an easy
alternative in it's place. For example, if a bunch of confusing vcls are
needed to use varnish, then few will go through the effort.



--
View this message in context: 
http://wikimedia.7.x6.nabble.com/File-cache-HTTPS-question-tp5014197p5014448.html
Sent from the Wikipedia Developers mailing list archive at Nabble.com.

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to