> When I build a mirror, I would like to compress the <text
> ...>plaintext</text> to get:
> old_text: ciphertext
> old_flags: utf-8,gzip
> I would like this done for every text revision, so as to save both disk
> space...

Maybe https://www.mediawiki.org/wiki/Manual:Reduce_size_of_the_database
will help. maintenance/storage/compressOld.php will compress older
revisions, optionally using gzip, and you can set the parameters to
compress every revision.

Did you set $wgCompressRevisions in your installation before importing? I'm
not sure if that has effect when building a mirror. It feels like it
should, and/or importDump.php should have some option to compress all
revisions imported; you could file a bug in Phabricator.

and communication bandwidth between web server and browser.

If I understand you correctly, that's a separate issue. MediaWiki doesn't
send compressed page data to the browser, it sends HTML. However, most
browsers send the
  Accept-Encoding: gzip, deflate
HTTP header, and in response most web servers will gzip the HTML of
MediaWiki pages and other web content. To verify, load a page from your
wiki in your browser and look in your web browser's developer tools'
Network tab for the request and response headers; the latter will probably
  Content-Encoding: gzip
Or you could do something like `curl -H 'Accept-Encoding: gzip, deflate'
--dump-header - http://localhost/wiki/Main_Page | less` and see what you

2) Problem
> There is little relevant documentation on <https://www.mediawiki.org>. So
> I
> have run a few experiments.
> exp1) I pipe the plaintext through gzip, escape for MySQL, and build the
> mirror.

I wouldn't try to do this yourself. If import with $wgCompressRevisions =
true doesn't do what you want and you don't want to run a compressOld.php
maintenance step afterwards, I would suggest modifying some PHP somewhere
solely during the import to your mirror to encourage MediaWiki it to
compress every revision.

> Please provide documentation as to how mediawiki handles compressed
> old_text.
> a) How is plaintext compressed?

From looking at core/includes/Revision.php, if PHP's gzdeflate() exists
then MediaWiki will use this to compress the contents of old_text.
http://php.net/manual/en/function.gzdeflate.php has some documentation on
the function works.

> b) Is the ciphertext escaped for MySQL after compression?
No idea, old_text is a mediumblob storing binary data. As I understand it
escaping applies only to transfer in and out of the DB.

c) How does mediawiki handle old_flags=utf-8,gzip?
> d) How are the contents of old_text unescaped and decompressed for
> rendering?
> e) Where in the mediawiki code should I be looking to understand this
> better?

As above, PHP's gzdeflate/gzinflate in Revision::compressRevisionText() and
decompressRevisionText() in core/includes/Revision.php

Hope this helps. I didn't know anything about this 25 minutes ago :)

=S Page  WMF Tech writer
