https://bugzilla.wikimedia.org/show_bug.cgi?id=1935

Roan Kattouw <roan.katt...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |roan.katt...@gmail.com

--- Comment #3 from Roan Kattouw <roan.katt...@gmail.com> 2011-01-21 23:30:18 
UTC ---
(In reply to comment #2)
> Quoting Roan in
> http://permalink.gmane.org/gmane.science.linguistics.wikipedia.technical/51583
> 
> '''
> Wikimedia doesn't technically use delta compression. It concatenates a
> couple dozen adjacent revisions of the same page and compresses that
> (with gzip?), achieving very good compression ratios because there is
> a huge amount of duplication in, say, 20 adjacent revisions of
> [[Barack Obama]] (small changes to a large page, probably a few
> identical versions to due vandalism reverts, etc.). However,
> decompressing it just gets you the raw text, so nothing in this
> storage system helps generation of diffs. Diff generation is still
> done by shelling out to wikidiff2 (a custom C++ diff implementation
> that generates diffs with HTML markup like <ins>/<del>) and caching
> the result in memcached.
> 
> '''
>
...and I was wrong, see the replies to that post. We actually DO use
delta-based storage, almost exactly in the way you propose.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to