https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #21 from Ariel T. Glenn <[email protected]> ---
(In reply to comment #20)

> The biggest problem is slowness of xml dumps, so SQL dumps also should be
> created in such way.

If I inderstand you correctly, you're suggesting that the text revisions be
dumped using e.g. mysqldump in order to make them faster.  While the production
of the XML dumps for WMF projects is very slow for large projects, using
mysqldump isn't feasible, for a few reasons:

* Text revisions live in external storage clusters in separate databases and
tables.  Older revisions might live in a different cluser than newer ones.  For
any given revision the way to find out where the text content is stored is to
check the pointer in the wikis's text table.
* Some text revisions are hidden from public view (deleted or oversighted) and
should not be included in the dumps.
* We have all of the metadata that should accompany the text of each page, for
bot users, researchers and importers alike.  This is a convenience measure more
than anything else but a vary popular one. Of course if there were some other
proposal for packaging the metadata in the glorious new dump format to come,
this issue could be addressed.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to