Am 15.09.2017 um 19:49 schrieb Erik Zachte:
> Compute the hashes on the fly for the offline analysis doesn’t work for 
> Wikistats 1.0, as it only parses the stub dumps, without article content, 
> just metadata.
> Parsing the full archive dumps is a quite expensive, time-wise.

We can always compute the hash when outputting XML dumps that contain the full
content (it's already loaded, so no big deal), and then generate the XML dump
with only meta-data from the full dump.


-- 
Daniel Kinzler
Principal Platform Engineer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to