On Thu, 16 Dec 2010 07:50:56 +0200, Andrew Dunbar <[email protected]> wrote: > On 15 December 2010 20:24, Manuel Schneider > <[email protected]> wrote: >> Hi Andrew, >> >> maybe you'd like to check out ZIM: This is an standardized file >> format >> for compressed HTML dumps, focused on Wikimedia content at the >> moment. >> >> There is some C++ code around to read and write ZIM files and there >> are >> several projects using that, eg. the WP1.0 project, the Israeli and >> Kenyan Wikipedia Offline initiatives and more. Also the Wikimedia >> Foundation is currently in progress to adopt the format to provide >> ZIM >> files from Wikimedia wikis in the future. > > This is very interesting and I'll be watching it. Where do the HTML > dumps come from?
I do the HDML dumps on my own, using a customed version of the dumpHTML extension and additional scripts. Emmanuel _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
