* MZMcBride <[email protected]> [2012-09-10 02:45]:
> K. Peachey wrote:
> > On Mon, Sep 10, 2012 at 8:01 AM, MZMcBride wrote:
> >> page) is just absurd. There's enormous value in the HTML dumps. This 
> >> subject
> >> came up in December 2011 and from the comments in that thread, it seemed as
> >> though the only reason the HTML dumps have been updated is that nobody has
> >> run the relevant script.
> > 
> > AFAIK, E:DumpHTML needs some loving first.
> 
> Can you elaborate on this? Is there anything actually stopping the extension
> (or rather the script) from being run? Of course every piece of software has
> bugs or feature requests, but if there are blockers to actually running this
> script, can you point me to the list of these (or more preferably add them
> as blockers to bug 15017)?
> 
> For context, "E:DumpHTML" refers to
> <https://www.mediawiki.org/wiki/Extension:DumpHTML>, a pseudo-extension
> (quasi-extension?) used to generate HTML dumps.

I use this extension on my wiki (http://spiele.j-crew.de/,
http://misc.j-crew.de/wiki-dump/), but I find it quite brittle in the
face of MediaWiki software changes. Every few months, a change in trunk
breaks the extension in one way or another.

I recently submitted a bunch of fixes for the extension (see
https://gerrit.wikimedia.org/r/#/c/17697/). These changes used to work for
me a few months ago, but on current trunk image handling in DumpHTML is
broken again (filename mangling of images seems broken, and thumbs are
not included in the dump, which used to work).

I think HTML dumps of Wikipedia would be very useful, but it needs
someone from WMF who actively maintains this extension.

Best regards
Thomas


_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to