Στις 09-09-2010, ημέρα Πεμ, και ώρα 20:08 +0200, ο/η Roan Kattouw
έγραψε:
> 2010/9/8 David Gerard <dger...@gmail.com>:
> > This is something that's been a problem for years now.
> >
> > I do not think there is any sort of deliberate intent. However,
> > keeping the data close is a way to proprietise a wiki even if it's
> > free content, so making it easy to fork is an important attitude to
> > maintain.
> >
> > I realise this is difficult when the devs have to work as hard as
> > possible just to keep everything from falling over ...
> >
> That's right, there is no deliberate intent and it's really a lack of
> people on the ops side (dumps are an ops thing, not a dev thing, and
> devs generally can't do much to help here). WMF is also not "ignoring"
> requests to provide image dumps, it just hasn't gotten around to
> setting them up yet; presumably, this is because text dumps aren't
> running smoothly yet (I'd appreciate a reply from Ariel Glenn to get
> the facts here, but since Ariel is out of the country I may or may not
> get my wish).
> 
> It's true that the dumps situation is still a problem, but you (OP)
> should assume some good faith here rather than accusing the WMF of
> ignoring you, not earning the community's trust or even trying to
> usurp Wikipedia. You're right, you are being paranoid.

I am not thinking about image dumps at all.  I am concentrating on the
regular XML dumps which have been in sorry shape for various reasons
ever since I started as a volunteer in the community adding content.
(note that I am not laying blame about the sorry state, that's not the
point).

For the rest of September I'll be fooling with these parallel runs until
I get something that seems to perform well.  For the next 5-6 days I'm
out of action on them but after that it's back to the grind on them.
Today, though I should hae been working on something else, I spent
crunching some numbers and trying to figure out what more optimal chunk
sizes ought to be.  Since earlier articles by far have the bulk of the
revisions it turns out I need to write some code to implement that.
Anyways, either I'm (mostly) hard at work on this problem or I'm
secretly plotting to run off with all the old copies of wikipedia to
Bermuda and retire.... :-P

Off of the dumps page on wikitech
http://wikitech.wikimedia.org/view/Dumps
there's a link to a page where I'm starting to keep updates, now that
there is an actual run going.  I may shoot this run and restart this
piece in a few days, but what the heck, at least there's some
information there.  Also there's a link to a wish list for the XML
dumps; if the image dumps aren't listed there please add them.  I'm not
going to try to think about how feasible or not they might be right now
though, brain too full.

Happy trails,

Ariel





_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to