On 3/25/09 10:08 AM, Christian Storm wrote:
> Thanks to everyone who got the enwiki dumps going again!  Should we expect
> more regular dumps now?  What was the final solution of fixing this?
>
>

Lots of love and upkeep by everyone :)

But really its needs to be more automated and made parallelised so that 
we can spot issues faster, validate inconsistencies, and finish quicker.

Brion and I have met about this and we've even brought it into the 
Wikimedia dev meetings to brainstorm how the system could change for the 
better.

I've started drafting some new ideas at 
http://wikitech.wikimedia.org/view/Data_dump_redesign

of the various problems that were facing and what kind of job management 
we can put around it. Were taking this on as a full "should have been 
done 2 years ago" project and I'm going to be shepherding this along.

Right now I'm collecting stats about the throughput of the components to 
see how much in parallel this could be farmed out in a job management 
system.

This is a large project that has some distinct problem areas that we'll 
be isolating and welcoming help on.

--tomasz



_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to