Anthony the process is linear, you have a php inserting X number of rows
per Y time frame. Yes rebuilding the externallinks, links, and langlinks
tables will take some additional time and wont scale. However I have been
working with the toolserver since 2007 and Ive lost count of the number of
times that the TS has needed to re-import a cluster, (s1-s7) and even
enwiki can be done in a semi-reasonable timeframe. The WMF actually
compresses all text blobs not just old versions. complete download and
decompression of simple only took 20 minutes on my 2 year old consumer
grade laptop with a standard home cable internet connection, same download
on the toolserver (minus decompression) was 88s. Yeah Importing will take a
little longer but shouldnt be that big of a deal. There will also be some
need cleanup tasks. However the main issue, archiving and restoring wmf
wikis isnt an issue, and with moderately recent hardware is no big deal. Im
putting my money where my mouth is, and getting actual valid stats and
figures. Yes it may not be an exactly 1:1 ratio when scaling up, however
given the basics of how importing a dump functions it should remain close
to the same ratio

On Thu, May 17, 2012 at 12:54 AM, Anthony <wikim...@inbox.org> wrote:

> On Thu, May 17, 2012 at 12:45 AM, John <phoenixoverr...@gmail.com> wrote:
> > Simple.wikipedia is nothing like en.wikipedia I care to dispute that
> > statement, All WMF wikis are setup basically the same (an odd extension
> here
> > or there is different, and different namespace names at times) but for
> the
> > purpose of recovery simplewiki_p is a very standard example. this issue
> isnt
> > just about enwiki_p but *all* wmf wikis. Doing a data recovery for
> enwiki vs
> > simplewiki is just a matter of time, for enwiki a 5 day estimate would be
> > fairly standard (depending on server setup) and lower times for smaller
> > databases. typically you can explain it in a rate of X revisions
> processed
> > per Y time unit, regardless of the project. and that rate should be
> similar
> > for everything given the same hardware setup.
>
> Are you compressing old revisions, or not?  Does the WMF database
> compress old revisions, or not?
>
> In any case, I'm sorry, a 20 gig mysql database does not scale
> linearly to a 20 terabyte mysql database.
>
_______________________________________________
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l

Reply via email to