https://bugzilla.wikimedia.org/show_bug.cgi?id=34104

--- Comment #4 from Asher Feldman <[email protected]> 2012-02-08 01:53:41 
UTC ---
The revision.rev_sha1 column addition will be problematic on enwiki in our
current production environment.

I took the enwiki slave db38 out of the production and ran:

mysql -e "SET SQL_LOG_BIN = 0; ALTER TABLE /*$wgDBprefix*/revision ADD rev_sha1
varbinary(32) NOT NULL default ''" enwiki

It took over 27 hours to complete, and may take another ~8 hours for
replication to catch up.  We can go ahead and do this on every db individually,
but it will take quite a while, and may require pushing back the release cycle.
 It also increased ibdata1 by 200GB.

Due to the huge amount of time required, we may want to complete the other
migrations on db38 and then rebuild all of the other enwiki db's from a
hotbackup of it, except for analytics systems where this isn't an option.  We
could also take the opportunity to redo db38 from a full sql dump of enwiki in
order to switch to file-per-table mode, but this could take an additional week.

Or we can scuttle this change for a future release and until then, examine ways
to partition the revision table or store it differently, obtain SSD's for
enwiki db's, or store sha1's in their own table and push the cost to queries
that join against it.  As its currently unindexed, it doesn't appear it will be
of any use except in offline cases anyways.


FYI, I'm also experimenting with Percona's pt-online-schema-change tool, since
our revision table meets its requirement of having a single column unique
index.  

On db1043, an enwiki slave in eqiad, I am running :

pt-online-schema-change t=enwiki.revision --alter "ADD rev_sha1 varbinary(32)
NOT NULL default ''" --sleep 0.1

This writes 1000 rows at a time to a __tmp_revision table and then sleeps 100ms
between chunks.  Replication couldn't keep up at all with more aggressive
settings.  With these settings, replication kept up for around 12 hours, but is
now 3.5 hours behind.  It's been running for 25 hours and the tool estimates it
will require 29 more hours to complete, by which time it will be further behind
on replication.  We'd need a larger sleep and/or smaller chunks to avoid
seriously backing up replication, which may not even work.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to