Bug 2939 is one relevant bug to this, it could probably use an index. [1] https://bugzilla.wikimedia.org/show_bug.cgi?id=2939
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name] On 11-09-02 05:20 PM, Asher Feldman wrote: > Would it be possible to generate offline hashes for the bulk of our revision > corpus via dumps and load that into prod to minimize the time and impact of > the backfill? > > When using for analysis, will we wish the new columns had partial indexes > (first 6 characters?) > > Is code written to populate rev_sha1 on each new edit? > > On Thu, Aug 18, 2011 at 7:40 AM, Diederik van Liere > <[email protected]>wrote: > >> Hi! >> I am starting this thread because Brion's revision r94289 reverted >> r94289 [0] stating "core schema change with no discussion" [1]. >> Bugs 21860 [2] and 25312 [3] advocate for the inclusion of a hash >> column (either md5 or sha1) in the revision table. The primary use >> case of this column will be to assist detecting reverts. I don't think >> that data integrity is the primary reason for adding this column. The >> huge advantage of having such a column is that it will not be longer >> necessary to analyze full dumps to detect reverts, instead you can >> look for reverts in the stub dump file by looking for the same hash >> within a single page. The fact that there is a theoretical chance of a >> collision is not very important IMHO, it would just mean that in very >> rare cases in our research we would flag an edit being reverted while >> it's not. The two bug reports contain quite long discussions and this >> feature has also been discussed internally quite extensively but oddly >> enough it hasn't happened yet on the mailinglist. >> >> So let's have a discussion! >> >> [0] http://www.mediawiki.org/wiki/Special:Code/MediaWiki/94289 >> [1] http://www.mediawiki.org/wiki/Special:Code/MediaWiki/94541 >> [2] https://bugzilla.wikimedia.org/show_bug.cgi?id=21860 >> [3] https://bugzilla.wikimedia.org/show_bug.cgi?id=25312 >> >> Best, >> >> Diederik >> >> _______________________________________________ >> Wikitech-l mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l >> -- ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name] _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
