On 11-09-19 12:57 PM, Brion Vibber wrote: > On Mon, Sep 19, 2011 at 12:53 PM, Asher Feldman <[email protected]>wrote: > >> Since the primary use case here seems to be offline analysis and it may not >> be of much interest to mediawiki users outside of wmf, can we store the >> checksums in new tables (i.e. revision_sha1) instead of running large >> alters, and implement the code to generate checksums on new edits via an >> extension? >> >> Checksums for most old revs can be generated offline and populated before >> the extension goes live. Since nothing will be using the new table yet, >> there'd be no issues with things like gap lock contention on the revision >> table from mass populating it. >> > That's probably the simplest solution; adding a new empty table will be very > quick. It may make it slower to use the field though, depending on what all > uses/exposes it. > > During stub dump generation for instance this would need to add a left outer > join on the other table, and add things to the dump output (and also needs > an update to the XML schema for the dump format). This would then need to be > preserved through subsequent dump passes as well. > > -- brion Revision is going to need to either make a JOIN whenever it grabs revision info, or make an additional db query whenever someone does use the checksum.
Btw, instead of having Revision return a checksum string and needing to check what type it is with a second method (best to program generically in case we do switch checksum types) how about we return an instance of a simple SHA1 wrapper class. We can have a MD5 one too and use a simple descriptive method instead of having to manually use wfBaseConvert with the right args when you want something good for filesystem use. ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name] _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
