On 11-09-19 12:57 PM, Brion Vibber wrote:
> On Mon, Sep 19, 2011 at 12:53 PM, Asher Feldman <[email protected]>wrote:
>
>> Since the primary use case here seems to be offline analysis and it may not
>> be of much interest to mediawiki users outside of wmf, can we store the
>> checksums in new tables (i.e. revision_sha1) instead of running large
>> alters, and implement the code to generate checksums on new edits via an
>> extension?
>>
>> Checksums for most old revs can be generated offline and populated before
>> the extension goes live.  Since nothing will be using the new table yet,
>> there'd be no issues with things like gap lock contention on the revision
>> table from mass populating it.
>>
> That's probably the simplest solution; adding a new empty table will be very
> quick. It may make it slower to use the field though, depending on what all
> uses/exposes it.
>
> During stub dump generation for instance this would need to add a left outer
> join on the other table, and add things to the dump output (and also needs
> an update to the XML schema for the dump format). This would then need to be
> preserved through subsequent dump passes as well.
>
> -- brion
Revision is going to need to either make a JOIN whenever it grabs
revision info, or make an additional db query whenever someone does use
the checksum.

Btw, instead of having Revision return a checksum string and needing to
check what type it is with a second method (best to program generically
in case we do switch checksum types) how about we return an instance of
a simple SHA1 wrapper class. We can have a MD5 one too and use a simple
descriptive method instead of having to manually use wfBaseConvert with
the right args when you want something good for filesystem use.

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]


_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to