Hi,

I've suggested to generate bulk checksums as well but both Brion and Ariel see 
the primary purpose of this field to check the validity of the dump generating 
process and so they want to generate the checksums straight from the external 
storage. 

In a general sense, there are two use cases for this new field:
1) Checking the validity of the XML dump files
2) Identifying reverts

I have started to work on a proposal for deployment (and while being 
incomplete) it might be a good start to further plan the deployment. I have 
been trying to come up with some back-of-the-envelope calculations about how 
much time and space it would take but I don't have all the required information 
yet to come up with some reasonable estimates. 

You can find the proposal here: 
http://strategy.wikimedia.org/wiki/Proposal:Implement_and_deploy_checksum_revision_table

I want to thank Brion and Asher for giving feedback on prior drafts. Please 
feel free to improve this proposal.

Best,
Diederik

PS: not sure if this proposal should be on strategy or mediawiki...


On 2011-09-03, at 7:16 AM, Daniel Friesen wrote:

> On 11-09-02 09:33 PM, Rob Lanphier wrote:
>> On Fri, Sep 2, 2011 at 5:47 PM, Daniel Friesen
>> <li...@nadir-seen-fire.com> wrote:
>>> On 11-09-02 05:20 PM, Asher Feldman wrote:
>>>> When using for analysis, will we wish the new columns had partial indexes
>>>> (first 6 characters?)
>>> Bug 2939 is one relevant bug to this, it could probably use an index.
>>> [1] https://bugzilla.wikimedia.org/show_bug.cgi?id=2939
>> My understanding is that having a normal index on a table the size of
>> our revision table will be far too expensive for db writes.
>> ...
>> Rob
> We've got 5 normal indexes on revision:
> - A unique int+int
> - A binary(14)
> - An int+binary(14)
> - Another int+binary(14)
> - And a varchar(255)+binary(14)
> 
> That bug wise a (rev_page,rev_sha1) or (rev_page,rev_timestamp,rev_sha1)
> may do.
> 
> -- 
> ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
> 
> 
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to