On Sat, Sep 17, 2011 at 8:26 AM, Roan Kattouw <[email protected]> wrote: > Minor detail: I think it's more likely we'll use SHA-1 hashes rather > than MD5 hashes.
Is there a good reason to prefer SHA-1? Both have weaknesses allowing one to construct a collision (with considerable effort), but I wouldn't see why that would matter for the proposed use. With only about 1 billion revisions in the collective databases, the odds of an accidental collision with either MD5 or SHA-1 is infinitesimal (less than 1 in 10^18 for the weaker MD5). MD5 is shorter and in my experience about 25% faster to compute. Personally I've tended to view MD5 as more than good enough in offline analyses. -Robert Rohde _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
