On Sun, Sep 18, 2011 at 1:55 AM, Robert Rohde <raro...@gmail.com> wrote:
> If collision attacks really matter we should use SHA-1.

If collision attacks really matter you should use, at least, SHA-256, no?

> However, do
> any of the proposed use cases care about whether someone might
> intentionally inject a collision?  In the proposed uses I've looked at
> it, it seems irrelevant.  The intentional collision will get flagged
> as a revert and the text leading to that collision would be discarded.
>  How is that a bad thing?

Well, what if the checksum of the initial page hasn't been calculated
yet?  Then some miscreant sets the page to spam which collides, and
then the spam gets reverted.  The good page would be the one that gets
thrown out.

Maybe that's not feasible.  Maybe it is.  Either way, I'd feel very
uncomfortable about the fact that someday someone might decide to use
the checksums in some way in which collisions would matter.

Now I don't know how important the CPU differences in calculating the
two versions would be.  If they're significant enough, then fine, use
MD5, but make sure there are warnings all over the place about its
use.

(As another possibility, what if someone writes a bot to detect
certain reverts?  I can see spammers/vandals having a field day with
this sort of thing.)

>> For offline analyses, there's no need to change the online database tables.
>
> Need?  That's debatable, but one of the major motivators is the desire
> to have hash values in database dumps (both for revert checks and for
> checksums on correct data import / export).  Both of those are
> "offline" uses, but it is beneficial to have that information
> precomputed and stored rather than frequently regenerated.

Why not in a separate file?  There's no need to get permission from
anyone or mess with the schema to generate a file with revision ids
and checksums.  If WMF won't host it at the regular dump location
(which I can't see why they wouldn't), you could host it at
archive.org.

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to