Base36 certainly isn't the most efficient way to store a sha1, but it's
what is in use all over mediawiki.  I think there was some discussion on
this list of the tradeoffs of different methods when revision.rev_sha1 was
added, and base36 was picked as a compromise.  I don't know why base36 was
picked over base62 once it was decided to stick with an ascii alpha-numeric
encoding but regardless, there was opposition to binary.   Taken on its
own, an integer index would be more efficient but I don't think it makes
sense if we continue using base36.

On Tue, Sep 25, 2012 at 11:20 AM, Artur Fijałkowski <[email protected]>wrote:

> > <tl;dr> Where an index is desired on a mysql column of base36 encoded
> sha1
> > hashes, I recommend ADD INDEX (sha1column(10)).  Shorter indexes will be
> > sufficient in many cases, but this is still provides a >2/3 space savings
> > while covering a huge (2^51.43) space.
>
> Isn't it better to store BIGINT containing part of (binary) sha1 and
> use index on numeric column?
>
> AJF/WarX
>
> _______________________________________________
> Wikitech-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to