Base36 certainly isn't the most efficient way to store a sha1, but it's what is in use all over mediawiki. I think there was some discussion on this list of the tradeoffs of different methods when revision.rev_sha1 was added, and base36 was picked as a compromise. I don't know why base36 was picked over base62 once it was decided to stick with an ascii alpha-numeric encoding but regardless, there was opposition to binary. Taken on its own, an integer index would be more efficient but I don't think it makes sense if we continue using base36.
On Tue, Sep 25, 2012 at 11:20 AM, Artur Fijałkowski <[email protected]>wrote: > > <tl;dr> Where an index is desired on a mysql column of base36 encoded > sha1 > > hashes, I recommend ADD INDEX (sha1column(10)). Shorter indexes will be > > sufficient in many cases, but this is still provides a >2/3 space savings > > while covering a huge (2^51.43) space. > > Isn't it better to store BIGINT containing part of (binary) sha1 and > use index on numeric column? > > AJF/WarX > > _______________________________________________ > Wikitech-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
