On Oct 24, 2012, at 11:36 AM, Mark Bergsma <[email protected]> wrote:
> How about this idea:
>
> Just "purge all images with this prefix" doesn't really work in Squid or
> Varnish, because they don't store their cache database in a format that makes
> it cheap to determine which objects would match that. Varnish could do it
> with their "bans", but each ban is kept around for a long time, and with the
> tens, sometimes hundreds of purges a second we do, this would quickly add up
> to a massive ban list.
>
> But... Varnish allows you to customize how it hashes objects into its object
> hash table (vcl_hash). What we could do, is hash thumbnails to the same hash
> key as their original. Because of our current URL structure, that's pretty
> much a matter of stripping off the thumbnail postfix. Then the original and
> all its associated thumbnails end up at the same hash key in the hash table,
> and only a single purge for the original would nuke them all out of the cache.
>
> This relies on Varnish having an efficient implementation for multiple
> objects at a single hash key. It probably does, since it implements Vary
> processing this way. We would essentially be doing the same, Vary-ing on the
> thumbnail size. But I'll check the implementation to be sure.
I checked, and Varnish stores all variant objects in a linked list per hash
table entry. So once it looks up the hash entry for the URL of the original,
it'll have to do a linear search for the right thumbnail size, matching each
against a Vary header string. If we do this, we'll need to restrict the number
of variants (thumb sizes) so we don't get hundreds/thousands on a single hash
key.
Here's a little proof of concept to demonstrate how it could work:
https://gerrit.wikimedia.org/r/#/c/29805/2
--
Mark Bergsma <[email protected]>
Lead Operations Architect
Wikimedia Foundation
_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l