On May 13, 2014 7:13 PM, "Sumana Harihareswara" <suma...@wikimedia.org>
wrote:
>
> I am trying to figure out how thumbnail retrieval & caching works right
> now - with Swift, and the frontline & secondary ("frontend" and
> "backend") Varnishes. (I am working on the caching-related bit of the
> performance guidelines, and want to understand and help push forward on
>
https://www.mediawiki.org/wiki/Requests_for_comment/Simplify_thumbnail_cache
> .) I looked for docs but didn't find anything that had been updated this
> year.
>
> Here's how I think it works, assuming you are a MediaWiki developer
> who's written, e.g., a page that includes a thumbnail of an image:
>
> First, your code must get the metadata about the image, which might come
> from the local database, or memcached, or Commons. Then, you need to get
> a thumbnail of the image at the dimensions your page requires. Rather
> than create the thumbnail immediately on demand via parsing the filename
> and dimensions, Wikimedia's MediaWiki is configured to use the "404
> handler." (see [[Manual:Thumb_handler.php]]) Your page first receives a
> URL indicating the eventual location of the thumbnail, then the browser
> asks for that URL. If it hasn't been created yet, the web server
> initially gets an internal 404 error; the 404 handler then kicks off the
> thumbnailer to create the thumbnail, and the response gets sent to the
> client.
>
> As it is sent to the client, each thumbnail is stored in a Swift store
> and stored in our frontline and secondary Varnish caches.
>
> (The Varnish caches cache entire HTTP responses, including thumbnails of
> images, frequently-requested pages, ResourceLoader modules, and similar
> items that can be retrieved by URL. The frontline Varnishes keep these
> in memory. (A weighted-random load balancer (LVS) distributes web
> requests to the front-end Varnishes.) But if a frontline Varnish doesn't
> have a response cached, it passes the request to the secondary Varnishes
> via hash-based load balancing (on the hash of the URL). The secondary
> Varnishes hold more responses, storing them ondisk. Every URL is on at
> most one secondary Varnish.)
>
> So, at the end of this whole process, any given thumbnail is in:
> * the Swift thumbnail store (and will persist until the canonical image
> changes, or is deleted, or we run out of space and flush Swift)
> * the frontline and secondary Varnishes (and will persist until the
> canonical image changes, or is deleted, or we restart the frontline
> Varnishes or we evict data from the hard disks of the secondary Varnishes)
>
> Is this right?
>
> --
> Sumana Harihareswara
> Senior Technical Writer
> Wikimedia Foundation
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

That is mostly correct afaik. The varnish set up also includes different
caches in different locations (so during invalidation failures you can have
correct data in say usa but not europe, which confuses bug reporters
considerably)

Removal of thumb from storage can also happen by doing ?action=purge on the
image description page. I believe varnish caches only store for a max of 30
days (not 100% sure on that). Swift stores forever.

Im not sure if its in scope of what your trying to document, but htcp
purging is also an important aspect of how our varnish cache works, and a
part that historically has exploded several times.

--bawolff
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to