https://bugzilla.wikimedia.org/show_bug.cgi?id=65217

--- Comment #4 from Bawolff (Brian Wolff) <[email protected]> ---
I recently realized that we still download the source file, even if its above
$wgMaxImageArea (e.g.
https://commons.wikimedia.org/wiki/File:Map_of_New-York_Bay_and_Harbor_and_the_environs_-_founded_upon_a_trigonometrical_survey_under_the_direction_of_F._R._Hassler,_superintendent_of_the_Survey_of_the_Coast_of_the_United_States;_NYPL1696369.tiff
is a 540 mb file, which takes 37 seconds just to get to the error message that
says we aren't even going to attempt to thumbnail the file). I've submitted
https://gerrit.wikimedia.org/r/135101 to fix this.

I've missed much of the events that unfolded around this situation. Looking
back in mailing list archives, I'm not even clear if it is swift being
overloaded, or time taken to actually thumbnail the image that's the problem
(or both. Or something else). One of the earlier emails says:

>We just had a brief imagescaler outage today at approx. 11:20 UTC that
>was investigated and NYPL maps were found to be the cause of the outage.
>Besides the complete outage of imagescaling, Swift's (4Gbps) bandwidth
>was saturated again, which would cause slowdowns and timeouts in file
>serving as well.

So possibly (correct me if I'm off base here) its just swift network connection
being overloaded, which in turn causes the image scalars to have to wait longer
before getting the original image asset is delivered to them, causing them to
be overloaded. If so, the fact we are fetching the original > 100 mb source
file, only to not even try to scale it, and doing so repetitively until 4
attempts at a specific file width trigger attempt-failures to stop it for an
hour on that particular size only, may be a very significant contributor to the
situation.

The attempt-failures thing only increments the cache key after the attempt
failed. Given it was taking ~ 38 seconds just to download the file to the image
scalar (in the case I tried), A lot of people could try and render that file in
that time before the key is incremented (Still limited by the pool counter
though). Maybe that key should be incremented at the beginning of the request.
Sure in certain situations a couple people might get an error for the couple of
seconds it takes a good file to render, but that would only last a couple
seconds and would much more quickly limit the damage a stampede of people
requesting a hard to render file could do.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to