https://bugzilla.wikimedia.org/show_bug.cgi?id=65217
--- Comment #4 from Bawolff (Brian Wolff) <[email protected]> --- I recently realized that we still download the source file, even if its above $wgMaxImageArea (e.g. https://commons.wikimedia.org/wiki/File:Map_of_New-York_Bay_and_Harbor_and_the_environs_-_founded_upon_a_trigonometrical_survey_under_the_direction_of_F._R._Hassler,_superintendent_of_the_Survey_of_the_Coast_of_the_United_States;_NYPL1696369.tiff is a 540 mb file, which takes 37 seconds just to get to the error message that says we aren't even going to attempt to thumbnail the file). I've submitted https://gerrit.wikimedia.org/r/135101 to fix this. I've missed much of the events that unfolded around this situation. Looking back in mailing list archives, I'm not even clear if it is swift being overloaded, or time taken to actually thumbnail the image that's the problem (or both. Or something else). One of the earlier emails says: >We just had a brief imagescaler outage today at approx. 11:20 UTC that >was investigated and NYPL maps were found to be the cause of the outage. >Besides the complete outage of imagescaling, Swift's (4Gbps) bandwidth >was saturated again, which would cause slowdowns and timeouts in file >serving as well. So possibly (correct me if I'm off base here) its just swift network connection being overloaded, which in turn causes the image scalars to have to wait longer before getting the original image asset is delivered to them, causing them to be overloaded. If so, the fact we are fetching the original > 100 mb source file, only to not even try to scale it, and doing so repetitively until 4 attempts at a specific file width trigger attempt-failures to stop it for an hour on that particular size only, may be a very significant contributor to the situation. The attempt-failures thing only increments the cache key after the attempt failed. Given it was taking ~ 38 seconds just to download the file to the image scalar (in the case I tried), A lot of people could try and render that file in that time before the key is incremented (Still limited by the pool counter though). Maybe that key should be incremented at the beginning of the request. Sure in certain situations a couple people might get an error for the couple of seconds it takes a good file to render, but that would only last a couple seconds and would much more quickly limit the damage a stampede of people requesting a hard to render file could do. -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
