https://bugzilla.wikimedia.org/show_bug.cgi?id=23258
--- Comment #4 from Daniel Kinzler <[email protected]> 2010-06-22 15:18:36 UTC --- (In reply to comment #2) > Review follows. I'm not sure who is still involved in this project so I'm just > putting my comments here. > > I don't know why this is an extension. Can it be moved to the core, replacing > the barely-functional TiffHandler there? The reson is purely organisational. Developing in core would have required Hallo Welt to get access toi trunk right away, or we would have had to use patches, wich is quite annoying. There's no reson not to rename and/or mere it with core. Well actually, there is one: They do like to be visible on Special:Version. Not sure if adding the extension authors to the core credits would be an option. > Is there any reason we have to use exiv to read EXIF data? We already have > functions for reading EXIF data from TIFF files in the MediaWiki core, in > Exif.php. Use of exiv2 is optional, the reason to use it is to also fetch IPTC and XMP besides Exif. I would very much like to see this for other image formats too - perhaps there could be an abstraction like Exif.php covering all kinds of image meta-data? Otoh, we'd really want to have this per file type, so we could also cover ID3 etc for Ogg files and other media... Needs some more thought. At the moment, PagedTiffHandler defaults to using ImageMagick's identify command to get the basic data. Not sure if we can get around that, since we need some info that is not included in Exif, most notable, the number of pages in the tiff. Also, identify provides clues about whether imagemagick is able to process the file at all, and helps to reject broken tiffs. Anyway, we could ineed use Exif.php to fetch exif data per default, in additon to what identify returns. > The "cache" in PagedTiffImage::retrieveMetaData() seems to just be a way to > completely break image uploads for a given filename for a day, I can't see any > practical purpose for it. But it's a moot point if the whole file is going to > be deleted in favour of Exif.php. The cache thing is an akward hack to provide for broken legacy files. We already have several thousand tiff files uploaded, some of which we'll not be able to render. Caching the fact that extraction failed for a specific file is supposed to prevent mediawiki trying to extract that info over and over again, whenever the file or a thumbnail is requested. However, if the failure occurs while uploading, it should of course not be cached. If it is, that's a bug, and a nasty one at that, since it would confuse the user who might have removed the cause for the failure, but still gets the error. > I've read a bit about VIPS, I see that this extension uses it. Optionally, the default is to use ImageMagick. This feature has only seen light testing. Could be intersting, though. > VIPS has the > ability to scale arbitrarily sized TIFF files with limited RAM, which would be > a useful property for us. But it only works if the whole pipeline is PIO > rather > than WIO. This means that the input TIFF file must be tiled (RowsPerStrip > > 16), and only PIO transformation functions may be used. PagedTiffHandler > currently uses im_resize_linear which is WIO. If it used the PIO function > im_shrink instead, then the RAM usage would be limited. > > im_resize_linear will look nicer when a small TIFF image is resized to a > slightly smaller image. That's not the expected use case for PagedTiffHandler > on Wikimedia. im_shrink will look OK when the thumbnail is several times > smaller than the source image, which is the case we expect to be dealing with. > > To limit memory usage in VIPS, please have PagedTiffHandler error out if it > tries to scale a source image with area greater than $wgMaxImageArea and > RowsPerStrip less than 16. And use im_shrink. Considering the comment you added later, it seems that using img_shrink is always ok? did I understand that correctly? > Ideally I would like to see two-stage conversion, so that very large (>100 > Mpx) > images can be scaled down with im_shrink to somewhat larger than web size, say > 4 Mpx, once only per upload. Then the 4 Mpx intermediate image would be used > to > generate the ~0.1 Mpx thumbnail images requested by editors. The final step > could be done with ImageMagick, or with VIPS with a transformation similar to > the one ImageMagick uses. Yes, we have discussed this option (both with Markus and, I think, with you). This would be very good to have, but it should be implemented in a generic way, not at a per-file-type basuis, I think. PNGs would also benefit from this. > It's a common misconception that downscaling requires interpolation. Really it > needs blurring followed by downsampling followed by sharpening, that's what > ImageMagick does. It's only upscaling (or downscaling by a very small factor) > that needs interpolation. im_shrink uses a very simple kind of blurring, > specifically a block average. The key to high quality downscaling is to choose > an appropriate window function to do the blur. ImageMagick uses a Lanczos > filter by default, see: > > http://www.imagemagick.org/script/command-line-options.php#filter Not sure what you are getting at here - does this mean ImageMagick's scaling looks better? > In PagedTiffHandler.php: > > $wgTiffErrorCacheTTL = 84600; > > Is this meant to be 86400, the number of seconds in a day? Probably, and should probably be written as 60*60*24 to make it clear :) > In PagedTiffHandler_body.php: > > // @todo check height ## MediaWiki doesn't have a MagicWord-ID for height, > appears to be a bug. ^SU > > It's not a bug. Width and height are specified using two numbers separated by > an "x", e.g. 100x100px. There is a special case for this at Parser.php line > 4689. If the img_width magic word parameter matches this format, then a height > parameter will automatically be passed through to the handler. wow... uh... why? i mean, just adding "height" would have been cleaner and simpler, no? > * TODO: The purpose of this function is not yet fully clear. > */ > function getScriptParams( $params ) { # # wtf?? ^DK > // FIXME: This function is unused, seems to be useless, > // and could be replaced with an array_intersect() call > > The function is used to specify which parameters to File::transform() should > be > passed through to thumb.php, in the case where the configuration specifies > thumb.php is to be used (e.g. $wgThumbnailScriptPath !== false). You should > pass through the same parameters as in makeParamString(). The code you have > here would be fine if the fixme comments were removed, except for a detail > about the lossy parameter, see below. > > PagedTiffHandler::normaliseParams() seems to be duplicating code from > ImageHandler::normaliseParams(). I'm not sure why this is necessary. > DjVuHandler gets away with not overriding normaliseParams() at all. If there > is > a bug in ImageHandler::normaliseParams(), you should fix it, rather than > forking it. If all you're doing is validating the lossy parameter, then you > can > do that with code like: > [...] > If you normalise the lossy parameter properly at this point, you avoid the > need > to duplicate the code in makeParamString(), getScriptParams() and > getThumbExtension(). Also, you resolve a bug due to makeParamString() not > matching getThumbExtension() in the way it handles a missing parameter. [...] Thank you very much flor clarifying this. It's quite confusing how these methods interact. What would be a good place to document this? > In PagedTiffHandler::doTransform: please remove the error suppression (@) > operators. normaliseParams() should return false if the width and height are > missing. Error suppression operators have lots of implementation issues in > PHP, > they are evil and should almost never be used. OK. > PagedTiffHandler::getThumbType() is missing and the parent is not returning > the > correct type. Please fix this by modifying File::thumbName() to pass handler > parameters through to MediaHandler::getThumbType() as a third parameter, so > that your lossy/lossless feature can work. You mean, this needs to be fixed in core? I'll have a look. Thanks agin for your detailed review! -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
