Simply, from a practital point iof view, my suggestion is: don't try to get a good djvu from IA pdf, use instead _jp2.zip images (after conversion to jpg the images are very good), and the result will be much better - almost as good as images into IA viewer, that uses the same images.
Alex 2016-05-13 10:06 GMT+02:00 Federico Leva (Nemo) <[email protected]>: > Alex Brollo, 13/05/2016 09:02: > >> I presume that this complex structure is somewhat similar of djvu >> background/foreground segmentation into djvu files, and artifacts are >> similar. >> > > Sure. > > >> So, pdf images are not only "compressed", but deeply processed and >> segmented images. >> > > ...which is what I call "compression". I still recommend to try and > increase the fixed-ppi parameter in such a case of excessive compression. > > I also still need an answer to https://it.wikisource.org/?diff=1733473 > > Is something of this complex IA image processing path documented >> anywhere? >> > > What do you mean? Are you asking about details of their derivation plan > for books? What we know has been summarised over time at > https://en.wikisource.org/wiki/Help:DjVu_files#The_Internet_Archive , as > always. As the help page IIRC states, the best way to understand what's > going on is to check the item history and read the derive.php log, like > https://catalogd.archive.org/log/487271468 which I linked. > > The main difference compared to the past is, I think, that they're no > longer creating the luratech b/w PDF, probably because the "normal" PDF now > manages to compress enough. They may have not realised that the single PDF > they now produce is too compressed for illustrations and for cases where > the original JP2 is too small. > > > Nemo > > _______________________________________________ > Wikisource-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikisource-l >
_______________________________________________ Wikisource-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikisource-l
