There's on the web an interesting suggestion about difference between djvu and pdf. The question was: how I can get hOCR from hidden layer of a pdf file? The reply: convert pdf in djvu, then all wik be simple (more or less). This comes from the fact that anything into a djvu file is open and "simply" accessible, just as anything into a pdf is difficult and obscure. Djvu is wiki, pdf isn't. I don't know any other open format that implements searchable hidden text underlying page image.
But as a first step, incredible djvu opportunities should be *actively explored and used*! If you use a car simply as a hen-house, never driving it, any standard and effective hen-house is similar, or more effective, in your opinion. Alex 2018-04-06 15:45 GMT+02:00 Federico Leva (Nemo) <nemow...@gmail.com>: > Peter Meyer, 06/04/2018 14:59: > >> Could we distill these issues online on a wiki page somewhere? Or is it >> already done? >> (1) what are the significant differences between pdf and djvu (or some >> new version of djvu that we could imagine coming up with) >> > > I agree this is important to outline. For instance, is there some > Wikisource where PDF files are actively discouraged in favour of DjVu, and > for what reasons? > > Which DjVu features we dream of using within 5 years, which PDF doesn't > provide? Do we want a system where libraries can feed us with DjVu files, > the proofread text gets ingested back to the DjVu file and libraries can > reuse it? Do we want to use some of the low level features of the text > layer to widely deploy some dark magic, such as the captcha-based > proofreading we talked about many times or some other interaction between > MediaWiki and the scans? What "market" is there for such features? > > DjVu became our favourite format back at the time when the upload size > limit was around 10 MiB, if I remember correctly, and compression was the > most important factor. I often find myself explaining why it's such a > useful format, but in the end if someone asks me "so, is it fine to just > upload a PDF at Wikisource?" I have a hard time giving an answer other than > "sure, don't worry, it will be the same". > > Federico > > > _______________________________________________ > Wikisource-l mailing list > Wikisource-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikisource-l >
_______________________________________________ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l