There's on the web an interesting suggestion about difference between djvu
and pdf. The question was: how I can get hOCR from hidden layer of a pdf
file? The reply: convert pdf in djvu, then all wik be simple (more or
less). This comes from the fact that anything into a djvu file is open and
"simply" accessible, just as anything into a pdf is difficult and obscure.
Djvu is wiki, pdf isn't. I don't know any other open format that implements
searchable hidden text underlying page image.

But as a first step, incredible djvu opportunities should be *actively
explored and used*! If you use a car simply as a hen-house, never driving
it, any  standard and effective hen-house is similar, or more effective, in
your opinion.


2018-04-06 15:45 GMT+02:00 Federico Leva (Nemo) <>:

> Peter Meyer, 06/04/2018 14:59:
>> Could we distill these issues online on a wiki page somewhere?   Or is it
>> already done?
>> (1) what are the significant differences between pdf and djvu (or some
>> new version of djvu that we could imagine coming up with)
> I agree this is important to outline. For instance, is there some
> Wikisource where PDF files are actively discouraged in favour of DjVu, and
> for what reasons?
> Which DjVu features we dream of using within 5 years, which PDF doesn't
> provide? Do we want a system where libraries can feed us with DjVu files,
> the proofread text gets ingested back to the DjVu file and libraries can
> reuse it? Do we want to use some of the low level features of the text
> layer to widely deploy some dark magic, such as the captcha-based
> proofreading we talked about many times or some other interaction between
> MediaWiki and the scans? What "market" is there for such features?
> DjVu became our favourite format back at the time when the upload size
> limit was around 10 MiB, if I remember correctly, and compression was the
> most important factor. I often find myself explaining why it's such a
> useful format, but in the end if someone asks me "so, is it fine to just
> upload a PDF at Wikisource?" I have a hard time giving an answer other than
> "sure, don't worry, it will be the same".
> Federico
> _______________________________________________
> Wikisource-l mailing list
Wikisource-l mailing list

Reply via email to