On Thu, Jul 11, 2019 at 11:22 PM Alex Brollo <alex.bro...@gmail.com> wrote:
> I don't understand fully your statement "Right now, I'm going to convert them 
> to DjVu and upload them, without any text information.". Don't you feel any 
> need  of an excellent OCR layer when proofreading it into wikisource?

I reuploaded the first issue of Weird Tales in DjVu because the PDF
was significantly fuzzier than the DjVu, and looking at the PDF OCR,
it's slightly better than what I can get from the interface. Given the
choice between better images and better OCR, I go with the first one.

> Do you feel fully satisfied by mediawiki OCR of images?

I can't even get the MediaWiki OCR to work. I use the Google OCR gadget.

> I don't know how to get xml data about mapping of words into page image.

It's a pretty distant concern for me, somewhat tangential to producing
transcriptions of the works.

Kie ekzistas vivo, ekzistas espero.

Wikisource-l mailing list

Reply via email to