On Thu, Jul 11, 2019 at 11:22 PM Alex Brollo <alex.bro...@gmail.com> wrote:
>
> I don't understand fully your statement "Right now, I'm going to convert them 
> to DjVu and upload them, without any text information.". Don't you feel any 
> need  of an excellent OCR layer when proofreading it into wikisource?

I reuploaded the first issue of Weird Tales in DjVu because the PDF
was significantly fuzzier than the DjVu, and looking at the PDF OCR,
it's slightly better than what I can get from the interface. Given the
choice between better images and better OCR, I go with the first one.

> Do you feel fully satisfied by mediawiki OCR of images?

I can't even get the MediaWiki OCR to work. I use the Google OCR gadget.

> I don't know how to get xml data about mapping of words into page image.

It's a pretty distant concern for me, somewhat tangential to producing
transcriptions of the works.

-- 
Kie ekzistas vivo, ekzistas espero.

_______________________________________________
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Reply via email to