Re: [Wikisource-l] Does really wikisource need djvu/pdf files?

David Starner Fri, 12 Jul 2019 00:27:27 -0700

On Thu, Jul 11, 2019 at 11:22 PM Alex Brollo <[email protected]> wrote:
>
> I don't understand fully your statement "Right now, I'm going to convert them 
> to DjVu and upload them, without any text information.". Don't you feel any 
> need  of an excellent OCR layer when proofreading it into wikisource?


I reuploaded the first issue of Weird Tales in DjVu because the PDF
was significantly fuzzier than the DjVu, and looking at the PDF OCR,
it's slightly better than what I can get from the interface. Given the
choice between better images and better OCR, I go with the first one.

> Do you feel fully satisfied by mediawiki OCR of images?

I can't even get the MediaWiki OCR to work. I use the Google OCR gadget.

> I don't know how to get xml data about mapping of words into page image.

It's a pretty distant concern for me, somewhat tangential to producing
transcriptions of the works.

-- 
Kie ekzistas vivo, ekzistas espero.

_______________________________________________
Wikisource-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Does really wikisource need djvu/pdf files?

Reply via email to