Andrea Zanni, 03/10/2013 13:19:
Hi Nemo,
that's great news.

I wonder though how would be worth to redo the OCR on the archive djvu,
as it will be on the archive.org <http://archive.org> but not on Commons...
Do you imply that we would need to re-upload the djvu on Commons?

Of course, it will need to be reuploaded. Reuploading is cheap, while correcting OCR errors consumes precious volunteer time.

Nemo


BTW,
I think it's past time that Archive.org and Wikimedia start a real
partnership/collaboration.
With Micru, some months ago, we tried to draft a possible model:
https://docs.google.com/file/d/0B1PNcNlN2oqvajVfOEFuM29sbzg/edit?usp=sharing

But I think the discussion died (as did many others).
One of the things we could do is a project similar to this:
https://www.mediawiki.org/wiki/Possible_projects#Google_Books_.3E_Internet_Archive_.3E_Commons_upload_cycle

Aubrey




On Tue, Oct 1, 2013 at 6:25 PM, Federico Leva (Nemo) <[email protected]
<mailto:[email protected]>> wrote:

    As you know, many of us use archive.org <http://archive.org> to OCR
    their books:
    <https://en.wikisource.org/__wiki/Help:DjVu_files#The___Internet_Archive
    <https://en.wikisource.org/wiki/Help:DjVu_files#The_Internet_Archive>>
    For a while, they've been stuck with FineReader 8.0. I've just
    noticed the last OCR processes use 9.0, which has 5 more languages
    and 2 more dictionaries:
    http://www.abbyy.com/support/__finereader_90_ts/__RecognitionLanguages/
    <http://www.abbyy.com/support/finereader_90_ts/RecognitionLanguages/>
    http://www.abbyy.com/support/__finereader_80_ts/__RecognitionLanguages/
    <http://www.abbyy.com/support/finereader_80_ts/RecognitionLanguages/>

    I think it's worth re-doing OCR on any archive.org
    <http://archive.org> DjVu you're using (and you definitely must do
    so if it's one of those languages). I'm a (limited) admin there, so
    feel free to give me on my talk lists of items where to update OCR:
    https://wikisource.org/wiki/__User_talk:Nemo_bis
    <https://wikisource.org/wiki/User_talk:Nemo_bis>

    Nemo

    _________________________________________________
    Wikisource-l mailing list
    [email protected].__org
    <mailto:[email protected]>
    https://lists.wikimedia.org/__mailman/listinfo/wikisource-l
    <https://lists.wikimedia.org/mailman/listinfo/wikisource-l>




_______________________________________________
Wikisource-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


_______________________________________________
Wikisource-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Reply via email to