Today I rehearsed ScanTailor and unpaper to upload a rare book from WWI: https://archive.org/details/lettere-ferrovieri-patria I noticed the OCR is now "ABBYY FineReader 11"; IA only recently switched to 9 after many years on 8, this seems good news.

If I see correctly, 11 claims:

    New and improved language support
        New OCR languages: Turkmen (Latin) and Old Slavonic
New ICR languages: Danish, Norwegian (Bokmal & Nynorsk), Old English, Serbian (Cyrillic), Tajik
        Latin language has full dictionary support

    Improved OCR for Chinese, Japanese & Korean

10 claims:

    Improved Language Support
        Chinese
        Korean
        Japanese

It seems IA technology is evolving rather fast:
https://blog.archive.org/2015/10/21/grant-to-develop-the-next-generation-wayback-machine/
https://blog.archive.org/2015/10/23/zoom-in-to-9-3-million-internet-archive-books-and-images-through-iiif/

Nemo

_______________________________________________
Wikisource-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Reply via email to