I tried ABBY before and the quality was low, I will try tesseract and see what happens
Best On Tue, Jun 24, 2014 at 7:08 PM, Aleksey Chalabyan <[email protected]> wrote: > ABBYY FineReader supports Hebrew and Arabic since v. 11. But I'm afraid > same script is not enough. For example FineReader has 3 versions for > Armenian. All three use same scripts, different orphography and slightly > different vocabulary, but if you set wrong language drop in quality is > dramatic. So I'm not sure if Arabic OCR would work good for text in Farsi > (Persian). > FineReader provides 30 days full trial, and I think it's worth to give it > a try. > > You may try to approach ABBYY and check if there are any plans on full > support of Persian in coming future. > > And trying to train Teseract seems like good idea to get free/open source > OCR for Persian, if you can get enough resources on that. But I can't > comment on how well it will work with RTL scripts especially with > Nastaliq/Naskh when letters and words are not separated from each other. > > > On Tue, Jun 24, 2014 at 6:13 PM, Federico Leva (Nemo) <[email protected]> > wrote: > >> Amir Ladsgroup, 24/06/2014 15:37: >> >> I have access to huge resources of old books in Persian (some of them >>> are even typed) and almost all of them can be imported to Wikisource but >>> the problem is I don't have (or know) any OCR for Persian, Do you know >>> which OCR software supports Persian (supporting Arabic is not enough; I >>> checked several programs) texts? >>> >> >> The only result for "Persian" and OCR in abbyy website is < >> http://www.abbyy.com/CaseStudies/SISU-Reveals-Its- >> Multilingual-Content-to-Academic-Community-Thanks-to- >> ABBYY-Recognition-Server/>, weird! Worth asking them some details, they >> might have some additional plugins. >> >> On the FLOSS side, maybe some library in Iran made some investments on >> tesseract? If there's any big digital library of Persian content you should >> ask them as well. >> >> Reminder: archive.org is still in need of people willing to compare 8.0 >> vs. 9.0 OCR results of some books in their language. :) >> http://thread.gmane.org/gmane.org.wikimedia.wikisource/1552 >> >> Nemo >> >> _______________________________________________ >> Wikisource-l mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/wikisource-l >> > > > _______________________________________________ > Wikisource-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikisource-l > > -- Amir
_______________________________________________ Wikisource-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikisource-l
