@Sumana Harihareswara

Please look the Bengali OCR https://code.google.com/p/banglaocr/  and its
need to developed.


On Mon, Aug 19, 2013 at 10:12 PM, Sumana Harihareswara <
suma...@wikimedia.org> wrote:

> On 08/19/2013 02:52 AM, L. Shyamal wrote:
> > Re-posting a now outdated query from meta
> >
> http://meta.wikimedia.org/wiki/Talk:India_Access_To_Knowledge/Events/Bangalore/Digitization_workshop_18August2013
> >
> > now that the workshop has already been conducted I think those that have
> > attended the workshop could comment if this cover Indic language OCR-ing
> -
> > if it did it would be worthwhile if the OCR software used can be
> documented
> > on the meta pages or elsewhere such as Wikisource. Most of the more
> > experienced editors here will be fairly familiar with the use of scanners
> > for creating PDF documents and uploading them to places like the Internet
> > Archive but the experience or knowledge of OCRs and their success rates
> is
> > a bit wanting for Indic languages (fonts).
> >
> > best wishes
> > Shyamal
> > en:User:Shyamal
>
> I looked at the talk page on Meta - thank you, Shyamal!
>
> For those who do not know: OCR means Optical Character Recognition.
> When we want to get archival documents onto the web, it's nice to have
> photos of them, but it's even better to OCR them so that people can
> clearly read, copy, excerpt, translate, and remix the text.
>
> Is there a central list of the problems that OCR software (especially
> open source OCR software) has with text written in Indic languages?  If
> so, I could help encourage people to fix those problems, as volunteers,
> via a Google Summer of Code/Outreach Program for Women internship, via a
> grant-funded project (such as https://meta.wikimedia.org/wiki/Grants:IEG
> ), or via some other method.
>
> People who would like to make Wikisource more easily useful for Indic
> languages might want to contribute to the Wikisource vision development
> project that's going on right now:
>
> https://wikisource.org/wiki/Wikisource_vision_development
>
> The ProofreadPage extension (part of the Wikisource technology stack) is
> being worked on right now in Aarti K. Dwivedi's Google Summer of Code
> internship.  http://aartindi.blogspot.in/  She might be interested in
> knowing about these issues, so I am cc'ing her.
>
> Also - just because people on this list might be interested! - if you
> have an old historical map that you'd like to vectorize to get it onto
> OpenStreetMap, try out the new "Map polygon and feature extractor" tool:
> https://github.com/NYPL/map-vectorizer
>
> --
> Sumana Harihareswara
> Engineering Community Manager
> Wikimedia Foundation
>
> _______________________________________________
> Wikimediaindia-l mailing list
> Wikimediaindia-l@lists.wikimedia.org
> To unsubscribe from the list / change mailing preferences visit
> https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l
>
_______________________________________________
Wikimediaindia-l mailing list
Wikimediaindia-l@lists.wikimedia.org
To unsubscribe from the list / change mailing preferences visit 
https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l

Reply via email to