I got the 80% success from Sankarshan's posterous -

problem that Ashwin Baindur raised was the improper digitisation effort. A
rough Google search tells me that C-DAC is doing the digitisation for the
Maharashtra Archives - http://www.cdac.in/html/egov/mda.aspx - which as
Ashwin raised the point is stored on compact disks. Interestingly they are
using SQL and Visual Basic under Windows NT. I am not sure if this is a good
thing. I also do not know when this project was done either. So, not sure if
those were then current technologies.

We discussed yesterday that Maharashtra Archives being a public institution
(or for that matter any public institution) should ideally make these
documents either public domain or release under an open copyright (do
correct me if I am wrong with terminology).

warm regards,

On 14 February 2011 11:29, Pradeep Mohandas <pradeep.mohan...@gmail.com>wrote:

> hi,
> At the discussion yesterday, we were told that the OCR did not work at all
> in case of many Indian languages. Also, as a person who does not understand
> OCR at all, can any one help me with what they mean by a 80% successful
> OCR?
> The other end of the process is the digitisation machine needed to convert
> the physical text into image. Any ideas on availability and cost of a museum
> grade digitisation machine? I am sure you cannot and the archives will not
> let you use an ordinary device to handle these documents.
> thanks in advance,
> Pradeep
Wikimediaindia-l mailing list

Reply via email to