Thank you, Trey, for this analysis! There is also this page with transliteration table (in case it helps and not confuses more)) https://crh.wikipedia.org/wiki/Vikipediya:İmlâ/Latin_elifbesi
I asked user:Bunyk (Ukrainian Wikimedian) for help, but he is not working with php. He is attending Vienna Hakathon though so one can reach him there if feel like doing this. I feel uncomfortable because I want this transliteration instrument to exist but can contribute to the code myself (( So please, my volunteer hero, appear! *--* *Vira Motorko* project manager, Wikimedia Ukraine <https://ua.wikimedia.org/> non-profit organisation m: +380667740499 | f: vira.motorko <https://www.facebook.com/vira.motorko> | w: Ата <https://meta.wikimedia.org/wiki/User:Ата> Are you saving your documents in free formats? ;) Help save natural resources – please think twice before printing this e-mail or any attachments. 2017-03-24 16:05 GMT+02:00 Trey Jones <[email protected]>: > It looks like a lot of the pieces needed to make this happen are out there. > > Unfortunately it doesn't look like a one-to-one transliteration based on > the description in English Wikipedia.[1] But when is language ever > straightforward? > > It looks like much of the work to deal with all the contextual variation > and the exceptions to the transliteration was at least attempted twice. > There's a zip file of code attached to the Phab Ticket,[2] and link to some > code on-wiki[5]. From the comments, it looks like that code never quite > worked, but it seems possible to harvest the conversion data from one or > both and put it into the same format as the other existing language > converters, like Kazakh[3]—and it *might* be easier this time since it's > been 6.5 years and the LanguageConverter code is probably more mature now. > > It would be even better if someone could create an Elasticsearch plugin to > do the same kind of conversion. That would allow cross-alphabet searching, > too. I've been working with a plugin[4] that does that kind of thing for > Traditional and Simplified Chinese. > > —Trey > > [1] > https://en.wikipedia.org/wiki/Crimean_Tatar_alphabet#Cyrillic_to_Latin_ > transliteration > [2] https://phabricator.wikimedia.org/T23582#247642 > [3] > https://doc.wikimedia.org/mediawiki-core/master/php/classKkConverter.html > [4] https://github.com/medcl/elasticsearch-analysis-stconvert > [5] https://phabricator.wikimedia.org/T23582#247634 > > > Trey Jones > Software Engineer, Discovery > Wikimedia Foundation > > On Fri, Mar 24, 2017 at 5:40 AM, Vira Motorko <[email protected]> > wrote: > > > [I'm sorry if it's not the place to ask, please forward where it should > > be.] > > > > Hi all, > > > > There is a long frozen idea: to make a transliterator for Crimean Tatar > > Wikipedia. Native speakers of crh use both cyrillic and latin script > > depending on the country they used to live in. > > One example of similar thing in use is https://kk.wikipedia.org — one > can > > choose in what script they see the content. > > > > There is an old task on Phabricator and were attempts to write a tool in > > php but the effort stopped. > > https://phabricator.wikimedia.org/T23582 > > <https://phabricator.wikimedia.org/T23582> > > > > Maybe someone can/wants to help with this tool or create one from > scratch? > > Maybe you know where else I can find help? > > > > Thanks! > > *--* > > *Vira Motorko* > > project manager, Wikimedia Ukraine <https://ua.wikimedia.org/> > non-profit > > organisation > > m: +380667740499 | f: vira.motorko <https://www.facebook.com/ > vira.motorko> > > | > > w: Ата <https://meta.wikimedia.org/wiki/User:Ата> > > > > Are you saving your documents in free formats? ;) > > Help save natural resources – please think twice before printing this > > e-mail or any attachments. > > _______________________________________________ > > Wikitech-l mailing list > > [email protected] > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > _______________________________________________ > Wikitech-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikitech-l _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
