I might work on this at the Hackathon (and maybe find Bunyk there if he's interested). I would need someone who knows Crimean Tatar to help validate the results, though that wouldn't have to happen during the Hackathon.
On Sun, Mar 26, 2017 at 1:23 AM, Vira Motorko <[email protected]> wrote: > Thank you, Trey, for this analysis! > > There is also this page with transliteration table (in case it helps and > not confuses more)) > https://crh.wikipedia.org/wiki/Vikipediya:İmlâ/Latin_elifbesi > > I asked user:Bunyk (Ukrainian Wikimedian) for help, but he is not working > with php. > He is attending Vienna Hakathon though so one can reach him there if feel > like doing this. > > I feel uncomfortable because I want this transliteration instrument to > exist but can contribute to the code myself (( > So please, my volunteer hero, appear! > *--* > *Vira Motorko* > project manager, Wikimedia Ukraine <https://ua.wikimedia.org/> non-profit > organisation > m: +380667740499 | f: vira.motorko <https://www.facebook.com/vira.motorko> > | > w: Ата <https://meta.wikimedia.org/wiki/User:Ата> > > Are you saving your documents in free formats? ;) > Help save natural resources – please think twice before printing this > e-mail or any attachments. > > 2017-03-24 16:05 GMT+02:00 Trey Jones <[email protected]>: > > > It looks like a lot of the pieces needed to make this happen are out > there. > > > > Unfortunately it doesn't look like a one-to-one transliteration based on > > the description in English Wikipedia.[1] But when is language ever > > straightforward? > > > > It looks like much of the work to deal with all the contextual variation > > and the exceptions to the transliteration was at least attempted twice. > > There's a zip file of code attached to the Phab Ticket,[2] and link to > some > > code on-wiki[5]. From the comments, it looks like that code never quite > > worked, but it seems possible to harvest the conversion data from one or > > both and put it into the same format as the other existing language > > converters, like Kazakh[3]—and it *might* be easier this time since it's > > been 6.5 years and the LanguageConverter code is probably more mature > now. > > > > It would be even better if someone could create an Elasticsearch plugin > to > > do the same kind of conversion. That would allow cross-alphabet > searching, > > too. I've been working with a plugin[4] that does that kind of thing for > > Traditional and Simplified Chinese. > > > > —Trey > > > > [1] > > https://en.wikipedia.org/wiki/Crimean_Tatar_alphabet#Cyrillic_to_Latin_ > > transliteration > > [2] https://phabricator.wikimedia.org/T23582#247642 > > [3] > > https://doc.wikimedia.org/mediawiki-core/master/php/ > classKkConverter.html > > [4] https://github.com/medcl/elasticsearch-analysis-stconvert > > [5] https://phabricator.wikimedia.org/T23582#247634 > > > > > > Trey Jones > > Software Engineer, Discovery > > Wikimedia Foundation > > > > On Fri, Mar 24, 2017 at 5:40 AM, Vira Motorko <[email protected]> > > wrote: > > > > > [I'm sorry if it's not the place to ask, please forward where it should > > > be.] > > > > > > Hi all, > > > > > > There is a long frozen idea: to make a transliterator for Crimean Tatar > > > Wikipedia. Native speakers of crh use both cyrillic and latin script > > > depending on the country they used to live in. > > > One example of similar thing in use is https://kk.wikipedia.org — one > > can > > > choose in what script they see the content. > > > > > > There is an old task on Phabricator and were attempts to write a tool > in > > > php but the effort stopped. > > > https://phabricator.wikimedia.org/T23582 > > > <https://phabricator.wikimedia.org/T23582> > > > > > > Maybe someone can/wants to help with this tool or create one from > > scratch? > > > Maybe you know where else I can find help? > > > > > > Thanks! > > > *--* > > > *Vira Motorko* > > > project manager, Wikimedia Ukraine <https://ua.wikimedia.org/> > > non-profit > > > organisation > > > m: +380667740499 | f: vira.motorko <https://www.facebook.com/ > > vira.motorko> > > > | > > > w: Ата <https://meta.wikimedia.org/wiki/User:Ата> > > > > > > Are you saving your documents in free formats? ;) > > > Help save natural resources – please think twice before printing this > > > e-mail or any attachments. > > > _______________________________________________ > > > Wikitech-l mailing list > > > [email protected] > > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > _______________________________________________ > > Wikitech-l mailing list > > [email protected] > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > _______________________________________________ > Wikitech-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
