On 05/22/2014 05:41 PM, Petr Bena wrote:
I was looking for a free (possibly open source) provider of automatic translations for my open source application I am working on and quite had troubles finding some. Then I realized we have a project called "wiktionary" which could possibly (I was assuming it's open dictionary) help me here, but I was quite disappointed as I couldn't find any simple way to perform simple queries like:
There are several open-source machine translation projects. They are either rule-based or statistics-based. One of the rule-based projects is Apertium. When you start from zero, building a rule-based system gives you a useful system quite fast, especially if the two languages are similar. A statistics-based system (such as Google Translate) requires enormous amounts of data to become useful. It's not something that you can start as a subproject within Wiktionary, not even as a separate WMF project. It's a very large task. One naive approach is to base a statistics-based machine translator (SMT) on the European Union's freely available parallel text corpus. When you try to translate Finnish "terve" (which means: hello!) into English in such a system, it will say "health", since the same word also means health, and EU texts only talk about healthcare, never "hello". -- Lars Aronsson ([email protected]) Aronsson Datateknik - http://aronsson.se _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
