Erik, 2013/4/25 Erik Moeller <e...@wikimedia.org>
> > The system I am really aiming at is a different one, and there has > > been plenty of related work in this direction: imagine a wiki where you > > enter or edit content, sentence by sentence, but the natural language > > representation is just a surface syntax for an internal structure. Your > > editing interface is a constrained, but natural language. Now, in order > to > > really make this fly, both the rules for the parsers (interpreting the > > input) and the serializer (creating the output) would need to be editable > > by the community - in addition to the content itself. There are a number > of > > major challenges involved, but I have by now a fair idea of how to tackle > > most of them (and I don't have the time to detail them right now). > > So what would you want to enable with this? Faster bootstrapping of > content? How would it work, and how would this be superior to an > approach like the one taken in the Translate extension (basically, > providing good interfaces for 1:1 translation, tracking differences > between documents, and offering MT and translation memory based > suggestions)? Are there examples of this approach being taken > somewhere else? Not just bootstrapping the content. By having the primary content be saved in a language independent form, and always translating it on the fly, it would not merely bootstrap content in different languages, but it would mean that editors from different languages would be working on the same content. The texts in the different language is not a translation of each other, but they are all created from the same source. There would be no primacy of, say, English. It would be foolish to create any such plan without reusing tools and concepts from the Translate extension, translation memories, etc. There is a lot of UI and conceptual goodness in these tools. The idea would be to make them user extensible with rules. If you want, examples of that are the bots working on some Wikipedias currently, creating text from structured input. They are partially reusing the same structured input, and need "merely" a translation in the way the bots create the text to save in the given Wikipedia. I have seen some research in the area, but they all have one or the other drawbacks, but can and should be used as an inspiration and to inform the project (like Allegro Controlled English, or a Chat program developed at the Open University in Milton Keynes to allow conducting business in different languages, etc.) I hope this helps a bit. Cheers, Denny -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l