Hi Roman! On 02/28/2014 01:24 AM, Brian Wolff wrote: > On 2/28/14, Roman Zaynetdinov <[email protected]> wrote: >> Help people in reading complex texts by providing inline translation for >> unknown words. For me as a non-native English speaker student sometimes is >> hard to read complicated texts or articles, that's why I need to search for >> translation or description every time. Why not to simplify this and change >> the flow from translate and understand to translate, learn and understand?
This sounds like a great idea. >> There are two ways in my mind right now. First is to make a web-site built >> on Node.js with open API for users. Parsoid could be used for parsing data >> from Wiktionary API which is suitable for Node. A small JavaScript widget >> is also required for front-end representation. You could basically write a node service that pulls in the Parsoid HTML for a given wiktionary term and extracts the info you need from the DOM and returns it in a JSON response to a client-side library. Alternatively (or as a first step), you could download the Parsoid HTML of the wiktionary article on the client and extract the info there. This could even be implemented as a gadget. We recently set liberal CORS headers to make this easy. >> Parsoid could be used for parsing data >> from Wiktionary API which is suitable for Node > > Just as a warning, parsing data from wiktionary into usable form is a > lot harder then it looks, so don't underestimate this step. (Or at > least it was several years ago when I last tried) The Parsoid rendering (e.g. [1]) has pretty much all semantic information in the DOM. There might still be wiktionary-specific issues that we don't know about yet, but tasks like extracting template parameters or the rendering of specific templates (IPA,..) are already straightforward. Also see the DOM spec [2] for background. Gabriel [1]: http://parsoid-lb.eqiad.wikimedia.org/enwiktionary/foo Other languages via frwiktionary, fiwiktionary, ... [2]: https://www.mediawiki.org/wiki/Parsoid/MediaWiki_DOM_spec _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
