On 04/24/2014 05:24 AM, Daan Kuijsten wrote: > > On 23-Apr-14 21:29, [email protected] wrote: >> Re: API attribute ID for querying wikipedia pages > > @Matma Rex: This is way to general, I think it would be a lot better when > this would be in more detail. For example when I want to fetch a table with > all currencies on > https://en.wikipedia.org/wiki/List_of_circulating_currencies, I would make > an API call like > this:https://en.wikipedia.org/w/api.php?action=parse&page=List%20of%20circulating%20currencies&prop=sections&format=jsonfm. > This returns 5 sections with "numbers" which I can use as reference points, > but I would rather have a "number" for the table in the section. A section > can have multiple tables. > > Querying specific (structured) data from Wikipedia is still very difficult > in my opinion. My suggestion is that every paragraph, image, link and table > get a unique identifiable number. This way Wikipedia gets more machine > readable.
We (the Parsoid team) are actually working on this, see https://www.mediawiki.org/wiki/Parsoid/MediaWiki_DOM_spec/Element_IDs Besides making it possible to reference content, our goal is to use these ids as a key that lets us associate additional metadata with each element in the DOM. We expect stable element ids to be available in Parsoid output by this summer. Gabriel _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
