On Jun 14, 2014 4:54 AM, "Maximilian Klein" <[email protected]> wrote:
>
> Hello All,
>
> I'm working on the Open-Access Signalling Project[1], which aims to signal
> and badge when a reference in Wikipedia is Open Access source. I'm writing
> the bot at the moment to do this, and I'm encountering a question - how do
> I keep track of the values of the template {{Cite doi | doi=value}}, in as
> close to real-time as possible?
>
> The most efficient approach I can come up with is to query the SQL servers
> on Labs in constant loop, returning the results of "What transcludes {{Cite
> doi}}" and seeing if the last_edited timestamp is newer than previous? If
> the last_edit is newer, then get the content of the page and see if the
> {{Cite_doi}} value has changed, checking against a local database.
>
> This seems horribly inefficient still. Is there a hook to know when a
> template on a page has been edited, rather than having to check every time
> the page has been edited?

The API can provide the list of URLs on the page, which may be enough for you.

It sounds like you want a API hook which returns (only) the reference
metadata for a page.  That would be lovely.  I would love to see an
option to provide those results in Zotero's JSON format.

COinS metadata can be downloaded in structured form.

1. get a list of sections

https://en.wikipedia.org/w/api.php?action=parse&prop=sections&page=COinS

2. fetch the formatted HTML for the relevant section(s)

https://en.wikipedia.org/w/api.php?action=mobileview&prop=text&page=COinS&sections=7&format=dumpfm&notransform=1

3. extract out the COinS metadata

Look in the page from 2. above - it should contain 'reference-*'

--
John Vandenberg

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to