Smalyshev added a comment.

I think we have several concepts there that needs to be refined.

  1. Canonical object URI - this is the URI that uniquely identifies an object in Wikimedia world, and, by extension, in the whole world of linked data. Note that in theory that URI does not have to produce any content when accessed (in fact, it may not even use any accessible scheme like http:). However, it is common and beneficial to link it with:
  2. Data access URI - this is the address one can use to retrieve some representation of the object identified above. The kind of representation varies a lot, sometimes it is a text description, sometimes it is some kind of RDF, sometimes it may be negotiated page, etc. I suggest we use content negotiation as much as we can and choose sane defaults when we can't. I also suggest that we link alternative representations to this data URI.
  3. Human-readable URI - since we are in the wiki world, our content is meant to be edited by humans, and thus have human-readable (at least to certain extent :) representation, where you can interpret and edit it. Not every object would have these (e.g. individual values in Wikidata don't) but many interesting ones would.

I would suggest to design a scheme that supports each of the above, and allows to go between them in automatic way - i.e. having one of them, it is easy for a simple script to get to the others. I'd also suggest to use redirects and content negotiation to reconcile the differences between how we represent things in Wiki and how we want external URLs to look like.

/api/rest_v1/page/{type}/{title}{/revision} pattern

I do not think we should include revisions in data URIs, not unless we intend to represent our revision structure in linked data formats (which I hope we don't, 99.99% of intended usage won't need it). Also, api/rest_v1 part should not be part of the canonical data URI. api part because canonical URI should be the same, however you access it - via specific API or not, it identifies the object, not specific way of retrieving it, and rest_v1 - for the same reason, plus canonical URI should not change if we change our API version. Basically, unless we radically change the whole data structure, the URI should be forever (and even if we do there's an argument for preserving an old one, so even more basically, canonical URIs are forever).

I would propose the following scheme for Commons:

https://commons.wikimedia.org/data/Avignon_City_Wall as canonical URI (we can add .map if it's important, but if we can avoid it, it looks nicer without it). This URI would be redirected to the following places:

  • if accessed with Accept type known to us and having representation, either produces this representation directly or redirects to https://commons.wikimedia.org/wiki/Data:Avignon_City_Wall.map?action="">
  • if Accept suggests it is a browser asking for HTML, redirected to https://commons.wikimedia.org/wiki/Data:Avignon_City_Wall.map
  • if there is no Accept, choose a sane default for it - e.g. JSON or something - and proceed as if this were the requested type.

Relying on redirects for canonical URI should solve the caching problem, at the (supposedly minimal) cost of extra HTTP request in some cases. Of course, tools that care about it could access target URLs directly.


TASK DETAIL

EMAIL PREFERENCES

To: Smalyshev
Cc: Dzahn, GWicke, tstarling, Aklapper, Jonas, Smalyshev, mkroetzsch, Lydia_Pintscher, daniel, QZanden, D3r1ck01, Izno, suriyaa, Eevans, mobrovac, Hardikj, Wikidata-bugs, aude, jayvdb, Southparkfan, fbstj, RobLa-WMF, santhosh, Mbch331, Jay8g, Ltrlg, Glaisher, bd808, Krenair, Legoktm
_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to