daniel created this task. daniel added projects: Wikidata, ArchCom-RfC (ArchCom-Approved), Wikimedia-Apache-configuration. |
Wikimedia is managing a growing amount of machine readable data as wiki page content. The latest addition is the Data namespace on commons, which hosts tabular data like Data:Dolmens_of_the_Preseli_Hills.tab and geographic data like Data:Avignon_City_Wall.map.
Problem:
There is currently no canonical URI/URL for referring to and retrieving these data sets.
Concrete need:
Wikidata can reference geo-shape data from the Data namespace on Commons. To represent such references in RDF, the data set needs a canonical URI. See T159517: [RFC] RDF mapping for geo-shape / URIs for commons data pages
Current solutions:
- A way to get raw page data for most data types, using action="" with the "ugly" URL form: https://commons.wikimedia.org/w/index.php?title=Data:Avignon_City_Wall.map&action="">. However, this is not supported for data types that have "direct editing " disabled. E.g. https://www.wikidata.org/w/index.php?title=Q23&action=""> does not work.
- Wikidata uses https://www.wikidata.org/entity/Q23 as the canonical URI of concepts, and https://www.wikidata.org/wiki/Special:EntityData:Q23 as the canonical URI of the description. Both apply content negotiation and trigger a 303 redirect. The canonical URL for a specific serialization has the form https://www.wikidata.org/wiki/Special:EntityData:Q23.ttl.
Proposed URIs for data:
- Special case for the data namespace: https://commons.wikimedia.org/data/Avignon_City_Wall.map
- ...or with the namespace, so other kinds of data can be added: https://commons.wikimedia.org/data/Data:Avignon_City_Wall.map
- ...or bind it to action="" explicity: https://commons.wikimedia.org/raw/Data:Avignon_City_Wall.map
Note that in contrast to Wikidata concept URIs, the above URIs identify descriptions (data), not the thing described by the data.
Also note that these would return the "internal" serialization of the data (with the appropriate MIME type in the response header). They do not support custom serialization or apply content negotiation.
Question:
Do we need to plan for supporting custom serialization and content negotiation? Is it sufficient to later add a query parameter to specify an alternative serialization?
Example: to get .tab data as CSV instead of JSON, one would use a URL like https://commons.wikimedia.org/data/Avignon_City_Wall.map?format=text/csv.
Note that specifying the format makes no sense for a "pure" URI, this is only relevant when resolving the URI as a URL and fetching the associated data.
Cc: Aklapper, Jonas, Smalyshev, mkroetzsch, Lydia_Pintscher, daniel, QZanden, D3r1ck01, Izno, suriyaa, Wikidata-bugs, aude, GWicke, jayvdb, Southparkfan, fbstj, RobLa-WMF, santhosh, Mbch331, Jay8g, Ltrlg, Glaisher, bd808, Krenair, Legoktm
_______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs