Christopher added a comment.
The RDF standard that you reference explicitly supports my point. > IRI normalization: Interoperability problems can be avoided by minting only IRIs that are normalized according to Section 5 of [RFC3987]. > Non-normalized forms that are best avoided include: > Percent-encoding of characters where it is not required by IRI syntax The sitelinks are **not** properly represented in the RDF according to the standards. When a sitelink is rendered as an **dereferenced URI** (a webpage link), yes, it should be percent encoded because its function is defined by the http protocol. When in RDF, however, the link should be represented as a reference that is string comparable to the source, in this case the wiki article IRI. I have //many problems// with Wikidata as a linked data source, but I am able to work around them. What is not fixed at the source can be normalized with Java and then rectified with SPARQL update, though it would certainly be better for the developers to try and produce data that was not so proprietary. For example, https://phabricator.wikimedia.org/T121274 can easily be fixed in the RDF with SPARQL update once it is in the linked data store. For some reason, it is an epic task for Wikibase to distinguish object properties from datatype properties, and this **major interoperability problem** remains. The consequence of this übercomplexity is that you have hacks like the authority control gadget that the whole project is dependent on. Why? TASK DETAIL https://phabricator.wikimedia.org/T132319 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Christopher Cc: Smalyshev, Aklapper, Christopher, Avner, debt, Gehel, D3r1ck01, FloNight, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331 _______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
