VladimirAlexiev added a comment.
https://github.com/eclipse/rdf4j/issues/1291 got this answer: http://w3.org/TR/rdf11-concepts/#section-IRIs it says the following: > IRI equality: Two IRIs are equal if and only if they are equivalent under Simple String Comparison according to section 5.1 of [RFC3987]. Further normalization MUST NOT be performed when comparing IRIs for equality. This explicitly states that in RDF, IRIs are considered equal only under simple string comparison normalization, and even goes so far to as to explicitly forbid any other normalizations. Normalization of %-encoding is one of these "other" normalizations (see http://tools.ietf.org/html/rfc3987#section-5.3.2.3), and is therefore explicitly ruled out. This is further enforced by RFC3987 itself, which in section 5.1 states: > Applications using IRIs as identity tokens with no relationship to a protocol MUST use the Simple String Comparison (see section 5.3.1). All other applications MUST select one of the comparison practices from the Comparison Ladder (see section 5.3 or, after IRI-to-URI conversion, select one of the comparison practices from the URI comparison ladder in [RFC3986], section 6.2) RDF4J (and RDF triplestores in general) fall in the first category: IRIs are considered identity tokens not related to a specific protocol. rdf4j will probably add a function like `wikibase:decodeUri` to deal with this, but won't do %-decoding automatically. -------------- So there's nothing to do on this issue except: document the Wikipedia encoding rules so we can do the same before querying. Sparql's encode-for-uri <https://www.w3.org/TR/xpath-functions/#func-encode-for-uri> encodes everything except upper- and lower-case letters A-Z, the digits 0-9, HYPHEN-MINUS ("-"), LOW LINE ("_"), FULL STOP ".", and TILDE "~", but Wikipedia URLs use less encoding. TASK DETAIL https://phabricator.wikimedia.org/T210738 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: VladimirAlexiev Cc: Smalyshev, Aklapper, VladimirAlexiev, alaa_wmde, Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
