jindrichmynarz created this task.
jindrichmynarz added a project: Wikidata.
Herald added a subscriber: Aklapper.

TASK DESCRIPTION

Wikidata's SPARQL endpoint doesn't escape commas in IRIs in CSV output, causing the produced CSV to be syntactically invalid. For example, the issue can be replicated using the following request:

curl \
  -H "Accept:text/csv" \
  --data-urlencode "query=SELECT (<https://example.com/a,b> AS ?result) WHERE {}" \
  https://query.wikidata.org/sparql

The request produces the following CSV results:

result
https://example.com/a,b

This CSV fails to parse correctly, since the second row is interpreted as two columns. Correctly escaped, the results should look like this:

result
"https://example.com/a,b"

Commas in IRIs typically appear in those linking other Wikimedia sites, such as <https://en.wikipedia.org/wiki/Versailles,_Yvelines>.

This might be an upstream issue that the Blazegraph RDF store backing the Wikidata's SPARQL endpoint has.


TASK DETAIL
https://phabricator.wikimedia.org/T200612

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: jindrichmynarz
Cc: Aklapper, jindrichmynarz, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to