M.schmidt00 added a subscriber: M.schmidt00.
M.schmidt00 added a comment.
Generally speaking, the FILTER NOT EXISTS query is quite challenging, it
requires the evaluation of a composed graph pattern for each of the outer
solutions. What’s missing here for the query to run efficiently is a distinct
projection on the variables that are bound outside and projected in the FILTER
NOT EXISTS clause, to eliminate redundant computations. This is inline with
other issues we’re currently working on; we will address these issues as part
of ticket http://trac.bigdata.com/ticket/1140 in the near future.
For the meantime, here’s a proposal for rewriting the query into (as we
understand) an equivalent one that should run in the subseconds range (note
that the query slightly differs from your query, e.g. does not contain the
filters — we’ve got a different data set running here at the moment). The key
idea is to use GROUP BY and HAVING COUNT = 1 to identify statements that are
referenced only once, it should be straightforward to apply to the original
query:
SELECT ?s ?p ?o
WITH {
SELECT ?ref
WHERE {
<http://www.wikidata.org/entity/Q30> ?statementPred ?statement .
?statement <http://www.w3.org/ns/prov#wasDerivedFrom> ?ref .
}
GROUP BY ?ref
HAVING (COUNT(?statement)=1)
} AS %unreferenced
WHERE {
INCLUDE %unreferenced
?ref ?expandedValuePred ?s .
?s ?p ?o .
}
The query uses the (non-standard) SPARQL extensions “WITH”, which allows
enforcing the order in which subqueries are evaluated. Once we’ve tackled the
optimization, you may switch back to the original query and execute that one
more efficiently.
TASK DETAIL
https://phabricator.wikimedia.org/T96094
REPLY HANDLER ACTIONS
Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign
<username>.
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: M.schmidt00
Cc: M.schmidt00, Thompsonbry, Thompsonbry.systap, Beebs.systap, Haasepeter,
Manybubbles, Aklapper, Smalyshev, jkroll, Wikidata-bugs, Jdouglas, aude,
GWicke, daniel, JanZerebecki
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs