M.schmidt00 added a subscriber: M.schmidt00.
M.schmidt00 added a comment.

Generally speaking, the FILTER NOT EXISTS query is quite challenging, it 
requires the evaluation of a composed graph pattern for each of the outer 
solutions. What’s missing here for the query to run efficiently is a distinct 
projection on the variables that are bound outside and projected in the FILTER 
NOT EXISTS clause, to eliminate redundant computations. This is inline with 
other issues we’re currently working on;  we will address these issues as part 
of ticket http://trac.bigdata.com/ticket/1140 in the near future.

For the meantime, here’s a proposal for rewriting the query into (as we 
understand) an equivalent one that should run in the subseconds range (note 
that the query slightly differs from your query, e.g. does not contain the 
filters — we’ve got a different data set running here at the moment). The key 
idea is to use GROUP BY and HAVING COUNT = 1 to identify statements that are 
referenced only once, it should be straightforward to apply to the original 
query:

  SELECT ?s ?p ?o
  WITH {
    SELECT ?ref
    WHERE {
      <http://www.wikidata.org/entity/Q30> ?statementPred ?statement .
      ?statement <http://www.w3.org/ns/prov#wasDerivedFrom> ?ref . 
    }
    GROUP BY ?ref
    HAVING (COUNT(?statement)=1)
  } AS %unreferenced
  WHERE {
    INCLUDE %unreferenced
    ?ref ?expandedValuePred ?s .
    ?s ?p ?o .
  }

The query uses the (non-standard) SPARQL extensions “WITH”, which allows 
enforcing the order in which subqueries are evaluated. Once we’ve tackled the 
optimization, you may switch back to the original query and execute that one 
more efficiently.


TASK DETAIL
  https://phabricator.wikimedia.org/T96094

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
<username>.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: M.schmidt00
Cc: M.schmidt00, Thompsonbry, Thompsonbry.systap, Beebs.systap, Haasepeter, 
Manybubbles, Aklapper, Smalyshev, jkroll, Wikidata-bugs, Jdouglas, aude, 
GWicke, daniel, JanZerebecki



_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to