Addshore added a comment.

So a rough version of my approach can be seen at 
https://github.com/wikimedia/analytics-limn-wikidata-data/blob/master/graphite/sparql/references.php
Firstly get all properties that should be used as references

  SELECT ?s WHERE {?s wdt:P31/wdt:P279* wd:Q18608359}

And then query the counts for each

                $query .= "SELECT (count(?s) AS ?scount) WHERE {";
                $query .= "?wdref 
<http://www.wikidata.org/prop/reference/$propertyId> ?x .";
                $query .= "?s prov:wasDerivedFrom ?wdref";
                $query .= "}";

Of course this runs into the issue of a single statement can be returned in 
multiple counts.

So instead of this I will simply query for the statements that are referenced 
by the property (a query which completes but on the public interface times out 
when sending back the result) and then do some post processing to figure out 
the number that we actually want.

Me doing this is just blocked on https://phabricator.wikimedia.org/T120010 now.

Also when digging into all of these queries it turns out adding distinct is 
what actually causes them to run over the execution time limit.
If you remove the distinct from your query it will actually complete rather 
quickly.


TASK DETAIL
  https://phabricator.wikimedia.org/T117234

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Christopher, Addshore
Cc: Lydia_Pintscher, StudiesWorld, Addshore, Christopher, Aklapper, 
Wikidata-bugs, aude, Mbch331



_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to