Smalyshev added a subscriber: Smalyshev. Smalyshev added a comment. Yeah, 13 min queries is not really the best idea I'm afraid. Also, `?wds a wikibase:Statement` should not have worked on query.wikidata.org since it strips those (see https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format#WDQS_data_differences) but can be run from raw dump of course. In fact, you do not need the "a statement" part since only statements are ever on the right of p: predicates anyway.
I also don't think distinct is needed in the last query since having the same reference twice it pretty rare I think. And, "OPTIONAL" part may be omitted too maybe since if you enumerate all statements and then remove the ones with non-zero counts, you get the ones with zero counts (e.g. MINUS operator could do it). With these modifications, query like: prefix wikibase: <http://wikiba.se/ontology#> prefix wdt: <http://www.wikidata.org/prop/direct/> prefix prov: <http://www.w3.org/ns/prov#> prefix wd: <http://www.wikidata.org/entity/> prefix p: <http://www.wikidata.org/prop/> SELECT ?wds (count(?o) AS ?ocount) WHERE { ?s p:P227 ?wds . ?wds prov:wasDerivedFrom ?o . } GROUP BY ?wds runs for me in 26 s. Of course, I may be missing something here. In general, the query service may not be very suited for queries that require touching whole or significant part of the database, they will be slow. Going over 300K+ entities one by one has to take some time. TASK DETAIL https://phabricator.wikimedia.org/T120166 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Smalyshev Cc: Smalyshev, Jheald, daniel, Lydia_Pintscher, Aklapper, Christopher, StudiesWorld, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, JeroenDeDauw, Mbch331 _______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
