Hi! > statements (about 2.5M) and on the question if SPARQL could list all > entries in Wikidata that do not have statements. I played a bit with
Technically, it could, but since it's so many of them, they might not finish in time. The problem is that since there's no indexes on something not existing, what probably happens is that the database would go entity by entity trying to find one that doesn't have a statement, and that is slow. I think there may be a bug with LIMIT implementation, or maybe it's just indeed taking too long... > combinations of OPTIONAL and FILTER-BOUND and FILTER NOT EXIST... > something like: > > PREFIX wikibase: <http://wikiba.se/ontology#> > SELECT DISTINCT ?entry ?label ?statement WHERE { > ?entry rdfs:label ?label . FILTER (lang(?label) = "en") > FILTER NOT EXISTS { > ?statement ?prop ?entry ; > wikibase:rank ?rank . > } > } LIMIT 5 This query also seems a bit wrong since it looks for ?entry as object, not subject. > But there was something else I noted... statements are not typed... > that would probably kick in some index, rather than the above query, > and the documentation actually speaks about wikibase:Statement [1] but > if I search for anything rdf:type-d as such, then it finds nothing in > the SPARQL end point: Right, please check out: https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format#WDQS_data_differences wikibase:Statement is ommitted from the database for performance reasons. You could still match statements by URL by converting them to str() and then using substr() function, but that probably wouldn't help much since there's a lot of statements so the filtering would not be very selective. -- Stas Malyshev smalys...@wikimedia.org _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata