Hi!

> statements (about 2.5M) and on the question if SPARQL could list all
> entries in Wikidata that do not have statements. I played a bit with

Technically, it could, but since it's so many of them, they might not
finish in time. The problem is that since there's no indexes on
something not existing, what probably happens is that the database would
go entity by entity trying to find one that doesn't have a statement,
and that is slow. I think there may be a bug with LIMIT implementation,
or maybe it's just indeed taking too long...

> combinations of OPTIONAL and FILTER-BOUND and FILTER NOT EXIST...
> something like:
> 
> PREFIX wikibase: <http://wikiba.se/ontology#>
> SELECT DISTINCT ?entry ?label ?statement WHERE {
>   ?entry rdfs:label ?label . FILTER (lang(?label) = "en")
>   FILTER NOT EXISTS {
>     ?statement ?prop ?entry ;
>       wikibase:rank ?rank .
>   }
> } LIMIT 5

This query also seems a bit wrong since it looks for ?entry as object,
not subject.

> But there was something else I noted... statements are not typed...
> that would probably kick in some index, rather than the above query,
> and the documentation actually speaks about wikibase:Statement [1] but
> if I search for anything rdf:type-d as such, then it finds nothing in
> the SPARQL end point:

Right, please check out:
https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format#WDQS_data_differences

wikibase:Statement is ommitted from the database for performance
reasons. You could still match statements by URL by converting them to
str() and then using substr() function, but that probably wouldn't help
much since there's a lot of statements so the filtering would not be
very selective.
-- 
Stas Malyshev
smalys...@wikimedia.org

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to