David, Thank you so much. This is very helpful and I've improved the wiki docs in a few places with this new information.
Thad https://www.linkedin.com/in/thadguidry/ On Fri, Aug 7, 2020 at 12:31 PM David Causse <[email protected]> wrote: > Some answers inline, > > On Fri, Aug 7, 2020 at 6:07 PM Thad Guidry <[email protected]> wrote: > >> Very nice David! >> >> 1. Does the MINUS actually utilize ElasticSearch indexes or just >> Blazegraph? >> >> > No, elasticsearch is being used only during the call to the wikibase:mwapi > SERVICE. Everything happening outside this call is handled by blazegraph. > > >> I'd like to help the community by writing up a bit better documentation >> on our SPARQL pages that talks about FILTER() versus MINUS() if no one has >> this info floating around? >> The only footnote I saw was: >> " MINUS lets you select results that *don’t* fit some graph pattern. FILTER >> NOT EXISTS is mostly equivalent (see the SPARQL spec for an example >> where they differ), but – at least on WDQS – usually slower by quite a bit." >> at the bottom of the SPARQL tutorial >> >> <https://www.wikidata.org/wiki/Wikidata:SPARQL_tutorial> >> and the wiki page SPARQL query service >> <https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries#Excluding_subsets> >> has: >> >> Excluding subsets >> >> SPARQL has three different idioms for excluding subsets: >> >> - OPTIONAL { ... ?x ... } FILTER(!bound(?x)), >> - FILTER NOT EXISTS { ... } >> - MINUS { ... } >> >> Currently, in almost all circumstances, Blazegraph resolves all of these >> to the same query plan. >> >> 2. Is that still a true statement that those 3 above use the same query >> plan currently? >> > > I think they indeed serve the same purpose but might vary in subtle ways, > for MINUS vs FILTER NOT EXISTS the sparql specs states that they can > produce different solutions > <https://www.w3.org/TR/sparql11-query/#neg-notexists-minus>. > As to which approach is better I can't answer clearly, I tend to prefer > MINUS as I find it easier to read/understand. I also tend to avoid plain > FILTER(constraint on ?x) when possible as they tend to be rather slow (here > the FILTER(!bound(?x)) should be pretty fast though). > > David. > _______________________________________________ > Wikidata mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikidata >
_______________________________________________ Wikidata mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata
