Thank you so much David! This was such a great example that I had to add this to our SPARQL Examples page in a new section "Mediawiki API": *https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples#Mediawiki_API <https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples#Mediawiki_API>*
The community thanks you sincerely! Thad https://www.linkedin.com/in/thadguidry/ On Mon, Jul 13, 2020 at 2:26 AM David Causse <[email protected]> wrote: > On Sat, Jul 11, 2020 at 7:12 PM Thad Guidry <[email protected]> wrote: > >> This query times out: >> >> SELECT ?item ?label >> WHERE >> { >> ?item wdt:P31 ?instance ; >> rdfs:label ?label ; >> rdfs:label ?enLabel . >> FILTER(CONTAINS(lcase(?label), "Soriano")). >> FILTER(?instance != wd:Q5). >> SERVICE wikibase:label {bd:serviceParam wikibase:language "en".} >> } >> LIMIT 100 >> >> I have this feeling that it's not actually using an index or even asking >> the right question and so is slow and times out? >> >> > Indeed, none of the criteria in your query allows the triple store to > determine an index to follow to extract the results in a timely manner. > The sole non negative criterion would be FILTER(CONTAINS(lcase(?label), > "Soriano")) but being in a FILTER and moreover a function it cannot be used > to determine an index to work on. > The only way to speed-up your query would be to introduce a discriminant > "matching" criterion. > > However the MediaWiki wbsearchentities API does seem to use an index and >> is performant for label searching: >> >> https://www.wikidata.org/w/api.php?action=wbsearchentities&search=soriano&language=en >> >> > wbsearchentitiies is backed by elasticsearch which is optimized for such > lookups. > > How can I get my SPARQL query to be more performant or asking the right >> question? >> >> > Unfortunate I don't see an obvious way to adapt your sparql query and keep > exactly the same semantic but to illustrate the problem: > > SELECT ?item ?label WHERE { > ?item wdt:P31 ?instance ; > rdfs:label "Soriano"@en . > FILTER(?instance != wd:Q5). > } > LIMIT 100 > > will return results in a timely manner, only because we helped the graph > traversal with an initial path on ?item rdfs:label "Soriano"@en. > > But by combining the query service and the wikidata API[0] baked by > elasticsearch I think you can extract what you want: > > SELECT ?item ?itemLabel WHERE { > ?item wdt:P31 ?instance . > FILTER(?instance != wd:Q5). > SERVICE wikibase:mwapi { > bd:serviceParam wikibase:endpoint "www.wikidata.org"; > wikibase:api "EntitySearch"; > mwapi:search "soriano"; > mwapi:language "en". > ?item wikibase:apiOutputItem mwapi:item. > } > SERVICE wikibase:label {bd:serviceParam wikibase:language "en".} > } > LIMIT 100 > > This query will first contact EntitySearch (an alias to wbsearchentities) > which will pass the items it found to the triple store which in turn can > now query the graph in a timely manner. Obviously this solution only works > if the number of items returned by wbsearchentities remains reasonable. > > 0: https://www.mediawiki.org/wiki/Wikidata_Query_Service/User_Manual/MWAPI > -- > David C. > _______________________________________________ > Wikidata mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikidata >
_______________________________________________ Wikidata mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata
