Hi!

> As suggested somewhere far above, it would be great for the community to
> catalogue the queries that are most important for their use cases that
> do not do well on the SPARQL endpoint.  Its likely that the list isn't
> going to be super-long (in terms of query structure), hence it might
> make sense to establish dedicated, optimized web services (that exist
> apart from the endpoint) to call upon when those kinds of queries need
> to be executed.  

Good idea. As a preliminary point, I think the same basics as with most
other query engines (SQL, etc.) apply:

- non-restrictive queries with tons of results will be slow -
i.e. "list of all humans" is probably not a good question to ask :)

- negative searches are usually slower - e.g. "all humans without
images" will be slow, since that query would have to inspect records for
every human

- Unbound paths/traversals will usually be slower (unfortunately many
queries that have TREE in WDQ are those), especially if there are a lot
of starting points for traversals (again, "all humans that", etc...)

It is also a good idea to put LIMIT on queries when experimenting, i.e.
if you intended to write query that asks for 10 records but accidentally
wrote one that returns 10 million, it's much nicer to discover it with
suitable limit than waiting for the query to time out and then try to
figure out why it happened.

Yes, I realize all this has to go to some page in the manual eventually :)
-- 
Stas Malyshev
smalys...@wikimedia.org

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to