|MichaelSchoenitzer added a comment.|
That's the reason why queries time out. There's no magic to it - queries that time out are those that require the engine to process a huge amount of data.
Well that's obvious. But what I was referring to is that while above examples produce huge amount of data in the output and can therefore never be significantly faster by definition, the example I gave does not give big amounts of data as output.
It is slow because the SPARQL-Query first generates a list of all humans – which is huge – then sorts the list and only then applies the limit. Also the filter is only applied after generating the list – thus not improving the query time much. There is no way to avoid the generation of the huge interim list in the first place now - but there could be!
So in contrary to the queries my Magnus and others this type of query could be faster.
A (not very nice but working) way to do so would be to add a binned or qualitative information to the RDF-Database containing an estimate of the number of site-links, so that on could add something like:
?item wikibase:popularity wikibase:verypopular.
to restrain the list to items with more than some-constant number of site links. Not very pretty, but something like that could be implemented to improve this very common and still very inefficient pattern described above.
Cc: MichaelSchoenitzer, Edgars2007, chasemp, Lydia_Pintscher, Magnus, MichaelSchoenitzer_WMDE, MisterSynergy, doctaxon, Jonas, Ash_Crow, Daniel_Mietchen, Lucas_Werkmeister_WMDE, Jane023, Base, Gehel, Smalyshev, Ijon, Aklapper, Lahi, Gq86, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Avner, FloNight, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
_______________________________________________ Wikidata-bugs mailing list Wikidataemail@example.com https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs