Smalyshev added a comment.

OK, this can be done but the issue here is we can't evaluate a solution (e.g. 
for performance, fitness to data, etc.) such as Titan/Gremlin if we have no 
data to test it on. Meaning, assume we coded up all the queries under 
assumption the data is ranked properly. But the current data, as far as I can 
see, is not. So how can we know if certain query would be fast or not on real 
data? 

Another issue is that if the cleverness is in the query, we can not pre-process 
data on import. Which means, if we need to do an operation as simple as "select 
the latest relevant population figure", instead of preparing for it on data 
import, we'd be doing it every time the query is run, on every query, for every 
data item. I suspect that may hurt performance, especially if such items are 
combined. I.e. "get me the mayors of 10 most populous capitals" - now we have 3 
data sets to process, for mayors - mayors may be past or current, for capitals 
- capitals can be current or old, and for population figures. This means 
instead of considering one data point at each junction, we will have to 
consider multiple ones - and this work will be done anew each time. Of course, 
ideally we have all the latest data marked as preferred, but I don't think now 
it is the case even in relatively common data, and it probably is even worse 
once the data sets become more exotic and less visible/visited.

Maybe indeed the solution would be to make a bot that just reads qualifiers and 
puts ranks automatically on the latest figure if there's none, and that bot 
maybe can run more complex queries than the regular user. 

In any case, I'll add rank support to importer/query prototype, but I think we 
still need to consider optimizing for common case.

TASK DETAIL
  https://phabricator.wikimedia.org/T76373

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
<username>.

To: Smalyshev
Cc: Smalyshev, Manybubbles, GWicke, JanZerebecki, aude, Lydia_Pintscher, 
jkroll, Wikidata-bugs, daniel



_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to