[Wikidata-bugs] [Maniphest] [Commented On] T116404: EntityUsageTable::getUsedEntityIdStrings query on wbc_entity_usage table is sometimes fast, sometimes slow

jcrespo Fri, 10 Jun 2016 03:21:13 -0700

jcrespo added a comment.

@daniel I think this is a case of prematurely optimizing. It is true that things like:

foreach ... { 'SELECT' }


Are usually considered bad practices, but for trying to minimize "round-trip time", we are actually doing a way worse query. Your query cannot know that with only 1 result per IN value it would be enough- it is implemented using a group by.

Look at this quick script I wrote:

time while read id; do mysql -h db1068 commonswiki -e "SELECT eu_entity_id FROM wbc_entity_usage WHERE eu_entity_id = '$id' LIMIT 1"; done < eu_entity_id

eu_entity_id
Q148475
eu_entity_id
Q54919
eu_entity_id
Q423048
eu_entity_id
Q2494649
eu_entity_id
Q13219454
eu_entity_id
Q131454
eu_entity_id
Q36578
eu_entity_id
Q1798125
eu_entity_id
Q2597810
eu_entity_id
Q19938912
eu_entity_id
Q193563
eu_entity_id
Q2553334
eu_entity_id
Q477675
eu_entity_id
Q623578
eu_entity_id
Q19675
eu_entity_id
Q13481868
eu_entity_id
Q384602

real	0m0.119s
user	0m0.028s
sys	0m0.048s

Despite creating a *new connection* each time, it only takes 0.1 seconds (and I am connecting and disconnecting 15 times! It will take even less if we are not connecting every time, as a normal script would do).

Let's go with the simpler approach (point SELECT queries), and we can later try to optimize the round-trip time if needed. InnoDB is very, very fast to do index lookups thanks to internal hash tables.

Of course, that was not a trivial example, but this happens many times- we assume only a few records per value exist, when in reality, uses explodes to several millions quickly.

TASK DETAIL

https://phabricator.wikimedia.org/T116404

EMAIL PREFERENCES

https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: jcrespo
Cc: aaron, aude, daniel, hoo, Aklapper, jcrespo, Vali.matei, Minhnv-2809, Volans, D3r1ck01, Izno, Luke081515, Wikidata-bugs, GWicke, Mbch331, Jay8g, Krenair

_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

[Wikidata-bugs] [Maniphest] [Commented On] T116404: EntityUsageTable::getUsedEntityIdStrings query on wbc_entity_usage table is sometimes fast, sometimes slow

Reply via email to