https://bugzilla.wikimedia.org/show_bug.cgi?id=45529
--- Comment #3 from Asher Feldman <[email protected]> --- (In reply to comment #0) > Since only a single index will be used for any given query, this is not > helpful, and results in inefficient searches. This isn't strictly true. Since the select queries in production against wb_terms aren't join queries, index merge optimizations are possible and are being utilized. I.e. a query containing WHERE (term_language='oc' AND term_type='label' AND term_entity_type='property') is executed via an intersection of two indexes - "Using intersect(wb_terms_entity_type,wb_terms_language)". That said, index merging isn't as efficient as composite indexes, and there's one query case that isn't performing well at all. It looks like there are four common select query types in production: 1) /* Wikibase\TermSqlIndex::getTermsOfEntity */ select term_language, term_type, term_text from `wb_terms` where term_entity_id = ? and term_entity_type = ? 2) /* Wikibase\TermSqlIndex::getMatchingIDs */ select distinct term_entity_id from `wb_terms` where (term_language=? and term_search_key like ? and term_type=? and term_entity_type=?) or (term_language=? and term_search_key like ? and term_type=? and term_entity_type=?) limit ? 3) /* Wikibase\TermSqlIndex::getMatchingTerms */ SELECT term_entity_type, term_type, term_language, term_text, term_entity_id FROM `wb_terms` WHERE (term_language=? AND term_type=? AND term_entity_type=?) 4) /* Wikibase\TermSqlIndex::getMatchingTermCombination */ select terms?term_entity_type as terms?term_entity_type, terms?term_type as terms?term_type, terms?term_language as terms?term_language, terms?term_text as terms?term_text, terms?term_entity_id as terms?term_entity_id, terms?term_entity_type as terms?term_entity_type, terms?term_type as terms?term_type, terms?term_language as terms?term_language, terms?term_text as terms?term_text, terms?term_entity_id as terms?term_entity_id from `wb_terms` `terms?` inner join `wb_terms` `terms?` on ((terms?term_entity_id=terms?term_entity_id) and (terms?term_entity_type=terms?term_entity_type)) where (terms?term_language=? and terms?term_text=? and terms?term_type=? and terms?term_entity_type=?) and (terms?term_language=? and terms?term_text=? and terms?term_type=? and terms?term_entity_type=?) and (terms?term_entity_id <> ? or terms?term_entity_type <> ?) limit ? Query type 3 above can have a large number of results returned and should probably include a limit. I think your proposed indexes should be good, but please limit the length on term_search_key. Most queries use a few letters only ('abc%'), indexing term_search_key(6) should be good. If the combination ( term_entity_type, term_entity_id ) is guaranteed to be unique, and it will be used to delete rows, define this index as unique to minimize locking. Submit the changes (with drop index statements) as a regular schema change and let me know when it's available in gerrit for review. -- You are receiving this mail because: You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
