daniel added a comment. A first review of the schema and queries:
Queries run against wbqc_constraints seem to all use the (pid) index. What's the (constraint_type_qid) index for? Queries run against wbqev_external_data seem to all use the (dump_id, external_id, pid) index. Unless ExternalDataRepo->get is called with no dump id or no external id. In these cases, an exception should be thrown instead of creating extremely inefficient queries. Calling `$db->commit( __METHOD__, "flush" )` should not be done in DAO objects. Simmilarly, calling `wfWaitForSlaves` should be avoided in the DAO layer. These things should be kept on the level of updaters/importers. wbqev_identifier_properties should have an index on dump_id, so the query planner is free to choose which side of the join to evaluate first. It's also needed for the DELETE queries that select by dumpId. wbq_violations seems to be unused in v1, is that correct? The (claim_guid) index is redundant to the composite primary key (claim_guid, constraint_id). Prefixes of composite keys can be used like separate keys. The (constraint_id) index is thus not redundant. It seems very likely that wbq_violations needs more indexed. At one on (entity_id), for getting (or clearing) all violations for a given entity. Or is the plan to do this by prefix-matching on the claim_guid? wbq_evaluation does not define any indexes. That's bad, at least give it an auto-increment row id. But is it used at all? It seems like the eval data is going to a log file at the moment. TASK DETAIL https://phabricator.wikimedia.org/T102992 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: daniel Cc: jcrespo, Tamslo, csteipp, Springle, hoo, Aklapper, daniel, Wikidata-bugs, aude, GWicke, Krenair, Malyacko, P.Copp _______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
