daniel added a comment.

A first review of the schema and queries:

Queries run against wbqc_constraints seem to all use the (pid) index.  What's 
the (constraint_type_qid) index for?

Queries run against wbqev_external_data seem to all use the (dump_id, 
external_id, pid) index. Unless ExternalDataRepo->get is called with no dump id 
or no external id. In these cases, an exception should be thrown instead of 
creating extremely inefficient queries.

Calling `$db->commit( __METHOD__, "flush" )` should not be done in DAO objects. 
Simmilarly, calling `wfWaitForSlaves` should be avoided in the DAO layer. These 
things should be kept on the level of updaters/importers.

wbqev_identifier_properties should have an index on dump_id, so the query 
planner is free to choose which side of the join to evaluate first. It's also 
needed for the DELETE queries that select by dumpId.

wbq_violations seems to be unused in v1, is that correct? The (claim_guid) 
index is redundant to the composite primary key (claim_guid, constraint_id). 
Prefixes of composite keys can be used like separate keys. The (constraint_id) 
index is thus not redundant.

It seems very likely that wbq_violations needs more indexed. At one on 
(entity_id), for getting (or clearing) all violations for a given entity. Or is 
the plan to do this by prefix-matching on the claim_guid?

wbq_evaluation does not define any indexes. That's bad, at least give it an 
auto-increment row id. But is it used at all? It seems like the eval data is 
going to a log file at the moment.


TASK DETAIL
  https://phabricator.wikimedia.org/T102992

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: jcrespo, Tamslo, csteipp, Springle, hoo, Aklapper, daniel, Wikidata-bugs, 
aude, GWicke, Krenair, Malyacko, P.Copp



_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to