| Lucas_Werkmeister_WMDE added a comment. |
The data model for a constraint violation would be a single triple pointing from the violating statement to the constraint it violates.
A few notes on load / scale: last week (2018-04-12 – 2018-04-19), there were some 121.3K requests to the wbcheckconstraints API, according to Grafana. Assuming the vast majority of them was for a single entity, that comes out to approximately 17K writes per day, or one every five seconds. We don’t have data on how many violations there are per constraint check, but I would estimate the average to be somewhere between 10 and 100.
At the moment, we can’t make any promises on the completeness of this data anyways, so it might be okay if we don’t have the data after a reimport. Alternatively – might it be possible to get the data from another query service node? The data should be easy to identify – all triples with a special predicate (something like wikibase:violatesConstraint).
Cc: Lucas_Werkmeister_WMDE, Gehel, Smalyshev, Jonas, Aklapper, Lahi, Gq86, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Avner, Agabi10, FloNight, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
