Lucas_Werkmeister_WMDE added a comment.

  In T214362#6680770 <https://phabricator.wikimedia.org/T214362#6680770>, 
@Krinkle wrote:
  
  > - The authoritive source for describing items is Wikidata.org.
  > - The authoritive source for describing the constraint checks is also on 
Wikidata.org.
  >
  > What is the authoritive source for executing a constraint check if all 
caches and secondary services were empty? I believe this is currently in 
MediaWiki (WBQC extension), which may consult WDQS as part of the constraint 
check, where WDQS in this context is mainly used as a way to query the 
relational data of Wikidata items, which we can't do efficiently within 
MediaWiki so we rely on WDQS for that. This means to run a constraint check, 
WDQS needs to be fairly up to date with all the items, which happens through 
some sync process that is not related to this RFC. Does that sound right?
  
  Yes, though I would add that WDQS is only used for certain kinds of 
constraint checks – mainly those that have to traverse the subclass of 
<https://www.wikidata.org/wiki/Property:P279> hierarchy. Constraint checks that 
only use the data of one or two items (this item should also have property X; 
the value of property Y should be an item that has property Z) don’t use the 
query service.
  
  Or put differently – when checking constraints, we use the query service for 
data of some other items, but not for data of the item itself; that’s why it’s 
okay if the query service has slightly stale data when we check constraints of 
an item immediately after it has been edited. (I realized while writing this 
that that’s not entirely true, but hopefully T269859 
<https://phabricator.wikimedia.org/T269859> won’t be too hard to fix.)
  
  > Speaking of caches and secondary services, where do we currently expose or 
store the result of constraint checks? As I understand it, they are:
  >
  > - Saved in Memcached for 1 day after a computation happens.
  > - Exposed via action=rdf, which is best-effort only. Returns cache hit or 
nothing. It's not clear to me when one would use this, and what higher-level 
requirements this needs to meet. I'll assume for now there are cases somewhere 
where a JS gadget can't affort to wait to generate it and is fine with results 
just being missing if they weren't recently computed by something unrelated.
  
  It’s mainly used by the query service updater, which adds constraint check 
results to the query service, best-effort as you say.
  
  > - Exposed via Special:ConstraintReport/Q123, which ignores the cache and 
always computes it fresh.
  > - Exposed via API action=wbcheckconstraints, which is the main and reliable 
way to access this data from the outside. Considers cache and re-generates on 
the fly as needed, so it might be slow.
  >
  > It's not clear to me why Special:ConstraintReport exists in this way.
  
  Historically, the special page predates the API (and also any form of 
caching); the main reason it //still// exists this way is just that we haven’t 
removed it yet. You can see on Grafana 
<https://grafana.wikimedia.org/d/000000344/wikidata-quality?viewPanel=6> that 
it’s barely used, a dozen requests a day or so.
  
  I’ll skip quoting the rest of the comment, but regarding using WDQS as the 
store itself, I guess it’s important that WDQS isn’t really one store – there 
are (I believe) about a dozen instances, distributed across eqiad and codfw, 
and they’re fairly independent from one another. They’re supposed to all 
contain the same data, since they update themselves from the same data source, 
but over time the number of triples 
<https://grafana.wikimedia.org/d/000000489/wikidata-query-service?viewPanel=7> 
still drifts apart slightly. So to use WDQS as the store, we would have to push 
the data to each instance. The store which this RFC asks for, I believe, should 
instead store the data only once, and then each of the query service updaters 
(each instance has its own update) can pull the data from there, and so can 
action=wbcheckconstraints.

TASK DETAIL
  https://phabricator.wikimedia.org/T214362

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lucas_Werkmeister_WMDE
Cc: WMDE-leszek, eprodromou, CCicalese_WMF, kchapman, Krinkle, mobrovac, abian, 
Lydia_Pintscher, Lucas_Werkmeister_WMDE, Marostegui, Joe, daniel, Agabi10, 
Aklapper, Addshore, Akuckartz, Demian, WDoranWMF, holger.knust, EvanProdromou, 
DannyS712, Nandana, kostajh, Lahi, Gq86, Pablo-WMDE, GoranSMilovanovic, 
RazeSoldier, QZanden, merbst, LawExplorer, _jensen, rosalieper, xSavitar, 
Scott_WUaS, Pchelolo, Izno, SBisson, Perhelion, Wikidata-bugs, Base, aude, 
GWicke, Bawolff, jayvdb, fbstj, santhosh, Jdforrester-WMF, Ladsgroup, Mbch331, 
Rxy, Jay8g, Ltrlg, bd808, Legoktm
_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to