Ladsgroup added subscribers: Joe, Ladsgroup.
Ladsgroup added a comment.

  I sorta copy what I said in T193733#5276659 
<https://phabricator.wikimedia.org/T193733#5276659> on reasons
  
  - It's a SPOF, if mwmaint1002 node goes down for HW issues, we can't dispatch 
at all. If there's a need to restart the node, dispatching has to stop until 
it's done.
  - "Noisy neighbor" effect, people run maintenance scripts in the mwmaint 
node, it can be choked to death by other scripts and it can make running 
maintenance scripts impossible by having bugs that eats all of the resources.
  - The distributed system we designed for this (pulling the wikis using three 
cronjobs, dispatching and picking up basically random + most stalled ones). 
This can use the great infrastructure for jobqueues we have.
  - Cronjobs are hard to debug, moving them to jobqueue makes it easier to 
debug in logstash.
  
  By reducing number of edits happening on wikidata (using one big wbeditentity 
API call instead of several when termbox v1 edit happens) can help, but there 
might be better ways to do it. @Joe has lots of good insight in this regard.

TASK DETAIL
  https://phabricator.wikimedia.org/T48643

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Ladsgroup
Cc: Ladsgroup, Joe, Addshore, Aklapper, Tobi_WMDE_SW, JanZerebecki, 
Wikidata-bugs, Abraham, Nemo_bis, Denny, aude, Ricordisamoa, Lydia_Pintscher, 
daniel, hoo, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to