faidon added a comment.

So post-mortem, I think there are 4 different things here:

  • T189519: Audit switch ports/descriptions/enable (and do this on an ongoing basis)
  • T189522: Detect IP address collisions
  • General enhancements on our server provisioning and decommissioning pipeline, which has a bunch of long-standing issues, but also requires a more dedicated long-term effort. I'm sure there's one or more tasks related to this, but more broadly, this work stream is something that has been incorporated into our (draft) annual plan as a major item next year.
  • (Tagential) Triage the decom queue in a more prompt way to avoid servers lingering for months after their service decom.

My apologies for all the wasted engineering time -- this is a pretty unfortunate, and pretty basic issue that continues to bite us :(



To: RobH, faidon
Cc: Platonides, MoritzMuehlenhoff, TerraCodes, faidon, ops-monitoring-bot, Joe, RobH, Stashbot, Gehel, Cmjohnson, Papaul, gerritbot, elukey, Smalyshev, Aklapper, Dzahn, Davinaclare77, Qtn1293, Lahi, Gq86, Darkminds3113, ayounsi, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Avner, Zppix, Jonas, FloNight, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Southparkfan, mark, Mbch331, Jay8g, akosiaris, fgiunchedi
Wikidata-bugs mailing list

Reply via email to