https://bugzilla.wikimedia.org/show_bug.cgi?id=50585
Web browser: ---
Bug ID: 50585
Summary: Silence the qacct transfer jobs and monitor them with
Icinga instead
Product: Wikimedia Labs
Version: unspecified
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: Unprioritized
Component: tools
Assignee: [email protected]
Reporter: [email protected]
Classification: Unclassified
Mobile Platform: ---
During the NFS outage, the qacct transfer jobs pestered the roots' mailboxes
every five minutes. Though such an outage of course will never ever happen
again :-), it sucked nonetheless.
The transfer job is a service and if we would monitor it as one, we would get
better behaviour as well: A nice green or red icon on a web dashboard, and only
one (or none?) ping by mail when the status *changes*.
So we should set up Icinga monitoring for that:
a) The transfer job directs all stdout/stderr to a file, saves its exit code in
another and periodically these files are queried by Icinga.
b) The transfer job passes its output and exit code directly to an Icinga
sentinel that passes it somewhere up the chain.
Whether a) or b) are preferable (or possible for that matter), I haven't
figured out yet, but this bug will track the progress on that.
--
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l