https://bugzilla.wikimedia.org/show_bug.cgi?id=71239
Bug ID: 71239
Summary: Re-examine icinga warning thresholds and job expiry.
Product: OCG
Version: unspecified
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: Unprioritized
Component: General/Unknown
Assignee: [email protected]
Reporter: [email protected]
CC: [email protected]
Web browser: ---
Mobile Platform: ---
Before OCG was turned on for everyone, we had a 30k icinga warning limit for
the job status queue. Since entries expire from the queue after 5 days and we
expect around 10k jobs/day, we raised the limit to a more reasonable 100k.
But we should re-examine this once OCG goes live by default in production and
see whether this limit makes sense. We also have:
warn output dir 40GB
critical output dir 50GB
postmortem dir warn 1G, critical 2G
render jobs queue warn 100, critical 500
temp size warn 1G, critical 5G
We should examine these as well. (If changes are needed, see
https://gerrit.wikimedia.org/r/162623 )
Finally -- is the 5 day expiry reasonable? Should we instead/also have a "#
entries" limit, and expire things as needed until the status queue goes down
before NNN entries?
--
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l