https://bugzilla.wikimedia.org/show_bug.cgi?id=36993
Web browser: ---
Bug #: 36993
Summary: Labs cluster dies daily at roughly 6:30 UTC
Product: Wikimedia Labs
Version: unspecified
Platform: All
OS/Version: All
Status: NEW
Severity: normal
Priority: Unprioritized
Component: General
AssignedTo: [email protected]
ReportedBy: [email protected]
CC: [email protected], [email protected]
Classification: Unclassified
Mobile Platform: ---
Everyday, all instances hosted on WMFLabs are made barely accessible from
roughly 6:30am for about an hour. The symptoms are:
* very high load reported in ganglia for most instances
* ssh client reaching timeout
* `ls -l` being
This is known to be related to I/O and how GlusterFS seems to be lacking in
that area.
Regardless of GlusterFS, Ubuntu has a default daily cron set up at 6:25 UTC.
Which also means that all instances start rotating or processing their logs at
the same exact time.
There must be a cronjob on some of the instances that uses too much I/O. We
would need some metrics in Ganglia about disk usage to find it out.
--
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l