https://bugzilla.wikimedia.org/show_bug.cgi?id=36993
Antoine "hashar" Musso <has...@free.fr> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|Labs cluster dies daily at |dumps project overload |roughly 6:30 UTC |GlusterFS and cause cluster | |failure Severity|normal |major --- Comment #9 from Antoine "hashar" Musso <has...@free.fr> 2012-05-22 14:05:34 UTC --- We just had some kind of outage for the whole cluster. The virtualization cluster showed load gradually increasing at 13:20UTC : http://ganglia.wikimedia.org/latest/?r=hour&cs=05%2F22%2F2012+13%3A00+&ce=05%2F22%2F2012+14%3A00+&m=load_report&s=by+name&c=Virtualization+cluster+pmtpa&h=&host_regex=&max_graphs=0&tab=m&vn=&sh=1&z=small&hc=4 At the sometime, the dumps project on labs starts having some network activity which corresponds to I/O activity over NFS: http://ganglia.wmflabs.org/latest/graph.php?c=dumps&m=network_report&r=custom&s=by%20name&hc=4&mc=2&cs=05%2F22%2F2012%2011%3A00%20&ce=05%2F22%2F2012%2014%3A00%20&st=1337694997&g=network_report&z=medium&c=dumps I have seen the exact same behavior earlier this meaning where 30MBytes/s were output from a datadump host in eqiad and 30Mbytes/s were input in the dumps project. At the sametime, instances were unresponsive. We need to find a workaround, some possible solutions: - get the `dump` project to use some NFS share on real storage thus bypassing GlusterFS - rate limit network bandwidth between the dataset1001 in eqiad and the labs - find a parameter in GlusterFS that will throttle the connection Other ideas? Changing summary from: "Labs cluster dies daily at roughly 6:30 UTC" To: "dumps project overload GlusterFS and cause cluster failure" Raising severity since that makes the cluster unusable from time to time. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l