A quick followup for the archives...
General Recommendation:
Some older systems auto configure /proc/sys/vm/zone_reclaim_mode to 1
which can cause severe performance problems in various cases. It
appears to be a small range of systems that this happen with (Nehalem
in particular).
If in doubt, just set /proc/sys/vm/zone_reclaim_mode to 0.
People have been recommending this for years and the main linux kernel
has finally adopted this.
For non-HPC uses it will probably help.
For HPC uses it probably does not matter. Where it might matter, you
are already probably doing enough other numa-aware processing that it
actually does not matter.
More details:
We were seeing severe performance issues on our diskless systems with
an application doing mmap reads of large files on GPFS. The I/O
pattern was sequential reads a large file. The file was 5-10 times
the size of ram on the nodes.
We tracked this down to 'pgscand/s' in the 'sar -B' output going
outrageous (13M pages scanned per second to try to find a pages to
free).
Some googling led us to:
<http://engineering.linkedin.com/performance/optimizing-linux-memory-management-low-latency-high-throughput-databases>
Although a fairly different problem this was just the information we
needed.
We found that /proc/sys/vm/zone_reclaim_mode was being set to 1 on our
systems despite various documentation indicating that the default
value should be 0.
It appears that this largely impacts Nehalem processors.
Redhat 6.3 appears to have addressed the issue:
<https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.3_Technical_Notes/kernel_issues.html>
indicates that Redhat 6.1 had an incorrect value of 1 and that RH6.2
and RH6.3 corrected that defect. We are running CentOS 6.5.
Redhat also has something about the inconsistencies at:
<https://access.redhat.com/solutions/60669>
However it needs a subscriber login. We do having a number of Redhat
licenses but we don't use them preferring to use a single consistent
image.
A few other items of interest:
Something from 2010 "zone_reclaim_mode is the essence of all evil":
<http://www.poempelfox.de/blog/2010/03/19/>
It looks setting zone_reclaim_mode to 0 was proposed at least as early
as 2009. I'm unclear what happened with this patch:
<http://osdir.com/ml/linux-kernel/2009-05/msg05670.html>
It appears that just recently the main linux kernel has merged in a
change claiming to do this (although the diff does not appear to do it
very directly).
<http://rhaas.blogspot.com/2014/06/linux-disables-vmzonereclaimmode-by.html>
Stuart
--
I've never been lost; I was once bewildered for three days, but never lost!
-- Daniel Boone
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user