The default value is either one or zero depending on the distance between nodes 
as reported by the numactl utility. If the distance is over 20 
zone_reclaim_mode will default to one. The Nehalem nodes report a distance of 
21, SandyBridge and IvyBridge are lower.  

  -- ddj
Dave Johnson

> On Sep 19, 2014, at 1:42 PM, Christopher Samuel <[email protected]> wrote:
> 
>> On 20/09/14 03:27, Christopher Samuel wrote:
>> 
>> To be honest I'm suspicious of kernel autotuning here..
> 
> OK, did a quick bit of poking around.
> 
> On our M4 iDataplex I see 0 everywhere.
> 
> *but* on our M2 cluster I see:
> 
> [root@merri-m ~]# xdsh all -v cat /proc/sys/vm/zone_reclaim_mode | xcoll
> ====================================
> idataplex,login,terri
> ====================================
> 1
> 
> ====================================
> merri081,merri083,turpin,merri082
> ====================================
> 0
> 
> So all the M2's have 1, along with an (old, 2011, Westmere) SGI UV10.
> 
> The nodes that are all 0 are x3690 X5 (Westmere with Maxx5's).
> 
> Now they're all booting the same statelite image and so I know it's
> not something we've configured:
> 
> provmethod=rhel64_gpfs35016_updates
> 
> So I'd say that's highly likely to be kernel autotuning.
> 
> 
> Aha - there was a patch merged for 3.16 to disable it by default!
> 
> commit 4f9b16a64753d0bb607454347036dc997fd03b82
> Author: Mel Gorman <[email protected]>
> Date:   Wed Jun 4 16:07:14 2014 -0700
> 
>    mm: disable zone_reclaim_mode by default
> 
>    When it was introduced, zone_reclaim_mode made sense as NUMA distances
>    punished and workloads were generally partitioned to fit into a NUMA
>    node.  NUMA machines are now common but few of the workloads are
>    NUMA-aware and it's routine to see major performance degradation due to
>    zone_reclaim_mode being enabled but relatively few can identify the
>    problem.
> 
>    Those that require zone_reclaim_mode are likely to be able to detect
>    when it needs to be enabled and tune appropriately so lets have a
>    sensible default for the bulk of users.
> 
> 
> That modified the text describing the sysctl in vm.txt, and that
> old text confirms that the kernel does autotune this..
> 
> zone_reclaim_mode is set during bootup to 1 if it is determined that pages
> from remote zones will cause a measurable performance reduction. The
> page allocator will then reclaim easily reusable pages (those page
> cache pages that are currently not used) before allocating off node pages.
> 
> So there you go!
> 
> Best of luck,
> Chris (off to catch a plane)
> -- 
> Christopher Samuel        Senior Systems Administrator
> VLSCI - Victorian Life Sciences Computation Initiative
> Email: [email protected] Phone: +61 (0)3 903 55545
> http://www.vlsci.org.au/      http://twitter.com/vlsci
> 
> 
> ------------------------------------------------------------------------------
> Slashdot TV.  Video for Nerds.  Stuff that Matters.
> http://pubads.g.doubleclick.net/gampad/clk?id=160591471&iu=/4140/ostg.clktrk
> _______________________________________________
> xCAT-user mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/xcat-user

------------------------------------------------------------------------------
Slashdot TV.  Video for Nerds.  Stuff that Matters.
http://pubads.g.doubleclick.net/gampad/clk?id=160591471&iu=/4140/ostg.clktrk
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to