RE: [PATCH 4/4] zone_reclaim_mode is always 0 by default

From: Zhang, Yanmin
Date: Tue May 19 2009 - 01:07:21 EST


>>-----Original Message-----
>>From: KOSAKI Motohiro [mailto:kosaki.motohiro@xxxxxxxxxxxxxx]
>>Sent: 2009年5月19日 12:31
>>To: Zhang, Yanmin
>>Cc: kosaki.motohiro@xxxxxxxxxxxxxx; Wu, Fengguang; LKML; linux-mm; Andrew
>>Morton; Rik van Riel; Christoph Lameter
>>Subject: Re: [PATCH 4/4] zone_reclaim_mode is always 0 by default
>>
>>> >>-----Original Message-----
>>> >>From: KOSAKI Motohiro [mailto:kosaki.motohiro@xxxxxxxxxxxxxx]
>>> >>Sent: 2009ト・ヤツ19ネユ 10:54
>>> >>To: Wu, Fengguang
>>> >>Cc: kosaki.motohiro@xxxxxxxxxxxxxx; LKML; linux-mm; Andrew Morton; Rik van
>>> >>Riel; Christoph Lameter; Zhang, Yanmin
>>> >>Subject: Re: [PATCH 4/4] zone_reclaim_mode is always 0 by default
>>> >>
>>> >>> On Wed, May 13, 2009 at 12:08:12PM +0900, KOSAKI Motohiro wrote:
>>> >>> > Subject: [PATCH] zone_reclaim_mode is always 0 by default
>>> >>> >
>>> >>> > Current linux policy is, if the machine has large remote node distance,
>>> >>> > zone_reclaim_mode is enabled by default because we've be able to assume
>>> >>Fortunately (or Unfortunately), typical workload and machine size had
>>> >>significant mutuality.
>>> >>Thus, the current default setting calculation had worked well in past days.
>>> [YM] Your analysis is clear and deep.
>>
>>Thanks!
>>
>>
>>> >>Now, it was breaked. What should we do?
>>> >>Yanmin, We know 99% linux people use intel cpu and you are one of
>>> >>most hard repeated testing
>>> [YM] It's very easy to reproduce them on my machines. :) Sometimes, because
>>the
>>> issues only exist on machines with lots of cpu while other community
>>developers
>>> have no such environments.
>>>
>>>
>>> guy in lkml and you have much test.
>>> >>May I ask your tested machine and benchmark?
>>> [YM] Usually I started lots of benchmark testing against the latest kernel,
>>but
>>> as for this issue, it's reported by a customer firstly. The customer runs
>>apache
>>> on Nehalem machines to access lots of files. So the issue is an example of
>>file
>>> server.
>>
>>hmmm.
>>I'm surprised this report. I didn't know this problem. oh..
[YM] Did you run file server workload on such NUMA machine with
zone_reclaim_mode=1? If all nodes have the same memory, the behavior is
obvious.


>>
>>Actually, I don't think apache is only file server.
>>apache is one of killer application in linux. it run on very widely
>>organization.
[YM] I know that. Apache could support document, ecommerce, and lots of other
usage models. What I mean is one of customers hit it with their
workload.


>>you think large machine don't run apache? I don't think so.
>>
>>
>>
>>> BTW, I found many test cases of fio have big drop after I upgraded BIOS of
>>one
>>> Nehalem machine. By checking vmstat data, I found almost a half memory is
>>always free. It's also related to zone_reclaim_mode because new BIOS changes
>>the node
>>> distance to a large value. I use numactl --interleave=all to walkaround the
>>problem temporarily.
>>>
>>> I have no HPC environment.
>>
>>Yeah, that's ok. I and cristoph have. My worries is my unknown workload become
>>regression.
>>so, May I assume you run your benchmark both zonre reclaim 0 and 1 and you
>>haven't seen regression by non-zone reclaim mode?
[YM] what is non-zone reclaim mode? When zone_reclaim_mode=0?
I didn't do that intentionally. Currently I just make sure FIO has a big drop
when zone_reclaim_mode=1. I might test it with other benchmarks on 2 Nehalem machines.


>>if so, it encourage very much to me.
>>
>>if zone reclaim mode disabling don't have regression, I'll pushing to
>>remove default zone reclaim mode completely again.
[YM] I run lots of benchmarks, but it doesn't mean I run all benchmarks, especially
no HPC.


>>
>>
>>> >>if zone_reclaim=0 tendency workload is much than zone_reclaim=1 tendency
>>> >>workload,
>>> >> we can drop our afraid and we would prioritize your opinion, of cource.
>>> So it seems only file servers have the issue currently.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/