Re: [Approach #2] [RFC][PATCH] Remove cgroup member from struct page

From: Balbir Singh
Date: Wed Sep 10 2008 - 10:36:20 EST


KAMEZAWA Hiroyuki wrote:
> On Thu, 11 Sep 2008 07:02:44 +1000
> Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:
>
>> On Wednesday 10 September 2008 21:03, KAMEZAWA Hiroyuki wrote:
>>> On Thu, 11 Sep 2008 06:44:37 +1000
>>>
>>> Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:
>>>> On Wednesday 10 September 2008 11:49, KAMEZAWA Hiroyuki wrote:
>>>>> On Tue, 9 Sep 2008 18:20:48 -0700
>>>>>
>>>>> Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx> wrote:
>>>>>> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> [2008-09-09
>>>>>> 21:30:12]: OK, here is approach #2, it works for me and gives me
>>>>>> really good performance (surpassing even the current memory
>>>>>> controller). I am seeing almost a 7% increase
>>>>> This number is from pre-allcation, maybe.
>>>>> We really do alloc-at-boot all page_cgroup ? This seems a big change.
>>>> It seems really nice to me -- we get the best of both worlds, less
>>>> overhead for those who don't enable the memory controller, and even
>>>> better performance for those who do.
>>> No trobles for me for allocating-all-at-boot policy.
>>> My small concern is
>>> - wasting page_cgroup for hugepage area.
>>> - memory hotplug
>> In those cases you still waste the struct page area too. I realise that
>> isn't a good way to justify even more wastage. But I guess it is
>> relatively low. At least, I would think the users would be more happy to
>> get a 7% performance increase for small pages! :)
>>
> I guess the increase mostly because we can completely avoid kmalloc/kfree slow path.
>

Correct

> Balbir, how about fix our way to allocate-all-at-boot-policy ?
> If you say yes, I think I can help you and I'll find usable part from my garbage.
>

I am perfectly fine with it, I'll need your expertise to get the
alloc-at-boot-policy correct.

> Following is lockless+remove-page-cgroup-pointer-from-page-struct patch's result.
>
> rc5-mm1
> ==
> Execl Throughput 3006.5 lps (29.8 secs, 3 samples)
> C Compiler Throughput 1006.7 lpm (60.0 secs, 3 samples)
> Shell Scripts (1 concurrent) 4863.7 lpm (60.0 secs, 3 samples)
> Shell Scripts (8 concurrent) 943.7 lpm (60.0 secs, 3 samples)
> Shell Scripts (16 concurrent) 482.7 lpm (60.0 secs, 3 samples)
> Dc: sqrt(2) to 99 decimal places 124804.9 lpm (30.0 secs, 3 samples)
>
> lockless
> ==
> Execl Throughput 3035.5 lps (29.6 secs, 3 samples)
> C Compiler Throughput 1010.3 lpm (60.0 secs, 3 samples)
> Shell Scripts (1 concurrent) 4881.0 lpm (60.0 secs, 3 samples)
> Shell Scripts (8 concurrent) 947.7 lpm (60.0 secs, 3 samples)
> Shell Scripts (16 concurrent) 485.0 lpm (60.0 secs, 3 samples)
> Dc: sqrt(2) to 99 decimal places 125437.9 lpm (30.0 secs, 3 samples)
>
> lockless + remove page cgroup pointer (my version).
> ==
> Execl Throughput 3021.1 lps (29.5 secs, 3 samples)
> C Compiler Throughput 980.3 lpm (60.0 secs, 3 samples)
> Shell Scripts (1 concurrent) 4600.0 lpm (60.0 secs, 3 samples)
> Shell Scripts (8 concurrent) 915.7 lpm (60.0 secs, 3 samples)
> Shell Scripts (16 concurrent) 468.3 lpm (60.0 secs, 3 samples)
> Dc: sqrt(2) to 99 decimal places 124909.1 lpm (30.0 secs, 3 samples)
>
> Oh,yes. siginificant slow down. I'm glad to kick this patch out to trash box.
>
> Thanks,
> -Kame
>


--
Thanks,
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/