Re: [PATCH] remove rq->lock from cpuacct cgroup v2

From: KAMEZAWA Hiroyuki
Date: Wed Mar 04 2009 - 09:17:18 EST


Bharata B Rao さんは書きました:
> On Wed, Mar 4, 2009 at 1:50 PM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
>> On Wed, 4 Mar 2009 13:24:43 +0530
>> Bharata B Rao <bharata.rao@xxxxxxxxx> wrote:
>>
>>> Instead of subsystems handling all these percpu counter problems
>>> themselves, shouldn't we be using percpu_counter subsytem and let it
>>> handle all the issues transparently for us ? I am not sure if all
>>> these problems have been addressed in percpu_counter, but would like
>>> to know why we are not using percpu_counter for these kinds of things
>>> and enhance percpu_counter if it can't handle some of the issues which
>>> we are solving here specifically for cpuacct subsystem ?
>>>
>> At first, generic per-cpu counter sounds interesting but to be honest,
>> some special handling is used for cpuacct based on its characteristic.
>>
>
> Just trying to understand this clearly ...
>
>> &#160;- Writer works under non-preemptable context.
>
> Which means the writer is running with preemption disabled.
> percpu_counter writes (via percpu_counter_add) don't assume anything
> and disable preemption themselves. Is this the problem or is there a
> bigger issue here why percpu_counter can't be used ?
>
Ah, just means it doesn't call disable_preempt() explicitly and
use __get_per_cpu() (not get_cpu_var) and smp_processor_id (not get_cpu).


>> &#160;- There is only one writer.
>
> Not sure how you have optimized for this case in cpuacct.
Because there is only one writer, I used seq_counter not seq_lock.
(there are no atomic ops by this.)

> percpu_counters use spinlocks to serialize writers. Are you saying
> using spinlocks for this 1 writer case is too much ?
I'm sorry I noticed there is lib/percpu_counter now.......
I think lib/percpu_counter does proper jobs.

> Also note that they update 32 bit percpu counters without any lock
> and take spinlocks only when they do a batch update to the 64bit counter.
>
Hmm, cpuacct's one uses hierachical update of several counters at once.
So, some private code like mine is not so bad, I think.

What I think now is it's ok to see your patch first and later do
this kind of update to avoid locks if necessary. This patch is micro
optimization and not in hurry.

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/