Re: local_add_return

From: Mathieu Desnoyers
Date: Fri Dec 19 2008 - 12:06:44 EST


* Rusty Russell (rusty@xxxxxxxxxxxxxxx) wrote:
> On Friday 19 December 2008 14:05:14 Mathieu Desnoyers wrote:
> > * Rusty Russell (rusty@xxxxxxxxxxxxxxx) wrote:
> > But can we turn what you call "nmi_safe_t" into "local_atomic_t" then ?
> > Because we have to specify that this type must only be used as part of
> > per-cpu data with preemption disabled, and we also specify that it is
> > atomic.
> >
> > Plus, nmi_safe_t does not make much sense on architectures without NMIs,
> > where we sometimes disable interrupts to make the modification "atomic"
> > wrt all other interrupts that can happen.
>
> But those archs can use local_t. I don't like either name local_t nor
> atomic_local_t, but renaming sucks too.
>
> OK, how about a different approach? Since there's really only one case
> where we need this local_t property outside arch-specific code, how about
> we define ARCH_LOCAL_T_TRACE_SAFE for x86?
>
> Then some trace-specific typedef like "trace_counter_t" which goes to local_t
> or atomic_(long?)_t?
>
> Should be a simple patch and pretty clear.
>

Hrm, is it me or linking a basic type definition to a single user seems
like the wrong approach ?

The idea behind declaring new types is, to me, that they should describe
as generally as possible what they provide and what they are. If we
think of the future, where we might want to use such local atomic types
for other purposes than tracing, I think we will end up regretting such
specific naming scheme. I don't think the argument "because the type has
only one arch-independent user" holds, because the idea behind new types
is that they _will_ be used by others eventually. For instance, we've
done some work on moving the slub allocator to such local atomic
operations last year, and it gave very good results on architectures
where disabling interrupt is costly (threefold acceleration of the
fastpath).

In your trace_counter_t proposal, you don't take into account that (what
I call) local_atomic_long_t is a _new_ primitive, which cannot be
implemented by a trivalue and differs from atomic_long_t, on more
architectures than x86. On mips and powerpc, at least, it can be
implemented as an atomic operation without the memory barriers, which
improves performances a lot.

I think the following scheme would be pretty simple and yet not tied to
any specific user :

local_long_t
- Fast per-cpu counter, not necessarily atomic.
Implements long trivalues, or uses local_atomic_long_t.
local_atomic_long_t
- Fast per-cpu atomic counter.
Implements per-cpu atomic counters or uses atomic_long_t.
atomic_long_t
- Global atomic counter.
Implements globally synchronized atomic operations.

We could do the same with "int" type for :
local_t
local_atomic_t
atomic_t

If we need smaller counters.

Mathieu


> Thanks,
> Rusty.

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/