Re: local_add_return

From: Mathieu Desnoyers
Date: Tue Dec 16 2008 - 11:25:45 EST


* Rusty Russell (rusty@xxxxxxxxxxxxxxx) wrote:
> On Tuesday 16 December 2008 00:17:35 Steven Rostedt wrote:
> > Shouldn't local_add_return be a way for archs that can increment a memory
> > location atomically against interrupts to use that infrastructure? It can
> > simply fall back to atomic_add_return for those archs that do not have
> > a lesser equivalent of atomic_add_return.
>
> local_t was originally introduced (but actually never used for) the
> SNMP counters. They use two counters to avoid atomics, but as the ancient
> comment says:
>
> /*
> * FIXME: On x86 and some other CPUs the split into user and softirq parts
> * is not needed because addl $1,memory is atomic against interrupts (but
> * atomic_inc would be overkill because of the lock cycles). Wants new
> * nonlocked_atomic_inc() primitives -AK
> */
> #define DEFINE_SNMP_STAT(type, name) \
> __typeof__(type) *name[2]
>
> Then last year Mathieu sent (and Andrew accepted) a "rich set of atomic
> operations", including excellent documentation "local_ops.txt". Except
> he thought they were atomics, so treated them accordingly. Also, there
> were no users (you're now the only one).
>
> But if these new operations are to become the norm, it changes how archs
> should implement local_t. eg. trivalue becomes less attractive, atomic_long
> more. x86 has its own implementation so doesn't have these issues.
>
> Now, I posted a benchmark patch before for archs to test. I'm interested
> in Sparc64. Does any arch win from using multiple counters? PowerPC has
> soft interrupt disable, so that solution wins over atomic_long_t for them.
>

Hi Rusty,

I'd like to comment on your test case found at
http://groups.google.com/group/linux.kernel/msg/98c512fceda26351

Specifically on this comment :

+/* There are three obvious ways to implement local_t on an arch which
+ * can't do single-instruction inc/dec etc.
+ * 1) atomic_long
+ * 2) irq_save/irq_restore
+ * 3) multiple counters.
+ *
+ * This does a very rough benchmark on each one.
+ */

Option 3) is not workable for tracers, because it's not safe against
some exceptions (e.g. some hardware errors) nor NMIs. Also, local_t
operations must have preemption disabled before playing on per-cpu data,
which I don't see in your test. This has to be taken into account in the
runtime cost. The "multiple counters" options should also disable
preemption, because a thread being moved to another CPU could corrupt
some other thread's data when being rescheduled.

Only two alternatives does not have this preempt_disable() requirement :
atomic_long_t and the CPU_OPS work done by Christoph Lameter which use
segments to address the per-cpu data, which effectively removes the need
for disabling preemption around local_t operations because the CPU ID
becomes encoded in a cpu register.

Otherwise, you can be moved to a different CPU between the moment you
read the CPU ID and the moment you access the local data, which can lead
to corruption with local_t and multiple counters options.

Cheers,

Mathieu

> Cheers,
> Rusty.

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/