Re: [Patch] statistics infrastructure - update 9

From: Andi Kleen
Date: Thu Jul 06 2006 - 12:59:03 EST



> Good question. Btw. - faster by what order of magnitude?

pushf + popf is on K8 at least ~18 cycles, on P4 it is much more
because they synchronize the pipeline there (hundreds of cycles)

cpu local add would be a few cycles at best and doesn't have
any impact on the pipeline


> local_irq_save/restore seems to be fine for kernel/profile.c
>
>
> Reason 1:
> cpu_local_* uses __get_cpu_var, which conflicts with struct statistic
> being embedded into struct xyz that is allocated whenever the client
> needs it.
>
> I could try to use local_t in conjunction with local_add etc.
> (as seen in include/linux/dmaengine.h in 2.6.17-mm6).
> Does this also yield a performance gain worth consideration?

Yes, but you would need preempt_disable() then. For non preemptible
kernels (far majority) that would be already a big win.


> So, removing local_irq_save/restore would require statistics to be
> switched on and their buffers being available all the time. That is,
> buffers holding counters etc. can't be allocated at run time - what
> if allocation fails? (Should I leave this issue to clients?).

Can't you use RCU for this?


> Reason 4:
> The alleged overhead of local_irq_save/restore (as compared
> to atomic operations)

local_* doesn't need to be atomic. IT isn't on x86 at least.
On some other architectures it can be, but i think it's just a SMOP
of fixing them.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/