Re: [thiscpuops upgrade 05/10] x86: Use this_cpu_inc_return for nmicounter

From: Tejun Heo
Date: Fri Nov 26 2010 - 12:06:59 EST


On 11/26/2010 06:02 PM, Christoph Lameter wrote:
> On Fri, 26 Nov 2010, Tejun Heo wrote:
>
>>> - __this_cpu_inc(alert_counter);
>>> - if (__this_cpu_read(alert_counter) == 5 * nmi_hz)
>>> + if (__this_cpu_inc_return(alert_counter) == 5 * nmi_hz)
>>
>> Hmmm... one worry I have is that xadd, being not a very popular
>> operation, might be slower than add and read. Using it for atomicity
>> would probably be beneficial in most cases but have you checked this
>> actually is cheaper?
>
> XADD takes 3 uops. INC 1 and MOV 1 uop. So there is an additiona uop.
>
> However, a memory fetch from l1 takes a mininum 4 cycles. Doing that twice
> already ends up with at least 8 cycles.

Thanks for the explanation. It might be beneficial to note
performance characteristics on top of the x86 implementation?
Anyways, for this and the following simple conversion patches.

Reviewed-by: Tejun Heo <tj@xxxxxxxxxx>

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/