Re: [RFC] Extend mwait idle to optimize away IPIs when possible

From: Suresh Siddha
Date: Mon Feb 06 2012 - 21:16:59 EST


On Mon, 2012-02-06 at 12:42 -0800, Venkatesh Pallipadi wrote:
> * Lower overhead on Async IPI send path. Measurements on Westmere based
> systems show savings on "no wait" smp_call_function_single with idle
> target CPU (as measured on the sender side).
> local socket smp_call_func cost goes from ~1600 to ~1200 cycles
> remote socket smp_call_func cost goes from ~2000 to ~1800 cycles

Interesting that savings in the remote socket is less compared to the
local socket.

> +int smp_need_ipi(int cpu)
> +{
> + int oldval;
> +
> + if (!system_using_cpu_idle_sync || cpu == smp_processor_id())
> + return 1;
> +
> + oldval = atomic_cmpxchg(&per_cpu(cpu_idle_sync, cpu),
> + CPU_STATE_IDLE, CPU_STATE_WAKING);

To avoid too many cache line bounces for the case when the cpu is in the
running state, we should do a read to check if the state is in idle
before going ahead with the locked operation?

> +
> + if (oldval == CPU_STATE_RUNNING)
> + return 1;
> +
> + if (oldval == CPU_STATE_IDLE) {
> + set_tsk_ipi_pending(idle_task(cpu));
> + atomic_set(&per_cpu(cpu_idle_sync, cpu), CPU_STATE_WOKENUP);
> + }
> +
> + return 0;

We should probably disable interrupts around this, otherwise any delay
in transitioning to wokenup from waking will cause the idle cpu to be
stuck for similar amount of time.

thanks,
suresh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/