Re: idle issues running sembench on 128 cpus

From: Thomas Gleixner
Date: Wed May 04 2011 - 18:34:43 EST


On Wed, 4 May 2011, Andi Kleen wrote:
> Dave Kleikamp <dkleikamp@xxxxxxxxx> writes:
> > I also don't know if it makes sense to be able to tune the cpuidle
> > governors to add more resistance to enter the C3 state, or even being
> > able to switch to a performance governor at runtime, similar to
> > cpufreq.
> >
> > I'd like to hear your thoughts before I dive any deeper into this.
>
> It's fixed on Westmere. There the APIC timer will always tick
> and all that logic is not needed anymore and disabled.
>
> That is mostly fixed. One problem right now is that the
> CLOCK_EVT_FEAT_C3STOP test is inside the lock. But we
> can easily move it out, assuming the clock_event_device
> gets RCU freed or has a reference count.

No, it does not even need refcounting. We can access it outside of the
lock as this is atomic context called on the cpu which is about to go
idle and therefor the device cannot go away. Easy and straightforward
fix.

> But yes it would be still good to fix Nehalem too.
>
> One fix would be to make all the masks hierarchical,
> similar to what RCU does. Perhaps even some code
> could be shared with RCU on that because it's a very
> similar problem.

In theory. It's not about the mask. The mask is uninteresting. It's
about the expiry time, which we have to protect. There is nothing
hierarchical about that. It all boils down on _ONE_ single functional
device and you don't want to miss out your deadline just because you
decided to be extra clever. RCU does not care much whether you run the
callbacks a tick later on not. Time and timekeeping does.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/