Re: [PATCH V2 0/6][RFC] futex: FUTEX_LOCK with optional adaptivespinning

From: Thomas Gleixner
Date: Tue Apr 06 2010 - 15:33:38 EST


On Tue, 6 Apr 2010, Alan Cox wrote:
> > Do you feel some of these situations would also benefit from some kernel
> > assistance to stop spinning when the owner schedules out? Or are you
> > saying that there are situations where pure userspace spinlocks will
> > always be the best option?
>
> There are cases its the best option - you are assuming for example that
> the owner can get scheduled out. Eg nailing one thread per CPU in some
> specialist high performance situations means they can't.

Fair enough, but that's not the problem Darren is targeting.

> > If the latter, I'd think that they would also be situations where
> > sched_yield() is not used as part of the spin loop. If so, then these
> > are not our target situations for FUTEX_LOCK_ADAPTIVE, which hopes to
> > provide a better informed mechanism for making spin or sleep decisions.
> > If sleeping isn't part of the locking construct implementation, then
> > FUTEX_LOCK_ADAPTIVE doesn't have much to offer.
>
> I am unsure about the approach. As Avi says knowing that the lock owner is
> scheduled out allows for far better behaviour. It doesn't need complex
> per lock stuff or per lock notifier entries on pre-empt either.
>
> A given task is either pre-empted or not and in the normal case of things
> you need this within a process so you've got shared pages anyway. So you
> only need one instance of the 'is thread X pre-empted' bit somewhere in a
> non swappable page.

I fear we might end up with a pinned page per thread to get this
working properly and it restricts the mechanism to process private
locks.

> That gives you something along the lines of
>
> runaddr = find_run_flag(lock);
> do {
> while(*runaddr == RUNNING) {
> if (trylock(lock))
> return WHOOPEE;
> cpu relax
> }
> yield (_on(thread));

That would require a new yield_to_target() syscall, which either
blocks the caller when the target thread is not runnable or returns an
error code which signals to go into the slow path.

> } while(*runaddr != DEAD);
>
>
> which unlike blindly spinning can avoid the worst of any hit on the CPU
> power and would be a bit more guided ?

I doubt that the syscall overhead per se is large enough to justify
all of the above horror. We need to figure out a more efficient way to
do the spinning in the kernel where we have all the necessary
information already. Darren's implementation is suboptimal AFAICT.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/