Re: [PATCH 6/6] futex: Add aggressive adaptive spinning argumentto FUTEX_LOCK

From: Darren Hart
Date: Thu Apr 08 2010 - 01:58:59 EST


To eliminate syscall overhead from the equation, I modified the testcase to allow for forcing the syscall on lock(). Doing so cut the non-adaptive scores by more than half. The adaptive scores dropped accordingly. The relative difference between normal and adaptive remained in tact (with my adaptive implementation lagging by 10x). So while the syscall overhead does impact the scores, it is not the source of the performance issue with the adaptive futex implementation I posted.

The following bits were being used to test for spinners and attempt to only allow one spinner. Obviously it failed miserably at that. I found up to 8 spinners running at a time with an instrumented kernel.

@@ -2497,6 +2502,14 @@ static int futex_lock(u32 __user *uaddr, int flags, int detect, ktime_t *time)
retry:
#ifdef CONFIG_SMP
if (flags & FLAGS_ADAPTIVE) {
+ if (!aas) {
+ ret = get_user(uval, uaddr);
+ if (ret)
+ goto out;
+ if (uval & FUTEX_WAITERS)
+ goto skip_adaptive;
+ }

Trouble is at this point is there are no more bits in the word to be able to have a FUTEX_SPINNER bit. The futex word is the only per-futex storage we have, the futex_q is per task.

If we overload the FUTEX_WAITERS bit it will force more futex_wake() calls on the unlock() path. It also will effectively disable spinning under contention as there are bound to be FUTEX_WAITERS in that case.

Another option I dislike is to forget about robust futexes in conjunction with adaptive futexes and overload the FUTEX_OWNER_DIED bit. Ulrich mentioned in another mail that "If we have 31 bit TID values there isn't enough room for another bit." Since we have two flag bits now, I figured TID values were 30 bits. Is there an option to run with 31 bits or something?

Assuming we all agree that these options are "bad", that leaves us with looking for somewhere else to store the information we need, which in turn brings us back around to what Avi, Alan, and Ulrich were discussing regarding non swappable TLS data and a pointer in the futex value.

--
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/