Re: [PATCH] 2.6.16 - futex: small optimization (?)

From: Eric Dumazet
Date: Tue Mar 28 2006 - 05:05:11 EST


Pierre PEIFFER a écrit :
Hi,


I found a (optimization ?) problem in the futexes, during a futex_wake, if the waiter has a higher priority than the waker.

In fact, in this case, the waiter is immediately scheduled and tries to take a lock still held by the waker. This is specially expensive on UP or if both threads are on the same CPU, due to the two task-switchings. This produces an extra latency during a wakeup in pthread_cond_broadcast or pthread_cond_signal, for example.

See below my detailed explanation.

I found a solution given by the patch, at the end of this mail. It works for me on kernel 2.6.16, but the kernel hangs if I use it with -rt patch from Ingo Molnar. So, I have a doubt on the correctness of the patch.

The idea is simple: in unqueue_me, I first check
"if (list_empty(&q->list))"

If yes => we were woken (the list is initialized in wake_futex).
Then, it immediately returns and let the waker drop the key_refs (instead of the waiter).



Its true that futex code implies lot of context switches (kernel side but also user side).

Even if you change kernel behavior in futex_wake(), you wont change the fact that a typical pthread_cond_signal does :

1) lock cond var
lll_lock(cv->lock);
2) wake one waiter if necessary
FUTEX_WAKE(cv->wakeup_seq, 1);
3) unlock cond var

If a waiter process B has higher priority than the wake process A, then most probably, B is scheduled before A had a chance to unlock cond var (step 3))

So B will re-enter kernel (because of the contended cond var lock), and A will re-enter kernel too to futex_wake() process A again, but on cond var lock this time, not on condvar wakeup_seq futex.

Each time a thread enters futex kernel code, an expensive find_extend_vma() lookup is done, (expensive because of the read_lock but also the possible amount of vm_area_struct in mm_struct)

I wish futex code had a special implementation for PTHREAD_SCOPE_PROCESS futexes , where no vma lookups would be necessary at all. Most mutexes or condvar have a process private scope (not shared by different processes)

Eric



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/