>I ended up rewriting the x86 semaphore code (and some of Richards pipe
>code too, for that matter, to get rid of some races in waking things up),
>and it doesn't show the problems I saw before, but hey, maybe I just
>exchanged one set of problems for another set that I can't trigger any
>more. Give me feedback, please.
I guess the problem is the pipe code since I understood the old semaphores
completly and there weren't SMP races there.
Your new semaphores seems completly buggy to me and I am surprised your
kernel works without crash or corruption with them.
task1 task2 task3 -> effect -> count sleepers
----- ----- ----- ----- --------
1 0
------- task 0 does a down() ------------------ 0 0
------- here task 1,2,3 try to get the lock ---
down() -1 1
(I avoided the details here)
schedule()
down() -2 1
spin_lock()
sleepers++ -2 2
add_neg(1) -1??? 2
sleepers = 1 -1 1
schedule()
down()
spin_lock()
sleepers++ -1 2
add_neg(1) 0???????????
ret not negative!!
sleepers = 0 0 0
wakeup()
spin_unlock()
two task got the lock at the same time!!!!!
the above isn't a subtle SMP race and I think it can trigger easily in
real-world. So I would be suprised if 2.3.15 would be rock solid.
Maybe I am missing something in your semaphores but I can't see what.
Tomorrow I'll continue thinking about this issue and (if I am not dreaming
about the above ;) I'll fix the new semaphores (so the new global set_mb()
patch will be delayed a bit).
Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/