This is because of the RSB (Return Stack Buffer) introduced with the
P5/MMX time. It predicts return addresses, but doing a push/ret
sequence can totally mess up the prediction stack.
> o changes "inc 0(ecx)" to "inc (ecx)". gas does not change the
> addressing mode in this case - why waste a full byte of icache
> for every semaphore op?.. :)
Ok.
> o makes the contention case use 'conventional' calling style.
Perhaps this change only makes sense if the out-of-line case ever
returns without scheduling. I'm not sure if it does (but I can think of
a few tricks where that would be useful). If it always schedules the
RSB mispredication overhead is still there and you've added a bit more
overhead. Though for some cases of semaphore contention I guess the RSB
could be right despite scheduling.
-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/