expiring tasks who are holding semaphores

From: Mike Galbraith (efault@gmx.de)
Date: Thu Apr 24 2003 - 05:15:01 EST


Greetings Folks,

In order to explore why SCHED_RR tasks are being delayed under load on my
box, I added some debug code to semaphores (knew it was proc for [RR]
vmstat, but what about others?). At the moment, normal semaphores only
record the fact that they have been downed in the owner's task
struct. rw_semaphores, however, also have some timeout code (lib/rwsem.c)
to find out _who_ is holding the semaphore for so long. In
scheduler_tick(), I print out stats as to how many expires/second, along
with the number of tasks which were expired while holding a semaphore. See
the attached logs... the numbers seems _awfully_ high to me, and if I
prevent expiration of these guys, most of my SCHED_RR vmstat/top stalls
just go away.

I'm wondering if it's worth while adding the wrappers and starvation
timeout code to normal semaphores (hmm, what about spinlocks... we do have
a might_sleep check which amounts to the same thing if you send any lock
holder off to the expired land when there are interactive tasks doing round
robin) to see if anybody starves for these. The same random thoughts occur
for yield() for normal tasks, though I've yet to trigger a printk there.

Anyway, logs attached in case they might be of interest.

        -Mike

P.S. logs are from 67-mm4 in case anyone looks up line numbers and
wonders. There are no twiddlings from yours truely other than the
semaphore stuff... it's virgin code, no wild/crazy/stupid experiments ;-)




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Apr 30 2003 - 22:00:12 EST