Re: [PATCH 07/10] generic rwsem: implement down_read_critical() / up_read_critical()

From: Michel Lespinasse
Date: Wed May 19 2010 - 19:47:26 EST


On Wed, May 19, 2010 at 6:21 AM, David Howells <dhowells@xxxxxxxxxx> wrote:
> Michel Lespinasse <walken@xxxxxxxxxx> wrote:
>
>> +void __sched down_read_critical(struct rw_semaphore *sem)
>> +{
>> +     might_sleep();
>> +     rwsem_acquire_read(&sem->dep_map, 0, 0, _RET_IP_);
>> +
>> +     LOCK_CONTENDED(sem, __down_read_trylock, __down_read_unfair);
>> +
>> +     preempt_disable();
>
> Shouldn't preemption really be disabled before __down_read_unfair() is called?
> Otherwise you can get an unfair read on a sem and immediately get taken off
> the CPU.  Of course, this means __down_read_unfair() would have to deal with
> that in the slow path:-/

I've asked myself the same question; it is true that we don't fully
prevent ourselves getting preempted while holding the rwsem here.

My understanding is that Linus wants the preempt_disable() mainly to
prevent threads doing voluntary preemption (anything that might_sleep)
while holding the unfairly acquired rwsem; and also to have people
think twice before adding more down_read_critical() calls.

It wouldn't be difficult to move the preempt_disable() ahead of the
lock acquire fast path. However I don't think I can do it for the
blocking path, where thread A tries to acquire the lock on behalf of
thread B and then wakes B if it succeeded - I don't think we have an
API for A to say 'I want to disable preemption in thread B', is there
?

> Oh, and something else that occurs to me:  Do unfair readers have to go at the
> front of the wakeup queue?  Can they be slightly less unfair and go either
> before the first reader in the queue or at the back of the queue instead?

Going before the first reader would be fine for our use, as we're
really only using this for mmap_sem and the write holders there don't
keep it very long. I'm not sure what this buys us though.

Going at the back of the queue would mean the critical readers would
still get occasionally blocked behind other readers doing disk
accesses - we'd like to avoid that.

--
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/