Re: [rfc] "fair" rw spinlocks

From: Thomas Gleixner
Date: Mon Nov 30 2009 - 16:13:18 EST


On Mon, 30 Nov 2009, Ingo Molnar wrote:
> * Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> > On Mon, 30 Nov 2009, Christoph Hellwig wrote:
> > >
> > > How long will this use be around? I've seen some slow progress toward
> > > replacing most read side uses of the task list lock with RCU. While we
> > > still have lots of read side users now I wonder when they'll go away.
> >
> > tasklist_lock is pretty nasty. I threw out "replace it with RCU"
> > because it would be nice, but the data structures used are not just
> > simple linked lists that we have RCU helpers for traversing.
> >
> > There are various real exclusion rules about things like
> > 'tsk->exit_state' etc, which do not translate directly to RCU usage.
> > Of course, _maybe_ all the places that care already take the thing for
> > writing and would just automatically have exclusion anyway.
> >
> > So I'd love to see somebody try to do the conversion. To a first
> > approximation, you probably could do
> >
> > - turn tasklist_lock into a spinlock
> >
> > - sed 's/write_lock_irq(&tasklist_lock)/spin_lock(&tasklist_lock)/g'
> > sed 's/write_unlock_irq(&tasklist_lock)/spin_unlock(&tasklist_lock)/g'
> >
> > - sed 's/read_lock(&tasklist_lock)/rcu_read_lock()/g'
> > sed 's/read_unlock(&tasklist_lock)/rcu_read_unlock()/g'
> >
> > - make all the task lists use the RCU versions of the list routines
> >
> > - free the task structure using RCU
> >
> > and you'd be _pretty_ close to a working system.
>
> In -rt we've got that in essence, and it's indeed working fine (with a
> few caveats). A few RCU conversions of tasklist_lock usage in that area
> even trickled upstream, because the simple lock would hurt so much under
> -rt.

I think the conversion Linus proposed is pretty feasible. I went
through the read_lock sites and most of them are protecting function
calls which we already use under rcu_read_lock() in other places like
find_task* and thread or pid iterators.

There are a few non obvious ones in signal.c and posix-cpu-timers.c
(what a surprise) but nothing looks too scary.

If nobody beats me I'm going to let sed loose on the kernel, lift the
task_struct rcu free code from -rt and figure out what explodes.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/