Re: [PATCH v4 1/1] dcache: Translating dentry into pathname withouttaking rename_lock

From: Al Viro
Date: Mon Sep 09 2013 - 14:06:14 EST


On Mon, Sep 09, 2013 at 10:45:38AM -0700, Linus Torvalds wrote:
> On Mon, Sep 9, 2013 at 10:29 AM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > I'm not sure I like mixing rcu_read_lock() into that - d_path() and friends
> > can do that themselves just fine (it needs to be taken when seq is even),
> > and e.g. d_walk() doesn't need it at all. Other than that, I'm OK with
> > this variant.
>
> Hmm.. I think you need the RCU read lock even when you get the write_seqlock().
>
> Yes, getting the seqlock for write implies that you get a spinlock and
> in many normal circumstances that basically is equvalent to being
> rcu-locked, but afaik in some configurations that is *not* sufficient
> protection against an RCU grace period on another CPU. You need to do
> a real rcu_read_lock that increments that whole rcu_read_lock_nesting
> level, which a spinlock won't do.
>
> And while the rename sequence lock protects against _renames_, it does
> not protect against just plain dentries getting free'd under memory
> pressure.

It protects the chain of ->d_parent, so they'd better not get freeds at
all...

> So I think the RCU-readlockness really needs to be independent of the
> sequence lock.

Actually, now that I've tried to convert d_walk() to those guys, I think
I like my proposal for the set of primitives better:

static inline bool seqretry_and_lock(seqlock_t *lock, unsigned *seq):
{
if ((*seq & 1) || !read_seqretry(lock, *seq))
return true;
*seq |= 1;
write_seqlock(lock);
return false;
}

static inline void seqretry_done(seqlock_t *lock, unsigned seq)
{
if (seq & 1)
write_sequnlock(lock);
}

with the prepend_path() and friends becoming

rcu_read_lock();
seq = read_seqbegin(&rename_lock);
again:
....
if (!seqretry_and_lock(&rename_lock, seq))
goto again; /* now as writer */
seqretry_done(&rename_lock, seq);
rcu_read_unlock();

The thing is, d_walk() does essentially

seq = read_seqbegin(&rename_lock);
again:
....
spin_lock(&d->d_lock);
if (!seqretry_and_lock(&rename_lock, seq)) {
spin_unlock(&d->d_lock);
goto again; /* now as writer */
}
/* now we are holding ->d_lock on it and we know
* that d has not gone stale until that point.
*/
do stuff with d
spin_unlock(&d->d_lock);
seqretry_done(&rename_lock, seq);

OTOH, it's not impossible to handle with Waiman's primitives, just more
massage to do that...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/