Re: fs/dcache.c - BUG: soft lockup - CPU#5 stuck for 22s! [systemd-udevd:1667]

From: Al Viro
Date: Thu May 29 2014 - 14:52:12 EST


On Thu, May 29, 2014 at 05:53:51PM +0100, Al Viro wrote:
> On Thu, May 29, 2014 at 09:29:42AM -0700, Linus Torvalds wrote:
> > On Thu, May 29, 2014 at 9:23 AM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > BTW, lock_parent() might be better off if in contended case it would not
> > > bother with rename_lock and did something like this:
> > > again:
> >
> > Ack. I think that's much better.
>
> Pushed to #for-linus (with dumb braino fixed - it's if (parent != dentry),
> not if (parent)). I'll wait with folding it back into the commit that
> introduces lock_parent() until we get testing results...

Grrr... Sadly, that's not good enough. Leaking rcu_read_lock() on
success is trivial, but there's more serious problem: suppose dentries
involved get moved before we get to locking what we thought was parent.
We end up taking ->d_lock on two dentries that might be nowhere near each
other in the tree, with obvious nasty implications. Would be _very_ hard
to reproduce ;-/

AFAICS, the following would be safe, but I'd really appreciate any extra
eyes on that sucker:

static inline struct dentry *lock_parent(struct dentry *dentry)
{
struct dentry *parent = dentry->d_parent;
if (IS_ROOT(dentry))
return NULL;
if (likely(spin_trylock(&parent->d_lock)))
return parent;
spin_unlock(&dentry->d_lock);
rcu_read_lock();
again:
parent = ACCESS_ONCE(dentry->d_parent);
spin_lock(&parent->d_lock);
/*
* We can't blindly lock dentry until we are sure
* that we won't violate the locking order.
* While parent->d_lock is not enough to stabilize
* dentry->d_parent, it *is* enough to stabilize
* dentry->d_parent == parent.
*/
if (unlikely(parent != dentry->d_parent)) {
spin_unlock(&parent->d_lock);
goto again;
}
rcu_read_unlock();
if (parent != dentry)
spin_lock(&dentry->d_lock);
else
parent = NULL;
return parent;
}

That variant got force-pushed in place of the previous one, again at the
head of #for-linus. And I'm definitely not folding it in until it gets
more review and testing.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/