Re: [PATCH v7 1/4] spinlock: A new lockref structure for locklessupdate of refcount

From: Al Viro
Date: Mon Sep 09 2013 - 01:44:27 EST


On Sun, Sep 08, 2013 at 08:32:03PM -0700, Linus Torvalds wrote:
> On Sun, Sep 8, 2013 at 5:03 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > There's one exception - basically, we decide to put duplicates of
> > reference(s) we hold into (a bunch of) structures being created. If
> > we decide that we'd failed and need to roll back, the structures
> > need to be taken out of whatever lists, etc. they'd been already
> > put on and references held in them - dropped. That removal gets done
> > under a spinlock. Sure, we can string those structures on some kind
> > of temp list, drop the spinlock and do dput() on everything in there,
> > but it's much more convenient to just free them as we are evicting
> > them, doing dput() as we go. Which is safe, since we are still have
> > the references used to create these buggers pinned down.
>
> Hmm. Which codepath does this? Because I got curious and added back
> the __might_sleep() unconditionally to dput() just to see (now that I
> think that the dput() bugs are gone), and at least under normal load
> it doesn't trigger. I even wrote a thing that just constantly creates
> and renames the target file concurrently with looking it up, so that
> I've stress-tested the RCU sequence number failure path (and verified
> with a profile that yes, it does trigger the "oops, need to retry"
> case). I didn't test anything odd at all (just my dentry stress tests
> and a regular graphical desktop), though.

Not sure if we have anything of that sort in the current tree; I remember
that kind of stuff in shared subtree code (basically, if we decided that
operation should fail halfway through the process, we could just evict
all created vfsmounts from the lists and mntput them, spinlocks or no
spinlocks - they all had been copied from existing ones protected by
the system-wide serialization on namespace modifications, so resulting
dput() calls wouldn't have d_count on anything reach zero). But I'm not
sure if it had been about vfsmount_lock or namespace_sem (we really don't
want any IO under the latter, since one stuck fs can make _any_ umount
impossible afterwards) and all remnants of that got killed off by
reorganizations of locking in there - all pending dput()/mntput() calls are
delayed until namespace_unlock() now.

Anyway, that wouldn't break even now - I'm not saying that it's a good
pattern to use, but it's legitimate. If you are holding a reference
already, something like
p = alloc_foo();
spin_lock(&lock);
...
p->dentry = dget(dentry);
...
if (error) {
...
free_foo(p);
...
spin_unlock(&lock);
}
with free_foo(p) including dput(p->dentry) is safe. The whole thing was just
a comment on your "who does that? Nobody" - that kind of stuff has uses and
it did happen at least once. And yes, it is safe *and* anybody writing
anything of that sort needs to look hard if it can be reorganized into
something less subtle...

> And I have too much memory to sanely stress any out-of-memory situations.

KVM image with -m <size> or mem=<size> in kernel command line ;-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/