Re: dcache shrink list corruption?

From: Al Viro
Date: Wed Apr 30 2014 - 16:38:29 EST


On Wed, Apr 30, 2014 at 01:23:26PM -0700, Linus Torvalds wrote:
> On Wed, Apr 30, 2014 at 12:59 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > Another thing: I don't like what's going on with freeing vs. ->d_lock there.
> > Had that been a mutex, we'd definitely get a repeat of "vfs: fix subtle
> > use-after-free of pipe_inode_info". The question is, can spin_unlock(p)
> > dereference p after another CPU gets through spin_lock(p)? Linus?
>
> spin_unlock() *should* be safe wrt that issue.
>
> But I have to say, I think paravirtualized spinlocks may break that.
> They do all kinds of "kick waiters" after releasing the lock.
>
> Doesn't the RCU protection solve that, though? Nobody should be
> releasing the dentry under us, afaik..

We do not (and cannot) call dentry_kill() with rcu_read_lock held - it can
trigger any amount of IO, for one thing. We can take it around the
couple of places where do that spin_unlock(&dentry->d_lock) (along with
setting DCACHE_RCUACCESS) - that's what I'd been refering to. Then this
sucker (tests still running, so far everything seems to survive) becomes
the following (again, on top of 1/6..4/6). BTW, is there any convenient
way to tell git commit --amend to update the commit date? Something
like --date=now would be nice, but it isn't accepted...

commit 797ff22681dc969b478ed837787d24dfd2dd2132
Author: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Date: Tue Apr 29 23:52:05 2014 -0400

dentry_kill(): don't try to remove from shrink list

If the victim in on the shrink list, don't remove it from there.
If shrink_dentry_list() manages to remove it from the list before
we are done - fine, we'll just free it as usual. If not - mark
it with new flag (DCACHE_MAY_FREE) and leave it there.

Eventually, shrink_dentry_list() will get to it, remove the sucker
from shrink list and call dentry_kill(dentry, 0). Which is where
we'll deal with freeing.

Since now dentry_kill(dentry, 0) may happen after or during
dentry_kill(dentry, 1), we need to recognize that (by seeing
DCACHE_DENTRY_KILLED already set), unlock everything
and either free the sucker (in case DCACHE_MAY_FREE has been
set) or leave it for ongoing dentry_kill(dentry, 1) to deal with.

Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx>

diff --git a/fs/dcache.c b/fs/dcache.c
index e482775..fa40d26 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -489,6 +489,20 @@ relock:
goto relock;
}

+ if (unlikely(dentry->d_flags & DCACHE_DENTRY_KILLED)) {
+ if (parent)
+ spin_unlock(&parent->d_lock);
+ if (dentry->d_flags & DCACHE_MAY_FREE) {
+ spin_unlock(&dentry->d_lock);
+ dentry_free(dentry);
+ } else {
+ dentry->d_flags |= DCACHE_RCUACCESS;
+ rcu_read_lock();
+ spin_unlock(&dentry->d_lock);
+ rcu_read_unlock();
+ }
+ return parent;
+ }
/*
* The dentry is now unrecoverably dead to the world.
*/
@@ -504,8 +518,6 @@ relock:
if (dentry->d_flags & DCACHE_LRU_LIST) {
if (!(dentry->d_flags & DCACHE_SHRINK_LIST))
d_lru_del(dentry);
- else
- d_shrink_del(dentry);
}
/* if it was on the hash then remove it */
__d_drop(dentry);
@@ -527,7 +539,16 @@ relock:
if (dentry->d_op && dentry->d_op->d_release)
dentry->d_op->d_release(dentry);

- dentry_free(dentry);
+ spin_lock(&dentry->d_lock);
+ if (dentry->d_flags & DCACHE_SHRINK_LIST) {
+ dentry->d_flags |= DCACHE_MAY_FREE | DCACHE_RCUACCESS;
+ rcu_read_lock();
+ spin_unlock(&dentry->d_lock);
+ rcu_read_unlock();
+ } else {
+ spin_unlock(&dentry->d_lock);
+ dentry_free(dentry);
+ }
return parent;
}

@@ -829,7 +850,7 @@ static void shrink_dentry_list(struct list_head *list)
* We found an inuse dentry which was not removed from
* the LRU because of laziness during lookup. Do not free it.
*/
- if (dentry->d_lockref.count) {
+ if (dentry->d_lockref.count > 0) {
spin_unlock(&dentry->d_lock);
continue;
}
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index 3b9bfdb..3c7ec32 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -221,6 +221,8 @@ struct dentry_operations {
#define DCACHE_SYMLINK_TYPE 0x00300000 /* Symlink */
#define DCACHE_FILE_TYPE 0x00400000 /* Other file type */

+#define DCACHE_MAY_FREE 0x00800000
+
extern seqlock_t rename_lock;

static inline int dname_external(const struct dentry *dentry)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/