Re: [PATCH 10/41] whiteout: Add vfs_whiteout() and whiteout inode operation

From: Erez Zadok
Date: Sun Nov 29 2009 - 22:05:00 EST


In message <1256152779-10054-11-git-send-email-vaurora@xxxxxxxxxx>, Valerie Aurora writes:
> From: Jan Blunck <jblunck@xxxxxxx>
>
> Simply white-out a given directory entry. This functionality is usually used
> in the sense of unlink. Therefore the given dentry can still be in-use and
> contains an in-use inode. The filesystems inode operation has to do what
> unlink or rmdir would in that case. Since the dentry still might be in-use
> we have to provide a fresh unhashed dentry that is used as the whiteout
> dentry instead. The given dentry is dropped and the whiteout dentry is
> rehashed instead.
>
> Signed-off-by: Jan Blunck <jblunck@xxxxxxx>
> Signed-off-by: David Woodhouse <dwmw2@xxxxxxxxxxxxx>
> Signed-off-by: Valerie Aurora <vaurora@xxxxxxxxxx>
> ---
> fs/dcache.c | 4 +-
> fs/namei.c | 104 ++++++++++++++++++++++++++++++++++++++++++++++++
> include/linux/dcache.h | 6 +++
> include/linux/fs.h | 3 +
> 4 files changed, 116 insertions(+), 1 deletions(-)
>
> diff --git a/fs/dcache.c b/fs/dcache.c
> index 3415e9e..0fcae4b 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -1076,8 +1076,10 @@ struct dentry *d_alloc_name(struct dentry *parent, const char *name)
> /* the caller must hold dcache_lock */
> static void __d_instantiate(struct dentry *dentry, struct inode *inode)
> {
> - if (inode)
> + if (inode) {
> + dentry->d_flags &= ~DCACHE_WHITEOUT;
> list_add(&dentry->d_alias, &inode->i_dentry);
> + }
> dentry->d_inode = inode;
> fsnotify_d_instantiate(dentry, inode);
> }
> diff --git a/fs/namei.c b/fs/namei.c
> index 46cf1cb..d2fc8c9 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -2169,6 +2169,110 @@ SYSCALL_DEFINE2(mkdir, const char __user *, pathname, int, mode)
> return sys_mkdirat(AT_FDCWD, pathname, mode);
> }
>
> +
> +/* Checks on the victim for whiteout */
> +static inline int may_whiteout(struct inode *dir, struct dentry *victim,
> + int isdir)

Why not make 'isdir' a boolean?

I'd prefer to see more documentation above this function: explain what each
arg does, return value, etc.

> +{
> + int err;
> +
> + /* from may_create() */
> + if (IS_DEADDIR(dir))
> + return -ENOENT;
> + err = inode_permission(dir, MAY_WRITE | MAY_EXEC);
> + if (err)
> + return err;
> +
> + /* from may_delete() */
> + if (IS_APPEND(dir))
> + return -EPERM;
> + if (!victim->d_inode)
> + return 0;
> + if (check_sticky(dir, victim->d_inode) ||
> + IS_APPEND(victim->d_inode) ||
> + IS_IMMUTABLE(victim->d_inode))
> + return -EPERM;
> + if (isdir) {
> + if (!S_ISDIR(victim->d_inode->i_mode))
> + return -ENOTDIR;
> + if (IS_ROOT(victim))
> + return -EBUSY;
> + } else if (S_ISDIR(victim->d_inode->i_mode))
> + return -EISDIR;
> + if (victim->d_flags & DCACHE_NFSFS_RENAMED)
> + return -EBUSY;
> + return 0;
> +}
> +
> +/**
> + * vfs_whiteout: creates a white-out for the given directory entry
> + * @dir: parent inode
> + * @dentry: directory entry to white-out

Nit: is it 'white-out' or 'whiteout'? Whatever you choose is fine, but
please use consistent hypenation/spelling everywhere (code, comments, and
documentation).

> + *
> + * Simply white-out a given directory entry. This functionality is usually used
> + * in the sense of unlink. Therefore the given dentry can still be in-use and
> + * contains an in-use inode. The filesystem has to do what unlink or rmdir

Nit: other than the line of comment just above, the other two instances of
"in-use" in this comment (and the patch header) should be changed to "in
use" (no hyphen).

> + * would in that case. Since the dentry still might be in-use we have to
> + * provide a fresh unhashed dentry that whiteout can fill the new inode into.
> + * In that case the given dentry is dropped and the fresh dentry containing the
> + * whiteout is rehashed instead. If the given dentry is unused, the whiteout
> + * inode is instantiated into it instead.
> + *
> + * After this returns with success, don't make any assumptions about the inode.

What kinds of assumptions one should not make? Perhaps it'd be better to
document what you can/should assume, instead of what you shouldn't (or
both?)

> + * Just dput() it dentry.

The last line is awkward: do you mean "its dentry"?

> + */
> +int vfs_whiteout(struct inode *dir, struct dentry *dentry, int isdir)
> +{
> + int err;
> + struct inode *old_inode = dentry->d_inode;
> + struct dentry *parent, *whiteout;
> +
> + err = may_whiteout(dir, dentry, isdir);
> + if (err)
> + return err;
> +
> + BUG_ON(dentry->d_parent->d_inode != dir);
> +
> + if (!dir->i_op || !dir->i_op->whiteout)
> + return -EOPNOTSUPP;
> +
> + if (old_inode) {
> + vfs_dq_init(dir);
> +
> + mutex_lock(&old_inode->i_mutex);
> + if (isdir)
> + dentry_unhash(dentry);
> + if (d_mountpoint(dentry))
> + err = -EBUSY;
> + else {
> + if (isdir)
> + err = security_inode_rmdir(dir, dentry);
> + else
> + err = security_inode_unlink(dir, dentry);
> + }
> + }
> +
> + parent = dget_parent(dentry);
> + whiteout = d_alloc_name(parent, dentry->d_name.name);
> +
> + if (!err)
> + err = dir->i_op->whiteout(dir, dentry, whiteout);
> +
> + if (old_inode) {
> + mutex_unlock(&old_inode->i_mutex);
> + if (!err) {
> + fsnotify_link_count(old_inode);
> + d_delete(dentry);
> + }
> + if (isdir)
> + dput(dentry);
> + }
> +
> + dput(whiteout);
> + dput(parent);
> + return err;
> +}
> +
> /*
> * We try to drop the dentry early: we should have
> * a usage count of 2 if we're the only user of this
> diff --git a/include/linux/dcache.h b/include/linux/dcache.h
> index 30b93b2..7648b49 100644
> --- a/include/linux/dcache.h
> +++ b/include/linux/dcache.h
> @@ -183,6 +183,7 @@ d_iput: no no no yes
> #define DCACHE_INOTIFY_PARENT_WATCHED 0x0020 /* Parent inode is watched by inotify */
>
> #define DCACHE_COOKIE 0x0040 /* For use by dcookie subsystem */
> +#define DCACHE_WHITEOUT 0x0080 /* This negative dentry is a whiteout */
>
> #define DCACHE_FSNOTIFY_PARENT_WATCHED 0x0080 /* Parent inode is watched by some fsnotify listener */
>
> @@ -358,6 +359,11 @@ static inline int d_unlinked(struct dentry *dentry)
> return d_unhashed(dentry) && !IS_ROOT(dentry);
> }
>
> +static inline int d_is_whiteout(struct dentry *dentry)
> +{
> + return (dentry->d_flags & DCACHE_WHITEOUT);
> +}
> +
> static inline struct dentry *dget_parent(struct dentry *dentry)
> {
> struct dentry *ret;
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 5fb7343..04a9870 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -205,6 +205,7 @@ struct inodes_stat_t {
> #define MS_KERNMOUNT (1<<22) /* this is a kern_mount call */
> #define MS_I_VERSION (1<<23) /* Update inode I_version field */
> #define MS_STRICTATIME (1<<24) /* Always perform atime updates */
> +#define MS_WHITEOUT (1<<26) /* fs does support white-out filetype */
> #define MS_ACTIVE (1<<30)
> #define MS_NOUSER (1<<31)
>
> @@ -1422,6 +1423,7 @@ extern int vfs_link(struct dentry *, struct inode *, struct dentry *);
> extern int vfs_rmdir(struct inode *, struct dentry *);
> extern int vfs_unlink(struct inode *, struct dentry *);
> extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *);
> +extern int vfs_whiteout(struct inode *, struct dentry *, int);
>
> /*
> * VFS dentry helper functions.
> @@ -1526,6 +1528,7 @@ struct inode_operations {
> int (*mkdir) (struct inode *,struct dentry *,int);
> int (*rmdir) (struct inode *,struct dentry *);
> int (*mknod) (struct inode *,struct dentry *,int,dev_t);
> + int (*whiteout) (struct inode *, struct dentry *, struct dentry *);
> int (*rename) (struct inode *, struct dentry *,
> struct inode *, struct dentry *);
> int (*readlink) (struct dentry *, char __user *,int);

Nit: I'm curious why you decided to add the function proto for (*whiteout)
where you did? inode_operations isn't sorted alphabetically, so why not
just append it to the end of the op list?

> 1.6.3.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html

Erez.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/