[PATCH 4.19 23/73] Btrfs: fix fsync not persisting changed attributes of a directory

From: Greg Kroah-Hartman
Date: Fri Jun 07 2019 - 12:01:17 EST


From: Filipe Manana <fdmanana@xxxxxxxx>

commit 60d9f50308e5df19bc18c2fefab0eba4a843900a upstream.

While logging an inode we follow its ancestors and for each one we mark
it as logged in the current transaction, even if we have not logged it.
As a consequence if we change an attribute of an ancestor, such as the
UID or GID for example, and then explicitly fsync it, we end up not
logging the inode at all despite returning success to user space, which
results in the attribute being lost if a power failure happens after
the fsync.

Sample reproducer:

$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt

$ mkdir /mnt/dir
$ chown 6007:6007 /mnt/dir

$ sync

$ chown 9003:9003 /mnt/dir
$ touch /mnt/dir/file
$ xfs_io -c fsync /mnt/dir/file

# fsync our directory after fsync'ing the new file, should persist the
# new values for the uid and gid.
$ xfs_io -c fsync /mnt/dir

<power failure>

$ mount /dev/sdb /mnt
$ stat -c %u:%g /mnt/dir
6007:6007

--> should be 9003:9003, the uid and gid were not persisted, despite
the explicit fsync on the directory prior to the power failure

Fix this by not updating the logged_trans field of ancestor inodes when
logging an inode, since we have not logged them. Let only future calls to
btrfs_log_inode() to mark inodes as logged.

This could be triggered by my recent fsync fuzz tester for fstests, for
which an fstests patch exists titled "fstests: generic, fsync fuzz tester
with fsstress".

Fixes: 12fcfd22fe5b ("Btrfs: tree logging unlink/rename fixes")
CC: stable@xxxxxxxxxxxxxxx # 4.4+
Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx>
Signed-off-by: David Sterba <dsterba@xxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
fs/btrfs/tree-log.c | 12 ------------
1 file changed, 12 deletions(-)

--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -5312,7 +5312,6 @@ static noinline int check_parent_dirs_fo
{
int ret = 0;
struct dentry *old_parent = NULL;
- struct btrfs_inode *orig_inode = inode;

/*
* for regular files, if its inode is already on disk, we don't
@@ -5332,16 +5331,6 @@ static noinline int check_parent_dirs_fo
}

while (1) {
- /*
- * If we are logging a directory then we start with our inode,
- * not our parent's inode, so we need to skip setting the
- * logged_trans so that further down in the log code we don't
- * think this inode has already been logged.
- */
- if (inode != orig_inode)
- inode->logged_trans = trans->transid;
- smp_mb();
-
if (btrfs_must_commit_transaction(trans, inode)) {
ret = 1;
break;
@@ -6070,7 +6059,6 @@ void btrfs_record_unlink_dir(struct btrf
* if this directory was already logged any new
* names for this file/dir will get recorded
*/
- smp_mb();
if (dir->logged_trans == trans->transid)
return;