Re: [PATCH V2 2/2] fs/super.c: don't fool lockdep in freeze_super() and thaw_super() paths

From: Jan Kara
Date: Tue Sep 27 2016 - 02:51:54 EST


On Mon 26-09-16 18:55:25, Oleg Nesterov wrote:
> On 09/26, Jan Kara wrote:
> >
> > On Mon 26-09-16 18:08:06, Oleg Nesterov wrote:
> > > +/*
> > > + * Tell lockdep we are holding these locks before we call ->unfreeze_fs(sb).
> > > + */
> > > +static void sb_freeze_acquire(struct super_block *sb)
> >
> > Can we call this lockdep_sb_freeze_acquire() or something like that so that
> > it is clear this is only about lockdep annotations? Similarly with
> > sb_freeze_unlock()...
>
> OK, thanks, done. See V2 below.
>
> > and I hope you really tested
> > there are no more lockdep false positives ;).
>
> Heh ;) if only I knew how to test this... I ran the following script
> under qemu
>
> mkfs.xfs -f /dev/vda
> mkfs.xfs -f /dev/vdb
>
> mkdir -p TEST SCRATCH
>
> TEST_DEV=/dev/vda TEST_DIR=TEST SCRATCH_DEV=/dev/vdb SCRATCH_MNT=SCRATCH \
> ./check `grep -il freeze tests/*/???`

You can run either:

./check -g freeze

to check just the freezing tests or

./check

to run all sensible tests which is what I'd do (but it will take couple of
hours to pass). If that passes, chances are good there are no easy false
positives.

> And yes, I'm afraid this change can uncover some false positives later.
> But at the same time potentially it can find the real problems.

Well, sure it's not an end of world if there is some false positive - we
can just revert the change - but lockdep false positives are always
annoying because they take time to analyze and until they are fixed, you
are unable to see other probles found by lockdep...

> It would be nice to remove another hack in __sb_start_write under
> ifdef(CONFIG_LOCKDEP), but iirc XFS actually takes the same rw_sem twice
> for reading, so we can't do this.

Yes, and I don't really consider this a hack. Filesystem freezing was never
meant to have all the properties of classical rwsems in kernel - I just
happened to notice that the semantics is close to rwsems and so decided to
model it as such in lockdep to get some lockdep verification for it. In
particular "read lock recursion" has always been OK with freeze protection
by design and I don't see strong motivation for changing that.

> -------------------------------------------------------------------------------
> From 70e774533ab6319f9b90a490b036150ad9604af7 Mon Sep 17 00:00:00 2001
> From: Oleg Nesterov <oleg@xxxxxxxxxx>
> Date: Mon, 26 Sep 2016 17:23:24 +0200
> Subject: [PATCH V2 2/2] fs/super.c: don't fool lockdep in freeze_super() and thaw_super() paths
>
> sb_wait_write()->percpu_rwsem_release() fools lockdep to avoid the
> false-positives. Now that xfs was fixed by Dave's commit dbad7c993053
> ("xfs: stop holding ILOCK over filldir callbacks") we can remove it and
> change freeze_super() and thaw_super() to run with s_writers.rw_sem locks
> held; we add two trivial helpers for that, lockdep_sb_freeze_release()
> and lockdep_sb_freeze_acquire().
>
> xfstests-dev/check `grep -il freeze tests/*/???` does not trigger any
> warning from lockdep.
>
> Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>

Thanks. The patch looks good. You can add:

Reviewed-by: Jan Kara <jack@xxxxxxx>

Honza
> ---
> fs/super.c | 37 +++++++++++++++++++++++++------------
> 1 file changed, 25 insertions(+), 12 deletions(-)
>
> diff --git a/fs/super.c b/fs/super.c
> index 2549896c..0447afe 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1214,25 +1214,34 @@ EXPORT_SYMBOL(__sb_start_write);
> static void sb_wait_write(struct super_block *sb, int level)
> {
> percpu_down_write(sb->s_writers.rw_sem + level-1);
> - /*
> - * We are going to return to userspace and forget about this lock, the
> - * ownership goes to the caller of thaw_super() which does unlock.
> - *
> - * FIXME: we should do this before return from freeze_super() after we
> - * called sync_filesystem(sb) and s_op->freeze_fs(sb), and thaw_super()
> - * should re-acquire these locks before s_op->unfreeze_fs(sb). However
> - * this leads to lockdep false-positives, so currently we do the early
> - * release right after acquire.
> - */
> - percpu_rwsem_release(sb->s_writers.rw_sem + level-1, 0, _THIS_IP_);
> }
>
> -static void sb_freeze_unlock(struct super_block *sb)
> +/*
> + * We are going to return to userspace and forget about these locks, the
> + * ownership goes to the caller of thaw_super() which does unlock().
> + */
> +static void lockdep_sb_freeze_release(struct super_block *sb)
> +{
> + int level;
> +
> + for (level = SB_FREEZE_LEVELS - 1; level >= 0; level--)
> + percpu_rwsem_release(sb->s_writers.rw_sem + level, 0, _THIS_IP_);
> +}
> +
> +/*
> + * Tell lockdep we are holding these locks before we call ->unfreeze_fs(sb).
> + */
> +static void lockdep_sb_freeze_acquire(struct super_block *sb)
> {
> int level;
>
> for (level = 0; level < SB_FREEZE_LEVELS; ++level)
> percpu_rwsem_acquire(sb->s_writers.rw_sem + level, 0, _THIS_IP_);
> +}
> +
> +static void sb_freeze_unlock(struct super_block *sb)
> +{
> + int level;
>
> for (level = SB_FREEZE_LEVELS - 1; level >= 0; level--)
> percpu_up_write(sb->s_writers.rw_sem + level);
> @@ -1328,6 +1337,7 @@ int freeze_super(struct super_block *sb)
> * when frozen is set to SB_FREEZE_COMPLETE, and for thaw_super().
> */
> sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> + lockdep_sb_freeze_release(sb);
> up_write(&sb->s_umount);
> return 0;
> }
> @@ -1354,11 +1364,14 @@ int thaw_super(struct super_block *sb)
> goto out;
> }
>
> + lockdep_sb_freeze_acquire(sb);
> +
> if (sb->s_op->unfreeze_fs) {
> error = sb->s_op->unfreeze_fs(sb);
> if (error) {
> printk(KERN_ERR
> "VFS:Filesystem thaw failed\n");
> + lockdep_sb_freeze_release(sb);
> up_write(&sb->s_umount);
> return error;
> }
> --
> 2.5.0
>
>
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR