Re: [syzbot] [ext4?] KASAN: slab-use-after-free Read in fsnotify

From: Khazhy Kumykov
Date: Thu Apr 11 2024 - 16:28:00 EST


On Thu, Apr 11, 2024 at 12:25 PM Gabriel Krisman Bertazi
<krisman@xxxxxxx> wrote:
>
> Amir Goldstein <amir73il@xxxxxxxxx> writes:
>
> > On Thu, Apr 11, 2024 at 3:13 PM Jan Kara <jack@xxxxxxx> wrote:
> >>
> >> On Thu 11-04-24 01:11:20, syzbot wrote:
> >> > Hello,
> >> >
> >> > syzbot found the following issue on:
> >> >
> >> > HEAD commit: 6ebf211bb11d Add linux-next specific files for 20240410
> >> > git tree: linux-next
> >> > console+strace: https://syzkaller.appspot.com/x/log.txt?x=12be955d180000
> >> > kernel config: https://syzkaller.appspot.com/x/.config?x=16ca158ef7e08662
> >> > dashboard link: https://syzkaller.appspot.com/bug?extid=5e3f9b2a67b45f16d4e6
> >> > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> >> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13c91175180000
> >> > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1621af9d180000
> >> >
> >> > Downloadable assets:
> >> > disk image: https://storage.googleapis.com/syzbot-assets/b050f81f73ed/disk-6ebf211b.raw.xz
> >> > vmlinux: https://storage.googleapis.com/syzbot-assets/412c9b9a536e/vmlinux-6ebf211b.xz
> >> > kernel image: https://storage.googleapis.com/syzbot-assets/016527216c47/bzImage-6ebf211b.xz
> >> > mounted in repro: https://storage.googleapis.com/syzbot-assets/75ad050c9945/mount_0.gz
> >> >
> >> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> >> > Reported-by: syzbot+5e3f9b2a67b45f16d4e6@xxxxxxxxxxxxxxxxxxxxxxxxx
> >> >
> >> > Quota error (device loop0): do_check_range: Getting block 0 out of range 1-5
> >> > EXT4-fs error (device loop0): ext4_release_dquot:6905: comm kworker/u8:4: Failed to release dquot type 1
> >> > ==================================================================
> >> > BUG: KASAN: slab-use-after-free in fsnotify+0x2a4/0x1f70 fs/notify/fsnotify.c:539
> >> > Read of size 8 at addr ffff88802f1dce80 by task kworker/u8:4/62
> >> >
> >> > CPU: 0 PID: 62 Comm: kworker/u8:4 Not tainted 6.9.0-rc3-next-20240410-syzkaller #0
> >> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
> >> > Workqueue: events_unbound quota_release_workfn
> >> > Call Trace:
> >> > <TASK>
> >> > __dump_stack lib/dump_stack.c:88 [inline]
> >> > dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
> >> > print_address_description mm/kasan/report.c:377 [inline]
> >> > print_report+0x169/0x550 mm/kasan/report.c:488
> >> > kasan_report+0x143/0x180 mm/kasan/report.c:601
> >> > fsnotify+0x2a4/0x1f70 fs/notify/fsnotify.c:539
> >> > fsnotify_sb_error include/linux/fsnotify.h:456 [inline]
> >> > __ext4_error+0x255/0x3b0 fs/ext4/super.c:843
> >> > ext4_release_dquot+0x326/0x450 fs/ext4/super.c:6903
> >> > quota_release_workfn+0x39f/0x650 fs/quota/dquot.c:840
> >> > process_one_work kernel/workqueue.c:3218 [inline]
> >> > process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3299
> >> > worker_thread+0x86d/0xd70 kernel/workqueue.c:3380
> >> > kthread+0x2f0/0x390 kernel/kthread.c:389
> >> > ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
> >> > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> >> > </TASK>
> >>
> >> Amir, I believe this happens on umount when the filesystem calls
> >> fsnotify_sb_error() after calling fsnotify_sb_delete().
Hmm, so we're releasing dquots after already shutting down the
filesystem? Is that expected? This "Failed to release dquot type"
error message only appears if we have an open handle from
ext4_journal_start (although this filesystem was mounted without a
journal, so we hit ext4_get_nojournal()...)
> In theory these two
> >> calls can even run in parallel and fsnotify() can be holding
> >> fsnotify_sb_info pointer while fsnotify_sb_delete() is freeing it so we
> >> need to figure out some proper synchronization for that...
> >
> > Is it really needed to handle any for non SB_ACTIVE sb?
>
> I think it should be fine to exclude volumes being teared down. Cc'ing
> Khazhy, who sponsored this work at the time and owned the use-case.
In terms of real-world use case... not sure there is one - you'll
notice the errors when you fsck/mount again later on, and any users
who care about notifs should be gone by the time we're unmounting. But
it seems weird to me that we can get write errors shutting everything
down.
>
> --
> Gabriel Krisman Bertazi