Re: KCSAN: data-race in fat16_ent_put / fat_search_long

From: Marco Elver
Date: Wed Nov 06 2019 - 06:27:07 EST


On Wed, 6 Nov 2019 at 09:31, OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx> wrote:
>
> Matthew Wilcox <willy@xxxxxxxxxxxxx> writes:
>
> > On Tue, Nov 05, 2019 at 03:39:23PM +0100, Marco Elver wrote:
> >> On Tue, 05 Nov 2019, syzbot wrote:
> >> > ==================================================================
> >> > BUG: KCSAN: data-race in fat16_ent_put / fat_search_long
> >> >
> >> > write to 0xffff8880a209c96a of 2 bytes by task 11985 on cpu 0:
> >> > fat16_ent_put+0x5b/0x90 fs/fat/fatent.c:181
> >> > fat_ent_write+0x6d/0xf0 fs/fat/fatent.c:415
> >> > fat_chain_add+0x34e/0x400 fs/fat/misc.c:130
> >> > fat_add_cluster+0x92/0xd0 fs/fat/inode.c:112
> >> > __fat_get_block fs/fat/inode.c:154 [inline]
> >> > fat_get_block+0x3ae/0x4e0 fs/fat/inode.c:189
> >> > __block_write_begin_int+0x2ea/0xf20 fs/buffer.c:1968
> >> > __block_write_begin fs/buffer.c:2018 [inline]
> >> > block_write_begin+0x77/0x160 fs/buffer.c:2077
> >> > cont_write_begin+0x3d6/0x670 fs/buffer.c:2426
> >> > fat_write_begin+0x72/0xc0 fs/fat/inode.c:235
> >> > pagecache_write_begin+0x6b/0x90 mm/filemap.c:3148
> >> > cont_expand_zero fs/buffer.c:2353 [inline]
> >> > cont_write_begin+0x17a/0x670 fs/buffer.c:2416
> >> > fat_write_begin+0x72/0xc0 fs/fat/inode.c:235
> >> > pagecache_write_begin+0x6b/0x90 mm/filemap.c:3148
> >> > generic_cont_expand_simple+0xb0/0x120 fs/buffer.c:2317
> >> >
> >> > read to 0xffff8880a209c96b of 1 bytes by task 11990 on cpu 1:
> >> > fat_search_long+0x20a/0xc60 fs/fat/dir.c:484
> >> > vfat_find+0xc1/0xd0 fs/fat/namei_vfat.c:698
> >> > vfat_lookup+0x75/0x350 fs/fat/namei_vfat.c:712
> >> > lookup_open fs/namei.c:3203 [inline]
> >> > do_last fs/namei.c:3314 [inline]
> >> > path_openat+0x15b6/0x36e0 fs/namei.c:3525
> >> > do_filp_open+0x11e/0x1b0 fs/namei.c:3555
> >> > do_sys_open+0x3b3/0x4f0 fs/open.c:1097
> >> > __do_sys_open fs/open.c:1115 [inline]
> >> > __se_sys_open fs/open.c:1110 [inline]
> >> > __x64_sys_open+0x55/0x70 fs/open.c:1110
> >> > do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
> >> > entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >> >
> >> > Reported by Kernel Concurrency Sanitizer on:
> >> > CPU: 1 PID: 11990 Comm: syz-executor.2 Not tainted 5.4.0-rc3+ #0
> >> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> >> > Google 01/01/2011
> >> > ==================================================================
> >>
> >> I was trying to understand what is happening here, but fail to see how
> >> this can happen. So it'd be good if somebody who knows this code can
> >> explain. We are quite positive this is not a false positive, given the
> >> addresses accessed match.
> >
> > Both of these accesses are into a buffer head; ie the data being accessed
> > is stored in the page cache. Is it possible the page was reused for
> > different data between these two accesses?
>
> No and yes. Reader side is directory buffer, writer side is FAT buffer.
> So FAT buffer never be reused as directory buffer. But the page cache
> itself can be freed and reused as different index. So if KCSAN can't
> detect the page cache recycle, it would be possible.
>
> Is there anyway to know "why KCSAN thought this as data race"?

KCSAN set up a watchpoint on the plain read, simply stalling that
thread for a few microsec. While stalling, a concurrent plain write
occurred which matches the watchpoint the reader set up. Whenever
KCSAN detects a data race, the 2 operations *must* actually be
happening in parallel at the time.

I will try to reproduce this somehow.

> >> The two bits of code in question here are:
> >>
> >> static void fat16_ent_put(struct fat_entry *fatent, int new)
> >> {
> >> if (new == FAT_ENT_EOF)
> >> new = EOF_FAT16;
> >>
> >> *fatent->u.ent16_p = cpu_to_le16(new); <<== data race here
> >> mark_buffer_dirty_inode(fatent->bhs[0], fatent->fat_inode);
> >> }
>
> This is updating FAT entry (index for data cluster placement) on FAT buffer.
>
> >> int fat_search_long(struct inode *inode, const unsigned char *name,
> >> int name_len, struct fat_slot_info *sinfo)
> >> {
> >> struct super_block *sb = inode->i_sb;
> >> struct msdos_sb_info *sbi = MSDOS_SB(sb);
> >> struct buffer_head *bh = NULL;
> >> struct msdos_dir_entry *de;
> >> unsigned char nr_slots;
> >> wchar_t *unicode = NULL;
> >> unsigned char bufname[FAT_MAX_SHORT_SIZE];
> >> loff_t cpos = 0;
> >> int err, len;
> >>
> >> err = -ENOENT;
> >> while (1) {
> >> if (fat_get_entry(inode, &cpos, &bh, &de) == -1)
> >> goto end_of_dir;
> >> parse_record:
> >> nr_slots = 0;
> >> if (de->name[0] == DELETED_FLAG)
> >> continue;
> >> if (de->attr != ATTR_EXT && (de->attr & ATTR_VOLUME)) <<== data race here
>
> Checking attribute on directory buffer.
>
> >> continue;
> >> if (de->attr != ATTR_EXT && IS_FREE(de->name))
> >> continue;
> >> <snip>
> >> }
> >>
> >> Thanks,
> >> -- Marco
>
> --
> OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx>