Re: KCSAN: data-race in __alloc_file / __alloc_file

From: Eric Dumazet
Date: Mon Nov 11 2019 - 14:13:47 EST


On Mon, Nov 11, 2019 at 11:01 AM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Mon, Nov 11, 2019 at 10:44 AM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
> >
> > An interesting case is the race in ksys_write()
>
> Not really.
>
> > if (ppos) {
> > pos = *ppos; // data-race
>
> That code uses "fdget_pos().
>
> Which does mutual exclusion _if_ the file is something we care about
> pos for, and if it has more than one process using it.
>
> Basically the rule there is that we don't care about the data race in
> certain circumstances. We don't care about non-regular files, for
> example, because those are what POSIX gives guarantees for.
>
> (We have since moved towards FMODE_STREAM handling instead of the
> older FMODE_ATOMIC_POS which does this better, and it's possible we
> should get rid of the FMODE_ATOMIC_POS behavior in favor of
> FMODE_STREAM entirely)
>
> Again, that's pretty hard to tell something like KCSAN.

Well, this is hard to explain to humans... Probably less than 10 on
this planet could tell that.

What about this other one, it looks like multiple threads can
manipulate tsk->min_flt++; at the same time in faultin_page()

Should we not care, or should we mirror min_flt with a second
atomic_long_t, or simply convert min_flt to atomic_long_t ?

BUG: KCSAN: data-race in __get_user_pages / __get_user_pages

read to 0xffff8880b0b8f650 of 8 bytes by task 11553 on cpu 1:
faultin_page mm/gup.c:653 [inline]
__get_user_pages+0x78f/0x1160 mm/gup.c:845
__get_user_pages_locked mm/gup.c:1023 [inline]
get_user_pages_remote+0x206/0x3e0 mm/gup.c:1163
process_vm_rw_single_vec mm/process_vm_access.c:109 [inline]
process_vm_rw_core.isra.0+0x3a4/0x8c0 mm/process_vm_access.c:216
process_vm_rw+0x1c4/0x1e0 mm/process_vm_access.c:284
__do_sys_process_vm_writev mm/process_vm_access.c:306 [inline]
__se_sys_process_vm_writev mm/process_vm_access.c:301 [inline]
__x64_sys_process_vm_writev+0x8b/0xb0 mm/process_vm_access.c:301
do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x44/0xa9

write to 0xffff8880b0b8f650 of 8 bytes by task 11531 on cpu 0:
faultin_page mm/gup.c:653 [inline]
__get_user_pages+0x7b1/0x1160 mm/gup.c:845
__get_user_pages_locked mm/gup.c:1023 [inline]
get_user_pages_remote+0x206/0x3e0 mm/gup.c:1163
process_vm_rw_single_vec mm/process_vm_access.c:109 [inline]
process_vm_rw_core.isra.0+0x3a4/0x8c0 mm/process_vm_access.c:216
process_vm_rw+0x1c4/0x1e0 mm/process_vm_access.c:284
__do_sys_process_vm_writev mm/process_vm_access.c:306 [inline]
__se_sys_process_vm_writev mm/process_vm_access.c:301 [inline]
__x64_sys_process_vm_writev+0x8b/0xb0 mm/process_vm_access.c:301
do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 11531 Comm: syz-executor.4 Not tainted 5.4.0-rc6+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011