Re: KCSAN: data-race in __alloc_file / __alloc_file

From: Linus Torvalds
Date: Tue Nov 12 2019 - 12:23:57 EST


On Tue, Nov 12, 2019 at 8:50 AM Kirill Smelkov <kirr@xxxxxxxxxx> wrote:
>
> The same logic applies if it is not 2 processes, but 2 threads:
> thread T2 adjusts file position racily to thread T1 while T1 is doing
> read syscall with the end result that T1 read could access file range
> that it should not be allowed to access.

Well, I think we actually always copy the file position before we pass
it down. So everybody always _uses_ their own private pointer, and the
race is only in the "read original value" vs "write new value back".

You had a patch that passed the address of file->f_pos down in your
original series iirc, but I NAK'ed that one. Exactly because it made
me nervous.

> By the way on "1" topic I suspect there is a race of how
> `N(file-users) > 1` check is done: file_count(file) is
> atomic_long_read(&file->f_count), but let's think on how that atomic
> read is positioned wrt another process creation: I did not studied in
> detail, so I might be wrong here, but offhand it looks like there is no
> synchronization.

Well, that's one reason to add the test for threads - it also gets rid
of that race. Because without threads, there's nothing else that could
access - or fork - a "N(file-users) == 1" file but us.

> So talking about the kernel I would also review the possibility of
> file_count wrt clone race once again.

See above. That goes away with the test for FDPUT_FPUT.

> About "2": I generally agree with the direction, but I think the kernel
> is not yet ready for this switch. Let me quote myself:

Hmm. I thought we already then applied all the patches that marked
things that didn't use f_pos as FMODE_STREAM. Including pipes and
sockets etc.

But if we didn't - and no, I didn't double-check now either - then
obviously that part of the patch can't be applied now.

Linus