Re: [mm] f9ce0be71d: BUG:KCSAN:data-race_in_next_uptodate_page/next_uptodate_page

From: Will Deacon
Date: Tue Aug 31 2021 - 10:07:58 EST


On Tue, Aug 31, 2021 at 03:38:17PM +0200, Marco Elver wrote:
> On Tue, 31 Aug 2021 at 15:13, Will Deacon <will@xxxxxxxxxx> wrote:
> > > > [ 184.717904][ T1873] ==================================================================
> > > > [ 184.718938][ T1873] BUG: KCSAN: data-race in next_uptodate_page / unlock_page
> > > > [ 184.719828][ T1873]
> > > > [ 184.720103][ T1873] write (marked) to 0xffffea00050f37c0 of 8 bytes by task 1872 on cpu 1:
> > > > [ 184.721024][ T1873] unlock_page+0x102/0x1b0
> > > > [ 184.721533][ T1873] filemap_map_pages+0x6c6/0x890
> > > > [ 184.722102][ T1873] handle_mm_fault+0x179c/0x27f0
> > > > [ 184.722672][ T1873] do_user_addr_fault+0x3fb/0x830
> > > > [ 184.723263][ T1873] exc_page_fault+0xc3/0x1a0
> > > > [ 184.723845][ T1873] asm_exc_page_fault+0x1e/0x30
> > > > [ 184.724427][ T1873]
> > > > [ 184.724720][ T1873] read to 0xffffea00050f37c0 of 8 bytes by task 1873 on cpu 0:
> > > > [ 184.725575][ T1873] next_uptodate_page+0x456/0x830
> > > > [ 184.726161][ T1873] filemap_map_pages+0x728/0x890
> > > > [ 184.726747][ T1873] handle_mm_fault+0x179c/0x27f0
> > > > [ 184.727332][ T1873] do_user_addr_fault+0x3fb/0x830
> > > > [ 184.727905][ T1873] exc_page_fault+0xc3/0x1a0
> > > > [ 184.728440][ T1873] asm_exc_page_fault+0x1e/0x30
> > > > [ 184.729027][ T1873]
> > > > [ 184.729313][ T1873] Reported by Kernel Concurrency Sanitizer on:
> > > > [ 184.730019][ T1873] CPU: 0 PID: 1873 Comm: systemd-udevd Not tainted 5.11.0-rc4-00001-gf9ce0be71d1f #1
> > > > [ 184.731103][ T1873] ==================================================================
> > >
> > > Line annotation would be helpful.
> >
> > Agreed.
> >
> > > And I'm not very familiar with KCSAN. My guess it reports PageLock() vs.
> > > clearing PG_locked. In this context it looks safe, unless I miss
> > > something.
> >
> > The access in clear_bit_unlock() is annotated as a full sizeof(long) atomic
> > write, so we could be racing with a non-atomic read of another bit in the
> > page flags but I can't spot where that happens before the trylock_page() in
> > next_uptodate_page().
> >
> > > Do we need some annotation to help KCSAN?
> >
> > clear_bit_unlock() is already instrumented and _most_ of the helpers in
> > page-flags.h look they should be as well by virtue of using test_bit().
>
> Even though they're data races, if this is a "1-bit value change" data
> race, then I'd leave it as-is for now (unless maintainer prefers
> marking everything that can concurrently be accessed).
>
> Bots wanting to enable KCSAN and races like this was the reason for
> CONFIG_KCSAN_PERMISSIVE (ignoring those "1-bit value changes"):
> https://lkml.kernel.org/r/20210607125653.1388091-1-elver@xxxxxxxxxx --
> This series is supposed to land in this merge-window.
>
> For syzbot, we do active moderation -- when we get the time, we look
> at races, and try to analyze or fix (although I've not looked
> recently), or unconditionally forward to some maintainers who
> committed addressing all data races in their subsystems (RCU, some
> parts of networking).
>
> Without active moderation before kernel 5.15, I do not recommend
> turning on KCSAN on bots unconditionally, simply because prioritizing
> the severity of data races is still difficult (and maintainers only
> have so much time). Ultimately, our goal is of course to eventually
> address all data races, but I think it requires time and patience.

(sounds like the 0day folks should disable KCSAN for now then)

> With 5.15, it may be possible to cautiously enable KCSAN on bots
> without moderation if they set CONFIG_KCSAN_PERMISSIVE=y. If no
> moderation is possible, an intermediate step may be:
>
> -- sending everything to a separate mailing list by default, without
> Cc (those interested, can subscribe)
> -- maintainers who opt into KCSAN reports will get the reports as normal.
>
> We could also make use of those opt-ins for syzbot. :-)
>
> Thoughts?

I wasn't complaining about the report! It's more that without line numbers
we're struggling a bit to figure out where the race is. All the page-flag
tests on the reader side should be using test_bit(), but the report above
doesn't seem to think that the read is marked. Given your series adding
CONFIG_KCSAN_PERMISSIVE and the fact that you try to triage these things, I
thought maybe you've seen this before and might be able to point at the race
(which is hopefully benign, but it's annoying when you can't spot it!).

Will